site11/docs/ARCHITECTURE_OVERVIEW.md

# Site11 시스템 아키텍처 개요

## 📋 목차
- [전체 아키텍처](#전체-아키텍처)
- [마이크로서비스 구성](#마이크로서비스-구성)
- [데이터 플로우](#데이터-플로우)
- [기술 스택](#기술-스택)
- [확장성 고려사항](#확장성-고려사항)

## 전체 아키텍처

### 하이브리드 아키텍처 (현재)
```
┌─────────────────────────────────────────────────────────┐
│                      외부 API                           │
│  DeepL | OpenAI | Claude | Google Search | RSS Feeds    │
└────────────────────┬────────────────────────────────────┘
                     │
┌─────────────────────┴────────────────────────────────────┐
│                Kubernetes Cluster                        │
│  ┌─────────────────────────────────────────────────────┐ │
│  │               Frontend Layer                         │ │
│  │  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐   │ │
│  │  │ Console     │ │ Images      │ │ Users       │   │ │
│  │  │ Frontend    │ │ Frontend    │ │ Frontend    │   │ │
│  │  └─────────────┘ └─────────────┘ └─────────────┘   │ │
│  └─────────────────────────────────────────────────────┘ │
│                          │                               │
│  ┌─────────────────────────────────────────────────────┐ │
│  │               API Gateway Layer                      │ │
│  │  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐   │ │
│  │  │ Console     │ │ Images      │ │ Users       │   │ │
│  │  │ Backend     │ │ Backend     │ │ Backend     │   │ │
│  │  │ (Gateway)   │ │             │ │             │   │ │
│  │  └─────────────┘ └─────────────┘ └─────────────┘   │ │
│  └─────────────────────────────────────────────────────┘ │
│                          │                               │
│  ┌─────────────────────────────────────────────────────┐ │
│  │            Pipeline Workers Layer                    │ │
│  │  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌─────────┐ │ │
│  │  │RSS       │ │Google    │ │AI Article│ │Image    │ │ │
│  │  │Collector │ │Search    │ │Generator │ │Generator│ │ │
│  │  └──────────┘ └──────────┘ └──────────┘ └─────────┘ │ │
│  │  ┌─────────────────────────────────────────────────┐ │ │
│  │  │             Translator                          │ │ │
│  │  │        (8 Languages Support)                    │ │ │
│  │  └─────────────────────────────────────────────────┘ │ │
│  └─────────────────────────────────────────────────────┘ │
└────────────────────┬────────────────────────────────────┘
                     │ host.docker.internal
┌─────────────────────┴────────────────────────────────────┐
│               Docker Compose Infrastructure              │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐        │
│  │   MongoDB   │ │    Redis    │ │    Kafka    │        │
│  │  (Primary   │ │  (Cache &   │ │ (Message    │        │
│  │  Database)  │ │   Queue)    │ │  Broker)    │        │
│  └─────────────┘ └─────────────┘ └─────────────┘        │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐        │
│  │ Zookeeper   │ │ Pipeline    │ │ Pipeline    │        │
│  │(Kafka Coord)│ │ Scheduler   │ │ Monitor     │        │
│  └─────────────┘ └─────────────┘ └─────────────┘        │
│  ┌─────────────┐ ┌─────────────┐                        │
│  │  Language   │ │ Registry    │                        │
│  │    Sync     │ │   Cache     │                        │
│  └─────────────┘ └─────────────┘                        │
└──────────────────────────────────────────────────────────┘
```

## 마이크로서비스 구성

### Console Services (API Gateway Pattern)
```yaml
Console Backend:
  Purpose: API Gateway & Orchestration
  Technology: FastAPI
  Port: 8000
  Features:
    - Service Discovery
    - Authentication & Authorization
    - Request Routing
    - Health Monitoring

Console Frontend:
  Purpose: Admin Dashboard
  Technology: React + Vite + TypeScript
  Port: 80 (nginx)
  Features:
    - Service Health Dashboard
    - Real-time Monitoring
    - User Management UI
```

### Pipeline Services (Event-Driven Architecture)
```yaml
RSS Collector:
  Purpose: RSS Feed 수집
  Scaling: 1-5 replicas
  Queue: rss_collection

Google Search:
  Purpose: Google 검색 결과 수집
  Scaling: 1-5 replicas
  Queue: google_search

AI Article Generator:
  Purpose: AI 기반 콘텐츠 생성
  Scaling: 2-10 replicas
  Queue: ai_generation
  APIs: OpenAI, Claude

Translator:
  Purpose: 8개 언어 번역
  Scaling: 3-10 replicas (높은 처리량)
  Queue: translation
  API: DeepL

Image Generator:
  Purpose: 이미지 생성 및 최적화
  Scaling: 2-10 replicas
  Queue: image_generation
  API: OpenAI DALL-E
```

### Infrastructure Services (Stateful)
```yaml
MongoDB:
  Purpose: Primary Database
  Collections:
    - articles_ko (Korean articles)
    - articles_en (English articles)
    - articles_zh_cn, articles_zh_tw (Chinese)
    - articles_ja (Japanese)
    - articles_fr, articles_de, articles_es, articles_it (European)

Redis:
  Purpose: Cache & Queue
  Usage:
    - Queue management (FIFO/Priority)
    - Session storage
    - Result caching
    - Rate limiting

Kafka:
  Purpose: Event Streaming
  Topics:
    - user-events
    - oauth-events
    - pipeline-events
    - dead-letter-queue

Pipeline Scheduler:
  Purpose: Workflow Orchestration
  Features:
    - Task scheduling
    - Dependency management
    - Error handling
    - Retry logic

Pipeline Monitor:
  Purpose: Real-time Monitoring
  Features:
    - Queue status
    - Processing metrics
    - Performance monitoring
    - Alerting
```

## 데이터 플로우

### 콘텐츠 생성 플로우
```
1. Content Collection
   RSS Feeds → RSS Collector → Redis Queue
   Search Terms → Google Search → Redis Queue

2. Content Processing
   Raw Content → AI Article Generator → Enhanced Articles

3. Multi-Language Translation
   Korean Articles → Translator (DeepL) → 8 Languages

4. Image Generation
   Article Content → Image Generator (DALL-E) → Optimized Images

5. Data Storage
   Processed Content → MongoDB Collections (by language)

6. Language Synchronization
   Language Sync Service → Monitors & balances translations
```

### 실시간 모니터링 플로우
```
1. Metrics Collection
   Each Service → Pipeline Monitor → Real-time Dashboard

2. Health Monitoring
   Services → Health Endpoints → Console Backend → Dashboard

3. Queue Monitoring
   Redis Queues → Pipeline Monitor → Queue Status Display

4. Event Streaming
   Service Events → Kafka → Event Consumer → Real-time Updates
```

## 기술 스택

### Backend Technologies
```yaml
API Framework: FastAPI (Python 3.11)
Database: MongoDB 7.0
Cache/Queue: Redis 7
Message Broker: Kafka 3.5 + Zookeeper 3.9
Container Runtime: Docker + Kubernetes
Registry: Docker Hub + Local Registry
```

### Frontend Technologies
```yaml
Framework: React 18
Build Tool: Vite 4
Language: TypeScript
UI Library: Material-UI v7
Bundler: Rollup (via Vite)
Web Server: Nginx (Production)
```

### Infrastructure Technologies
```yaml
Orchestration: Kubernetes (Kind/Docker Desktop)
Container Platform: Docker 20.10+
Networking: Docker Networks + K8s Services
Storage: Docker Volumes + K8s PVCs
Monitoring: Custom Dashboard + kubectl
```

### External APIs
```yaml
Translation: DeepL API
AI Content: OpenAI GPT + Claude API
Image Generation: OpenAI DALL-E
Search: Google Custom Search API (SERP)
```

## 확장성 고려사항

### Horizontal Scaling (현재 구현됨)
```yaml
Auto-scaling Rules:
  CPU > 70% → Scale Up
  Memory > 80% → Scale Up
  Queue Length > 100 → Scale Up

Scaling Limits:
  Console: 2-10 replicas
  Translator: 3-10 replicas (highest throughput)
  AI Generator: 2-10 replicas
  Others: 1-5 replicas
```

### Vertical Scaling
```yaml
Resource Allocation:
  CPU Intensive: AI Generator, Image Generator
  Memory Intensive: Translator (language models)
  I/O Intensive: RSS Collector, Database operations

Resource Limits:
  Request: 100m CPU, 256Mi RAM
  Limit: 500m CPU, 512Mi RAM
```

### Database Scaling
```yaml
Current: Single MongoDB instance
Future Options:
  - MongoDB Replica Set (HA)
  - Sharding by language
  - Read replicas for different regions

Indexing Strategy:
  - Language-based indexing
  - Timestamp-based partitioning
  - Full-text search indexes
```

### Caching Strategy
```yaml
L1 Cache: Application-level (FastAPI)
L2 Cache: Redis (shared)
L3 Cache: Registry Cache (Docker images)

Cache Invalidation:
  - TTL-based expiration
  - Event-driven invalidation
  - Manual cache warming
```

### API Rate Limiting
```yaml
External APIs:
  DeepL: 500,000 chars/month
  OpenAI: Usage-based billing
  Google Search: 100 queries/day (free tier)

Rate Limiting Strategy:
  - Redis-based rate limiting
  - Queue-based buffering
  - Priority queuing
  - Circuit breaker pattern
```

### Future Architecture Considerations

#### Service Mesh (다음 단계)
```yaml
Technology: Istio or Linkerd
Benefits:
  - Service-to-service encryption
  - Traffic management
  - Observability
  - Circuit breaking
```

#### Multi-Region Deployment
```yaml
Current: Single cluster
Future: Multi-region with:
  - Regional MongoDB clusters
  - CDN for static assets
  - Geo-distributed caching
  - Language-specific regions
```

#### Event Sourcing
```yaml
Current: State-based
Future: Event-based with:
  - Event store (EventStore or Kafka)
  - CQRS pattern
  - Aggregate reconstruction
  - Audit trail
```

## 보안 아키텍처

### Authentication & Authorization
```yaml
Current: JWT-based authentication
Users: Demo users (admin/user)
Tokens: 30-minute expiration

Future:
  - OAuth2 with external providers
  - RBAC with granular permissions
  - API key management
```

### Network Security
```yaml
K8s Network Policies: Not implemented
Service Mesh Security: Future consideration
Secrets Management: K8s Secrets + .env files

Future:
  - HashiCorp Vault integration
  - mTLS between services
  - Network segmentation
```

## 성능 특성

### Throughput Metrics
```yaml
Translation: ~100 articles/minute (3 replicas)
AI Generation: ~50 articles/minute (2 replicas)
Image Generation: ~20 images/minute (2 replicas)
Total Processing: ~1000 articles/hour
```

### Latency Targets
```yaml
API Response: < 200ms
Translation: < 5s per article
AI Generation: < 30s per article
Image Generation: < 60s per image
End-to-end: < 2 minutes per complete article
```

### Resource Utilization
```yaml
CPU Usage: 60-80% under normal load
Memory Usage: 70-90% under normal load
Disk I/O: MongoDB primary bottleneck
Network I/O: External API calls
```