Files
multilingual-translation/CLAUDE.md
jungwoo choi f586f930b6 Initial commit: Multilingual Translation API
- Implemented REST API for 105+ language translation
- Used Facebook M2M100 model (Apache 2.0 License - Commercial use allowed)
- Supports any-to-any translation between 105 languages
- Major languages: English, Chinese, Spanish, Arabic, Russian, Japanese, Korean, etc.
- Southeast Asian: Malay, Indonesian, Thai, Vietnamese, Tagalog, Burmese, Khmer, Lao
- South Asian: Bengali, Hindi, Urdu, Tamil, Telugu, Marathi, Gujarati, etc.
- European: German, French, Italian, Spanish, Portuguese, Russian, etc.
- African: Swahili, Amharic, Hausa, Igbo, Yoruba, Zulu, Xhosa
- And many more languages

Tech Stack:
- FastAPI for REST API
- Transformers (Hugging Face) for ML model
- PyTorch for inference
- Docker for containerization
- M2M100 418M parameter model

Features:
- Health check endpoint
- Supported languages listing
- Dynamic language validation
- Model caching for performance
- GPU support (auto-detection)
- CORS enabled for web clients

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-10 14:11:20 +09:00

4.7 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

This is a Malaysian language translation API service built with FastAPI and Hugging Face Transformers. It provides bidirectional translation between Malay (Bahasa Melayu) and English using Helsinki-NLP's OPUS-MT neural machine translation models.

Development Commands

Local Development

# Setup virtual environment and install dependencies
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

# Run the development server (with auto-reload)
python run.py

# Or run with uvicorn directly
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Docker Development

# Build and run with Docker Compose
docker-compose up -d

# View logs
docker-compose logs -f

# Stop services
docker-compose down

# Rebuild after code changes
docker-compose up -d --build

Testing the API

# Health check
curl http://localhost:8000/health

# Translate Malay to English
curl -X POST "http://localhost:8000/api/translate" \
  -H "Content-Type: application/json" \
  -d '{"text": "Selamat pagi", "source_lang": "ms", "target_lang": "en"}'

# Translate English to Malay
curl -X POST "http://localhost:8000/api/translate" \
  -H "Content-Type: application/json" \
  -d '{"text": "Good morning", "source_lang": "en", "target_lang": "ms"}'

Architecture

Core Components

  1. app/main.py - FastAPI application with endpoint definitions

    • Lifespan events handle model preloading on startup
    • CORS middleware configured for cross-origin requests
    • Three main endpoints: root (/), health (/health), translate (/api/translate)
  2. app/translator.py - Translation service singleton

    • Manages loading and caching of translation models
    • Automatically detects and uses GPU if available (CUDA)
    • Supports lazy loading - models are loaded on first use or preloaded at startup
    • Model naming convention: Helsinki-NLP/opus-mt-{source}-{target}
  3. app/models.py - Pydantic schemas for request/response validation

    • TranslationRequest: Validates input (text, source_lang, target_lang)
    • TranslationResponse: Structured output with metadata
    • LanguageCode enum: Only "ms" and "en" are supported
  4. app/config.py - Configuration management using pydantic-settings

    • Loads settings from environment variables or .env file
    • Default values provided for all settings

Translation Flow

  1. Request received at /api/translate endpoint
  2. Pydantic validates request schema
  3. TranslationService determines appropriate model based on language pair
  4. Model is loaded if not already cached in memory
  5. Text is tokenized, translated, and decoded
  6. Response includes original text, translation, and model metadata

Model Caching

  • Models are downloaded to MODEL_CACHE_DIR (default: ./models/)
  • Once downloaded, models persist across restarts
  • In Docker, use volume mount to persist models
  • First translation request may be slow due to model download (~300MB per model)

Device Selection

The translator automatically detects GPU availability:

  • CUDA GPU: Used automatically if available for faster inference
  • CPU: Fallback option, slower but works everywhere

Configuration

Environment variables (see .env.example):

  • API_HOST / API_PORT: Server binding
  • MODEL_CACHE_DIR: Where to store downloaded models
  • MAX_LENGTH: Maximum token length for translation (default 512)
  • ALLOWED_ORIGINS: CORS configuration

Common Tasks

Adding New Language Pairs

To add support for additional languages:

  1. Check if Helsinki-NLP has an OPUS-MT model for the language pair at https://huggingface.co/Helsinki-NLP
  2. Update app/models.py - Add new language code to LanguageCode enum
  3. Update app/translator.py - Add model mapping in _get_model_name() method
  4. Update app/main.py - Add language info to /api/supported-languages endpoint

Modifying Translation Behavior

Translation parameters are in app/translator.py in the translate() method:

  • Adjust max_length in tokenizer call to handle longer texts
  • Modify generation parameters passed to model.generate() for different translation strategies

Production Deployment

For production use:

  1. Set reload=False in run.py or use production-ready uvicorn command
  2. Configure proper ALLOWED_ORIGINS instead of "*"
  3. Add authentication middleware if needed
  4. Consider using multiple workers: uvicorn app.main:app --workers 4
  5. Mount persistent volume for models/ directory in Docker

API Documentation

When the server is running, interactive API documentation is available at: