multilingual-translation

Author	SHA1	Message	Date
jungwoo choi	5a99d081ab	Fix NLLB-200 tokenizer and add .dockerignore - Fixed NLLB-200 tokenizer forced_bos_token_id issue - Changed from lang_code_to_id to convert_tokens_to_ids - Added .dockerignore to exclude models directory from Docker build - Prevents disk space issues during build - Models are loaded at runtime via volume mount - Both M2M100 and NLLB-200 models tested and working 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-11 16:02:32 +09:00
jungwoo choi	28e26d19b6	Add dual model support: M2M100 and NLLB-200 - Added optional 'model' parameter to translation request (default: m2m100) - M2M100: 105 languages, Apache 2.0 License (commercial OK) - NLLB-200: 200 languages, CC-BY-NC 4.0 License (non-commercial only) - Updated /api/translate endpoint to accept model selection - Updated /api/supported-languages to show languages per model - Added comprehensive language name mappings for all NLLB-200 languages - Both models can be used independently with automatic model loading - Model information includes license and commercial use status Example usage: - Default (M2M100): {"text": "Hello", "source_lang": "en", "target_lang": "ko"} - NLLB-200: {"text": "Hello", "source_lang": "en", "target_lang": "ko", "model": "nllb200"} 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-11 15:57:00 +09:00
jungwoo choi	228f6c38e5	Update API metadata: Change to Multilingual Translation API - Updated API title from "Malaysian Language Translation API" to "Multilingual Translation API" - Updated API description to mention 105+ languages and M2M100 model - Updated /api/translate endpoint docstring to reflect multilingual support - Updated startup/shutdown log messages - Added commercial license note (Apache 2.0) in API description This ensures the Swagger UI (http://localhost:8001/docs) shows correct information. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-11 15:38:19 +09:00
jungwoo choi	f586f930b6	Initial commit: Multilingual Translation API - Implemented REST API for 105+ language translation - Used Facebook M2M100 model (Apache 2.0 License - Commercial use allowed) - Supports any-to-any translation between 105 languages - Major languages: English, Chinese, Spanish, Arabic, Russian, Japanese, Korean, etc. - Southeast Asian: Malay, Indonesian, Thai, Vietnamese, Tagalog, Burmese, Khmer, Lao - South Asian: Bengali, Hindi, Urdu, Tamil, Telugu, Marathi, Gujarati, etc. - European: German, French, Italian, Spanish, Portuguese, Russian, etc. - African: Swahili, Amharic, Hausa, Igbo, Yoruba, Zulu, Xhosa - And many more languages Tech Stack: - FastAPI for REST API - Transformers (Hugging Face) for ML model - PyTorch for inference - Docker for containerization - M2M100 418M parameter model Features: - Health check endpoint - Supported languages listing - Dynamic language validation - Model caching for performance - GPU support (auto-detection) - CORS enabled for web clients 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-10 14:11:20 +09:00

4 Commits