Fix NLLB-200 tokenizer and add .dockerignore
- Fixed NLLB-200 tokenizer forced_bos_token_id issue - Changed from lang_code_to_id to convert_tokens_to_ids - Added .dockerignore to exclude models directory from Docker build - Prevents disk space issues during build - Models are loaded at runtime via volume mount - Both M2M100 and NLLB-200 models tested and working 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@ -473,7 +473,8 @@ class TranslationService:
|
||||
).to(self.device)
|
||||
|
||||
# Generate translation - NLLB uses forced_bos_token_id
|
||||
forced_bos_token_id = tokenizer.lang_code_to_id[tgt_code]
|
||||
# Convert language code to token ID
|
||||
forced_bos_token_id = tokenizer.convert_tokens_to_ids(tgt_code)
|
||||
|
||||
with torch.no_grad():
|
||||
translated = model.generate(
|
||||
|
||||
Reference in New Issue
Block a user