Train a custom model
A model provides translations for a specific language pair. The outcome of a successful training is a model. To train a custom model, three mutually exclusive document types are required: training, tuning, and testing. If only training data is provided when queuing a training, Custom Translator will automatically assemble tuning and testing data. It will use a random subset of sentences from your training documents, and exclude these sentences from the training data itself. A minimum of 10,000 parallel training sentences are required to train a full model.
Select the Train model blade.
Type the Model name.
Keep the default Full training selected or select Dictionary-only training.
Full training displays all uploaded document types. Dictionary-only displays dictionary documents only.
Under Select documents, select the documents you want to use to train the model, for example,
sample-English-Germanand review the training cost associated with the selected number of sentences.
Select Train now.
Select Train to confirm.
Notifications displays model training in progress, e.g., Submitting data state. Training model takes few hours, subject to the number of selected sentences.
When to select dictionary-only training
For better results, we recommended letting the system learn from your training data. However, when you don't have enough parallel sentences to meet the 10,000 minimum requirements, or sentences and compound nouns must be rendered as-is, use dictionary-only training. Your model will typically complete training much faster than with full training. The resulting models will use the baseline models for translation along with the dictionaries you've added. You won't see BLEU scores or get a test report.
Custom Translator doesn't sentence-align dictionary files. Therefore, it is important that there are an equal number of source and target phrases/sentences in your dictionary documents and that they are precisely aligned. If not, the document upload will fail.
After successful model training, select the Model details blade.
Select the Model Name to review training date/time, total training time, number of sentences used for training, tuning, testing, dictionary, and whether the system generated the test and tuning sets. You'll use
Category IDto make translation requests.
Evaluate the model BLEU score. Review the test set: the BLEU score is the custom model score and the Baseline BLEU is the pre-trained baseline model used for customization. A higher BLEU score means higher translation quality using the custom model.
Select the Model details blade.
Hover over the model name and check the selection button.
Fill in New model name.
Keep Train immediately checked if no further data will be selected or uploaded, otherwise, check Save as draft
If you save the model as
Draft, Model details is updated with the model name in
To add more documents, select on the model name and follow
Create modelsection above.