Train a custom model
A model provides translations for a specific language pair. The outcome of a successful training is a model. To train a custom model, three mutually exclusive document types are required: training, tuning, and testing. If only training data is provided when queuing a training, Custom Translator automatically assembles tuning and testing data. It uses a random subset of sentences from your training documents, and exclude these sentences from the training data itself. A minimum of 10,000 parallel training sentences are required to train a full model.
Create model
Select the Train model blade.
Type the Model name.
Keep the default Full training selected or select Dictionary-only training.
Note
Full training displays all uploaded document types. Dictionary-only displays dictionary documents only.
Under Select documents, select the documents you want to use to train the model, for example,
sample-English-German
and review the training cost associated with the selected number of sentences.Select Train now.
Select Train to confirm.
Note
Notifications displays model training in progress, e.g., Submitting data state. Training model takes few hours, subject to the number of selected sentences.
When to select dictionary-only training
For better results, we recommended letting the system learn from your training data. However, when you don't have enough parallel sentences to meet the 10,000 minimum requirements, or sentences and compound nouns must be rendered as-is, use dictionary-only training. Your model typically completes training faster than with full training. The resulting models use the baseline models for translation along with the dictionaries you added. You don't see BLEU
scores or get a test report.
Note
Custom Translator doesn't sentence-align dictionary files. Therefore, it is important that there are an equal number of source and target phrases/sentences in your dictionary documents and that they are precisely aligned. If not, the document upload will fail.
Model details
After successful model training, select the Model details blade.
Select the Model Name to review training date/time, total training time, number of sentences used for training, tuning, testing, dictionary, and whether the system generated the test and tuning sets. You use
Category ID
to make translation requests.Evaluate the model
BLEU
score. Review the test set: the BLEU score is the custom model score and the Baseline BLEU is the pretrained baseline model used for customization. A higher BLEU score means higher translation quality using the custom model.
Duplicate model
Select the Model details blade.
Hover over the model name and check the selection button.
Select Duplicate.
Fill in New model name.
Keep Train immediately checked if no further data is selected or uploaded, otherwise, check Save as draft
Select Save
Note
If you save the model as
Draft
, Model details is updated with the model name inDraft
status.To add more documents, select on the model name and follow the steps in the Create model section.