Hello @Manuel
Thanks for reaching out to us. To train a model with dictionary, you need to upload your dictionary document to your language studio as below steps -
1.Go to Language Studio https://language.cognitive.azure.com/ and select "Translate Text" - "Customize Translation"
Follow the guidance to train a customer model, you need to create a new project as this - https://learn.microsoft.com/en-us/azure/cognitive-services/Translator/custom-translator/how-to/train-custom-model#when-to-select-dictionary-only-training
Then you can add your dictionary document as "Dictionary set" as below
For what is a dictionary set, please refer to the document - https://learn.microsoft.com/en-us/azure/cognitive-services/Translator/custom-translator/concepts/dictionaries
Recommendations
- Dictionaries aren't a substitute for training a model using training data. For better results, we recommended letting the system learn from your training data. However, when sentences or compound nouns must be translated verbatim, use a dictionary.
- The phrase dictionary should be used sparingly. When a phrase within a sentence is replaced, the context of that sentence is lost or limited for translating the rest of the sentence. The result is that, while the phrase or word within the sentence will translate according to the provided dictionary, the overall translation quality of the sentence often suffers.
- The phrase dictionary works well for compound nouns like product names ("Microsoft SQL Server"), proper names ("City of Hamburg"), or product features ("pivot table"). It doesn't work as well for verbs or adjectives because those words are typically highly contextual within the source or target language. The best practice is to avoid phrase dictionary entries for anything but compound nouns.
- If you're using a phrase dictionary, capitalization and punctuation are important. Dictionary entries are case- and punctuation-sensitive. Custom Translator will only match words and phrases in the input sentence that use exactly the same capitalization and punctuation marks as specified in the source dictionary file. Also, translations will reflect the capitalization and punctuation provided in the target dictionary file.
Example
- If you're training an English-to-Spanish system that uses a phrase dictionary and you specify "SQL server" in the source file and "Microsoft SQL Server" in the target file. When you request the translation of a sentence that contains the phrase "SQL server", Custom Translator will match the dictionary entry and the translation will contain "Microsoft SQL Server."
- When you request translation of a sentence that includes the same phrase but doesn't match what is in your source file, such as "sql server", "sql Server" or "SQL Server", it won't return a match from your dictionary.
- The translation follows the rules of the target language as specified in your phrase dictionary.
- If you're using a sentence dictionary, end-of-sentence punctuation is ignored. Example
- If your source dictionary contains "This sentence ends with punctuation!", then any translation requests containing "This sentence ends with punctuation" will match.
- Your dictionary should contain unique source lines. If a source line (a word, phrase, or sentence) appears more than once in a dictionary file, the system will always use the last entry provided and return the target when a match is found.
- Avoid adding phrases that consist of only numbers or are two- or three-letter words, such as acronyms, in the source dictionary file.
I hope this helps! Please let me know if you need further help for any of above.
Regards
Yutong
-Please kindly accept the answer and vote 'Yes' if you feel helpful to support the community, thanks a lot.