Hi midhun ,
Thanks for posting on Microsoft Q&A!
The “dog → [10,3,2]” example in the module is a static word vector, showing a word’s fixed meaning. In a real Transformer, text like “when my dog was” gets split into tokens, and each token starts with an embedding. The encoder’s self-attention then tweaks these into contextual embeddings, capturing the sentence’s meaning. The decoder uses these to generate output, like translations, by focusing on the right input parts. The module’s example is just simplified for clarity.
I hope this helps! If you have any further questions, feel free to ask.
If the information is useful, please accept the answer and upvote it to assist other community members.