fundamentals-machine-learning/8a-transformers

midhun 21 Reputation points
2025-06-23T14:19:23.8733333+00:00

https://learn.microsoft.com/en-us/training/modules/fundamentals-machine-learning/8a-transformers

Here in the image output of encode is represented as a list of embeddings like "dog" - [10, 3, 2], "cat" - [10, 3, 1], "puppy" - [5, 2, 1] but the encoder embedding are tokens (["when" - [10, 3, 4], "my" - [10, 3, 5], "dog" - [10, 3, 6], "was" - [10, 3, 7]]) not the list of embeddings like "dog" - [10, 3, 2], "cat" - [10, 3, 1], "puppy" - [5, 2, 1].

This creates confusion for decoder.

This question is related to the following Learning Module

Azure Azure Training
0 comments No comments
{count} votes

Accepted answer
  1. Gowtham CP 6,020 Reputation points Volunteer Moderator
    2025-06-23T17:22:53.7633333+00:00

    Hi midhun ,

    Thanks for posting on Microsoft Q&A!

    The “dog → [10,3,2]” example in the module is a static word vector, showing a word’s fixed meaning. In a real Transformer, text like “when my dog was” gets split into tokens, and each token starts with an embedding. The encoder’s self-attention then tweaks these into contextual embeddings, capturing the sentence’s meaning. The decoder uses these to generate output, like translations, by focusing on the right input parts. The module’s example is just simplified for clarity.

    I hope this helps! If you have any further questions, feel free to ask.

    If the information is useful, please accept the answer and upvote it to assist other community members.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.