In Generative AI, are the tokens converted real time and remain in cache or MS has converted and saved all of them

Question

In Generative AI, are the tokens converted real time and remain in cache or MS has converted and saved all of them

Rajeev Gera 0

When training text are converted into tokens - where is it saved. It is a real time change for every query? Are they saved in cache ? What if someone uses the same text for query on a different computer? What happens when you run a new query? What happens when you close the browser and open the AI page again?

Has Microsoft or any encoder or decoder has converted all text and saved them as tokens?

This question is related to the following Learning Module

Rakesh Gurram 15,715 Reputation points Microsoft External Staff Moderator

2025-01-13T02:54:28.12+00:00

Hi Rajeev Gera,

Just following up to see if you had a chance to review my previous response. Please let me know if it was helpful, and don't hesitate to reach out if you have any further questions.

1 answer

Your answer

Rakesh Gurram 15,715 Reputation points Microsoft External Staff Moderator

2025-01-13T02:54:28.12+00:00

Hi Rajeev Gera,

Just following up to see if you had a chance to review my previous response. Please let me know if it was helpful, and don't hesitate to reach out if you have any further questions.

Answer 1

Hi Rajeev Gera,

Thank you for reaching out to us on the Microsoft Q&A forum.

In Generative AI systems like GPT, tokenization is the process of breaking text into smaller pieces called tokens, which the model uses to generate responses. This happens in real-time whenever you enter text. The tokenizer follows a fixed method to create tokens, but it doesn’t pre-process or save tokens for all possible inputs because that would require too much storage. Instead, tokenization is done only when needed.

During training, the model uses tokenized text to learn patterns and relationships. However, these tokens are not stored after training; the model keeps the knowledge in its internal structure. Each time you make a query, the text is tokenized again on the spot and processed right away.

Sometimes, tokens may be temporarily stored during a session to make things faster, but they are not saved permanently. If you use the same text on a different device, it will be tokenized again in real-time. The tokenizer always works the same way, so the same input will always create the same tokens.

If you close your browser and reopen it, any temporary tokens are cleared, and the tokenization process starts fresh for new queries. It’s not practical to pre-convert and save tokens for every possible text because there are endless possibilities. Instead, the model processes text dynamically as you use it, ensuring it can handle a wide variety of inputs efficiently.

Please reach out to us if you have any other queries.

If the information is helpful, please Accept Answer & Upvote so that it would be helpful to other community members.

Rakesh Gurram 15,715 Reputation points Microsoft External Staff Moderator

2025-01-10T08:29:42.6233333+00:00

Hi Rajeev Gera,

Just following up to see if you had a chance to review my previous response. Please let me know if it was helpful, and don't hesitate to reach out if you have any further questions. If your issues have been resolved, kindly accept the answer by clicking the Accept Answer and Upvote buttons on the post.

Thank you!

Share via

In Generative AI, are the tokens converted real time and remain in cache or MS has converted and saved all of them

1 answer

Your answer