Hi @lakshmi
Welcome to Microsoft Q&A! Thanks for posting the question.
As stated in the documentation, the system message is not a strict rule that the model has to follow. It is more like a suggestion or a hint that the model can use to improve its output. The model may still generate answers that are longer than 50 words or do not use bullet lists, depending on the question and the private KB content.
Both the system message and the token limit are two different ways to control the chat completion model output. The system message is a soft constraint that guides the model to generate answers that match your expectations. The token limit is a hard constraint that limits the model to generate answers that fit within the token budget. But neither of them guarantees that the model will generate answers that are exactly 50 words or less. If you want to enforce a strict word limit, you may need to post-process the model output and truncate it if it exceeds the desired length. Hope this helps.
Thanks Saurabh