Alright so from what I have searched, there isn't any native configuration that can solve this issue, regardless of whether you are using azure openai or openai itself.
The best possible way that I found in order to limit the response is to either use a binary classifier (detect if a prompt is on-topic or off-topic) or to use a a similarity detector between whatever you are getting from the search results and the answer. I used a similarity detector that returns a score between 0 and 1 based on the similarity between the search results and the prompt, set up a threshold and if the search results fit the user prompt, the score will be high and it'll be on-topic.
I really hope that they add some sort of configuration setup but I think it'll be challenging as it is very difficult to limit the model's response, and even in the playground the sometimes hallucinates.