@Simone Gallo Function calling specifically, since there is an intermediate step that happens on the service side to suggest functions to call, there are more tokens processed than your input prompt.
Unfortunately, this is not something that is documented at the moment considering the intermediate prompt is part of the service itself. There is an open discussion about this on the OpenAI Forums as well, which does include some third-party libraries that have approximated these extra tokens through multiple trials.
As for the discrepancy itself, this is something that I haven't observed myself. The metrics reported were the exact numbers that I see in the API response. It would be best to ensure you don't have others making calls on the same instance and compare the exact metric data that is incorrect (Processed Prompt Tokens -> prompt_tokens; Generated Completion Tokens -> completion_tokens; Processed Inference Tokens -> total_tokens).
If you consistently see incorrect values with you being the only owner, it would be best to open a support ticket to investigate this further.