@SS Unfortunately, with just APIM, I don't believe there is a solution to tackle your scenario. Instead, here are two options you could consider
- Client-Side Logic
If you control the client application (web app, SPA, native, etc.), you could leverage client-side telemetry collection to report back all the data you need once the response is completely streamed to the client. This would be the simpler option if applicable to you. - Proxy Service
This approach involves building a proxy-service between APIM and Azure OpenAI that can proxy the SSE request while looking into the chunks for information and recording it. This is a bit more complex but would account for all client applications that use your API, especially if you do not control all client applications.