(nodejs azure/openai SKD) Can't get "streamChatCompletions" to work properly. Chunks coming all at once after waiting for some time

Question

(nodejs azure/openai SKD) Can't get "streamChatCompletions" to work properly. Chunks coming all at once after waiting for some time

CultureDrivers 0

Hello. I'm in the process of migrating from OpenAI to Azure/OpenAI. I have set up my Azure OpenAI GPT-4 deployment and tested that it works. I'm trying to finalize the implementation by enabling stream response. However, I am facing an issue that I do not know how to resolve. I have tried a few different implementations, suggested on various GitHub repositories, Stack Overflow, etc., even tried implementations suggested by GPT itself. But they all give the same behavior, so I am starting to suspect that it is not the implementations that are wrong but something else. Below you find an example of one of the attempted implementations:


let result = '';
    const events = await client.streamChatCompletions(deploymentId, requestBody.messages);
    const stream = new ReadableStream({
      async start (controller) {
        for await (const event of events) {
          controller.enqueue(event);
        }
        controller.close();
      }
    });
    const reader = stream.getReader();
    while (true) {
      const { done, value } = await reader.read();
      if (done) {
        break;
      }
      for (const choice of value.choices) {
        if (!choice.delta?.content) console.log('Debug 1', JSON.stringify(value, 2, null));
        if (choice.delta?.content !== undefined) {
          console.log('Chunk: ', choice.delta?.content);
          result += choice.delta.content;
          streamCallback(result);
        }
      }
    }
    return result;

When this code runs, I soon get a log from "Debug 1" that looks like following:

Debug 1 {
  "id": "chatcmpl-9ACt3...",
  "model": "gpt-4",
  "object": "chat.completion.chunk",
  "systemFingerprint": null,
  "created": "1970-01-20T19:36:59.845Z",
  "promptFilterResults": [],
  "choices": [
    {
      "index": 0,
      "finishReason": null,
      "delta": {
        "role": "assistant",
        "toolCalls": []
      },
      "contentFilterResults": {}
    }
  ]
}

Then nothing happens for a while (depending on the response length, 5-20 seconds), then I get a burst of stream events all at once, in total containing the full response of the assistant. And then I get the finalizing event closing the stream ("finishReason": "stop").

I would expect the chunks to come flowing in, from the beginning of the process to the end, not all at once after a long wait. I could paste in a few more code implementation examples that I have tried, but as I mentioned, I'm seeing this behavior on all of them. So, could it perhaps be an issue with my deployment? Am I missing turning on some setting? Could it be a bug in the SDK? Or could it be the endpoint that just doesn't return stream response as I expect?

I hope someone can shed a bit of light onto this issue.

Thanks 🙏

YutongTie-MSFT 53,966 Reputation points Moderator

2024-04-04T20:58:26.6966667+00:00

@CultureDrivers Thanks for reaching out to us, could you please share the whole code sample to us without any confidential information so that we can reproduce it?
Deleted

This comment has been deleted due to a violation of our Code of Conduct. The comment was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.

CultureDrivers 0

Hi. There is not much more to it, here is the entire function, with client initialization:

const configurator = local('config');
const { OpenAIClient, AzureKeyCredential } = require('@azure/openai');

const azure = (() => {
  let instanceList;

  const createInstanceList = async () => {
    const config = await configurator.loading();
    return [
      new OpenAIClient('https://<our-instance>.openai.azure.com/', new AzureKeyCredential(config.azure.key_1)),
      new OpenAIClient('https://<our-instance>.openai.azure.com/', new AzureKeyCredential(config.azure.key_2))
    ];
  };

  return {
    getInstanceList: () => {
      if (!instanceList) {
        instanceList = createInstanceList();
      }
      return instanceList;
    }
  };
})();

async function azureChatCompletion ({ requestBody, streamCallback }) {
  const instanceList = await azure.getInstanceList();
  const client = instanceList[0];
  const deploymentId = requestBody.model;

  if (streamCallback) {
    let result = '';
    const events = await client.streamChatCompletions(deploymentId, requestBody.messages);
    const stream = new ReadableStream({
      async start (controller) {
        for await (const event of events) {
          controller.enqueue(event);
        }
        controller.close();
      }
    });
    const reader = stream.getReader();
    while (true) {
      const { done, value } = await reader.read();
      if (done) {
        break;
      }
      for (const choice of value.choices) {
        if (!choice.delta?.content) console.log('Debug 1', JSON.stringify(value, null, 2));
        if (choice.delta?.content !== undefined) {
          console.log('Chunk: ', choice.delta?.content);
          result += choice.delta.content;
          streamCallback(result);
        }
      }
    }
    return result;
  } else {
    const response = await client.getChatCompletions(deploymentId, requestBody.messages);
    return {
      data: response
    };
  }
}

module.exports = {
  'azure/openai': {
    createChatCompletion: azureChatCompletion
  }
};

package.json

  "dependencies": {
    "@azure/openai": "^1.0.0-beta.12",
  }

YutongTie-MSFT 53,966 Reputation points Moderator

2024-04-22T05:11:46.9333333+00:00

@CultureDrivers Sorry, it seems you have deleted your comment, please let us know if you are still blocked by this issue, thanks a lot.

Your answer

YutongTie-MSFT 53,966 Reputation points Moderator

2024-04-04T20:58:26.6966667+00:00

@CultureDrivers Thanks for reaching out to us, could you please share the whole code sample to us without any confidential information so that we can reproduce it?
Deleted

This comment has been deleted due to a violation of our Code of Conduct. The comment was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.
CultureDrivers 0 Reputation points

2024-04-15T12:11:45.49+00:00

Hi. There is not much more to it, here is the entire function, with client initialization:

const configurator = local('config'); const { OpenAIClient, AzureKeyCredential } = require('@azure/openai'); const azure = (() => { let instanceList; const createInstanceList = async () => { const config = await configurator.loading(); return [ new OpenAIClient('https://<our-instance>.openai.azure.com/', new AzureKeyCredential(config.azure.key_1)), new OpenAIClient('https://<our-instance>.openai.azure.com/', new AzureKeyCredential(config.azure.key_2)) ]; }; return { getInstanceList: () => { if (!instanceList) { instanceList = createInstanceList(); } return instanceList; } }; })(); async function azureChatCompletion ({ requestBody, streamCallback }) { const instanceList = await azure.getInstanceList(); const client = instanceList[0]; const deploymentId = requestBody.model; if (streamCallback) { let result = ''; const events = await client.streamChatCompletions(deploymentId, requestBody.messages); const stream = new ReadableStream({ async start (controller) { for await (const event of events) { controller.enqueue(event); } controller.close(); } }); const reader = stream.getReader(); while (true) { const { done, value } = await reader.read(); if (done) { break; } for (const choice of value.choices) { if (!choice.delta?.content) console.log('Debug 1', JSON.stringify(value, null, 2)); if (choice.delta?.content !== undefined) { console.log('Chunk: ', choice.delta?.content); result += choice.delta.content; streamCallback(result); } } } return result; } else { const response = await client.getChatCompletions(deploymentId, requestBody.messages); return { data: response }; } } module.exports = { 'azure/openai': { createChatCompletion: azureChatCompletion } };

package.json

"dependencies": { "@azure/openai": "^1.0.0-beta.12", }
YutongTie-MSFT 53,966 Reputation points Moderator

2024-04-22T05:11:46.9333333+00:00

@CultureDrivers Sorry, it seems you have deleted your comment, please let us know if you are still blocked by this issue, thanks a lot.

Share via

(nodejs azure/openai SKD) Can't get "streamChatCompletions" to work properly. Chunks coming all at once after waiting for some time

Your answer