sorry. I don't have the C# environment. I run my python code with your subscriptionKey and region. In fact, I believe that the following Python code is essentially no different from C #. the code is as follows:
import requests
import base64
import json
import time
subscriptionKey = "62c2e33bc0ee44f4836f9bff74c65c6c"
region = "westeurope"
# a common wave header, with zero audio length
# since stream data doesn't contain header, but the API requires header to fetch format information, so you need post this header as first chunk for each query
WaveHeader16K16BitMono = bytes([ 82, 73, 70, 70, 78, 128, 0, 0, 87, 65, 86, 69, 102, 109, 116, 32, 18, 0, 0, 0, 1, 0, 1, 0, 128, 62, 0, 0, 0, 125, 0, 0, 2, 0, 16, 0, 0, 0, 100, 97, 116, 97, 0, 0, 0, 0 ])
# a generator which reads audio data chunk by chunk
# the audio_source can be any audio input stream which provides read() method, e.g. audio file, microphone, memory stream, etc.
def get_chunk(audio_source, chunk_size=1024):
yield WaveHeader16K16BitMono
while True:
#time.sleep(chunk_size / 32000) # to simulate human speaking rate
chunk = audio_source.read(chunk_size)
if not chunk:
global uploadFinishTime
uploadFinishTime = time.time()
break
yield chunk
# build request
url = "https://%s.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=zh-CN&profanity=masked" % region
headers = { 'Accept': 'application/json;text/xml',
'Connection': 'Keep-Alive',
'Content-Type': 'audio/wav; codecs=audio/pcm; samplerate=16000',
'Ocp-Apim-Subscription-Key': subscriptionKey,
'Transfer-Encoding': 'chunked',
'Expect': '100-continue' }
audioFile = open("./source/voice2024-07-08-16-25-57.wav", 'rb')
# send request with chunked data
response = requests.post(url=url,data=get_chunk(audioFile), headers=headers)
getResponseTime = time.time()
audioFile.close()
resultJson = json.loads(response.text)
print(json.dumps(resultJson, indent=4))
print(resultJson["DisplayText"])
latency = getResponseTime - uploadFinishTime
print("Latency = %sms" % int(latency * 1000))
The result is following:

My question is as follows: Why is this 4.5s longer than the previous 1.7s? What is the specific reason related to file size?
Thanks.