How to Use "Real-time speech to text" feature of Azure Cognitive Speech Service in ServiceNow?

Carmel Franco Raj 0 Reputation points
2025-05-19T10:59:36.57+00:00

How to Use "Real-time speech to text" feature of Azure Cognitive Speech Service in ServiceNow?

If any of you has coded or know about this, please let me know

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
2,061 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more

  2. Jerald Felix 1,630 Reputation points
    2025-05-19T16:38:46.4133333+00:00

    Hello Carmel,

    There's no native ServiceNow plugin for Azure Speech-to-Text, you can implement it by using a custom UI widget + Azure WebSocket API or an external integration using MID server / integration hub / REST messages.

    Custom UI Component with WebSocket Client

    Use a Service Portal widget or Now Experience UI Framework to connect to Azure Speech.

    Steps:

    Create Azure Speech Resource

    Go to Azure portal > Create Speech resource.

      Copy the key and region.
      
      **Enable WebSocket in browser (via JavaScript)**
      
         Use browser mic (`getUserMedia`)
         
            Stream audio via **Azure STT WebSocket endpoint**
            
               Parse and display the result in real-time
               
               **Embed this in a ServiceNow Widget or UI page**
               
    

    GitHub sample JS client: https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/quickstart/javascript/browser

    Option 2: Node.js Backend + ServiceNow Integration

    If WebSocket handling in ServiceNow frontend is too complex:

    Build a Node.js Express backend that:

    Uses Azure Speech SDK (npm microsoft-cognitiveservices-speech-sdk)

      Accepts audio input (from browser or app)
      
         Returns transcript
         
         **Expose this backend as REST API**
         
         In **ServiceNow**, create:
         
            REST Message to call backend
            
               Scripted UI to handle audio upload and fetch transcription
               
                  Use transcription in Incident/Case/Record creation
                  
    

    Option 3: Integration Hub + Azure REST API (Not Real-Time)

    Use batch transcription with Integration Hub or REST Message.

    Less real-time, but easier to set up.

    You post audio blob → Azure returns transcript.

    Azure Batch STT Docs

    Best Regards,

    Jerald Felix


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.