Can Conversational Language Understanding (CLU) analyze the intents of each user in a multi-user conversation?

png 10 Reputation points

I need to analyze a conversation between two humans E.g.

A: I want to buy this Samsung A20 phone. Is this available?

B: Which color are you looking for?

A: Black please

B: Let me check

B: Sorry, it is out of stock.

My requirement is: every time after B utters something, in real-time I need to find out:

  • Up to that point what are A's intent
  • Classify the intent of that B's utterance

The output (in brackets) will be something like that:

A: I want to buy this Samsung A20 phone. Is this available?

B: Which color are you looking for?

(output: A wants to buy a product and checks if it is available; B is asking about the color)

A: Black please

B: Let me chec=k

(output: A wants to buy a product and checks if it is available; B is checking)

B: Sorry, it is out of stock.

(output: A wants to buy a product and checks if it is available; B says it is out of stock)

I read the CLU doc but it seems like it can only handle classifying a single utterance from a user, but without considering the full context of a dialog.

So I wonder if I can use CLU to build something to solve my problem? If not, any suggestion on what approaches I should use?

Azure AI Language
Azure AI Language
An Azure service that provides natural language capabilities including sentiment analysis, entity extraction, and automated question answering.
328 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 12,141 Reputation points

    Based on what you provided as information, the model needs to maintain a conversation state or context that updates with each turn of the conversation. This context will help you in understanding the flow of the conversation and the changing intents and information presented by each speaker.

    The model assigns an intent class for every utterance. This necessitates training the model not only on what is said in single utterances but also on how these utterances interact with each other within a context of a dialogue. For example, a question about color after an inquiry on product availability is regarded as part of the buying intent.

    Whenever B speaks the system is analyzing utterance in relation to conversation that has taken place so far. To do this, it means being able to acknowledge B’s reply (specifically asking for further details, verifying availability and confirming out-of stock) as well as updating A’s intent based on the continuous conversation.

    This should be done in real-time with your system capable of fast text processing, context updating and intent classification as each message passes through the platform. In most cases, this entails optimized models and infrastructure that is capable of withstanding high levels of input/output operations.

    To implement such a system, consider these approaches:It goes without saying that.

    Custom CLU Model Development: If your requirements are quite complicated and the domain that you work with is specialized, such as retail or customer services, it may be necessary to develop custom CLU model. This includes the creation of a multi-turn conversation dataset, adding intent and context annotations to it then training using machine learning frameworks.

    Use of Pretrained Models with Fine-Tuning: Pre-trained models that are available in platforms such as Azure AI Language Service or other NLP services including (e.g., Google’s Dialogflow, IBM Watson Assistant) provide a point of departure for the understanding of conversational context and intent. These models are often able to be tuned on your dataset specific to the use cases being presented.

    Integration of Contextual Memory: Your system should have a contextual memory that can be updated in real-time and applied to intent classification. This could include bespoke development or the use of existing tools that provide context-management capabilities.

    In case the documentation implies that only CLU models limit themselves to single utterance analysis without paying attention at full dialog context, it may mean either a more trivial use-case or specific functionality of this system. For complex conversational understanding in a real-time app, you’ll usually seek advanced features or tailor the system to preserve and retain context conversationally.