Helpful Terminology for Conversational AI

Created by Nitin Solanki, Modified on Thu, 8 Sep, 2022 at 4:34 PM by Nitin Solanki

Multi-expert dialog

In the SimInsights Conversational AI (SCAI) architecture, conversational skills are called experts. There are often many experts within each application built on this architecture, hence the term multi-expert. Expert is responsible for a specific skill (or behavior) exhibited by the Conversational AI. For example, you might have an expert which provides help with using the application, another expert that guides the user through a task and a third expert that takes verbal notes. The SCAI multi expert dialog engine automatically activates the right expert for the right input. While each expert is often very simple to implement, together they can form a highly capable conversational AI.

Experts

Experts are software implementations that can analyze user input and contextual information. Experts can communicate with other experts. When the skill/behavior implemented within an expert is activated (usually when the user input is relevant to the skill), the expert produces a response.

As mentioned above, there can be multiple experts serving each application (i.e. a simulation). Occasionally the experts can collide i.e. generate a response to the same input. These collisions are typically identified during development/testing and resolved. that are able to respond to all user input. Deployed experts have been tested with a host of experts and are known to not collide with any others. Standalone experts have only been tested in their own environments and have not been tested with others. Further testing and development is needed before allowing Standalone Experts to be deployed in full projects but we are aiming to support them in future builds.

List of experts developed so far

HM Graph Expert - Deployed

The HMGraphExpert is a script based interaction module. The script can have multiple steps and can branch out to alternate steps based on user input. Typically, this expert is activated in training simulations where the user is guided through a scripted interaction.

This expert can control the flow of a simulation by directing performances by virtual persons/objects and well presenting audio/visual feedback to the user. The script used by this expert can be authored using our dialog authoring tools as well as by coding the script in its native JSON format.

Key Phrase Expert - Deployed

The KeyPhraseExpert is able to respond to specific phrases. For example: “help me”, “restart the simulation”, “quit”. This expert is used to map user utterances matching (or similar to) their corresponding keyphrases to simulation actions. It is common to use this expert in all applications and often works in tandem with the HMGraphExpert to handle unscripted actionable inputs. Any reaction to a phrase can be managed by this expert whether it is feedback to the user directly, a change in an object's state, direction to an npc, etc.

Catch All Expert - Deployed

As the name suggests, the CatchAllExpert is used to respond to all user utterances. The response generated by this expert is used only when none of the other experts respond. Usually this will respond with a phrase similar to “I didn’t catch that” or “I couldn’t understand that”. This expert is necessary in all applications, however its behavior can be customized as appropriate for the application.

Azure PDF Expert - Standalone

The AzurePDFExpert is a conversational question answering expert. This expert also demonstrates the ability to attach the behavior of an expert to an external knowledge source/API. Answers to free form questions are generated using the Microsoft Azure PDF QnA system which is able to digest information in a PDF file (such as a user manual) and allows question answering based on the contents on the PDF. When a relevant response is generated from the external system, this expert allows SCAI to present that response to the user.

Gaze Expert - Standalone

The GazeExpert demonstrates our ability to combine multimodal cues in conversational interactions. In the case of the gaze expert, the user's gaze tracking is used to identify which virtual objects/persons the user is looking towards. A conical gaze tracking system is used in our current implementation. This ability is used to performs behaviors such as

Answering questions like “what is that”
Tracking if use is paying attention to the right objects in a scene. In training applications, this has been used to provide feedback to the user when they are not paying attention to the relevant objects.

Knowledge Graph Expert - Standalone

The KnowledgeGraphExpert enables knowledge based conversational question answering. The behaviors supported by this expert include:

Finding objects in the scene by name by saying utterances like “where is the desk”
Using context clues about an item to identify other objects in the scene, i.e. What is the object to the left of the bin?
Able to provide specific details about an object, i.e. How much does the box weigh?