Menu

Expand
Rate this page:

How to use Task Confidence

Task Confidence gives you more granular control over how you handle the output of Autopilot’s NLU ("Natural Language Understanding") engine. When a task is triggered by a user’s utterance, Autopilot includes:

  • a confidence score via the CurrentTaskConfidence parameter
  • the unique name of the task with the second highest probability of matching that input via the NextBestTask parameter

in its request to your application. This data is also available on the Queries resource via the confidence and next_best_task attributes in the results object. The NextBestTask will be null if the fallback task is selected as the next best task.

The confidence score is a number between 0 and 1. Since Autopilot uses different machine learning models depending on the nature and quantity of your training data, the score is a relative measure of the probabilities returned by the model for the two tasks with the highest probabilities — (probability#1 - probability#2)/(probability#1).

A high confidence score indicates a larger difference between the probabilities of the top two tasks identified by the model. The task confidence feature is therefore very useful for fine-tuning the user experience when the NLU engine returns a low confidence score.

There are three recommended applications of this feature that can be used independently or in conjunction with each other:

  • Disambiguation
  • Setting a confidence threshold
  • Training and annotation

Clarify intent with Disambiguation

Disambiguation essentially involves asking the user to clarify their intent when a low-confidence score is received. For example, consider an appointment management bot that, among other things, can help users with canceling and rescheduling appointments via two tasks — cancel_appointment and reschedule_appointment. If a user interacts with the bot by saying "I need help with my appointment", the model will likely pick one of these two tasks with a low-confidence score and return the other task as the next-best task.

This information is included in Autopilot’s request to your application as follows:

Parameter Description Example
AccountSid Your Twilio account ID. It is 34 characters long, and always starts with the letters AC. ACXXXXX
AssistantSid The Autopilot assistant ID. It is 34 characters long, and always starts with UA. UAXXXXX
DialogueSid The session identifier. It is 34 characters long, and always starts with the letters UK. UKXXXXX
UserIdentifier The unique user identifier coming from the channel. For Voice and SMS it will be the user's phone number. +18304765664
CurrentInput The last thing the user said. "I need help with my appointment"
CurrentTask The user's current task. reschedule_appointment
Field_{field-name}_Value The key-value pair of the field value recognized. A different key-value pair will be sent for each field value. Field_CLAIM_NUMBER_Value
Field_{field-name}_Type The key-value pair of the field type recognized. A different key-value pair will be sent for each field type. Twilio.ALPHANUMERIC
DialoguePayloadUrl A URL to the Dialogue JSON payload that contains the context and data collected during the Autopilot session. https://autopilot.twilio.com/v1/Assistants/UAXXXX/Dialogues/UKXXXX
Memory

A JSON Payload that contains all the Autopilot memory values.


NOTE: Memory is only sent in POST requests to prevent query params from getting truncated.

"twilio":{
"collected_data": {
"get_insurance": {
"answers": {
"car_model": {
"answer": "3 Series",
"filled": true,
"confirmed": false,
"attempts": 1
},
"car_year": {
"answer": "2016",
"filled": true,
"type": "Twilio.NUMBER",
"confirmed": false,
"attempts": 1
},
"car_make": {
"answer": "It's a BMW",
"filled": true,
"confirmed": false,
"attempts": 1
},
"car_state": {
"answer": "California",
"type": "Twilio.US_STATE",
"filled": true,
"attempts": 3,
"confirmed": false
}
},
"date_completed": "",
"date_started": "2019-08-03T00:13:18Z",
"status": "in-progress"
}
}
}

Channel

The channel the interaction is taking place.

SMS

CurrentTaskConfidence The confidence score for the task detected 0.7
NextBestTask The task with the next highest confidence score cancel_appointment

In this scenario, you can ask the user to clarify or disambiguate with a response along the lines of — "I can help you with that. Would you like to reschedule your appointment or cancel your appointment?"

With Autopilot's Disambiguation feature, you create a smarter user experience instead of triggering a task that may not match the user’s intent.

Specify confidence thresholds

Your application can also specify confidence thresholds to decide how to respond to the user’s query. These thresholds can then be used to decide whether to trigger a disambiguation response, trigger the fallback task, or hand off the conversation to a human agent.

Continuing the example of the appointment management bot, let’s say Autopilot provides a confidence score of 0.2 for the query "I can’t find my appointment". Setting confidence thresholds for triggering the fallback task and disambiguation of 0.3 and 0.5 respectively instructs your application to respond using the fallback task instead of a disambiguation flow. The response in the fallback task can be used to get the user back on track with the bot.

Many factors, such as the nature of the training data and customer expectations, will differ between bot types and the experiences you want to design for your end users. Therefore, it's not possible to recommend a specific value for each threshold; it will depend in your use case(s).

Training and annotation

The confidence score is also recorded for each query in the Queries page in the Autopilot console. The page also lets you filter queries by confidence score, allowing you to focus your training and annotation efforts on queries that have low-confidence scores.

Rate this page:

Need some help?

We all do sometimes; code is hard. Get help now from our support team, or lean on the wisdom of the crowd by visiting Twilio's Stack Overflow Collective or browsing the Twilio tag on Stack Overflow.

Loading Code Sample...
        
        
        

        Thank you for your feedback!

        Please select the reason(s) for your feedback. The additional information you provide helps us improve our documentation:

        Sending your feedback...
        🎉 Thank you for your feedback!
        Something went wrong. Please try again.

        Thanks for your feedback!

        thanks-feedback-gif