Menu

Expand
Rate this page:

Thanks for rating this page!

We are always striving to improve our documentation quality, and your feedback is valuable to us. How could this documentation serve you better?

How to use Task Confidence

Task Confidence gives you more granular control over how you handle the output of Autopilot’s NLU engine. When a task is triggered by a user’s utterance, Autopilot includes a confidence score via the CurrentTaskConfidence parameter, and the unique name of the task with the second highest probability of matching that input via the NextBestTask parameter in it’s request to your application. This data is also available on the Queries resource via the confidence and next_best_task attributes in the results object. The NextBestTask will be null if the fallback task is selected as the next best task.

The confidence score is a number between 0 and 1. Since Autopilot uses different machine learning models depending on the nature and quantity of your training data, the score is a relative measure of the probabilities returned by the model for the two tasks with the highest probabilities — (probability#1 - probability#2)/(probability#1).

A high confidence score indicates a larger difference between the probabilities of the top two tasks identified by the model. The task confidence feature is therefore very useful in helping you fine-tune the user experience when the NLU engine returns a low confidence score.

There are three recommended applications of this feature that can be used independently or in conjunction with each other:

  • Disambiguation
  • Setting a confidence threshold
  • Training and annotation

Disambiguation

Disambiguation essentially involves asking the user to clarify their intent when a low confidence score is received. For example, consider an appointment management bot that among other things can help users with canceling and rescheduling appointments via two tasks — cancel_appointment and reschedule_appointment. Now if a user interacts with the bot by saying ‘I need help with my appointment’, it’s likely the model will pick one of these two tasks with a low confidence score and return the other task as the next best task. This information is included in Autopilot’s request to your application as follows:

Parameter Description Example
AccountSid Your Twilio account ID. It is 34 characters long, and always starts with the letters AC. ACXXXXX
AssistantSid The Autopilot assistant ID. It is 34 characters long, and always starts with UA. UAXXXXX
DialogueSid The session identifier. It is 34 characters long, and always starts with the letters UK. UKXXXXX
UserIdentifier The unique user identifier coming from the channel. For Voice and SMS it will be the user's phone number. +18304765664
CurrentInput The last thing the user said. "I need help with my appointment"
CurrentTask The user's current task. reschedule_appointment
Field_{field-name}_Value The key-value pair of the field value recognized. A different key-value pair will be sent for each field value. Field_CLAIM_NUMBER_Value
Field_{field-name}_Type The key-value pair of the field type recognized. A different key-value pair will be sent for each field type. Twilio.ALPHANUMERIC
DialoguePayloadUrl A URL to the Dialogue JSON payload that contains the context and data collected during the Autopilot session. https://autopilot.twilio.com/v1/Assistants/UAXXXX/Dialogues/UKXXXX
Memory

A JSON Payload that contains all the Autopilot memory values.


NOTE: Memory is only sent in POST requests to prevent query params from getting truncated.

"twilio":{
"collected_data": {
"get_insurance": {
"answers": {
"car_model": {
"answer": "3 Series",
"filled": true,
"confirmed": false,
"attempts": 1
},
"car_year": {
"answer": "2016",
"filled": true,
"type": "Twilio.NUMBER",
"confirmed": false,
"attempts": 1
},
"car_make": {
"answer": "It's a BMW",
"filled": true,
"confirmed": false,
"attempts": 1
},
"car_state": {
"answer": "California",
"type": "Twilio.US_STATE",
"filled": true,
"attempts": 3,
"confirmed": false
}
},
"date_completed": "",
"date_started": "2019-08-03T00:13:18Z",
"status": "in-progress"
}
}
}

Channel

The channel the interaction is taking place.

SMS

CurrentTaskConfidence The confidence score for the task detected 0.7
NextBestTask The task with the next highest confidence score cancel_appointment

In this scenario, you can ask the user to clarify or disambiguate with a response along the lines of — ‘I can help you with that. Would you like to reschedule your appointment or cancel your appointment?’

Disambiguation helps you deliver a smarter user experience instead of triggering a task that may not match the user’s intent.

Setting confidence thresholds

Your application can also specify confidence thresholds to decide how to respond to the user’s query. These thresholds can then be used to decide whether to trigger a disambiguation response, the fallback task or hand off the conversation to a human agent.

Continuing the example of the appointment management bot, let’s say Autopilot provides a confidence score of 0.2 for the query ‘I can’t find my appointment’. Setting confidence thresholds for triggering the fallback task and disambiguation of 0.3 and 0.5 respectively will instruct your application to respond using the fallback task instead of a disambiguation flow. The response in the fallback task can be used to get the user back on track with the bot.

Since many factors like the nature of the training data and customer expectations will defer between bot experiences, it’s not possible to recommend a specific value for each threshold.

Training and annotation

The confidence score is also recorded for each query in the Queries page in the Autopilot console. The page also lets you filter queries by confidence score, allowing you to focus your training and annotation efforts on queries that with low confidence scores.

Rate this page:

Need some help?

We all do sometimes; code is hard. Get help now from our support team, or lean on the wisdom of the crowd browsing the Twilio tag on Stack Overflow.