Skip to contentSkip to navigationSkip to topbar
Rate this page:
On this page

Voice Intelligence - Key Concepts

Language Operators

language-operators page anchor

Language Operators — sometimes referred to as "operators" — use artificial intelligence and machine learning technology to turn transcripts into structured information using a variety of techniques. There are currently two categories of operators that can be used with Voice Intelligence.

  • Prebuilt Language Operators — Prebuilt Language Operators have been created by Twilio or use third-party AI models. Twilio-made Operators are created, trained, and maintained by the Twilio team. These Operators are trained across a wide swath of data and typically map to pieces of information that are agnostic to use-case or industry. Prebuilt operators cannot be modified or made more specific.
  • Custom Operators — These Operators are created and maintained by our customers. They are specific to an individual customer's use case and data. Custom Operators are text analysis operators — for example, they are keyword or phrase based — and can be used to spot phrases or classify transcripts.


To find out more about the Prebuilt operators that Twilio currently makes available, and for examples of the actions they perform, please review Prebuilt Language Operators.

See Supported Languages for details on Transcription languages that different Operators can support.

Operator actions

operator-actions page anchor

Operators perform a specific action on a conversation or a sentence within a conversation. There are five types of actions that an operator is able to perform.

Operator ActionStatusDescriptionExample
ClassifyAvailableClassify a conversation into a predefined categoryClassify if the call was transferred to another agent
Phrase matchingAvailableDetermine if an event occurred or if a piece of data or a phrase was mentioned during a conversationSpot whether or not an agent told a customer that their call is being recorded
RedactAvailableFind and redact a value mentioned during a conversationRedact a social security number that was mentioned during a call

Custom Operators are text-parsing based operators and support Classify and Phrase Matching.

To add a Custom Language Operator to your Services follow the steps below:

  1. Navigate to the Language Operators tab on your Service.
  2. Click Create Custom Operator .
  3. Add the name of the operator and select Phrase Matching or Classify .
  4. Create Phrase Sets. Each Phrase Set can have multiple words or phrases to extract from the transcript. For each phrase, you can select:
    A. Exact Match: Find exact words or phrases in transcript
    B. Fuzzy Match: Find words or phases in transcripts using machine learning techniques even if that match less than 100%
  5. Once all the Phrase Sets are created, add the new Custom Language Operator to your Service. The next time a Transcript is created, the new Custom Language Operator results will be on the OperatorResults of the Transcript.

When there is audio that doesn't correspond to speech, or isn't recognized by Twilio's speech recognition engine, it will be labeled with a Non-Speech Tag. Currently, there are the following tags.

Non-Speech TagDescription
[applause]Included if a participant claps on a call
[dtmf]Included when a participant provides input via DTMF (Dual-Tone Multi-Frequency). This tag is only included when DTMF is embedded in the audio of the recording. Out-of-band DTMF is not captured
[foreign]Included when the speech recognition engine does not recognize the audio as being part of a supported language
[hes]Included when a participant says a hesitation marker like umm, uhh, or hmm
[inaudible]Included when there is unclear audio that cannot be recognized by the speech recognition engine
[laugh]Included when a participant laughs on a call
[music]Included when music is detected on a call. This marker typically shows up with hold music
[noise]Included when there are noises that are not recognized as speech.
[ring]Included when there is ringing on a call. This typically shows up when a call is recorded with the record-from-ringing parameter or when a bridged leg plays ringback

Rate this page: