Conversational Intelligence - Transcript Resource
A Conversational Intelligence Transcript resource represents a transcribed voice conversation. To initiate the transcription process of a specific recording's audio, you'll need to call the Create a new Conversational Intelligence Transcript endpoint. You can transcribe recordings created by Twilio or those that are externally created or stored.
If automatic transcription is enabled, Twilio creates a Conversational Intelligence Transcript resource whenever a Voice call within your Account has been recorded.
A Transcript resource contains links to the associated subresources:
- The Transcript Sentence subresource contains the recording's transcribed sentences.
- The Transcript Media subresource includes the URL of the recording's media file.
- The Transcript OperatorResults subresource contains the results from the Intelligence Service's Language Operators.
Conversational Intelligence supports various audio formats, each suited for different needs:
- Mono: A single audio channel, suitable for straightforward recordings where speaker differentiation isn't crucial.
- Stereo: Two channels providing spatial sound, but not specifically separating speakers.
- Dual-Channel: Two distinct audio tracks in the same file, ideal for differentiating speakers such as agents and customers in call recordings. This format enhances transcription accuracy and participant differentiation.
We recommend using dual-channel recordings to improve transcription accuracy, especially in scenarios requiring speaker differentiation.
The unique SID identifier of the Account.
^AC[0-9a-fA-F]{32}$
Min length: 34
Max length: 34
The unique SID identifier of the Service.
^GA[0-9a-fA-F]{32}$
Min length: 34
Max length: 34
A 34 character string that uniquely identifies this Transcript.
^GT[0-9a-fA-F]{32}$
Min length: 34
Max length: 34
The date that this Transcript was created, given in ISO 8601 format.
The date that this Transcript was updated, given in ISO 8601 format.
The Status of this Transcript. One of queued
, in-progress
, completed
, failed
or canceled
.
queued
in-progress
completed
failed
canceled
Data logging allows Twilio to improve the quality of the speech recognition & language understanding services through using customer data to refine, fine tune and evaluate machine learning models. Note: Data logging cannot be activated via API, only via www.twilio.com, as it requires additional consent.
The date that this Transcript's media was started, given in ISO 8601 format.
If the transcript has been redacted, a redacted alternative of the transcript will be available.
The Channel
parameter object contains information about the conversational media used for the transcription. The table below describes the properties of the Channel
object. Click Show child properties to show the details on each property.
Object representing the media associated with the transcript. It has information about the source of the transcript and its participants.
POST https://intelligence.twilio.com/v2/Transcripts
Info
When you use automatic transcription, you don't need this API request to create new Conversational Intelligence Transcripts.
application/x-www-form-urlencoded
The unique SID identifier of the Service.
^GA[0-9a-fA-F]{32}$
Min length: 34
Max length: 34
Used to store client provided metadata. Maximum of 64 double-byte UTF8 characters.
The date that this Transcript's media was started, given in ISO 8601 format.
The Channel
parameter object contains information about the conversational media used for the transcription.
The table below describes the properties of the Channel
parameter object. Click Show child properties to show the media_properties
and participants
fields.
Object representing the media associated with the transcript. It has information about the source of the transcript and its participants.
You can optionally provide a CustomerKey
parameter to map a Transcript to an internal identifier known within your system. This unique identifier helps track the Transcript, and it's included in webhook callback when the results for Transcripts and Operators are available. Note that CustomerKey
doesn't replace the Transcript SID in Conversational Intelligence API calls.
To transcribe Recordings made via Twilio and stored within Twilio's infrastructure, provide the Recording SID in the Channel
object's media_properties.source_sid
property as shown below. REXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
represents a Recording SID.
In this scenario, the Channel
information appears as follows:
1{2"media_properties":{3"source_sid": "REXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"4}5}
1// Download the helper library from https://www.twilio.com/docs/node/install2const twilio = require("twilio"); // Or, for ESM: import twilio from "twilio";34// Find your Account SID and Auth Token at twilio.com/console5// and set the environment variables. See http://twil.io/secure6const accountSid = process.env.TWILIO_ACCOUNT_SID;7const authToken = process.env.TWILIO_AUTH_TOKEN;8const client = twilio(accountSid, authToken);910async function createTranscript() {11const transcript = await client.intelligence.v2.transcripts.create({12channel: {13media_properties: {14source_sid: "REXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",15},16},17serviceSid: "GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",18});1920console.log(transcript.accountSid);21}2223createTranscript();
Response
1{2"account_sid": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",3"service_sid": "GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",4"sid": "GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",5"date_created": "2010-08-31T20:36:28Z",6"date_updated": "2010-08-31T20:36:28Z",7"status": "queued",8"channel": {9"media_properties": {10"media_url": "http://foobar.test/ClusterTests/call1.wav"11}12},13"data_logging": false,14"language_code": "en-US",15"media_start_time": null,16"duration": 0,17"customer_key": "aaaaaaaa",18"url": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",19"redaction": true,20"links": {21"sentences": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Sentences",22"media": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Media",23"operator_results": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/OperatorResults"24}25}
- The recording file size must not exceed 3GB.
- Audio duration can't exceed eight hours.
- Recordings shorter than two seconds aren't transcribed.
- Transcripts are indexed and available for search for 90 days.
- You can create only one Conversational Intelligence Transcript resource for a given Recording resource. To re-transcribe a Recording, delete the original Conversational Intelligence Transcript resource and create a new one.
- To create Transcripts for Twilio Recordings in external storage, use the
MediaUrl
parameter. TheSourceSid
parameter isn't supported for externally-stored Twilio Recordings. - You can't create Conversational Intelligence Transcripts from encrypted Voice Recordings, because Conversational Intelligence can't decrypt those resources. You should move those recordings to your own external storage and generate pre-signed URLs for the decrypted recordings. Once you've done that, you can follow the instructions in the "Transcribe an external recording" section below.
To transcribe a recording stored externally, for example, a recording stored in your own S3 bucket, provide the recording's URL in the Channel
object's media_properties.media_url
property.
The following limitations apply when transcribing an external recording (specified by a MediaUrl
):
- You must make external recordings stored in Twilio Assets public.
- Basic authentication on
MediaUrl
s isn't supported for external recordings. If you store the recordings on S3, use a presigned URL. And when storing them on Azure Blob Storage, use a Shared Access Signature (SAS). MediaUrl
s that respond with a non-200 HTTP status code will result in a failed request.- Requests to access external recordings are performed once. There is currently no retry behavior.
To transcribe the audio of a Twilio Video recording, it needs additional processing to become compatible with Conversational Intelligence.
First, create a dual-channel audio recording by transcoding a separate audio-only composition for each participant in the Video Room.
1curl -X POST "https://video.twilio.com/v1/Compositions" \ --data-urlencode "AudioSources=PAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"2\ --data-urlencode "StatusCallback=https://www.example.com/callbacks"3\ --data-urlencode "Format=mp4"4\ --data-urlencode "RoomSid=RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"5\ -u $TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN
Next, download the media from these compositions and merge them into a single stereo audio file.
ffmpeg -i speaker1.mp4 -i speaker2.mp4 -filter_complex "[0:a][1:a]amerge=inputs=2[a]" -map "[a]" -f flac -bits_per_raw_smaple 16 -ar 441000 output.flac
If the recording duration for each participant differs, you can avoid overlapping audio tracks. Use ffmpeg
to create a single-stereo audio track with delay to cover the difference in track length. For example, if one audio track lasts 63 seconds and the other 67 seconds, use ffmpeg
to create a stereo file with the first track, with four seconds of delay to match the length of the second track.
ffmpeg -i speaker1.wav -i speaker2.wav -filter_complex "aevalsrc=0:d=${second_to_delay}[s1];[s1][1:a]concat=n=2:v=0:a=1[ac2];[0:a]apad[ac1];[ac1][ac2]amerge=2[a]" -map "[a]" -f flac -bits_per_raw_sample 16 -ar 441000 output.flac
Finally, use the Create a new Conversational Intelligence Transcript endpoint with the Channel
parameter's media_properties.media_url
property set to a publicly accessible URL of the audio file.
Recordings must be publicly accessible during transcription. The recordings can be hosted or used on a time-limited pre-signed URL. To share a recording on an existing AWS S3 bucket, read the "Sharing objects with pre-signed URLs" guide from AWS.
Twilio attempts to download an external recording for up to 10 minutes. After 10 minutes, the transcription fails.
You can't transcribe encrypted recordings.
Conversational Intelligence doesn't perform speaker diarization on recordings, meaning it doesn't differentiate between different speakers. Additionally, using mono recordings can lead to reduced transcription accuracy. For improved transcription accuracy and participant differentiation, use dual-channel recordings.
Conversational Intelligence supports both mono and stereo audio formats for the following media formats:
- WAV (PCM-encoded)
- MP3
- FLAC
The following limits apply to the media files:
- The maximum file size allowed is 3GB.
- The maximum audio length is eight hours.
- The minimum sample rate required is 8kHz (telephony grade). For best results, use 16KHz.
In this scenario, the Channel
information appears as follows:
1{2"media_properties":{3"media_url": "http://www.example.com/recording/call.wav"4}5}
1// Download the helper library from https://www.twilio.com/docs/node/install2const twilio = require("twilio"); // Or, for ESM: import twilio from "twilio";34// Find your Account SID and Auth Token at twilio.com/console5// and set the environment variables. See http://twil.io/secure6const accountSid = process.env.TWILIO_ACCOUNT_SID;7const authToken = process.env.TWILIO_AUTH_TOKEN;8const client = twilio(accountSid, authToken);910async function createTranscript() {11const transcript = await client.intelligence.v2.transcripts.create({12channel: {13media_properties: {14media_url: "https://example.com/your-recording.wav",15},16},17serviceSid: "GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",18});1920console.log(transcript.accountSid);21}2223createTranscript();
Response
1{2"account_sid": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",3"service_sid": "GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",4"sid": "GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",5"date_created": "2010-08-31T20:36:28Z",6"date_updated": "2010-08-31T20:36:28Z",7"status": "queued",8"channel": {9"media_properties": {10"media_url": "http://foobar.test/ClusterTests/call1.wav"11}12},13"data_logging": false,14"language_code": "en-US",15"media_start_time": null,16"duration": 0,17"customer_key": "aaaaaaaa",18"url": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",19"redaction": true,20"links": {21"sentences": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Sentences",22"media": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Media",23"operator_results": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/OperatorResults"24}25}
Warning
If you include both MediaUrl
and SourceSid
in the Transcript creation request, Twilio uses the MediaUrl
.
By default, Conversational Intelligence labels the left channel (channel one) as Agent
and the right channel (channel two) as Customer
. Depending on your call flow and the recorded call leg, this may not accurately reflect the participant/channel relationships on your recording. If needed, specify which participant is on a given channel via the Channel parameter's participants
array.
Warning
If the default behavior doesn't align with your application's recording implementation, you can do one of the following:
- Update your application's logic to ensure the "Agent" is always on the first channel. For two-party voice calls, that's the first call leg. For Conferences, that's the first Participant that joined the recorded Conference.
- If your application logic places all "Agents" on channel 2 and all "Customers" on channel 1, reach out to Twilio Support to invert the Agent/Customer Conversational Intelligence labeling at the Account level. This affects all recordings within that Account.
- Specify Participant information in the request to create a Transcript. Use this only if the first two options aren't feasible for your application. See below for how to do this.
Only two participants can be overridden in the Channel
object of the Transcript resource.
The code sample below demonstrates an example request that overrides the default Conversational Intelligence labels.
1// Download the helper library from https://www.twilio.com/docs/node/install2const twilio = require("twilio"); // Or, for ESM: import twilio from "twilio";34// Find your Account SID and Auth Token at twilio.com/console5// and set the environment variables. See http://twil.io/secure6const accountSid = process.env.TWILIO_ACCOUNT_SID;7const authToken = process.env.TWILIO_AUTH_TOKEN;8const client = twilio(accountSid, authToken);910async function createTranscript() {11const transcript = await client.intelligence.v2.transcripts.create({12channel: {13media_properties: {14media_url: "https://example.com/your-recording",15},16participants: [17{18user_id: "id1",19channel_participant: 1,20media_participant_id: "+1555959545",21email: "veronica.meyer@example.com",22full_name: "Veronica Meyer",23image_url:24"https://images.unsplash.com/photo-1438761681033-6461ffad8d80",25role: "Customer",26},27{28user_id: "id2",29channel_participant: 2,30media_participant_id: "+1555959505",31email: "lauryn.trujillo@example.com",32full_name: "Lauryn Trujillo",33image_url:34"https://images.unsplash.com/photo-1554384645-13eab165c24b",35role: "Agent",36},37],38},39serviceSid: "GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",40});4142console.log(transcript.accountSid);43}4445createTranscript();
Response
1{2"account_sid": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",3"service_sid": "GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",4"sid": "GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",5"date_created": "2010-08-31T20:36:28Z",6"date_updated": "2010-08-31T20:36:28Z",7"status": "queued",8"channel": {9"media_properties": {10"media_url": "http://foobar.test/ClusterTests/call1.wav"11}12},13"data_logging": false,14"language_code": "en-US",15"media_start_time": null,16"duration": 0,17"customer_key": "aaaaaaaa",18"url": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",19"redaction": true,20"links": {21"sentences": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Sentences",22"media": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Media",23"operator_results": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/OperatorResults"24}25}
GET https://intelligence.twilio.com/v2/Transcripts/{Sid}
Info
Use the webhook callback to know when a Create a new Conversational Intelligence Transcript request has completed and when the results are available. This is preferable to polling the Fetch a Conversational Intelligence Transcript endpoint.
The webhook callback URL can be configured on the Intelligence Service's settings.
A 34 character string that uniquely identifies this Transcript.
^GT[0-9a-fA-F]{32}$
Min length: 34
Max length: 34
1// Download the helper library from https://www.twilio.com/docs/node/install2const twilio = require("twilio"); // Or, for ESM: import twilio from "twilio";34// Find your Account SID and Auth Token at twilio.com/console5// and set the environment variables. See http://twil.io/secure6const accountSid = process.env.TWILIO_ACCOUNT_SID;7const authToken = process.env.TWILIO_AUTH_TOKEN;8const client = twilio(accountSid, authToken);910async function fetchTranscript() {11const transcript = await client.intelligence.v212.transcripts("GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa")13.fetch();1415console.log(transcript.accountSid);16}1718fetchTranscript();
Response
1{2"account_sid": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",3"service_sid": "GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",4"sid": "GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",5"date_created": "2010-08-31T20:36:28Z",6"date_updated": "2010-08-31T20:36:28Z",7"status": "queued",8"channel": {},9"data_logging": false,10"language_code": "en-US",11"media_start_time": null,12"duration": 0,13"customer_key": null,14"url": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",15"redaction": true,16"links": {17"sentences": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Sentences",18"media": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Media",19"operator_results": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/OperatorResults"20}21}
GET https://intelligence.twilio.com/v2/Transcripts
The unique SID identifier of the Service.
^GA[0-9a-fA-F]{32}$
Min length: 34
Max length: 34
How many resources to return in each list page. The default is 50, and the maximum is 1000.
1
Maximum: 1000
The page token. This is provided by the API.
1// Download the helper library from https://www.twilio.com/docs/node/install2const twilio = require("twilio"); // Or, for ESM: import twilio from "twilio";34// Find your Account SID and Auth Token at twilio.com/console5// and set the environment variables. See http://twil.io/secure6const accountSid = process.env.TWILIO_ACCOUNT_SID;7const authToken = process.env.TWILIO_AUTH_TOKEN;8const client = twilio(accountSid, authToken);910async function listTranscript() {11const transcripts = await client.intelligence.v2.transcripts.list({12limit: 20,13});1415transcripts.forEach((t) => console.log(t.accountSid));16}1718listTranscript();
Response
1{2"transcripts": [3{4"account_sid": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",5"service_sid": "GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",6"sid": "GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",7"date_created": "2010-08-31T20:36:28Z",8"date_updated": "2010-08-31T20:36:28Z",9"status": "queued",10"channel": {},11"data_logging": false,12"language_code": "en-US",13"media_start_time": null,14"duration": 0,15"customer_key": null,16"url": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",17"redaction": true,18"links": {19"sentences": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Sentences",20"media": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Media",21"operator_results": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/OperatorResults"22}23}24],25"meta": {26"key": "transcripts",27"page": 0,28"page_size": 50,29"first_page_url": "https://intelligence.twilio.com/v2/Transcripts?LanguageCode=en-US&SourceSid=REaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa&ServiceSid=GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa&AfterDateCreated=2019-11-22T23%3A46%3A00Z&PageSize=50&Page=0",30"next_page_url": null,31"previous_page_url": null,32"url": "https://intelligence.twilio.com/v2/Transcripts?LanguageCode=en-US&SourceSid=REaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa&ServiceSid=GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa&AfterDateCreated=2019-11-22T23%3A46%3A00Z&PageSize=50&Page=0"33}34}
DELETE https://intelligence.twilio.com/v2/Transcripts/{Sid}
A 34 character string that uniquely identifies this Transcript.
^GT[0-9a-fA-F]{32}$
Min length: 34
Max length: 34
1// Download the helper library from https://www.twilio.com/docs/node/install2const twilio = require("twilio"); // Or, for ESM: import twilio from "twilio";34// Find your Account SID and Auth Token at twilio.com/console5// and set the environment variables. See http://twil.io/secure6const accountSid = process.env.TWILIO_ACCOUNT_SID;7const authToken = process.env.TWILIO_AUTH_TOKEN;8const client = twilio(accountSid, authToken);910async function deleteTranscript() {11await client.intelligence.v212.transcripts("GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa")13.remove();14}1516deleteTranscript();