Skip to contentSkip to navigationSkip to topbar
Page toolsOn this page
Looking for more inspiration?Visit the

Batch Transcription resource


(new)

Legal notice and public beta

Batch Transcription Configurations use artificial intelligence or machine learning technologies. By enabling or using any of these features or functionalities within Batch Transcription Configurations, you acknowledge and agree that your use of these features or functionalities is subject to the terms of the Predictive and Generative AI/ML Features Addendum(link takes you to an external page).

Batch Transcription Configurations is currently available as a Public Beta release and the information contained in this document is subject to change. Some features are not yet implemented and others may be changed before the product is declared as Generally Available. Public Beta products are not covered by the Twilio Support Terms or Twilio Service Level Agreement.

Batch Transcription Configurations is not PCI compliant or a HIPAA Eligible Service and should not be used in workflows that are subject to HIPAA or PCI.

A Batch Transcription resource represents an asynchronous transcription job for a recorded conversation. To submit audio for transcription, call the Create a Transcription endpoint. You can transcribe Twilio Recordings using a Recording SID, or provide a direct URL to an externally hosted audio file.


Audio channel formats

audio-channel-formats page anchor

Batch Transcription supports several audio formats, each suited for different needs:

  • Mono: a single audio channel, suitable for straightforward recordings where speaker differentiation isn't crucial.
  • Stereo: two channels that provide spatial sound but don't separate speakers.
  • Dual-channel: two distinct audio tracks in the same file, ideal for differentiating speakers such as agents and customers in call recordings. This format improves transcription accuracy and participant differentiation.

For better transcription accuracy, use dual-channel recordings, especially when speaker differentiation is important.


Transcribe a Twilio Recording

transcribe-a-twilio-recording page anchor

To transcribe a Twilio Recording, provide the Recording SID in the sourceId parameter.

Limits for Twilio Recording transcriptions

limits-for-twilio-recording-transcriptions page anchor
  • The recording file size must not exceed 3 GB.
  • Audio duration can't exceed eight hours.
  • Recordings shorter than two seconds aren't transcribed.
  • To transcribe Twilio Recordings stored in external storage, use the mediaUrl parameter. The sourceId parameter isn't supported for externally stored Twilio Recordings.
  • You can't transcribe encrypted Voice Recordings. Move those recordings to your own external storage, generate pre-signed URLs for the decrypted files, and use the mediaUrl parameter instead.

Transcribe an external recording

transcribe-an-external-recording page anchor

To transcribe a recording stored externally, provide the recording's URL in the mediaUrl parameter.

Batch Transcription supports both mono and stereo audio for the following formats:

  • WAV (PCM-encoded)
  • MP3
  • FLAC

Limits for external recordings

limits-for-external-recordings page anchor
  • The maximum file size allowed is 3 GB.
  • The maximum audio length is eight hours.
  • The minimum sample rate required is 8 kHz (telephony grade). For best results, use 16 kHz.
(warning)

Warning

You must provide either sourceId or mediaUrl, but not both.


Specify participant information

specify-participant-information page anchor

Optionally, provide a participants array to identify who is on each audio channel. Each participant requires an audioChannelIndex (1 or 2) and can include a type (CUSTOMER, HUMAN_AGENT, or AI_AGENT), address, and name.

When sourceId is provided, participant data is inferred from the call metadata.

When mediaUrl is provided, participant data can't be resolved from the source and must be supplied explicitly. If the transcription result is stored in a conversation through a conversationConfigurationId, participant information is required. Without it, Twilio can't correctly attribute the transcript to conversation participants.


POST https://voice.twilio.com/v3/Transcriptions
(information)

Info

A transcriptionConfigurationId is required to create a Transcription. This ID identifies the configuration that controls transcription behavior such as engine, language, and callbacks. See the Transcription Configuration resource for details.

Create a Transcription with a Recording SID

create-a-transcription-with-a-recording-sid page anchor

To submit a Twilio Recording for transcription, provide the Recording SID in sourceId.

1
curl -X POST https://voice.twilio.com/v3/Transcriptions \
2
-u "$TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN" \
3
-H "Content-Type: application/json" \
4
-d '{
5
"transcriptionConfigurationId": "voice_transcriptionconfiguration_5pe8jw3ahdmsh7zr06yh4d45x1",
6
"sourceId": "RExxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
7
"participants": [
8
{ "audioChannelIndex": 1, "type": "CUSTOMER", "address": "+15551234567", "name": "Jane Doe" },
9
{ "audioChannelIndex": 2, "type": "HUMAN_AGENT", "address": "+15559876543", "name": "Agent Smith" }
10
]
11
}'
1
{
2
"status": "PENDING",
3
"statusUrl": "https://voice.twilio.com/v3/Transcriptions/voice_transcription_7n7hnd7sf68yfv5re42nvvz7aa",
4
"transcription": {
5
"id": "voice_transcription_7n7hnd7sf68yfv5re42nvvz7aa",
6
"accountId": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
7
"status": "PENDING",
8
"transcriptionConfigurationId": "voice_transcriptionconfiguration_5pe8jw3ahdmsh7zr06yh4d45x1",
9
"sourceId": "RExxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
10
"mediaUrl": null,
11
"audioStartedAt": "2026-03-10T19:42:16Z",
12
"conversationId": null,
13
"participants": [
14
{ "audioChannelIndex": 1, "type": "CUSTOMER", "address": "+15551234567", "name": "Jane Doe" },
15
{ "audioChannelIndex": 2, "type": "HUMAN_AGENT", "address": "+15559876543", "name": "Agent Smith" }
16
],
17
"duration": null,
18
"resolvedConfiguration": {
19
"transcriptionEngine": "deepgram",
20
"speechModel": "nova-3",
21
"language": "en-US",
22
"transcriptionStatusCallback": {
23
"url": "https://example.com/transcription/callback",
24
"method": "POST",
25
"events": null
26
},
27
"conversationConfigurationId": "conv_configuration_5pe8jw3ahdmsh7zr06yh4d45x1",
28
"participantDefaults": [
29
{ "audioChannelIndex": 1, "type": "CUSTOMER" },
30
{ "audioChannelIndex": 2, "type": "HUMAN_AGENT" }
31
]
32
},
33
"createdAt": "2026-04-02T19:25:20Z",
34
"updatedAt": "2026-04-02T19:25:20Z",
35
"url": "https://voice.twilio.com/v3/Transcriptions/voice_transcription_7n7hnd7sf68yfv5re42nvvz7aa"
36
}
37
}

Create a Transcription with a media URL

create-a-transcription-with-a-media-url page anchor

To submit an externally hosted audio file for transcription, provide its URL in mediaUrl.

1
curl -X POST https://voice.twilio.com/v3/Transcriptions \
2
-u "$TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN" \
3
-H "Content-Type: application/json" \
4
-d '{
5
"transcriptionConfigurationId": "voice_transcriptionconfiguration_5pe8jw3ahdmsh7zr06yh4d45x1",
6
"mediaUrl": "https://example.com/audio/recording.wav",
7
"audioStartedAt": "2026-03-10T19:42:16Z",
8
"participants": [
9
{ "audioChannelIndex": 1, "type": "CUSTOMER", "address": "+15551234567", "name": "Jane Doe" },
10
{ "audioChannelIndex": 2, "type": "HUMAN_AGENT", "address": "+15559876543", "name": "Agent Smith" }
11
]
12
}'
1
{
2
"status": "PENDING",
3
"statusUrl": "https://voice.twilio.com/v3/Transcriptions/voice_transcription_4abcde1234567890abcde12345",
4
"transcription": {
5
"id": "voice_transcription_4abcde1234567890abcde12345",
6
"accountId": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
7
"status": "PENDING",
8
"transcriptionConfigurationId": "voice_transcriptionconfiguration_5pe8jw3ahdmsh7zr06yh4d45x1",
9
"sourceId": null,
10
"mediaUrl": "https://example.com/audio/recording.wav",
11
"audioStartedAt": "2026-03-10T19:42:16Z",
12
"conversationId": null,
13
"participants": [
14
{ "audioChannelIndex": 1, "type": "CUSTOMER", "address": "+15551234567", "name": "Jane Doe" },
15
{ "audioChannelIndex": 2, "type": "HUMAN_AGENT", "address": "+15559876543", "name": "Agent Smith" }
16
],
17
"duration": null,
18
"resolvedConfiguration": {
19
"transcriptionEngine": "deepgram",
20
"speechModel": "nova-3",
21
"language": "en-US",
22
"transcriptionStatusCallback": {
23
"url": "https://example.com/transcription/callback",
24
"method": "POST",
25
"events": null
26
},
27
"conversationConfigurationId": "conv_configuration_5pe8jw3ahdmsh7zr06yh4d45x1",
28
"participantDefaults": [
29
{ "audioChannelIndex": 1, "type": "CUSTOMER" },
30
{ "audioChannelIndex": 2, "type": "HUMAN_AGENT" }
31
]
32
},
33
"createdAt": "2026-04-02T19:25:20Z",
34
"updatedAt": "2026-04-02T19:25:20Z",
35
"url": "https://voice.twilio.com/v3/Transcriptions/voice_transcription_4abcde1234567890abcde12345"
36
}
37
}

(information)

Info

Use the statusUrl from the response to poll for progress. The response includes a Retry-After header with the recommended number of seconds to wait before polling again.

GET https://voice.twilio.com/v3/Transcriptions/{transcriptionId}
1
curl -X GET "https://voice.twilio.com/v3/Transcriptions/voice_transcription_7n7hnd7sf68yfv5re42nvvz7aa" \
2
-u "$TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN"
1
{
2
"operationId": "voice_transcription_7n7hnd7sf68yfv5re42nvvz7aa",
3
"status": "COMPLETED",
4
"statusUrl": "https://voice.twilio.com/v3/Transcriptions/voice_transcription_7n7hnd7sf68yfv5re42nvvz7aa",
5
"transcription": {
6
"id": "voice_transcription_7n7hnd7sf68yfv5re42nvvz7aa",
7
"accountId": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
8
"status": "COMPLETED",
9
"transcriptionConfigurationId": "voice_transcriptionconfiguration_5pe8jw3ahdmsh7zr06yh4d45x1",
10
"sourceId": "RExxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
11
"mediaUrl": null,
12
"audioStartedAt": "2026-03-10T19:42:16Z",
13
"conversationId": "conv_conversation_01k1etx3jbfx88476ccja0889c",
14
"participants": [
15
{ "audioChannelIndex": 1, "type": "CUSTOMER", "address": "+15551234567", "name": "Jane Doe" },
16
{ "audioChannelIndex": 2, "type": "HUMAN_AGENT", "address": "+15559876543", "name": "Agent Smith" }
17
],
18
"duration": 120,
19
"resolvedConfiguration": {
20
"transcriptionEngine": "deepgram",
21
"speechModel": "nova-3",
22
"language": "en-US",
23
"transcriptionStatusCallback": {
24
"url": "https://example.com/transcription/callback",
25
"method": "POST",
26
"events": null
27
},
28
"conversationConfigurationId": "conv_configuration_5pe8jw3ahdmsh7zr06yh4d45x1",
29
"participantDefaults": [
30
{ "audioChannelIndex": 1, "type": "CUSTOMER" },
31
{ "audioChannelIndex": 2, "type": "HUMAN_AGENT" }
32
]
33
},
34
"createdAt": "2026-04-02T19:25:20Z",
35
"updatedAt": "2026-04-02T19:25:20Z",
36
"url": "https://voice.twilio.com/v3/Transcriptions/voice_transcription_7n7hnd7sf68yfv5re42nvvz7aa"
37
}
38
}