Batch Transcription resource
Legal notice and public beta
Batch Transcription Configurations use artificial intelligence or machine learning technologies. By enabling or using any of these features or functionalities within Batch Transcription Configurations, you acknowledge and agree that your use of these features or functionalities is subject to the terms of the Predictive and Generative AI/ML Features Addendum.
Batch Transcription Configurations is currently available as a Public Beta release and the information contained in this document is subject to change. Some features are not yet implemented and others may be changed before the product is declared as Generally Available. Public Beta products are not covered by the Twilio Support Terms or Twilio Service Level Agreement.
Batch Transcription Configurations is not PCI compliant or a HIPAA Eligible Service and should not be used in workflows that are subject to HIPAA or PCI.
A Batch Transcription resource represents an asynchronous transcription job for a recorded conversation. To submit audio for transcription, call the Create a Transcription endpoint. You can transcribe Twilio Recordings using a Recording SID, or provide a direct URL to an externally hosted audio file.
Batch Transcription supports several audio formats, each suited for different needs:
- Mono: a single audio channel, suitable for straightforward recordings where speaker differentiation isn't crucial.
- Stereo: two channels that provide spatial sound but don't separate speakers.
- Dual-channel: two distinct audio tracks in the same file, ideal for differentiating speakers such as agents and customers in call recordings. This format improves transcription accuracy and participant differentiation.
For better transcription accuracy, use dual-channel recordings, especially when speaker differentiation is important.
To transcribe a Twilio Recording, provide the Recording SID in the sourceId parameter.
- The recording file size must not exceed 3 GB.
- Audio duration can't exceed eight hours.
- Recordings shorter than two seconds aren't transcribed.
- To transcribe Twilio Recordings stored in external storage, use the
mediaUrlparameter. ThesourceIdparameter isn't supported for externally stored Twilio Recordings. - You can't transcribe encrypted Voice Recordings. Move those recordings to your own external storage, generate pre-signed URLs for the decrypted files, and use the
mediaUrlparameter instead.
To transcribe a recording stored externally, provide the recording's URL in the mediaUrl parameter.
Batch Transcription supports both mono and stereo audio for the following formats:
- WAV (PCM-encoded)
- MP3
- FLAC
- The maximum file size allowed is 3 GB.
- The maximum audio length is eight hours.
- The minimum sample rate required is 8 kHz (telephony grade). For best results, use 16 kHz.
Warning
You must provide either sourceId or mediaUrl, but not both.
Optionally, provide a participants array to identify who is on each audio channel. Each participant requires an audioChannelIndex (1 or 2) and can include a type (CUSTOMER, HUMAN_AGENT, or AI_AGENT), address, and name.
When sourceId is provided, participant data is inferred from the call metadata.
When mediaUrl is provided, participant data can't be resolved from the source and must be supplied explicitly. If the transcription result is stored in a conversation through a conversationConfigurationId, participant information is required. Without it, Twilio can't correctly attribute the transcript to conversation participants.
POST https://voice.twilio.com/v3/Transcriptions
Info
A transcriptionConfigurationId is required to create a Transcription. This ID identifies the configuration that controls transcription behavior such as engine, language, and callbacks. See the Transcription Configuration resource for details.
To submit a Twilio Recording for transcription, provide the Recording SID in sourceId.
1curl -X POST https://voice.twilio.com/v3/Transcriptions \2-u "$TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN" \3-H "Content-Type: application/json" \4-d '{5"transcriptionConfigurationId": "voice_transcriptionconfiguration_5pe8jw3ahdmsh7zr06yh4d45x1",6"sourceId": "RExxxxxxxxxxxxxxxxxxxxxxxxxxxxx",7"participants": [8{ "audioChannelIndex": 1, "type": "CUSTOMER", "address": "+15551234567", "name": "Jane Doe" },9{ "audioChannelIndex": 2, "type": "HUMAN_AGENT", "address": "+15559876543", "name": "Agent Smith" }10]11}'
1{2"status": "PENDING",3"statusUrl": "https://voice.twilio.com/v3/Transcriptions/voice_transcription_7n7hnd7sf68yfv5re42nvvz7aa",4"transcription": {5"id": "voice_transcription_7n7hnd7sf68yfv5re42nvvz7aa",6"accountId": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",7"status": "PENDING",8"transcriptionConfigurationId": "voice_transcriptionconfiguration_5pe8jw3ahdmsh7zr06yh4d45x1",9"sourceId": "RExxxxxxxxxxxxxxxxxxxxxxxxxxxxx",10"mediaUrl": null,11"audioStartedAt": "2026-03-10T19:42:16Z",12"conversationId": null,13"participants": [14{ "audioChannelIndex": 1, "type": "CUSTOMER", "address": "+15551234567", "name": "Jane Doe" },15{ "audioChannelIndex": 2, "type": "HUMAN_AGENT", "address": "+15559876543", "name": "Agent Smith" }16],17"duration": null,18"resolvedConfiguration": {19"transcriptionEngine": "deepgram",20"speechModel": "nova-3",21"language": "en-US",22"transcriptionStatusCallback": {23"url": "https://example.com/transcription/callback",24"method": "POST",25"events": null26},27"conversationConfigurationId": "conv_configuration_5pe8jw3ahdmsh7zr06yh4d45x1",28"participantDefaults": [29{ "audioChannelIndex": 1, "type": "CUSTOMER" },30{ "audioChannelIndex": 2, "type": "HUMAN_AGENT" }31]32},33"createdAt": "2026-04-02T19:25:20Z",34"updatedAt": "2026-04-02T19:25:20Z",35"url": "https://voice.twilio.com/v3/Transcriptions/voice_transcription_7n7hnd7sf68yfv5re42nvvz7aa"36}37}
To submit an externally hosted audio file for transcription, provide its URL in mediaUrl.
1curl -X POST https://voice.twilio.com/v3/Transcriptions \2-u "$TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN" \3-H "Content-Type: application/json" \4-d '{5"transcriptionConfigurationId": "voice_transcriptionconfiguration_5pe8jw3ahdmsh7zr06yh4d45x1",6"mediaUrl": "https://example.com/audio/recording.wav",7"audioStartedAt": "2026-03-10T19:42:16Z",8"participants": [9{ "audioChannelIndex": 1, "type": "CUSTOMER", "address": "+15551234567", "name": "Jane Doe" },10{ "audioChannelIndex": 2, "type": "HUMAN_AGENT", "address": "+15559876543", "name": "Agent Smith" }11]12}'
1{2"status": "PENDING",3"statusUrl": "https://voice.twilio.com/v3/Transcriptions/voice_transcription_4abcde1234567890abcde12345",4"transcription": {5"id": "voice_transcription_4abcde1234567890abcde12345",6"accountId": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",7"status": "PENDING",8"transcriptionConfigurationId": "voice_transcriptionconfiguration_5pe8jw3ahdmsh7zr06yh4d45x1",9"sourceId": null,10"mediaUrl": "https://example.com/audio/recording.wav",11"audioStartedAt": "2026-03-10T19:42:16Z",12"conversationId": null,13"participants": [14{ "audioChannelIndex": 1, "type": "CUSTOMER", "address": "+15551234567", "name": "Jane Doe" },15{ "audioChannelIndex": 2, "type": "HUMAN_AGENT", "address": "+15559876543", "name": "Agent Smith" }16],17"duration": null,18"resolvedConfiguration": {19"transcriptionEngine": "deepgram",20"speechModel": "nova-3",21"language": "en-US",22"transcriptionStatusCallback": {23"url": "https://example.com/transcription/callback",24"method": "POST",25"events": null26},27"conversationConfigurationId": "conv_configuration_5pe8jw3ahdmsh7zr06yh4d45x1",28"participantDefaults": [29{ "audioChannelIndex": 1, "type": "CUSTOMER" },30{ "audioChannelIndex": 2, "type": "HUMAN_AGENT" }31]32},33"createdAt": "2026-04-02T19:25:20Z",34"updatedAt": "2026-04-02T19:25:20Z",35"url": "https://voice.twilio.com/v3/Transcriptions/voice_transcription_4abcde1234567890abcde12345"36}37}
Info
Use the statusUrl from the response to poll for progress. The response includes a Retry-After header with the recommended number of seconds to wait before polling again.
GET https://voice.twilio.com/v3/Transcriptions/{transcriptionId}
1curl -X GET "https://voice.twilio.com/v3/Transcriptions/voice_transcription_7n7hnd7sf68yfv5re42nvvz7aa" \2-u "$TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN"
1{2"operationId": "voice_transcription_7n7hnd7sf68yfv5re42nvvz7aa",3"status": "COMPLETED",4"statusUrl": "https://voice.twilio.com/v3/Transcriptions/voice_transcription_7n7hnd7sf68yfv5re42nvvz7aa",5"transcription": {6"id": "voice_transcription_7n7hnd7sf68yfv5re42nvvz7aa",7"accountId": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",8"status": "COMPLETED",9"transcriptionConfigurationId": "voice_transcriptionconfiguration_5pe8jw3ahdmsh7zr06yh4d45x1",10"sourceId": "RExxxxxxxxxxxxxxxxxxxxxxxxxxxxx",11"mediaUrl": null,12"audioStartedAt": "2026-03-10T19:42:16Z",13"conversationId": "conv_conversation_01k1etx3jbfx88476ccja0889c",14"participants": [15{ "audioChannelIndex": 1, "type": "CUSTOMER", "address": "+15551234567", "name": "Jane Doe" },16{ "audioChannelIndex": 2, "type": "HUMAN_AGENT", "address": "+15559876543", "name": "Agent Smith" }17],18"duration": 120,19"resolvedConfiguration": {20"transcriptionEngine": "deepgram",21"speechModel": "nova-3",22"language": "en-US",23"transcriptionStatusCallback": {24"url": "https://example.com/transcription/callback",25"method": "POST",26"events": null27},28"conversationConfigurationId": "conv_configuration_5pe8jw3ahdmsh7zr06yh4d45x1",29"participantDefaults": [30{ "audioChannelIndex": 1, "type": "CUSTOMER" },31{ "audioChannelIndex": 2, "type": "HUMAN_AGENT" }32]33},34"createdAt": "2026-04-02T19:25:20Z",35"updatedAt": "2026-04-02T19:25:20Z",36"url": "https://voice.twilio.com/v3/Transcriptions/voice_transcription_7n7hnd7sf68yfv5re42nvvz7aa"37}38}