Skip to contentSkip to navigationSkip to topbar
On this page

Customize Your Conversational Intelligence Configuration


This page covers custom configurations you can use with Conversational Intelligence. While the Onboarding Guide helps you set up the basic workflow for Conversational Intelligence using Twilio Voice recordings, you can also set up Conversational Intelligence with the following changes:

  • Alternative sources, including third-party media recordings outside of Twilio and audio from Twilio Video recordings.
  • API features and optimizations, which are best practices when you use the API and features available only through the Transcript API resource.

Use a source other than Twilio Voice Transcripts

use-a-source-other-than-twilio-voice-transcripts page anchor

Use third-party media recordings

use-third-party-media-recordings page anchor

Conversational Intelligence supports third-party media recordings. If your call recordings aren't stored in Twilio and you want to use them with Conversational Intelligence, the recordings must be publicly accessible for the duration of transcription. You can host the recordings or use on a time-limited pre-signed URL.

For example, to share a recording on an existing AWS S3 bucket, follow this guide(link takes you to an external page). Then add the public recording URL to the media_url when you create a transcript with the API

Create an audio recording from Twilio Video

create-an-audio-recording-from-twilio-video page anchor

If you want to transcribe the audio of a Twilio Video recording, it needs additional processing to create an audio recording that can you can submit for transcription.

To create a dual-channel audio recording first, transcode a separate audio-only composition for each participant in the Video Room.

Create a dual-channel audio recording

create-a-dual-channel-audio-recording page anchor
1
curl -X POST "https://video.twilio.com/v1/Compositions" \ --data-urlencode "AudioSources=PAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
2
\ --data-urlencode "StatusCallback=https://www.example.com/callbacks"
3
\ --data-urlencode "Format=mp4"
4
\ --data-urlencode "RoomSid=RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
5
\ -u $TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN

Next, download the media from these compositions and merge them into a single audio stereo audio.

Download the Video Room Media

download-the-video-room-media page anchor
ffmpeg -i speaker1.mp4 -i speaker2.mp4 -filter_complex "[0:a][1:a]amerge=inputs=2[a]" -map "[a]" -f flac -bits_per_raw_smaple 16 -ar 441000 output.flac

In case the recording duration for each participant is different, you can avoid overlapping audio tracks. Use ffmpeg to create a single-stereo audio track with delay to cover the difference in track length. For example, if one audio track last 63 seconds and the other 67 seconds, use ffmpeg to create a stereo file with the first track, with four seconds of delay to match the length of the second track.

Create a single stereo audio track

create-a-single-stereo-audio-track page anchor
ffmpeg -i speaker1.wav -i speaker2.wav -filter_complex "aevalsrc=0:d=${second_to_delay}[s1];[s1][1:a]concat=n=2:v=0:a=1[ac2];[0:a]apad[ac1];[ac1][ac2]amerge=2[a]" -map "[a]" -f flac -bits_per_raw_sample 16 -ar 441000 output.flac

Finally, send a CreateTranscript request to Conversational Intelligence by providing a publicly accessible URL for this audio file as media_url in MediaSource.


API features and optimizations

api-features-and-optimizations page anchor

Include metadata for conversation participants

include-metadata-for-conversation-participants page anchor

By default, Conversational Intelligence assumes Participant one is on channel one, and Participant two is on channel two. If your use case doesn't follow this channel mapping, you can provide optional Participant metadata that maps the participant to the correct audio channel when you create a transcript with the API. You can also use this field to attach other participant metadata to the transcript.

Provide a CustomerKey when you create a transcript with the API

provide-a-customerkey-when-you-create-a-transcript-with-the-api page anchor

You can provide a CustomerKey when you create a transcript with the API, which allows you to map a Transcript to an internal identifier. This can be a unique identifier within your system to track the transcripts. The CustomerKey is also included as part of the webhook callback when the results for Transcript and Operators become available. This is an optional field, and you can't substitute CustomerKey for Transcript Sid in the APIs.