Upload Prerecorded Conversations for Voice Intelligence

February 15, 2024
Written by
Reviewed by

With Twilio’s Voice Intelligence product, every call can be a source of data. Keywords and insight are transcribed and plotted to understand customers, with the assurance that regulatory policies are followed. Gone are the days of manually listening to old audio recordings to gather information. In this blog, we’ll cover uploading audio recordings and using voice intelligence to transcribe and gather information from the call.

Prerequisites

Setup

If you don't have a pre-recorded conversation handy but want to follow along, you can create one with a friend. (Remember to give your friend notice that you will record the conversation!)

The following steps will guide you through the process. If you have a dual-channel recorded call handy, proceed to the section titled “Create a Voice Intelligence Service in the Twilio Console”.

Record and download a call

Follow this blog to buy a Twilio number and set up a call forwarding feature. This way your Twilio number will act as a proxy when contacting your friend. When setting up the Connect Call To widget, make sure Record Call is toggled ON. Publish your Flow, assign it to a Twilio number, and you're ready to make your call!

Once the call has terminated, head to the left panel of the Console and select Monitor > Logs > Call Recordings. Click on the recording that you'd like to download and download the WAV format.

Recordings must be in stereo format. Mono recordings will appear as a one-person conversation.

Upload the recording to an AWS S3 bucket

Once downloaded, head over to AWS and create an S3 bucket. Upload the recording as an object in the bucket.

In your S3 bucket, select the recording object, click on the Actions drop down menu and select Share with a pre-signed URL. For its desired expiration time, 2 hours is recommended. Finalize the creation of the pre-signed URL by clicking the submit button, then a confirmation should appear with its URL automatically copied to your clipboard.

Create a Voice Intelligence Service in the Twilio Console

Head to the Twilio Console and navigate to Voice Intelligence/Services . Create a new Service and give it a unique name. Services use AI to analyze phone calls for keywords and determine intent. All selections when creating a Service are optional.

After your Service is created, select any prebuilt language operators provided by Twilio and create any language operators you'd like to include.

Upload the pre-recorded call to the Twilio Console

Make a directory for this project, run npm init, and accept all the defaults.

mkdir upload-call-voice-intelligence
cd upload-call-voice-intelligence
npm init

npm init is a utility to create a package.json file. During initialization, you’ll be prompted to define fields such as the package name, version, description, author, etc., where you can use default or custom values.

Inside the directory, install the necessary dependencies.

npm i --save dotenv twilio

Next, add a .env file containing the account sid and authorization token for the Twilio account. The credentials can be found by navigating to the Twilio home page.

TWILIO_ACCOUNT_SID={REPLACE-WITH-ACCOUNT-SID}
TWILIO_AUTH_TOKEN={REPLACE-WITH-AUTH-TOKEN}

Create a file called create-transcript.js and add the following code:

require("dotenv").config();

const accountSid = process.env.TWILIO_ACCOUNT_SID;
const authToken = process.env.TWILIO_AUTH_TOKEN;
const client = require('twilio')(accountSid, authToken);


/* Enter aws Media Url and Voice Intelligence service Sid */
const awsMediaUrl = /*REPLACE-WITH-AWS-PRESIGNED-URL*/;
const serviceSid = /*REPLACE-WITH-VOICE-INTELLIGENCE-SERVICE-SID*/;


const participants = [
  {
     "user_id" : "id1", // Do not change
     "channel_participant": 1, // Do not change
     "media_participant_id": /*REPLACE-WITH-YOUR-TWILIO-PHONE-NUMBER*/,
     "email": /*REPLACE-WITH-YOUR-EMAIL*/,
     "full_name": /*REPLACE-WITH-YOUR-NAME*/,
     "role": "Agent"  // Do not change
  },
  {
     "user_id" : "id2", // Do not change
     "channel_participant": 2, // Do not change
     "media_participant_id": /*REPLACE-WITH-FRIEND’S-PHONE-NUMBER*/,
     "email": /*REPLACE-WITH-FRIEND’S-EMAIL*/,
     "full_name": /*REPLACE-WITH-FRIEND’S-NAME*/,
     "role": "Customer"  // Do not change
  }
];

const channel = {
  "media_properties": {
    "source_sid": null,
    "media_url": awsMediaUrl
  },
  participants
};

client.intelligence.v2.transcripts
  .create({
    serviceSid,
    channel
  })
  .then(transcript => console.log(transcript.sid));

And fill in with the necessary information. The Voice Intelligence Service SID can be found in the Twilio Console at Voice Intelligence/Services then choosing the SID for the newly created Service. The full name and email information will be viewable in the console once the code has run and the call is uploaded to the console. Participant information is optional.

Run the file by pasting the following command in your terminal:

node create-transcript.js

Now that the call has been uploaded to the Twilio Console and analyzed by the service, copy the transcript SID logged in the Console for the next step.

Create a Transcription of the recording

Create a file called pull-transcript-information.js, add the following code:

require("dotenv").config();
const fs = require('fs');

const accountSid = process.env.TWILIO_ACCOUNT_SID;
const authToken = process.env.TWILIO_AUTH_TOKEN;
const client = require('twilio')(accountSid, authToken);

client.intelligence.v2.transcripts('<your transcript SID>')
  .operatorResults
  .list({limit: 20})
  .then(operatorResults => operatorResults.forEach(operator => 
    console.log({
            operatorType: operator.operatorType,
            name: operator.name,
            probability: Object.keys(operator.labelProbabilities).length? operator.labelProbabilities: operator.matchProbability,
        })
    
    ));

let transcript = "";

client.intelligence.v2.transcripts('<your transcript SID>')
    .sentences
    .list({limit: 20})
    .then(sentences => sentences.forEach(s => {
      transcript += `${s.transcript}\n`
    }))
    .then(res => {
        fs.writeFile("transcript.txt", transcript, null, (err) => { 
            err? console.log(err) : console.log("File written successfully\n");
        })
    });

With the transcript sid logged in the previous step, add that value to the appropriate places.

Run the file by pasting the following command in your terminal:

node pull-transcript-information.js

The output in your terminal will display the level of confidence for the applied language operators in your service and a new file, transcript.txt will be created in the repository folder!

Next Steps

Once the pre-recorded call is uploaded, the transcript and invoked operators can be viewed in the Console. Additional steps can be taken to aggregate the operator results to an external data source for another team to decode and discover trends. To learn more about Twilio’s voice intelligence product, follow this tour to see its features!

Voice Intelligence is currently available as a public beta release. Some features are not yet implemented and others may be changed before the product is declared as Generally Available.

My name is Kaelyn and I'm a Developer Evangelist at Twilio whose primary focus is Twilio’s Voice API. If you have any questions, comments, or would like to show me any cool projects you’re working on, let me know online!

Email: kchresfield@twilio.com

Github: kchresfield