Get Started

TwiMLTM Voice: <Record>

The <Record> verb records the caller's voice and returns to you the URL of a file containing the audio recording. You can optionally generate text transcriptions of recorded calls by setting the 'transcribe' attribute of the <Record> verb to 'true'.

Verb Attributes

The <Record> verb supports the following attributes that modify its behavior:

Attribute Name Allowed Values Default Value
action relative or absolute URL current document URL
method GET, POST POST
timeout positive integer 5
finishOnKey any digit, #, * 1234567890*#
maxLength integer greater than 1 3600 (1 hour)
transcribe true, false false
transcribeCallback relative or absolute URL none
playBeep true, false true
trim trim-silence, do-not-trim trim-silence

Use one or more of these attributes in a <Record> verb like so:

<?xml version="1.0" encoding="UTF-8"?>
<Response>
    <Record timeout="10" transcribe="true" />
</Response>

action

The 'action' attribute takes an absolute or relative URL as a value. When recording is finished Twilio will make a GET or POST request to this URL including the parameters below. If no 'action' is provided, <Record> will default to requesting the current document's URL.

After making this request, Twilio will continue the current call using the TwiML received in your response. Keep in mind that by default Twilio will re-request the current document's URL, which can lead to unwanted looping behavior if you're not careful. Any TwiML verbs occuring after a <Record> are unreachable.

There is one exception: if Twilio receives an empty recording, it will not make a request to the 'action' URL. The current call flow will continue with the next verb in the current TwiML document.

Request Parameters

Twilio will pass the following parameters in addition to the standard TwiML Voice request parameters with its request to the 'action' URL:

Parameter Description
RecordingUrl the URL of the recorded audio
RecordingDuration the duration of the recorded audio (in seconds)
Digits the key (if any) pressed to end the recording or 'hangup' if the caller hung up

A request to the RecordingUrl will return a recording in binary WAV audio format by default. To request the recording in MP3 format, append ".mp3" to the RecordingUrl.

method

The 'method' attribute takes the value 'GET' or 'POST'. This tells Twilio whether to request the 'action' URL via HTTP GET or POST. This attribute is modeled after the HTML form 'method' attribute. 'POST' is the default value.

timeout

The 'timeout' attribute tells Twilio to end the recording after a number of seconds of silence has passed. The default is 5 seconds.

finishOnKey

The 'finishOnKey' attribute lets you choose a set of digits that end the recording when entered. For example, if you set 'finishOnKey' to '#' and the caller presses '#', Twilio will immediately stop recording and submit 'RecordingUrl', 'RecordingDuration', and the '#' as parameters in a request to the 'action' URL. The allowed values are the digits 0-9, '#' and '*'. The default is '1234567890*#' (i.e. any key will end the recording). Unlike <Gather>, you may specify more than one character as a 'finishOnKey' value.

maxLength

The 'maxLength' attribute lets you set the maximum length for the recording in seconds. If you set 'maxLength' to '30', the recording will automatically end after 30 seconds of recorded time has elapsed. This defaults to 3600 seconds (one hour) for a normal recording and 120 seconds (two minutes) for a transcribed recording.

transcribe

The 'transcribe' attribute tells Twilio that you would like a text representation of the audio of the recording. Twilio will pass this recording to our speech-to-text engine and attempt to convert the audio to human readable text. The 'transcribe' option is off by default. If you do not wish to perform transcription, simply do not include the transcribe attribute.

Note: transcription is a pay feature. If you include a 'transcribe' or 'transcribeCallback' attribute on your <Record> verb, your account will be charged. See the pricing page for our transcription prices.

Additionally, transcription is currently limited to recordings with a duration of two minutes or less. If you enable transcription and set a 'maxLength' attribute greater than 120 seconds, Twilio will write a warning to your debug log rather than transcribing the recording.

transcribeCallback

The 'transcribeCallback' attribute is used in conjunction with the 'transcribe' attribute. It allows you to specify a URL to which Twilio will make an asynchronous POST request when the transcription is complete. This is not a request for TwiML and the response will not change call flow, but the request will contain the standard TwiML request parameters as well as 'TranscriptionSid', 'TranscriptionStatus', 'TranscriptionText', 'TranscriptionUrl', 'RecordingSid' and 'RecordingUrl'.

If 'transcribeCallback' is specified, then there is no need to specify 'transcribe=true'. It is implied. If you specify 'transcribe=true' without a 'transcribeCallback', the completed transcription will be stored for you to retrieve later (see the REST API Transcriptions section), but Twilio will not asynchronously notify your application.

Request Parameters

Twilio will pass the following parameters with its request to the 'transcribeCallback' URL:

Parameter Description
TranscriptionSid The unique 34 character ID of the transcription.
TranscriptionText Contains the text of the transcription.
TranscriptionStatus The status of the transcription attempt: either 'completed' or 'failed'.
TranscriptionUrl The URL for the transcription's REST API resource.
RecordingSid The unique 34 character ID of the recording from which the transcription was generated.
RecordingUrl The URL for the transcription's source recording resource.
CallSid A unique identifier for this call, generated by Twilio.
AccountSid Your Twilio account id. It is 34 characters long, and always starts with the letters AC.
From The phone number or client identifier of the party that initiated the call. Phone numbers are formatted with a '+' and country code, e.g. +16175551212 ([E.164][e164] format). Client identifiers begin with the client: URI scheme; for example, for a call from a client named 'tommy', the From parameter will be client:tommy.
To The phone number or client identifier of the called party. Phone numbers are formatted with a '+' and country code, e.g. +16175551212 ([E.164][e164] format). Client identifiers begin with the client: URI scheme; for example, for a call to a client named 'jenny', the To parameter will be client:jenny.
CallStatus A descriptive status for the call. The value is one of queued, ringing, in-progress, completed, busy, failed or no-answer. See the CallStatus section for more details.
ApiVersion The version of the Twilio API used to handle this call. For incoming calls, this is determined by the API version set on the called number. For outgoing calls, this is the API version used by the outgoing call's REST API request.
Direction A string describing the direction of the call. inbound for inbound calls, outbound-api for calls initiated via the REST API or outbound-dial for calls initiated by a <Dial> verb.
ForwardedFrom This parameter is set only when Twilio receives a forwarded call, but its value depends on the caller's carrier including information when forwarding. Not all carriers support passing this information.

playBeep

The 'playBeep' attribute allows you to toggle between playing a sound before the start of a recording. If you set the value to 'false', no beep sound will be played.

trim

The 'trim' attribute lets you specify whether to trim leading and trailing silence from your audio files. 'trim' defaults to trim-silence, which removes any silence at the beginning or end of your recording. This may cause the duration of the recording to be slightly less than the duration of the call.

Nesting Rules

You can't nest any verbs within <Record> and you can't nest <Record> within any other verbs.

See Also

Examples

Example 1: Simple Record

Twilio will execute the <Record> verb causing the caller to hear a beep and the recording to start. If the caller is silent for more than 5 seconds, hits the '#' key, or the recording maxlength time is hit, Twilio will make an HTTP POST request to the default 'action' (the current document URL) with the parameters 'RecordingUrl' and 'RecordingDuration'.

<?xml version="1.0" encoding="UTF-8"?>
<!-- page located at http://example.com/simple_record.xml -->
<Response>  
    <Record/>  
</Response>

Example 2: Record a voicemail

This is example shows a simple voicemail prompt. The caller is asked to leave a message at the beep. The <Record> verb beeps and begins recording up to 20 seconds of audio.

<?xml version="1.0" encoding="UTF-8"?>
<!-- page located at http://example.com/voicemail_record.xml -->
<Response>
    <Say>
        Please leave a message at the beep. 
        Press the star key when finished. 
    </Say>
    <Record 
        action="http://foo.edu/handleRecording.php"
        method="GET" 
        maxLength="20"
        finishOnKey="*"
        />
    <Say>I did not receive a recording</Say>
</Response>
  • If the caller does not speak at all, the <Record> verb exits after 5 seconds of silence, falling through to the next verb in the document. In this case, it would fall through to the <Say> verb.
  • If the caller speaks for less that 20 seconds and is then silent for 5 seconds, Twilio makes a GET request to the 'action' URL. The <Say> verb is never reached.
  • If the caller speaks for the full 20 seconds, Twilio makes a GET request to the 'action' URL. The <Say> verb is never reached.

Example 3: Transcribe a recording

Twilio will record the caller. When the recording is complete, Twilio will transcribe the recording and make an HTTP POST request to the 'transcribeCallback' URL with a parameter containing a transcription of the recording.

<?xml version="1.0" encoding="UTF-8"?>
<!-- page located at http://example.com/record_and_transcribe.xml -->
<Response>
    <Record transcribe="true" transcribeCallback="/handle_transcribe.php"/>
</Response> 

Hints and Advanced Uses

  • Twilio will trim leading and trailing silence from your audio files. This may cause the duration of the files to be slightly smaller than the time a caller spends recording them.