High Volume Voice: Scale Outbound Voice Applications with Twilio

February 28, 2023
Written by

High Volume Voice Considerations Hero

Scaling Voice applications is complex. From making an API request to Twilio, to ensuring your calls are answered by end users, there is a lot to consider to make your outbound voice campaigns successful and scalable.

This post is best for an existing customer who is launching larger calling campaigns or expanding usage to millions of calls per day or peaks of thousands of calls per minute. By the end of this blog post, you should feel comfortable knowing what to consider when growing your voice usage to 10,000 calls per minute or more.

Prerequisites

Table of Contents

 


CPS: How fast can you place calls?

What is CPS?

CPS stands for Calls Per Second. It is the rate at which a Twilio Account can make outbound calls.

Every Twilio account (including subaccounts) has 1 CPS by default. Elastic SIP Trunking defaults to 1 CPS per trunk per region. CPS is a per-account setting.  By default, subaccount CPS does not roll up into the parent account’s CPS or impact other subaccounts. Each account must be configured separately if needed.

CPS does not impact call concurrency. The most common limiting factor for call concurrency is the scalability of your servers.

There is no CPS rate limit for outgoing <Dial> calls or inbound calls. CPS is only for call creation via API.

Read more about CPS in our support article How fast can I place or receive phone calls with Twilio?.

What if you exceed CPS?

API requests for calls that exceed the CPS rate limit will be queued, and executed as capacity is available. Exceeding the CPS rate on an Elastic SIP Trunk will result in a failed call with a SIP 503 Trunk CPS limit exceeded response. Refer to section “How can I increase CPS?” for more information on raising your CPS limits.

Twilio will queue up to 24 hours of calls. At 1 CPS, this is 86,400 calls.

Some amount of queuing is normal, and tolerance for queueing depends on your use case. Every workflow is on a spectrum of how much queuing is acceptable. For example, call flows involving live agents waiting on an outbound call will probably tolerate less queuing than automated voice notifications.

Some questions to consider include:

  • Do I have a certain number of calls I must make in a certain time period? Do I have any SLAs for my customers?
  • Do I have historical information about max call volumes? How can I start estimating my requirements?

Building your own queueing logic

Consider a workflow where you were placing calls on behalf of different customers. During peak times, which customer’s calls take precedence? What if lots of calls are queued and a high priority customer needs to get a call out first?

The only tool Twilio offers to control a call queue is to cancel calls via the Call Resource. To get that last-minute high-priority call out, you would need to cancel the whole queue and then re-queue all those calls. For the most fine-grained control, it is worth considering building your own queuing logic instead of relying on Twilio’s queue.

With custom logic, you can handle scenarios where calls are queued and you need to pause or throttle dialing, or need to have one client’s calls take precedence. In this scenario, you would build and manipulate any queue within your own application and send API requests to the Call Resource at the rate of your CPS to avoid queuing on Twilio.

How can you track CPS?

For Programmable Voice requests, The QueueTime attribute is included in Twilio's response to POST requests to the Calls Resource. You can use this to gauge if you are exceeding CPS, and by how much. For example, If requests are exceeding CPS and a call is queued for 10,000 milliseconds (10 seconds) before initiating, QueueTime will return 10000 and you may want to slow dialing or increase CPS in the Twilio Console.

Additionally, ISVs can also track the average queue time across their subaccounts with Voice Insights in the Twilio Console.

To take a more proactive approach, you can model queueing against different CPS using your historical call traffic. A great starting point is this GitHub repository containing Python scripts for extracting call logs from your Twilio account and modeling that call data to compare queueing given different CPS.

Consider the following examples showing the same sample call traffic. When modeled at 15 CPS, 20 CPS and 25 CPS, we can compare peak queuing. At 15 CPS, the high peak is over 80 seconds of queueing, at 20 CPS the peak is around 4 seconds and at 25 CPS there is never more than 1 second of queue.  

Call Queue Size at 15 CPS example

Call Queue Size at 20 CPS Example

Call Queue Size at 25 CPS example

Choosing the right CPS is always a negotiation between queuing and price. Tools like this are helpful in these discussions.

How much does CPS cost?

You can find CPS pricing here for Programmable Voice and here for SIP Trunking. CPS scales exponentially, i.e., the 20th added CPS is much more expensive than the 2nd.

CPS is prorated when increased mid-month, but downgrades can only be scheduled on the 1st day of following month. You cannot increase CPS for one day or one week mid-month. For example, if you expect a day of voice traffic spike, you must increase CPS starting on that peak day through the end of the month. You can schedule CPS increases and decreases in advance in the console.

  • If CPS is upgraded on September 1st, then you will be charged for the month and every following month, until you downgrade CPS.
  • If the CPS is upgraded on September 15th, then you will be charged for half of that month (September 15-30).

CPS is billed at the parent account level, so adding 100 extra CPS across 10 subaccounts costs the same as adding 100 CPS at the parent account level. Remember, Subaccounts are built to divide business units. When calculating your costs, always aggregate your total added CPS across all accounts. Remember that every account and subaccount defaults to 1 CPS.

How can you increase CPS?

You can self service your CPS in the Twilio console up to 30 CPS for Programmable Voice and 15 for SIP Trunks. There is no API for CPS. 

Beyond these limits, CPS increases must go through your Twilio Account Team.


API concurrency

What is API concurrency?

API concurrency is the limit of simultaneous requests you can make to Twilio’s edge.

There is no queueing for API Requests. Twilio will fail API requests with a 429 error if you exceed your concurrency limit.

You can find out more about our concurrency limits in our Support article on rate limits.

API concurrency is not the same as requests per second. The volume of requests you can make to, for example, the /Calls endpoints is dependent on the turn-around time for each request.

  • For example, say you want to make 500 POST requests to /Calls.
  • This will vary, but let’s say each request takes 0.2s. If you make all 500 requests simultaneously and you had, for example, a 100 POST request concurrency, the first 100 will succeed and the last 400 will fail with a 429 error. Instead, if you make 100 requests at t=0, 100 requests at t=0.2, etc. (where 't' is time), you can stay within concurrency limits.

API Concurrency is specific to each account. Subaccounts are meant to be containers for business units and have their own independent concurrency limits. Make sure to build resilient systems to retry requests with exponential backoff. If you have implemented both best practices, but continue to receive Error 429 responses, please contact our Support team with your relevant use case information.

How can I monitor API concurrency?

You can monitor API response headers.

  • Twilio-Concurrent-Requests indicates the number of concurrent requests, at that moment, for the account. Subaccount requests do not roll up into parent account requests. The concurrency count includes requests that receive a 429 response.
  • Twilio-Request-Duration is the time it took for the request to complete within the Twilio platform. This is the period between when the request hit our edge and when the response was sent back to your server. This does not include the network time between Twilio servers and your servers.

For more information, consult Twilio’s docs on REST API Best Practices and handling 429s.


Queue

What are the limitations around <Queue>?

If your voice workflow involves placing large numbers of callers in Queues, you may need to adjust the maxSize parameter.  

There is a default limit of 1000 calls in a single <Queue> (i.e. attempts to Enqueue the 1001st caller will fail). This can be increased up to 5000.

You can self-serve this increase via API. Using a Queue Sid (prefixed by QU) from the Programmable Voice API response, you can change the maximum number of calls for that queue by updating the queue resource and specifying a different MaxSize. For example:

curl 'https://api.twilio.com/2010-04-01/Accounts/ACxxx/Queues/QUxxx.json' -X POST \
--data-urlencode 'FriendlyName=QueueX' \
--data-urlencode 'MaxSize=200' \
-u ACxxx:[AuthToken]

Conference Participants

What are the limitations around <Conference>?

If your voice workflow involves a high volume of concurrent participants using Conference, you may need to request an increase by speaking with your Twilio account team.

There is a default limit of 5000 participants across all conferences account-wide (i.e., not per unique conference). If you reach the concurrent participant limit, new participants cannot be created and will fail with a SIP 603 or 429 depending on the direction of the call. This limit is per account, so subaccounts each default to 5000 participants.

There is no conference concurrency API or other tooling to monitor participants. You would need to maintain participant counts within your own application to monitor concurrency.

Remember that CPS includes any call created by API. This does not include <Dial> but does include creating Conference participants via API.

Twilio Studio

What are the limitations around Studio?

If you use Twilio Studio for outbound voice traffic, speak with your Twilio Account team about your use case to ensure your workflow scales. Additionally, reference our docs for other Studio limits here and here.

Functions

What are the limitations around Functions?

If your voice workflow leverages Twilio Functions, you should be aware of certain limitations.

Each account can execute up to 30 concurrent invocations across all of its Functions. The Functions Invocation concurrency limit is separate from the concurrency discussed above as Function executions do not go through the Twilio REST API infrastructure. When Twilio is unable to handle the traffic, Twilio will respond with a 429 error on a Function execution.

Talk to your Twilio account team if you have any questions, and reference the Function docs for more info on additional limitations.


How can I ensure my voice campaign is successful?

Simply launching a high volume voice campaign can look like spam to carriers, so ensure you are prepared in advance of making calls. In order to maintain high answer rates and excellent number reputations, you should adhere to both industry best practices and all legal compliance requirements.

Don’t forget about inbound traffic

Some of your outbound traffic will inevitably show up as missed calls on your end user’s handsets. Even if your use case is one-way notification or alerts, you need to be able to handle users calling you back. Consider using Twilio Studio to set a simple IVR as a backup, or are otherwise set up to handle callbacks. Ensuring that calls correctly routed on callback will prevent your end users from reporting your traffic as spam.

Voice Compliance

It’s more important than ever to ensure your voice workflows comply with all legal requirements and Twilio’s Acceptable Use Policy. Violations could lead to Twilio account suspension in addition to fines and legal ramifications.

Twilio is working to influence policy both in the United States and internationally to protect calling use cases and ensure a strong voice ecosystem. Twilio is a leader in many committees that are committed to restoring trust between businesses and consumers: USTelecom, Industry Traceback Group, State Attorneys General Anti-Robocall Coalition, and North American Numbering Council. Read more about Twilio’s commitment to trust in telephony here.  

The FTC and FCC regulate telemarketing calls with legislation including the Telemarketing Sales Rules (TSR) and Telephone Customer Protection Act (TCPA). Some key requirements include express consent, prohibiting calls to consumers on the National Do Not Call Registry or State-level DNC Registries, and maintaining low call abandonment rates.

Refer to Twilio’s requirements for voice traffic.

In addition, ensure your traffic is following all local legal requirements including the following.

Spam Likely

Twilio has a new solution for spam labeling launching in public beta in Q1 2023. See ‘Voice Integrity’ below.

Carriers work with third-party analytics vendors that provide services in determining and tagging the validity of a phone call, determining if “Spam” or “Scam Likely” displays on a user’s handset. Each vendor has their own blend of variables contributing to a score that are all proprietary but we have insight into some of the ways you can maintain your reputation.

Ensuring high answer rates starts with following the best practices of Twilio and these analytics vendors.

For a detailed overview, review Twilio’s best practices.

There are also excellent resources directly from the analytics vendors.

Following best practices is an ongoing effort for your campaigns. There are also a few one-time, free actions you should take that contribute to number health and answer rates.  Right now, the only way to reset a number’s reputation is with direct registration with the analytics vendors’  telephone number registration process. See Coming Soon for the future of trusted calling.    

  • Submit numbers directly through the Free Caller Registry which works with all three major analytic vendors. There is no API, but they have a bulk upload tool.  

SHAKEN/STIR

SHAKEN/STIR is a caller authentication framework meant to help restore trust by reducing fraudulent robocalls and illegal phone number spoofing. SHAKEN/STIR is a key but not comprehensive way to fix SPAM labeling. The algorithms that the analytic vendors use take into account several pieces of data, including call durations, answer rates, crowdsourced feedback as well as SHAKEN/STIR attestation levels.

CNAM

It’s important that the current Caller ID name (CNAM) on your numbers matches your business name. Incorrect CNAM can be part of SPAM labeling and can lower answer rates.

  • You can associate your brand name with your numbers using CNAM via the TrustHub in the Twilio Console or via API

If you are an ISV placing calls on behalf of customers you should be aware of the best practices for account architecture. Before registering for SHAKEN/STIR and CNAM you should ensure each end customer are in separate subaccounts.  For more of a deep dive on best practices for your architecture, reference Voice Architecture and Best Practices for ISVs.

Voice Integrity & Branded Calling: coming soon

Twilio’s Voice team is working on a new TrustHub offering, the Voice Integrity Trust Product. Voice Integrity will offer a way to register phone numbers directly with all the analytic vendors to ensure calls are not mis-labeled as spam as well as monitoring those numbers and redress for mislabeling. You will want to use Voice Integrity in parallel with SHAKEN/STIR and CNAM, when all used together you will have a large increase in call answer rates and/or quality calls.

Additionally, Twilio is launching Branded Calling to display a brand as Caller ID instead of the number. Unlike the ~5% of users who subscribe to CNAM, Branded Calling will cover a much larger percentage of US consumers.

Erika Kettleson is a Senior Solutions Engineer at Twilio. She loves thorough documentation and finding owls to draw. She can be reached at ekettleson [at] twilio.com.