Guide to Architecting a Messaging Solution for High Deliverability, Resiliency, and Scalability

Engineers collaboratively architecting an SMS platform
April 05, 2023
Written by
Reviewed by
Paul Heath
Twilion
Sally Lee
Twilion

Whether you are developing a new messaging solution from scratch or looking to improve your current implementation, the following guide defines and details how to architect a solid foundation for a successful SMS platform built with the Twilio Programmable Messaging API.

It’s straightforward to send your first SMS or create a small real-world application. The natural next step is to build out a comprehensive solution for your business that reliably delivers and scales without issue. This presents a challenge due to the complexity of the messaging world, but this guide will arm you with the knowledge to establish a solid framework for success with goals of high deliverability, low opt-out rates, and a great user experience (UX) for message recipients.

Here are common examples of messaging scaling challenges:

  • Carriers filter your messages if compliance isn’t adhered to
  • Messages are delayed or dropped if the sending phone number cannot accommodate the throughput
  • Low conversion and high opt-out rates from messaging too frequently

In this post, we’ll start with the basic foundations of selecting the correct sender type based on your throughput needs and compliance considerations. This is followed by getting savvy with optimization techniques, both technical and practical. Finally, we’ll review how to procure and analyze relevant data to both improve your messaging metrics, as well as detect and quickly address issues as they arise.

Article Contents

  • Foundations
    • Throughput
    • Compliance
  • Optimization
    • Data hygiene
    • Link shortening & click tracking
    • User experience
  • Analysis & Iteration
    • Monitor health
    • Smart improvements

Throughput

The most fundamental part of sending SMS is selecting a phone number. In the SMS business, this is a sender type that can handle your current and projected outbound message volume.

Messages sent at scale require a need for speed – how much speed is the question. Exponential growth is a windfall for any business, and that comes along with a need for a huge upgrade in volume. For example, Resy, a restaurant booking service, has grown to send millions of on-time texts per day to diners to confirm reservations using Twilio Messaging. In the SMS world this metric is called throughput, which is simply the maximum Message Segments per Second (MPS) a carrier permits on a given sender type. If throughput is exceeded, message segments are put into a queue that maxes out at 4 hours. So, if you send a higher volume of messages than your sender types can reasonably handle, they will be delayed and even dropped if the 4 hour queue is exceeded, causing overflow errors.

Calculating throughput requirements

As an example, let’s say you have a toll-free number at 3 MPS throughput and you send 750,000 texts to your customers. Assume each message has one segment. Let’s calculate an instantaneous bulk send:

Max queue length: 4 hours, or 14,400 seconds

Available throughput: 3 MPS

Total message segments: 750,000

750,000 messages / 3 MPS = 250,000 seconds required

250,000 - 14,400 max queue = 235,600 seconds still required * 3 MPS = 706,800 messages in queue overflow

If you used a short code with default 100 MPS or increased the toll-free throughput to 100 MPS:

750,000 / 100 MPS = 7,500 seconds required (no queue overflow)

Crisis averted with the increase to 100 MPS throughput!

Now, to illustrate the importance of checking your message length. If your texts are long enough to be 2 segments, this counts as 2 messages. Here is the same calculation with 2-segment messages at the same 100 MPS throughput:

(750,000 messages * 2 segments) / 100 MPS = 15,000 seconds required - 14,400 = 600 messages in queue overflow

Obviously, this is not ideal if your messages are time sensitive (such as 2FA or conversational), and you must re-evaluate which sender type you need combined with throughput required. Here is a visual representation of a queue overflow in the latency tab of Twilio’s Messaging Insights tool (covered in more detail in the Monitor Health section below):

Messaging Insights Latency Tab

 

Throughput and sender types

Throughput per sender type can get quite complex, as country-based and sender-specific requirements may affect this. Twilio provides an abundance of documentation about sender types and their throughput, sender type compliance, as well as geographic-based guidance for the 180+ supported countries.

Generally speaking, short codes provide the highest throughput starting at 100 MPS while toll-free numbers start at 3 MPS. Both short code and toll-free numbers have an advantage here at Twilio: you can increase throughput on request by working with your account representative or our support team. This is especially helpful if you expect traffic growth but aren’t quite sure how much. Long code senders have a baseline of 1 MPS, and depending on the country of origin may vary. For example, a Twilio UK number has 10 MPS by default, whereas A2P DLC in the US will assign a static trust score which maps to throughput per carrier and MPS is static. Alphanumeric sender IDs are also an option.

We recommend adding a small margin of additional throughput on top of what you need at peak times, in case you require more than anticipated. Hey, growth is good right? And like the old saying goes, it’s better to be safe than queued

If your messages are not time sensitive, have a higher standard deviation of send times, and you are not incurring major delays, then it’s probably alright to go with the middle-to-upper limit of required throughput.

Here are resources to learn more about throughput

Compliance

SMS compliance is like eating your vegetables. You do it because it’s good for you, otherwise you’ll have a variety of health problems from the lack of nutrients. For example, you’ll develop scurvy if you don’t consume enough vitamin C for an extended period of time.

Well, in any case, if you do not follow geographic regulations, sender based verification requirements, or Twilio Messaging compliance policies, your messages are at risk for filtering. In severe cases of non-compliance, you’ll be notified by our support team that your Twilio Account may be suspended.

Luckily, Twilio has compliance requirements well-documented. It is best to get compliance set up from the jump so you aren’t unpleasantly surprised by service disruptions. Depending on the country you plan on sending to and the sender type, you must plan for the time required to submit and receive a response for verification.

Here are the main tenets of compliance:

If you observe SMS delivery errors in Messaging Insights (more on this below) failing with error code 30007, your messages are being filtered due to lack of compliance. Sift through your message content, regulatory bundle status, and sender type verification status to debug and locate what’s lacking based on the above information. There’s more information on error tracking below, in the Monitor Health section of this article.

Also, Twilio will send regular updates to your email on any changes to the status quo based on communications we receive from carriers, industry groups, and regulators. These updates provide ample instructions and time to stay compliant, as rules and regulations do change. These messages are extremely important; it is highly recommended you read these to avoid pain if you don’t factor in new rules and regulations..

Data hygiene

The most common deliverability error we see crop up in Messaging Insights is the attempt to send an SMS to a landline or unreachable carrier.

Perhaps a customer signed up on your site with their landline number, or made up a fake number on the fly. This clogs your system with bad data, and you will be charged for every attempt to message an incompatible number. The solution is simple: filter out phone numbers that cannot receive SMS by using Twilio’s Lookup API. It’s a single endpoint that checks for phone number validity.

The main benefit to using Lookup is ensuring a phone number can receive SMS before you start sending en masse. Using Lookup, filter out customers’ numbers as they sign up to receive texts at the point of entry (for example, during website form validation if this is where you are gathering customer data), or run through the existing numbers in your database to see if they are mobile compatible for a complete data scrub. Voila, clean data - it all adds up to better deliverability, saved money on outbound SMS, and higher conversion rates.

If you want to take SMS verification one step further, Twilio provides an MFA solution called Verify. Send a verification code to your users via SMS to double down on number validity, protect user’s accounts, and cut down on fraud. Verify consists of two API endpoints, supports multiple communications channels on top of SMS, and can be implemented in as little as one sprint.

Adding links to your messages provides a call-to-action for end recipients, and enables you to track the conversion rate on this click or tap. Adding a full-length link of any kind to the SMS body seems straightforward enough. However, this is not recommended for three main reasons: carrier filtering, user trust, and potentially higher costs. Avoid these risks by utilizing Twilio’s Link Shortening & Click Tracking feature, which is already built directly into the Programmable Messaging API used to send the messages normally.

Let’s dig into each of the three reasons to illustrate the value of Link Shortening.

Carriers may filter messages containing URLs, the possibility and severity of which depends on the country’s SMS guidelines. For example, carriers in the US want to see branded, shortened links while carriers in Denmark have specific guidelines about URLs in certain sender types. Shortened URLs from services like Bitly or TinyURL are not branded by default, and carriers are scanning for these links to prevent spam. As always, be sure to check the guidelines per country to ensure rules regarding links are adhered to.

Unbranded links can affect user trust. Message recipients are suspicious of – and less likely to click – links that have an unexpected domain. This is typically the first thing users are warned about in a Phishing 101 course: don’t click on suspicious links. With a properly branded short link, message recipients are more likely to click the call-to-action, leading to a higher conversion rate for all your messaging efforts.

Shorter links can save you money. If you are sending long links in your messages that cause messages to exceed the segment character limit, this could needlessly split the message into multiple segments, and you will be charged for multiple outbound messages instead of one.

Click tracking is also a part of this Twilio messaging service feature. Data analysts and marketers can gain valuable engagement insight on whether or not a recipient responded to the message’s call-to-action via clicking the link. Each link is unique and mapped to a specific recipient, so you can get quite granular data tracking the click through rate (CTR), another advantage over using a generic link shortening service. Taking this concept a step further, if a recipient clicked the link you can take a specific action following this conversion, such as sending a personalized welcome message with a coupon after a user clicked the link to sign up for your site.

Get started on implementing this feature by reading the general configuration documentation, API docs on sending a message with a link, and this extremely valuable blog post containing a full guide on how to set up your domain configuration and a prime example of sending a message with a shortened link tied to a marketing campaign utilizing click tracking.

User experience

At this point, you have a solid foundation for sending off SMS at scale. Now, let’s take a step back from the technical aspects of implementation and into the message recipient experience. With all things technical, it’s easy to get caught up in the complex nuances of business logic, metrics, and programming to get the solution launched and used in the wild. However, at the end of the line is a real person signing up for and opening these messages. Texts are the most personal communications channel: for successfully delivered messages, we see an average open rate of 98% for general SMS, and 82% for marketing messages.

It’s imperative to consider the content, cadence, and consent of your messages. Put yourself in your customer’s shoes and ask yourself: Would I want to receive this message? Does this text feel personalized, and would I want to follow through on the call-to-action? Would I want to receive messages this often from a brand? And, perhaps most importantly: did I even sign up to receive these?

It’s no secret customers don’t linger lovingly over their account notifications or coupon alerts. A text from a brand should be personalized, concise, compliant, and have a specific purpose. If you have data on the recipient from their initial sign up, personalize the text by adding their first name as a variable in the text. Keep your messages relevant and timely, with a clear call-to-action: if the customer followed through by clicking the link in the text, showing up to their appointment from a reminder, or texting back a response to a survey – these are your key metrics achieved! Here are ideal SMS copy examples and templates for common use cases.

To keep your opt-out rates low (ideally 1% or less), don’t be overzealous with your text length and cadence. Keep the texts short, preferably all in one segment. It’s recommended to send messages during standard business hours in the recipient’s geographic location, so as to not disturb people on your list while they’re trying to relax (some of us are familiar with stand up comedy bits exclusively about eager telemarketers calling at dinnertime). In fact, some countries such as India require messages sent during normal business hours in order to remain compliant. And finally, ensure you are handling opt-outs appropriately. This is easily achieved by using Twilio’s Advanced Opt-Out configuration within Messaging Services.

Monitor health

Once you’ve established a solid messaging foundation, the natural next step is to keep tabs on your messaging campaign’s overall health by tracking delivery, opt-out rates, and conversion. The top recommendations to safeguard the health of these metrics are:

  • keep delivery error rates below 2%
  • keep opt-out rates as low as possible (there’s no definite number here, but shoot for less than 1-2%)
  • update any required compliance changes you receive via Twilio advisory alerts

There are several ways to monitor your success, so let’s break down some options in order of ascending complexity and automation.

The first way is visually pulse check your message delivery and opt-out rate using Messaging Insights. This is a handy tool in the Twilio console to inspect the delivery and error rates of all inbound and outbound messages (amongst many other data points). Who doesn’t love a solid set of graphs in a dashboard?

Messaging Insights dashboard showing ingoing and outgoing messages, plus status updates.
Messaging Insights Dashboard: have you ever seen anything so beautiful?

In all seriousness, Messaging Insights is an invaluable tool to spot check deliverability and opt-out rates, debug any problems that potentially arise, and export graphs of the data to show off in a slide show. You can get quite granular with the data to observe trends per country or messaging service, or troubleshoot which carrier or sender is the root of specific errors or filtering.

Messaging Insights is great for a manual check, but there are programmatic ways to track deliverability and opt-outs via Message Webhooks. This is the most common and recommended way to track important data on both inbound and outbound messages. After you send a message, Twilio pushes the delivery status to the webhook as soon as we get a receipt from our Super Network partner, so you can monitor which messages were delivered using the status callback parameter. Conversely, if there is a problem with delivery, Twilio returns specific error codes on the webhook. As mentioned above, webhooks can also be used to track opt-outs by scanning for the additional key value pair OptOutType in the inbound message webhook.

A third option if you want to get extremely granular and sophisticated with your tracking, Twilio offers a service in public beta called Event Streams. This enables you to track virtually any event that occurs within your Twilio account. You can stream these events anywhere, such as within your own data services, Twilio Segment, or Amazon Kinesis.

Finally, with regards to compliance changes: Twilio sends out regular advisory alerts via email if we learn of any decreed changes from carriers or regulators. These notifications contain detailed information about why a change came about, and instructions on how to handle it. Messaging rules are often a moving target, and Twilio is dedicated to keeping everyone up to speed with ample time to comply.

Improvements over time

Now that your comprehensive solution is set up and humming along, the natural next step is to iterate for improvement. Here are several suggestions to fortify and strengthen your messaging platform.

The incorporation of A/B or multivariate tests is an excellent method to experiment with what copy or cadence is working best across the board, or per audience bucket. For the unfamiliar: A/B (or split) testing is simply creating two or more versions of copy, design, or full feature experience and sending portions of your users or traffic to each experience while tracking whichever key performance indicators (KPIs) are relevant metrics to determine the success of one version over the other. With regards to messaging, as an example: try sending different text copy for your campaigns and analyzing which version converted better through click trackingMessaging tagging, or adding meta data such as campaign ID’s to the messaging API is a highly requested feature that is coming out soon to make tracking a breeze.

As a way to simplify your messaging solution, Twilio now offers message scheduling which is intended to make it straightforward to queue up messages that are sent at a specific time. For example, this could be used to send appointment reminders one day in advance of a date and time originally selected by the user. This could also help work around throughput limits by sending scheduled messages over time instead of sending messages in blasts.

Finally, if you want to get super savvy by fully automating a conversation between your customers and a virtual agent via SMS, Twilio offers integrations with Google Dialogflow, a conversational AI agent service. This blog post is an excellent guide to setting up a virtual agent to text back and forth with your customer (if you are working with a basic Twilio account – that is, not using Twilio Flex – omit the Flex and TaskRouter parts of the solution and the same result is achieved).

The benefits of a virtual agent that uses AI to garner conversational intent can automate tasks such  as a candidate scheduling an interview, a potential home buyer learning more about available real estate agents before getting connected, or someone on vacation changing their plane tickets to stay on vacation a few days longer. This frees up your human agents for the more complex interactions and keeps your customers happy when they don’t need to wait for a resolution – all through messaging.

Conclusion

That was a lot of ground to cover! If you follow the above guides and recommendations, you have a solid framework for an at-scale messaging solution. Your deliverability rates will increase, your customers will be happy, and you’ll have the peace of mind that your system is equipped to automatically alert you if the need to debug errors arises.

Twilio’s messaging solution offerings are designed to be a one-stop-shop, so you don’t need to spread your developers and budgets thin across setting up configuration and billing amongst disparate vendors. Ease of implementation with relevant features is a core tenant of our product philosophy, and as such we are always happy to listen to feedback or solutions requests!

If you would like detailed assistance with building your custom messaging solution from the ground up, Twilio offers message onboarding packages as an option within our Professional Services resources. Our team will assist with project planning, account setup, number registration, and launch. Speak with Twilio sales or support to learn more about our services.

This post is a high level guide, and a compass to navigate important and relevant messaging architecture so you won’t be stuck figuring out best practices the hard way. After following the above recommendations, your SMS implementation will have a solid baseline that will insure you against common pitfalls. I hope this guide was helpful, and remember to eat your vegetables and citrus fruits to help stave off any deficiencies.

Krista Goralczyk is a Senior Solutions Engineer at Twilio with 12+ years of tech industry experience. She knows the complete gamut of what it takes to conceptualize & build a full stack product - then take it to market. She enjoys finding ways to ease the minds of developers everywhere, one code example at a time. She can be reached at kgoralczyk@twilio.com or LinkedIn.