Secure and Scale Programmable Chat Applications

(error)

Danger

Programmable Chat has been deprecated and is no longer supported. Instead, we'll be focusing on the next generation of chat: Twilio Conversations. Find out more about the EOL process here.

If you're starting a new project, please visit the Conversations Docs to begin. If you've already built on Programmable Chat, please visit our Migration Guide to learn about how to switch.

You want to build a new chat functionality using Twilio's Programmable Chat. But you also want to make sure it is secure and performs well. Here you can learn the best practices for securing and scaling programmable chat applications.

(information)

Info

Looking for the API documentation for Programmable Chat? You can find the full reference docs here.

This guide highlights common security gotchas and recommendations to improve your application's performance. We'll take a close look at the following topics:

Chat fundamentals

Chat components

Twilio Chat consists of two main components: REST APIs (back-end) and Client SDKs (front-end).

REST API	The back-end is secured by a REST API and associated libraries that orchestrates and controls the use of Chat. Tasks include creating chat users and creating/managing/deleting Channels. It also takes care of webhook event notifications from Twilio. See Programmable Chat REST API.
Client SDK	The front-end implements interfaces that expose Chat functionality to users on mobile devices (iOS and Android) or in a web browser (JavaScript SDKs). Chat SDKs are asynchronous. Chat service responses are handled using Event listeners (Android), Delegates (iOS) or Promises (JavaScript). These SDKs interact with the Chat service over a WebSocket connection.

Chat objects

Service Instance	This is a single Chat "deployment". A single Twilio account can have multipleChat Service instances where each is siloed from all the others with a unique Service Identifier (SID). All chat data, channels, and users are contained and managed within the context of the Service Instance. This gives you complete separation between different Chat use cases. For example, you may have one Service Instance for an internal employee chat application and a second for an external customer-facing chat application.
Channel	A channel is a container for a conversation between two or more Chat users. A channel is required even for a 1:1 conversation. Messages from users are sent to a channel and then propagated to the other channel members. Those messages are stored and associated with that channel until they, or the channel itself, are deleted.
Message	Each message belongs to a single channel and, as noted above, messages are sent to the channel (as opposed to individual users). Messages can be edited or deleted within a channel.
User	A user is a single real person interacting with a Chat Service instance, usually via a 1st person SDK-based client in a browser or on a mobile device. For example, a single user may use multiple clients, a web-based client, and a mobile client in an installed app.
Member	Each User can join many Channels. A Member is a representation of the User within a single channel, and it is the Member that interacts with that Channel, for example by sending Messages.

Use API keys for back-end authentication

The back-end and front-end (client) components of chat applications need to be considered and handled separately from a security point of view. Let's start with discussing the mechanism for securing the back-end.

The back-end application makes requests to Twilio's Chat REST API which uses basic HTTP authentication. As with all of Twilio's REST APIs, the Twilio Account SID and Auth Token can be used as the username and password respectively. However, for Programmable Chat, we recommended using API Keys.

Here is an example back-end chat request using API keys:

1curl -G https://chat.twilio.com/v2/Services \
2   -u $TWILIO_API_KEY:$TWILIO_API_KEY_SECRET

API Keys are used both to authenticate to the REST API and to create and revoke the Access Tokens used by Chat clients. API Keys can be provisioned and revoked either via the REST API or the Twilio console. These tools offer fine-grained control over the developer and application access to the Chat REST API.

Note: When handling API keys, ensure you follow these two critical points:

API keys must be stored and handled with the same care as any other secret (for example, passwords and tokens for other third-party APIs and services).
API Keys and Twilio's REST APIs should only be used by the back-end application. API Keys & REST API Tokens should never be sent to or embedded in the client application. Additionally, you should never embed your API key directly in your source code or push it to a repo. If an API Key is available to a client, it could be obtained by an end user who will then have API-level access to your Twilio account.

Use JWT access tokens for client-side authentication

The Chat Client SDKs authenticate using Twilio Access Tokens. Unlike API Keys, Access Tokens are short-lived tokens granted by your application to give a client access to a Chat Service Instance for a defined amount of time.

The maximum lifetime of an Access Token can be up to 24 hours, but best practice is to limit them to the shortest amount of time feasible for your specific Chat application.

Your back-end application is responsible for authenticating the end user to ensure that they should have access to the Chat Service Instance. Your application can then make a REST API request to Twilio to grant an Access Token for the client.

The Access Token contains two pieces of information:

User's Identity
Chat Grant that specifies the Service Instance SID this user can connect to

The application then passes the Access Token back to the client (for example, via Ajax). The Chat client SDK will then provide this Access Token on requests to Twilio.

As a result of this process, Twilio keys do not need to be stored in the client application. Also, your client authenticates using the generated access token, as follows:

const chatClient = new Twilio.Chat.Client('the token string from server');

These tokens are based on the JSON Web Token standard (JWT). For troubleshooting purposes, you can paste a token into https://jwt.io/, and it will decode the token for you. This helps confirm that the values encoded in the token are correct.

Finally, if the Client needs uninterrupted access to the Chat Service, Access Tokens need to be renewed before they expire. This can be done using the client's updateToken method, but Twilio also has an optional helper, AccessManager, that can manage the renewal process.

Strip down client-side permissions to the bare minimum

Programmable Chat considers a back-end application that uses the REST API and authenticates via API Keys a "trusted" application. On the other hand, the client is considered "untrusted" and so requires extra security precautions such as time-limited Access Tokens for client access to Chat.

Roles and Permissions are also a critical component to "trust." Default service-level and channel-level roles are set up within Twilio Chat and can be assigned to give either Admin-level or User-level rights to each User within the Service Instance.

While permissions defaults work well for out-of-the-box Chat applications (for example, prototype/proof of concept), you need to review the permissions granted to each User within a production Chat application.

You can use the Roles resource to check the permissions and roles for any instance:

1curl -X GET 'https://chat.twilio.com/v2/Services/ISXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Roles' \
2-u AC235fc1b3aa3cf4ec101177b6b49d76f0:your_auth_token

For example, should a new User connecting to customer service Chat via the browser be allowed to create new Channels?

You might think that if your client UI doesn't offer a feature to create channels, then Users wouldn't be able to do that. However, a savvy User with developer experience may be able to access the underlying client SDK and perform unwanted operations granted by their User Role.

Therefore, Twilio's recommended best practice for Roles is to create custom Roles for your production application that grants the absolute minimum Permissions required for each User role.

Avoid client-side channel creation

In addition to the concept of minimally-provisioned roles, Twilio recommends that your back-end application orchestrates all channel creation via the REST API (where the use case allows for that) rather than granting channel creation privileges to untrusted client users.

As an example of how someone might exploit channel creation privileges, imagine a scenario where a client user initiates unwanted chats targeting individual agents or groups.

Delete or archive data when no longer needed

Chat data, particularly Message and Member data associated with Channels, persists forever by default unless the individual Messages or the entire Channel is deleted. This data retention has implications for both the security and performance of your application.

Security - General security best practice is to retain only the data you need and to store that data only where you need it.

Performance - Initializing a Chat client causes it to read Channel and Member data from the Chat Service. If over time that User becomes a Member of more and more Channels with more Message data in each channel, this can significantly impact Client initialization time as well as the general performance of other client SDK methods.

Twilio recommends that when designing the Chat application architecture you consider at what point Chat information should be archived or deleted. This decision is typically tied to the required lifecycle of a Channel.

An example of a Channel lifecycle might be as follows:

End-user requests Chat session via a web browser.
A user is placed in a newly-created Channel with a customer service Agent.
Chat session completes.
The application retrieves Message history from the Channel via the REST API and stores it in a CRM system.
Application deletes Channel, removing all history from Twilio's platform.

Additionally, you may want to consider your particular use-case and customer base:

Any PII shared via Chat?
PHI?
Are any of your Chat users EU citizens who are protected by GDPR?

Be thoughtful as you consider your application's data retention.

Be aware of push notification configuration requirements

A few steps are required to get push notifications working for Programmable Chat.

Push must be explicitly enabled as it is disabled by default on all Chat Service Instances.
Specific events for iOS (via APN), for Android, and for browsers (via GCM/FCM) support Programmable Chat push notifications.
Use the Chat console to provision Apple Developer credentials for iOS pushes. Use FCM push credentials for provisioning Android and browser push notifications. Before Twilio Chat provisioning can be completed, you need to configure both APN and FCM credentials with Apple and Google respectively.

Plan your chat administration strategy

When architecting your Chat application, make sure you consider what level of management functionality you need to build for system administration.

While individual Channels and Users can be created and deleted by the core application business logic, administrators may need additional insights into usage of a Chat Service Instance. These insights include, for example, a detailed breakdown of Users, Channels and Messages, and the ability to modify or remove Chat objects directly.

Some of these actions can be performed via the Twilio console but administrators often may not, or should not, have access to the console.

An admin interface can be a great solution to this requirement.