You want to build a new chat functionality using Twilio's Programmable Chat. But you also want to make sure it is secure and performs well. Here you can learn the best practices for securing and scaling programmable chat applications.
Looking for the API documentation for Programmable Chat? You can find the full reference docs here.
This guide highlights common security gotchas and recommendations to improve your application's performance. We'll take a close look at the following topics:
- Chat fundamentals
- Back-end authentication using API keys
- Client-side authentication using JWT access tokens
- Minimize client-side permissions
- Prevent channel creation from the client-side
- Delete or archive data that are no longer needed
- Comply with Push notification configurations
- Make sure you have a chat admin strategy
- Get the chat's users consent
Twilio Chat consists of two main components: REST APIs (back-end) and Client SDKs (front-end).
The back-end is secured by a REST API and associated libraries that orchestrates and controls the use of Chat. Tasks include creating chat users and creating/managing/deleting Channels. It also takes care of webhook event notifications from Twilio.
|Service Instance||This is a single Chat “deployment”. A single Twilio account can have multiple Chat Service instances where each is siloed from all the others with a unique Service Identifier (SID). All chat data, channels, and users are contained and managed within the context of the Service Instance. This gives you complete separation between different Chat use cases. For example, you may have one Service Instance for an internal employee chat application and a second for an external customer-facing chat application.|
|Channel||A channel is a container for a conversation between two or more Chat users. A channel is required even for a 1:1 conversation. Messages from users are sent to a channel and then propagated to the other channel members. Those messages are stored and associated with that channel until they, or the channel itself, are deleted.|
|Message||Each message belongs to a single channel and, as noted above, messages are sent to the channel (as opposed to individual users). Messages can be edited or deleted within a channel.|
|User||A user is a single real person interacting with a Chat Service instance, usually via a 1st person SDK-based client in a browser or on a mobile device. For example, a single user may use multiple clients, a web-based client, and a mobile client in an installed app.|
|Member||Each User can join many Channels. A Member is a representation of the User within a single channel, and it is the Member that interacts with that Channel, for example by sending Messages.|
The back-end and front-end (client) components of chat applications need to be considered and handled separately from a security point of view. Let's start with discussing the mechanism for securing the back-end.
The back-end application makes requests to Twilio’s Chat REST API which uses basic HTTP authentication. As with all of Twilio’s REST APIs, the Twilio Account SID and Auth Token can be used as the username and password respectively. However, for Programmable Chat, we recommended using API Keys.
Here is an example back-end chat request using API keys:
curl -G https://chat.twilio.com/v2/Services \ -u '[YOUR API KEY]:[YOUR API SECRET]'
API Keys are used both to authenticate to the REST API and to create and revoke the Access Tokens used by Chat clients. API Keys can be provisioned and revoked either via the REST API or the Twilio console. These tools offer fine-grained control over the developer and application access to the Chat REST API.
Note: When handling API keys, ensure you follow these two critical points:
- API keys must be stored and handled with the same care as any other secret (for example, passwords and tokens for other third-party APIs and services).
- API Keys and Twilio’s REST APIs should only be used by the back-end application. API Keys & REST API Tokens should never be sent to or embedded in the client application. Additionally, you should never embed your API key directly in your source code or push it to a repo. If an API Key is available to a client, it could be obtained by an end user who will then have API-level access to your Twilio account.
The Chat Client SDKs authenticate using Twilio Access Tokens. Unlike API Keys, Access Tokens are short-lived tokens granted by your application to give a client access to a Chat Service Instance for a defined amount of time.
The maximum lifetime of an Access Token can be up to 24 hours, but best practice is to limit them to the shortest amount of time feasible for your specific Chat application.
Your back-end application is responsible for authenticating the end user to ensure that they should have access to the Chat Service Instance. Your application can then make a REST API request to Twilio to grant an Access Token for the client.
The Access Token contains two pieces of information:
- User’s Identity
- Chat Grant that specifies the Service Instance SID this user can connect to
The application then passes the Access Token back to the client (for example, via Ajax). The Chat client SDK will then provide this Access Token on requests to Twilio.
As a result of this process, Twilio keys do not need to be stored in the client application. Also, your client authenticates using the generated access token, as follows:
const chatClient = new Twilio.Chat.Client('the token string from server');
These tokens are based on the JSON Web Token standard (JWT). For troubleshooting purposes, you can paste a token into https://jwt.io/, and it will decode the token for you. This makes it easy to confirm that the values encoded in the token are correct.
Finally, if the Client needs uninterrupted access to the Chat Service, Access Tokens need to be renewed before they expire. This can be done using the client’s
updateToken method, but Twilio also has an optional helper, AccessManager, that can manage the renewal process.
Programmable Chat considers a back-end application that uses the REST API and authenticates via API Keys a “trusted” application. On the other hand, the client is considered “untrusted” and so requires extra security precautions such as time-limited Access Tokens for client access to Chat.
Roles and Permissions are also a critical component to "trust." Default service-level and channel-level roles are set up within Twilio Chat and can be assigned to give either Admin-level or User-level rights to each User within the Service Instance.
While permissions defaults work well for out-of-the-box Chat applications (for example, prototype/proof of concept), you need to review the permissions granted to each User within a production Chat application.
You can use the
Roles resource to check the permissions and roles for any instance:
curl -X GET 'https://chat.twilio.com/v2/Services/ISXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Roles' \ -u AC235fc1b3aa3cf4ec101177b6b49d76f0:your_auth_token
For example, should a new User connecting to customer service Chat via the browser be allowed to create new Channels?
You might think that if your client UI doesn’t offer a feature to create channels, then Users wouldn’t be able to do that. However, a savvy User with developer experience may be able to access the underlying client SDK and perform unwanted operations granted by their User Role.
Therefore, Twilio’s recommended best practice for
Roles is to create custom Roles for your production application that grants the absolute minimum Permissions required for each User role.
In addition to the concept of minimally-provisioned roles, Twilio recommends that your back-end application orchestrates all channel creation via the REST API (where the use case allows for that) rather than granting channel creation privileges to untrusted client users.
As an example of how someone might exploit channel creation privileges, imagine a scenario where a client user initiates unwanted chats targeting individual agents or groups.
Chat data, particularly Message and Member data associated with Channels, persists forever by default unless the individual Messages or the entire Channel is deleted. This data retention has implications for both the security and performance of your application.
Security - General security best practice is to retain only the data you need and to store that data only where you need it.
Performance - Initializing a Chat client causes it to read Channel and Member data from the Chat Service. If over time that User becomes a Member of more and more Channels with more Message data in each channel, this can significantly impact Client initialization time as well as the general performance of other client SDK methods.
Twilio recommends that when designing the Chat application architecture you consider at what point Chat information should be archived or deleted. This decision is typically tied to the required lifecycle of a Channel.
An example of a Channel lifecycle might be as follows:
- End-user requests Chat session via a web browser.
- A user is placed in a newly-created Channel with a customer service Agent.
- Chat session completes.
- The application retrieves Message history from the Channel via the REST API and stores it in a CRM system.
- Application deletes Channel, removing all history from Twilio’s platform.
Additionally, you may want to consider your particular use-case and customer base:
- Any PII shared via Chat?
- Are any of your Chat users EU citizens who are protected by GDPR?
Be thoughtful as you consider your application’s data retention.
A few steps are required to get push notifications working for Programmable Chat.
- Push must be explicitly enabled as it is disabled by default on all Chat Service Instances.
- Specific events for iOS (via APN), for Android, and for browsers (via GCM/FCM) support Programmable Chat push notifications.
- Use the Chat console to provision Apple Developer credentials for iOS pushes. Use FCM push credentials for provisioning Android and browser push notifications. Before Twilio Chat provisioning can be completed, you need to configure both APN and FCM credentials with Apple and Google respectively.
When architecting your Chat application, make sure you consider what level of management functionality you need to build for system administration.
While individual Channels and Users can be created and deleted by the core application business logic, administrators may need additional insights into usage of a Chat Service Instance. These insights include, for example, a detailed breakdown of Users, Channels and Messages, and the ability to modify or remove Chat objects directly.
Some of these actions can be performed via the Twilio console but administrators often may not, or should not, have access to the console.
An admin interface can be a great solution to this requirement.
When using Twilio Programmable Chat, your Users’ communications are being handled and stored by Twilio.
It’s best practice, and it's a law in many jurisdictions, to ensure that you have a User’s consent for this before making use of a service based on Programmable Chat. Twilio recommends that you consult with your legal counsel to make sure that you are complying with all applicable laws in connection with communications you transmit using Twilio.