Get Started with Twilio Programmable Video Authentication and Identity using TypeScript

February 09, 2021
Written by
Jamie Corkhill
Contributor
Opinions expressed by Twilio contributors are their own
Reviewed by

typescript video 2.png

This article is for reference only. We're not onboarding new customers to Programmable Video. Existing customers can continue to use the product until December 5, 2024.


We recommend migrating your application to the API provided by our preferred video partner, Zoom. We've prepared this migration guide to assist you in minimizing any service disruption.

In this article, you’ll learn to use TypeScript and Twilio Programmable Video to build a video chatting application with identity and user management controls. You’ll use the server-side Twilio library (with TypeScript) to handle tokenization for users who wish to partake in an interactive video call.

This article is an extension of my last article, Getting Started with TypeScript and Twilio Programmable Video, and will build off the “getting-started” branch of this GitHub Repository. To see the final code, visit the “adding-token-server” branch.

Twilio Programmable Video is a suite of tools for building real-time video apps that scale as you grow, from free 1:1 chats with WebRTC to larger group rooms with many participants. You can sign up for a free Twilio account to get started using Programmable Video.

TypeScript is an extension of pure JavaScript - a “superset” if you will - and adds static typing to the language. It enforces type safety, makes code easier to reason about, and permits the implementation of classic patterns in a more “traditional” manner. As a language extension, all JavaScript is valid TypeScript, and TypeScript is compiled down to JavaScript.

Parcel is a blazing-fast web configuration bundler that supports hot-module replacement and which bundles and transforms your assets. You’ll use it in this article to work with TypeScript on the client without having to worry about transpilation or bundling and configuration.

Requirements

  • Node.js - Consider using a tool like nvm to manage Node.js versions.
  • A Twilio Account for Programmable Video. If you are new to Twilio, you can create a free account.

Project Configuration

Download the project files and install dependencies

You can begin by cloning the “getting-started” branch of the accompanying GitHub Repository with the command below:

git clone -b getting-started --single-branch https://github.com/JamieCorkhill/Twilio-Video-Series

Create two new folders, client and server, and move all your existing code into the client folder, except for the .gitignore file, as shown below:

cd Twilio-Video-Series
mkdir client server
mv index.html package-lock.json package.json tsconfig.json client/
mv src client

Inside client, install any dependencies:

cd client
npm install

Next, navigate into the server folder and configure the project as follows:

cd ../server
npm init -y
npm i cors env-cmd express twilio typescript
npm i --save-dev @types/cors @types/express @types/node @types/twilio
node_modules/.bin/tsc --init
mkdir env src
touch env/dev.env

These commands created a TypeScript project, installed all the dependencies you’ll need, and created some initial files and folders.

Configure TypeScript

Next, find the tsconfig.json file inside the server folder  and open it up,  in your favorite text editor. Replace the contents of the file with the following configuration:

{
  "compilerOptions": {
    "target": "es5",                          
    "module": "commonjs",                     
    "outDir": "./dist",                       
    "rootDir": "./src",                      
    "strict": true,                          
    "esModuleInterop": true,                  
    "skipLibCheck": true,                     
    "forceConsistentCasingInFileNames": true  
  }
}

The important parts to note inside this configuration file are the outDir and rootDir keys, short for out and root directory respectively.

Here, you’ve specified that for your server-side TypeScript code, you want the compiler to ingest all source files from the src directory and output compiled JavaScript code into the dist directory, which will be created and managed automatically. These folders will come into play shortly when you write the scripts for package.json.

Add environment variables

In order to make use of the Twilio Programmable Video API, you’ll need to add your Twilio account credentials to your TypeScript app.

Open the env/dev.env file, and add the following environment variables:

TWILIO_ACCOUNT_SID=[Your Key]
TWILIO_API_KEY=[Your Key]
TWILIO_API_SECRET=[Your Key]

You can find your Account SID on the Twilio Console and you can create your API Key and API Secret here. Add these keys in their respective locations, overwriting the [Your Key] placeholder in its entirety each time.

Note that on the API dashboard of the Console, your API key will be referred to as the API SID. Also, be sure to take note of your API Key Secret before navigating away from the page - you won’t be able to access it again.

Update your package.json file

Finally, update the scripts section of your package.json file to make use of env-cmd when building your project:


{
  "name": "server",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "start:build": "tsc",
    "start:run": "env-cmd -f ./env/dev.env node ./dist/server.js",
    "start": "npm run start:build && npm run start:run"
  },
  "keywords": [],
  "author": "",
  "license": "ISC",
  "dependencies": {
    "cors": "^2.8.5",
    "env-cmd": "^10.1.0",
    "express": "^4.17.1",
    "twilio": "^3.54.2",
    "typescript": "^4.1.3"
  },
  "devDependencies": {
    "@types/cors": "^2.8.9",
    "@types/express": "^4.17.11",
    "@types/node": "^14.14.20",
    "@types/twilio": "^2.11.0"
  }
}

With this, you’re ready to begin developing the token server.

Building the Token Server

Load the environment variables

Use the commands below to create a new folder called config in the src folder of your server project, and place within it a file named config.ts, as demonstrated below:

mkdir src/config
touch src/config/config.ts

Inside config.ts, add the following code:

/**
 * Global application level configuration parameters.
 */
export const config = {
    twilio: {
        ACCOUNT_SID: process.env.TWILIO_ACCOUNT_SID as string,
        API_KEY: process.env.TWILIO_API_KEY as string,
        API_SECRET: process.env.TWILIO_API_SECRET as string
    }
} as const;

In ES6 JavaScript, objects marked as const are not truly constant - only the reference to the object is constant, which means you can’t reassign a const object to another object, but you can manipulate the properties.

To remedy that, at compile-time, you use a TypeScript 3.4 feature known as a “const assertion” - that’s the meaning of the as const at the end.

It forces that no literal types will be widened, the properties of object literals will become readonly, and array literals will become readonly tuples.

To understand widening, consider creating a constant (const) variable and assigning it to a literal value like 3.14. The type of that variable will be the literal 3.14 since it can’t be changed. If you assigned the literal 3.14 to a variable declared via let (i.e, a non-constant variable), the variable type will be widened from the literal type 3.14 to the number type, since other numbers can be assigned and you’re not reduced to that literal.

To learn more about Const Assertions, see the relevant section of the TypeScript Documentation. This section of the documentation also shows an example of widening.

Create a Data Transfer Object

Next, inside of src, create a folder named api, and inside of that folder, create two new files - controller.ts and dtos.ts, as shown in the Bash snippet below. The controller.ts file will hold the actual endpoint which does the work of generating tokens. The dtos.ts file is where you’ll define Data Transfer Objects.

A Data Transfer Object, or DTO, is a method of standardizing a data shape as it passes between two systems or layers - in this case, to standardize the body we expect up in the POST request from the client to create the token.

mkdir src/api
touch src/api/controller.ts src/api/dtos.ts

You’ll start with the dtos.ts file. Open it in your text editor and add the following interface:

/**
 * Incoming request DTO data shape for creating a token.
 */
export interface IGenerateVideoTokenRequestDto {
    identity: string;
    roomName: string;
}

This interface specifies that for an incoming request to create a video token, you expect an identity property - which could be the client’s name or some other form of ID, as well as the name of the room to which they wish to connect, specified as roomName.

Generate the Access Token

Next, open controller.ts and add the following code:

import { Request, Response } from 'express';
import { jwt } from 'twilio';

import { config } from './../config/config';
import { IGenerateVideoTokenRequestDto } from './dtos';

const AccessToken = jwt.AccessToken;
const VideoGrant = AccessToken.VideoGrant;

/**
 * Generates a video token for a given identity and room name.
 */
export function generateToken(req: Request, res: Response) {
    const dto = req.body as IGenerateVideoTokenRequestDto;

    // Generate an access token for the given identity.
    const token = new AccessToken(
        config.twilio.ACCOUNT_SID,
        config.twilio.API_KEY,
        config.twilio.API_SECRET,
        { identity: dto.identity }
    );

    // Grant access to Twilio Video capabilities.
    const grant = new VideoGrant({ room: dto.roomName });
    token.addGrant(grant);

    return res.send({ token: token.toJwt() });
}

This code creates a function called generateToken() that acts as the endpoint to which users will make a POST Request in order to receive a token.

Access Tokens are ephemeral credentials that control participant identity and room permissions for your application. They contain “grants” which govern the actions that the token holder may perform. Visit the documentation to learn more about Grants and Access Tokens.

This function extracts the DTO off the request body, and then creates a Twilio Access Token utilizing your three Twilio credentials and the specified identity. It then adds a grant to the token for the room name and responds with the serialized JSON Web Token in an object. The grant specifies that the user with the identity specified who holds the current token is permitted to access the room with name dto.roomName.

The return result of this function is what you’ll pass to the connect method on the client-side below.

Create the Express server

At this point, all that remains for the backend is to create the Express application and wire up the endpoint to the route.

Create a new file inside src called server.ts:

touch src/server.ts

And add the following code:

import express from 'express';
import cors from 'cors';

import { generateToken } from './api/controller';

const PORT = parseInt(process.env.PORT as string) || 3000;
const app = express();

app.use(cors());
app.use(express.json());
app.use(express.urlencoded({ extended: true }));

// Endpoints 
app.post('/create-token', generateToken);

app.listen(PORT, () => console.log(`Server is up on port ${PORT}`));

This file instructs the server to run on whichever port is specified, be it the default 3000 or one provided by the operating system. It also adds CORS middleware to enable cross-origin resource sharing. This middleware allows you  to make a request to this server from your client, since that executes on a different port (thus a different domain) through Parcel.

The server is now complete, so it’s time to move on to updating the client.

Updating the Client

Add text inputs to index.html

Navigate back to the client folder. The first step you’ll need to perform is updating your existing HTML structure within the index.html file. Update all the code below the media-container div and before the <script> tag as shown in the highlighted lines below:


<!DOCTYPE html>
<html lang="en">
<head>
    <title>Twilio Video Development Demo</title>

    <style>
        .media-container {
            display: flex;
        }

        .media-container > * + * {
            margin-left: 1.5rem;
        }
    </style>
</head>
<body>
    <div class="media-container">
        <div id="local-media-container"></div> 
        <div id="remote-media-container"></div>
    </div>
    
    <div>
        <input id="room-name-input" type="text" placeholder="Room Name"/>
        <input id="identity-input" type="text" placeholder="Your Name"/>
        <button id="join-button">Join Room</button>
        <button id="leave-button">Leave Room</button>
    </div>

    <script src="./src/video.ts"></script>
</body>
</html>

These updates add two text inputs for the user to enter both a room name and their own name. It also adds the ability for a user to leave the room. You typically wouldn’t want to place styling within an HTML file, but this is a demo application and the styles are trivial, so it’s not too much of a problem here.

Your next steps are to update token-repository.ts to pull tokens from the server, and then update the video.ts file to make use of these new elements.

Update the token repository

Install the Axios HTTP Client library with the command: npm i axios.

Open the token-repository.ts file inside the client/src folder and delete all the existing code within the file. The changes you’ll be making aren’t drastic, but it’ll be easier to start from a clean slate. Paste the code below into the now empty file.

import axios from 'axios';

/**
 * Creates an instance of a token repository.
 */
function makeTokenRepository() {
    return {
        /**
         * Provides an access token for the given room name and identity.
         */
        async getToken(roomName: string, identity: string) {
            const response = await axios.post<{ token: string }>('http://localhost:3000/create-token', {
                roomName,
                identity
            });

            return response.data.token;
        }
    }
}

/**
 * An instance of a token repository.
 */
export const tokenRepository = makeTokenRepository();

Notice that you’ve renamed getNextToken() to just getToken(). You’ve also made the function asynchronous so that it can perform a network request without blocking the event loop. You renamed the function because you’re no longer just pulling the next token as you had before, now you’re generating one for the given room name and identity.

With that, you can move over to the video.ts file.

Start by adding the extra two UI handles and setting their initial disabled state, as well as a global reference to the room, as shown in the highlighted code below:


import { 
    connect,
    createLocalVideoTrack,
    RemoteAudioTrack, 
    RemoteParticipant, 
    RemoteTrack, 
    RemoteVideoTrack, 
    Room
} from 'twilio-video';

import { tokenRepository } from './token-repository';

import { Nullable } from './types';

// UI Element Handles
const joinButton = document.querySelector('#join-button') as HTMLButtonElement;
const leaveButton = document.querySelector('#leave-button') as HTMLButtonElement;
const remoteMediaContainer = document.querySelector('#remote-media-container') as HTMLDivElement;
const localMediaContainer = document.querySelector('#local-media-container') as HTMLDivElement;
const roomNameInput = document.querySelector('#room-name-input') as HTMLInputElement;
const identityInput = document.querySelector('#identity-input') as HTMLInputElement;

// Room reference
let room: Room;

/**
 * Entry point.
 */
async function main() {
    // Initial state.
    leaveButton.disabled = true;
    joinButton.disabled = false;

    // Provides a camera preview window.
    const localVideoTrack = await createLocalVideoTrack({ width: 640 });
    localMediaContainer.appendChild(localVideoTrack.attach());
}
 
… 

At the bottom of the file, right before adding the click event listener for the join click event but after thetrackExistsAndIsAttachable() function, add a function called toggleInputs() as shown in the highlighted lines below:


/**
 * Guard that a track is attachable.
 * 
 * @param track 
 * The remote track candidate.
 */
function trackExistsAndIsAttachable(track?: Nullable<RemoteTrack>): track is RemoteAudioTrack | RemoteVideoTrack {
    return !!track && (
        (track as RemoteAudioTrack).attach !== undefined ||
        (track as RemoteVideoTrack).attach !== undefined
    );
}

/**
 * Toggles inputs into their opposite form in terms of whether they're disabled.
 */
function toggleInputs() {
    joinButton.disabled = !joinButton.disabled;
    leaveButton.disabled = !leaveButton.disabled;

    identityInput.value = '';
    roomNameInput.value = '';
}

// Button event handlers.
joinButton.addEventListener('click', onJoinClick);

// Entry point.
main();

This function toggles the button disabled state to the opposite of whatever it is at the time of the function invocation, which is useful to enable you to switch between room connected and disconnected states quickly and easily. It also clears the text of the two inputs.

The next changes to make exist within the onJoinClick() function, which you can update as follows:


/**
 * Triggers when the join button is clicked.
 */
async function onJoinClick() {
    const roomName = roomNameInput.value;
    const identity = identityInput.value;
    room = await connect(await tokenRepository.getToken(roomName, identity), {
        name: roomName,
        audio: true,
        video: { width: 640 }
    });

    // Attach the remote tracks of participants already in the room.
    room.participants.forEach(
        participant => manageTracksForRemoteParticipant(participant)
    );

    // Wire-up event handlers.
    room.on('participantConnected', onParticipantConnected);
    room.on('participantDisconnected', onParticipantDisconnected);
    window.onbeforeunload = () => room.disconnect();

    toggleInputs();
}

The highlighted changes update the call to the token repository, passing the specified room name and identity, and set the buttons to a connected state. Technically, in a real application, you’d want to perform validation and sanitization on these input values to ensure they’re of the proper form. Additionally, you could attempt to screen them for explicit language and append a UUID to the end to avoid room name or user name collision conflicts.

Lastly, add the following onLeaveClick() function beneath onJoinClick() and wire it up:


… 

/**
 * Triggers when the leave button is clicked.
 */
function onLeaveClick() {
    room.disconnect();
    toggleInputs();
}

…

// Button event handlers.
joinButton.addEventListener('click', onJoinClick);
leaveButton.addEventListener('click', onLeaveClick);

// Entry point.
main();

The onLeaveClick() function disconnects the participant from the room and resets the input into a non-joined state.

With this, your project is complete!

Running the Application

To demo the application, start your local backend server by running the following command from inside the server folder::

npm start

Open a second terminal window, and navigate to your client folder. From this folder run the following command to start your client’s server:

parcel index.html

With both running, you should be able to visit localhost:1234 in your browser (or whichever port Parcel chooses), and see a preview of your webcam stream after providing the relevant permissions if prompted. By opening two browser windows, you can connect both to the same room but with different identities, and you should see the remote streams. By placing both the client and the server behind Ngrok, you could tunnel your localhost connections to a public URL, and then you could perform this demo on different machines so as to not be stuck with seeing the same video stream for both participants.

Conclusion

In this project, you learned how to manage identity and authentication for your users via the Twilio server-side library with TypeScript for Programmable Video. You also learned how to retrofit an existing codebase to make use of the token server. To view this project’s source code, visit the “adding-token-server” branch at its GitHub Repository.

Jamie is an 18-year-old software developer located in Texas. He has particular interests in enterprise architecture (DDD/CQRS/ES), writing elegant and testable code, and Physics and Mathematics. He is currently working on a startup in the business automation and tech education space, and when not behind a computer, he enjoys reading and learning.