Build the future of communications.
Start building for free

Building a Video Chat App with Twilio Programmable Video and React

lens_aperture

There is an increasingly large demand for collaboration tools, especially for remote developers and even people in other career sectors. Having a video chat application that you can customize the “look and feel” of can be very important since you are not restricted to the features of most commercial providers.
In this tutorial, we will learn how to create a video chat application using React, Node.js, and Twilio Programmable Video.

Setup React and Node.js

To develop this application we will need some basic knowledge of React JS,  Node JS and basic knowledge in client-side javascript.
We will also need a Node JS installation equal or greater than version 7.0.0 and  a browser for testing.

Twilio Account Setup

If you do not have a Twilio account, sign up for one and then create a project. When asked for the features you want to use, select Programmable Video. The TWILIO_ACCOUNT_SID can be found in your project dashboard.

We then create an API key in the Programmable Video section. If you experience any issues in the setup of the Twilio API key,, this GitHub Issue gives more insight into the setup. We will use the “SID” in the figure below for TWILIO_API_KEY and the “secret” for TWILIO_API_SECRET.

Note: The API key SID always starts with SK.

React Project Setup

In this project, we will primarily focus on using Twilio with React and will therefore skip the project setup and use a minimal boilerplate instead.

For SSH setup, you would type the following command on your command line in the terminal:

git clone -b setup git@github.com:kimobrian/TwilioReact.git

Alternatively, for HTTPS setup, you can type the following command on your command line in the terminal:

git clone -b setup https://github.com/kimobrian/TwilioReact.git

We then cd into our TwilioReact folder and run yarn or npm i to install our packages. We will end up with the following folder structure.

Developing our Node Programmable Video Application

We start by setting up our keys and secrets in a .env file based on the env.template file from the GitHub repo we just cloned. Keeping our keys in a .env file is a best practice that makes it easier to change them at anytime and also makes it easy to deploy to any platform.

Note: Do not push the .env file to any git hosting service like GitHub because this will expose our secret keys to the world, compromising security.

Let’s make our .env file and add the following code to it, replacing each key and token with the one found in our Twilio console.

TWILIO_ACCOUNT_SID=ACXXXXXXXXXXXXXXXXXXXX
TWILIO_API_KEY=SKXXXXXXXXXXXXXXXXXXXXXXXXX
TWILIO_API_SECRET=XXXXXXXXXXXXXXXXXXXXXXXX
NODE_ENV=DEV

After navigating into the “TwilioReact” folder, we can test our initial setup by running “npm start” on the command line. Now let’s open up a web browser and navigate to http://localhost:3000/ where we should see the following interface.

Server Setup

We need to set up our server to generate tokens that can be sent to the client. Twilio will then use the token to grant access to the client to use the Video SDK. To use Twilio Video functions, we need to install the Twilio Node.js package by typing on the command line.

npm install save twilio

Our package.json includes another package, faker. This will be used to generate fake names and assign to each connected client.

API Keys are credentials to access the Twilio API. They are used to authenticate the REST API and also to create and revoke Access Tokens. Access tokens are used to authenticate Client SDKs and also to grant access to API features.

The snippet below creates a JWT access token and sends it to the client with a fake name. We add the code inside server.js.

// server.js
// ... Code before
var AccessToken = require('twilio').jwt.AccessToken;
var VideoGrant = AccessToken.VideoGrant;
var app = express();
if(process.env.NODE_ENV === 'DEV') {
    // ... initial code block here
}

// Endpoint to generate access token
app.get('/token', function(request, response) {
   var identity = faker.name.findName();

   // Create an access token which we will sign and return to the client,
   // containing the grant we just created
   var token = new AccessToken(
       process.env.TWILIO_ACCOUNT_SID,
       process.env.TWILIO_API_KEY,
       process.env.TWILIO_API_SECRET
   );

   // Assign the generated identity to the token
   token.identity = identity;

   const grant = new VideoGrant();
   // Grant token access to the Video API features
   token.addGrant(grant);

   // Serialize the token to a JWT string and include it in a JSON response
   response.send({
       identity: identity,
       token: token.toJwt()
   });
});
var port = process.env.PORT || 3000;
// Code after ...

Video Room Creation

Video Rooms can be created by our client side code by default. In this application we will use that capability, however if you need to restrict the rooms that can be created you can turn off client side room creation in the room settings.

In this situation you need to create rooms using the REST API  and grant access to a specific room within the access token.

Video Client Setup

So far, we merely have a skeleton client application. Let’s add some functionality to our app. Start by installing the twilio-video SDK and axios for making HTTP requests by typing the following command on the command line :

npm install twilio-video axios save

In the /app/ folder, create a new file VideoComponent.js. This will contain our application component that will be included in the main/entry file app.js.
Inside VideoComponent.js, we will create a minimal component.

//app/VideoComponent.js
import React, { Component } from 'react';
import Video from 'twilio-video';
import axios from 'axios';

export default class VideoComponent extends Component {
 constructor(props) {
   super();
 }

 render() {
   return (
     <div>Video Component</div>
   );
 }
}

We then include this component in our main application file app.js by adding the highlighted code to that file:

// app/app.js
// ... other import statements
import VideoComponent from './VideoComponent';

let dom = document.getElementById("app");
render(
    <MuiThemeProvider muiTheme={getMuiTheme(lightBaseTheme)}>
        <div>
            <AppBar title="React Twilio Video" />
            <VideoComponent />
        </div>
    </MuiThemeProvider>
    ,
    dom
);

 

From now on, everything will happen within the ‘VideoComponent.js’ file. We can start our application by typing  ‘npm start’ on the command line and navigating to localhost:3000 in our web browser.


When a user loads the page, we expect the application to get an Access Token from our server and use the Token to join a room. Since we are allowing our client to create rooms, we will also include in our interface a section where users can join a room. By joining a room that does not exist, the room will be created automatically.

Acquiring a Token

When the component loads, an API call is made inside componentDidMount to the server and returns a token with a fake name.

Note: It’s recommended to make API calls or any other operations that cause side-effects in componentDidMount and not componentWillMount because by the time componentDidMount is called, the component has been mounted and an update to the state guarantees updates on DOM nodes. componentWillMount is called right before the component is mounted and this means if an API call completes before render, there will be no component to update.

The client uses the token to join a room. This means we will need state variables to store some values.

We also ensure that a user can only connect to a room if a room name is provided otherwise we show an error. We will only show the user’s video once they are connected to a room.

We are going to gradually update “VideoComponent.js” and add new functionality.
We first initialize several variables to track our state and hold some information inside the constructor.

constructor(props) {
    super();
    this.state = {
      identity: null,  /* Will hold the fake name assigned to the client. The name is generated by faker on the server */
      roomName: '',    /* Will store the room name */
      roomNameErr: false,  /* Track error for room name TextField. This will    enable us to show an error message when this variable is true */
      previewTracks: null,
      localMediaAvailable: false, /* Represents the availability of a LocalAudioTrack(microphone) and a LocalVideoTrack(camera) */
      hasJoinedRoom: false,
      activeRoom: null // Track the current active room
   };
 }

With the required variables set in the state, we can now make an API call to get the token. Add the “componentDidMount” function below the constructor.

constructor(props) {
    //... skipped constructor content
}
componentDidMount() {
  axios.get('/token').then(results => {
    /*
Make an API call to get the token and identity(fake name) and  update the corresponding state variables.
    */
    const { identity, token } = results.data;
    this.setState({ identity, token });
  });
}

We need to import some components from material-ui by adding the following statements in VideoComponent.js. These components will be used in the render() method.

// ... other imports
import RaisedButton from 'material-ui/RaisedButton';
import TextField from 'material-ui/TextField';
import { Card, CardHeader, CardText } from 'material-ui/Card';

We then delete the existing render() method and replace it with the following new version. The “render()” method responsible for calling most of the methods we create and also showing the interface.

render() {
  /* 
   Controls showing of the local track
   Only show video track after user has joined a room else show nothing 
  */
  let showLocalTrack = this.state.localMediaAvailable ? (
    <div className="flex-item"><div ref="localMedia" /> </div>) : '';   
  /*
   Controls showing of ‘Join Room’ or ‘Leave Room’ button.  
   Hide 'Join Room' button if user has already joined a room otherwise 
   show `Leave Room` button.
  */
  let joinOrLeaveRoomButton = this.state.hasJoinedRoom ? (
  <RaisedButton label="Leave Room" secondary={true} onClick={() => alert("Leave Room")}  />) : (
  <RaisedButton label="Join Room" primary={true} onClick={this.joinRoom} />);
  return (
    <Card>
    <CardText>
      <div className="flex-container">
    {showLocalTrack} {/* Show local track if available */}
    <div className="flex-item">
    {/* 
The following text field is used to enter a room name. It calls  `handleRoomNameChange` method when the text changes which sets the `roomName` variable initialized in the state.
    */}
    <TextField hintText="Room Name" onChange={this.handleRoomNameChange} 
errorText = {this.state.roomNameErr ? 'Room Name is required' : undefined} 
     /><br />
    {joinOrLeaveRoomButton}  {/* Show either ‘Leave Room’ or ‘Join Room’ button */}
     </div>
    {/* 
The following div element shows all remote media (other                             participant’s tracks) 
    */}
    <div className="flex-item" ref="remoteMedia" id="remote-media" />
  </div>
</CardText>
    </Card>
  );
}

Let us now implement some of the methods referenced in the render function.

    handleRoomNameChange(e) {
  /* Fetch room name from text field and update state */
      let roomName = e.target.value; 
      this.setState({ roomName });
    }

    joinRoom() {
   /* 
Show an error message on room name text field if user tries         joining a room without providing a room name. This is enabled by setting `roomNameErr` to true
  */
        if (!this.state.roomName.trim()) {
            this.setState({ roomNameErr: true });
            return;
        }

        console.log("Joining room '" + this.state.roomName + "'...");
        let connectOptions = {
            name: this.state.roomName
        };

        if (this.state.previewTracks) {
            connectOptions.tracks = this.state.previewTracks;
        }

        /* 
Connect to a room by providing the token and connection    options that include the room name and tracks. We also show an alert if an error occurs while connecting to the room.    
*/  
Video.connect(this.state.token, connectOptions).then(this.roomJoined, error => {
  alert('Could not connect to Twilio: ' + error.message);
});
}

To use the two methods above, we need to bind them in the constructor.

constructor(props) {
    super();
    this.state = {
        // ...
}
this.joinRoom = this.joinRoom.bind(this);
        this.handleRoomNameChange = this.handleRoomNameChange.bind(this);
}

Binding methods sets their context to VideoComponent and ensures that any call to them using this.methodName(…) will use VideoComponent as the context (this). To learn more about method binding, refer to Jason Arnold’s article.

The above function, “joinRoom()”, is very important since it results in a connection to a room which will lead to participants, both local and remote, joining the room and bringing media streams with them which we can attach to the DOM. On successful connection, “joinRoom()” will call “roomJoined(room)” passing in the room instance.

// Attach the Tracks to the DOM.
attachTracks(tracks, container) {
  tracks.forEach(track => {
    container.appendChild(track.attach());
  });
}

// Attach the Participant's Tracks to the DOM.
attachParticipantTracks(participant, container) {
  var tracks = Array.from(participant.tracks.values());
  this.attachTracks(tracks, container);
}

roomJoined(room) {
  // Called when a participant joins a room
  console.log("Joined as '" + this.state.identity + "'");
  this.setState({
    activeRoom: room,
    localMediaAvailable: true,
    hasJoinedRoom: true  // Removes ‘Join Room’ button and shows ‘Leave Room’
  });

  // Attach LocalParticipant's tracks to the DOM, if not already attached.
  var previewContainer = this.refs.localMedia;
  if (!previewContainer.querySelector('video')) {
    this.attachParticipantTracks(room.localParticipant, previewContainer);
  }
    // ... more event listeners
}
// ... more code

We then bind these methods in the constructor just below the other methods.

this.roomJoined = this.roomJoined.bind(this);

Binding “attachTracks” and “attachParticipantTracks” is optional since we are not using them as event handlers or in any callback function therefore their context will automatically be set to the component.

So far the above setup only handles joining a room and showing the local video track.
We can then start the application by typing “npm start” on the command line and navigating to localhost:3000 in our web browser. After entering a room name and clicking on ‘Join Room’, we should be able to see a window similar to the following. The ‘Leave Room’ button should only show an alert for now. In case of any issue, complete code up to this point can be found on this GitHub branch.

In the next steps we will explore how to leave a room and what happens when other participants join the room.

Leaving a Room

To leave a room, we call “disconnect()” on the active room and update some state variables to update the interface. This will also change the value of hasJoinedRoom in the state to false and consequently remove the ‘Leave Room’ button and show the ‘Join Room’ button in its place.

// ... code before
leaveRoom() {
   this.state.activeRoom.disconnect();
   this.setState({ hasJoinedRoom: false, localMediaAvailable: false });
}
// ... code after

We also need to update ‘joinOrLeaveRoomButton’ our render function to call ‘leaveRoom’.

let joinOrLeaveRoomButton = this.state.hasJoinedRoom ? (
<RaisedButton label="Leave Room" secondary={true} onClick={this.leaveRoom} />
        ) : (
<RaisedButton label="Join Room" primary={true} onClick={this.joinRoom} />
        );

We then bind the leaveRoom() method in the constructor.

  constructor(props) {
    super();
    // other code

    this.joinRoom = this.joinRoom.bind(this);
    this.handleRoomNameChange = this.handleRoomNameChange.bind(this);
    this.leaveRoom = this.leaveRoom.bind(this);
  }

Handling other Participants

In this section, we are going to update the remoteMedia section when another participant joins the room. We will also handle the case when a participant removes one of their tracks or leaves the room. Let’s start with the functions that will detach participants’ tracks from the room.

detachTracks(tracks) {
    tracks.forEach(track => {
      track.detach().forEach(detachedElement => {
        detachedElement.remove();
      });
    });
  }

detachParticipantTracks(participant) {
  var tracks = Array.from(participant.tracks.values());
  this.detachTracks(tracks);
}

As usual, we bind the above methods in our constructor.

  constructor(props) {
    super();
    // other code

    this.joinRoom = this.joinRoom.bind(this);
    this.handleRoomNameChange = this.handleRoomNameChange.bind(this);
    this.leaveRoom = this.leaveRoom.bind(this);
    this.detachTracks = this.detachTracks.bind(this);
    this.detachParticipantTracks =this.detachParticipantTracks.bind(this);
  }

Inside the function roomJoined(room), we will add the following snippet that performs several functionalities based on the event triggered on the room instance. The functionalities include:

  • Attaching the tracks of room participants to the DOM.
  • Logging participants’ identities after they join the room.
  • Detach participants’ tracks from the DOM when they leave the room.
  • Detach a single participant’s track when they disable it. For instance, detach the audio track when they turn of their microphone.
  • Detach all tracks from the DOM when a local participant leaves a room.

  roomJoined(room) {
    // ... existing code 
    if (!previewContainer.querySelector('video')) {
      this.attachParticipantTracks(room.localParticipant, previewContainer);
    }

    // Attach the Tracks of the room's participants.
    room.participants.forEach(participant => {
      console.log("Already in Room: '" + participant.identity + "'");
      var previewContainer = this.refs.remoteMedia;
      this.attachParticipantTracks(participant, previewContainer);
    });

    // Participant joining room
    room.on('participantConnected', participant => {
      console.log("Joining: '" + participant.identity + "'");
    });

    // Attach participant’s tracks to DOM when they add a track
    room.on('trackAdded', (track, participant) => {
      console.log(participant.identity + ' added track: ' + track.kind);
      var previewContainer = this.refs.remoteMedia;
      this.attachTracks([track], previewContainer);
    });

    // Detach participant’s track from DOM when they remove a track.
    room.on('trackRemoved', (track, participant) => {
      this.log(participant.identity + ' removed track: ' + track.kind);
      this.detachTracks([track]);
    });

    // Detach all participant’s track when they leave a room.
    room.on('participantDisconnected', participant => {
      console.log("Participant '" + participant.identity + "' left the room");
      this.detachParticipantTracks(participant);
    });

    // Once the local participant leaves the room, detach the Tracks
    // of all other participants, including that of the LocalParticipant.
    room.on('disconnected', () => {
      if (this.state.previewTracks) {
        this.state.previewTracks.forEach(track => {
          track.stop();
        });
      }
      this.detachParticipantTracks(room.localParticipant);
      room.participants.forEach(this.detachParticipantTracks);
      this.state.activeRoom = null;
      this.setState({ hasJoinedRoom: false, localMediaAvailable: false });
    });  }

In the above snippet, other participants can join the room and their video track will be visible on the far right in the browser window. It will be detached from the DOM when they leave the room. The application also detaches every participant once the local participant disconnects from the room. In summary, “roomJoined(room)” gives us access to different properties of the room including participants and events that can occur in the room. We then define application behaviour based on these events. We also control the appearance of the DOM by updating different state variables.

Testing the Video Application

We will manually test our application locally by running “npm start” on the command line. We can then visit localhost:3000 in our browser. The interface should look similar to the one in the screenshots below. Enter a room name of your choosing and click on ‘Join Room’. Then open up a new browser window and repeat the process there with the same room name.

A demo application you can use is hosted on Heroku for future reference. Here’s a screenshot before joining, with one user and two users.

Before joining a room

Only local participant in the room

Another participant joins the room(two separate browsers)

Conclusion: Programmable Video with React and Node.js

There’s always an increasing need and demand for collaboration tools like Google Hangouts. The downside of these existing enterprise tools is that they do not give the user the ability to be creative and customize them to meet users’ demands. Twilio Programmable Video provides a lot of pre-implemented features allowing the user to choose whatever feature they want and build out their own custom tools with little effort.

To get access to the complete code, check out the GitHub Repository and Heroku for a live demo of the application. Visit the official Twilio Video documentation for JavaScript to learn more on controlling the audio and video devices, screen sharing, using the data track, and many more features.

Takeaways

  • Twilio Programmable Video is a great API with lots of features to choose from.
  • The API is simple to setup and use, especially the descriptive events like participantConnected, participantDisconnected and so forth.
  • The API makes it very easy for developers to come up with their own custom collaboration tools by choosing from the many features available.

Brian Kimokoti is a software developer working mostly with Python, Javascript and related technologies. He is also a tech blogger on several online platforms and enjoys watching soccer. You can reach him on Twitter @kimobrian254.

Authors
Sign up and start building
Not ready yet? Talk to an expert.