Screen Capture on Chrome - Twilio
Register for SIGNAL by 8/31 for $250 off. Register now.

Screen Capture on Chrome

In this guide we’ll show you how to share your screen using Chrome's Desktop Capture APIs and twilio-video.js. In order to protect user privacy, Chrome disallows ordinary web apps from accessing the Desktop Capture APIs directly; instead, we’ll need to develop a Chrome extension that our web app will communicate with in order to capture the screen.

Note: This guide is intended to help you write your own Chrome extension for enabling Screen Capture in your web app.

Web App/Extension Communication

Our web app and extension will communicate using message passing. Specifically, our web app will be responsible for sending requests to our extension using Chrome's sendMessage API, and our extension will be responsible for responding to requests raised through Chrome's onMessageExternal event. By convention, every message passed between our web app and extension will be a JSON object containing a type property, and we will use this type property to distinguish different types of messages.

Web App Requests

Our web app will send requests to our extension.

"getUserScreen" Requests

Since we want to enable Screen Capture, the most important message our web app can send to our extension is a request to capture the user's screen. We want to distinguish these requests from other types of messages, so we will set its type equal to "getUserScreen". (We could choose any string for the message type, but "getUserScreen" bears a nice resemblance to the browser's getUserMedia API.) Also, Chrome allows us to specify the DesktopCaptureSourceTypes we would like to prompt the user for, so we should include another property, sources, equal to an Array of DesktopCaptureSourceTypes. For example, the following "getUserScreen" request will prompt access to the user's screen, window, or tab:

{
  "type": "getUserScreen",
  "sources": ["screen", "window", "tab"]
}

Our web app should expect a success or error message in response.

Extension Responses

Our extension will respond to our web app's requests.

Success Responses

Any time we need to communicate a successful result from our extension, we'll send a message with type equal to "success", and possibly some additional data. For example, if our web app's "getUserScreen" request succeeds, we should include the resulting streamId that Chrome provides us. Assuming Chrome returns us a streamId of "123", we should respond with

{
  "type": "success",
  "streamId": "123"
}

Error Responses

Any time we need to communicate an error from our extension, we'll send a message with type equal to "error" and an error message. For example, if our web app's "getUserScreen" request fails, we should respond with

{
  "type": "error",
  "message": "Failed to get stream ID"
}

Project Structure

In this guide, we propose the following project structure, with two top-level folders for our web app and extension.

.
├── web-app
│   ├── index.html
│   └── web-app.js
└── extension
    ├── extension.js
    └── manifest.json

Note: If you are adapting this guide to an existing project you may tweak the structure to your liking.

Web App

index.html

Since our web app will be loaded in a browser, we need some HTML entry-point to our application. This HTML file should load web-app.js and twilio-video.js.

web-app.js

Our web app's logic for creating twilio-video.js Clients, connecting to Rooms, and requesting the user's screen will live in this file.

Extension

extension.js

Our extension will run extension.js in a background page. This file will be responsible for handling requests. For more information refer to Chrome's documentation on background pages.

manifest.json

Every extension requires a manifest.json file. This file grants our extension access to Chrome's Tab and DesktopCapture APIs and controls which web apps can send messages to our extension. For more information on manifest.json, refer to Chrome's documentation on the manifest file format; otherwise, feel free to tweak the example provided here. Note that we've included "://localhost/" in our manifest.json's "externally_connectable" section. This is useful during development, but you may not want to publish your extension with this value. Consider removing it once you're done developing your extension.

{
  "manifest_version": 2,
  "name": "your-plugin-name",
  "version": "0.10",
  "background": {
    "scripts": ["extension.js"]
  },
  "externally_connectable": {
    "matches": ["*://localhost/*", "*://*.example.com/*"]
  },
  "permissions": [
    "desktopCapture",
    "tabs"
  ]
}

Requesting the Screen

We define a helper function in our web app, getUserScreen, that will send a "getUserScreen" request to our extension using Chrome's sendMessage API. If our request succeeds, we can expect a "success" response containing a streamId. Our response callback will pass that streamId to getUserMedia, and—if all goes well—our function will return a Promise that resolves to a MediaStream representing the user's screen.

/**
 * Get a MediaStream containing a MediaStreamTrack that represents the user's
 * screen.
 * 
 * This function sends a "getUserScreen" request to our Chrome Extension which,
 * if successful, responds with the sourceId of one of the specified sources. We
 * then use the sourceId to call getUserMedia.
 * 
 * @param {Array<DesktopCaptureSourceType>} sources
 * @param {string} extensionId
 * @returns {Promise<MediaStream>} stream
 */
function getUserScreen(sources, extensionId) {
  const request = {
    type: 'getUserScreen',
    sources: sources
  };
  return new Promise((resolve, reject) => {
    chrome.runtime.sendMessage(extensionId, request, response => {
      switch (response && response.type) {
        case 'success':
          resolve(response.streamId);
          break;

        case 'error':
          reject(new Error(error.message));
          break;

        default:
          reject(new Error('Unknown response'));
          break;
      }
    });
  }).then(streamId => {
    return navigator.mediaDevices.getUserMedia({
      video: {
        mandatory: {
          chromeMediaSource: 'desktop',
          chromeMediaSourceId: streamId,
          // You can provide additional constraints. For example,
          maxWidth: 1920,
          maxHeight: 1080,
          maxFrameRate: 10,
          minAspectRatio: 1.77
        }
      }
    });
  });
}

Connecting to a Room with Screen Sharing

Assume for the moment that we know our extension's ID and that we want to request the user's screen, window, or tab. We have all the information we need to call getUserScreen. When the Promise returned by getUserScreen resolves, we need to use the resulting MediaStream to construct the LocalVideoTrack object we intend to use in our Room. Once we've constructed our LocalVideoTrack representing the user's screen, we have two options for publishing it to the Room:

  1. We can provide it in our call to connect, or
  2. We can add it after connecting to the Room using addTrack.

Finally, we'll also want to add a listener for the "stopped" event. If the user stops sharing their screen, the "stopped" event will fire, and we may want to remove the LocalVideoTrack from the Room. We can do this by calling removeTrack.

const { connect, LocalVideoTrack } = require('twilio-video');

// Option 1. Provide the screenLocalTrack when connecting.
async function option1() {
  const stream = await getUserScreen(['window', 'screen', 'tab'], 'your-extension-id');
  const screenLocalTrack = new LocalVideoTrack(stream.getVideoTracks()[0]);

  const room = await connect('my-token', {
    name: 'my-room-name',
    tracks: [screenLocalTrack]
  });

  screenLocalTrack.once('stopped', () => {
    room.localParticipant.removeTrack(screenLocalTrack);
  });

  return room;
}

// Option 2. First connect, and then add screenLocalTrack.
async function option2() {
  const room = await connect('my-token', {
    name: 'my-room-name',
    tracks: []
  });

  const stream = await getUserScreen(['window', 'screen', 'tab'], 'your-extension-id');
  const screenLocalTrack = new LocalVideoTrack(stream.getVideoTracks()[0]);

  screenLocalTrack.once('stopped', () => {
    room.localParticipant.removeTrack(screenLocalTrack);
  });

  room.localParticipant.addTrack(screenLocalTrack);
  return room;
}

Handling Requests

Our extension will listen to Chrome's onMessageExternal event, which will be fired whenever our web app sends a message to the extension. In the event listener, we switch on the message type in order to determine how to handle the request. In this example, we only care about "getUserScreen" requests, but we also include a default case for handling unrecognized responses.

chrome.runtime.onMessageExternal.addListener((message, sender, sendResponse) => {
  switch (message && message.type) {
    // Our web app sent us a "getUserScreen" request.
    case 'getUserScreen':
      handleGetUserScreenRequest(message.sources, sender.tab, sendResponse);
      break;

    // Our web app sent us a request we don't recognize.
    default:
      handleUnrecognizedRequest(sendResponse);
      break;
  }

  return true;
});

"getUserScreen" Requests

We define a helper function in our extension, handleGetUserScreenRequest, for responding to "getUserScreen" requests. The function invokes Chrome's chooseDesktopMedia API with sources and, if the request succeeds, sends a success response containing a streamId; otherwise, it sends an error response.

/**
 * Respond to a "getUserScreen" request.
 * @param {Array<DesktopCaptureSourceType>} sources
 * @param {Tab} tab
 * @param {function} sendResponse
 * @returns {void}
 */
function handleGetUserScreenRequest(sources, tab, sendResponse) {
  chrome.desktopCapture.chooseDesktopMedia(sources, tab, streamId => {
    // The user canceled our request.
    if (!streamId) {
      sendResponse({
        type: 'error',
        message: 'Failed to get stream ID'
      });
    }

    // The user accepted our request.
    sendResponse({
      type: 'success',
      streamId: streamId
    });
  });
}

Unrecognized Requests

For completeness, we'll also handle unrecognized requests. Any time we receive a message with a type we don't understand (or lacking a type altogether), our extension's handleUnrecognizedResponse function will send the following error response:

{
  "type": "error",
  "message": "Unrecognized request"
}

handleUnrecognizedRequest Implementation

/**
 * Respond to an unrecognized request.
 * @param {function} sendResponse
 * @returns {void}
 */
function handleUnrecognizedRequest(sendResponse) {
  sendResponse({
    type: 'error',
    message: 'Unrecognized request'
  });
}

Publishing the Extension

Finally, once we've built and tested our web app and extension, we will want to publish our extension in the Chrome Web Store so that users of our web app can enjoy our new Screen Capture functionality. Take a look at Chrome's documentation for more information.

Need some help?

We all do sometimes; code is hard. Get help now from our support team, or lean on the wisdom of the crowd browsing the Twilio tag on Stack Overflow.