Document Converter Bot using Twilio WhatsApp, Adobe and Node.js

September 29, 2023
Written by
Sunil Kumar
Contributor
Opinions expressed by Twilio contributors are their own
Reviewed by

WhatsApp is one of the most widely used platforms in the world and building bots for WhatsApp can provide a powerful way to interact with customers and improve service. This tutorial shows how to create a WhatsApp bot that allows users to convert files to different formats using the Adobe PDF service API.

Twilio’s WhatsApp Business API will be used to integrate WhatsApp with a Node.js Express web application and Adobe PDF Services APIs. By the end of this tutorial, you will have a functional WhatsApp bot which accepts a file and converts it to a different format as shown below.

A WhatsApp conversation demonstrating the functionality of a document converter chatbot. A user sends a media file and the chatbot then asks the user to specify the desired format for the conversion. After the user sends the destination format, the chatbot returns the converted media file.

Prerequisites

To proceed with this tutorial, you will need the following:

Setting up your development environment

Create the project

Before building the chatbot, you need to set up your development environment. Open up your terminal and execute the following command:

mkdir twilio-converterx
cd twilio-converterx
npm init -y

This command creates a twilio-converterx directory and navigates into it. The npm init command will initialize a project and create a package.json file. The -y flag will choose the default settings while creating the project.

Install the dependencies

You will be using the following packages in your application.

  • express - Express is a minimal and flexible Node.js web application framework that provides a robust set of features for web and mobile applications
  • axios - It is a Promise based HTTP client for the browser and node.js
  • ext-name - Get the file extension and MIME type from a file
  • twilio - It is a package that allows you to interact with the Twilio API.
  • @adobe/pdfservices-node-sdk -  It allows you to access RESTful APIs to create, convert, and manipulate PDFs within your applications.
  • fs - It is a Node.js file system module that allows you to work with the file system on your computer.
  • path - It is a Node.js module that provides a way of working with directories and file paths.
  • dotenv - It is a zero-dependency module that loads environment variables from a .env file into process.env

Use the following command to install these packages:

npm install express axios ext-name twilio dotenv @adobe/pdfservices-node-sdk fs path

Get the credentials from Adobe

You need to generate credentials from Adobe developer portal to access the Adobe PDF service API. The credentials include a file named pdfservices-api-credentials.json which will be used later in your application. After you log in, click on this link to generate the credentials from Adobe API.

Once downloaded, unzip the file and make a note of client_id and client_secret in the pdfservices-api-credentials.json file. You should set these values as environment variables.

Setup environment variables

You will be using environment variables to store some sensitive data which are required later in your application. Create a file named .env in your project directory and paste the snippet given below:

TWILIO_ACCOUNT_SID=ACXXXXXXXXX
TWILIO_AUTH_TOKEN=
NGROK_URL=https://<ngrok_id>.ngrok-free.app/
TWILIO_WA_NUMBER=whatsapp:+14155238886
PDF_SERVICES_CLIENT_ID=<YOUR CLIENT ID>
PDF_SERVICES_CLIENT_SECRET=<YOUR CLIENT SECRET>

You can find your Account SID and Auth Token from Twilio Console. The TWILIO_WA_NUMBER number is your WhatsApp sender. The number in the code snippet is the WhatsApp Sandbox number so feel free to leave it if you haven't registered your Twilio number for WhatsApp. The NGROK_URL is the base URL of your server which you’ll add later in the tutorial.

PDF_SERVICES_CLIENT_ID and PDF_SERVICES_CLIENT_SECRET are the credentials present in pdfservices-api-credentials.json file.

Build your Document Converter Bot

Configuring your Twilio Sandbox for WhatsApp

To enable the chatbot to communicate with WhatsApp users using Twilio's Messaging API, you need to configure the Twilio Sandbox for WhatsApp. Here's how to do it:

  • Assuming you've already set up a new Twilio account, go to the Twilio Console and choose the Messaging tab on the left panel.
  • Under Try it out, click on Send a WhatsApp message. You'll land on the Sandbox tab by default and you'll see a phone number +14155238886 with a code to join next to it on the left and a QR code on the right.
  • To enable the Twilio testing environment, send a WhatsApp message with this code's text to the displayed phone number. You can click on the hyperlink to direct you to the WhatsApp chat if you are using the web version. Otherwise, you can scan the QR code on your phone.

Create an Express Server

Now that you have set up your environment, it's time to build the chatbot.

Let’s set up a basic express server. In your project directory (twilio-converterx), create a file called index.js. Inside index.js, paste the code snippet given below:

const express = require('express')
const app = express()
const port = 3000
app.get('/', (req, res) => {
    res.send('Hello world!')
})
app.listen(port, () => {
    console.log(`Example app listening on port ${port}`)
})

In the code above, you are importing express with the require keyword and creating an app by calling the express() function provided by the express framework. The port for your local application is set to 3000 which is the default but you can choose any according to the availability of ports.

Now, import all the other required modules. Copy the code snippet given below and paste it at the top of the index.js file.

const axios = require('axios')
const extName = require('ext-name');
const fs = require('fs')
const path = require('path');
const PDFServicesSdk = require('@adobe/pdfservices-node-sdk');
require('dotenv').config();
const twilio = require('twilio')
const { MessagingResponse } = twilio.twiml;

app.use(express.urlencoded({ extended: true }));
app.use(express.static('./files'));

You’ll be using a folder called /files to temporarily store media files so the app.use(express.static('./files')); line is used to publicly expose the files. Create a folder in your main project directory and name it files.

The express.urlencoded() function is a built-in middleware function in Express. It parses incoming requests with urlencoded payloads and is based on body-parser. This middleware is available in Express v4.16.0 onwards.

To fetch the environment variables and initialize the Twilio client, add this snippet right below where you initialized all of the packages in index.js

const accountSid = process.env.TWILIO_ACCOUNT_SID; // Account SID from www.twilio.com/console
const authToken = process.env.TWILIO_AUTH_TOKEN;// Auth Token from www.twilio.com/console
const ngrok = process.env.NGROK_URL;
const client = twilio(accountSid, authToken);

You’ll need your Twilio Account SID and Auth Token to initialize the client. Since you already stored these values in the environment variable, you are fetching this data using process.env. You are also fetching the NGROK_URL which is the public URL obtained from ngrok which you will be updating later.

To declare the global constants and variables, add this snippet below where you initialized the Twilio client:

var state = []

const CONVERSIONS = {
"PDF": ["DOCX", "DOC", "PPTX", "PNG", "JPEG", "XLSX"],
"DOC": ["PDF"],
"DOCX": ["PDF"],
"XLSX": ["PDF"],
"PPTX": ["PDF"],
"PNG": ["PDF"],
"JPEG": ["PDF"],
}
const conversions = Object.entries(CONVERSIONS)
.map(([sourceFormat, targetFormats]) =>
`${sourceFormat} ➡️ ${targetFormats.join(", ")}`
).join("\n");
var message;
var twiml;

The constant CONVERSIONS stores all the available conversions and conversions will convert the CONVERSIONS JSON into a readable string which is used in the message body to let the users know which file formats are supported for conversion. The state variable is used to hold some data when the user is interacting with the chatbot and the message and twiml variables are used to send the message response.

Create a route for incoming message

Users should be allowed to send the message and media for this chatbot to work. To do this, create a route called /message which will handle the incoming messages. Copy the code given below and paste it into index.js file:

app.post('/message', (req, res) => {
    //Add all function logic explained below
    const { NumMedia, MediaUrl0, MediaContentType0, Body, MessageSid, From } = req.body;
});

Here, the req object will contain the data passed in the request and you can access the data related to incoming messages as shown.

Implement the application logic

Now, you will be adding your application logic inside the /message route. The user can interact with the WhatsApp bot by sending any random text message or a file so you'll need to handle these messages separately.

Handling Media attachments:

To start off, let’s handle the media attachments. Copy the code given below and paste it inside the /message route you have created.


if(NumMedia > 0){
    state = []
    deleteFiles();
    const segments = MediaUrl0.split("/");
    const mediaSid = segments[segments.length - 1];
    var inputExtension = extName.mime(MediaContentType0)[0].ext.toUpperCase();

    if (!CONVERSIONS.hasOwnProperty(inputExtension)) {
        message = `Oops! 🙊📁 Looks like I can't handle that file extension just yet! 😅🚫\nCurrently the following formats are supported\n\n${conversions}`;
        twiml = new MessagingResponse().message(message);
        res.set('Content-Type', 'text/xml');
        return res.send(twiml.toString()).status(200);
    }

    message = `📁🔮 Time to work wonders! Let's choose the perfect file type for your conversion:\n ${CONVERSIONS[inputExtension].join('\n')} 🔄📲`;
    state.push(inputExtension);
    state.push(MediaUrl0);
    state.push(MessageSid)
    state.push(mediaSid)
    twiml = new MessagingResponse().message(message);
    res.set('Content-Type', 'text/xml');
    return res.send(twiml.toString()).status(200);
}

If the user sends a media file, NumMedia will always be greater than zero. So you can use the NumMedia > 0 condition to handle the media messages.

When the user sends a new media file, you'll need to clear the state variable and delete all the previously stored media files. The deleteFiles() method will delete the files which are already processed. Copy the code given below and paste it into index.js file (outside of the /message route):

//Delete the files from your server
function deleteFiles(){
    const directory = path.resolve(`./files`);
    fs.readdir(directory, (err, files) => {
        if (err) throw err;
        for (const file of files) {
            fs.unlink(path.join(directory, file), (err) => {
                if (err) throw err;
            });
        }
    });
}

The MediaSid from the media url is fetched using the MediaUrl0.split method. extName is used to fetch the input file extension. The code then checks if the input file format is present in CONVERSIONS by using CONVERSIONS.hasOwnProperty(inputExtension). If the format is not present, then it informs the user by sending a message.

If the format is supported, then inputExtension, MediaUrl0, MessageSid and mediaSid are stored into the state array.

Handling Text Messages

Now you need to handle text messages sent by the user. The user will send a text message in two scenarios,

  1. Sending a random text message
  2. The extension of the target media file.

If the user is sending a random message, then the state variable will be empty. To handle this case, add an else if block below the code chunk you added in the /message route: .

else if (state.length === 0) {
    const message = `📎🔄 File Conversion App 🚀\n\nSend us your file, and we'll convert it to the following formats:\n\n${conversions}\n\nLet's get started! 🎉🔃\n\n`;
    twiml = new MessagingResponse().message(message);
    return res.send(twiml.toString()).status(200);
}

Since this message from the user does not contain media, the code above sends a greeting message informing the available conversion options.

Next, you’ll need to handle the message sent by the user after sending the media. Here, the body of the message will contain what the media should be converted to (eg. PDF). Copy and paste the code right below where you place the last else if statement in the /message route:

else if(!CONVERSIONS[state[0]].includes(Body.toUpperCase())){
    const message = new MessagingResponse().message(`Oh no! 🙊📁 Looks like ${state[0]}s cannot be converted to ${Body}. Please provide any valid type from the list below or you can send a new file     to Convert!🤗\n ${CONVERSIONS[state[0]].join('\n')}\n\nIf you want to convert a different file, just send it to us, we will do the rest.\n\n${conversions}`);
    res.set('Content-Type', 'text/xml');
    return res.send(message.toString()).status(200);
}

The code above checks if the output media is supported or not by checking in the CONVERSIONS json. If the format is not supported, then it will inform the user and ask them to send the right format.

Finally, if all the validations are successful, you are now going to perform the conversion operation. Copy and paste the code right below where you place the last else if statement in the /message route:

else {
    message = new MessagingResponse().message("📂💻 Your file is in the works! Sit back, relax, and get ready for something amazing to come! ✨🎉😄");
    res.type('text/xml').send(message.toString());

    let config = {
        method: 'get',
        url: state[1],
        responseType: "stream"
    }
    const inputFileName = `${state[2]}.${state[0]}`;
    const fullPath = path.resolve(`./files/${inputFileName}`);

    axios.request(config)
    .then((response) => {
        const fileStream = fs.createWriteStream(fullPath);
        fileStream.on('close', function(){
            console.log(Body.toUpperCase())
            console.log(Body.toUpperCase(), From)
            removeMediaFile(state[2], state[3]);
            convertFileFromAdobe(Body.toUpperCase(), From);
        });
    response.data.pipe(fileStream);
    });
}

In the config object, you are fetching the media URL from the state variable. The code then initializes two variables named inputFileName and fullPath which will contain the name and location of the downloaded file respectively.

axios is then used to download the file and save it in the location defined in the variable fullPath.

After downloading the media, it will be removed from the Twilio Servers.This is because, once media files are saved locally, keeping redundant copies on the Twilio server becomes unnecessary. Removing them ensures the privacy of user data is maintained. Create a function removeMediaFile and copy the code given below and add it to the index.js file.

//Removes media file from the Twilio Server
function removeMediaFile(smsSid, mediaSid){
    client.messages(smsSid).media(mediaSid).remove()
}

The function convertFileFromAdobe(Body.toUpperCase(), From) will perform the document conversion operation which is explained below.

To convert a file, you first need to determine which type of conversion is requested by the user depending on the media file and the target file format provided by the user. To do this, the getOperationType(targetType, inputFileName) function will return the type of operation you are going to perform later.

The function accepts two parameters, targetType and inputFileName. The targetType is the format to which the file needs to be converted and inputFileName is the name of the input file which needs to be converted.

Copy and paste the code snippet given below in the index.js file:

function getOperationType(targetType, inputFileName){
    const { ExportPDFToImages, ExportPDF } = PDFServicesSdk;
    const input = PDFServicesSdk.FileRef.createFromLocalFile(`./files/${inputFileName}`);
    let operationType = null;

    switch(targetType){
        case "DOC":
            operationType = ExportPDF.Operation.createNew(ExportPDF.SupportedTargetFormats.DOC);
            break;
        case "DOCX":
            operationType = ExportPDF.Operation.createNew(ExportPDF.SupportedTargetFormats.DOCX);
            break;
        case "XLSX":
            operationType = ExportPDF.Operation.createNew(ExportPDF.SupportedTargetFormats.XLSX);
            break;
        case "PPTX":
            operationType = ExportPDF.Operation.createNew(ExportPDF.SupportedTargetFormats.PPTX);
            break;
        case "PNG":
            operationType = ExportPDFToImages.Operation.createNew(ExportPDFToImages.SupportedTargetFormats.PNG);
            operationType.setOutputType(ExportPDFToImages.OutputType.ZIP_OF_PAGE_IMAGES);
            break;
        case "JPEG":
            operationType = ExportPDFToImages.Operation.createNew(ExportPDFToImages.SupportedTargetFormats.JPEG);
            operationType.setOutputType(ExportPDFToImages.OutputType.ZIP_OF_PAGE_IMAGES);
            break;
        case "PDF":
            operationType = PDFServicesSdk.CreatePDF.Operation.createNew();
            break;
    }
    if (operationType) {
        operationType.setInput(input);
    }
        return operationType;
    }

Here you are using two types of operations from Adobe PDF Services API: ExportPDFToImages and ExportPDF.

The code first creates a reference to the input file which is converted using the createFromLocalFile function of PDFServicesSdk.

It then defines the operationType based on the targetType,i.e the output file format. If the targetType is PNG/JPEG, it will set the output type as ZIP_OF_PAGE_IMAGES.This is because the output will contain more than one media and all the media files will be sent as a zip file.

After creating the operationType instance, it sets the local file reference as the input to the object using the setInput method.

Now, you will be creating a function to convert the files from one format to another using Adobe PDF Services API. Create a function called convertFileFromAdobe(targetType, toNumber) and copy the code given below in the index.js file:

function convertFileFromAdobe(targetType, toNumber){
    var msgOptions = {from: process.env.TWILIO_WA_NUMBER, to: toNumber};
    try {
        // Initial setup, create credentials instance.
        const credentials = PDFServicesSdk.Credentials
        .servicePrincipalCredentialsBuilder()
        .withClientId(process.env.PDF_SERVICES_CLIENT_ID)
        .withClientSecret(process.env.PDF_SERVICES_CLIENT_SECRET)
        .build();

        //Create an ExecutionContext using credentials and create a new operation instance.
        const executionContext = PDFServicesSdk.ExecutionContext.create(credentials);
        const inputFileName = `${state[2]}.${state[0].toLowerCase()}`;
        var operationType = getOperationType(targetType, inputFileName);
        let outputFilename = createOutputFileName()
        //Generating a file name
        let outputFilePath = "./files/" + outputFilename;

        // Execute the operation and Save the result to the specified location.
        operationType.execute(executionContext)
        .then(result => {
            //If the target format is PNG/JPEG we will send the zip file which includes all the pages of the source file
            if(targetType === "PNG" || targetType === "JPEG"){
                result[0].saveAsFile(outputFilePath);
                msgOptions.body = `Voila! 🎉 Your file has been transformed! Enjoy! 🪄😄\n${ngrok}${outputFilename}.zip`;
            }
            else{
                result.saveAsFile(outputFilePath);
                msgOptions.mediaUrl = `${ngrok}${outputFilename}.${targetType}`;
            }

            //Send media message
            client.messages
            .create(msgOptions)
            .then(resp => {
                console.log(resp.sid);
            })
        })
        .catch(err => {
            console.log(err)
            state = []
            msgOptions.body = `🚨 Uh-oh! It seems like something went wrong! We apologize for the hiccup in the system. 🤖🔧`
            client.messages.create(msgOptions);
        });

        //Generates a string containing a directory structure and file name for the output file.
        function createOutputFileName() {
            let date = new Date();
            let dateString = date.getFullYear() + "-" + ("0" + (date.getMonth() + 1)).slice(-2) + "-" +
("0" + date.getDate()).slice(-2) + "T" + ("0" + date.getHours()).slice(-2) + "-" +
("0" + date.getMinutes()).slice(-2) + "-" + ("0" + date.getSeconds()).slice(-2);
            return (dateString);
        }
    } 
    catch (err) {
        state = []
        msgOptions.body = `🚨 Uh-oh! It seems like something went wrong! We apologize for the hiccup in the system. 🤖🔧`
        client.messages.create(msgOptions);
    }
}

The code above starts off by creating a credentials instance using PDFServicesSdk.The servicePrincipalCredentialsBuilder() requires the credentials which are stored in the .env file.

The code then creates an executionContext using the above credential instance. It will then create an operation instance which defines which type of conversion you need to perform. The getOperationType function is then used to get the operation type by passing targetType and inputFileName.

After creating the operation instance, it will define a name for the output file. It will use the current date and time to name the output files uniquely. The function createOutputFileName() will return the filename based on the current timestamp.

Next, the code will execute the operation and store the results in the specified location. For the PDF to Image conversion, there will be multiple image files generated so you will be creating a zip file to store all the files. Since .zip format is not supported by Twilio WhatsApp, you will be sending the file via URL. Hence after saving the file,it will send the converted file to the user.

You can refer to this Quickstart for Adobe PDF Services API (Node.js) for more details about the Document Conversion logic used.

You are done with your coding part and now it is time to test the application!

Setup the Server

Now, start your server so that you can handle the incoming messages. Navigate to the project directory (twilio-converterx) and run the below command to start your express server:

node index.js

Since you have set the port to 3000, you can access your server using the URL http://localhost:3000. You should see a “Hello World!” message when you open this URL in the browser.

Since Twilio needs to send messages to your backend, you need to host your app on a public server. An easy way to do that is to use ngrok.If you're new to ngrok, you can refer to this blog post and create a new account. Execute this command on your terminal in another tab to start the ngrok tunnel:

ngrok http 3000

The above command sets up a connection between your local server running on port 3000 and a public domain created on the ngrok.io website. You will see the response as shown below.

The terminal output of the command "ngrok http 3000". It contains the ngrok public URL which can be used to set the incoming message webhook of Twilio WhatsApp sandbox number.

Once you have the ngrok forwarding URL, any requests from a client to that URL will be automatically directed to your application.

Update your .env file with the new ngrok URL as shown below.

NGROK_URL = XXXXXX 

Replace the placeholder with your unique forwarding URL and ensure that you add the URL including a backslash(/) at the end since this URL is used to send the media file.

Restart your express server with node index so that the changes are applied.

Configuring the Twilio webhook

You must set up a Twilio-approved webhook to be able to receive a response when you message the Twilio WhatsApp sandbox number.

To do that, head over to the Twilio Console and choose the Messaging tab on the left panel. Under the Try it out tab, choose Send a WhatsApp message. Next to the Sandbox tab, choose the Sandbox settings tab.

Copy your ngrok.io forwarding URL and append /message. Paste it into the box next to WHEN A MESSAGE COMES IN field as shown below.

WhatsApp sandbox settings with ngrok URL in "When a message comes in" field

The complete URL should look something like this: https://8748-103-179-197-149.ngrok-free.app/message

Press the Save button.

Testing your Document Converter Bot

Send a message to initiate the conversation or send a media file to start the Conversion job directly.

A WhatsApp conversation: An initial message with the greeting "hey" prompts the chatbot to respond, requesting the user to send a file for conversion. Alongside this request, the chatbot provides information about the supported document formats.

PDF to DOCX Conversion

A WhatsApp conversation: A user initiates by sending a PDF file. In response, the chatbot provides a list of file formats to which the PDF can be converted. The user then sends the message "Docx". Subsequently, the bot acknowledges the user&#x27;s request and informs them that the file conversion is in progress. After some time, the bot sends the converted "Docx" file to the user.

Image to PDF Conversion

A WhatsApp conversation: A user initiates by sending an image file in PNG format. In response, the chatbot provides a list of file formats to which the PNG image can be converted. The user then sends another WhatsApp message, specifying "PDF." Subsequently, the bot acknowledges the user&#x27;s request and informs them that the file conversion is in progress. After some time, the bot sends the converted "PDF" file to the user.

PPT to DOCX - Unsupported

A WhatsApp conversation: A user starts by sending a PPT file. In response, the chatbot provides a list of file formats to which the PPT can be converted, noting that currently, PPTs can only be converted to PDF. The user then sends another WhatsApp message, specifying "Docx." The bot responds, indicating that the requested conversion is unsupported and asks the user to enter a valid format (PDF). The user complies by sending another WhatsApp message, specifying "PDF." The bot acknowledges the user&#x27;s choice and informs them that the file conversion is now in progress. After some time, the bot sends the converted "PDF" file to the user.

PDF to Image Conversion

A WhatsApp conversation: A user begins by sending a PDF file. In response, the chatbot provides a list of file formats to which the PDF can be converted. The user then sends another WhatsApp message, specifying "PNG." The bot responds, informing the user that the file conversion is underway. After some time, the bot sends a link to download the converted files in a zip format. This is necessary because the PDFs contain multiple pages, and each page needs to be converted into an image.

Conclusion

You have just built a document converter WhatsApp bot which can convert documents from one format to another. Currently this tutorial is limited to following conversions formats,

PDF to ["DOCX", "DOC", "PPTX", "PNG", "JPEG", "XLSX"] and

["DOCX", "DOC", "PPTX", "PNG", "JPEG", "XLSX"] to PDF

But Adobe Pdf Service API provides other services such as Split PDF, Compress PDF, Secure PDF etc. which can be integrated to this application. To learn more about Adobe Acrobat Service APIs check out their docs page here.

Also note that, WhatsApp media size will be limited by Twilio WhatsApp API to a maximum of 16MB. Please refer to this document for more details about WhatsApp limits - https://www.twilio.com/docs/whatsapp/guidance-whatsapp-media-messages#message-size-limits 

Sunil Kumar is a Software Developer from India. He can be reached at blogs.sunilkumar@gmail.com.