Creating an OCR Communication App with Tesseract.js and React (Part 2)

November 10, 2021
Written by
Reviewed by

Creating an OCR Communication App with Tesseract.js and React (Part 2)

This tutorial is divided into 2 parts: part 1 covers the project setup and front end development, while part 2 covers the back end development and testing of the app.

Part 1: Creating an OCR Communication App with Tesseract.js and React (Part 1)

In part 1, we built the front end of this OCR communication app:


completed app

Let’s now build the back end.

Building the back end

Create .env and server.js files in the root directory of the project.

First, let’s store the Twilio credentials in the .env file.

Open the .env file in a text editor and paste the following code:

TWILIO_ACCOUNT_SID=XXXXX
TWILIO_AUTH_TOKEN=XXXXX
TWILIO_PHONE_NUMBER=XXXXX

XXXXX represents a placeholder for Twilio’s credentials.

Obtain the Account SID and Auth Token from the Twilio Console and paste them as the values of TWILIO_ACCOUNT_SID and TWILIO_AUTH_TOKEN respectively.

Paste the phone number you obtained from Twilio in E.164 format as the value of TWILIO_PHONE_NUMBER.

Save the file.

Next, we’ll create a Node.js server. Create a server.js file in the root directory of the project.

Open server.js in a text editor and paste the following code:

require("dotenv").config()

const express = require("express")
const app = express()
const port = 5000

app.use(express.json())

app.listen(port, () => {
  console.log(`Example app listening at http://localhost:${port}`)
})

This code imports the dotenv and express libraries and instantiates Express.

We’re also setting the port of the server to 5000, because in part 1 of this tutorial, we used create-react-app in the local environment, so http://localhost:3000/ is already taken. For this reason, we specify a port other than 3000 for the back end.

In an SPA(single page application) that is running in a local environment, sending a request from the front end to the back end API server can cause a same origin policy error, which disallows the front end from accessing the server.

In this tutorial, we’re using http://localhost:3000/ for the front end, and http://localhost:5000/ for the Node.js server, causing the same origin policy issue.

To prevent this from happening, we’ll add a proxy definition to package.json.

Open package.json in a text editor. Add the following code to the file:

"proxy": "http://localhost:5000",

Save the file.

This will proxy all requests for http://localhost:3000/ except for those with the Accept header set to text/html to http://localhost:5000/.

Next, we’ll retrieve the Twilio credentials from the environment variables. Open server.js in a text editor.

Paste the following code between the app.use block and the app.listen block:

// Twilio credentials
const accountSid = process.env.TWILIO_ACCOUNT_SID
const authToken = process.env.TWILIO_AUTH_TOKEN
const fromPhoneNumber = process.env.TWILIO_PHONE_NUMBER
const client = require("twilio")(accountSid, authToken)

Finally, we’ll configure the endpoint for sending SMS by pasting the following code under the Twilio credentials block:

 

// Send an SMS based on the phone number and text received in the request
app.post("/send-sms", (req, res) => {
  client.messages
    .create({body: req.body.text, from: fromPhoneNumber, to: req.body.to})
    .then(message => {
      res.json({ sid: message.sid })
    })
    .catch(() => res.sendStatus(500))
})

This code creates a new Twilio client instance when an HTTP POST request is sent from the front end to /send-sms. This client will be used to send an SMS message to the phone number specified in the to property in the request, with the message specified in the text property.

The full code for server.js can be found in the Github repository.

Congratulations, the server is now ready!

Test the app

Let’s run the completed app.

Open two terminal windows.

In the first window, execute the following command from the root directory of the app:

npm start

Open http://localhost:3000 in your browser. If everything is working fine, you will see a screen like so:

Select file

Next, we’ll check the behavior of the server and the entire app.

In the second window, execute the following command from the root directory of the application:

node server.js

In your browser, select an image that contains English characters using “Choose file”.

Tesseract.js tends to demonstrate higher recognition accuracy when images with a white background and black text are used.

In this tutorial, we’ll use the following image:

Image used for recognition

When you select an image, the screen will change like so:

Start OCR screen

Select Process the image with OCR.

The status “Processing…” will be displayed under the button, and then it will change to “OCR processing complete”. The text detected by the OCR process will be displayed as follows:


Editor screen

If you want to edit the OCR-processed text before sending the SMS, you can do so from the editor under “Edit the recognized text:”.

When you are done editing, enter the phone number you want to send the text as an SMS to in the phone number field.

Select the country where the phone number you want to send the SMS to is registered from the country flag icon, and enter the phone number without the country code in the field on the right.


Select phone number

It is dangerous to send personal information via SMS without the information owner’s consent. Before clicking “Send SMS”, make sure that the text does not contain any personal information.

Click “Send SMS”. The app screen will display “Sending SMS…”, and if there are no problems with the sending process, “Finished sending SMS” will be displayed like so:


Finished sending SMS screen

The SMS will be sent to the phone number you specified like so:


Message sent

We have now tested the app’s behavior!

Conclusion

OCR technology is used in a variety of situations in our day-to-day lives. With Tesseract.js, you can easily perform OCR processing in your browser and combine OCR with other technologies and frameworks. Using this tutorial as a foundation, why not try digitizing processes that only exist in paper form?

Stephenie is a JavaScript editor in the Twilio Voices team. She writes hands-on JavaScript tutorial articles in Japanese and English for Twilio Blog. Reach out to Stephenie at snakajima[at]twilio.com and see what she’s building on Github at smwilk.