Voice Biometric Authentication With Twilio

What if you could use your voice as a password – saying a phrase that only works if you said it?  That would be more secure than even a complex password because a person’s voice is harder to duplicate than any password.


Well you can, using voice biometric authentication.  Voice biometric authentication techniques – using digital sampling techniques to recognize a  speaker’s voice – are gaining popularity in a number of industries.   The organizations that have typically implemented voice authentication technology are large ones such as banks and government departments. Organizations that have heightened security concerns and the funds to implement rather complex technology of voice authentication.

But the field is changing – and now voice authentication is accessible via APIs, the way that Twilio has made communications available via API.

What Is It?

I wanted to set up a proof of concept with voice bio authentication from the first time heard about  it, but I struggled to find a API that was useable without a large overhead.  Luckily Noel Grover from VoiceIt reached out to us, and I was able to set up a Twilio based voice authentication demo in no time.

A common use case is two factor authentication.  For example, during a bank transfer or application login the flow would be similar to existing Twilio two factor authentication, but instead of entering a code via SMS or IVR, a voice must be recognized.

You can try it yourself by calling:  +1 612-400-7423


One needs to become familiar with some key concepts I struggled with a bit initially, but are  pretty standard in the field.

1.  Enrollment – this is the process of creating a voice print. Typically, a  user will be asked to repeat a phrase multiple times, and the recording of these utterances will be compared to future authentication attempts.

2. Authentication – This is the comparison of a users phrase to recorded enrollment.  These algorithms are tunable, a company and dictate how strict or loose the matching algorithm should be.   And VoiceIT has the ability to authenticate via a REST api and wav file, which is particularly convenient with Twilio.

3. Users – Self explanatory, to use VoiceIt you need a user created.  This user is A tip for using Twilio with VoiceIt – instead of making a user create a username, you likely will want to use their phone number, and put it in the required email format for them.

4. These concepts and more are covered in the VoiceIt API.

The Code

In this demo, I’m going to use a system that when called, recognizes the number, and if the user exists, allows them to log in via their passphrase.  If their voice matches the pre-recorded phrase, the demo simply says “thanks your voice has been recognized”.  If callers aren’t recognized, they can enroll as a new user.

To REST based API of VoiceIT made integration with Twilio super easy.  Here is an example in Node, which made making a request from one service to another very easy.

We will start – with some basic Node requirements, and use Express so we can receive web requests from Twilio.



You will need a Developer ID from VoiceIt – you can register directly on their website to set that up.  In this code sample the DeveloperId is stored in an environment variable called VOICEIT_DEV_ID.


Next, we write a helper function, callerCredentials, that will be used later, every time we receive a new call to the system.



Next,  we set up the section of the code that will accept requests from Twilio when an incoming call comes in.  Notice that the method is /incoming_call. If your server is running on  http://yourtwilioserver.yourcompany.com/, you would configure your Twilio phone number voice URL to = http://yourtwilioserver.yourcompany.com/incoming_call.

That will send inbound calls to this section of the code.  From there, a few things will happen:

1. The incoming phone number from the caller will be used to create user credentials with the callerCredentials function. Note, the only unique part of the user record in this demo is the phone number they are calling from.

2. And api call is made against VoiceIt using the to GET information about user. If they don’t exist, create the user in VoiceIt.

3. If the VoiceIt user does not exist, start off the process to /enroll the user.

4.  Play some feedback to the caller using TwiML


The experience will vary depending on whether the user is enrolling for the first time, or they are authenticating.

The /enroll function, if selected, is used to create a voice  print for the user. They must say the phrase 3 times to enroll.  This enrollment will be saved on VoiceIt, and be compared with future /authentications.


The /authenticate route is the heart of the actual voice authentication system.  When a user reaches this section, they will speak their Twilio phrase, and the Twilio recording will be sent to VoiceIt to match their enrollment.

If it matches, they get a simple message “Great Success!”, and some JSON returned from VoiceIt will be played, stating the authentication match percentage. In a real world application, this would then go on to the next function, for example allowing you to access your bank balance, etc.

And that’s it! The machine is now recognizing your voice.  I, for one, welcome our new robot overlords.

Code also available on GitHub: https://github.com/choppen5/twilioVoiceItiVR