Bonus Tips for Virtual Agents: Dialogflow CX’s Hidden Gems
Time to read: 5 minutes
Traditional Automated Speech Recognition (ASR) tools (or Speech to Text (STT) tools) are great and powerful tools, especially if you constrain the range of possible answers or caller utterances to a manageable set from well established or longstanding caller behavior.
But if you’d like to adopt a more conversational tone with your callers and give callers a wider array of choices, then connecting from Twilio (using our <Virtual Agent> noun and one-click Studio Connector) to a predictive AI tool like Google’s Dialogflow CX may be just the solution you need. It can help you with structuring your set of to-be-detected Intents, creating a set of Training Phrases (training data), and creating Action/Responses for your <Virtual Agent> “bot”.
Dialogflow bots are especially useful when one of Google’s Prebuilt Agent templates suits your use case. Twilio’s integration and One-click Connection between Dialogflow and Studio
<connect_to><Virtual Agent> widget makes it straightforward to get started.
A bot designer or developer working with Dialogflow on Twilio can specify the language for speech recognition in the Connector, or dynamically using the
<Config> TwiML noun nested inside of the
<Connect> verb and
<VirtualAgent> noun pair. (Google has written about how they train and develop the ASR underlying their bots for use with multiple languages.) Using
Config will override your underlying Dialogflow CX Connector's configuration, and pass additional parameters that can change other behaviors of the virtual agent.
<Config> has two attributes:
value, which must be used every time you use
<Config>. You must include a new
<Config> noun for each configuration option you want to override. The
name attribute corresponds to one of your Dialogflow CX Connector's configuration options, such as
Additionally, some attributes are not present in your Dialogflow CX Connector configuration, such as
speechModelVariant. You can still set these attributes using the
<Config> noun nested inside of
For example, if you want to customize the TTS voice and language for the virtual agent interaction, you can supply the respective configuration settings inside the
<Config> noun as follows:
<Config name="language" value="en-us"/>
<Config name="voiceName" value="en-US-Wavenet-C"/>
Twilio’s Text-To-Speech now supports Google Voices in Public Beta, and many new voices for different languages and genders have been added to the catalog. This enables you to deliver a consistent experience by using the same prompting voice across all parts of your Twilio application. For example, you can use the same voice when using Conversational AI from Google DialogFlow in combination with other Twilio-prompted <Say> interactions, such as Twilio Verify for 2FA (two-factor authentication), or shortly, Twilio-wide with tools like Twilio <Pay> to capture payments in a PCI-compliant manner.
Any supported Google TTS voice not currently listed and available in the Dialogflow CX Connector dropdown list can be selected via the
voiceName configuration in TwiML. In Studio, you can set an unlisted voice through the same key-value pair in the optional Configurations section of the Studio widget.
Input parameter variables are a beautiful thing.
Suppose you want to personalize a caller's experience of your bot by greeting them by name. In that case, you don't have to find your customer's name in a separate CRM database lookup step if their name is in the CNAM variable Twilio or the
connect_agent Studio widget already sees. Here's a list of the input variables Studio has available to send Dialogflow.
For subaccounts (especially as used by ISVs – Independent Software Vendors – with their end-customers), we have another list of best practices.
We suggest ISVs have separate bots for each sub-account, for referencing the correct end-customer’s input data. Even though billing (e.g., for charge-backs) is already reported on a per sub-account basis, we still also recommend using parameters to feed in unique data or prompts to a bot sitting at the Twilio sub-account level versus wrestling with the complexity of trying to do so for multiple customers with a bot sitting at the parent-account level.
In the former (recommended) approach, each subaccount would configure its own Connector between its Dialogflow GCP project and the Twilio sub-account and Studio flow (if used by the ISV, instead of TwiML). In either case, billing and usage would be visible on a sub-account level.
We've tried to make creating the first version of a bot as easy as possible (see our best practices) – but you should also reserve the right for your bot to get smarter in subsequent iterations of your voice workflows. You should always be trying to optimize your customers' experience while calling your business' voice front door.
For instance: what if your bot were smart enough to recognize if many customers were asking for the same seasonal special? What would be the business outcome of capturing that demand? With Twilio and Google's tools combined, agility like this is possible.
Google Dialogflow CX has sophisticated ways of generating a good combinatorial matrix of training phrases to identify caller Intents. The builder or designer need only enter a few prompts – Google recommends around ten or more.
However, the best training data for improving and optimizing a bot is what customer callers actually say to the bot itself. Hooking calls up to Twilio Call Recording and Transcriptions and Twilio Voice Intelligence to pull language operators out of structured voice conversations data can translate into discovering new Intents not already captured in the training data you entered into the Google Dialogflow CX bot. Consider what you might find: competitor mentions, churn intent (and reasons), etc.
With improving technology, the proper settings, and these tips and best practices, today's predictive AI "Virtual Agent" bots using Automated Speech Recognition (ASR) from Twilio and Google can provide incredible recognition accuracy performance. That holds true even in challenging, noisy environments. Twilio is the right "tool" for connecting these newly automated self-service applications to mobile and PSTN callers, and we can't wait to see the conversational experiences you build!
If you're exploring a Dialogflow bot and still need to read through our best practices for automatic speech recognition, find them here. And whether you use a Dialogflow bot or not, read how Voice Insights can help you increase the performance of your speech recognition solution.
Russ Kahan is the Principal Product Manager for <gather> Speech Recognition, Dialogflow Virtual Agents, Media Streams and SIPREC at Twilio. He’s enjoyed programming voice apps and conversing with robots since sometime back in the late nineties – when and this stuff was still called “CTI,” for “Computer Telephony Integration” – but he also enjoys real-world pursuits like Scouting, skiing, swimming, and mountain biking with his kids. Reach him at rkahan [at] twilio.com
Jeff Foster is a Software Engineer on Twilio's Programmable Voice team, and he’s been working on Speech Rec at Twilio for the last 6 years – including the original Dialogflow prototype implementations more than 2 years ago. He can be reached at jfoster [at] twilio.com.
Ramón Ulldemolins Andreu is a Product Manager for Twilio Voice. He works to support companies transform their business embracing technology to build for digital data-driven engagement at scale. He also loves traveling, experience local culture and food, and live music. He can be reached at rulldemolinsandreu [at] twilio.com
From APIs to SDKs to sample apps
API reference documentation, SDKs, helper libraries, quickstarts, and tutorials for your language and platform.
The latest ebooks, industry reports, and webinars
Learn from customer engagement experts to improve your own communication.
Twilio's developer community hub
Best practices, code samples, and inspiration to build communications and digital engagement experiences.