Build an AI SMS Chatbot with LangChain, LLaMA 2, and Baseten
Time to read: 3 minutes
Last month, Meta and Microsoft introduced the second generation of the LLaMA LLM (Large Language Model) to enable developers and organizations to build generative AI-powered tools and experiences. Read on to learn how to build an SMS chatbot using LangChain templating, LLaMa 2, Baseten, and Twilio Programmable Messaging!
- A Twilio account - sign up for a free Twilio account here
- A Twilio phone number with SMS capabilities - learn how to buy a Twilio Phone Number here
- Baseten account to host the LlaMA 2 model – make a Baseten account here
- Hugging Face account – make one here
- Python installed - download Python here
- ngrok, a handy utility to connect the development version of our Python application running on your machine to a public URL that Twilio can access.
LLaMA 2 is an open access Large Language Model (LLM) now licensed for commercial use. "Open access" means it is not closed behind an API and its licensing lets almost anyone use it and fine-tune new models on top of it. It is available in a few different sizes (7B, 13B, and 70B) and the largest model, with 70 billion parameters, is comparable to GPT-3.5 in numerous tasks. Currently approval is required to access it once you accept Meta’s license for the model.
Request access from Meta here with the email associated with your Hugging Face account. You should receive access within minutes.
Once you have access,
Once your Hugging Face access token is added to your Baseten account, you can deploy the LLaMA 2 chat version from the Baseten model library here. LLaMA 2-Chat is more optimized for engaging in two-way conversations and, according to TechCrunch, performs better on Meta's internal “helpfulness” and toxicity benchmarks.
After deploying your model, note the Version ID. You’ll use it to call the model from LangChain.
python3 -m venv venv
pip install langchain baseten flask twilio
If you're following this tutorial on Windows, enter the following commands in a command prompt window:
python -m venv venv
pip install langchain baseten flask twilio
from flask import Flask, request
from langchain import LLMChain, PromptTemplate
from langchain.llms import Baseten
from langchain.memory import ConversationBufferWindowMemory
from twilio.twiml.messaging_response import MessagingResponse
Though LLaMA 2 is tuned for chat, templates are still helpful so the LLM knows what behavior is expected of it. This starting prompt is similar to ChatGPT so it should behave similarly.
template = """Assistant is a large language model.
Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.
Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics.
Overall, Assistant is a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.
I want you to act as Taylor Swift giving advice and answering questions. You will reply with what she would say.
prompt = PromptTemplate(input_variables=["sms_input"], template=template)
Next, make a LLM Chain, one of the core components of LangChain, allowing us to chain together prompts and make a prompt history.
max_length is 4096, the maximum number of tokens (called the context window) the LLM can accept as input when generating responses.
Don't forget to replace YOUR-MODEL-VERSION-ID with your model’s version ID!
sms_chain = LLMChain(
Finally, make a Flask app to accept inbound text messages, pass that to the LLM Chain, and return the output as an outbound text message with Twilio Programmable Messaging.
app = Flask(__name__)
@app.route("/sms", methods=['GET', 'POST'])
resp = MessagingResponse()
inb_msg = request.form['Body'].lower().strip()
output = sms_chain.predict(sms_input=inb_msg)
if __name__ == "__main__":
On the command line, run
python app.py to start the Flask app.
Now, your Flask app will need to be visible from the web so Twilio can send requests to it. ngrok lets you do this. With ngrok installed, run
ngrok http 5000 in a new terminal tab in the directory your code is in.
You should see the screen above. Grab that ngrok Forwarding URL to configure your Twilio number: select your Twilio number under Active Numbers in your Twilio console, scroll to the Messaging section, and then modify the phone number’s routing by pasting the ngrok URL with the /sms path in the textbox corresponding to when A Message Comes In as shown below:
Click Save and now your Twilio phone number is configured so that it maps to your web application server running locally on your computer and your application can run. Text your Twilio number a question relating to the text file and get an answer from that file over SMS!
There is so much fun for developers to have around building with LLMs! You can modify existing LangChain and LLM projects to use LLaMA 2 instead of GPT, build a web interface using Streamlit instead of SMS, fine-tune LLaMA 2 with your own data, and more! I can't wait to see what you build–let me know online what you're working on!
From APIs to SDKs to sample apps
API reference documentation, SDKs, helper libraries, quickstarts, and tutorials for your language and platform.
The latest ebooks, industry reports, and webinars
Learn from customer engagement experts to improve your own communication.
Twilio's developer community hub
Best practices, code samples, and inspiration to build communications and digital engagement experiences.