Generate images with DALL·E 2 and Twilio SMS using ASP.NET Core

January 18, 2023
Written by
Volkan Paksoy
Contributor
Opinions expressed by Twilio contributors are their own
Reviewed by

Generate images with DALL·E 2 and Twilio SMS using ASP.NET Core

Recently, there has been a massive boost in AI-generated art. It came to a point that an AI-generated piece of art won a contest a few months ago. There are many art generation programs available such as OpenAI DALL·E 2, Midjourney, Stable Diffusion, etc. In this article, you will use DALL·E 2 to generate images. They recently made their system available to the general public without a waitlist and also opened their API. They also give free credits so you can follow this article for free. You can also get the final source code from my GitHub repository.  

Prerequisites

You'll need the following things in this tutorial:

OpenAI and DALL·E 2

OpenAI started as a non-profit artificial intelligence research organization founded by Elon Musk and Sam Altman. Elon Musk later quit the company. Currently, it operates under OpenAI LP, a “capped-profit” company (a hybrid of profit and non-profit models).  

DALL·E, is a machine learning model that uses GPT-3 to generate realistic images from a description. It was initially announced in January 2021. The latest iteration of the system, DALL·E 2, was announced in April 2022. It initially required joining a waiting list, and after you’ve been accepted, you could only generate images using their web front-end. Those limitations have now been lifted, and you can sign up and start using their API to generate images.

Overview of DALL·E 2 Front-End

When you go to the DALL·E 2 website and log in, you see an input box and a Generate button.

DALL-E 2 front-end showing an input box, Surprise me link on top of the input and a Generate button next to the input

The simplicity in the design reminds me of the Google homepage. It even has a “Surprise me” option which is similar to the “I’m feeling lucky” button.

Enter your description and press the Generate button. In a matter of seconds, you will see 4 image suggestions generated for you based on your description as shown below.

Image showing 4 images generated by DALL-E 2 based on description "a white kitten playing with red string"

You get 50 credits upon sign up but they expire after a month. After that, you get 15 free credits every month. 1 image generation costs 1 credit. You can check your credit status by clicking on your profile image on the upper right-hand corner.

Image showing the account information and remaining credits with a "Buy credits" button next to it.

If you hover over the images, you see a “…” button appear. Click on it and the “Quick Actions” menu opens.

Image showing mouse hovered over a generated image and clicked on the "..." button to show the Quick Actions menu

Here you can download the image or generate more variations based on this. The new variations are quite similar to the original one though, as they are all based on the same description.

Image showing 4 more variations generated based on a previously generated image.

You can keep on generating variations from variations as well. Generating variations also costs 1 credit.

Overview of OpenAI Account

To be able to use the OpenAI API, you will need an API key and some credits. When you sign up, OpenAI gives you free credits. Note that this is different from the 50 credits DALL·E 2 gave.

To check your credit status, go to your account page.

You should see your usage breakdown and your credit status.

 

Image showing the usage and credits granted for free trial usage

You get quite a lot of credits ($18) considering 1 image generation costs $0.02. There are even cheaper options depending on the image size and generation model.

 

Image showing the different pricing between image models  (1024x1024 <embed alt= <title>An icon of a outbound link arrow</title> <path class="icon-stroke" d="M75.3037 3.98207L3 75.5935M75.3037 3.98207L76.0435 43.3021M75.3037 3.98207L35.951 3.59351" stroke="#F22F46" stroke-width="5.5" stroke-linecap="round" stroke-linejoin="round"/> </svg> "> .02, 512x512
Image showing the different pricing between image models  (1024x1024 $0.02, 512x512 $0.018, 256x256 $0.016) and language models (Ada $0.0004, Babbage $0.0005, Curie $0.0020, Davinci $0.02)
.018, 256x256
Image showing the different pricing between image models  (1024x1024 $0.02, 512x512 $0.018, 256x256 $0.016) and language models (Ada $0.0004, Babbage $0.0005, Curie $0.0020, Davinci $0.02)
.016) and language models (Ada
Image showing the different pricing between image models  (1024x1024 $0.02, 512x512 $0.018, 256x256 $0.016) and language models (Ada $0.0004, Babbage $0.0005, Curie $0.0020, Davinci $0.02)
.0004, Babbage
Image showing the different pricing between image models  (1024x1024 $0.02, 512x512 $0.018, 256x256 $0.016) and language models (Ada $0.0004, Babbage $0.0005, Curie $0.0020, Davinci $0.02)
.0005, Curie
Image showing the different pricing between image models  (1024x1024 $0.02, 512x512 $0.018, 256x256 $0.016) and language models (Ada $0.0004, Babbage $0.0005, Curie $0.0020, Davinci $0.02)
.0020, Davinci
Image showing the different pricing between image models  (1024x1024 $0.02, 512x512 $0.018, 256x256 $0.016) and language models (Ada $0.0004, Babbage $0.0005, Curie $0.0020, Davinci $0.02)
.02)" data-embed-id="67969" class="left"/>

You can check the pricing page for full details.

After you’ve confirmed you have free credits granted, click on API Keys on the left menu.

Here, click on the Create new secret key button.

As the prompt says, save your secret key somewhere safe as you’ll not have another chance to see it.

Image showing the current API keys. User clicked Create new secret key button and a new API key is generated and shown along with a warning to keep it safe.

Now that you’ve familiarized yourself with image generation, have your API key and credits available, move on to the next section to implement your own API to send the DALL·E 2-generated images.

Project Implementation

To be able to generate the images, you will need to receive the image description from the user via SMS. To achieve this, you will implement a web API that responds to Twilio SMS webhook requests.

Create the API by running the following commands in a terminal:

mkdir Dalle2ImageSmsApi
cd Dalle2ImageSmsApi
dotnet new webapi

Since you’re going to use Twilio, add Twilio .NET SDK and ASP.NET helper library via NuGet:

dotnet add package Twilio
dotnet add package Twilio.AspNet.Core

Open the project with your IDE.

Under the Controllers directory, create a new file called IncomingSmsController.cs and update its contents with the code below:

using Microsoft.AspNetCore.Mvc;
using Twilio.AspNet.Core;
using Twilio.TwiML;
using Twilio.TwiML.Messaging;

namespace Dalle2ImageSmsApi.Controllers;

[ApiController]
[Route("[controller]")]
public class IncomingSmsController : TwilioController
{
    [HttpPost]
    public async Task<TwiMLResult> Index()
    {
        var form = await Request.ReadFormAsync();
        var incomingText = form["Body"];
        
        var message = new Message();
        message.Body($"Here's the image for your query: {incomingText}");
        message.Media(new Uri("https://picsum.photos/1024/1024"));

        return new MessagingResponse()
            .Append(message)
            .ToTwiMLResult();
    }
}

The code above extracts the text message sent by the user (which is sent in the Body paramater field of the form encoded request body.).

This message will be used to generate the image. For now, just for testing purposes, you will ignore this message and return a random photo from an online service called picsum.photos which is a handy service to create random placeholder images. You can use this service to test and format your response messages without wasting your OpenAI credits.

Run the application by running the following command in the terminal window:

dotnet run

You should see your application running on your localhost.

Image showing application running on http://localhost:5252

You want Twilio to send webhook requests to your API, but currently Twilio cannot access your localhost. To fix this issue, you will tunnel your localhost to the internet with ngrok.

Copy the localhost URL, open another terminal and run the following command:

ngrok http { YOUR LOCALHOST URL }

Replace { YOUR LOCALHOST URL } with the value you copied from the other terminal (It would be http://localhost:5252 in this example)

You should see ngrok generate a random Forwarding URL and ngrok will now forward the traffic from this URL to your local API:

Terminal window showing ngrok running and forwarding traffic to http://localhost:5252

Now that you have a publicly accessible URL, you can tell Twilio where to send webhook requests.

Go to Twilio Console. Then go to Phone Numbers → Manage → Active Numbers and click on your number.

Scroll down to the messaging section. Select Webhook in the “A MESSAGE COMES IN” part and enter your ngrok URL followed by /IncomingSms as shown below:

Twilio Console showing A Message Comes In section with Webhook selected and the ngrok URL entered followed by /IncomingSms

Click Save.

Now that the environment is set up, send an SMS to your Twilio phone and see if you can receive a random image as response. If you can receive the image, it means it’s now time to generate images using the OpenAI API.

Implement OpenAI Client

The image generation API is still in beta as of this writing. Using image generation is quite straightforward. You send the description of the image, size, and the number of images you want to get. In this project, you will use an open-source client library which has recently been updated to support DALL·E.

Stop your application if it’s still running and run the following command in the terminal window:

dotnet add package Betalgo.OpenAI.GPT3

You will need your OpenAI API key to use the API that you created in the previous section. You will use .NET user secrets to store it. Run the following command to initialize the user secrets:

dotnet user-secrets init

Then, add the API key to the secrets by running the following command, replacing {YOUR OPENAI API KEY} with the actual API key value:

dotnet user-secrets set OpenAIServiceOptions:ApiKey {YOUR OPENAI API KEY}

Update Program.cs and add the highlighted lines:

using OpenAI.GPT3.Extensions;

var builder = WebApplication.CreateBuilder(args);

// Add services to the container.

builder.Services.AddControllers();
// Learn more about configuring Swagger/OpenAPI at https://aka.ms/aspnetcore/swashbuckle
builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen();
builder.Services.AddOpenAIService();

Now you can inject the service to your IncomingSmsController controller as shown below. The highlighted lines are what’s new and updated.

using Microsoft.AspNetCore.Mvc;
using OpenAI.GPT3.Interfaces;
using OpenAI.GPT3.ObjectModels.RequestModels;
using Twilio.AspNet.Core;
using Twilio.TwiML;
using Twilio.TwiML.Messaging;
 
namespace Dalle2ImageSmsApi.Controllers;
 
[ApiController]
[Route("[controller]")]
public class IncomingSmsController : TwilioController
{
    private readonly ILogger<IncomingSmsController> _logger;
    private readonly IOpenAIService _openAiService;
 
    public IncomingSmsController(
        ILogger<IncomingSmsController> logger, 
        IOpenAIService openAiService
    )
    {
        _logger = logger;
        _openAiService = openAiService;
    }
    
    [HttpPost]
    public async Task<TwiMLResult> Index()
    {
        var form = await Request.ReadFormAsync();
        var incomingText = form["Body"];
 
        var createImageRequest = new ImageCreateRequest
        {
            Size = "1024x1024",
            N = 1,
            Prompt = incomingText,
            ResponseFormat = "url"
        };
 
        var createImageResponse = await _openAiService.Image.CreateImage(createImageRequest);
        if(!createImageResponse.Successful)
        {
            var errorMessage = "An error occurred trying to create OpenAI image." +
                $" {createImageResponse.Error.Code}: {createImageResponse.Error.Message}.";
            _logger.LogError(errorMessage);
            
            return new MessagingResponse()
                .Message("An unexpected error occurred. Try again later.")
                .ToTwiMLResult();
        }

        var image = createImageResponse.Results.First();
 
        var message = new Message();
        message.Body($"Here's the image for your query: {incomingText}");
        message.Media(new Uri(image.Url));
 
        return new MessagingResponse()
            .Append(message)
            .ToTwiMLResult();
    }
}

The action now construct a CreateImageRequest object with size set to 1024x1024, the number of images requested set to 1, and passes in the image description received from the incoming text message. It also sets the response format to url as it will pass Twilio the URL of the image. Valid values for ResponseFormat are url and b64_json.

Run the application again. To test the implementation, send an SMS to your Twilio phone again. This time the image returned should match your description.

For example, I sent the following description: “a golden retriever puppy playing with a kitten”:

Screenshot of phone showing the sent message “a golden retriever puppy playing with a kitten and a received message showing a link to the generated image.

When I clicked the link, I got the following image:

Screenshot of phone showing the AI-generated image based on the description “a golden retriever puppy playing with a kitten

I’m happy with the puppy but the kitten seems a bit odd. Of course, the nice thing about it is, if you don’t like the result, you can always request more.

As an improvement, you can make the image size and the number of images customizable.

Conclusion

In this tutorial, you learned the basics of DALL·E 2 via using the front-end. Then implemented your own service to interact with OpenAI API. The service responds to Twilio SMS webhooks and uses DALL·E 2 to generate images based on the user's description and sends the image back to the user.

The resulting image may or may not be satisfactory based on the description. These are the early days of AI-generated images and I’m sure they will keep on getting better and better.

If you'd like to keep learning, I recommend taking a look at these articles:

Volkan Paksoy is a software developer with more than 15 years of experience, focusing mainly on C# and AWS. He’s a home lab and self-hosting fan who loves to spend his personal time developing hobby projects with Raspberry Pi, Arduino, LEGO and everything in-between. You can follow his personal blogs on software development at devpower.co.uk and cloudinternals.net.