Build a Smart QA ChatOps Assistant with Playwright, Twilio WhatsApp, LLMs, and MCP

Time to read:

November 25, 2025

Written by

Jacob Snipes

Contributor

Reviewed by

Amanda Lange

Twilion

Build a Smart QA ChatOps Assistant with Playwright, Twilio WhatsApp, LLMs, and MCP

Smart QA ChatOps isn't your traditional test automation where you write rigid test scripts. This is intelligent, dynamic testing where you simply tell the system what to test in natural language via WhatsApp, and the AI figures out how to test it.

You can say:

Test login on https://example.com with user@test.com
Check if the search works on https://example.com
Try adding a product to cart on https://example.com
Test the checkout flow on https://example.com

In this tutorial, you’ll build an application that triggers a web test via WhatsApp; the MCP-powered backend immediately runs Playwright tests, captures artifacts (screenshots, video, traces, logs), and returns a concise, plain-language report back to WhatsApp typically within minutes. By doing so, it breaks down technical barriers so that anyone (product, support, non-tech teammates) can invoke QA.

Traditional test automation requires developers to write and maintain test scripts for every scenario. When the UI changes, tests break. When requirements change, scripts need updates. This AI-powered approach eliminates rigid test scripts by:

Interpreting natural language: Describe what you want to test in plain English
Dynamic element discovery: AI identifies buttons, forms, and links intelligently without hard-coded selectors
Self-healing capabilities: AI adapts to UI changes automatically
Democratized testing: Non-technical team members can trigger tests
Instant feedback loop: Results delivered directly to WhatsApp

What You'll Learn

Integrate Twilio WhatsApp API with .NET for bidirectional messaging
Use OpenAI to parse natural language and generate test plans
Execute dynamic browser automation with Playwright
Store and serve test artifacts (screenshots, traces) locally
Orchestrate complex workflows between AI, webhooks, and browser automation
Build production-ready ChatOps applications

Prerequisites

.NET 9 SDK installed (verify: dotnet --version and expect 9.x).
An OpenAI API key (or equivalent LLM API).
A Twilio account and phone number (trial accounts require verified recipient numbers).
ngrok (or any tunnel) for exposing local HTTP to the public internet.
Node.js (for Playwright browser drivers)
Basic understanding of C#, ASP.NET Core, and REST APIs

Project Setup and Structure

Start by cloning the complete working project from GitHub.

git clone https://github.com/GeekEdmund/Smart-QA-ChatOps-Assistant-with-Playwright-Twilio-WhatsApp-LLMs.git
cd Smart-QA-ChatOps-Assistant-with-Playwright-Twilio-WhatsApp-LLMs

All code is already functional in the repository.

Understand the Solution Structure

Open the cloned project in your IDE. You'll see a clean multi-project architecture:

QAChatOpsAssistant/
├── QAChatOps.Api/           # Web API layer
├── QAChatOps.Core/          # Business logic
└── QAChatOps.Infrastructure/ # External integrations

Why this structure?

QAChatOps.Api: Web API layer handling webhooks and HTTP concerns. This serves as your entry point
QAChatOps.Core: Business logic and domain models. Pure C# with no external dependencies
QAChatOps.Infrastructure: External integrations (Twilio, OpenAI, Playwright). Isolates third-party dependencies

This separation ensures testability, maintainability, and follows Clean Architecture principles. The Core project remains framework-agnostic, making it reusable across different application types.

Project References

Now establish proper project references:

# Add project references
cd QAChatOps.Api
dotnet add reference ../QAChatOps.Core
dotnet add reference ../QAChatOps.Infrastructure
cd ../QAChatOps.Infrastructure
dotnet add reference ../QAChatOps.Core

The reference hierarchy matters: Infrastructure can depend on Core, but Core never depends on Infrastructure. This creates a unidirectional dependency flow that prevents circular references and maintains clean architecture boundaries.

Install Dependencies

Install all required NuGet packages for each project:

# Install NuGet packages
cd ../QAChatOps.Api
dotnet add package Microsoft.Playwright
dotnet add package Twilio
cd ../QAChatOps.Infrastructure
dotnet add package Microsoft.Playwright
dotnet add package Twilio
dotnet add package OpenAI
dotnet add package Microsoft.Extensions.Http
dotnet add package Microsoft.Extensions.Configuration.Abstractions
# Build
cd ../QAChatOps.Api
dotnet build
dotnet tool install --global Microsoft.Playwright.CLI
playwright install

About these packages:

Microsoft.Playwright is a Cross-browser automation library that allows you to control Chrome, Firefox, and WebKit. Twilio is the official Twilio SDK for WhatsApp messaging that simplifies API calls and handles authentication. Microsoft.Extensions.Http provides IHttpClientFactory for OpenAI API calls that includes resilience and performance features.

Microsoft.Extensions.Configuration.Abstractions enables dependency injection for configuration and critical for the Infrastructure layer. playwright downloads Chromium, Firefox, and WebKit browsers to your system. These browsers are specifically patched versions that enable automation features like screenshot capture, network interception, and trace recording. This step is crucial. Without it, Playwright tests will fail with "browser not found" errors.

Cross-Platform: Browser Binary Installation via npm

If you're on macOS or Windows and encounter browser installation errors (like "Executable doesn't exist"), you can use Playwright's npm initializer as an alternative. This works cross-platform and downloads the correct browser binaries for your OS.

# Initialize Playwright (non-interactive with defaults)
npm init playwright@latest -- --yes
# Ensure browsers are installed (the init step often runs this; run explicitly if needed)
npx playwright install

Building the Core Domain Models

Draw your attention to the code in this project so you can understand the basic structure.

The Core project defines your domain without any infrastructure concerns. These models represent the "language" of your application and will be used across all layers.

TestRequest Model

This model represents a parsed testing request. It's the bridge between a user's natural language and executable tests.

namespace QAChatOps.Core.Models;
public record TestRequest
{
    public string Url { get; init; } = string.Empty;
    public string TestIntent { get; init; } = string.Empty;
    public TestType Type { get; init; }
    public Dictionary<string, string> Parameters { get; init; } = new();
}
public enum TestType
{
    Login,
    Search,
    Navigation,
    FormSubmission,
    AddToCart,
    Checkout,
    General,
    ElementInteraction
}

The TestType enum helps the AI categorize requests. While the AI generates dynamic test plans, knowing the type helps with selector strategies. For example, login tests are known to look for email inputs and submit buttons, while cart tests look for "Add to Cart" buttons.

The Parameters dictionary stores dynamic values extracted from user messages. If someone says "Test login with user@test.com and password123", the AI populates this with {"email": "user@test.com", "password": "password123"}. Head to the next model.

TestPlan Model

The AI generates this from a TestRequest. It's a structured representation of what steps to execute.

namespace QAChatOps.Core.Models;
public record TestPlan
{
    public List<TestStep> Steps { get; init; } = new();
    public Dictionary<string, string> Selectors { get; init; } = new();
    public string Description { get; init; } = string.Empty;
}
public record TestStep
{
    public string Action { get; init; } = string.Empty;
    public string Target { get; init; } = string.Empty;
    public string? Value { get; init; }
    public string Description { get; init; } = string.Empty;
    public bool IsOptional { get; init; }
}

Action types include: navigate, click, type, verify, wait, screenshot, scroll. Each action is atomic and self-contained. Target can be a CSS selector OR natural language text (like "Login button"). The orchestrator tries intelligent matching strategies. IsOptional flag: Some steps are nice-to-have but not critical. For example, waiting for a popup might be optional—if it doesn't appear, the test continues. This prevents brittle tests that fail on minor UI variations.

The Selectors dictionary stores reusable selector strategies. The AI might define {"loginButton": "button[type='submit'], button:has-text('Login'), [aria-label*='login']"} which provides multiple fallback options. This is key for resilient testing. Go to the final model.

TestExecutionResult Model

This captures everything about a test run success/failure, timing, artifacts, and AI analysis.

namespace QAChatOps.Core.Models;
public record TestExecutionResult
{
    public string JobId { get; init; } = Guid.NewGuid().ToString();
    public string Url { get; init; } = string.Empty;
    public string TestIntent { get; init; } = string.Empty;
    public bool Success { get; init; }
    public TimeSpan Duration { get; init; }
    public List<ExecutedStep> ExecutedSteps { get; init; } = new();
    public List<string> ScreenshotPaths { get; init; } = new();
    public string? TracePath { get; init; }
    public string? VideoPath { get; init; }
    public string? ErrorMessage { get; init; }
    public string AIAnalysis { get; init; } = string.Empty;
}
public record ExecutedStep
{
    public string Action { get; init; } = string.Empty;
    public string Description { get; init; } = string.Empty;
    public bool Success { get; init; }
    public string? Error { get; init; }
    public DateTime Timestamp { get; init; }
    public string? ScreenshotPath { get; init; }
}

JobId is used to correlate artifacts. All screenshots for a test run will have this ID in their filename, making debugging easier.

ExecutedSteps creates an audit trail. You can see exactly what happened, when it happened, and whether it succeeded. This is invaluable for debugging flaky tests. Artifact paths (screenshots, traces, videos) are stored as local file paths. Later, the WhatsApp service will convert these to public URLs. This separation of concerns means the Core layer doesn't need to know about web hosting or URL generation.

Service Interfaces

Before implementing infrastructure, this defines service contracts in the Core project. This inverts dependencies the Infrastructure will implement these interfaces defined in Core.

namespace QAChatOps.Core.Services;
using QAChatOps.Core.Models;
public interface IAITestGenerator
{    Task<TestRequest> ParseIntentAsync(string message, CancellationToken cancellationToken = default);
    Task<TestPlan> GenerateTestPlanAsync(TestRequest request, CancellationToken cancellationToken = default);
    Task<string> AnalyzeResultsAsync(TestExecutionResult result, CancellationToken cancellationToken = default);
}

Why three methods?

ParseIntentAsync which converts natural language to structured TestRequest. GenerateTestPlanAsync which creates executable steps from the request.

AnalyzeResultsAsync which interprets test results and provides insights. This interface makes the AI provider swappable. Today it's OpenAI, tomorrow it could be Anthropic Claude or a local LLM. The rest of your application doesn't care.

Implement the AI Test Generator

Now you'll implement the brain of the system, the component that converts natural language into executable test plans using OpenAI's GPT-4. Feel free to use any LLM of your choice.

This file has a lot of moving parts, so the code has not been entirely reproduced here. You can view the complete implementation with all helper methods on GitHub.

Direct your attention to the following code block:

using System.Text;
using System.Text.Json;
using System.Text.Json.Serialization;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.Logging;
using QAChatOps.Core.Models;
using QAChatOps.Core.Services;
namespace QAChatOps.Infrastructure.OpenAI;
public class AITestGenerator : IAITestGenerator
{
    private readonly HttpClient _httpClient;
    private readonly ILogger<AITestGenerator> _logger;
    private readonly string _apiKey;
    private readonly string _modelName;
    private const string OpenAIApiUrl = "https://api.openai.com/v1/chat/completions";
    public AITestGenerator(
        IConfiguration configuration,
        ILogger<AITestGenerator> logger,
        HttpClient httpClient)
    {
        _httpClient = httpClient;
        _logger = logger;
        _apiKey = configuration["OpenAI:ApiKey"] 
            ?? throw new InvalidOperationException("OpenAI:ApiKey not configured");
        _modelName = configuration["OpenAI:Model"] ?? "gpt-4-turbo-preview";
        _httpClient.DefaultRequestHeaders.Add("Authorization", $"Bearer {_apiKey}");
    }

Constructor injection: you inject IHttpClient (for resilience), IConfiguration (for settings), and ILogger (for observability). This is standard .NET dependency injection where every dependency is explicit and testable.

You use HttpClient directly While OpenAI has official SDKs, using HttpClient directly gives us complete control over requests/responses and simplifies error handling. The OpenAI Chat Completions API is straightforward enough that an SDK adds little value.

Parsing Natural Language Intent

Continuing to explain the code in this generator, take a look at this segment:

public async Task<TestRequest> ParseIntentAsync(
        string message,
        CancellationToken cancellationToken = default)
    {
        var prompt = $@"You are a QA test intent parser. Extract testing information from user messages.
User message: ""{message}""
Extract and return ONLY a valid JSON object (no markdown, no backticks) with this structure:
{{
  ""url"": ""the website URL (required, must start with http:// or https://)"",
  ""testIntent"": ""brief description of what to test"",
  ""testType"": ""Login|Search|Checkout|Navigation|Form|General"",
  ""parameters"": {{
    ""email"": ""extracted email or test@example.com"",
    ""password"": ""extracted password or Test123!"",
    ""username"": ""extracted username or testuser"",
    ""search"": ""search term if mentioned""
  }}
}}

Why "temperature 0.3"? Temperature controls randomness. At 0.3, responses are mostly deterministic. You want consistent structure, not creative variations. The prompt engineering matters: Notice how you explicitly state "Respond with ONLY a JSON object" and provide the exact structure. This constrains GPT-4's output and reduces parsing errors. The prompt also handles edge cases (missing URL, no parameters). Markdown cleanup with GPT-4 sometimes wraps JSON in backticks for code blocks. We strip these before parsing. This is a common pattern when working with LLMs that output code. For error handling if parsing fails completely, you return a basic TestRequest with the original message. This prevents the entire flow from crashing on malformed AI responses.

Generating Test Plans

public async Task<TestPlan> GenerateTestPlanAsync(
        TestRequest request,
        CancellationToken cancellationToken = default)
    {
        var prompt = $@"You are a QA test automation expert. Generate a detailed test plan.
Test Request:
- URL: {request.Url}
- Intent: {request.TestIntent}
- Type: {request.Type}
- Parameters: {JsonSerializer.Serialize(request.Parameters)}
Generate a test plan with specific Playwright actions. Return ONLY valid JSON (no markdown):
{{
  ""description"": ""brief description of test"",
  ""steps"": [
    {{
      ""action"": ""navigate|click|type|verify|wait|scroll"",
      ""target"": ""CSS selector or text locator"",
      ""value"": ""value for type action, or wait duration"",
      ""description"": ""human readable step description"",
      ""isOptional"": false
    }}
  ],
  ""selectors"": {{}}
}}

Higher temperature (0.5) is used in this portion because you want creativity here. GPT-4 should generate varied selector strategies and think about edge cases. This is intentional: test plans benefit from diverse approaches. The system prompt is critical: "Generate robust, intelligent test plans with fallback strategies" directs GPT-4's reasoning. It knows to provide multiple selectors per element, which makes tests resilient to UI changes and dynamic. Max tokens set to 2000: Test plans are larger than intent parsing. Complex flows might have 10-15 steps with detailed selectors. Selector format: The prompt instructs GPT-4 to provide comma-separated alternatives: "button[type='submit'], button:has-text('Login'), [aria-label*='login']". The orchestrator will try each until one succeeds.

Analyzing Test Results

public async Task<string> AnalyzeResultsAsync(
        TestExecutionResult result,
        CancellationToken cancellationToken = default)
    {
        var stepsInfo = string.Join("\n", result.ExecutedSteps.Select(s => 
            $"- [{(s.Success ? "✓" : "✗")}] {s.Description}" + 
            (s.Success ? "" : $" (Error: {s.Error})")));
        var prompt = $@"You are a QA analyst. Analyze this test execution result and provide a brief, actionable summary.
Test Execution:
- URL: {result.Url}
- Intent: {result.TestIntent}
- Overall Success: {result.Success}
- Duration: {result.Duration.TotalSeconds:F1}s
- Steps Executed: {result.ExecutedSteps.Count}

A few key points in this block:

Temperature 0.4: Balanced between determinism and natural language. You want consistent structure but human-readable text.
Max chars constraint: "Keep it concise (max 1200 chars)" ensures the analysis fits well in WhatsApp messages. Without this, GPT-4 might generate multi-paragraph essays.
Emojis for clarity: WhatsApp is a casual medium. Emojis (✅, ❌, ⚡) make reports scannable and engaging.

Why do you not use structured JSON here? Unlike intent parsing and test plans, analysis is meant for human consumption. Natural language is perfect for explaining test results to non-technical stakeholders. This transforms raw test execution data into something a little more end-user-friendly. Instead of "Step 3 failed," users get "Login button not found. The site might be using a modal dialog. Try waiting longer or check if there's a cookie consent popup." Next, proceed to the test orchestrator.

Implement the Test Orchestrator

The orchestrator executes AI-generated test plans using Playwright. This is where plans become reality, browsers launch, elements click, screenshots capture.

Draw your attention to the following code:

public interface ITestOrchestrator
{
    Task<TestExecutionResult> ExecuteTestAsync(
        TestRequest request,
        TestPlan plan,
        CancellationToken cancellationToken = default);
}
public class TestOrchestrator : ITestOrchestrator
{
    private readonly ILogger<TestOrchestrator> _logger;
    private readonly string _artifactsPath;
    public TestOrchestrator(
        ILogger<TestOrchestrator> logger,
        IConfiguration configuration)
    {
        _logger = logger;
        _artifactsPath = configuration["ArtifactsPath"] ?? "wwwroot/artifacts";
        // Ensure directories exist
        Directory.CreateDirectory(Path.Combine(_artifactsPath, "screenshots"));
        Directory.CreateDirectory(Path.Combine(_artifactsPath, "traces"));
        Directory.CreateDirectory(Path.Combine(_artifactsPath, "videos"));
    }

This code creates the necessary directories. . The orchestrator is responsible for artifacts. It should ensure storage exists before attempting writes. This prevents cryptic "directory not found" errors during test execution. Configurable path: Using IConfiguration makes the artifacts path environment-specific. In development it's wwwroot/artifacts, in production it could be a cloud storage mount point.

Test Execution Error Handling and Individual Step Execution

public async Task<TestExecutionResult> ExecuteTestAsync(
        TestRequest request,
        TestPlan plan,
        CancellationToken cancellationToken = default)
    {
        var jobId = Guid.NewGuid().ToString("N")[..8];
        var startTime = DateTime.UtcNow;
        var executedSteps = new List<ExecutedStep>();
        var screenshots = new List<string>();
        string? tracePath = null;
        string? videoPath = null;
        bool success = false;
        string? errorMessage = null;
        _logger.LogInformation("Starting test execution for: {Intent} on {Url}", 
            request.TestIntent, request.Url);
        // Initialize Playwright
        using var playwright = await Playwright.CreateAsync();
        // Launch with stealth settings to avoid bot detection
        await using var browser = await playwright.Chromium.LaunchAsync(new()
        {
            Headless = true,
            SlowMo = 100,
            Args = new[]
            {
                "--disable-blink-features=AutomationControlled",
                "--disable-dev-shm-usage",
                "--disable-web-security",
                "--no-sandbox"
            }
        });

Playwright's configuration ensures robust testing through SlowMo delays (100ms) to prevent race conditions and automatic video recording. With tracing, it captures screenshots, DOM snapshots, and network requests for complete test replay. The orchestrator handles critical vs optional steps. Breaking on essential failures while continuing past non-critical ones like cookie banners. NetworkIdle waits ensure pages fully load (max 2 connections for 500ms) before proceeding, while comprehensive error handling catches all exceptions and always returns results without propagation.

Finally blocks guarantee browser cleanup, preventing process leaks, and screenshots at each step create a visual debugging timeline accessible through Playwright's trace viewer.

Intelligent Element Clicking

private async Task ClickElementAsync(IPage page, string selector)
    {
        // Try multiple selector strategies
        var selectors = selector.Split(',').Select(s => s.Trim()).ToList();
        foreach (var sel in selectors)
        {
            try
            {
                // Wait for element to be visible and enabled
                await page.WaitForSelectorAsync(sel, new() 
                { 
                    State = WaitForSelectorState.Visible,
                    Timeout = 3000 
                });
                // Try smart text-based clicking first (Playwright handles visibility)
                if (sel.Contains("text=") || sel.Contains("has-text"))
                {
                    await page.ClickAsync(sel, new() { Timeout = 5000 });
                    _logger.LogInformation("Clicked element using selector: {Selector}", sel);
                    return;
                }
                // Try CSS selector
                var element = await page.QuerySelectorAsync(sel);
                if (element != null && await element.IsVisibleAsync())
                {
                    // Scroll into view first
                    await element.ScrollIntoViewIfNeededAsync();
                    await page.WaitForTimeoutAsync(300);
                    await element.ClickAsync(new() { Timeout = 5000 });
                    _logger.LogInformation("Clicked element using selector: {Selector}", sel);
                    return;
                }
            }
            catch
            {
                continue; // Try next selector
            }
        }

This is the magic of resilient testing: The AI provides multiple selectors separated by commas. You try each until one works. If button[type='submit'] fails (maybe the site changed), you try button:has-text('Login'), then [aria-label*='login']. Text-based selectors are checked first, since these are most resilient to UI changes. A button with text "Login" is likely to remain even if CSS classes change. You use a hort timeout (5s), because you don't want to wait forever per selector. If an element isn't found in 5 seconds, the code will try the next approach. Why catch and continue? Each selector might fail for different reasons (element not found, element not clickable, wrong element type). You don't care why it failed. You just try the next option.

private async Task TypeTextAsync(IPage page, string selector, string value, TestRequest request)
    {
        // Replace placeholders with actual values
        var actualValue = value
            .Replace("{email}", request.Parameters.GetValueOrDefault("email", "test@example.com"))
            .Replace("{password}", request.Parameters.GetValueOrDefault("password", "Password123!"))
            .Replace("{username}", request.Parameters.GetValueOrDefault("username", "testuser"))
            .Replace("{search}", request.Parameters.GetValueOrDefault("search", "laptop"));
        var selectors = selector.Split(',').Select(s => s.Trim()).ToList();
        // Determine field type for better handling
        var isPasswordField = selector.Contains("password", StringComparison.OrdinalIgnoreCase) ||
                             value.Contains("{password}");
        // For password fields, wait longer as they often appear after email entry
        var waitTimeout = isPasswordField ? 5000 : 3000;
        // Try provided selectors first
        foreach (var sel in selectors)
        {
            try
            {
                await page.WaitForSelectorAsync(sel, new() 
                { 
                    State = WaitForSelectorState.Visible,
                    Timeout = waitTimeout 
                });
                var element = await page.QuerySelectorAsync(sel);
                if (element != null && await element.IsVisibleAsync())
                {
                    await element.ScrollIntoViewIfNeededAsync();
                    await element.ClickAsync(); // Focus the input
                    await page.WaitForTimeoutAsync(300);
                    await element.FillAsync(actualValue, new() { Timeout = 5000 });
                    _logger.LogInformation("Typed into element using selector: {Selector}", sel);
                    // After typing, wait a bit for any dynamic behavior
                    await page.WaitForTimeoutAsync(500);
                    return;
                }
            }
            catch
            {
                continue;
            }
        }
        // SMART FALLBACK: Try to find ANY visible input field that might match
        _logger.LogWarning("Standard selectors failed, attempting smart input discovery");

Placeholder replacement: The AI generates test plans with placeholders like {email} and {password}. This method replaces them with actual values from the TestRequest parameters or sensible defaults. Why this pattern? It decouples test plan generation from actual test data. The same test plan can be reused with different credentials simply by changing the parameters. FillAsync vs TypeAsync: FillAsync clears the field first, then fills it. TypeAsync simulates individual keystrokes. For forms, filling is faster and more reliable.

Element Verification

private async Task VerifyElementAsync(IPage page, string selector)
    {
        var selectors = selector.Split(',').Select(s => s.Trim()).ToList();
        foreach (var sel in selectors)
        {
            try
            {
                await page.WaitForSelectorAsync(sel, new() 
                { 
                    State = WaitForSelectorState.Visible,
                    Timeout = 5000 
                });
                return;
            }
            catch
            {
                continue;
            }
        }
        throw new Exception($"Element not found or not visible: {selector}");
    }

WaitForSelectorState.Visible elements must be both present in the DOM and visible on screen. This prevents false positives where an element exists but is hidden with display: none. Verification is crucial. This is how we confirm tests actually succeeded. After clicking login, you verify the dashboard element appears. Without verification, you'd never know if actions had the intended effect.

Screenshot Capture

private async Task<string> CaptureScreenshotAsync(IPage page, string jobId, string step)
    {
        var filename = $"{jobId}_{step}_{DateTime.UtcNow:HHmmss}.png";
        var path = Path.Combine(_artifactsPath, "screenshots", filename);
        await page.ScreenshotAsync(new() 
        { 
            Path = path,
            FullPage = true 
        });
        return path;
    }
}

FullPage = true captures the entire scrollable page, not just the viewport. This is better for documentation and debugging. The filename format {jobId}_{step}_{timestamp}.png makes screenshots self-documenting and easy to find. The jobId groups related screenshots together.

Videos are saved automatically when RecordVideoDir is configured. You just need to get the path after test completion.

Next, move to the communication channel.

Implement the WhatsApp Service

Now you integrate Twilio's WhatsApp API to send test results back to users.

using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.Logging;
using QAChatOps.Core.Services;
using System.Linq;
using Twilio;
using Twilio.Exceptions;
using Twilio.Rest.Api.V2010.Account;
using Twilio.Types;
namespace QAChatOps.Infrastructure.Twilio;
public interface IWhatsAppService
{
    Task SendMessageAsync(string to, string message);
    Task SendMessageWithImagesAsync(string to, string message, List<string> imagePaths);
}
public class WhatsAppService : IWhatsAppService
{
    private readonly ILogger<WhatsAppService> _logger;
    private readonly string _twilioNumber;
    private readonly string _publicBaseUrl;
    public WhatsAppService(
        IConfiguration configuration,
        ILogger<WhatsAppService> logger)
    {
        _logger = logger;
        var accountSid = configuration["Twilio:AccountSid"];
        var authToken = configuration["Twilio:AuthToken"];
        _twilioNumber = configuration["Twilio:WhatsAppNumber"] 
            ?? throw new InvalidOperationException("Twilio WhatsApp number not configured");
        _publicBaseUrl = configuration["PublicBaseUrl"] ?? "https://your-domain.com";
        TwilioClient.Init(accountSid, authToken);
    }

TwilioClient.Init is a global initialization. It configures the SDK with your credentials. You only need to call this once, typically in the constructor. PublicBaseUrl is critical. WhatsApp media messages require publicly accessible URLs. Local file paths won't work. You'll convert paths like wwwroot/artifacts/screenshots/abc123.png to https://your-domain.com/artifacts/screenshots/abc123.png.

Sending Text Messages

This is the final task to be handled here:

public async Task SendMessageWithImagesAsync(
        string to, 
        string message, 
        List<string> imagePaths)
    {
        try
        {
            // Convert local paths to public URLs
            var mediaUrls = imagePaths
                .Select(path => {
                    var relativePath = path.Replace("wwwroot/", "").Replace("\\", "/");
                    return $"{_publicBaseUrl}/{relativePath}";
                })
                .Take(3) // WhatsApp media limit
                .Select(url => new Uri(url))
                .ToList();
            string Normalize(string num)
            {
                if (string.IsNullOrEmpty(num)) return num;
                if (num.StartsWith("whatsapp:")) return num;
                var n = num.StartsWith("+") ? num : "+" + num.TrimStart('0');
                return $"whatsapp:{n}";
            }
            var fromNum = Normalize(_twilioNumber);
            var toNum = Normalize(to);
            var messageResource = await MessageResource.CreateAsync(
                body: message,
                from: new PhoneNumber(fromNum),
                to: new PhoneNumber(toNum),
                mediaUrl: mediaUrls
            );
            _logger.LogInformation(
                "Sent WhatsApp message with {Count} images to {To}. SID: {Sid}",
                mediaUrls.Count, to, messageResource.Sid);
        }
    catch (ApiException apiEx)
        {
            _logger.LogError(apiEx, "Failed to send WhatsApp message with images to {To}: {Message}", to, apiEx.Message);
            return;
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Failed to send WhatsApp message with images to {To}", to);
            throw;
        }
    }
}

For PhoneNumber format both from and to must be in WhatsApp format: whatsapp:+1234567890. The webhook will provide this format automatically. Message SID: Twilio returns a unique identifier for each message. Log this for tracking and debugging delivery issues. Error handling you log and rethrow. The calling code should handle failures gracefully.

Build the Webhook Controller

The webhook is the heart of the system. It receives WhatsApp messages, orchestrates all services, and sends responses.

using Microsoft.AspNetCore.Mvc;
using QAChatOps.Core.Services;
using QAChatOps.Core.Models;
using QAChatOps.Infrastructure.Twilio;
using Microsoft.Extensions.Logging;
using System.Text.RegularExpressions;
namespace QAChatOps.Api.Controllers;
[ApiController]
[Route("api/[controller]")]
public class WebhookController : ControllerBase
{
    private readonly IAITestGenerator _aiGenerator;
    private readonly ITestOrchestrator _orchestrator;
    private readonly IWhatsAppService _whatsApp;
    private readonly ILogger<WebhookController> _logger;
    private static readonly Regex UrlRegex = new Regex(
        @"https?://[^\s\)\]\}\,]+", 
        RegexOptions.IgnoreCase | RegexOptions.Compiled);
    private readonly string _baseUrl;
    public WebhookController(
        IAITestGenerator aiGenerator,
        ITestOrchestrator orchestrator,
        IWhatsAppService whatsApp,
        ILogger<WebhookController> logger,
        IConfiguration configuration)  // ADD THIS
    {
        _aiGenerator = aiGenerator;
        _orchestrator = orchestrator;
        _whatsApp = whatsApp;
        _logger = logger;
        _baseUrl = configuration["BaseUrl"] ?? "http://localhost:5000";  // ADD THIS
    }

Constructor injection. All dependencies are injected to make the controller testable and follows SOLID principles. Three core dependencies: AI for intelligence, Orchestrator for execution, WhatsApp for communication. This controller coordinates them.

Receiving WhatsApp Messages

[HttpPost("whatsapp")]
    public async Task<IActionResult> ReceiveWhatsAppMessage(
        [FromForm] WhatsAppWebhookRequest request)
    {
        _logger.LogInformation("Received WhatsApp message from {From}: {Body}",
            request.From, request.Body);
        // Process asynchronously
        _ = Task.Run(async () => await ProcessTestRequestAsync(request));
        return Ok();
    }

[FromForm] Twilio sends webhook data as form-urlencoded, not JSON. This attribute tells ASP.NET to bind from form data. Fire and forget: _ = Task.Run(...) starts background processing without awaiting. The webhook returns immediately (HTTP 200 OK) within milliseconds. Why not await? Tests take 30-60 seconds. If we awaited, Twilio would timeout the webhook. By returning immediately, you acknowledge receipt while processing continues in the background. The discard operator _: Tells the compiler we intentionally ignore the Task. Without this, you'd get warnings about unawaited async calls.

Processing Test Requests

private async Task ProcessTestRequestAsync(WhatsAppWebhookRequest request)
    {
        var message = request.Body?.Trim() ?? string.Empty;
        var from = request.From;
        try
        {
            // Handle help command
            if (message.ToLower() == "help" || message.ToLower() == "start")
            {
                await SendHelpMessageAsync(from);
                return;
            }
            // Send acknowledgment
            await _whatsApp.SendMessageAsync(
                from,
                "🤖 Got it! Analyzing your request and generating test plan...\n\n⏳ This usually takes 30-60 seconds.");
            // Step 1: Parse intent using AI
            var testRequest = await _aiGenerator.ParseIntentAsync(message);
            // CRITICAL FIX: Fallback URL extraction if AI failed
            if (string.IsNullOrEmpty(testRequest.Url))
            {
                var extractedUrl = ExtractUrlFromMessage(message);
                if (!string.IsNullOrEmpty(extractedUrl))
                {
                    testRequest = testRequest with { Url = extractedUrl };
                    _logger.LogInformation("URL extracted via regex fallback: {Url}", extractedUrl);
                }
            }
            _logger.LogInformation("Parsed intent: {Intent} for {Url}", 
                testRequest.TestIntent, testRequest.Url);
            // Validate URL after fallback attempt
            if (string.IsNullOrEmpty(testRequest.Url))
            {
                await _whatsApp.SendMessageAsync(
                    from,
                    "❌ I couldn't find a website URL in your message.\n\n" +
                    "Please include the URL. Example:\n" +
                    "Test login on https://example.com");
                return;
            }
            // Validate URL format
            if (!Uri.TryCreate(testRequest.Url, UriKind.Absolute, out var uri) || 
                (uri.Scheme != Uri.UriSchemeHttp && uri.Scheme != Uri.UriSchemeHttps))
            {
                await _whatsApp.SendMessageAsync(
                    from,
                    $"❌ Invalid URL format: {testRequest.Url}\n\n" +
                    "Please provide a valid URL starting with http:// or https://");
                return;
            }
            // Step 2: Generate test plan using AI
            var testPlan = await _aiGenerator.GenerateTestPlanAsync(testRequest);
            await _whatsApp.SendMessageAsync(
                from,
                $"✅ Test plan created!\n\n" +
                $"📋 Steps: {testPlan.Steps.Count}\n" +
                $"🎯 Goal: {testPlan.Description}\n\n" +
                $"🚀 Executing tests now...");
            // Step 3: Execute test with Playwright
            var result = await _orchestrator.ExecuteTestAsync(testRequest, testPlan);
            // Step 4: AI analysis of results
            result = result with 
            { 
                AIAnalysis = await _aiGenerator.AnalyzeResultsAsync(result) 
            };
            // Step 5: Send results back to WhatsApp
            await SendTestResultsAsync(from, result);
            _logger.LogInformation("Test completed for {Intent}. Success: {Success}",
                testRequest.TestIntent, result.Success);
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Error processing test request");
            await _whatsApp.SendMessageAsync(
                from,
                $"❌ Oops! Something went wrong:\n\n{ex.Message}\n\n" +
                "Send 'help' to see examples.");
        }
    }
    private string? ExtractUrlFromMessage(string message)
    {
        var match = UrlRegex.Match(message);
        if (match.Success)
        {
            // Clean up any trailing punctuation
            var url = match.Value.TrimEnd('.', ',', ')', ']', '}', '!', '?');
            return url;
        }
        return null;
    }

Users receive 3-4 messages during execution:

Initial acknowledgment ("Got it!")
Test plan confirmation
Final results with screenshots

This keeps users engaged and informed. Without it, they'd wait 60 seconds wondering if their message was received.

URL validation: If the AI couldn't extract a URL, we fail fast with helpful guidance. No point executing a test without a target.

The 5-step pipeline:

Parse natural language → TestRequest
Generate dynamic plan → TestPlan
Execute with browser → TestExecutionResult
Analyze with AI → Enhanced result
Format and send → WhatsApp message

The idea is to catch all exceptions, and inform the user gracefully. No stack traces in WhatsApp, just user-friendly error messages.

Sending Test Results

Observe the following code.

private async Task SendTestResultsAsync(string to, TestExecutionResult result)
    {
        var icon = result.Success ? "✅" : "❌";
        var status = result.Success ? "*PASSED*" : "*FAILED*";
        // Build the base URL for artifacts (you should configure this in appsettings.json)
        var baseUrl = $"{_baseUrl}/api/artifacts"; // TODO: Make this configurable
        var report = $"{icon} *Test Results* {icon}\n";
        report += "━━━━━━━━━━━━━━━━━━━━\n\n";
        report += $"🌐 *URL:* {result.Url}\n";
        report += $"🎯 *Intent:* {result.TestIntent}\n";
        report += $"📊 *Status:* {status}\n";
        report += $"⏱️ *Duration:* {result.Duration.TotalSeconds:F1}s\n";
        report += $"📋 *Steps:* {result.ExecutedSteps.Count(s => s.Success)}/{result.ExecutedSteps.Count} succeeded\n\n";
        // AI Analysis Section

WhatsApp formatting: Bold text with asterisks (*bold*), emojis for visual hierarchy. This makes reports scannable on mobile devices. Step limit (8) prevents message overflow. If there are, for example, 15 steps, we show the first 8 and note there are more. This keeps messages readable.

The analysis from GPT-4 is embedded in the report. This gives context beyond pass/fail, explains why something failed and how to fix it. Conditional media: If screenshots exist, send them. If not (maybe the test failed immediately), send text only and link to the media for access. This handles edge cases gracefully.

Help Message

private async Task SendHelpMessageAsync(string to)
    {
        var help = @"🤖 *Smart QA ChatOps Assistant*
I can test websites for you! Just tell me what to test in natural language.
*Examples:*
1️⃣ *Login Testing*
`Test login on https://example.com with user@test.com`
2️⃣ *Search Testing*
`Check if search works on https://amazon.com for laptops`
3️⃣ *Add to Cart*
`Try adding a product to cart on https://shop.com`
4️⃣ *Navigation*
`Test navigation on https://mysite.com - click about, then contact`
5️⃣ *Form Submission*
`Test contact form on https://example.com with name John Doe`
6️⃣ *Checkout Flow*
`Test the full checkout on https://store.com`
*Features:*
✨ AI-powered test generation
🎯 Dynamic element discovery
📸 Automatic screenshots
🔍 Intelligent error analysis
💬 Plain language commands
Just describe what you want to test!";
        await _whatsApp.SendMessageAsync(to, help);
    }
}
// DTOs
public record WhatsAppWebhookRequest
{
    public string MessageSid { get; init; } = string.Empty;
    public string From { get; init; } = string.Empty;
    public string Body { get; init; } = string.Empty;
}

Users typing "help" or "start" get comprehensive examples. This reduces friction and demonstrates capabilities immediately. Each example is carefully chosen to show different test types. Users can copy, modify, and send.

Serve Test Artifacts with the ArtifactsController

While sending screenshots and videos directly through WhatsApp seems convenient, it's impractical in production due to WhatsApp's 3-media-per-message limit, 16MB file size restrictions, and cost. Large media files may timeout during Twilio's webhook processing, and there's no way to provide video playback controls or organize multiple artifacts effectively.

The HTML Report Solution

To solve these shortcomings, you will provide a single clickable link (provided in the WhatsApp analysis report) to a comprehensive HTML report. The same pattern, a standard protocol, is used by GitHub Actions, CircleCI, and Jenkins. The ArtifactsController serves three endpoints:

/api/artifacts/report/{jobId} - Professional HTML report with screenshot grid, embedded video player, and trace download links
/api/artifacts/video/{fileName} - Video streaming with range requests for playback controls
/api/artifacts/trace/{fileName} - Downloadable Playwright trace files for debugging

This approach delivers unlimited screenshots, full video playback controls, persistent storage, instant WhatsApp delivery, lower costs, and enables easy sharing across teams. Reports remain accessible indefinitely and can be archived for compliance.

The complete controller with responsive HTML templating and video streaming is available in the GitHub repository. This webhook → process → notify with link pattern is scalable, reliable, and provides superior UX compared to inline media delivery. Next, proceed to configure the application to wrap up the development.

Configure the Application

In the Program.cs, this is where everything is wired together and configures the application settings.

Program.cs Setup

This is the content of your QAChatOps.Api/ Program.cs.

using QAChatOps.Core.Services;
using QAChatOps.Infrastructure.OpenAI;
using QAChatOps.Infrastructure.Twilio;
var builder = WebApplication.CreateBuilder(args);
// Add services
builder.Services.AddControllers();
builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen();
builder.Services.AddHttpClient<IAITestGenerator, AITestGenerator>();
// Configure Playwright
builder.Services.AddSingleton(sp =>
{
    return Microsoft.Playwright.Playwright.CreateAsync().GetAwaiter().GetResult();
});
// Configure core services
builder.Services.AddHttpClient<QAChatOps.Core.Services.IAITestGenerator, QAChatOps.Infrastructure.OpenAI.AITestGenerator>();
builder.Services.AddSingleton<QAChatOps.Core.Services.ITestOrchestrator, QAChatOps.Core.Services.TestOrchestrator>();
builder.Services.AddSingleton<QAChatOps.Infrastructure.Twilio.IWhatsAppService, QAChatOps.Infrastructure.Twilio.WhatsAppService>();
// Configure logging
builder.Logging.AddConsole();
builder.Logging.AddDebug();
// Serve static files (for screenshots)
builder.Services.AddDirectoryBrowser();
// Ensure artifacts directory exists early so DirectoryBrowser's PhysicalFileProvider
// doesn't throw when the app is built.
var artifactsPath = Path.Combine(Directory.GetCurrentDirectory(), "wwwroot", "artifacts");
Directory.CreateDirectory(Path.Combine(artifactsPath, "screenshots"));
Directory.CreateDirectory(Path.Combine(artifactsPath, "traces"));
Directory.CreateDirectory(Path.Combine(artifactsPath, "videos"));
var app = builder.Build();
// Configure middleware
if (app.Environment.IsDevelopment())
{
    app.UseSwagger();
    app.UseSwaggerUI();
}
// Serve static files from wwwroot
app.UseStaticFiles();
// Enable directory browsing for artifacts (dev only)
if (app.Environment.IsDevelopment())
{
    app.UseDirectoryBrowser(new DirectoryBrowserOptions
    {
        FileProvider = new Microsoft.Extensions.FileProviders.PhysicalFileProvider(
            Path.Combine(Directory.GetCurrentDirectory(), "wwwroot", "artifacts")),
        RequestPath = "/artifacts"
    });
}
app.UseHttpsRedirection();
app.UseAuthorization();
app.MapControllers();
app.UseStaticFiles(); 
// Serves files from wwwroot
// (Artifacts directory already ensured earlier)
app.Run();

Application Configuration

Review the file QAChatOps.API/appsettings.json.

{
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft.AspNetCore": "Warning",
      "Microsoft.Playwright": "Information"
    }
  },
  "AllowedHosts": "*",
  "ArtifactsPath": "wwwroot/artifacts",
  "PublicBaseUrl": "https://your-ngrok-url.ngrok-free.app",
  "Twilio": {
    "AccountSid": "ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    "AuthToken": "your_auth_token_here",
    "WhatsAppNumber": "whatsapp:+14155238886"
  },
  "OpenAI": {
    "ApiKey": "sk-proj-xxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    "Model": "gpt-4"
  }
}

ArtifactsPath: Relative path from the application root. Using wwwroot/artifacts ensures files are web-accessible.

This file contains placeholder values that you will need to replace: PublicBaseUrl: This will be your ngrok URL during development (we'll set this up next). In production, it's your actual domain. This is critical—without it, WhatsApp can't access screenshots.

Twilio configuration:

AccountSid: Found in Twilio Console dashboard
AuthToken: Found in Twilio Console (keep this secret!)
WhatsAppNumber: Your Twilio WhatsApp number in format whatsapp:+14155238886

OpenAI configuration:

ApiKey: Get from OpenAI Platform
Model: Use gpt-4 for best results. According to your budget you can use more powerful models for better results.

Never commit appsettings.json with real secrets to Git. Use environment variables or Azure Key Vault in production. For development, consider appsettings.Development.json which you can gitignore.

Running the Application with ngrok

Since Twilio needs to send webhooks to your application, you need to expose localhost to the internet. ngrok is perfect for this.

Step 1: Start Your Application

Fire the command to start the application.

cd QAChatOps.Api 
dotnet run

Step 2: Start ngrok

Open a new terminal window (keep your app running) and start ngrok:

ngrok http 5000

Note that you may need to adjust the port number to the port on which your application is running.

Copy the HTTPS URL provided (e.g., https://example123.ngrok.io). ngrok creates a secure tunnel from a public URL to your localhost. Twilio can now reach your webhook at https://abc123.ngrok-free.app/api/webhook/whatsapp. Navigate to http://127.0.0.1:4040 to see all webhook requests in real-time. This is invaluable for debugging because you can see exactly what Twilio is sending.

Step 3: Update Configuration

Update your appsettings.Development.json with your ngrok URL:

{
  "PublicBaseUrl": "https://abc123.ngrok-free.app/api/webhook/whatsapp"
"
}

Restart your application for the change to take effect. In your app terminal, press Ctrl+C, then:

dotnet run

Step 4: Configure Twilio Webhook

Go back to Twilio Console → Messaging → Try it out → Send a WhatsApp message
Scroll to Sandbox Configuration. You may need to first join the sandbox using the Join code provided.
In the When a message comes in field, enter your ngrok URL that you just created, including the full URL from the application, for example: https://abc123.ngrok.app/api/webhook/whatsapp
Method: HTTP POST
Click Save

4. 5.

Send "join code" to whatsapp:+14155238886 (use the join code and number shown in your Twilio Console) to connect your WhatsApp to the sandbox.

Testing the webhook: Send any message to your Twilio WhatsApp number. Check your application logs. You should see:

info: QAChatOps.Api.Controllers.WebhookController[0]
      Received WhatsApp message from whatsapp:+1234567890: Hello

If you see this, your webhook is working!

Testing Your QA ChatOps Assistant

Now for the exciting part! It's time to test the complete system.

Test 1: Simple Navigation

Send this WhatsApp message:

Check if search works on https://amazon.com for MacBook

The screenshot below shows what happens.

Screenshot of testing results for Amazon search functionality on iPhone 17 Pro Max, showing steps and AI analysis.

You can also look at the logs to see what's happening. To access the media files, click or copy the link provided in the WhatsApp message and paste in your browser. Make sure your app is still up and running.

Test 2: Login Flow

Send this message:

Test the login page of this website. Create an account and use those details for performing the test.

Credentials for performing test:

username: YourUserName

password: YourPassword

https://tikitiafrika.com/login

Check out the screenshots below to see the results of your application.

WhatsApp chat showing automated test plan execution for login functionality of a website and its results.

See the logs in your terminal to understand every step happening in your login workflow test.

Screenshot of a Python script with automated steps for testing a login function, filling email, and password fields.

Code defines actions for login and verification including wait times and button clicks

Visually illustrated with screenshots are the step by step processes of the login workflow.

Login page screenshot with email input filled and an option for password recovery.

Login page with welcome message on left and login form on right

Website showing event cards for Spooky Season and Nostalgia Marafiki by Tikiti Afrika

To access the full report with the media and traces, click or copy the link provided in the WhatsApp message and paste in your browser. Make sure your app is still up and running. Here is the report as shown in the screenshot below.

A test report dashboard showcasing screenshots, a video recording, and options to download the video and playwright trace.

Feel free to perform other types of tests by just chatting through WhatsApp.

Troubleshooting Common Issues

Issue 1: "OpenAI API key not configured"

Symptoms: Application crashes on startup with this error.

Solution: Ensure appsettings.json has your OpenAI API key:

"OpenAI": {
  "ApiKey": "sk-proj-your-actual-key-here"
}

Testing: After adding the key, restart your app. You should see the startup log without errors.

Issue 2: WhatsApp messages not received

Symptoms: Sending messages to Twilio number, but webhook never fires.

Diagnostics:

Check ngrok is running: http://127.0.0.1:4040
Check Twilio webhook URL matches your ngrok URL exactly
Ensure your ngrok URL is HTTPS, not HTTP
Verify you joined the WhatsApp sandbox (send the join code)

Solution:

# Verify webhook URL in Twilio matches:
https://your-ngrok-url.ngrok-free.app/api/webhook/whatsapp
# Check ngrok web UI for incoming requests
# Even failed requests will show up here

Issue 3: "Failed to generate test plan"

Symptoms: Bot responds but says it couldn't generate a plan.

Causes:

OpenAI API quota exceeded
Network connectivity to OpenAI
Invalid API key

Diagnostics: Check application logs for OpenAI API errors:

dotnet run --logging:loglevel:default=Debug

Look for HTTP error codes in logs (401 = invalid key, 429 = rate limit).

Issue 4: Tests timeout frequently

Symptoms: Many tests fail with timeout errors after 30 seconds.

Causes:

Slow internet connection
Target website is slow
Selectors not found

Solution: Increase timeouts in TestOrchestrator.cs:

await page.GotoAsync(url, new() 
{ 
    WaitUntil = WaitUntilState.NetworkIdle,
    Timeout = 60000 // Increase from 30000
});

Better solution: Ask AI to generate more resilient selectors by modifying the system prompt in AITestGenerator.cs.

Next Steps and Enhancements

Now that you have a working weather SMS service, consider these improvements:

Scheduled recurring tests - Use Hangfire for hourly/daily smoke tests
Slack/Teams integration - Send results to team channels, not just WhatsApp
API testing capabilities - Extend beyond UI to test REST/GraphQL endpoints
Blob storage integration - Replace local file storage with Azure Blob or S3
Add test history database - Track all test runs, success rates, and trends

Your intelligent testing system is now operational and ready to revolutionize how your team performs quality assurance. Simply send a WhatsApp message describing what you want to test, and watch as AI interprets your intent, generates dynamic test plans, executes them with Playwright, and delivers comprehensive results with screenshots. Your ChatOps QA assistant is ready to transform your quality assurance workflow!

Jacob Snipes is a seasoned AI Engineer who transforms complex communication challenges into seamless, intelligent solutions. With a strong foundation in full-stack development and a keen eye for innovation, he crafts applications that enhance user experiences and streamline operations. Jacob's work bridges the gap between cutting-edge technology and real-world applications, delivering impactful results across various industries.