Memory and tool patterns
TAC provides multiple ways to retrieve memory and define tools. This guide explains when to use each pattern and the tradeoffs involved.
TAC offers three ways to retrieve memory context for your agent. Choose the pattern that best fits your use case.
Enable automatic memory retrieval in your channel configuration. TAC fetches memory before calling your on_message_ready callback.
1from tac.channels.sms import SMSChannel, SMSChannelConfig2from tac.channels.voice import VoiceChannel, VoiceChannelConfig34sms_channel = SMSChannel(5tac,6config=SMSChannelConfig(auto_retrieve_memory=True),7)89voice_channel = VoiceChannel(10tac,11config=VoiceChannelConfig(auto_retrieve_memory=True),12)1314async def handle_message_ready(15message: str,16context: ConversationSession,17memory: TACMemoryResponse | None,18) -> str | None:19# Memory is already fetched and passed to your callback20if memory:21observations = memory.observations22profile = context.profile
When to use:
- Most production agents
- When you need memory for every message
- When you want TAC to handle memory retrieval timing
Performance: One memory API call per message. TAC fetches memory in parallel with other operations.
Retrieve memory on demand within your callback when you need conditional logic or custom queries.
1from tac.channels.sms import SMSChannel, SMSChannelConfig23sms_channel = SMSChannel(4tac,5config=SMSChannelConfig(auto_retrieve_memory=False), # Disabled6)78async def handle_message_ready(9message: str,10context: ConversationSession,11memory: TACMemoryResponse | None, # Will be None12) -> str | None:13# Conditionally retrieve memory14if needs_personalization(message):15memory = await tac.retrieve_memory(context, query="customer preferences")1617# Or retrieve with a specific query18if "order" in message.lower():19memory = await tac.retrieve_memory(context, query="order history")
When to use:
- Memory only needed for certain message types
- Custom semantic queries based on message content
- Cost optimization (fewer memory API calls)
- Debugging or testing without memory
Performance: You control when memory API calls happen. Can reduce costs by skipping memory for simple queries.
Use the OpenAI adapter to inject memory automatically into OpenAI API calls without manual prompt building.
1from openai import AsyncOpenAI2from tac.adapters.openai import with_tac_memory34openai_client = AsyncOpenAI()56async def handle_message_ready(7message: str,8context: ConversationSession,9memory: TACMemoryResponse | None,10) -> str | None:11# Wrap client with memory - TAC injects memory into system prompt12client = with_tac_memory(openai_client, memory, context)1314# Memory is automatically injected as system message or instructions15response = await client.chat.completions.create(16model="gpt-4o-mini",17messages=conversation_history[context.conversation_id],18)1920# Or with Responses API21response = await client.responses.create(22model="gpt-5.4-mini",23instructions="You are a helpful agent.",24input=conversation_history[context.conversation_id],25)
When to use:
- OpenAI-specific projects (Chat Completions or Responses API)
- Want automatic memory formatting
- Don't need custom memory prompt structure
Performance: Same as auto-retrieval (memory fetched once per message), but saves you from manual prompt building.
| Pattern | Setup Complexity | Control | Performance | Best For |
|---|---|---|---|---|
| Auto-retrieval | Low | Medium | Good | Most agents |
| Manual retrieval | Medium | High | Best (optimizable) | Conditional memory, custom queries |
| OpenAI adapter | Low | Low | Good | OpenAI projects |
For custom tools (API calls, database queries, business logic), use your LLM framework's native tool definitions. TAC is framework-agnostic and works with any tool format your LLM supports.
TAC provides built-in tools for common agent operations like knowledge search, memory retrieval, and escalation to human agents.
1from tac.tools import (2create_knowledge_tool,3create_memory_tool,4create_studio_handoff_tool,5)67# Knowledge base search8knowledge_tool = create_knowledge_tool(9tac=tac,10knowledge_base_id="know_knowledgebase_xxxxx",11name="search_docs",12description="Search product documentation",13)1415# Memory recall with custom query16memory_tool = create_memory_tool(17tac=tac,18name="recall_history",19description="Recall past interactions with customer",20)2122# Studio Flow handoff (human escalation)23handoff_tool = create_studio_handoff_tool(24tac=tac,25flow_sid="FWxxxxx",26name="escalate_to_human",27description="Transfer to human agent",28)2930tac.register_tool(knowledge_tool)31tac.register_tool(memory_tool)32tac.register_tool(handoff_tool)
When to use:
- Knowledge search across your Enterprise Knowledge bases
- Memory recall for past customer interactions
- Escalation to human agents via Studio Flows
- Auto-retrieval: Fetches memory in parallel with other TAC operations. Minimal latency impact.
- Manual retrieval: Only fetches when called. Can reduce costs by skipping memory for simple queries.
- OpenAI adapter: Same performance as auto-retrieval, but formats memory automatically.
- Built-in tools: Include Twilio API call latency for knowledge search, memory recall, and escalation operations.
- Start with auto-retrieval: Use auto-retrieval unless you have a specific reason not to. It's simple and performant.
- Use manual retrieval for optimization: If memory is only needed for 20% of messages, manual retrieval can save 80% of memory API calls.
- Prefer built-in tools: Use TAC's built-in tools (knowledge, memory, handoff) when possible. They handle Twilio API integration correctly.
- Keep tools focused: Each tool should do one thing well. Don't create mega-tools that handle multiple operations.
- Document tool parameters: LLMs rely on your tool descriptions. Be specific about what each parameter does and when to use the tool.
- Channels: Learn about channel configuration and transport mechanisms.
- Core concepts: Understand TAC's architecture and component relationships.