Complete Agent Setup Guide

Everything you need to know about creating and configuring AI voice agents on the EWT Voice Agent platform.

1. Overview

An agent is the central building block of the EWT Voice Agent platform. Each agent is an independent AI persona that can receive inbound calls, make outbound calls, and handle SMS conversations. When a phone number rings, the platform routes the call to the agent assigned to that number and orchestrates a real-time pipeline:

Telephony — Twilio connects the caller and streams audio via WebSocket.
Speech-to-Text (STT) — Deepgram transcribes the caller's speech in real time (default model: nova-2).
LLM Reasoning — The transcript is sent to an LLM (Anthropic Claude or any of 200+ models via OpenRouter) along with the agent's system prompt.
Text-to-Speech (TTS) — The LLM response is streamed to ElevenLabs or Deepgram TTS, which generates natural-sounding speech played back to the caller.

You create agents via the dashboard UI or programmatically through POST /api/agents. Each agent belongs to your organization (tenant) and can be cloned, versioned, exported, and imported.

2. Core Settings

These fields define the agent's identity and how it greets callers.

Field	Type	Default	Description
`name` required	string	—	Display name for the agent. Also injected into the system prompt so the LLM knows its own name.
`first_message`	string	`null`	The opening line the agent speaks when it answers a call. Example: "Thanks for calling Acme Corp! How can I help?"
`first_message_mode`	string	`"assistant-speaks-first"`	Controls who talks first. Set to `"assistant-speaks-first"` so the agent greets the caller immediately, or `"user-speaks-first"` to wait for the caller to speak before responding.
`system_prompt`	string	`null`	The core instruction set for the LLM. Defines personality, rules, knowledge, and behavior. Supports `{{variable}}` interpolation (see Metadata & Variables).
`identity_prompt`	string	`null`	A separate prompt fragment focused on who the agent is. Useful for keeping identity details out of the main system prompt.
`tone`	string	`"friendly"`	A label for the agent's conversational style. Common values: `friendly`, `professional`, `casual`, `empathetic`, `enthusiastic`.
`language`	string	`"en"`	Primary language code (e.g., `en`, `es`, `fr`).
`greeting_template`	string	`null`	A dynamic greeting template. Supports variables for personalized hellos when caller info is known.
`is_active`	boolean	`true`	Toggle the agent on or off without deleting it.

Always provide both a first_message and a system_prompt. An agent without a system prompt will fall back to a generic "helpful assistant" behavior, which produces vague, off-brand responses.

Writing Effective System Prompts

The system prompt is the single most important setting for your agent. A great prompt produces natural, on-brand calls. A vague prompt produces generic, unhelpful responses. Here's the formula:

1. Identity — Tell the AI who it is.

You are Sarah, a scheduling coordinator at Bright Dental.

2. Goal — What should the agent accomplish on every call?

Your job is to help callers book, reschedule, or cancel appointments.

3. Personality — How should it sound?

Be warm, friendly, and concise. Keep responses to 1-2 sentences. Use casual language like a real person on the phone.

4. Boundaries — What should it never do?

Never diagnose medical conditions. If asked about pricing, say "I'd be happy to have our billing team reach out." Do not discuss competitors.

5. Handling unknowns — What if the caller asks something unexpected?

If you don't know the answer, say "Let me take your info and have someone get back to you" and use the takeMessage tool.

Voice calls are not chatbots. Callers expect short, natural responses. Always include an instruction like "Keep responses to 2-3 sentences maximum" in your prompt. Without this, the AI may produce long monologues that sound robotic and prevent callers from interrupting.

3. LLM Configuration

Control which language model powers your agent and how it generates responses.

Field	Type	Default	Description
`model`	string	`"anthropic/claude-haiku-4.5"`	The LLM model identifier. Use OpenRouter-style names (e.g., `anthropic/claude-sonnet-4`, `google/gemini-2.0-flash-001`, `openai/gpt-4o-mini`). Over 200 models are available through OpenRouter.
`llm_provider`	string	`"openrouter"`	Which LLM provider to route through. Typically `"openrouter"` (default) or `"anthropic"` for direct Anthropic API access.
`temperature`	number	`0.7`	Controls randomness. Range: `0.0` (deterministic) to `1.0` (creative). Use `0.3`–`0.5` for structured tasks like appointment booking; use `0.7`–`0.9` for casual conversation.
`max_tokens`	integer	`300`	Maximum tokens per LLM response. Keep this low (150–400) for voice agents — long responses sound unnatural on the phone.
`llm_base_url`	string	`null`	Custom LLM endpoint URL. Use this to point at a self-hosted model, an Azure OpenAI deployment, or any OpenAI-compatible API. Example: `https://my-llm.example.com/v1`
`llm_api_key`	string	`null`	API key for the custom LLM endpoint. Only needed when using `llm_base_url`. This overrides the organization-level OpenRouter key for this agent.

Choosing a Model

For most voice agents, anthropic/claude-haiku-4.5 offers the best balance of speed, quality, and cost. Haiku's low latency is critical for natural-sounding phone conversations. If you need stronger reasoning (e.g., complex sales qualification), consider anthropic/claude-sonnet-4 but be aware of slightly higher latency.

When using a custom llm_base_url, the endpoint must be OpenAI API-compatible (accept /chat/completions with the standard message format).

4. Call Control

These settings govern the real-time behavior of voice calls — when the agent speaks, when it listens, and when it hangs up.

Field	Type	Default	Description
`max_call_duration`	integer	`300`	Maximum call length in seconds. The call is automatically ended after this time. `300` = 5 minutes. Set to `600` for support calls or `900` for complex sales calls.
`endpointing`	integer	`200`	Silence threshold in milliseconds. After the caller stops talking, the agent waits this long before treating the utterance as complete and generating a response. Lower = faster but may cut off the caller. Higher = more patient but feels sluggish. Recommended range: `150`–`400`.
`bargein_threshold`	integer	`3`	Number of words the caller must speak before the agent stops talking (barge-in). At `3`, the caller saying "wait hold on" will interrupt the agent. Set to `1` for maximum responsiveness or `5`–`8` to prevent accidental interruptions.
`response_delay_ms`	integer	`0`	Artificial delay before the agent speaks, in milliseconds. Adds a more human-like pause. `0` for instant, `200`–`500` for a natural feel.
`end_call_phrases`	string[]	`[]`	Array of phrases that trigger an automatic hang-up when the agent says them. Example: `["goodbye", "bye", "have a great day"]`. The agent will speak the phrase and then end the call.
`idle_timeout_seconds`	number	`15`	Seconds of silence before the agent considers the caller idle and sends a prompt message. Default was changed from 7.5 to 15 to avoid being too aggressive on phone calls.
`idle_messages`	string[]	`[]`	Messages the agent cycles through when the caller goes silent. Example: `["Are you still there?", "I'm here whenever you're ready.", "Would you like me to repeat anything?"]`
`idle_max_triggers`	integer	`3`	Maximum number of idle prompts before the `idle_action` is taken.
`idle_action`	string	`"message"`	What happens after all idle triggers are exhausted. `"message"` keeps the call open; alternatively the agent can end the call.
`chunk_size`	integer	`50`	Number of characters per TTS chunk. Smaller chunks start speaking sooner (lower latency) but may sound choppy. `50` is a good balance.

Setting endpointing below 150 will frequently cut callers off mid-sentence. Setting it above 500 makes the agent feel unresponsive. Start with 200–300 and tune from there.

5. Business Hours

Restrict when your agent answers calls. Outside configured hours, callers hear an after-hours message instead of engaging with the AI.

Field	Type	Default	Description
`business_hours`	object (JSONB)	`null`	A JSON object keyed by day abbreviation (`mon`, `tue`, `wed`, `thu`, `fri`, `sat`, `sun`). Each day has `start` and `end` times in 24-hour format. Omit a day to mark it as closed.
`timezone`	string	`"America/New_York"`	IANA timezone string for interpreting business hours. Examples: `America/Chicago`, `America/Los_Angeles`, `Europe/London`.
`after_hours_message`	string	`null`	Message played to callers outside business hours. Example: "Our office is currently closed. Please call back Monday through Friday, 9 AM to 5 PM Eastern."

Business Hours JSON Format

{
  "mon": { "start": "09:00", "end": "17:00" },
  "tue": { "start": "09:00", "end": "17:00" },
  "wed": { "start": "09:00", "end": "17:00" },
  "thu": { "start": "09:00", "end": "17:00" },
  "fri": { "start": "09:00", "end": "17:00" }
}

In this example, Saturday and Sunday are omitted, so the agent will play the after_hours_message on weekends. Times are interpreted in the agent's configured timezone.

If business_hours is null (the default), the agent answers calls 24/7 with no restrictions.

6. Metadata & Variables

Agents support two mechanisms for attaching custom data and dynamically personalizing prompts.

metadata (JSONB)

A free-form JSON object stored alongside the agent. Use it to attach arbitrary key-value data — department codes, CRM IDs, feature flags, or anything your webhook server needs.

{
  "department": "sales",
  "crm_id": "SF-00412",
  "priority": "high"
}

metadata_variables (JSONB)

A flat key-value map used for template interpolation inside first_message and system_prompt. Any occurrence of {{key}} in those fields is replaced at call time with the corresponding value.

{
  "company_name": "Acme Corp",
  "support_phone": "1-800-555-0199",
  "product_name": "Widget Pro",
  "booking_url": "https://acme.com/book"
}

Then in your system prompt, you can write:

You are a customer support agent for {{company_name}}.
If the caller needs to schedule a repair, direct them to {{booking_url}}.
For billing questions, transfer them to {{support_phone}}.

The platform also automatically makes {{agent_name}} available (pulled from the agent's name field), so you do not need to add it to metadata_variables manually.

If you use {{variable}} placeholders in your prompts but forget to define matching keys in metadata_variables, the raw {{variable}} text will appear in the prompt as-is. The LLM may then say the placeholder out loud on the call.

7. Agent Templates

The platform provides pre-built templates at GET /api/agents/templates to get you started quickly. Select a template, customize the prompts for your business, and save.

Template	Tone	Max Duration	Best For
Receptionist	professional	300s (5 min)	Answering calls, taking messages, routing callers to the right person.
Appointment Setter	friendly	300s (5 min)	Booking appointments, collecting name/date/time/service, confirming details. Includes an `analysis_schema` to extract structured data after the call.
Customer Support	empathetic	600s (10 min)	Troubleshooting issues, answering FAQs, escalating to a human when needed. Includes a `success_eval_prompt` for post-call quality scoring.
Sales Qualifier	enthusiastic	600s (10 min)	Qualifying inbound leads by discovering pain points, budget, timeline, and decision process. Includes both `analysis_schema` and `success_eval_prompt`.
Outbound Follow-up	casual	300s (5 min)	Checking in on existing customers or leads. Short, natural, and warm.

Using a Template via the API

Fetch templates: GET /api/agents/templates
Pick a template from the response and copy its fields.
Customize system_prompt, first_message, and other fields for your business.
Send the payload to POST /api/agents.

Templates are starting points, not finished agents. Always customize the system prompt with your company name, products, policies, and specific instructions before going live.

8. Full Example: Sales Agent

Here is a complete, production-ready JSON payload for creating a sales qualifier agent via POST /api/agents:

{
  "name": "Acme Sales Agent",
  "first_message": "Hey there! Thanks for calling Acme Corp. I'd love to learn more about what you're looking for. What brings you to us today?",
  "first_message_mode": "assistant-speaks-first",
  "system_prompt": "You are a friendly sales qualification specialist at {{company_name}}. Your goal is to qualify inbound leads through natural conversation.\n\nGather the following information naturally (do NOT ask all at once):\n1. What problem they're trying to solve\n2. Their approximate budget range\n3. Timeline for making a decision\n4. Who else is involved in the decision\n5. What solution they currently use\n\nRules:\n- Be conversational, not interrogative\n- If they ask about pricing, give a range of $500-$5,000/month depending on needs\n- If they seem qualified (budget > $1,000/mo, timeline < 90 days), offer to schedule a demo\n- If unqualified, be helpful and suggest our free tier at {{website_url}}\n- Never make up product features\n- Keep responses under 3 sentences",
  "tone": "enthusiastic",
  "model": "anthropic/claude-haiku-4.5",
  "llm_provider": "openrouter",
  "temperature": 0.5,
  "max_tokens": 250,
  "max_call_duration": 600,
  "endpointing": 300,
  "bargein_threshold": 3,
  "response_delay_ms": 150,
  "end_call_phrases": ["goodbye", "thanks for calling", "have a great day"],
  "idle_timeout_seconds": 20,
  "idle_max_triggers": 2,
  "idle_messages": [
    "Still there? No rush, take your time.",
    "I'm here whenever you're ready. Any other questions I can help with?"
  ],
  "idle_action": "message",
  "enable_call_analysis": true,
  "analysis_schema": {
    "qualified": "boolean",
    "budget": "string",
    "timeline": "string",
    "pain_points": "string[]",
    "current_solution": "string",
    "next_steps": "string"
  },
  "success_eval_prompt": "Was the lead qualified (budget > $1,000/mo and timeline < 90 days)? Did we gather enough information to pass to a sales rep?",
  "metadata_variables": {
    "company_name": "Acme Corp",
    "website_url": "https://acme.com/free"
  },
  "metadata": {
    "department": "sales",
    "lead_source": "inbound"
  },
  "business_hours": {
    "mon": { "start": "08:00", "end": "18:00" },
    "tue": { "start": "08:00", "end": "18:00" },
    "wed": { "start": "08:00", "end": "18:00" },
    "thu": { "start": "08:00", "end": "18:00" },
    "fri": { "start": "08:00", "end": "17:00" }
  },
  "timezone": "America/New_York",
  "after_hours_message": "Thanks for calling Acme Corp! Our sales team is available Monday through Friday, 8 AM to 6 PM Eastern. Please call back during business hours or visit acme.com to learn more.",
  "enable_recording": true,
  "enable_voicemail_detection": true,
  "voicemail_message": "Hi, this is the Acme Corp sales team. We missed your call but would love to chat. Please call us back during business hours or leave a message and we'll get right back to you.",
  "enable_voice_formatting": true,
  "enable_backchanneling": false,
  "stt_model": "nova-2"
}

9. Common Mistakes

1. Template variables left unfilled

If your system prompt contains {{company_name}} or {{booking_url}} but you did not define those keys in metadata_variables, the agent will literally say "Welcome to curly-brace curly-brace company underscore name" on the call. Always verify that every {{placeholder}} in your prompts has a matching variable defined.

2. No agent identity in the prompt

Without clear identity instructions, the LLM defaults to a generic assistant persona. It may introduce itself as "an AI language model" or refuse to give its name. Always include a line like: "Your name is Sarah. You work at Acme Corp as a customer service representative."

3. Missing survey questions in the prompt

If you are building a survey agent, the LLM will not know what questions to ask unless they are explicitly listed in the system_prompt. Simply naming the agent "Survey Agent" is not enough. Include the full questionnaire in the prompt.

4. Endpointing too aggressive or too loose

Too low (< 150ms): The agent cuts in before the caller finishes speaking. Callers get frustrated quickly.
Too high (> 500ms): The agent feels slow and unresponsive, as if it is not listening. There are long pauses after every sentence.
Recommended: Start at 200 for fast-paced calls (sales, reception) and 300 for patient calls (support, surveys).

5. Temperature too high for structured tasks

A temperature of 0.9 produces creative, varied responses — great for casual chat, terrible for appointment booking or data collection. When the agent needs to follow a strict script or extract specific information, use 0.3–0.5. High temperature can cause the agent to hallucinate appointment times, invent product features, or go off-script.

6. max_tokens set too high

A max_tokens of 1000 or more means the agent can generate paragraph-length responses. On a phone call, this sounds like an uninterruptible monologue. Keep it between 150 and 400. The system prompt should also instruct the LLM to keep responses short (2–3 sentences max).

7. No end_call_phrases defined

Without end_call_phrases, the agent never hangs up on its own. The call continues until max_call_duration is hit or the caller disconnects. Define natural goodbye phrases so the agent can gracefully end conversations.

8. Forgetting to assign a phone number

Creating an agent does not automatically give it a phone number. After creating the agent, assign it to a phone number via the dashboard or POST /api/agents/:id/assign. Without an assigned number, the agent cannot receive inbound calls.

Use the Chat Playground (POST /api/agents/:id/chat) to test your agent's prompt before routing live calls to it. You can also run batch test scenarios with POST /api/agents/:id/test-scenarios to validate behavior across multiple inputs.