Complete Agent Setup Guide
Everything you need to know about creating and configuring AI voice agents on the EWT Voice Agent platform.
1. Overview
An agent is the central building block of the EWT Voice Agent platform. Each agent is an independent AI persona that can receive inbound calls, make outbound calls, and handle SMS conversations. When a phone number rings, the platform routes the call to the agent assigned to that number and orchestrates a real-time pipeline:
- Telephony — Twilio connects the caller and streams audio via WebSocket.
- Speech-to-Text (STT) — Deepgram transcribes the caller's speech in real time (default model:
nova-2). - LLM Reasoning — The transcript is sent to an LLM (Anthropic Claude or any of 200+ models via OpenRouter) along with the agent's system prompt.
- Text-to-Speech (TTS) — The LLM response is streamed to ElevenLabs or Deepgram TTS, which generates natural-sounding speech played back to the caller.
You create agents via the dashboard UI or programmatically through POST /api/agents. Each agent belongs to your organization (tenant) and can be cloned, versioned, exported, and imported.
2. Core Settings
These fields define the agent's identity and how it greets callers.
| Field | Type | Default | Description |
|---|---|---|---|
name required |
string | — | Display name for the agent. Also injected into the system prompt so the LLM knows its own name. |
first_message |
string | null |
The opening line the agent speaks when it answers a call. Example: "Thanks for calling Acme Corp! How can I help?" |
first_message_mode |
string | "assistant-speaks-first" |
Controls who talks first. Set to "assistant-speaks-first" so the agent greets the caller immediately, or "user-speaks-first" to wait for the caller to speak before responding. |
system_prompt |
string | null |
The core instruction set for the LLM. Defines personality, rules, knowledge, and behavior. Supports {{variable}} interpolation (see Metadata & Variables). |
identity_prompt |
string | null |
A separate prompt fragment focused on who the agent is. Useful for keeping identity details out of the main system prompt. |
tone |
string | "friendly" |
A label for the agent's conversational style. Common values: friendly, professional, casual, empathetic, enthusiastic. |
language |
string | "en" |
Primary language code (e.g., en, es, fr). |
greeting_template |
string | null |
A dynamic greeting template. Supports variables for personalized hellos when caller info is known. |
is_active |
boolean | true |
Toggle the agent on or off without deleting it. |
first_message and a system_prompt. An agent without a system prompt will fall back to a generic "helpful assistant" behavior, which produces vague, off-brand responses.Writing Effective System Prompts
The system prompt is the single most important setting for your agent. A great prompt produces natural, on-brand calls. A vague prompt produces generic, unhelpful responses. Here's the formula:
1. Identity — Tell the AI who it is.
You are Sarah, a scheduling coordinator at Bright Dental.
2. Goal — What should the agent accomplish on every call?
Your job is to help callers book, reschedule, or cancel appointments.
3. Personality — How should it sound?
Be warm, friendly, and concise. Keep responses to 1-2 sentences. Use casual language like a real person on the phone.
4. Boundaries — What should it never do?
Never diagnose medical conditions. If asked about pricing, say "I'd be happy to have our billing team reach out." Do not discuss competitors.
5. Handling unknowns — What if the caller asks something unexpected?
If you don't know the answer, say "Let me take your info and have someone get back to you" and use the takeMessage tool.
3. LLM Configuration
Control which language model powers your agent and how it generates responses.
| Field | Type | Default | Description |
|---|---|---|---|
model |
string | "anthropic/claude-haiku-4.5" |
The LLM model identifier. Use OpenRouter-style names (e.g., anthropic/claude-sonnet-4, google/gemini-2.0-flash-001, openai/gpt-4o-mini). Over 200 models are available through OpenRouter. |
llm_provider |
string | "openrouter" |
Which LLM provider to route through. Typically "openrouter" (default) or "anthropic" for direct Anthropic API access. |
temperature |
number | 0.7 |
Controls randomness. Range: 0.0 (deterministic) to 1.0 (creative). Use 0.3–0.5 for structured tasks like appointment booking; use 0.7–0.9 for casual conversation. |
max_tokens |
integer | 300 |
Maximum tokens per LLM response. Keep this low (150–400) for voice agents — long responses sound unnatural on the phone. |
llm_base_url |
string | null |
Custom LLM endpoint URL. Use this to point at a self-hosted model, an Azure OpenAI deployment, or any OpenAI-compatible API. Example: https://my-llm.example.com/v1 |
llm_api_key |
string | null |
API key for the custom LLM endpoint. Only needed when using llm_base_url. This overrides the organization-level OpenRouter key for this agent. |
Choosing a Model
For most voice agents, anthropic/claude-haiku-4.5 offers the best balance of speed, quality, and cost. Haiku's low latency is critical for natural-sounding phone conversations. If you need stronger reasoning (e.g., complex sales qualification), consider anthropic/claude-sonnet-4 but be aware of slightly higher latency.
llm_base_url, the endpoint must be OpenAI API-compatible (accept /chat/completions with the standard message format).4. Call Control
These settings govern the real-time behavior of voice calls — when the agent speaks, when it listens, and when it hangs up.
| Field | Type | Default | Description |
|---|---|---|---|
max_call_duration |
integer | 300 |
Maximum call length in seconds. The call is automatically ended after this time. 300 = 5 minutes. Set to 600 for support calls or 900 for complex sales calls. |
endpointing |
integer | 200 |
Silence threshold in milliseconds. After the caller stops talking, the agent waits this long before treating the utterance as complete and generating a response. Lower = faster but may cut off the caller. Higher = more patient but feels sluggish. Recommended range: 150–400. |
bargein_threshold |
integer | 3 |
Number of words the caller must speak before the agent stops talking (barge-in). At 3, the caller saying "wait hold on" will interrupt the agent. Set to 1 for maximum responsiveness or 5–8 to prevent accidental interruptions. |
response_delay_ms |
integer | 0 |
Artificial delay before the agent speaks, in milliseconds. Adds a more human-like pause. 0 for instant, 200–500 for a natural feel. |
end_call_phrases |
string[] | [] |
Array of phrases that trigger an automatic hang-up when the agent says them. Example: ["goodbye", "bye", "have a great day"]. The agent will speak the phrase and then end the call. |
idle_timeout_seconds |
number | 15 |
Seconds of silence before the agent considers the caller idle and sends a prompt message. Default was changed from 7.5 to 15 to avoid being too aggressive on phone calls. |
idle_messages |
string[] | [] |
Messages the agent cycles through when the caller goes silent. Example: ["Are you still there?", "I'm here whenever you're ready.", "Would you like me to repeat anything?"] |
idle_max_triggers |
integer | 3 |
Maximum number of idle prompts before the idle_action is taken. |
idle_action |
string | "message" |
What happens after all idle triggers are exhausted. "message" keeps the call open; alternatively the agent can end the call. |
chunk_size |
integer | 50 |
Number of characters per TTS chunk. Smaller chunks start speaking sooner (lower latency) but may sound choppy. 50 is a good balance. |
endpointing below 150 will frequently cut callers off mid-sentence. Setting it above 500 makes the agent feel unresponsive. Start with 200–300 and tune from there.5. Business Hours
Restrict when your agent answers calls. Outside configured hours, callers hear an after-hours message instead of engaging with the AI.
| Field | Type | Default | Description |
|---|---|---|---|
business_hours |
object (JSONB) | null |
A JSON object keyed by day abbreviation (mon, tue, wed, thu, fri, sat, sun). Each day has start and end times in 24-hour format. Omit a day to mark it as closed. |
timezone |
string | "America/New_York" |
IANA timezone string for interpreting business hours. Examples: America/Chicago, America/Los_Angeles, Europe/London. |
after_hours_message |
string | null |
Message played to callers outside business hours. Example: "Our office is currently closed. Please call back Monday through Friday, 9 AM to 5 PM Eastern." |
Business Hours JSON Format
{
"mon": { "start": "09:00", "end": "17:00" },
"tue": { "start": "09:00", "end": "17:00" },
"wed": { "start": "09:00", "end": "17:00" },
"thu": { "start": "09:00", "end": "17:00" },
"fri": { "start": "09:00", "end": "17:00" }
}
In this example, Saturday and Sunday are omitted, so the agent will play the after_hours_message on weekends. Times are interpreted in the agent's configured timezone.
business_hours is null (the default), the agent answers calls 24/7 with no restrictions.6. Metadata & Variables
Agents support two mechanisms for attaching custom data and dynamically personalizing prompts.
metadata (JSONB)
A free-form JSON object stored alongside the agent. Use it to attach arbitrary key-value data — department codes, CRM IDs, feature flags, or anything your webhook server needs.
{
"department": "sales",
"crm_id": "SF-00412",
"priority": "high"
}
metadata_variables (JSONB)
A flat key-value map used for template interpolation inside first_message and system_prompt. Any occurrence of {{key}} in those fields is replaced at call time with the corresponding value.
{
"company_name": "Acme Corp",
"support_phone": "1-800-555-0199",
"product_name": "Widget Pro",
"booking_url": "https://acme.com/book"
}
Then in your system prompt, you can write:
You are a customer support agent for {{company_name}}.
If the caller needs to schedule a repair, direct them to {{booking_url}}.
For billing questions, transfer them to {{support_phone}}.
The platform also automatically makes {{agent_name}} available (pulled from the agent's name field), so you do not need to add it to metadata_variables manually.
{{variable}} placeholders in your prompts but forget to define matching keys in metadata_variables, the raw {{variable}} text will appear in the prompt as-is. The LLM may then say the placeholder out loud on the call.7. Agent Templates
The platform provides pre-built templates at GET /api/agents/templates to get you started quickly. Select a template, customize the prompts for your business, and save.
| Template | Tone | Max Duration | Best For |
|---|---|---|---|
| Receptionist | professional | 300s (5 min) | Answering calls, taking messages, routing callers to the right person. |
| Appointment Setter | friendly | 300s (5 min) | Booking appointments, collecting name/date/time/service, confirming details. Includes an analysis_schema to extract structured data after the call. |
| Customer Support | empathetic | 600s (10 min) | Troubleshooting issues, answering FAQs, escalating to a human when needed. Includes a success_eval_prompt for post-call quality scoring. |
| Sales Qualifier | enthusiastic | 600s (10 min) | Qualifying inbound leads by discovering pain points, budget, timeline, and decision process. Includes both analysis_schema and success_eval_prompt. |
| Outbound Follow-up | casual | 300s (5 min) | Checking in on existing customers or leads. Short, natural, and warm. |
Using a Template via the API
- Fetch templates:
GET /api/agents/templates - Pick a template from the response and copy its fields.
- Customize
system_prompt,first_message, and other fields for your business. - Send the payload to
POST /api/agents.
8. Full Example: Sales Agent
Here is a complete, production-ready JSON payload for creating a sales qualifier agent via POST /api/agents:
{
"name": "Acme Sales Agent",
"first_message": "Hey there! Thanks for calling Acme Corp. I'd love to learn more about what you're looking for. What brings you to us today?",
"first_message_mode": "assistant-speaks-first",
"system_prompt": "You are a friendly sales qualification specialist at {{company_name}}. Your goal is to qualify inbound leads through natural conversation.\n\nGather the following information naturally (do NOT ask all at once):\n1. What problem they're trying to solve\n2. Their approximate budget range\n3. Timeline for making a decision\n4. Who else is involved in the decision\n5. What solution they currently use\n\nRules:\n- Be conversational, not interrogative\n- If they ask about pricing, give a range of $500-$5,000/month depending on needs\n- If they seem qualified (budget > $1,000/mo, timeline < 90 days), offer to schedule a demo\n- If unqualified, be helpful and suggest our free tier at {{website_url}}\n- Never make up product features\n- Keep responses under 3 sentences",
"tone": "enthusiastic",
"model": "anthropic/claude-haiku-4.5",
"llm_provider": "openrouter",
"temperature": 0.5,
"max_tokens": 250,
"max_call_duration": 600,
"endpointing": 300,
"bargein_threshold": 3,
"response_delay_ms": 150,
"end_call_phrases": ["goodbye", "thanks for calling", "have a great day"],
"idle_timeout_seconds": 20,
"idle_max_triggers": 2,
"idle_messages": [
"Still there? No rush, take your time.",
"I'm here whenever you're ready. Any other questions I can help with?"
],
"idle_action": "message",
"enable_call_analysis": true,
"analysis_schema": {
"qualified": "boolean",
"budget": "string",
"timeline": "string",
"pain_points": "string[]",
"current_solution": "string",
"next_steps": "string"
},
"success_eval_prompt": "Was the lead qualified (budget > $1,000/mo and timeline < 90 days)? Did we gather enough information to pass to a sales rep?",
"metadata_variables": {
"company_name": "Acme Corp",
"website_url": "https://acme.com/free"
},
"metadata": {
"department": "sales",
"lead_source": "inbound"
},
"business_hours": {
"mon": { "start": "08:00", "end": "18:00" },
"tue": { "start": "08:00", "end": "18:00" },
"wed": { "start": "08:00", "end": "18:00" },
"thu": { "start": "08:00", "end": "18:00" },
"fri": { "start": "08:00", "end": "17:00" }
},
"timezone": "America/New_York",
"after_hours_message": "Thanks for calling Acme Corp! Our sales team is available Monday through Friday, 8 AM to 6 PM Eastern. Please call back during business hours or visit acme.com to learn more.",
"enable_recording": true,
"enable_voicemail_detection": true,
"voicemail_message": "Hi, this is the Acme Corp sales team. We missed your call but would love to chat. Please call us back during business hours or leave a message and we'll get right back to you.",
"enable_voice_formatting": true,
"enable_backchanneling": false,
"stt_model": "nova-2"
}
9. Common Mistakes
1. Template variables left unfilled
If your system prompt contains {{company_name}} or {{booking_url}} but you did not define those keys in metadata_variables, the agent will literally say "Welcome to curly-brace curly-brace company underscore name" on the call. Always verify that every {{placeholder}} in your prompts has a matching variable defined.
2. No agent identity in the prompt
Without clear identity instructions, the LLM defaults to a generic assistant persona. It may introduce itself as "an AI language model" or refuse to give its name. Always include a line like: "Your name is Sarah. You work at Acme Corp as a customer service representative."
3. Missing survey questions in the prompt
If you are building a survey agent, the LLM will not know what questions to ask unless they are explicitly listed in the system_prompt. Simply naming the agent "Survey Agent" is not enough. Include the full questionnaire in the prompt.
4. Endpointing too aggressive or too loose
- Too low (< 150ms): The agent cuts in before the caller finishes speaking. Callers get frustrated quickly.
- Too high (> 500ms): The agent feels slow and unresponsive, as if it is not listening. There are long pauses after every sentence.
- Recommended: Start at
200for fast-paced calls (sales, reception) and300for patient calls (support, surveys).
5. Temperature too high for structured tasks
A temperature of 0.9 produces creative, varied responses — great for casual chat, terrible for appointment booking or data collection. When the agent needs to follow a strict script or extract specific information, use 0.3–0.5. High temperature can cause the agent to hallucinate appointment times, invent product features, or go off-script.
6. max_tokens set too high
A max_tokens of 1000 or more means the agent can generate paragraph-length responses. On a phone call, this sounds like an uninterruptible monologue. Keep it between 150 and 400. The system prompt should also instruct the LLM to keep responses short (2–3 sentences max).
7. No end_call_phrases defined
Without end_call_phrases, the agent never hangs up on its own. The call continues until max_call_duration is hit or the caller disconnects. Define natural goodbye phrases so the agent can gracefully end conversations.
8. Forgetting to assign a phone number
Creating an agent does not automatically give it a phone number. After creating the agent, assign it to a phone number via the dashboard or POST /api/agents/:id/assign. Without an assigned number, the agent cannot receive inbound calls.
POST /api/agents/:id/chat) to test your agent's prompt before routing live calls to it. You can also run batch test scenarios with POST /api/agents/:id/test-scenarios to validate behavior across multiple inputs.