Architecture

Understand the Voice AI Agent runtime.

Use this page to separate runtime traffic, control-plane configuration, telephony callbacks, AI media streams, and operational data before building production integrations.

Request API key Start building Endpoint guides Authentication Production readiness

Runtime APIBearer tenant keys

Admin APIx-admin-key

TelephonyTwilio signatures

SpecsOpenAPI 3.1

Runtime plane

The runtime plane handles live phone traffic. It receives signed Twilio callbacks, starts outbound calls, opens realtime media streams, manages voice sessions, executes approved tools, and records operational state for review.

Caller or workflow

A customer calls a routed number, or your app creates an outbound call with tenant bearer auth.

StateSet Voice API

The API resolves tenant routing, validates credentials, returns TwiML, and creates session state.

Realtime AI session

Media streams connect to the configured model, voice, prompt, and tool policy.

Operations record

Call logs, voice sessions, function calls, evaluations, and callback tasks become auditable records.

Control plane

The control plane manages tenant configuration and production rollout. Keep it separate from runtime callers because it changes credentials, phone routes, agents, versions, rollout policy, and diagnostics.

Object	Managed by	Production consideration
Tenant runtime config	Admin API	Controls model, voice, prompt key, and tenant defaults.
Agent versions	Admin API	Promote tested versions only after synthetic calls and evaluation review.
Phone routes	Admin API and Twilio	Map numbers to the correct tenant, direction, and agent version.
Rollout governance	Admin API	Gate production traffic with evaluation thresholds and stop conditions.

Trust boundaries

Tenant boundary

Bearer runtime keys

Runtime keys should only access tenant-scoped calls, sessions, logs, and operations workflows.

Admin boundary

Admin API keys

Admin keys can change tenants, agents, routes, and rollout policy. Store them separately from runtime credentials.

Telephony boundary

Twilio signatures

Inbound callbacks must preserve the exact public URL and body before verification.

Media boundary

Stream tokens

Short-lived stream tokens constrain realtime media access during call setup.

Operational data model

Record	Purpose	Review cadence
Call log	Business-level call status, duration, direction, escalation, and summary.	Every launch and support audit.
Voice session	Conversation transcript, function calls, metadata, and model/voice context.	QA, debugging, and prompt iteration.
Evaluation	Quality score, rubric, notes, and rollout-governance inputs.	Before promotion and after major prompt changes.
Callback task	Human follow-up queue, SLA, disposition, and notification workflow.	Daily operations and SLA review.