Architecture Overview
Arkenos is composed of four services that work together to provide a complete voice AI platform.System Diagram
Data Flow
Services
Frontend (Next.js 16)
The web dashboard built with Next.js 16, React 19, Tailwind CSS, and shadcn/ui. Handles:- User authentication via Better Auth
- Agent configuration and management
- Custom agent code editor with AI coding assistant
- Voice session interface (LiveKit WebRTC)
- Phone number search, purchase, and assignment
- Call logs, analytics, and cost dashboards
- API key management (API Keys page)
Backend (FastAPI)
The REST API server handling all business logic:- Agent CRUD (standard and custom modes)
- Encrypted API key storage (AES-256-GCM) and resolution
- Session management, transcripts, and cost tracking
- Call analysis (AI-generated summaries, sentiment, topics)
- Telephony provisioning (auto-creates SIP trunks, dispatch rules)
- Coding agent service (Gemini-powered code assistant)
- LiveKit webhook processing
- Key injection endpoint for the agent (
/api/settings/keys/agent)
Voice Agent (Python)
A LiveKit Agents SDK worker that runs the real-time voice pipeline:- Fetches all API keys from the backend on startup (zero
.envconfig) - Joins LiveKit rooms when dispatched (browser or SIP calls)
- Fetches agent configuration and dynamically loads the right persona
- Runs the STT → LLM → TTS pipeline with Silero VAD
- Executes function calls and reports usage/cost events
- Streams transcripts back to the backend in real-time
- Handles both standard (dashboard-configured) and custom (user-code) agents
PostgreSQL
Stores all persistent data: users, agents (with config JSON), sessions, transcripts, usage events, encrypted API keys (instance_settings), custom agent files, and coding agent conversations.
MinIO
Object storage for custom agent code files. Stores versioned file content for custom Python agents.Communication Patterns
| From | To | Protocol | Purpose |
|---|---|---|---|
| Browser | Frontend | HTTPS | Dashboard UI |
| Browser | LiveKit | WebRTC | Real-time audio |
| Frontend (server) | Backend | HTTP | API calls (SSR) |
| Frontend (client) | Backend | HTTP | API calls (browser) |
| Agent | Backend | HTTP | Key fetch, config, transcripts, usage |
| Agent | LiveKit | WebSocket | Audio pipeline |
| Agent | STT/LLM/TTS | HTTPS | Provider API calls |
| Backend | PostgreSQL | TCP | Database queries |
| Backend | Twilio | HTTPS | Number management, SIP provisioning |
| Backend | LiveKit | HTTPS | Trunk/rule management, room creation |
| LiveKit | Backend | HTTP | Webhook events |
| Twilio | LiveKit | SIP | Inbound/outbound phone calls |
Docker Networking
When running in Docker Compose, services communicate on an internal network:- Frontend server components call
http://backend:8000/api(internal) - Frontend client components call
http://localhost:4201/api(host-mapped) - Agent calls
http://backend:8000/api(internal) - Backend connects to
postgres:5432(internal) - Backend connects to
minio:9000(internal)