Skip to main content

Architecture Overview

Arkenos is composed of four services that work together to provide a complete voice AI platform.

System Diagram

+------------------------------------------------------------------+
|                           Browser                                 |
|                                                                   |
|   +------------------+                +------------------------+  |
|   | Dashboard        |    REST API    | Voice Session (WebRTC) |  |
|   | (Next.js)        |-------------->|  LiveKit Client SDK     |  |
|   +--------+---------+                +-----------+------------+  |
+------------|--------------------------------------|---------------+
             | HTTP                                 | WebRTC
             v                                      v
+------------------------+            +-----------------------------+
| Backend API (FastAPI)  |            | LiveKit Cloud               |
|                        |   REST     |                             |
| +--------------------+ |<---------->| +-------------------------+ |
| | PostgreSQL         | |  webhooks  | | Standard Agent Worker   | |
| | - agents, sessions | |            | | (shared voice engine)   | |
| | - encrypted keys   | |            | +-------------------------+ |
| | - costs, files     | |            |                             |
| +--------------------+ |            | +-------------------------+ |
|                        |            | | Custom Agent Containers | |
| +--------------------+ |            | | (user Python code)      | |
| | Telephony          | |            | +-------------------------+ |
| | Provisioning       | |            +-------------+---------------+
| +--------------------+ |                          |
+----------+-------------+              +-----------+-----------+
           |                            |           |           |
           v                            v           v           v
+-------------------+               +-----+   +-------+   +-----+
| Twilio            |               | STT |   |  LLM  |   | TTS |
| - SIP trunks      |               +-----+   +-------+   +-----+
| - Phone numbers   |
+-------------------+

Data Flow

Browser Call:   Dashboard --> LiveKit Token --> WebRTC --> Agent --> STT/LLM/TTS
Inbound Call:   Phone --> Twilio --> SIP Trunk --> LiveKit --> Agent --> STT/LLM/TTS
Outbound Call:  Dashboard --> Backend --> LiveKit SIP --> Twilio --> Phone

Services

Frontend (Next.js 16)

The web dashboard built with Next.js 16, React 19, Tailwind CSS, and shadcn/ui. Handles:
  • User authentication via Better Auth
  • Agent configuration and management
  • Custom agent code editor with AI coding assistant
  • Voice session interface (LiveKit WebRTC)
  • Phone number search, purchase, and assignment
  • Call logs, analytics, and cost dashboards
  • API key management (API Keys page)

Backend (FastAPI)

The REST API server handling all business logic:
  • Agent CRUD (standard and custom modes)
  • Encrypted API key storage (AES-256-GCM) and resolution
  • Session management, transcripts, and cost tracking
  • Call analysis (AI-generated summaries, sentiment, topics)
  • Telephony provisioning (auto-creates SIP trunks, dispatch rules)
  • Coding agent service (Gemini-powered code assistant)
  • LiveKit webhook processing
  • Key injection endpoint for the agent (/api/settings/keys/agent)

Voice Agent (Python)

A LiveKit Agents SDK worker that runs the real-time voice pipeline:
  • Fetches all API keys from the backend on startup (zero .env config)
  • Joins LiveKit rooms when dispatched (browser or SIP calls)
  • Fetches agent configuration and dynamically loads the right persona
  • Runs the STT → LLM → TTS pipeline with Silero VAD
  • Executes function calls and reports usage/cost events
  • Streams transcripts back to the backend in real-time
  • Handles both standard (dashboard-configured) and custom (user-code) agents

PostgreSQL

Stores all persistent data: users, agents (with config JSON), sessions, transcripts, usage events, encrypted API keys (instance_settings), custom agent files, and coding agent conversations.

MinIO

Object storage for custom agent code files. Stores versioned file content for custom Python agents.

Communication Patterns

FromToProtocolPurpose
BrowserFrontendHTTPSDashboard UI
BrowserLiveKitWebRTCReal-time audio
Frontend (server)BackendHTTPAPI calls (SSR)
Frontend (client)BackendHTTPAPI calls (browser)
AgentBackendHTTPKey fetch, config, transcripts, usage
AgentLiveKitWebSocketAudio pipeline
AgentSTT/LLM/TTSHTTPSProvider API calls
BackendPostgreSQLTCPDatabase queries
BackendTwilioHTTPSNumber management, SIP provisioning
BackendLiveKitHTTPSTrunk/rule management, room creation
LiveKitBackendHTTPWebhook events
TwilioLiveKitSIPInbound/outbound phone calls

Docker Networking

When running in Docker Compose, services communicate on an internal network:
  • Frontend server components call http://backend:8000/api (internal)
  • Frontend client components call http://localhost:4201/api (host-mapped)
  • Agent calls http://backend:8000/api (internal)
  • Backend connects to postgres:5432 (internal)
  • Backend connects to minio:9000 (internal)