Architecture Overview

Arkenos is composed of four services that work together to provide a complete voice AI platform.

System Diagram

+------------------------------------------------------------------+
|                           Browser                                 |
|                                                                   |
|   +------------------+                +------------------------+  |
|   | Dashboard        |    REST API    | Voice Session (WebRTC) |  |
|   | (Next.js)        |-------------->|  LiveKit Client SDK     |  |
|   +--------+---------+                +-----------+------------+  |
+------------|--------------------------------------|---------------+
             | HTTP                                 | WebRTC
             v                                      v
+------------------------+            +-----------------------------+
| Backend API (FastAPI)  |            | LiveKit Cloud               |
|                        |   REST     |                             |
| +--------------------+ |<---------->| +-------------------------+ |
| | PostgreSQL         | |  webhooks  | | Standard Agent Worker   | |
| | - agents, sessions | |            | | (shared voice engine)   | |
| | - encrypted keys   | |            | +-------------------------+ |
| | - costs, files     | |            |                             |
| +--------------------+ |            | +-------------------------+ |
|                        |            | | Custom Agent Containers | |
| +--------------------+ |            | | (user Python code)      | |
| | Telephony          | |            | +-------------------------+ |
| | Provisioning       | |            +-------------+---------------+
| +--------------------+ |                          |
+----------+-------------+              +-----------+-----------+
           |                            |           |           |
           v                            v           v           v
+-------------------+               +-----+   +-------+   +-----+
| Twilio            |               | STT |   |  LLM  |   | TTS |
| - SIP trunks      |               +-----+   +-------+   +-----+
| - Phone numbers   |
+-------------------+

Data Flow

Browser Call:   Dashboard --> LiveKit Token --> WebRTC --> Agent --> STT/LLM/TTS
Inbound Call:   Phone --> Twilio --> SIP Trunk --> LiveKit --> Agent --> STT/LLM/TTS
Outbound Call:  Dashboard --> Backend --> LiveKit SIP --> Twilio --> Phone

Services

Frontend (Next.js 16)

The web dashboard built with Next.js 16, React 19, Tailwind CSS, and shadcn/ui. Handles:

User authentication via Better Auth
Agent configuration and management
Custom agent code editor with AI coding assistant
Voice session interface (LiveKit WebRTC)
Phone number search, purchase, and assignment
Call logs, analytics, and cost dashboards
API key management (API Keys page)

Backend (FastAPI)

The REST API server handling all business logic:

Agent CRUD (standard and custom modes)
Encrypted API key storage (AES-256-GCM) and resolution
Session management, transcripts, and cost tracking
Call analysis (AI-generated summaries, sentiment, topics)
Telephony provisioning (auto-creates SIP trunks, dispatch rules)
Coding agent service (Gemini-powered code assistant)
LiveKit webhook processing
Key injection endpoint for the agent (/api/settings/keys/agent)

Voice Agent (Python)

A LiveKit Agents SDK worker that runs the real-time voice pipeline:

Fetches all API keys from the backend on startup (zero .env config)
Joins LiveKit rooms when dispatched (browser or SIP calls)
Fetches agent configuration and dynamically loads the right persona
Runs the STT → LLM → TTS pipeline with Silero VAD
Executes function calls and reports usage/cost events
Streams transcripts back to the backend in real-time
Handles both standard (dashboard-configured) and custom (user-code) agents

PostgreSQL

Stores all persistent data: users, agents (with config JSON), sessions, transcripts, usage events, encrypted API keys (instance_settings), custom agent files, and coding agent conversations.

MinIO

Object storage for custom agent code files. Stores versioned file content for custom Python agents.

Communication Patterns

From	To	Protocol	Purpose
Browser	Frontend	HTTPS	Dashboard UI
Browser	LiveKit	WebRTC	Real-time audio
Frontend (server)	Backend	HTTP	API calls (SSR)
Frontend (client)	Backend	HTTP	API calls (browser)
Agent	Backend	HTTP	Key fetch, config, transcripts, usage
Agent	LiveKit	WebSocket	Audio pipeline
Agent	STT/LLM/TTS	HTTPS	Provider API calls
Backend	PostgreSQL	TCP	Database queries
Backend	Twilio	HTTPS	Number management, SIP provisioning
Backend	LiveKit	HTTPS	Trunk/rule management, room creation
LiveKit	Backend	HTTP	Webhook events
Twilio	LiveKit	SIP	Inbound/outbound phone calls

Docker Networking

When running in Docker Compose, services communicate on an internal network:

Frontend server components call http://backend:8000/api (internal)
Frontend client components call http://localhost:4201/api (host-mapped)
Agent calls http://backend:8000/api (internal)
Backend connects to postgres:5432 (internal)
Backend connects to minio:9000 (internal)

Getting Started

Architecture

Configuration

Features

Architecture Overview

Architecture Overview

System Diagram

Data Flow

Services

Frontend (Next.js 16)

Backend (FastAPI)

Voice Agent (Python)

PostgreSQL

MinIO

Communication Patterns

Docker Networking

Getting Started

Architecture

Configuration

Features

​Architecture Overview

​System Diagram

​Data Flow

​Services

​Frontend (Next.js 16)

​Backend (FastAPI)

​Voice Agent (Python)

​PostgreSQL

​MinIO

​Communication Patterns

​Docker Networking

Architecture Overview

System Diagram

Data Flow

Services

Frontend (Next.js 16)

Backend (FastAPI)

Voice Agent (Python)

PostgreSQL

MinIO

Communication Patterns

Docker Networking