Version: 0.8

Server Configuration

Command Line Options

./atelico-server [OPTIONS]

Option	Default	Description
`-p`, `--port`	`11434`	Port to listen on

Environment Variables

Model & Asset Management

Variable	Description
`ATELICO_CACHE_DIR`	Local model cache directory (default: `~/Library/Caches/atelico` on macOS, `~/.cache/atelico` on Linux)
`ATELICO_ASSET_STORE_URL`	Remote asset store URL (S3, R2, HTTP, local path)
`ATELICO_ASSET_ACCESS_KEY`	Access key for private S3/R2 stores
`ATELICO_ASSET_SECRET_KEY`	Secret key for private S3/R2 stores
`HF_TOKEN`	HuggingFace token for gated models (fallback source)

Proxy Backends

Variable	Description
`OPENAI_API_KEY`	Enables the `openai::` model prefix
`OPENAI_BASE_URL`	OpenAI endpoint (default: `https://api.openai.com/v1`)
`PROXY_<NAME>_API_KEY`	Enables `<name>::` prefix for any OpenAI-compatible API
`PROXY_<NAME>_BASE_URL`	Endpoint for the named proxy

Content Safety

Variable	Description
`GUARDRAILS_ENABLED`	Enable content filtering (`true`/`false`)
`GUARDRAILS_POLICY`	`strict` or `permissive`

Classifiers

Variable	Description
`ATELICO_CLASSIFIERS`	Comma-separated classifier IDs to load on startup
`CLASSIFIER_DIR`	Custom classifier directory

Data Logging

Variable	Description
`ATELICO_DATA_LOG_ENABLED`	Enable request/response logging (`true`/`false`)
`ATELICO_DATA_LOG_DIR`	Output directory for logs
`ATELICO_DATA_LOG_SAMPLE_RATE`	Sampling rate (`0.0` - `1.0`)

Inference Tuning

Variable	Description
`ATELICO_BONSAI_PREFILL`	`speed` (default) or `memory` — Bonsai 1-bit models pre-dequantize weights to F16 for fast prefill at the cost of ~170 MB extra RAM (1.7B) or ~2.8 GB (8B). Set `memory` to skip pre-dequantization.
`ATELICO_CUDA_GRAPHS`	Set to `0` to disable CUDA graph capture/replay (useful for debugging).
`ATELICO_NO_CLOCK_LOCK`	Set to `1` to disable GPU clock locking on Windows (WDDM performance mitigation).

Audio

Variable	Description
`ATELICO_KOKORO_QUANT`	`q8_0` or `q4_0` — opt-in quantization of Kokoro's linear layers (~12% / ~18% memory saving). Default: F16 on Metal/CUDA, F32 on CPU. Convolutional decoder is unaffected.
`ATELICO_TTS_DTYPE`	Pocket TTS runtime dtype. `bf16` (default on GPU), `f32` (default on CPU, parity reference), or `f16` (~12% faster on Metal but minor quality drift).
`ATELICO_POCKET_TTS_MAX_TOKENS`	Maximum SentencePiece tokens per Pocket TTS chunk. Default `50` (the model's training distribution boundary); raising this produces out-of-distribution audio.

Debug

Variable	Description
`RUST_LOG`	Log level: `error`, `warn`, `info`, `debug`, `trace`

Example: Production Configuration

ATELICO_ASSET_STORE_URL=https://models.yourgame.com \
GUARDRAILS_ENABLED=true \
GUARDRAILS_POLICY=strict \
RUST_LOG=info \
./atelico-server --port 11434

Example: Development with OpenAI Fallback

OPENAI_API_KEY=sk-... \
RUST_LOG=debug \
./atelico-server

This lets you use local models (in-memory::...) and OpenAI models (openai::gpt-4o-mini) side by side during development.

Command Line Options​

Environment Variables​

Model & Asset Management​

Proxy Backends​

Content Safety​

Classifiers​

Data Logging​

Inference Tuning​

Audio​

Debug​

Example: Production Configuration​

Example: Development with OpenAI Fallback​

Command Line Options

Environment Variables

Model & Asset Management

Proxy Backends

Content Safety

Classifiers

Data Logging

Inference Tuning

Audio

Debug

Example: Production Configuration

Example: Development with OpenAI Fallback