Skip to main content
Version: 0.8

C FFI API Reference

Generate atelico_ffi.h from this crate's extern "C" declarations.

Run via:

cargo run -p atelico-ffi --features gen-header --bin gen_header

Writes atelico-ffi/include/atelico_ffi.h and copies the result to demos/cig-unreal-demo/ThirdParty/atelico/atelico_ffi.h (the path the Unreal demo expects). Re-run after changing any #[no_mangle] extern "C" signature in src/lib.rs.

Constants

Result Codes

ConstantValueDescription
ATELICO_OK0Operation completed successfully.
ATELICO_ERR_INVALID_HANDLE-1The engine handle is null or has already been destroyed.
ATELICO_ERR_INVALID_ARG-2A required argument is null, not valid UTF-8, or otherwise invalid.
ATELICO_ERR_INIT_FAILED-3Engine initialization failed (device detection, runtime creation, etc.).
ATELICO_ERR_MODEL_NOT_FOUND-4The requested model is not loaded or does not exist.
ATELICO_ERR_INFERENCE_FAILED-5Inference failed during token generation or image synthesis.
ATELICO_ERR_STREAM_DONE-6The stream has completed; no more data will arrive. The stream handle is automatically cleaned up when this code is returned from atelico_stream_poll.
ATELICO_ERR_STREAM_EMPTY-7No data is available yet on a non-blocking poll. Retry on the next frame.
ATELICO_ERR_JSON_PARSE-8A JSON string argument could not be parsed.
ATELICO_ERR_STORE_NOT_FOUND-9The requested KV store or ANN index was not found.
ATELICO_ERR_IO-10An I/O error occurred (file system, network, etc.).
ATELICO_ERR_BLOCKED-11The request was blocked by a safety guardrail.
ATELICO_ERR_INTERNAL-99An unexpected internal error. Call atelico_last_error() for details.

Scheduling Modes

ConstantValueDescription
ATELICO_SCHEDULE_PRIORITIZE_COMPUTE0Maximize inference throughput; the GPU dedicates most time to compute. Use when rendering is paused or during loading screens.
ATELICO_SCHEDULE_PRIORITIZE_GRAPHICS2Minimize inference GPU usage to preserve frame rate. Inference yields frequently to keep the renderer at target FPS.

Other

ConstantValueDescription
ATELICO_SCHEDULE_BALANCE1Balance inference and rendering workloads (default).

Types

AtelicoEngine

Opaque engine handle exposed to C callers.

Created by atelico_engine_create (or the CiG variants) and destroyed by atelico_engine_destroy. All FFI functions that take an AtelicoEngine* parameter require the pointer to remain valid for the duration of the call.

The handle owns the inference runtime, all loaded models, active streams, and GPU scheduling state. Destroying the handle releases all resources.

Functions

Configuration

atelico_set_log_level

void atelico_set_log_level(_level: int32_t);

Set the log level at runtime.

Currently a no-op. Use the RUST_LOG environment variable instead (e.g., RUST_LOG=debug).

  • _level (int32_t): desired log level (0 = error, 1 = warn, 2 = info, 3 = debug, 4 = trace).

Returns: nothing (void).

CiG Queries

atelico_is_cig_d3d12_supported

int32_t atelico_is_cig_d3d12_supported(device_index: uint32_t);

Query whether the GPU supports Compute-in-Graphics (CiG) via D3D12.

CiG allows the inference engine to share the GPU hardware scheduling context with the game renderer, avoiding OS-level context switching. Requires NVIDIA R570+ driver, CUDA 12.8+, and Ada Lovelace+ GPU.

  • device_index (uint32_t): CUDA device ordinal (usually 0).

Returns: int32_t -- 1 if D3D12 CiG is supported, 0 otherwise. Always returns 0 when the cuda feature is not enabled.

atelico_is_cig_vulkan_supported

int32_t atelico_is_cig_vulkan_supported(device_index: uint32_t);

Query whether the GPU supports Compute-in-Graphics (CiG) via Vulkan.

Requires NVIDIA R570+ driver, CUDA 12.9+, and Ada Lovelace+ GPU.

  • device_index (uint32_t): CUDA device ordinal (usually 0).

Returns: int32_t -- 1 if Vulkan CiG is supported, 0 otherwise. Always returns 0 when the cuda feature is not enabled.

Frame Signaling

atelico_engine_on_frame

int32_t atelico_engine_on_frame(engine: AtelicoEngine*);

Signal the start of a new render frame for GPU scheduling synchronization.

Call this once per game frame (typically at the top of the update loop) so the inference throttler can yield to the renderer when using frame-budget-aware scheduling modes.

  • engine (AtelicoEngine*): valid engine handle.

Returns: ATELICO_OK (0) on success, ATELICO_ERR_INVALID_HANDLE if NULL.

Safety:

engine must be a valid pointer.

atelico_engine_set_frame_time_ms

int32_t atelico_engine_set_frame_time_ms(engine: AtelicoEngine*, ms: uint32_t);

Set the frame time target in milliseconds. Pass 0 to disable.

When set, inference yields GPU time to keep the render frame within this budget (e.g., 16 for 60 FPS, 33 for 30 FPS).

  • engine (AtelicoEngine*): valid engine handle.
  • ms (uint32_t): frame time budget in milliseconds (0 = disabled).

Returns: ATELICO_OK (0) on success.

Safety:

engine must be a valid pointer.

Other

atelico_engine_create

int32_t atelico_engine_create(config_json: const char*, out_engine: *mut AtelicoEngine*);

Create an engine from a JSON configuration string.

Initializes the inference runtime, device backend, and optional model pre-loading. If config_json is NULL, uses defaults (auto device detection, balance scheduling, no pre-loaded models).

  • config_json (const char*): NULL for defaults, or a JSON object with the structure shown below.
  • out_engine (AtelicoEngine**): receives the engine pointer on success.

Expected JSON for config_json:

{
"device": "auto",
"scheduling": {
"mode": "balance",
"vram_budget_mb": 0,
"target_tps": 0,
"frame_time_ms": 0
},
"backends": [
{
"name": "in-memory",
"type": "in-memory",
"models": ["llama-3.2-1b"]
}
],
"guardrails": "game-safe",
"log_level": "info"
}

Field details:

  • device: "auto", "metal", "cuda", or "cpu" (default "auto")
  • scheduling.mode: "balance", "prioritize-compute", or "prioritize-graphics"
  • scheduling.vram_budget_mb: VRAM budget in MB (0 = unlimited)
  • scheduling.target_tps: target tokens/sec (0 = unlimited)
  • scheduling.frame_time_ms: frame time budget in ms (0 = no budget)
  • backends[].name: backend name (e.g., "in-memory")
  • backends[].type: "in-memory", "proxy", or "mock"
  • backends[].models: array of model IDs to pre-load
  • guardrails: "game-safe", "child-safe", or "developer-sdk"
  • log_level: "error", "warn", "info", "debug", "trace"

Returns: ATELICO_OK (0) on success, negative error code on failure. On success, *out_engine is set to a valid AtelicoEngine* that must be freed with atelico_engine_destroy.

Safety:

config_json must be NULL or a valid null-terminated UTF-8 string. out_engine must be a valid, non-null pointer to an AtelicoEngine*.

atelico_engine_destroy

void atelico_engine_destroy(engine: AtelicoEngine*);

Destroy the engine and free all resources, including loaded models, active streams, and the inference runtime.

Safe to call with NULL (no-op). After this call, the pointer is invalid and must not be used again.

  • engine (AtelicoEngine*): engine handle to destroy, or NULL (no-op).

Returns: nothing (void).

Safety:

engine must be a pointer returned by atelico_engine_create*, or NULL.

atelico_last_error

*const atelico_last_error();

Get the last error message for the calling thread.

Returns: const char* -- a null-terminated UTF-8 string describing the most recent error on this thread. The pointer is stored in a thread-local buffer and valid until the next Atelico API call on the same thread. Copy the string if you need to keep it longer. Returns an empty string if no error has occurred.

atelico_engine_set_scheduling_mode

int32_t atelico_engine_set_scheduling_mode(engine: AtelicoEngine*, mode: uint32_t);

Set the GPU scheduling mode at runtime.

  • engine (AtelicoEngine*): valid engine handle.
  • mode (uint32_t): one of the ATELICO_SCHEDULE_* constants: 0 = ATELICO_SCHEDULE_PRIORITIZE_COMPUTE (maximize inference throughput), 1 = ATELICO_SCHEDULE_BALANCE (balance inference and rendering), 2 = ATELICO_SCHEDULE_PRIORITIZE_GRAPHICS (minimize inference GPU usage). Unknown values default to ATELICO_SCHEDULE_BALANCE.

Returns: ATELICO_OK (0) on success.

Safety:

engine must be a valid pointer.

atelico_engine_get_scheduling_mode

uint32_t atelico_engine_get_scheduling_mode(engine: const AtelicoEngine*);

Get the current GPU scheduling mode.

  • engine (const AtelicoEngine*): valid engine handle, or NULL.

Returns: uint32_t -- one of the ATELICO_SCHEDULE_* constants. Returns ATELICO_SCHEDULE_BALANCE (1) if engine is NULL.

Safety:

engine must be a valid pointer or NULL.

atelico_engine_set_vram_budget_mb

int32_t atelico_engine_set_vram_budget_mb(engine: AtelicoEngine*, mb: uint32_t);

Set the VRAM budget in megabytes. Pass 0 for unlimited.

  • engine (AtelicoEngine*): valid engine handle.
  • mb (uint32_t): VRAM budget in megabytes (0 = unlimited).

Returns: ATELICO_OK (0) on success.

Safety:

engine must be a valid pointer.

atelico_engine_set_target_tps

int32_t atelico_engine_set_target_tps(engine: AtelicoEngine*, tps: uint32_t);

Set the target tokens per second. Pass 0 for unlimited.

When set, inference throttles generation speed to free GPU cycles for rendering.

  • engine (AtelicoEngine*): valid engine handle.
  • tps (uint32_t): target tokens per second (0 = unlimited).

Returns: ATELICO_OK (0) on success.

Safety:

engine must be a valid pointer.

atelico_engine_create_with_d3d12 Platform-specific

int32_t atelico_engine_create_with_d3d12(device_index: uint32_t, d3d12_queue: void*, config_json: const char*, out_engine: *mut AtelicoEngine*);

Create an engine with CiG GPU sharing via a D3D12 command queue.

Falls back gracefully to a normal CUDA context if CiG is unsupported.

  • device_index (uint32_t): CUDA device ordinal (usually 0).
  • d3d12_queue (void*): raw pointer to ID3D12CommandQueue from the game engine.
  • config_json (const char*): optional JSON configuration (NULL for defaults); see atelico_engine_create for the JSON schema.
  • out_engine (AtelicoEngine**): receives the engine pointer on success.

Returns: ATELICO_OK (0) on success, negative error code on failure. On success, *out_engine is set to a valid AtelicoEngine* with a CiG-enabled CUDA backend.

Safety:

d3d12_queue must be a valid ID3D12CommandQueue* that outlives the engine. config_json must be NULL or a valid null-terminated UTF-8 string.

atelico_engine_create_with_vulkan Platform-specific

int32_t atelico_engine_create_with_vulkan(device_index: uint32_t, vulkan_blob: void*, config_json: const char*, out_engine: *mut AtelicoEngine*);

Create an engine with CiG GPU sharing via a Vulkan external compute blob.

Requires CUDA 12.9+ driver. Falls back gracefully if CiG is unsupported.

  • device_index (uint32_t): CUDA device ordinal (usually 0).
  • vulkan_blob (void*): opaque blob from vkGetExternalComputeQueueDataNV.
  • config_json (const char*): optional JSON configuration (NULL for defaults); see atelico_engine_create for the JSON schema.
  • out_engine (AtelicoEngine**): receives the engine pointer on success.

Returns: ATELICO_OK (0) on success, negative error code on failure. On success, *out_engine is set to a valid AtelicoEngine* with a CiG-enabled CUDA backend.

Safety:

vulkan_blob must be a valid blob from vkGetExternalComputeQueueDataNV. config_json must be NULL or a valid null-terminated UTF-8 string.

atelico_model_load

int32_t atelico_model_load(engine: AtelicoEngine*, model_id: const char*);

Pre-load a model into memory. Blocks until the model weights are loaded and the model is ready for inference.

  • engine (AtelicoEngine*): valid engine handle.
  • model_id (const char*): model identifier in "backend::model-name" format (e.g., "in-memory::llama-3.2-1b").

Returns: ATELICO_OK (0) on success, ATELICO_ERR_MODEL_NOT_FOUND (-4) if the model cannot be resolved, or another negative code on failure.

Safety:

engine must be a valid pointer. model_id must be a valid C string.

atelico_model_unload

int32_t atelico_model_unload(engine: AtelicoEngine*, model_id: const char*);

Unload a model from memory, freeing its GPU/CPU resources.

  • engine (AtelicoEngine*): valid engine handle.
  • model_id (const char*): model identifier in "backend::model-name" format.

Returns: ATELICO_OK (0) on success.

Safety:

engine must be a valid pointer. model_id must be a valid C string.

atelico_model_list

int32_t atelico_model_list(engine: const AtelicoEngine*, out_json: *mut const char*);

List all loaded models as a JSON array.

  • engine (const AtelicoEngine*): valid engine handle.
  • out_json (const char**): receives a pointer to the JSON result string.

Returns: ATELICO_OK (0) on success.

On success, *out_json points to a JSON array of model objects:

[
{
"id": "in-memory::llama-3.2-1b",
"backend": "in-memory",
"ready": true
}
]

The pointer is valid until the next Atelico API call on the same thread.

Safety:

engine must be a valid pointer. out_json must be a valid pointer.

atelico_llm_chat

int32_t atelico_llm_chat(engine: AtelicoEngine*, request_json: const char*, out_response_json: *mut const char*);

Run a blocking chat completion request (OpenAI-compatible).

  • engine (AtelicoEngine*): valid engine handle.
  • request_json (const char*): JSON ChatCompletionRequest.
  • out_response_json (const char**): receives the response JSON pointer.

Expected JSON for request_json:

{
"model": "in-memory::llama-3.2-1b",
"messages": [
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "Hello!"}
],
"max_tokens": 256,
"temperature": 0.7
}

Returns: ATELICO_OK (0) on success.

On success, *out_response_json points to a ChatCompletionResponse:

{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"choices": [
{
"index": 0,
"message": {"role": "assistant", "content": "Hello!"},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 5,
"total_tokens": 17
}
}

The pointer is valid until the next Atelico API call on the same thread.

Safety:

All pointers must be valid.

atelico_llm_chat_stream

int32_t atelico_llm_chat_stream(engine: AtelicoEngine*, request_json: const char*, out_stream: uint64_t*);

Start a streaming chat completion. Returns a stream handle for polling.

The request JSON is the same schema as atelico_llm_chat. Use atelico_stream_poll to receive incremental chunks, and atelico_stream_cancel or atelico_stream_destroy to clean up early.

  • engine (AtelicoEngine*): valid engine handle.
  • request_json (const char*): JSON ChatCompletionRequest (same schema as atelico_llm_chat).
  • out_stream (uint64_t*): receives the stream handle.

Returns: ATELICO_OK (0) on success.

On success, *out_stream is set to a stream handle. Each poll via atelico_stream_poll returns a chat.completion.chunk JSON object:

{
"id": "chatcmpl-abc123",
"object": "chat.completion.chunk",
"choices": [
{
"delta": {"content": "Hello"},
"finish_reason": null
}
]
}

The final chunk has "finish_reason": "stop" (or "length"), and the next poll after that returns ATELICO_ERR_STREAM_DONE (-6).

Safety:

All pointers must be valid.

atelico_llm_respond

int32_t atelico_llm_respond(engine: AtelicoEngine*, request_json: const char*, out_response_json: *mut const char*);

Run a blocking response request (OpenAI Responses API).

  • engine (AtelicoEngine*): valid engine handle.
  • request_json (const char*): JSON ResponseRequest.
  • out_response_json (const char**): receives the response JSON pointer.

Expected JSON for request_json:

{
"model": "in-memory::llama-3.2-1b",
"input": "What is 2+2?",
"max_tokens": 256,
"temperature": 0.7
}

Returns: ATELICO_OK (0) on success.

On success, *out_response_json points to a ResponseObject:

{
"id": "resp-abc123",
"object": "response",
"output": [
{
"type": "message",
"content": [
{"type": "output_text", "text": "4"}
]
}
],
"usage": {"input_tokens": 10, "output_tokens": 3}
}

The pointer is valid until the next Atelico API call on the same thread.

Safety:

All pointers must be valid.

atelico_llm_call_function

int32_t atelico_llm_call_function(engine: AtelicoEngine*, function_json: const char*, model_id: const char*, variables_json: const char*, system_prompt: const char*, out_response_json: *mut const char*);

Execute a templated LM function with optional structured output (blocking).

Renders a Minijinja template with the provided variables, sends it to the model, and optionally enforces a JSON Schema on the output.

  • engine (AtelicoEngine*): valid engine handle.
  • function_json (const char*): JSON defining the function.
  • model_id (const char*): model identifier in "backend::model" format.
  • variables_json (const char*): JSON object with template variable values.
  • system_prompt (const char*): optional system message (NULL to omit).
  • out_response_json (const char**): receives the response JSON pointer.

Expected JSON for function_json:

{
"name": "summarize",
"template": "Summarize the following text: {{text}}",
"schema": {"type": "object", "properties": {"summary": {"type": "string"}}},
"max_tokens": 128,
"temperature": 0.3
}
  • name (required): function name for logging/tracing.
  • template (required): Minijinja template; use {{variable_name}} placeholders.
  • schema: optional JSON Schema. When provided, the model output is constrained to valid JSON matching this schema.
  • max_tokens: optional max tokens to generate.
  • temperature: optional sampling temperature.

Expected JSON for variables_json:

{"text": "The quick brown fox jumps over the lazy dog."}

Returns: ATELICO_OK (0) on success.

On success, *out_response_json points to a result object:

{
"content": "A fox jumps over a dog.",
"parsed": null
}
  • content: raw text output from the model.
  • parsed: JSON-parsed output if a schema was provided, otherwise null.

Safety:

All pointers must be valid. system_prompt may be NULL.

atelico_image_generate

int32_t atelico_image_generate(engine: AtelicoEngine*, request_json: const char*, out_response_json: *mut const char*);

Generate an image from a text prompt (blocking).

  • engine (AtelicoEngine*): valid engine handle.
  • request_json (const char*): JSON ImageGenerationRequest.
  • out_response_json (const char**): receives the response JSON pointer.

Expected JSON for request_json:

{
"model": "in-memory::pixart",
"prompt": "a cat sitting on a roof at sunset",
"n": 1,
"size": "512x512"
}
  • model (required): model ID in "backend::model" format.
  • prompt (required): text description of the image to generate.
  • n: number of images to generate (default 1).
  • size: image dimensions as "WxH" (e.g., "512x512", "1024x1024").

Returns: ATELICO_OK (0) on success.

On success, *out_response_json points to an ImageGenerationResponse:

{
"created": 1234567890,
"data": [
{"b64_json": "iVBORw0KGgo..."}
]
}

Each entry in data contains a base64-encoded PNG image.

Safety:

All pointers must be valid.

atelico_image_remove_background

int32_t atelico_image_remove_background(engine: AtelicoEngine*, request_json: const char*, out_response_json: *mut const char*);

Remove the background from an image (blocking).

  • engine (AtelicoEngine*): valid engine handle.
  • request_json (const char*): JSON BackgroundRemovalRequest.
  • out_response_json (const char**): receives the response JSON pointer.

Expected JSON for request_json:

{
"model": "in-memory::rembg",
"image": "iVBORw0KGgo..."
}
  • model (required): model ID for the background removal model.
  • image (required): base64-encoded input image (PNG or JPEG).

Returns: ATELICO_OK (0) on success.

On success, *out_response_json points to a BackgroundRemovalResponse:

{
"data": [
{"b64_json": "iVBORw0KGgo..."}
]
}

Each entry in data contains a base64-encoded PNG with transparent background.

Safety:

All pointers must be valid.

atelico_audio_synthesize

int32_t atelico_audio_synthesize(engine: AtelicoEngine*, request_json: const char*, out_response_json: *mut const char*);

Synthesize speech from text (blocking).

  • engine (AtelicoEngine*): valid engine handle.
  • request_json (const char*): JSON [AudioSpeechRequest].
  • out_response_json (const char**): receives the response JSON pointer.

Expected JSON for request_json:

{
"model": "in-memory::tts",
"input": "Hello from Atelico.",
"voice": "af_heart",
"speed": 1.0
}
  • model (required): model id, e.g. "in-memory::kokoro-82m" or "in-memory::pocket-tts".
  • input (required): text to synthesize.
  • voice: voice identifier (default "af_heart").
  • speed: 0.25–4.0 multiplier (default 1.0).

Returns: ATELICO_OK (0) on success.

On success, *out_response_json points to:

{
"audio_b64": "UklGRn...",
"duration_seconds": 1.42,
"format": "wav",
"sample_rate": 24000
}

audio_b64 decodes to a complete WAV file (RIFF header + PCM data).

Safety:

All pointers must be valid.

atelico_audio_synthesize_stream

int32_t atelico_audio_synthesize_stream(engine: AtelicoEngine*, request_json: const char*, out_stream: uint64_t*);

Start a streaming speech synthesis. Returns a stream handle for polling.

Use [atelico_stream_poll] to receive incremental [AudioSpeechChunk] objects (one per sentence/clause). Use atelico_stream_destroy to clean up early.

  • engine (AtelicoEngine*): valid engine handle.
  • request_json (const char*): JSON AudioSpeechRequest (same schema as atelico_audio_synthesize).
  • out_stream (uint64_t*): receives the stream handle.

Returns: ATELICO_OK (0) on success.

On success, *out_stream is a stream handle. Each poll via atelico_stream_poll returns an AudioSpeechChunk JSON object:

{
"sequence": 0,
"audio": "<base64 WAV bytes>",
"duration_seconds": 1.42,
"text": "First sentence."
}

atelico_stream_poll returns ATELICO_ERR_STREAM_EMPTY (-7) when no chunk is ready yet, and ATELICO_ERR_STREAM_DONE (-6) once the stream finishes.

Safety:

All pointers must be valid.

atelico_audio_transcribe

int32_t atelico_audio_transcribe(engine: AtelicoEngine*, request_json: const char*, out_response_json: *mut const char*);

Transcribe audio to text (blocking).

  • engine (AtelicoEngine*): valid engine handle.
  • request_json (const char*): JSON request (see below).
  • out_response_json (const char**): receives the response JSON pointer.

Expected JSON for request_json:

{
"model": "in-memory::whisper",
"audio_b64": "UklGRn...",
"language": "en"
}
  • model (required): model id, e.g. "in-memory::whisper", "in-memory::whisper-large-v3-turbo", "in-memory::distil-large-v3".
  • audio_b64 (required): base64-encoded WAV file (RIFF header + PCM data). The FFI layer decodes the WAV to f32 PCM samples and reads the sample rate from the header — no separate sample_rate field is needed.
  • language: ISO 639-1 code ("en", "ja", …) or omit for auto-detect.
  • response_format: "json" (default) or "verbose_json" (includes segments).
  • temperature: decoder sampling temperature (0.0 = greedy).
  • timestamp_granularities: array of "segment" and/or "word".

Returns: ATELICO_OK (0) on success.

On success, *out_response_json points to an AudioTranscriptionResponse:

{
"text": "Hello from Atelico.",
"language": "en",
"duration": 1.4
}

Safety:

All pointers must be valid.

atelico_embed

int32_t atelico_embed(engine: AtelicoEngine*, request_json: const char*, out_response_json: *mut const char*);

Compute text embeddings (blocking, OpenAI-compatible).

  • engine (AtelicoEngine*): valid engine handle.
  • request_json (const char*): JSON EmbeddingRequest.
  • out_response_json (const char**): receives the response JSON pointer.

Expected JSON for request_json:

{
"model": "in-memory::bge-small",
"input": ["Hello world", "Another sentence"]
}

Returns: ATELICO_OK (0) on success.

On success, *out_response_json points to an EmbeddingResponse:

{
"object": "list",
"data": [
{"object": "embedding", "index": 0, "embedding": [0.012, -0.034, ...]},
{"object": "embedding", "index": 1, "embedding": [0.008, 0.021, ...]}
],
"usage": {"prompt_tokens": 8, "total_tokens": 8}
}

Each data entry contains the embedding vector for the corresponding input text.

Safety:

All pointers must be valid.

atelico_embed_similarity

int32_t atelico_embed_similarity(engine: AtelicoEngine*, model_id: const char*, text_a: const char*, text_b: const char*, out_score: float*);

Compute cosine similarity between two texts using an embedding model.

Embeds both texts and returns their cosine similarity. This is a convenience wrapper around atelico_embed for pairwise comparison.

  • engine (AtelicoEngine*): valid engine handle.
  • model_id (const char*): embedding model identifier.
  • text_a (const char*): first text to compare.
  • text_b (const char*): second text to compare.
  • out_score (float*): receives the cosine similarity score.

Returns: ATELICO_OK (0) on success. On success, *out_score contains the cosine similarity in the range [-1.0, 1.0], where 1.0 means identical direction and -1.0 means opposite.

Safety:

All pointers must be valid.

atelico_classifier_load

int32_t atelico_classifier_load(engine: AtelicoEngine*, classifier_id: const char*, directory: const char*);

Load a text classifier model from a directory on disk.

The directory must contain model weights and a metadata.json file describing the classifier configuration.

  • engine (AtelicoEngine*): valid engine handle.
  • classifier_id (const char*): unique name for this classifier (used in subsequent atelico_classifier_predict calls).
  • directory (const char*): filesystem path to the classifier model directory.

Returns: ATELICO_OK (0) on success.

Safety:

All pointers must be valid.

atelico_classifier_predict

int32_t atelico_classifier_predict(engine: AtelicoEngine*, request_json: const char*, out_response_json: *mut const char*);

Classify input text and return class probabilities.

  • engine (AtelicoEngine*): valid engine handle.
  • request_json (const char*): JSON classifier request.
  • out_response_json (const char**): receives the response JSON pointer.

Expected JSON for request_json:

{
"model_id": "sentiment",
"text": "I love this game!",
"top_k": 3
}
  • model_id (required): name of a loaded classifier (from atelico_classifier_load).
  • text (required): input text to classify.
  • top_k: max number of top predictions to return (default: all classes).

Returns: ATELICO_OK (0) on success.

On success, *out_response_json points to a ClassifierResult:

{
"label": "positive",
"probability": 0.92,
"top": [["positive", 0.92], ["neutral", 0.06], ["negative", 0.02]]
}

Safety:

All pointers must be valid.

atelico_guardrail_check_input

int32_t atelico_guardrail_check_input(engine: AtelicoEngine*, text: const char*, out_verdict_json: *mut const char*);

Check user input text against safety guardrails before sending to the model.

Use this to pre-screen user messages before passing them to inference. Requires the guardrails feature to be enabled at build time.

  • engine (AtelicoEngine*): valid engine handle.
  • text (const char*): the user input text to check.
  • out_verdict_json (const char**): receives the verdict JSON pointer.

Returns: ATELICO_OK (0) on success.

On success, *out_verdict_json points to a SafetyVerdict:

When allowed:

{"action": "Allow", "checker_name": "keyword", "score": null}

When blocked:

{"action": {"Block": {"reason": "profanity detected"}}, "checker_name": "keyword", "score": 0.95}
  • action: "Allow", or an object {"Block": {"reason": "..."}} or {"Rewrite": {"original": "...", "rewritten": "...", "reason": "..."}}.
  • checker_name: name of the guardrail checker that produced this verdict.
  • score: confidence score (0.0-1.0), or null if not applicable.

Safety:

All pointers must be valid.

atelico_guardrail_check_output

int32_t atelico_guardrail_check_output(engine: AtelicoEngine*, text: const char*, out_verdict_json: *mut const char*);

Check model output text against safety guardrails before displaying to the user.

Use this to post-screen model responses before showing them to the player. Requires the guardrails feature to be enabled at build time.

  • engine (AtelicoEngine*): valid engine handle.
  • text (const char*): the model output text to check.
  • out_verdict_json (const char**): receives the verdict JSON pointer.

Returns: ATELICO_OK (0) on success. The verdict JSON has the same schema as atelico_guardrail_check_input (see its documentation for the SafetyVerdict structure).

Safety:

All pointers must be valid.

atelico_lora_load

int32_t atelico_lora_load(engine: AtelicoEngine*, model_id: const char*, adapter_path: const char*);

Load a LoRA adapter onto a loaded model.

The adapter directory must contain adapter_config.json and weight files in HuggingFace PEFT format. The adapter is applied on the next inference request. Only one adapter can be active per model at a time; loading a new adapter replaces the previous one.

  • engine (AtelicoEngine*): valid engine handle.
  • model_id (const char*): model identifier in "backend::model" format.
  • adapter_path (const char*): filesystem path to the LoRA adapter directory.

Returns: ATELICO_OK (0) on success.

Safety:

All pointers must be valid C strings.

atelico_lora_unload

int32_t atelico_lora_unload(engine: AtelicoEngine*, model_id: const char*);

Unload the active LoRA adapter from a model, reverting to base weights.

  • engine (AtelicoEngine*): valid engine handle.
  • model_id (const char*): model identifier in "backend::model" format.

Returns: ATELICO_OK (0) on success.

Safety:

All pointers must be valid.

atelico_lora_set_scale

int32_t atelico_lora_set_scale(engine: AtelicoEngine*, model_id: const char*, scale: float);

Set the LoRA runtime blending scale for a model.

Allows smooth interpolation between base weights and full LoRA adaptation. Can be changed between inference requests for dynamic personality blending.

  • engine (AtelicoEngine*): valid engine handle.
  • model_id (const char*): model identifier in "backend::model" format.
  • scale (float): blending factor. 0.0 = base weights only, 1.0 = full LoRA effect. Values between 0 and 1 interpolate linearly.

Returns: ATELICO_OK (0) on success.

Safety:

All pointers must be valid.

atelico_stream_poll

int32_t atelico_stream_poll(engine: AtelicoEngine*, stream_id: uint64_t, out_json: *mut const char*);

Poll for the next JSON chunk from a stream (non-blocking).

Call this each game frame (or on a timer) to receive incremental results from a streaming operation. Designed for game-loop integration where blocking is not acceptable.

  • engine (AtelicoEngine*): valid engine handle.
  • stream_id (uint64_t): handle returned by a *_stream function (e.g., atelico_llm_chat_stream, atelico_llm_respond_stream).
  • out_json (const char**): receives a pointer to the JSON chunk.

Returns: one of the following result codes:

  • ATELICO_OK (0): data is available in *out_json. The JSON content depends on the stream type (chat chunks, response deltas, etc.).
  • ATELICO_ERR_STREAM_EMPTY (-7): no data available yet. Retry on the next frame.
  • ATELICO_ERR_STREAM_DONE (-6): the stream has completed. The stream handle is automatically freed; do not poll again.
  • Other negative values indicate an error.

The *out_json pointer is valid until the next Atelico API call on the same thread. Copy the string if you need to keep it longer.

Safety:

engine must be a valid pointer. out_json must be a valid pointer.

atelico_stream_cancel

int32_t atelico_stream_cancel(engine: AtelicoEngine*, stream_id: uint64_t);

Cancel and destroy an active stream, dropping any unread data.

Safe to call even if the stream has already completed or the handle is unknown (returns ATELICO_OK in both cases).

  • engine (AtelicoEngine*): valid engine handle.
  • stream_id (uint64_t): handle returned by a *_stream function.

Returns: ATELICO_OK (0) on success.

Safety:

engine must be a valid pointer.

atelico_stream_destroy

int32_t atelico_stream_destroy(engine: AtelicoEngine*, stream_id: uint64_t);

Destroy a stream handle. Equivalent to atelico_stream_cancel.

Streams that reach ATELICO_ERR_STREAM_DONE are automatically cleaned up, so calling this afterward is a no-op (but safe).

  • engine (AtelicoEngine*): valid engine handle.
  • stream_id (uint64_t): handle returned by a *_stream function.

Returns: ATELICO_OK (0) on success.

Safety:

engine must be a valid pointer.

atelico_llm_complete

int32_t atelico_llm_complete(engine: AtelicoEngine*, request_json: const char*, out_response_json: *mut const char*);

Run a blocking text completion (non-chat, raw prompt completion).

Unlike atelico_llm_chat, this takes a raw text prompt without conversation structure. Useful for autocomplete, fill-in, or non-conversational generation tasks.

  • engine (AtelicoEngine*): valid engine handle.
  • request_json (const char*): JSON TextCompletionRequest.
  • out_response_json (const char**): receives the response JSON pointer.

Expected JSON for request_json:

{
"model": "in-memory::llama-3.2-1b",
"prompt": "Once upon a time",
"max_tokens": 64,
"temperature": 0.7
}

Returns: ATELICO_OK (0) on success.

On success, *out_response_json points to a TextCompletionResponse:

{
"id": "cmpl-abc123",
"object": "text_completion",
"choices": [
{"index": 0, "text": " there was a village...", "finish_reason": "length"}
],
"usage": {
"prompt_tokens": 5,
"completion_tokens": 64,
"total_tokens": 69
}
}

Safety:

All pointers must be valid.

atelico_llm_respond_stream

int32_t atelico_llm_respond_stream(engine: AtelicoEngine*, request_json: const char*, out_stream: uint64_t*);

Start a streaming response request (OpenAI Responses API). Returns a stream handle.

The request JSON is the same schema as atelico_llm_respond. Use atelico_stream_poll to receive incremental response delta objects.

  • engine (AtelicoEngine*): valid engine handle.
  • request_json (const char*): JSON ResponseRequest (same schema as atelico_llm_respond).
  • out_stream (uint64_t*): receives the stream handle.

Returns: ATELICO_OK (0) on success. On success, *out_stream is set to a stream handle for use with atelico_stream_poll.

Safety:

All pointers must be valid.

atelico_guardrail_check_image_prompt

int32_t atelico_guardrail_check_image_prompt(engine: AtelicoEngine*, prompt: const char*, out_verdict_json: *mut const char*);

Check an image generation prompt against safety guardrails.

Use this to pre-screen image prompts before passing them to atelico_image_generate. Requires the guardrails feature.

  • engine (AtelicoEngine*): valid engine handle.
  • prompt (const char*): the image generation prompt to check.
  • out_verdict_json (const char**): receives the verdict JSON pointer.

Returns: ATELICO_OK (0) on success. The verdict JSON has the same schema as atelico_guardrail_check_input (see its documentation for the SafetyVerdict structure).

Safety:

All pointers must be valid.

atelico_model_is_loaded

int32_t atelico_model_is_loaded(engine: const AtelicoEngine*, model_id: const char*);

Check if a model is currently loaded and ready for inference.

  • engine (const AtelicoEngine*): valid engine handle.
  • model_id (const char*): model identifier in "backend::model" format.

Returns: int32_t -- 1 if loaded and ready, 0 if not loaded or on error.

Safety:

All pointers must be valid.

atelico_kvstore_create Platform-specific

int32_t atelico_kvstore_create(engine: AtelicoEngine*, config_json: const char*, out_store_id: uint64_t*);

Create a new semantic KV store backed by an on-disk SQLite database.

The store combines embedding-based vector search with faceted metadata filtering. Useful for game NPC memory systems, dialogue history, and semantic search over game content.

  • engine (AtelicoEngine*): valid engine handle.
  • config_json (const char*): JSON store configuration.
  • out_store_id (uint64_t*): receives the store handle.

Expected JSON for config_json:

{
"store_id": "npc-memories",
"db_path": "./data",
"table_name": "entries",
"embed_dim": 384,
"has_priority": true,
"similarity_weight": 0.5,
"priority_weight": 0.5
}
  • store_id: unique name for this store (default "default").
  • db_path: filesystem path for the SQLite database (default "./data").
  • table_name: SQLite table name (default "entries").
  • embed_dim: embedding vector dimensionality (default 384).
  • has_priority: whether entries have priority scores (default false).
  • similarity_weight: weight for similarity in ranking (default 0.5).
  • priority_weight: weight for priority in ranking (default 0.5).

Returns: ATELICO_OK (0) on success. On success, *out_store_id is set to an opaque store handle for use with other atelico_kvstore_* functions.

Safety:

All pointers must be valid.

atelico_kvstore_insert Platform-specific

int32_t atelico_kvstore_insert(engine: AtelicoEngine*, store_id: const char*, entries_json: const char*);

Insert entries into a KV store.

  • engine (AtelicoEngine*): valid engine handle.
  • store_id (const char*): store identifier (the store_id used in atelico_kvstore_create).
  • entries_json (const char*): JSON array of entries.

Expected JSON for entries_json:

[
{
"id": "mem-001",
"key_text": "guard saw the thief enter the castle",
"key_embedding": [0.1, -0.2, 0.05],
"facets": {"npc_id": "guard-01", "location": "castle"},
"priority": 0.8
}
]
  • id (required): unique entry identifier.
  • key_text (required): searchable text content.
  • key_embedding: pre-computed embedding vector. If omitted, you must compute and provide it externally (the store does not auto-embed).
  • facets: key-value string metadata for filtering during queries.
  • priority: optional priority score for ranking (used when has_priority was set to true in the store config).

Returns: ATELICO_OK (0) on success.

Safety:

All pointers must be valid.

atelico_kvstore_query Platform-specific

int32_t atelico_kvstore_query(engine: AtelicoEngine*, store_id: const char*, query_json: const char*, out_results_json: *mut const char*);

Query a KV store with vector similarity search.

  • engine (AtelicoEngine*): valid engine handle.
  • store_id (const char*): store identifier.
  • query_json (const char*): JSON query parameters.
  • out_results_json (const char**): receives the results JSON pointer.

Expected JSON for query_json:

{
"query_embedding": [0.1, -0.2, 0.05],
"query_text": "who stole the key?",
"facet_filters": {"npc_id": "guard-01"},
"vector_search_limit": 20,
"limit": 5,
"use_prefilter": true
}
  • query_embedding: query vector for cosine similarity search.
  • query_text: optional text, used for logging/debugging only.
  • facet_filters: restrict results to entries matching these facet values.
  • vector_search_limit: max candidates from vector index (default 20).
  • limit: max results to return (default 5).
  • use_prefilter: apply facet filters before vector search (default true).

Returns: ATELICO_OK (0) on success.

On success, *out_results_json points to a JSON array of results:

[
{
"id": "mem-001",
"key_text": "guard saw the thief enter the castle",
"similarity": 0.95,
"priority": 0.8,
"combined_score": 0.87
}
]

Safety:

All pointers must be valid.

atelico_kvstore_scan Platform-specific

int32_t atelico_kvstore_scan(engine: AtelicoEngine*, store_id: const char*, filter: const char*, limit: uint32_t, out_results_json: *mut const char*);

Scan entries in a KV store with an optional SQL-like facet filter.

Unlike atelico_kvstore_query, this does not use vector similarity -- it returns entries matching the filter in insertion order.

  • engine (AtelicoEngine*): valid engine handle.
  • store_id (const char*): store identifier.
  • filter (const char*): SQL-like filter expression (e.g., "npc_id = 'guard-01'") or NULL for no filter (returns all entries).
  • limit (uint32_t): maximum number of entries to return.
  • out_results_json (const char**): receives the results JSON pointer.

Returns: ATELICO_OK (0) on success. The result JSON array has the same schema as atelico_kvstore_query.

Safety:

All pointers must be valid. filter may be NULL.

atelico_kvstore_count Platform-specific

int32_t atelico_kvstore_count(engine: AtelicoEngine*, store_id: const char*, filter: const char*, out_count: uint64_t*);

Count entries in a KV store with an optional facet filter.

  • engine (AtelicoEngine*): valid engine handle.
  • store_id (const char*): store identifier.
  • filter (const char*): SQL-like filter expression, or NULL to count all entries.
  • out_count (uint64_t*): receives the entry count.

Returns: ATELICO_OK (0) on success. On success, *out_count contains the number of matching entries.

Safety:

All pointers must be valid. filter may be NULL.

atelico_kvstore_destroy Platform-specific

int32_t atelico_kvstore_destroy(engine: AtelicoEngine*, store_id: const char*);

Delete a KV store and release its in-memory resources.

The on-disk SQLite database file is NOT deleted; only the in-memory state (vector index, caches) is freed. Re-create the store with the same db_path to reload from disk.

  • engine (AtelicoEngine*): valid engine handle.
  • store_id (const char*): store identifier.

Returns: ATELICO_OK (0) on success.

Safety:

All pointers must be valid.

atelico_ann_create Platform-specific

int32_t atelico_ann_create(config_json: const char*, out_index: uint64_t*);

Create a new approximate nearest neighbor (ANN) index using HNSW.

The index is held in memory and independent of engine lifecycle. Use atelico_ann_destroy to free it.

  • config_json (const char*): JSON index configuration.
  • out_index (uint64_t*): receives the index handle.

Expected JSON for config_json:

{
"dim": 384,
"m": 16,
"max_nb_connection": 16,
"ef_construction": 200,
"ef_search": 50,
"max_elements": 10000
}
  • dim: dimensionality of vectors (default 384).
  • m: HNSW connectivity parameter (default 16).
  • max_nb_connection: max connections per node (default 16).
  • ef_construction: search scope during index construction (default 200).
  • ef_search: search scope during queries (default 50).
  • max_elements: maximum index capacity (default 10000).

Returns: ATELICO_OK (0) on success. On success, *out_index is set to an opaque index handle for use with other atelico_ann_* functions.

Safety:

All pointers must be valid.

atelico_ann_insert Platform-specific

int32_t atelico_ann_insert(index_id: uint64_t, vector: *const float, dim: uint32_t, label_id: uint64_t);

Insert a vector into an ANN index.

Vectors are staged but not searchable until atelico_ann_build is called.

  • index_id (uint64_t): handle from atelico_ann_create.
  • vector (const float*): pointer to dim contiguous float values.
  • dim (uint32_t): number of dimensions (must match the index config).
  • label_id (uint64_t): application-defined label for this vector (returned in search results to identify the entry).

Returns: ATELICO_OK (0) on success, ATELICO_ERR_STORE_NOT_FOUND (-9) if the index handle is invalid.

Safety:

vector must point to at least dim contiguous floats.

atelico_ann_build Platform-specific

int32_t atelico_ann_build(index_id: uint64_t);

Build the ANN index. Must be called after all insertions and before any search queries.

This constructs the HNSW graph structure. The operation is O(n log n) in the number of inserted vectors.

  • index_id (uint64_t): handle from atelico_ann_create.

Returns: ATELICO_OK (0) on success, ATELICO_ERR_STORE_NOT_FOUND (-9) if the index handle is invalid.

atelico_ann_search Platform-specific

int32_t atelico_ann_search(index_id: uint64_t, query_vector: *const float, dim: uint32_t, k: uint32_t, out_results_json: *mut const char*);

Search the ANN index for the k nearest neighbors.

The index must have been built with atelico_ann_build before searching.

  • index_id (uint64_t): handle from atelico_ann_create.
  • query_vector (const float*): pointer to dim contiguous float query values.
  • dim (uint32_t): number of dimensions (must match the index config).
  • k (uint32_t): number of nearest neighbors to return.
  • out_results_json (const char**): receives the results JSON pointer.

Returns: ATELICO_OK (0) on success.

On success, *out_results_json points to a JSON array of results:

[
{"label_id": 42, "distance": 0.05},
{"label_id": 17, "distance": 0.12}
]
  • label_id: the application-defined label from atelico_ann_insert.
  • distance: cosine distance to the query vector (lower = more similar).

Safety:

query_vector must point to at least dim contiguous floats.

atelico_ann_destroy Platform-specific

int32_t atelico_ann_destroy(index_id: uint64_t);

Destroy an ANN index and free its memory.

Safe to call with an unknown handle (no-op).

  • index_id (uint64_t): handle from atelico_ann_create.

Returns: ATELICO_OK (0) always.

atelico_prefix_prefill

int32_t atelico_prefix_prefill(engine: AtelicoEngine*, request_json: const char*, out_handle_json: *mut const char*);

Prefill a prompt prefix and store it for reuse. Returns handle as JSON.

Safety:

All pointers must be valid. out_handle_json must point to writable memory.

atelico_prefix_generate

int32_t atelico_prefix_generate(engine: AtelicoEngine*, request_json: const char*, out_text: *mut const char*);

Generate from a stored prefix (blocking). Returns generated text.

Safety:

All pointers must be valid.

atelico_prefix_release

int32_t atelico_prefix_release(engine: AtelicoEngine*, handle_json: const char*);

Release a prefix handle. handle_json is the serialized PrefixHandle.

Safety:

All pointers must be valid.

atelico_prefix_list

int32_t atelico_prefix_list(engine: AtelicoEngine*, out_json: *mut const char*);

List all cached prefix handles as a JSON array.

Safety:

All pointers must be valid.

atelico_cache_get

int32_t atelico_cache_get(engine: AtelicoEngine*, key_json: const char*, policy_json: const char*, out_result_json: *mut const char*);

Look up a cached prompt result. Returns ATELICO_ERR_STREAM_EMPTY on miss.

Safety:

All pointers must be valid.

atelico_cache_put

int32_t atelico_cache_put(engine: AtelicoEngine*, key_json: const char*, policy_json: const char*, response_json: const char*, metadata_json: const char*);

Store a prompt result in the cache.

Safety:

All pointers must be valid.

atelico_cache_clear

int32_t atelico_cache_clear(engine: AtelicoEngine*);

Clear all cached prompt results.

Safety:

engine must be a valid pointer.

atelico_matcher_rank

int32_t atelico_matcher_rank(engine: AtelicoEngine*, matcher_id: const char*, request_json: const char*, out_result_json: *mut const char*);

Rank dynamic candidates against a query using a registered matcher.

Safety:

All pointers must be valid.