Version: 0.9

Godot API Reference

AtelicoEngineNode

The main Godot node for the Atelico AI Engine.

Add as a child node in your scene tree and call initialize_engine to configure backends before using any LLM, embedding, or image generation methods.

Signals emitted by this node:

inference_token_generated(job_id: int, chunk_json: String) -- a new token chunk arrived during streaming chat/text completion.
inference_completed(job_id: int, success: bool) -- a streaming LLM request finished (success=true) or failed (success=false).
image_generation_chunk(job_id: int, response_json: String) -- a progress chunk from streaming image generation.
image_generation_completed(job_id: int, success: bool) -- streaming image generation finished.
model_loading_completed(success: bool) -- engine initialization done.
async_request_completed(job_id: int, response_json: String) -- a non-blocking (async) request completed. response_json is empty on error.

Signals

`inference_token_generated`

Emitted once per token during a streaming LLM request (llm_chat_stream or llm_respond_stream).

job_id: matches the return value of the streaming method.
chunk_json: a ChatCompletionChunk or ResponseStreamDelta JSON string.

`inference_completed`

Emitted when a streaming LLM request finishes.

job_id: matches the return value of the streaming method.
success: true if generation completed normally, false on error.

`image_generation_chunk`

Emitted for each progress update during streaming image generation (image_generate_stream).

job_id: matches the return value of image_generate_stream.
response_json: an ImageGenerationResponse JSON string with partial or complete image data.

`image_generation_completed`

Emitted when streaming image generation finishes.

job_id: matches the return value of image_generate_stream.
success: true if generation completed normally, false on error.

`audio_synthesis_chunk`

Emitted for each chunk during streaming TTS synthesis (audio_synthesize_stream).

job_id: matches the return value of audio_synthesize_stream.
chunk_json: an AudioSpeechChunk JSON string with fields sequence, audio (base64 WAV), duration_seconds, text.

`audio_synthesis_completed`

Emitted when streaming TTS synthesis finishes.

job_id: matches the return value of audio_synthesize_stream.
success: true on normal completion, false on error.

`model_loading_completed`

Emitted when initialize_engine finishes configuring backends.

success: true if at least one backend was configured successfully.

`async_request_completed`

Emitted when a non-blocking (async) request completes. Fired by llm_chat_async and embed_async.

job_id: matches the return value of the async method.
response_json: the full JSON response string (empty on error).

Methods

`set_env_var(key: GString, value: GString)`

Set a process environment variable.

Useful for configuring RUST_LOG, HF_TOKEN, or OPENAI_API_KEY from GDScript before initializing the engine.

Inputs

key -- environment variable name (e.g. "RUST_LOG", "HF_TOKEN").
value -- value to set.

Outputs

No return value.

Example

Input: key="RUST_LOG", value="debug"
Output: (none -- environment variable is set for the process)

`initialize_engine(backends_config: Array<VarDictionary>)`

Configure the engine router with an array of backend definitions.

GPU scheduling configuration is read from the AtelicoSingleton at init time. Call AtelicoSingleton.set_gpu_scheduling_mode() etc. before calling this method to configure GPU sharing behavior.

Inputs

backends_config -- Array of Dictionaries, each with:
- "name" (string, required) -- logical name for the backend (e.g. "in-memory").
- "type" (string, required) -- "in-memory" (local inference) or "proxy" (remote API).
- "config" (Dictionary, required) -- backend-specific settings:
  - For "proxy": {"base_url": "...", "api_key": "..."}.
  - For "in-memory": empty dictionary {} is valid.

Outputs

No return value. Emits model_loading_completed(true) signal on success.

Example

Input: [{"name":"in-memory","type":"in-memory","config":{}}]
Output: (signal) model_loading_completed(true)

`llm_chat(request_str: GString) -> GString`

Synchronous (blocking) chat completion.

Inputs

request_str -- JSON ChatCompletionRequest:
- model (string, required) -- model ID in "backend::org/model" format.
- messages (array, required) -- conversation messages:
  - role (string) -- "system", "user", or "assistant".
  - content (string) -- message text.
- max_tokens (int) -- maximum tokens to generate.
- temperature (float) -- sampling temperature 0.0-2.0.
- top_p (float) -- nucleus sampling threshold.
- response_format (object) -- structured output schema.

Outputs

JSON ChatCompletionResponse (empty string on error):

id (string) -- unique response ID.
choices (array) -- generated completions:
- message (object) -- { role, content }.
- finish_reason (string) -- "stop" or "length".
usage (object) -- { prompt_tokens, completion_tokens, total_tokens }.

Example

Input: {"model":"in-memory::meta-llama/Llama-3.2-1B-Instruct","messages":[{"role":"user","content":"Hello"}],"max_tokens":100}
Output: {"id":"chatcmpl-xxx","choices":[{"message":{"role":"assistant","content":"Hi!"},"finish_reason":"stop"}],"usage":{"prompt_tokens":5,"completion_tokens":3,"total_tokens":8}}

`llm_chat_async(request_str: GString) -> i64`

Non-blocking chat completion. Runs on a background thread.

Inputs

request_str -- same JSON ChatCompletionRequest schema as llm_chat.

Outputs

int: a job_id. When inference finishes, the async_request_completed(job_id, response_json) signal is emitted with the full ChatCompletionResponse JSON (or empty string on error).

Example

Input: {"model":"in-memory::llama","messages":[{"role":"user","content":"Hi"}]}
Output: 0 (job_id; then signal async_request_completed(0, "{...}") is emitted)

`llm_chat_stream(request_str: GString) -> i64`

Streaming chat completion. Tokens arrive via signals.

Inputs

request_str -- same JSON ChatCompletionRequest schema as llm_chat. The stream field is set automatically.

Outputs

int: a job_id. For each generated token, the inference_token_generated(job_id, chunk_json) signal is emitted where chunk_json is a ChatCompletionChunk:

id (string) -- chunk ID.
choices (array) -- [{"delta": {"content": "token"}, "finish_reason": null}]

When generation finishes, inference_completed(job_id, success) is emitted.

Example

Input: {"model":"in-memory::llama","messages":[{"role":"user","content":"Hi"}]}
Output: 0 (job_id; then signals: inference_token_generated(0, "{...}") per token, inference_completed(0, true))

`llm_text_complete(request_str: GString) -> GString`

Synchronous (blocking) text completion. Continues a raw prompt without chat template formatting.

Inputs

request_str -- JSON TextCompletionRequest:
- model (string, required) -- backend-prefixed model ID.
- prompt (string, required) -- raw text prompt (no chat template).
- max_tokens (int) -- maximum tokens to generate.
- temperature (float) -- sampling temperature 0.0-2.0.

Outputs

JSON TextCompletionResponse (empty string on error):

id (string) -- unique completion ID.
choices (array) -- [{"text": "...", "index": 0, "finish_reason": "stop"}]
usage (object) -- { prompt_tokens, completion_tokens, total_tokens }

Example

Input: {"model":"in-memory::llama","prompt":"Once upon a time","max_tokens":50}
Output: {"id":"cmpl-xxx","choices":[{"text":" there was a...","index":0,"finish_reason":"stop"}],"usage":{"prompt_tokens":4,"completion_tokens":50,"total_tokens":54}}

`llm_respond(request_str: GString) -> GString`

Synchronous (blocking) Responses API call. A higher-level conversational interface that manages conversation state internally.

Inputs

request_str -- JSON ResponseRequest:
- model (string, required) -- backend-prefixed model ID.
- input (string or array) -- user input text or message list.
- instructions (string) -- system prompt / instructions.
- max_output_tokens (int) -- maximum tokens to generate.
- temperature (float) -- sampling temperature 0.0-2.0.

Outputs

JSON ResponseResponse (empty string on error):

id (string) -- unique response ID.
output (array) -- [{"type": "message", "content": [...]}]
usage (object) -- { input_tokens, output_tokens }

Example

Input: {"model":"in-memory::llama","input":"What is 2+2?"}
Output: {"id":"resp-xxx","output":[{"type":"message","content":[{"type":"output_text","text":"4"}]}],"usage":{"input_tokens":5,"output_tokens":1}}

`llm_respond_stream(request_str: GString) -> i64`

Streaming Responses API request. Tokens arrive via signals.

Inputs

request_str -- same JSON ResponseRequest schema as llm_respond. The stream field is set automatically.

Outputs

int: a job_id. Tokens arrive via inference_token_generated(job_id, chunk_json), and completion is signaled by inference_completed(job_id, success).

Example

Input: {"model":"in-memory::llama","input":"Hello"}
Output: 0 (job_id; then signals: inference_token_generated(0, "{...}") per token, inference_completed(0, true))

`image_generate(request_str: GString) -> GString`

Synchronous (blocking) image generation from a text prompt.

Inputs

request_str -- JSON ImageGenerationRequest:
- model (string, required) -- backend-prefixed model ID.
- prompt (string, required) -- text description of the image.
- n (int) -- number of images to generate (default 1).
- size (string) -- image dimensions (e.g. "512x512").
- response_format (string) -- "b64_json" or "url".

Outputs

JSON ImageGenerationResponse (empty string on error):

created (int) -- Unix timestamp of generation.
data (array) -- [{"b64_json": "...", "revised_prompt": "..."}]

Example

Input: {"model":"in-memory::PixArt-alpha/PixArt-Sigma-XL-2-512-MS","prompt":"A sunset","size":"512x512"}
Output: {"created":1234567890,"data":[{"b64_json":"iVBORw0KGgo...","revised_prompt":"A sunset"}]}

`audio_synthesize(request_str: GString) -> GString`

Synthesize speech from text (blocking).

Inputs

request_str -- JSON AudioSpeechRequest:
- model (string, required) -- e.g. "in-memory::tts", "in-memory::kokoro-82m", "in-memory::pocket-tts". Bare ids accepted after the prefix: tts (default → kokoro-82m), kokoro, kokoro-82m, pocket, pocket-tts.
- input (string, required) -- text to synthesize.
- voice (string) -- voice id (default "af_heart"). Pocket TTS ships 24 built-in English voices (alba, ian, morgan, kate, ...).
- speed (float) -- 0.25–4.0 multiplier (default 1.0).

Outputs

JSON object (empty string on error):

audio_b64 (string) -- base64-encoded WAV bytes (RIFF + PCM).
duration_seconds (float)
format (string) -- "wav"
sample_rate (int) -- typically 24000.

Example

Input: {"model":"in-memory::tts","input":"Hello!","voice":"af_heart"}
Output: {"audio_b64":"UklGRn...","duration_seconds":1.42,"format":"wav","sample_rate":24000}

`audio_transcribe(request_str: GString) -> GString`

Transcribe audio to text (blocking).

Inputs

request_str -- JSON object:
- model (string, required) -- e.g. "in-memory::whisper" (default → whisper-base.en), "in-memory::whisper-large-v3-turbo", "in-memory::distil-large-v3".
- audio_b64 (string, required) -- base64-encoded WAV file (RIFF + PCM). Sample rate is read from the WAV header.
- language (string) -- ISO 639-1 code ("en", "ja", ...) or omit for auto-detect.
- response_format (string) -- "json" (default) or "verbose_json" (adds segments).
- temperature (float) -- decoder sampling temperature (0.0 = greedy).
- timestamp_granularities (array) -- ["segment"] and/or ["word"].

Outputs

JSON AudioTranscriptionResponse (empty string on error):

text (string)
language (string, optional)
duration (float, optional)
segments, words (optional, depending on response_format and granularities).

Example

Input: {"model":"in-memory::whisper","audio_b64":"UklGRn..."}
Output: {"text":"Hello from Atelico.","language":"en","duration":1.4}

`audio_synthesize_stream(request_str: GString) -> i64`

Streaming TTS synthesis. Chunks arrive via signals.

Inputs

request_str -- JSON AudioSpeechRequest (same schema as audio_synthesize). The stream field is set automatically.

Outputs

int: a job_id. Per-chunk signals fire as audio_synthesis_chunk(job_id, chunk_json), and completion is signaled by audio_synthesis_completed(job_id, success).

Example

Input: {"model":"in-memory::pocket-tts","input":"First. Second.","voice":"alba"}
Output: 7 (job_id; then signals: audio_synthesis_chunk(7, "{...}"), audio_synthesis_completed(7, true))

`image_generate_stream(request_str: GString) -> i64`

Streaming image generation. Progress chunks arrive via signals.

Inputs

request_str -- same JSON ImageGenerationRequest schema as image_generate. The stream field is set automatically.

Outputs

int: a job_id. Progress chunks arrive via image_generation_chunk(job_id, response_json), and completion is signaled by image_generation_completed(job_id, success).

Example

Input: {"model":"in-memory::PixArt","prompt":"A cat"}
Output: 0 (job_id; then signals: image_generation_chunk(0, "{...}"), image_generation_completed(0, true))

`image_remove_background(request_str: GString) -> GString`

Synchronous (blocking) background removal from an image.

Inputs

request_str -- JSON BackgroundRemovalRequest:
- model (string, required) -- backend-prefixed model ID.
- image (string, required) -- base64-encoded image data or URL.

Outputs

JSON response (empty string on error):

data (array) -- [{"b64_json": "..."}] -- image with background removed.

Example

Input: {"model":"in-memory::briaai/RMBG-1.4","image":"iVBORw0KGgo..."}
Output: {"data":[{"b64_json":"iVBORw0KGgo..."}]}

`embed(request_str: GString) -> GString`

Synchronous (blocking) text embedding.

Inputs

request_str -- JSON EmbeddingRequest:
- model (string, required) -- backend-prefixed embedding model ID.
- input (string or array of strings) -- text(s) to embed.

Outputs

JSON EmbeddingResponse (empty string on error):

object (string) -- always "list".
model (string) -- model used.
data (array) -- [{"object": "embedding", "index": 0, "embedding": [0.1, ...]}]
usage (object) -- { prompt_tokens, total_tokens }

Example

Input: {"model":"in-memory::sentence-transformers/all-MiniLM-L6-v2","input":"Hello world"}
Output: {"object":"list","data":[{"object":"embedding","index":0,"embedding":[0.1,0.2,...]}],"usage":{"prompt_tokens":2,"total_tokens":2}}

`embed_async(request_str: GString) -> i64`

Non-blocking embedding. Runs on a background thread.

Inputs

request_str -- same JSON EmbeddingRequest schema as embed.

Outputs

int: a job_id. When embedding finishes, the async_request_completed(job_id, response_json) signal is emitted with the full EmbeddingResponse JSON (or empty string on error).

Example

Input: {"model":"in-memory::all-MiniLM-L6-v2","input":"Hello"}
Output: 0 (job_id; then signal async_request_completed(0, "{...}") is emitted)

`model_load(model_id: GString) -> bool`

Pre-load a model (blocking). Call during loading screens to avoid latency on the first inference request.

Inputs

model_id -- model ID in "backend::org/model" format (e.g. "in-memory::meta-llama/Llama-3.2-1B-Instruct").

Outputs

bool: true on success, false if the backend is not found or loading fails.

Example

Input: "in-memory::meta-llama/Llama-3.2-1B-Instruct"
Output: true

`lora_load(model_id: GString, adapter_path: GString) -> bool`

Load a LoRA adapter onto a model (registers intent; actual loading happens via the model name convention backend::model::adapter).

Inputs

model_id -- the base model ID to attach the adapter to.
adapter_path -- filesystem path to a directory containing adapter_config.json and adapter weight files.

Outputs

bool: true on success.

Example

Input: model_id="in-memory::llama", adapter_path="/adapters/my-lora"
Output: true

`lora_unload(model_id: GString) -> bool`

Unload a LoRA adapter from a model, reverting to base weights.

Inputs

model_id -- the model whose adapter should be removed.

Outputs

bool: true on success.

Example

Input: "in-memory::llama"
Output: true

`lora_set_scale(model_id: GString, scale: f64) -> bool`

Set the LoRA runtime scale factor for a model's loaded adapter.

A scale of 1.0 applies the full adapter effect; 0.0 effectively disables it without unloading.

Inputs

model_id -- the model with a loaded LoRA adapter.
scale -- scale factor (typically 0.0 to 1.0; 1.0 = full adapter effect).

Outputs

bool: true on success.

Example

Input: model_id="in-memory::llama", scale=0.5
Output: true

Deprecated Methods

`chat_completions(request_str: GString) -> GString`

DEPRECATED: Use llm_chat instead.

Synchronous chat completion (old API name). Delegates to llm_chat.

Inputs

request_str -- same as llm_chat.

Outputs

Same as llm_chat.

Example

Input: {"model":"in-memory::llama","messages":[{"role":"user","content":"Hi"}]}
Output: (same as llm_chat -- use llm_chat instead)

`async_chat_completions(request_str: GString) -> i64`

DEPRECATED: Use llm_chat_async instead.

Non-blocking chat completion (old API name). Delegates to llm_chat_async.

Inputs

request_str -- same as llm_chat_async.

Outputs

Same as llm_chat_async.

Example

Input: {"model":"in-memory::llama","messages":[{"role":"user","content":"Hi"}]}
Output: (same as llm_chat_async -- use llm_chat_async instead)

`stream_chat_completions(request_str: GString) -> i64`

DEPRECATED: Use llm_chat_stream instead.

Streaming chat completion (old API name). Delegates to llm_chat_stream.

Inputs

request_str -- same as llm_chat_stream.

Outputs

Same as llm_chat_stream.

Example

Input: {"model":"in-memory::llama","messages":[{"role":"user","content":"Hi"}]}
Output: (same as llm_chat_stream -- use llm_chat_stream instead)

`responses(request_str: GString) -> GString`

DEPRECATED: Use llm_respond instead.

Synchronous Responses API call (old API name). Delegates to llm_respond.

Inputs

request_str -- same as llm_respond.

Outputs

Same as llm_respond.

Example

Input: {"model":"in-memory::llama","input":"Hello"}
Output: (same as llm_respond -- use llm_respond instead)

`stream_responses(request_str: GString) -> i64`

DEPRECATED: Use llm_respond_stream instead.

Streaming Responses API (old API name). Delegates to llm_respond_stream.

Inputs

request_str -- same as llm_respond_stream.

Outputs

Same as llm_respond_stream.

Example

Input: {"model":"in-memory::llama","input":"Hello"}
Output: (same as llm_respond_stream -- use llm_respond_stream instead)

`embeddings(request_str: GString) -> GString`

DEPRECATED: Use embed instead.

Synchronous embedding (old API name). Delegates to embed.

Inputs

request_str -- same as embed.

Outputs

Same as embed.

Example

Input: {"model":"in-memory::all-MiniLM-L6-v2","input":"Hello"}
Output: (same as embed -- use embed instead)

`async_embeddings(request_str: GString) -> i64`

DEPRECATED: Use embed_async instead.

Non-blocking embedding (old API name). Delegates to embed_async.

Inputs

request_str -- same as embed_async.

Outputs

Same as embed_async.

Example

Input: {"model":"in-memory::all-MiniLM-L6-v2","input":"Hello"}
Output: (same as embed_async -- use embed_async instead)

`stream_image_generation(request_str: GString) -> i64`

DEPRECATED: Use image_generate_stream instead.

Streaming image generation (old API name). Delegates to image_generate_stream.

Inputs

request_str -- same as image_generate_stream.

Outputs

Same as image_generate_stream.

Example

Input: {"model":"in-memory::PixArt","prompt":"A cat"}
Output: (same as image_generate_stream -- use image_generate_stream instead)

AtelicoSingleton

Engine-level singleton for GPU scheduling and inference configuration.

Registered as AtelicoSingleton and accessible from GDScript via Engine.get_singleton("AtelicoSingleton"). Configure GPU scheduling mode, VRAM budgets, token rate limiting, and frame budgets before calling AtelicoEngineNode.initialize_engine().

Settings configured here are read once at engine initialization time. Changing them after initialize_engine() has no effect.

Methods

`set_gpu_scheduling_mode(mode: i32)`

Set the GPU scheduling mode to control how GPU time is shared between inference and rendering.

mode: one of PRIORITIZE_COMPUTE (0), BALANCE (1), or PRIORITIZE_GRAPHICS (2). Invalid values default to BALANCE.

`get_gpu_scheduling_mode() -> i32`

Get the current GPU scheduling mode as an integer constant.

Returns: one of PRIORITIZE_COMPUTE (0), BALANCE (1), or PRIORITIZE_GRAPHICS (2).

`set_vram_budget_mb(budget_mb: i32)`

Set the maximum VRAM (in megabytes) that inference may use. When exceeded, model loading will be rejected.

budget_mb: VRAM cap in MB. Set to 0 for unlimited (default). Negative values are clamped to 0.

`get_vram_budget_mb() -> i32`

Get the configured VRAM budget in MB.

Returns: VRAM budget in megabytes, or 0 if unlimited.

`set_target_tokens_per_second(tps: i32)`

Set the maximum tokens per second for inference. The inference loop sleeps between tokens to stay under this rate, freeing GPU time for rendering.

tps: target tokens per second. Set to 0 for unlimited (default). Negative values are clamped to 0.

`get_target_tokens_per_second() -> i32`

Get the configured token rate limit.

Returns: target tokens per second, or 0 if unlimited.

`set_frame_time_ms(ms: i32)`

Set the target frame time in milliseconds, used as a hint for adaptive inference throttling. For example, pass 16 for a 60 FPS target or 33 for 30 FPS.

ms: frame budget in milliseconds. Set to 0 for no frame budget (default). Negative values are clamped to 0.

`get_frame_time_ms() -> i32`

Get the configured frame time budget in milliseconds.

Returns: frame time in milliseconds, or 0 if no budget is set.

`set_lora_pre_merge(enabled: bool)`

Configure whether LoRA adapters are pre-merged into base weights at load time.

Pre-merging (the default) bakes the LoRA delta into the base weight, giving zero per-token overhead during inference. This temporarily uses ~2× the base weight's memory during the merge and clones the base weight to support adapter unload.

Disable on memory-constrained devices (iPhone, low-RAM iPads) — A/B matrices are kept and the delta is applied at runtime in forward(). Lower peak memory, small per-token cost.

enabled: true to pre-merge (default), false for runtime delta.

Must be called BEFORE AtelicoEngineNode.initialize_engine(). Settings change after init has no effect on already-loaded adapters.

`get_lora_pre_merge() -> bool`

Get the configured LoRA pre-merge policy. Returns the engine default (true) if set_lora_pre_merge has not been called.

`is_cig_d3d12_supported() -> bool`

Check whether the GPU supports CiG (Compute-in-Graphics) with D3D12. Requires an NVIDIA Ada Lovelace+ GPU with HAGS enabled and R570+ driver.

Returns: true if D3D12 CiG is supported, false otherwise. Always returns false on non-CUDA builds (macOS, CPU-only).

`is_cig_vulkan_supported() -> bool`

Check whether the GPU supports CiG (Compute-in-Graphics) with Vulkan. Requires an NVIDIA Ada Lovelace+ GPU with CUDA 12.9+ driver.

Returns: true if Vulkan CiG is supported, false otherwise. Always returns false on non-CUDA builds (macOS, CPU-only).

`get_gpu_scheduling_mode_raw() -> i32`

`set_gpu_scheduling_mode_raw(value: i32)`

AtelicoClassifierNode

Godot node for embedding-based text classification.

Loads pre-trained classifier models (centroid or KNN/HNSW) and predicts the class of input text strings. Useful for intent detection, sentiment analysis, or content routing in games.

Usage: Add as a child node, call initialize(), then load_classifier() with a classifier ID and directory path, then predict() to classify text.

Methods

`initialize() -> bool`

Initialize the classifier engine. Must be called before load_classifier or predict.

Inputs

(none)

Outputs

bool: true on success, false if initialization fails.

Example

Input: (no arguments)
Output: true

`load_classifier(classifier_id: GString, directory: GString) -> bool`

Load a classifier from a directory on disk.

Inputs

classifier_id -- logical name to reference this classifier in predict.
directory -- filesystem path to the classifier directory containing model weights and metadata files.

Outputs

bool: true on success, false on error (logged).

Example

Input: classifier_id="sentiment", directory="/models/sentiment_v1"
Output: true

`predict(classifier_id: GString, text: GString, top_k: i32) -> GString`

Predict the class of input text using a loaded classifier.

Inputs

classifier_id -- the ID passed to load_classifier.
text -- the input text string to classify.
top_k -- number of top predictions to include (minimum 1).

Outputs

JSON prediction result (empty string on error):

label (string) -- predicted class name.
probability (float) -- confidence score for the top prediction.
top (array) -- top-k predictions: [{"label": "...", "probability": 0.95}, ...]

Example

Input: classifier_id="sentiment", text="I love this game!", top_k=3
Output: {"label":"positive","probability":0.95,"top":[{"label":"positive","probability":0.95},{"label":"neutral","probability":0.04},{"label":"negative","probability":0.01}]}

AtelicoKvStoreNode

Godot node for semantic key-value storage with vector similarity search.

Backed by LanceDB. Store text entries with embeddings, then query by vector similarity to find semantically related content. Useful for game dialogue recall, lore lookup, or dynamic context retrieval.

Usage: Add as a child node, call kvstore_create() with a config JSON, then kvstore_query() to search by embedding similarity.

Methods

`kvstore_create(config_json: GString) -> bool`

Create a new KV store backed by LanceDB.

Inputs

config_json -- JSON configuration object:
- store_id (string) -- unique identifier for this store (default "default").
- db_path (string) -- filesystem path for the database (default "./data").
- table_name (string) -- database table name (default "entries").
- embed_dim (int) -- embedding vector dimensionality (default 384).
- has_priority (bool) -- enable priority scoring (default false).
- similarity_weight (float) -- weight for similarity in combined score (default 0.5).
- priority_weight (float) -- weight for priority in combined score (default 0.5).

Outputs

bool: true on success, false on error.

Example

Input: {"store_id":"lore","db_path":"./data/lore","table_name":"entries","embed_dim":384}
Output: true

`kvstore_query(store_id: GString, query_json: GString) -> GString`

Query a KV store using vector similarity search.

Inputs

store_id -- the store to query (must match a previous kvstore_create call).
query_json -- JSON query object:
- query_embedding (array of float, required) -- the query vector.
- query_text (string) -- text for display/debug.
- vector_search_limit (int) -- ANN candidate pool size (default 20).
- limit (int) -- number of results to return (default 5).
- use_prefilter (bool) -- apply facet filters before ANN (default true).

Outputs

JSON array of result objects ("[]" on error):

id (string) -- entry identifier.
key_text (string) -- entry key text.
similarity (float) -- cosine similarity score.
priority (float) -- priority value (if enabled).
combined_score (float) -- weighted combination of similarity and priority.

Example

Input: store_id="lore", query_json={"query_embedding":[0.1,0.2,...],"limit":3}
Output: [{"id":"1","key_text":"dragon lore","similarity":0.95,"priority":1.0,"combined_score":0.95}]

`kvstore_count(store_id: GString, filter: GString) -> i64`

Count entries in a KV store, optionally filtered.

Inputs

store_id -- the store to count entries in.
filter -- SQL-like filter expression (empty string = no filter, count all).

Outputs

int: the entry count, or -1 if the store is not found or an error occurs.

Example

Input: store_id="lore", filter=""
Output: 42

`kvstore_destroy(store_id: GString) -> bool`

Delete a KV store and remove it from memory.

Inputs

store_id -- the store to destroy.

Outputs

bool: true if the store existed and was removed, false if not found.

Example

Input: "lore"
Output: true

AtelicoGuardrailsNode

Godot node for content safety checking (guardrails).

Configure with a preset ("game-safe", "child-safe", or "developer-sdk") and use check_input/check_output/check_image_prompt to validate user or model content against safety policies.

Usage: Add as a child node, call initialize("game-safe"), then pass text through check_input() before sending to the LLM or check_output() before displaying model responses to the player.

Methods

`initialize(preset: GString) -> bool`

Initialize guardrails with a safety preset.

Inputs

preset -- one of "game-safe" (default), "child-safe", or "developer-sdk". Each preset configures different thresholds for violence, sexual content, hate speech, etc.

Outputs

bool: true on success, false on error.

Example

Input: "game-safe"
Output: true

`check_input(text: GString) -> GString`

Check user input text against safety guardrails.

Inputs

text -- the user-provided input text to validate.

Outputs

JSON SafetyVerdict (empty string on error):

allowed (bool) -- whether the input passes safety checks.
category (string, optional) -- violation category (only if blocked).
reason (string, optional) -- explanation of the violation (only if blocked).

Example

Input: "Tell me a joke about dragons"
Output: {"allowed":true}

`check_output(text: GString) -> GString`

Check model-generated output against safety guardrails.

Inputs

text -- the model output to validate before displaying to the player.

Outputs

JSON SafetyVerdict (same schema as check_input, empty string on error):

allowed (bool) -- whether the output passes safety checks.
category (string, optional) -- violation category (only if blocked).
reason (string, optional) -- explanation of the violation (only if blocked).

Example

Input: "The dragon breathes fire at the castle."
Output: {"allowed":true}

`check_image_prompt(prompt: GString) -> GString`

Check an image generation prompt against safety guardrails.

Inputs

prompt -- the image generation prompt to validate.

Outputs

JSON SafetyVerdict (same schema as check_input, empty string on error):

allowed (bool) -- whether the prompt passes safety checks.
category (string, optional) -- violation category (only if blocked).
reason (string, optional) -- explanation of the violation (only if blocked).

Example

Input: "A peaceful sunset over mountains"
Output: {"allowed":true}

AtelicoAnnIndexNode

Godot node for approximate nearest neighbor (ANN) vector search.

A pure data structure backed by HNSW (Hierarchical Navigable Small World) graph -- no GPU or ML models required. Insert vectors with label IDs, build the index, then search for nearest neighbors by cosine distance.

Usage: Add as a child node, call create(dim, max_elements), then insert() vectors, call build(), and finally search() to find nearest neighbors.

Methods

`create(dim: i32, max_elements: i32) -> bool`

Create a new empty ANN index.

Inputs

dim -- dimensionality of vectors (must match vectors inserted later).
max_elements -- maximum number of vectors the index can hold.

Outputs

bool: true on success.

Example

Input: dim=384, max_elements=1000
Output: true

`insert(vector: Array<f32>, label_id: i64) -> bool`

Insert a vector with an associated label ID.

Call build() after all insertions are complete before searching.

Inputs

vector -- array of floats with length matching dim from create().
label_id -- integer label to identify this vector in search results.

Outputs

bool: true on success, false if the index has not been created.

Example

Input: vector=[0.1, 0.2, 0.3], label_id=42
Output: true

`build() -> bool`

Build the HNSW index graph. Must be called after all insertions and before any search() calls.

Inputs

(none)

Outputs

bool: true on success, false if the index has not been created.

Example

Input: (no arguments)
Output: true

`search(query: Array<f32>, k: i32) -> GString`

Search for the k nearest neighbors of a query vector.

Inputs

query -- query vector (array of floats with length matching dim).
k -- number of nearest neighbors to return.

Outputs

JSON array of result objects sorted by ascending distance ("[]" if index not created):

label_id (int) -- the label assigned during insert().
distance (float) -- cosine distance (lower = more similar).

Example

Input: query=[0.1, 0.2, 0.3], k=3
Output: [{"label_id":42,"distance":0.05},{"label_id":17,"distance":0.12},{"label_id":8,"distance":0.34}]

AtelicoCacheNode

Godot node for prompt result caching.

Methods

`initialize() -> bool`

`cache_get(key_json: GString, policy_json: GString) -> GString`

Look up a cached result. Returns JSON or empty string on miss.

`cache_put(key_json: GString, policy_json: GString, response_json: GString) -> bool`

Store a result in the cache.

`cache_clear()`

`cache_size() -> i32`

AtelicoMatcherNode

Godot node for ranking candidates against a query.

Methods

`initialize() -> bool`

`match_one(matcher_id: GString, query: GString, elements_json: GString) -> GString`

Match the single best candidate. Returns JSON or empty string.

VectorMemoryStore

Godot node for vector memory storage backed by LanceDB.

Stores character memories and metadata as vector-embedded records and supports similarity-based retrieval. Useful for NPC memory recall, narrative context, and character persona management.

Usage: Add as a child node, set db_name and embed_dim exports, then call connect_to_db() and get_or_create_table() before reading or writing records.

Methods

`connect_to_db(db_name: GString) -> bool`

Open (or create) a LanceDB database at user://database/ followed by db_name. Does not create tables -- call get_or_create_table afterwards to set up tables with the desired schema.

db_name: database directory name (e.g. "npc_memories").

Returns: true on success, false if the connection fails or the database is already connected with the same name.

`get_or_create_table(table_name: GString, table_schema: GString) -> bool`

Open or create a table with the given schema name. If the table already exists, it is opened; otherwise a new table is created with the specified schema.

table_name: logical table name (e.g. "npc_observations").
table_schema: schema preset name. Composable convention: "memory", "memory+emotion", "character", "character+emotion".

Returns: true on success, false if not connected or on error.

`write_memory(_table_name: GString, node: VarDictionary) -> bool`

Write a single memory record to the store.

_table_name: reserved for future multi-table support (currently unused).
node: a Dictionary containing all Memory fields. Required keys: uuid, observer, counter, node_count, type_count, depth, poignancy, node_type, embedding_key, location, created, expiration, subject, predicate, object, description, filling (Array of String), and vector (Array of float with length matching embed_dim). Optional: emotion_at_encoding (String).

Returns: true on success, false if not connected or the dictionary is missing required fields.

`write_character(_table_name: GString, character: VarDictionary) -> bool`

Write a single character/persona metadata record to the store.

_table_name: reserved for future multi-table support (currently unused).
character: a Dictionary containing character fields. Required keys: uuid, uuid_int, persona_name, age, personality, background, story, daily_plan, home, lifestyle, usual_wake_up_time, activity, reflectiveness. Optional: emotional_baseline (String, defaults to "neutral").

Returns: true on success, false if not connected or the dictionary is missing required fields.

`get_table_row_count(_table_name: GString, filter: GString) -> i64`

Count rows in the memory store, optionally filtered.

_table_name: reserved for future multi-table support (currently unused).
filter: SQL-like filter expression (empty string means count all rows).

Returns: the number of matching rows, or -1 if not connected or on error.

AtelicoEngineNode​

Signals​

inference_token_generated​

inference_completed​

image_generation_chunk​

image_generation_completed​

audio_synthesis_chunk​

audio_synthesis_completed​

model_loading_completed​

async_request_completed​

Methods​

set_env_var(key: GString, value: GString)​

initialize_engine(backends_config: Array<VarDictionary>)​

llm_chat(request_str: GString) -> GString​

llm_chat_async(request_str: GString) -> i64​

llm_chat_stream(request_str: GString) -> i64​

llm_text_complete(request_str: GString) -> GString​

llm_respond(request_str: GString) -> GString​

llm_respond_stream(request_str: GString) -> i64​

image_generate(request_str: GString) -> GString​

audio_synthesize(request_str: GString) -> GString​

audio_transcribe(request_str: GString) -> GString​

audio_synthesize_stream(request_str: GString) -> i64​

image_generate_stream(request_str: GString) -> i64​

image_remove_background(request_str: GString) -> GString​

embed(request_str: GString) -> GString​

embed_async(request_str: GString) -> i64​

model_load(model_id: GString) -> bool​

lora_load(model_id: GString, adapter_path: GString) -> bool​

lora_unload(model_id: GString) -> bool​

lora_set_scale(model_id: GString, scale: f64) -> bool​

Deprecated Methods​

chat_completions(request_str: GString) -> GString​

async_chat_completions(request_str: GString) -> i64​

stream_chat_completions(request_str: GString) -> i64​

responses(request_str: GString) -> GString​

stream_responses(request_str: GString) -> i64​

embeddings(request_str: GString) -> GString​

async_embeddings(request_str: GString) -> i64​

stream_image_generation(request_str: GString) -> i64​

AtelicoSingleton​

Methods​

set_gpu_scheduling_mode(mode: i32)​

get_gpu_scheduling_mode() -> i32​

set_vram_budget_mb(budget_mb: i32)​

get_vram_budget_mb() -> i32​

set_target_tokens_per_second(tps: i32)​

get_target_tokens_per_second() -> i32​

set_frame_time_ms(ms: i32)​

get_frame_time_ms() -> i32​

set_lora_pre_merge(enabled: bool)​

get_lora_pre_merge() -> bool​

is_cig_d3d12_supported() -> bool​

is_cig_vulkan_supported() -> bool​

get_gpu_scheduling_mode_raw() -> i32​

set_gpu_scheduling_mode_raw(value: i32)​

AtelicoClassifierNode​

Methods​

initialize() -> bool​

load_classifier(classifier_id: GString, directory: GString) -> bool​

predict(classifier_id: GString, text: GString, top_k: i32) -> GString​

AtelicoKvStoreNode​

Methods​

kvstore_create(config_json: GString) -> bool​

kvstore_query(store_id: GString, query_json: GString) -> GString​

kvstore_count(store_id: GString, filter: GString) -> i64​

kvstore_destroy(store_id: GString) -> bool​

AtelicoGuardrailsNode​

Methods​

initialize(preset: GString) -> bool​

check_input(text: GString) -> GString​

check_output(text: GString) -> GString​

check_image_prompt(prompt: GString) -> GString​

AtelicoAnnIndexNode​

Methods​

create(dim: i32, max_elements: i32) -> bool​

insert(vector: Array<f32>, label_id: i64) -> bool​

build() -> bool​

search(query: Array<f32>, k: i32) -> GString​

AtelicoCacheNode​

AtelicoEngineNode

Signals

`inference_token_generated`

`inference_completed`

`image_generation_chunk`

`image_generation_completed`

`audio_synthesis_chunk`

`audio_synthesis_completed`

`model_loading_completed`

`async_request_completed`

Methods

`set_env_var(key: GString, value: GString)`

`initialize_engine(backends_config: Array<VarDictionary>)`

`llm_chat(request_str: GString) -> GString`

`llm_chat_async(request_str: GString) -> i64`

`llm_chat_stream(request_str: GString) -> i64`

`llm_text_complete(request_str: GString) -> GString`

`llm_respond(request_str: GString) -> GString`

`llm_respond_stream(request_str: GString) -> i64`

`image_generate(request_str: GString) -> GString`

`audio_synthesize(request_str: GString) -> GString`

`audio_transcribe(request_str: GString) -> GString`

`audio_synthesize_stream(request_str: GString) -> i64`

`image_generate_stream(request_str: GString) -> i64`

`image_remove_background(request_str: GString) -> GString`

`embed(request_str: GString) -> GString`

`embed_async(request_str: GString) -> i64`

`model_load(model_id: GString) -> bool`

`lora_load(model_id: GString, adapter_path: GString) -> bool`

`lora_unload(model_id: GString) -> bool`

`lora_set_scale(model_id: GString, scale: f64) -> bool`

Deprecated Methods

`chat_completions(request_str: GString) -> GString`

`async_chat_completions(request_str: GString) -> i64`

`stream_chat_completions(request_str: GString) -> i64`

`responses(request_str: GString) -> GString`

`stream_responses(request_str: GString) -> i64`

`embeddings(request_str: GString) -> GString`

`async_embeddings(request_str: GString) -> i64`

`stream_image_generation(request_str: GString) -> i64`

AtelicoSingleton

Methods

`set_gpu_scheduling_mode(mode: i32)`

`get_gpu_scheduling_mode() -> i32`

`set_vram_budget_mb(budget_mb: i32)`

`get_vram_budget_mb() -> i32`

`set_target_tokens_per_second(tps: i32)`

`get_target_tokens_per_second() -> i32`

`set_frame_time_ms(ms: i32)`

`get_frame_time_ms() -> i32`

`set_lora_pre_merge(enabled: bool)`

`get_lora_pre_merge() -> bool`

`is_cig_d3d12_supported() -> bool`

`is_cig_vulkan_supported() -> bool`

`get_gpu_scheduling_mode_raw() -> i32`

`set_gpu_scheduling_mode_raw(value: i32)`

AtelicoClassifierNode

Methods

`initialize() -> bool`

`load_classifier(classifier_id: GString, directory: GString) -> bool`

`predict(classifier_id: GString, text: GString, top_k: i32) -> GString`

AtelicoKvStoreNode

Methods

`kvstore_create(config_json: GString) -> bool`

`kvstore_query(store_id: GString, query_json: GString) -> GString`

`kvstore_count(store_id: GString, filter: GString) -> i64`

`kvstore_destroy(store_id: GString) -> bool`

AtelicoGuardrailsNode

Methods

`initialize(preset: GString) -> bool`

`check_input(text: GString) -> GString`

`check_output(text: GString) -> GString`

`check_image_prompt(prompt: GString) -> GString`

AtelicoAnnIndexNode

Methods

`create(dim: i32, max_elements: i32) -> bool`

`insert(vector: Array<f32>, label_id: i64) -> bool`

`build() -> bool`

`search(query: Array<f32>, k: i32) -> GString`

AtelicoCacheNode