Skip to main content
Version: 0.9

Godot API Reference

AtelicoEngineNode

The main Godot node for the Atelico AI Engine.

Add as a child node in your scene tree and call initialize_engine to configure backends before using any LLM, embedding, or image generation methods.

Signals emitted by this node:

  • inference_token_generated(job_id: int, chunk_json: String) -- a new token chunk arrived during streaming chat/text completion.
  • inference_completed(job_id: int, success: bool) -- a streaming LLM request finished (success=true) or failed (success=false).
  • image_generation_chunk(job_id: int, response_json: String) -- a progress chunk from streaming image generation.
  • image_generation_completed(job_id: int, success: bool) -- streaming image generation finished.
  • model_loading_completed(success: bool) -- engine initialization done.
  • async_request_completed(job_id: int, response_json: String) -- a non-blocking (async) request completed. response_json is empty on error.

Signals

inference_token_generated

Emitted once per token during a streaming LLM request (llm_chat_stream or llm_respond_stream).

  • job_id: matches the return value of the streaming method.
  • chunk_json: a ChatCompletionChunk or ResponseStreamDelta JSON string.

inference_completed

Emitted when a streaming LLM request finishes.

  • job_id: matches the return value of the streaming method.
  • success: true if generation completed normally, false on error.

image_generation_chunk

Emitted for each progress update during streaming image generation (image_generate_stream).

  • job_id: matches the return value of image_generate_stream.
  • response_json: an ImageGenerationResponse JSON string with partial or complete image data.

image_generation_completed

Emitted when streaming image generation finishes.

  • job_id: matches the return value of image_generate_stream.
  • success: true if generation completed normally, false on error.

audio_synthesis_chunk

Emitted for each chunk during streaming TTS synthesis (audio_synthesize_stream).

  • job_id: matches the return value of audio_synthesize_stream.
  • chunk_json: an AudioSpeechChunk JSON string with fields sequence, audio (base64 WAV), duration_seconds, text.

audio_synthesis_completed

Emitted when streaming TTS synthesis finishes.

  • job_id: matches the return value of audio_synthesize_stream.
  • success: true on normal completion, false on error.

model_loading_completed

Emitted when initialize_engine finishes configuring backends.

  • success: true if at least one backend was configured successfully.

async_request_completed

Emitted when a non-blocking (async) request completes. Fired by llm_chat_async and embed_async.

  • job_id: matches the return value of the async method.
  • response_json: the full JSON response string (empty on error).

Methods

set_env_var(key: GString, value: GString)

Set a process environment variable.

Useful for configuring RUST_LOG, HF_TOKEN, or OPENAI_API_KEY from GDScript before initializing the engine.

Inputs

  • key -- environment variable name (e.g. "RUST_LOG", "HF_TOKEN").
  • value -- value to set.

Outputs

No return value.

Example

Input: key="RUST_LOG", value="debug"
Output: (none -- environment variable is set for the process)

initialize_engine(backends_config: Array<VarDictionary>)

Configure the engine router with an array of backend definitions.

GPU scheduling configuration is read from the AtelicoSingleton at init time. Call AtelicoSingleton.set_gpu_scheduling_mode() etc. before calling this method to configure GPU sharing behavior.

Inputs

  • backends_config -- Array of Dictionaries, each with:
    • "name" (string, required) -- logical name for the backend (e.g. "in-memory").
    • "type" (string, required) -- "in-memory" (local inference) or "proxy" (remote API).
    • "config" (Dictionary, required) -- backend-specific settings:
      • For "proxy": {"base_url": "...", "api_key": "..."}.
      • For "in-memory": empty dictionary {} is valid.

Outputs

No return value. Emits model_loading_completed(true) signal on success.

Example

Input: [{"name":"in-memory","type":"in-memory","config":{}}]
Output: (signal) model_loading_completed(true)

llm_chat(request_str: GString) -> GString

Synchronous (blocking) chat completion.

Inputs

  • request_str -- JSON ChatCompletionRequest:
    • model (string, required) -- model ID in "backend::org/model" format.
    • messages (array, required) -- conversation messages:
      • role (string) -- "system", "user", or "assistant".
      • content (string) -- message text.
    • max_tokens (int) -- maximum tokens to generate.
    • temperature (float) -- sampling temperature 0.0-2.0.
    • top_p (float) -- nucleus sampling threshold.
    • response_format (object) -- structured output schema.

Outputs

JSON ChatCompletionResponse (empty string on error):

  • id (string) -- unique response ID.
  • choices (array) -- generated completions:
    • message (object) -- { role, content }.
    • finish_reason (string) -- "stop" or "length".
  • usage (object) -- { prompt_tokens, completion_tokens, total_tokens }.

Example

Input: {"model":"in-memory::meta-llama/Llama-3.2-1B-Instruct","messages":[{"role":"user","content":"Hello"}],"max_tokens":100}
Output: {"id":"chatcmpl-xxx","choices":[{"message":{"role":"assistant","content":"Hi!"},"finish_reason":"stop"}],"usage":{"prompt_tokens":5,"completion_tokens":3,"total_tokens":8}}

llm_chat_async(request_str: GString) -> i64

Non-blocking chat completion. Runs on a background thread.

Inputs

  • request_str -- same JSON ChatCompletionRequest schema as llm_chat.

Outputs

int: a job_id. When inference finishes, the async_request_completed(job_id, response_json) signal is emitted with the full ChatCompletionResponse JSON (or empty string on error).

Example

Input: {"model":"in-memory::llama","messages":[{"role":"user","content":"Hi"}]}
Output: 0 (job_id; then signal async_request_completed(0, "{...}") is emitted)

llm_chat_stream(request_str: GString) -> i64

Streaming chat completion. Tokens arrive via signals.

Inputs

  • request_str -- same JSON ChatCompletionRequest schema as llm_chat. The stream field is set automatically.

Outputs

int: a job_id. For each generated token, the inference_token_generated(job_id, chunk_json) signal is emitted where chunk_json is a ChatCompletionChunk:

  • id (string) -- chunk ID.
  • choices (array) -- [{"delta": {"content": "token"}, "finish_reason": null}]

When generation finishes, inference_completed(job_id, success) is emitted.

Example

Input: {"model":"in-memory::llama","messages":[{"role":"user","content":"Hi"}]}
Output: 0 (job_id; then signals: inference_token_generated(0, "{...}") per token, inference_completed(0, true))

llm_text_complete(request_str: GString) -> GString

Synchronous (blocking) text completion. Continues a raw prompt without chat template formatting.

Inputs

  • request_str -- JSON TextCompletionRequest:
    • model (string, required) -- backend-prefixed model ID.
    • prompt (string, required) -- raw text prompt (no chat template).
    • max_tokens (int) -- maximum tokens to generate.
    • temperature (float) -- sampling temperature 0.0-2.0.

Outputs

JSON TextCompletionResponse (empty string on error):

  • id (string) -- unique completion ID.
  • choices (array) -- [{"text": "...", "index": 0, "finish_reason": "stop"}]
  • usage (object) -- { prompt_tokens, completion_tokens, total_tokens }

Example

Input: {"model":"in-memory::llama","prompt":"Once upon a time","max_tokens":50}
Output: {"id":"cmpl-xxx","choices":[{"text":" there was a...","index":0,"finish_reason":"stop"}],"usage":{"prompt_tokens":4,"completion_tokens":50,"total_tokens":54}}

llm_respond(request_str: GString) -> GString

Synchronous (blocking) Responses API call. A higher-level conversational interface that manages conversation state internally.

Inputs

  • request_str -- JSON ResponseRequest:
    • model (string, required) -- backend-prefixed model ID.
    • input (string or array) -- user input text or message list.
    • instructions (string) -- system prompt / instructions.
    • max_output_tokens (int) -- maximum tokens to generate.
    • temperature (float) -- sampling temperature 0.0-2.0.

Outputs

JSON ResponseResponse (empty string on error):

  • id (string) -- unique response ID.
  • output (array) -- [{"type": "message", "content": [...]}]
  • usage (object) -- { input_tokens, output_tokens }

Example

Input: {"model":"in-memory::llama","input":"What is 2+2?"}
Output: {"id":"resp-xxx","output":[{"type":"message","content":[{"type":"output_text","text":"4"}]}],"usage":{"input_tokens":5,"output_tokens":1}}

llm_respond_stream(request_str: GString) -> i64

Streaming Responses API request. Tokens arrive via signals.

Inputs

  • request_str -- same JSON ResponseRequest schema as llm_respond. The stream field is set automatically.

Outputs

int: a job_id. Tokens arrive via inference_token_generated(job_id, chunk_json), and completion is signaled by inference_completed(job_id, success).

Example

Input: {"model":"in-memory::llama","input":"Hello"}
Output: 0 (job_id; then signals: inference_token_generated(0, "{...}") per token, inference_completed(0, true))

image_generate(request_str: GString) -> GString

Synchronous (blocking) image generation from a text prompt.

Inputs

  • request_str -- JSON ImageGenerationRequest:
    • model (string, required) -- backend-prefixed model ID.
    • prompt (string, required) -- text description of the image.
    • n (int) -- number of images to generate (default 1).
    • size (string) -- image dimensions (e.g. "512x512").
    • response_format (string) -- "b64_json" or "url".

Outputs

JSON ImageGenerationResponse (empty string on error):

  • created (int) -- Unix timestamp of generation.
  • data (array) -- [{"b64_json": "...", "revised_prompt": "..."}]

Example

Input: {"model":"in-memory::PixArt-alpha/PixArt-Sigma-XL-2-512-MS","prompt":"A sunset","size":"512x512"}
Output: {"created":1234567890,"data":[{"b64_json":"iVBORw0KGgo...","revised_prompt":"A sunset"}]}

audio_synthesize(request_str: GString) -> GString

Synthesize speech from text (blocking).

Inputs

  • request_str -- JSON AudioSpeechRequest:
    • model (string, required) -- e.g. "in-memory::tts", "in-memory::kokoro-82m", "in-memory::pocket-tts". Bare ids accepted after the prefix: tts (default → kokoro-82m), kokoro, kokoro-82m, pocket, pocket-tts.
    • input (string, required) -- text to synthesize.
    • voice (string) -- voice id (default "af_heart"). Pocket TTS ships 24 built-in English voices (alba, ian, morgan, kate, ...).
    • speed (float) -- 0.25–4.0 multiplier (default 1.0).

Outputs

JSON object (empty string on error):

  • audio_b64 (string) -- base64-encoded WAV bytes (RIFF + PCM).
  • duration_seconds (float)
  • format (string) -- "wav"
  • sample_rate (int) -- typically 24000.

Example

Input: {"model":"in-memory::tts","input":"Hello!","voice":"af_heart"}
Output: {"audio_b64":"UklGRn...","duration_seconds":1.42,"format":"wav","sample_rate":24000}

audio_transcribe(request_str: GString) -> GString

Transcribe audio to text (blocking).

Inputs

  • request_str -- JSON object:
    • model (string, required) -- e.g. "in-memory::whisper" (default → whisper-base.en), "in-memory::whisper-large-v3-turbo", "in-memory::distil-large-v3".
    • audio_b64 (string, required) -- base64-encoded WAV file (RIFF + PCM). Sample rate is read from the WAV header.
    • language (string) -- ISO 639-1 code ("en", "ja", ...) or omit for auto-detect.
    • response_format (string) -- "json" (default) or "verbose_json" (adds segments).
    • temperature (float) -- decoder sampling temperature (0.0 = greedy).
    • timestamp_granularities (array) -- ["segment"] and/or ["word"].

Outputs

JSON AudioTranscriptionResponse (empty string on error):

  • text (string)
  • language (string, optional)
  • duration (float, optional)
  • segments, words (optional, depending on response_format and granularities).

Example

Input: {"model":"in-memory::whisper","audio_b64":"UklGRn..."}
Output: {"text":"Hello from Atelico.","language":"en","duration":1.4}

audio_synthesize_stream(request_str: GString) -> i64

Streaming TTS synthesis. Chunks arrive via signals.

Inputs

  • request_str -- JSON AudioSpeechRequest (same schema as audio_synthesize). The stream field is set automatically.

Outputs

int: a job_id. Per-chunk signals fire as audio_synthesis_chunk(job_id, chunk_json), and completion is signaled by audio_synthesis_completed(job_id, success).

Example

Input: {"model":"in-memory::pocket-tts","input":"First. Second.","voice":"alba"}
Output: 7 (job_id; then signals: audio_synthesis_chunk(7, "{...}"), audio_synthesis_completed(7, true))

image_generate_stream(request_str: GString) -> i64

Streaming image generation. Progress chunks arrive via signals.

Inputs

  • request_str -- same JSON ImageGenerationRequest schema as image_generate. The stream field is set automatically.

Outputs

int: a job_id. Progress chunks arrive via image_generation_chunk(job_id, response_json), and completion is signaled by image_generation_completed(job_id, success).

Example

Input: {"model":"in-memory::PixArt","prompt":"A cat"}
Output: 0 (job_id; then signals: image_generation_chunk(0, "{...}"), image_generation_completed(0, true))

image_remove_background(request_str: GString) -> GString

Synchronous (blocking) background removal from an image.

Inputs

  • request_str -- JSON BackgroundRemovalRequest:
    • model (string, required) -- backend-prefixed model ID.
    • image (string, required) -- base64-encoded image data or URL.

Outputs

JSON response (empty string on error):

  • data (array) -- [{"b64_json": "..."}] -- image with background removed.

Example

Input: {"model":"in-memory::briaai/RMBG-1.4","image":"iVBORw0KGgo..."}
Output: {"data":[{"b64_json":"iVBORw0KGgo..."}]}

embed(request_str: GString) -> GString

Synchronous (blocking) text embedding.

Inputs

  • request_str -- JSON EmbeddingRequest:
    • model (string, required) -- backend-prefixed embedding model ID.
    • input (string or array of strings) -- text(s) to embed.

Outputs

JSON EmbeddingResponse (empty string on error):

  • object (string) -- always "list".
  • model (string) -- model used.
  • data (array) -- [{"object": "embedding", "index": 0, "embedding": [0.1, ...]}]
  • usage (object) -- { prompt_tokens, total_tokens }

Example

Input: {"model":"in-memory::sentence-transformers/all-MiniLM-L6-v2","input":"Hello world"}
Output: {"object":"list","data":[{"object":"embedding","index":0,"embedding":[0.1,0.2,...]}],"usage":{"prompt_tokens":2,"total_tokens":2}}

embed_async(request_str: GString) -> i64

Non-blocking embedding. Runs on a background thread.

Inputs

  • request_str -- same JSON EmbeddingRequest schema as embed.

Outputs

int: a job_id. When embedding finishes, the async_request_completed(job_id, response_json) signal is emitted with the full EmbeddingResponse JSON (or empty string on error).

Example

Input: {"model":"in-memory::all-MiniLM-L6-v2","input":"Hello"}
Output: 0 (job_id; then signal async_request_completed(0, "{...}") is emitted)

model_load(model_id: GString) -> bool

Pre-load a model (blocking). Call during loading screens to avoid latency on the first inference request.

Inputs

  • model_id -- model ID in "backend::org/model" format (e.g. "in-memory::meta-llama/Llama-3.2-1B-Instruct").

Outputs

bool: true on success, false if the backend is not found or loading fails.

Example

Input: "in-memory::meta-llama/Llama-3.2-1B-Instruct"
Output: true

lora_load(model_id: GString, adapter_path: GString) -> bool

Load a LoRA adapter onto a model (registers intent; actual loading happens via the model name convention backend::model::adapter).

Inputs

  • model_id -- the base model ID to attach the adapter to.
  • adapter_path -- filesystem path to a directory containing adapter_config.json and adapter weight files.

Outputs

bool: true on success.

Example

Input: model_id="in-memory::llama", adapter_path="/adapters/my-lora"
Output: true

lora_unload(model_id: GString) -> bool

Unload a LoRA adapter from a model, reverting to base weights.

Inputs

  • model_id -- the model whose adapter should be removed.

Outputs

bool: true on success.

Example

Input: "in-memory::llama"
Output: true

lora_set_scale(model_id: GString, scale: f64) -> bool

Set the LoRA runtime scale factor for a model's loaded adapter.

A scale of 1.0 applies the full adapter effect; 0.0 effectively disables it without unloading.

Inputs

  • model_id -- the model with a loaded LoRA adapter.
  • scale -- scale factor (typically 0.0 to 1.0; 1.0 = full adapter effect).

Outputs

bool: true on success.

Example

Input: model_id="in-memory::llama", scale=0.5
Output: true

Deprecated Methods

chat_completions(request_str: GString) -> GString

DEPRECATED: Use llm_chat instead.

Synchronous chat completion (old API name). Delegates to llm_chat.

Inputs

  • request_str -- same as llm_chat.

Outputs

Same as llm_chat.

Example

Input: {"model":"in-memory::llama","messages":[{"role":"user","content":"Hi"}]}
Output: (same as llm_chat -- use llm_chat instead)

async_chat_completions(request_str: GString) -> i64

DEPRECATED: Use llm_chat_async instead.

Non-blocking chat completion (old API name). Delegates to llm_chat_async.

Inputs

  • request_str -- same as llm_chat_async.

Outputs

Same as llm_chat_async.

Example

Input: {"model":"in-memory::llama","messages":[{"role":"user","content":"Hi"}]}
Output: (same as llm_chat_async -- use llm_chat_async instead)

stream_chat_completions(request_str: GString) -> i64

DEPRECATED: Use llm_chat_stream instead.

Streaming chat completion (old API name). Delegates to llm_chat_stream.

Inputs

  • request_str -- same as llm_chat_stream.

Outputs

Same as llm_chat_stream.

Example

Input: {"model":"in-memory::llama","messages":[{"role":"user","content":"Hi"}]}
Output: (same as llm_chat_stream -- use llm_chat_stream instead)

responses(request_str: GString) -> GString

DEPRECATED: Use llm_respond instead.

Synchronous Responses API call (old API name). Delegates to llm_respond.

Inputs

  • request_str -- same as llm_respond.

Outputs

Same as llm_respond.

Example

Input: {"model":"in-memory::llama","input":"Hello"}
Output: (same as llm_respond -- use llm_respond instead)

stream_responses(request_str: GString) -> i64

DEPRECATED: Use llm_respond_stream instead.

Streaming Responses API (old API name). Delegates to llm_respond_stream.

Inputs

  • request_str -- same as llm_respond_stream.

Outputs

Same as llm_respond_stream.

Example

Input: {"model":"in-memory::llama","input":"Hello"}
Output: (same as llm_respond_stream -- use llm_respond_stream instead)

embeddings(request_str: GString) -> GString

DEPRECATED: Use embed instead.

Synchronous embedding (old API name). Delegates to embed.

Inputs

  • request_str -- same as embed.

Outputs

Same as embed.

Example

Input: {"model":"in-memory::all-MiniLM-L6-v2","input":"Hello"}
Output: (same as embed -- use embed instead)

async_embeddings(request_str: GString) -> i64

DEPRECATED: Use embed_async instead.

Non-blocking embedding (old API name). Delegates to embed_async.

Inputs

  • request_str -- same as embed_async.

Outputs

Same as embed_async.

Example

Input: {"model":"in-memory::all-MiniLM-L6-v2","input":"Hello"}
Output: (same as embed_async -- use embed_async instead)

stream_image_generation(request_str: GString) -> i64

DEPRECATED: Use image_generate_stream instead.

Streaming image generation (old API name). Delegates to image_generate_stream.

Inputs

  • request_str -- same as image_generate_stream.

Outputs

Same as image_generate_stream.

Example

Input: {"model":"in-memory::PixArt","prompt":"A cat"}
Output: (same as image_generate_stream -- use image_generate_stream instead)

AtelicoSingleton

Engine-level singleton for GPU scheduling and inference configuration.

Registered as AtelicoSingleton and accessible from GDScript via Engine.get_singleton("AtelicoSingleton"). Configure GPU scheduling mode, VRAM budgets, token rate limiting, and frame budgets before calling AtelicoEngineNode.initialize_engine().

Settings configured here are read once at engine initialization time. Changing them after initialize_engine() has no effect.

Methods

set_gpu_scheduling_mode(mode: i32)

Set the GPU scheduling mode to control how GPU time is shared between inference and rendering.

  • mode: one of PRIORITIZE_COMPUTE (0), BALANCE (1), or PRIORITIZE_GRAPHICS (2). Invalid values default to BALANCE.

get_gpu_scheduling_mode() -> i32

Get the current GPU scheduling mode as an integer constant.

Returns: one of PRIORITIZE_COMPUTE (0), BALANCE (1), or PRIORITIZE_GRAPHICS (2).

set_vram_budget_mb(budget_mb: i32)

Set the maximum VRAM (in megabytes) that inference may use. When exceeded, model loading will be rejected.

  • budget_mb: VRAM cap in MB. Set to 0 for unlimited (default). Negative values are clamped to 0.

get_vram_budget_mb() -> i32

Get the configured VRAM budget in MB.

Returns: VRAM budget in megabytes, or 0 if unlimited.

set_target_tokens_per_second(tps: i32)

Set the maximum tokens per second for inference. The inference loop sleeps between tokens to stay under this rate, freeing GPU time for rendering.

  • tps: target tokens per second. Set to 0 for unlimited (default). Negative values are clamped to 0.

get_target_tokens_per_second() -> i32

Get the configured token rate limit.

Returns: target tokens per second, or 0 if unlimited.

set_frame_time_ms(ms: i32)

Set the target frame time in milliseconds, used as a hint for adaptive inference throttling. For example, pass 16 for a 60 FPS target or 33 for 30 FPS.

  • ms: frame budget in milliseconds. Set to 0 for no frame budget (default). Negative values are clamped to 0.

get_frame_time_ms() -> i32

Get the configured frame time budget in milliseconds.

Returns: frame time in milliseconds, or 0 if no budget is set.

set_lora_pre_merge(enabled: bool)

Configure whether LoRA adapters are pre-merged into base weights at load time.

Pre-merging (the default) bakes the LoRA delta into the base weight, giving zero per-token overhead during inference. This temporarily uses ~2× the base weight's memory during the merge and clones the base weight to support adapter unload.

Disable on memory-constrained devices (iPhone, low-RAM iPads) — A/B matrices are kept and the delta is applied at runtime in forward(). Lower peak memory, small per-token cost.

  • enabled: true to pre-merge (default), false for runtime delta.

Must be called BEFORE AtelicoEngineNode.initialize_engine(). Settings change after init has no effect on already-loaded adapters.

get_lora_pre_merge() -> bool

Get the configured LoRA pre-merge policy. Returns the engine default (true) if set_lora_pre_merge has not been called.

is_cig_d3d12_supported() -> bool

Check whether the GPU supports CiG (Compute-in-Graphics) with D3D12. Requires an NVIDIA Ada Lovelace+ GPU with HAGS enabled and R570+ driver.

Returns: true if D3D12 CiG is supported, false otherwise. Always returns false on non-CUDA builds (macOS, CPU-only).

is_cig_vulkan_supported() -> bool

Check whether the GPU supports CiG (Compute-in-Graphics) with Vulkan. Requires an NVIDIA Ada Lovelace+ GPU with CUDA 12.9+ driver.

Returns: true if Vulkan CiG is supported, false otherwise. Always returns false on non-CUDA builds (macOS, CPU-only).

get_gpu_scheduling_mode_raw() -> i32

set_gpu_scheduling_mode_raw(value: i32)

AtelicoClassifierNode

Godot node for embedding-based text classification.

Loads pre-trained classifier models (centroid or KNN/HNSW) and predicts the class of input text strings. Useful for intent detection, sentiment analysis, or content routing in games.

Usage: Add as a child node, call initialize(), then load_classifier() with a classifier ID and directory path, then predict() to classify text.

Methods

initialize() -> bool

Initialize the classifier engine. Must be called before load_classifier or predict.

Inputs

(none)

Outputs

bool: true on success, false if initialization fails.

Example

Input: (no arguments)
Output: true

load_classifier(classifier_id: GString, directory: GString) -> bool

Load a classifier from a directory on disk.

Inputs

  • classifier_id -- logical name to reference this classifier in predict.
  • directory -- filesystem path to the classifier directory containing model weights and metadata files.

Outputs

bool: true on success, false on error (logged).

Example

Input: classifier_id="sentiment", directory="/models/sentiment_v1"
Output: true

predict(classifier_id: GString, text: GString, top_k: i32) -> GString

Predict the class of input text using a loaded classifier.

Inputs

  • classifier_id -- the ID passed to load_classifier.
  • text -- the input text string to classify.
  • top_k -- number of top predictions to include (minimum 1).

Outputs

JSON prediction result (empty string on error):

  • label (string) -- predicted class name.
  • probability (float) -- confidence score for the top prediction.
  • top (array) -- top-k predictions: [{"label": "...", "probability": 0.95}, ...]

Example

Input: classifier_id="sentiment", text="I love this game!", top_k=3
Output: {"label":"positive","probability":0.95,"top":[{"label":"positive","probability":0.95},{"label":"neutral","probability":0.04},{"label":"negative","probability":0.01}]}

AtelicoKvStoreNode

Godot node for semantic key-value storage with vector similarity search.

Backed by LanceDB. Store text entries with embeddings, then query by vector similarity to find semantically related content. Useful for game dialogue recall, lore lookup, or dynamic context retrieval.

Usage: Add as a child node, call kvstore_create() with a config JSON, then kvstore_query() to search by embedding similarity.

Methods

kvstore_create(config_json: GString) -> bool

Create a new KV store backed by LanceDB.

Inputs

  • config_json -- JSON configuration object:
    • store_id (string) -- unique identifier for this store (default "default").
    • db_path (string) -- filesystem path for the database (default "./data").
    • table_name (string) -- database table name (default "entries").
    • embed_dim (int) -- embedding vector dimensionality (default 384).
    • has_priority (bool) -- enable priority scoring (default false).
    • similarity_weight (float) -- weight for similarity in combined score (default 0.5).
    • priority_weight (float) -- weight for priority in combined score (default 0.5).

Outputs

bool: true on success, false on error.

Example

Input: {"store_id":"lore","db_path":"./data/lore","table_name":"entries","embed_dim":384}
Output: true

kvstore_query(store_id: GString, query_json: GString) -> GString

Query a KV store using vector similarity search.

Inputs

  • store_id -- the store to query (must match a previous kvstore_create call).
  • query_json -- JSON query object:
    • query_embedding (array of float, required) -- the query vector.
    • query_text (string) -- text for display/debug.
    • vector_search_limit (int) -- ANN candidate pool size (default 20).
    • limit (int) -- number of results to return (default 5).
    • use_prefilter (bool) -- apply facet filters before ANN (default true).

Outputs

JSON array of result objects ("[]" on error):

  • id (string) -- entry identifier.
  • key_text (string) -- entry key text.
  • similarity (float) -- cosine similarity score.
  • priority (float) -- priority value (if enabled).
  • combined_score (float) -- weighted combination of similarity and priority.

Example

Input: store_id="lore", query_json={"query_embedding":[0.1,0.2,...],"limit":3}
Output: [{"id":"1","key_text":"dragon lore","similarity":0.95,"priority":1.0,"combined_score":0.95}]

kvstore_count(store_id: GString, filter: GString) -> i64

Count entries in a KV store, optionally filtered.

Inputs

  • store_id -- the store to count entries in.
  • filter -- SQL-like filter expression (empty string = no filter, count all).

Outputs

int: the entry count, or -1 if the store is not found or an error occurs.

Example

Input: store_id="lore", filter=""
Output: 42

kvstore_destroy(store_id: GString) -> bool

Delete a KV store and remove it from memory.

Inputs

  • store_id -- the store to destroy.

Outputs

bool: true if the store existed and was removed, false if not found.

Example

Input: "lore"
Output: true

AtelicoGuardrailsNode

Godot node for content safety checking (guardrails).

Configure with a preset ("game-safe", "child-safe", or "developer-sdk") and use check_input/check_output/check_image_prompt to validate user or model content against safety policies.

Usage: Add as a child node, call initialize("game-safe"), then pass text through check_input() before sending to the LLM or check_output() before displaying model responses to the player.

Methods

initialize(preset: GString) -> bool

Initialize guardrails with a safety preset.

Inputs

  • preset -- one of "game-safe" (default), "child-safe", or "developer-sdk". Each preset configures different thresholds for violence, sexual content, hate speech, etc.

Outputs

bool: true on success, false on error.

Example

Input: "game-safe"
Output: true

check_input(text: GString) -> GString

Check user input text against safety guardrails.

Inputs

  • text -- the user-provided input text to validate.

Outputs

JSON SafetyVerdict (empty string on error):

  • allowed (bool) -- whether the input passes safety checks.
  • category (string, optional) -- violation category (only if blocked).
  • reason (string, optional) -- explanation of the violation (only if blocked).

Example

Input: "Tell me a joke about dragons"
Output: {"allowed":true}

check_output(text: GString) -> GString

Check model-generated output against safety guardrails.

Inputs

  • text -- the model output to validate before displaying to the player.

Outputs

JSON SafetyVerdict (same schema as check_input, empty string on error):

  • allowed (bool) -- whether the output passes safety checks.
  • category (string, optional) -- violation category (only if blocked).
  • reason (string, optional) -- explanation of the violation (only if blocked).

Example

Input: "The dragon breathes fire at the castle."
Output: {"allowed":true}

check_image_prompt(prompt: GString) -> GString

Check an image generation prompt against safety guardrails.

Inputs

  • prompt -- the image generation prompt to validate.

Outputs

JSON SafetyVerdict (same schema as check_input, empty string on error):

  • allowed (bool) -- whether the prompt passes safety checks.
  • category (string, optional) -- violation category (only if blocked).
  • reason (string, optional) -- explanation of the violation (only if blocked).

Example

Input: "A peaceful sunset over mountains"
Output: {"allowed":true}

AtelicoAnnIndexNode

Godot node for approximate nearest neighbor (ANN) vector search.

A pure data structure backed by HNSW (Hierarchical Navigable Small World) graph -- no GPU or ML models required. Insert vectors with label IDs, build the index, then search for nearest neighbors by cosine distance.

Usage: Add as a child node, call create(dim, max_elements), then insert() vectors, call build(), and finally search() to find nearest neighbors.

Methods

create(dim: i32, max_elements: i32) -> bool

Create a new empty ANN index.

Inputs

  • dim -- dimensionality of vectors (must match vectors inserted later).
  • max_elements -- maximum number of vectors the index can hold.

Outputs

bool: true on success.

Example

Input: dim=384, max_elements=1000
Output: true

insert(vector: Array<f32>, label_id: i64) -> bool

Insert a vector with an associated label ID.

Call build() after all insertions are complete before searching.

Inputs

  • vector -- array of floats with length matching dim from create().
  • label_id -- integer label to identify this vector in search results.

Outputs

bool: true on success, false if the index has not been created.

Example

Input: vector=[0.1, 0.2, 0.3], label_id=42
Output: true

build() -> bool

Build the HNSW index graph. Must be called after all insertions and before any search() calls.

Inputs

(none)

Outputs

bool: true on success, false if the index has not been created.

Example

Input: (no arguments)
Output: true

search(query: Array<f32>, k: i32) -> GString

Search for the k nearest neighbors of a query vector.

Inputs

  • query -- query vector (array of floats with length matching dim).
  • k -- number of nearest neighbors to return.

Outputs

JSON array of result objects sorted by ascending distance ("[]" if index not created):

  • label_id (int) -- the label assigned during insert().
  • distance (float) -- cosine distance (lower = more similar).

Example

Input: query=[0.1, 0.2, 0.3], k=3
Output: [{"label_id":42,"distance":0.05},{"label_id":17,"distance":0.12},{"label_id":8,"distance":0.34}]

AtelicoCacheNode

Godot node for prompt result caching.

Methods

initialize() -> bool

cache_get(key_json: GString, policy_json: GString) -> GString

Look up a cached result. Returns JSON or empty string on miss.

cache_put(key_json: GString, policy_json: GString, response_json: GString) -> bool

Store a result in the cache.

cache_clear()

cache_size() -> i32

AtelicoMatcherNode

Godot node for ranking candidates against a query.

Methods

initialize() -> bool

match_one(matcher_id: GString, query: GString, elements_json: GString) -> GString

Match the single best candidate. Returns JSON or empty string.

VectorMemoryStore

Godot node for vector memory storage backed by LanceDB.

Stores character memories and metadata as vector-embedded records and supports similarity-based retrieval. Useful for NPC memory recall, narrative context, and character persona management.

Usage: Add as a child node, set db_name and embed_dim exports, then call connect_to_db() and get_or_create_table() before reading or writing records.

Methods

connect_to_db(db_name: GString) -> bool

Open (or create) a LanceDB database at user://database/ followed by db_name. Does not create tables -- call get_or_create_table afterwards to set up tables with the desired schema.

  • db_name: database directory name (e.g. "npc_memories").

Returns: true on success, false if the connection fails or the database is already connected with the same name.

get_or_create_table(table_name: GString, table_schema: GString) -> bool

Open or create a table with the given schema name. If the table already exists, it is opened; otherwise a new table is created with the specified schema.

  • table_name: logical table name (e.g. "npc_observations").
  • table_schema: schema preset name. Composable convention: "memory", "memory+emotion", "character", "character+emotion".

Returns: true on success, false if not connected or on error.

write_memory(_table_name: GString, node: VarDictionary) -> bool

Write a single memory record to the store.

  • _table_name: reserved for future multi-table support (currently unused).
  • node: a Dictionary containing all Memory fields. Required keys: uuid, observer, counter, node_count, type_count, depth, poignancy, node_type, embedding_key, location, created, expiration, subject, predicate, object, description, filling (Array of String), and vector (Array of float with length matching embed_dim). Optional: emotion_at_encoding (String).

Returns: true on success, false if not connected or the dictionary is missing required fields.

write_character(_table_name: GString, character: VarDictionary) -> bool

Write a single character/persona metadata record to the store.

  • _table_name: reserved for future multi-table support (currently unused).
  • character: a Dictionary containing character fields. Required keys: uuid, uuid_int, persona_name, age, personality, background, story, daily_plan, home, lifestyle, usual_wake_up_time, activity, reflectiveness. Optional: emotional_baseline (String, defaults to "neutral").

Returns: true on success, false if not connected or the dictionary is missing required fields.

get_table_row_count(_table_name: GString, filter: GString) -> i64

Count rows in the memory store, optionally filtered.

  • _table_name: reserved for future multi-table support (currently unused).
  • filter: SQL-like filter expression (empty string means count all rows).

Returns: the number of matching rows, or -1 if not connected or on error.