Version: 0.9

Classifiers

Embedding-based classifiers for text and images. Three families are shipped:

Centroid — mean embedding per class, nearest-cosine prediction. Zero hyperparameters, strong baseline.
KNN (HNSW) — approximate nearest-neighbour vote. Best when classes have local structure rather than a single global centroid.
SetFit — few-shot contrastive fine-tuning of a sentence transformer with an optional LP-FT recipe.

All three share one API surface: a tagged embedder config picks the modality, and the classifier itself is modality-agnostic. Add image (or future audio) data by swapping the embedder; everything else — training, evaluation, persistence, the inference endpoint — stays the same.

Picking an embedder

Classifiers are configured with a ClassifierEmbedderConfig, a tagged enum over the per-modality embedders shipped by atelico-embed:

Variant	Embedder	Notes
`Text(EmbedderConfig)`	Sentence transformers — `AllMiniLML6V2`, `BGEBaseENV15`, `BGESmallENV15`, etc.	Default for text
`Vision(VisionEmbedderConfig)`	DINOv2-Small or DINOv2-Large	New in 0.9 — image classifiers

Mixing modalities in one classifier is not supported — train one classifier per modality.

Dataset format

A single JSONL row carries exactly one of text, image_path, or audio_path, plus a label:

{"text": "US stocks rally on tech earnings", "label": "Business"}
{"image_path": "data/cats/cat_001.jpg", "label": "cat"}
{"image_path": "data/dogs/dog_017.jpg", "label": "dog"}

Rows that mix input fields, or omit all of them, are rejected at load time with an explicit error.

Training (Rust)

Training happens in Rust via the atelico-classifiers crate; the resulting model is then loaded into the engine for serving from any binding. Example (image-modality centroid):

use atelico_classifiers::{
    centroid::{CentroidClassifier, CentroidConfig},
    data::InMemoryDataset,
    embedder_config::ClassifierEmbedderConfig,
};
use atelico_embed::{
    vision_embedder::{VisionEmbedderConfig, VisionEmbeddingModel},
    EmbedInputOwned,
};
use std::path::PathBuf;

let inputs: Vec<EmbedInputOwned> = vec![
    EmbedInputOwned::image(PathBuf::from("data/cats/01.jpg")),
    EmbedInputOwned::image(PathBuf::from("data/dogs/01.jpg")),
];
let labels = vec!["cat".into(), "dog".into()];
let dataset = InMemoryDataset::new(inputs, labels);

let cfg = CentroidConfig {
    embedder: ClassifierEmbedderConfig::Vision(VisionEmbedderConfig {
        model: VisionEmbeddingModel::DINOv2Small,
        batch_size: 4,
    }),
};
let mut clf = CentroidClassifier::new(cfg, /* assets */ None)?;
clf.train(&dataset, /* batch_size */ 4)?;
clf.save(std::path::Path::new("./models/animals"))?;
# Ok::<_, anyhow::Error>(())

Swap CentroidConfig for KnnConfig or SetFitConfig to use the other classifier types — the call sites stay identical.

Serving — text input

Once a classifier is saved to disk, load it under an ID and call predict.

Python
Godot (GDScript)
Unity (C#)
Unreal (C++)
C FFI

import json
import atelico

engine = atelico.Engine()
# (Implementation note: classifier loading is currently performed at engine
# startup via the ATELICO_CLASSIFIERS environment variable; programmatic load
# from Python follows the same pattern as other subsystems.)

result = json.loads(engine.classifier_predict("sentiment", "I love this!", top_k=3))
print(result["label"], result["probability"])

@onready var classifiers: AtelicoClassifierNode = $AtelicoClassifiers

func _ready() -> void:
    classifiers.initialize()
    classifiers.load_classifier("sentiment", "/abs/path/to/models/sentiment")

func classify(text: String) -> void:
    var json := classifiers.predict("sentiment", text, 3)
    var result := JSON.parse_string(json)
    print("label=%s p=%.2f" % [result.label, result.probability])

var engine = new AtelicoEngine();

string requestJson = JsonSerializer.Serialize(new {
    model_id = "sentiment",
    text = "I love this!",
    top_k = 3,
});
string resultJson = engine.Classifiers.Predict(requestJson);
using var doc = JsonDocument.Parse(resultJson);
Debug.Log($"label = {doc.RootElement.GetProperty("label").GetString()}");

auto* Atelico = GEngine->GetEngineSubsystem<UAtelicoAISubsystem>();
const FString Request = TEXT(R"({
    "model_id": "sentiment",
    "text": "I love this!",
    "top_k": 3
})");
FString ResultJson = Atelico->ClassifierPredict(Request);

const char *request =
    "{\"model_id\":\"sentiment\",\"text\":\"I love this!\",\"top_k\":3}";
const char *result_json = NULL;
if (atelico_classifier_predict(engine, request, &result_json) == ATELICO_OK) {
    printf("%s\n", result_json);
}

Serving — image input (DINOv2)

The classifier referenced by model_id must have been trained with a Vision embedder. Calling the image endpoint against a text-only classifier returns an error.

import json
import atelico

engine = atelico.Engine()
result = json.loads(engine.classifier_predict_image(
    "animals",
    "/abs/path/to/cat.jpg",
    top_k=3,
))
print(result["label"], result["probability"])

@onready var classifiers: AtelicoClassifierNode = $AtelicoClassifiers

func classify_image(path: String) -> void:
    var json := classifiers.predict_image("animals", path, 3)
    var result := JSON.parse_string(json)
    print("label=%s p=%.2f" % [result.label, result.probability])

string request = JsonSerializer.Serialize(new {
    model_id = "animals",
    image_path = "/abs/path/cat.jpg",
    top_k = 3,
});
string resultJson = engine.Classifiers.PredictImage(request);

const FString Request = TEXT(R"({
    "model_id": "animals",
    "image_path": "/abs/path/cat.jpg",
    "top_k": 3
})");
FString ResultJson = Atelico->ClassifierPredictImage(Request);

const char *request =
    "{\"model_id\":\"animals\","
    " \"image_path\":\"/abs/path/cat.jpg\","
    " \"top_k\":3}";
const char *result_json = NULL;
if (atelico_classifier_predict_image(engine, request, &result_json) == ATELICO_OK) {
    printf("%s\n", result_json);
}

let result = engine.classifiers().predict_image_sync(
    "animals",
    std::path::Path::new("/abs/path/cat.jpg"),
    Some(3),
)?;
println!("{} {:.2}", result.label, result.probability);
# Ok::<_, atelico_sdk::SdkError>(())

DINOv2 vision embeddings

VisionEmbeddingModel ships two sizes:

Variant	HF repo	Embedding dim	Use when
`DINOv2Small`	`facebook/dinov2-small`	384	Default. Fast, ~22M params. Good for general object / scene categories.
`DINOv2Large`	`facebook/dinov2-large`	1024	Higher quality at ~300M params. Worth the cost when classes are visually subtle.

DINOv2 produces strong general-purpose visual features without per-task pre-training, which makes the Centroid classifier a surprisingly capable baseline for image tasks — try it first before reaching for SetFit fine-tuning.

The same VisionEmbedder is also exposed as a standalone embedder via atelico-embed if you only need raw image vectors (e.g. for similarity search, clustering, or feeding the Hybrid Search store).

Persistence and serving

Trained classifiers persist to disk (safetensors for SetFit, JSON for centroid / KNN) and are loaded into the engine via the ATELICO_CLASSIFIERS environment variable on startup, or programmatically through the SDK and bindings shown above.

The same classifier infrastructure also powers the Guardrails ML-classifier layer for content moderation.

Picking an embedder​

Dataset format​

Training (Rust)​

Serving — text input​

Serving — image input (DINOv2)​

DINOv2 vision embeddings​

Persistence and serving​