Cortex (Inference)
The current UE plugin exposes both local node-backed cortex helpers and remote API-backed cortex helpers.
Local Cortex
Local inference is driven by Cortex/CortexThunks.h and currently loads a model from a file path.
Important points:
InitCortextakes a model ID (e.g.smollm2-135m) which maps to a HuggingFace GGUF URL — the SDK downloads it automatically on first use to{ProjectSavedDir}/ForbocAI/models/- You can also pass a direct file path to a
.ggufmodel - Local completion uses the currently initialized llama.cpp context (Metal GPU on Mac)
GenerateEmbeddingis also available throughSDKOps::GenerateEmbedding(...)— the embedding model (all-MiniLM-L6-v2) is also downloaded automatically on first use
Remote Cortex
Protocol Requirement
rtk::processNPC(...) with rtk::LocalProtocolRuntime() uses local cortex for ExecuteInference instructions. Remote fallback is intentionally disabled in that path.
If local cortex is not initialized and the API requests inference, processNPC(...) fails fast.
Key Types
FCortexStatusFCortexConfigFCortexResponseFCortexModelInfo
FCortexConfig also supports optional stop sequences and a JSON schema payload for remote completion requests.
