Files
nexusAI/docs/reference/API-routes.md
2026-04-19 06:50:24 -07:00

10 KiB
Raw Blame History

API Routes

All HTTP endpoints across NexusAI services. Clients communicate only with the orchestration service (port 4000) — memory service routes are listed here for reference and direct debugging use.


Orchestration Service — port 4000

Health

Method Path Description
GET /health Service health check

Chat

Method Path Description
POST /chat Send a message, receive full response
POST /chat/stream Send a message, receive SSE token stream

POST /chat and POST /chat/stream — request body:

{
  "sessionId": "your-session-uuid",
  "message": "Hello, my name is Tim.",
  "model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
  "temperature": 0.7
}

model and temperature are optional. Inference parameters (temperature, topP, topK, repeatPenalty) are read from settings.json on every request — controlled via PATCH /settings.

POST /chat — response:

{
  "sessionId": "your-session-uuid",
  "response": "Hello Tim! How can I help you today?",
  "model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
  "tokenCount": 87
}

POST /chat/stream — response (SSE):

data: {"text":"Hello"}
data: {"text":" Tim"}
data: {"done":true,"model":"gemma-4-26B...gguf","tokenCount":87}

Sessions

Method Path Description
GET /sessions Paginated session list
GET /sessions/:sessionId/history Paginated episode history for a session
PATCH /sessions/:sessionId Update session name and/or project assignment
DELETE /sessions/:sessionId Delete session and all its episodes

GET /sessions — query params:

Param Default Description
limit 20 Sessions per page
offset 0 Pagination offset
projectId Filter by project (integer ID)

PATCH /sessions/:sessionId — body:

{ "name": "My Session", "projectId": 3 }

Either name or projectId is required. Both can be sent together. Returns the updated session object.

GET /sessions/:sessionId/history — query params:

Param Default Description
limit 20 Episodes per page
offset 0 Pagination offset

Returns { sessionId, episodes: [...] }. Episodes ordered newest first.

Projects

Method Path Description
GET /projects Get all projects
POST /projects Create a new project
PATCH /projects/:id Update a project (partial — any subset of fields)
DELETE /projects/:id Delete a project (nulls session assignments)

POST /projects — body:

{
  "name": "My Project",
  "description": "Optional description",
  "colour": "#3d3a79",
  "icon": null,
  "isolated": 1
}

name is required. All other fields optional. isolated is always 1 — all projects use isolated memory. Returns 201 with the created project object.

PATCH /projects/:id — body: any subset of fields, all optional.

Field Type Description
name string Project name
description string Project description
colour string Hex colour for UI accent
icon string Icon identifier
isolated integer Memory isolation flag (always 1)
notes string User-authored project notes
system_prompt string Per-project system prompt override (null = use global)

Only provided fields are updated — omitted fields are not touched.

Models

Method Path Description
GET /models Available models scanned live from models folder
GET /models/props Live model props from llama-server (context window, loaded model)

GET /models — returns array:

[{ "value": "model-name.gguf", "label": "Display Name", "description": null, "size": "19.7 GB" }]

Scans .gguf files live from modelsFolderPath (set in settings). Merges with models.json in the same folder for label and description metadata.

GET /models/props — returns:

{ "contextWindow": 64000, "modelAlias": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf" }

Fetches directly from llama-server /props. n_ctx is at data.default_generation_settings.n_ctx in the llama-server response. Returns 503 if llama-server is unreachable.

Settings

Method Path Description
GET /settings Get all current settings
PATCH /settings Update one or more settings

GET /settings — response:

{
  "recentEpisodeLimit": 9,
  "semanticLimit": 5,
  "scoreThreshold": 0.6,
  "modelsFolderPath": "/mnt/nexus-models",
  "temperature": 0.65,
  "repeatPenalty": 1.3,
  "topP": 0.9,
  "topK": 41,
  "systemPrompt": "You are a helpful assistant..."
}

PATCH /settings — body: any subset of the above fields.

Field Type Range Description
recentEpisodeLimit integer 120 Recent episodes injected into prompt
semanticLimit integer 120 Max semantic search results
scoreThreshold float 01 Minimum similarity score
modelsFolderPath string Path to folder containing .gguf files
temperature float 02 Inference randomness
repeatPenalty float 12 Repeat token penalty
topP float 01 Nucleus sampling probability mass
topK integer 1100 Top-K token candidates per step
systemPrompt string Global system prompt (null reverts to hardcoded default)

Settings are persisted to data/settings.json and read on every request — changes take effect immediately without a service restart.

Episodes

Method Path Description
GET /episodes Paginated episode list across all sessions
DELETE /episodes/:id Delete an episode (SQLite + Qdrant)

GET /episodes — query params:

Param Default Description
limit 20 Episodes per page
offset 0 Pagination offset
q Keyword search (FTS)

Memory Service — port 3002

Direct access is for debugging only. All client traffic goes through orchestration.

Health

Method Path Description
GET /health Service health check

Sessions

Method Path Description
POST /sessions Create a new session
GET /sessions Paginated session list with optional projectId filter
GET /sessions/:id Get session by internal ID
GET /sessions/by-external/:externalId Get session by external ID
PATCH /sessions/by-external/:externalId Update session fields
DELETE /sessions/by-external/:externalId Delete session (cascades to episodes)

Route ordering: by-external/:externalId must be defined before /:id to prevent by-external being captured as an ID param.

POST /sessions — body:

{ "externalId": "unique-uuid", "metadata": {} }

PATCH /sessions/by-external/:externalId — body:

{ "name": "Session Name", "projectId": 3 }

Both fields are optional. Only provided fields are updated.

Episodes

Method Path Description
POST /episodes Create episode + auto-embed into Qdrant
GET /episodes Paginated episode list across all sessions
GET /episodes/search?q=&limit= FTS keyword search across all episodes
GET /episodes/:id Get episode by ID
GET /sessions/:id/episodes?limit=&offset= Paginated episodes for a session
DELETE /episodes/:id Delete episode (SQLite + Qdrant cleanup)

Route ordering: /episodes/search must be defined before /episodes/:id.

POST /episodes — body:

{
  "sessionId": 1,
  "userMessage": "Hello",
  "aiResponse": "Hi there!",
  "tokenCount": 10
}

Projects

Method Path Description
POST /projects Create a new project
GET /projects Get all projects
GET /projects/:id Get project by ID
PATCH /projects/:id Update a project (dynamic — any subset of fields)
DELETE /projects/:id Delete project + null session assignments

Same request/response shape as orchestration /projects above.

Entities

Method Path Description
POST /entities Upsert entity (creates or updates by name + type)
GET /entities/by-type/:type All entities of a given type
GET /entities/:id Get entity by ID
DELETE /entities/:id Delete entity (cascades to relationships)

Route ordering: /entities/by-type/:type must be before /entities/:id.

POST /entities — body:

{
  "name": "NexusAI",
  "type": "project",
  "notes": "My AI memory project",
  "metadata": {}
}

Relationships

Method Path Description
POST /relationships Upsert a relationship between two entities
GET /entities/:id/relationships All relationships for an entity
DELETE /relationships Delete a specific relationship

POST /relationships — body:

{ "fromId": 1, "toId": 2, "label": "uses", "metadata": {} }

DELETE /relationships — body:

{ "fromId": 1, "toId": 2, "label": "uses" }

Relationships are identified by the composite key (fromId, toId, label). Delete uses request body rather than URL params since this three-part key is awkward to encode in a path.


Embedding Service — port 3003

Method Path Description
GET /health Service health check
POST /embed Embed a single text string
POST /embed/batch Embed an array of text strings

POST /embed — body:

{ "text": "Hello from NexusAI" }

POST /embed — response:

{ "embedding": [0.123, -0.456, ...], "model": "nomic-embed-text", "dimensions": 768 }

Inference Service — port 3001

Method Path Description
GET /health Health check — reports active provider and model
POST /complete Full completion — awaits entire response
POST /complete/stream Streaming completion via SSE

POST /complete — body:

{
  "prompt": "What is the capital of France?",
  "model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
  "temperature": 0.7,
  "maxTokens": 1024,
  "topP": 0.9,
  "topK": 40,
  "repeatPenalty": 1.1
}

All fields except prompt are optional. In normal usage these are forwarded from orchestration, which reads them from settings.json.

POST /complete — response:

{
  "text": "The capital of France is Paris.",
  "model": "gemma-4-26B...gguf",
  "done": true,
  "evalCount": 8,
  "promptEvalCount": 41
}