9.7 KiB
API Routes
All HTTP endpoints across NexusAI services. Clients communicate only with the orchestration service (port 4000) — memory service routes are listed here for reference and direct debugging use.
Orchestration Service — port 4000
Health
| Method | Path | Description |
|---|---|---|
| GET | /health | Service health check |
Chat
| Method | Path | Description |
|---|---|---|
| POST | /chat | Send a message, receive full response |
| POST | /chat/stream | Send a message, receive SSE token stream |
POST /chat and POST /chat/stream — request body:
{
"sessionId": "your-session-uuid",
"message": "Hello, my name is Tim.",
"model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
"temperature": 0.7
}
model and temperature are optional. Inference parameters (temperature,
topP, topK, repeatPenalty) are read from settings.json on every request —
the request body values are not used for these; they are controlled via
PATCH /settings.
POST /chat — response:
{
"sessionId": "your-session-uuid",
"response": "Hello Tim! How can I help you today?",
"model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
"tokenCount": 87
}
POST /chat/stream — response (SSE):
data: {"text":"Hello"}
data: {"text":" Tim"}
data: {"done":true,"model":"gemma-4-26B...gguf","tokenCount":87}
Sessions
| Method | Path | Description |
|---|---|---|
| GET | /sessions | Paginated session list |
| GET | /sessions/:sessionId/history | Paginated episode history for a session |
| PATCH | /sessions/:sessionId | Update session name and/or project assignment |
| DELETE | /sessions/:sessionId | Delete session and all its episodes |
GET /sessions — query params:
| Param | Default | Description |
|---|---|---|
| limit | 20 | Sessions per page |
| offset | 0 | Pagination offset |
| projectId | — | Filter by project (integer ID) |
PATCH /sessions/:sessionId — body:
{ "name": "My Session", "projectId": 3 }
Either name or projectId is required. Both can be sent together.
Returns the updated session object.
GET /sessions/:sessionId/history — query params:
| Param | Default | Description |
|---|---|---|
| limit | 20 | Episodes per page |
| offset | 0 | Pagination offset |
Returns { sessionId, episodes: [...] }. Episodes ordered newest first.
Projects
| Method | Path | Description |
|---|---|---|
| GET | /projects | Get all projects |
| POST | /projects | Create a new project |
| PATCH | /projects/:id | Update a project |
| DELETE | /projects/:id | Delete a project (nulls session assignments) |
POST /projects — body:
{
"name": "My Project",
"description": "Optional description",
"colour": "#3d3a79",
"icon": null,
"isolated": 0
}
name is required. All other fields optional. isolated is 0 or 1.
Returns 201 with the created project object.
PATCH /projects/:id — body: same fields as POST, all optional.
Models
| Method | Path | Description |
|---|---|---|
| GET | /models | Available models scanned live from models folder |
| GET | /models/props | Live model props from llama-server (context window, loaded model) |
GET /models — returns array:
[{ "value": "model-name.gguf", "label": "Display Name", "description": null, "size": "19.7 GB" }]
Scans .gguf files live from modelsFolderPath (set in settings). Merges
with models.json in the same folder for label and description metadata.
GET /models/props — returns:
{ "contextWindow": 64000, "modelAlias": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf" }
Fetches directly from llama-server /props. Returns 503 if llama-server
is unreachable.
Settings
| Method | Path | Description |
|---|---|---|
| GET | /settings | Get all current settings |
| PATCH | /settings | Update one or more settings |
GET /settings — response:
{
"recentEpisodeLimit": 9,
"semanticLimit": 5,
"scoreThreshold": 0.6,
"modelsFolderPath": "/mnt/nexus-models",
"temperature": 0.65,
"repeatPenalty": 1.3,
"topP": 0.9,
"topK": 41
}
PATCH /settings — body: any subset of the above fields.
| Field | Type | Range | Description |
|---|---|---|---|
recentEpisodeLimit |
integer | 1–20 | Recent episodes injected into prompt |
semanticLimit |
integer | 1–20 | Max semantic search results |
scoreThreshold |
float | 0–1 | Minimum similarity score |
modelsFolderPath |
string | — | Path to folder containing .gguf files |
temperature |
float | 0–2 | Inference randomness |
repeatPenalty |
float | 1–2 | Repeat token penalty |
topP |
float | 0–1 | Nucleus sampling probability mass |
topK |
integer | 1–100 | Top-K token candidates per step |
Settings are persisted to data/settings.json and read on every request —
changes take effect immediately without a service restart.
Episodes
| Method | Path | Description |
|---|---|---|
| GET | /episodes | Paginated episode list across all sessions |
| DELETE | /episodes/:id | Delete an episode (SQLite + Qdrant) |
GET /episodes — query params:
| Param | Default | Description |
|---|---|---|
| limit | 20 | Episodes per page |
| offset | 0 | Pagination offset |
| q | — | Keyword search (FTS) |
Memory Service — port 3002
Direct access is for debugging only. All client traffic goes through orchestration.
Health
| Method | Path | Description |
|---|---|---|
| GET | /health | Service health check |
Sessions
| Method | Path | Description |
|---|---|---|
| POST | /sessions | Create a new session |
| GET | /sessions | Paginated session list with optional projectId filter |
| GET | /sessions/:id | Get session by internal ID |
| GET | /sessions/by-external/:externalId | Get session by external ID |
| PATCH | /sessions/by-external/:externalId | Update session fields |
| DELETE | /sessions/by-external/:externalId | Delete session (cascades to episodes) |
Route ordering:
by-external/:externalIdmust be defined before/:idto preventby-externalbeing captured as an ID param.
POST /sessions — body:
{ "externalId": "unique-uuid", "metadata": {} }
PATCH /sessions/by-external/:externalId — body:
{ "name": "Session Name", "projectId": 3 }
Both fields are optional. Only provided fields are updated — other fields are not touched.
Episodes
| Method | Path | Description |
|---|---|---|
| POST | /episodes | Create episode + auto-embed into Qdrant |
| GET | /episodes | Paginated episode list across all sessions |
| GET | /episodes/search?q=&limit= | FTS keyword search across all episodes |
| GET | /episodes/:id | Get episode by ID |
| GET | /sessions/:id/episodes?limit=&offset= | Paginated episodes for a session |
| DELETE | /episodes/:id | Delete episode (SQLite + Qdrant cleanup) |
Route ordering:
/episodes/searchmust be defined before/episodes/:id.
POST /episodes — body:
{
"sessionId": 1,
"userMessage": "Hello",
"aiResponse": "Hi there!",
"tokenCount": 10
}
Projects
| Method | Path | Description |
|---|---|---|
| POST | /projects | Create a new project |
| GET | /projects | Get all projects |
| GET | /projects/:id | Get project by ID |
| PATCH | /projects/:id | Update a project |
| DELETE | /projects/:id | Delete project + null session assignments |
Same request/response shape as orchestration /projects above.
Entities
| Method | Path | Description |
|---|---|---|
| POST | /entities | Upsert entity (creates or updates by name + type) |
| GET | /entities/by-type/:type | All entities of a given type |
| GET | /entities/:id | Get entity by ID |
| DELETE | /entities/:id | Delete entity (cascades to relationships) |
Route ordering:
/entities/by-type/:typemust be before/entities/:id.
POST /entities — body:
{
"name": "NexusAI",
"type": "project",
"notes": "My AI memory project",
"metadata": {}
}
Relationships
| Method | Path | Description |
|---|---|---|
| POST | /relationships | Upsert a relationship between two entities |
| GET | /entities/:id/relationships | All relationships for an entity |
| DELETE | /relationships | Delete a specific relationship |
POST /relationships — body:
{ "fromId": 1, "toId": 2, "label": "uses", "metadata": {} }
DELETE /relationships — body:
{ "fromId": 1, "toId": 2, "label": "uses" }
Relationships are identified by the composite key (fromId, toId, label).
Delete uses request body rather than URL params since this three-part key
is awkward to encode in a path.
Embedding Service — port 3003
| Method | Path | Description |
|---|---|---|
| GET | /health | Service health check |
| POST | /embed | Embed a single text string |
| POST | /embed/batch | Embed an array of text strings |
POST /embed — body:
{ "text": "Hello from NexusAI" }
POST /embed — response:
{ "embedding": [0.123, -0.456, ...], "model": "nomic-embed-text", "dimensions": 768 }
Inference Service — port 3001
| Method | Path | Description |
|---|---|---|
| GET | /health | Health check — reports active provider and model |
| POST | /complete | Full completion — awaits entire response |
| POST | /complete/stream | Streaming completion via SSE |
POST /complete — body:
{
"prompt": "What is the capital of France?",
"model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
"temperature": 0.7,
"maxTokens": 1024,
"topP": 0.9,
"topK": 40,
"repeatPenalty": 1.1
}
All fields except prompt are optional. In normal usage these are forwarded
from orchestration, which reads them from settings.json.
POST /complete — response:
{
"text": "The capital of France is Paris.",
"model": "gemma-4-26B...gguf",
"done": true,
"evalCount": 8,
"promptEvalCount": 41
}