13 KiB
API Routes
All HTTP endpoints across NexusAI services. Clients communicate only with the orchestration service (port 4000) — memory service routes are listed here for reference and direct debugging use.
Orchestration Service — port 4000
Health
| Method | Path | Description |
|---|---|---|
| GET | /health | Service health check |
Chat
| Method | Path | Description |
|---|---|---|
| POST | /chat | Send a message, receive full response |
| POST | /chat/stream | Send a message, receive SSE token stream |
POST /chat and POST /chat/stream — request body:
{
"sessionId": "your-session-uuid",
"message": "Hello, my name is Tim.",
"model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
"temperature": 0.7
}
model and temperature are optional. Inference parameters (temperature,
topP, topK, repeatPenalty) are read from settings.json on every request —
controlled via PATCH /settings.
POST /chat — response:
{
"sessionId": "your-session-uuid",
"response": "Hello Tim! How can I help you today?",
"model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
"tokenCount": 87
}
POST /chat/stream — response (SSE):
data: {"text":"Hello"}
data: {"text":" Tim"}
data: {"done":true,"model":"gemma-4-26B...gguf","tokenCount":87}
Sessions
| Method | Path | Description |
|---|---|---|
| GET | /sessions | Paginated session list |
| GET | /sessions/:sessionId/history | Paginated episode history for a session |
| PATCH | /sessions/:sessionId | Update session name and/or project assignment |
| DELETE | /sessions/:sessionId | Delete session and all its episodes |
GET /sessions — query params:
| Param | Default | Description |
|---|---|---|
| limit | 20 | Sessions per page |
| offset | 0 | Pagination offset |
| projectId | — | Filter by project (integer ID) |
PATCH /sessions/:sessionId — body:
{ "name": "My Session", "projectId": 3 }
Either name or projectId is required. Both can be sent together.
Returns the updated session object.
GET /sessions/:sessionId/history — query params:
| Param | Default | Description |
|---|---|---|
| limit | 20 | Episodes per page |
| offset | 0 | Pagination offset |
Returns { sessionId, episodes: [...] }. Episodes ordered newest first.
Projects
| Method | Path | Description |
|---|---|---|
| GET | /projects | Get all projects |
| POST | /projects | Create a new project |
| PATCH | /projects/:id | Update a project (partial — any subset of fields) |
| DELETE | /projects/:id | Delete a project (nulls session assignments) |
POST /projects — body:
{
"name": "My Project",
"description": "Optional description",
"colour": "#3d3a79",
"icon": null,
"isolated": 1
}
name is required. All other fields optional. isolated is always 1 —
all projects use isolated memory. Returns 201 with the created project object.
PATCH /projects/:id — body: any subset of fields, all optional.
| Field | Type | Description |
|---|---|---|
name |
string | Project name |
description |
string | Project description |
colour |
string | Hex colour for UI accent |
icon |
string | Icon identifier |
isolated |
integer | Memory isolation flag (always 1) |
notes |
string | User-authored project notes |
system_prompt |
string | Per-project system prompt override (null = use global) |
Only provided fields are updated — omitted fields are not touched.
Summaries
| Method | Path | Description |
|---|---|---|
| GET | /summaries/session/:sessionId | Get all summaries for a session (by external UUID) |
| GET | /summaries/project/:projectId | Get all summaries for a project |
GET /summaries/session/:sessionId — resolves the external UUID to an
internal session ID, then fetches summaries from the memory service.
Returns an array of summary objects ordered by created_at ascending.
GET /summaries/project/:projectId — proxies directly to the memory service project summaries endpoint.
Summary object shape:
{
"id": 8,
"session_id": 72,
"project_id": null,
"content": "The user asked about...",
"token_count": 579,
"episode_range": "246-251",
"created_at": 1776766518,
"updated_at": 1776766518
}
Proxy requirement:
/summariesmust be added to both the Caddyfile reverse proxy and the Vite dev proxy config alongside the other route prefixes. Seeorchestration-service.mdfor the Caddy block pattern.
Models
| Method | Path | Description |
|---|---|---|
| GET | /models | Available models scanned live from models folder |
| GET | /models/props | Live model props from llama-server (context window, loaded model) |
GET /models — returns array:
[{ "value": "model-name.gguf", "label": "Display Name", "description": null, "size": "19.7 GB" }]
Scans .gguf files live from modelsFolderPath (set in settings). Merges
with models.json in the same folder for label and description metadata.
GET /models/props — returns:
{ "contextWindow": 64000, "modelAlias": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf" }
Fetches directly from llama-server /props. n_ctx is at
data.default_generation_settings.n_ctx in the llama-server response.
Returns 503 if llama-server is unreachable.
Settings
| Method | Path | Description |
|---|---|---|
| GET | /settings | Get all current settings |
| PATCH | /settings | Update one or more settings |
GET /settings — response:
{
"recentEpisodeLimit": 9,
"semanticLimit": 5,
"scoreThreshold": 0.6,
"modelsFolderPath": "/mnt/nexus-models",
"temperature": 0.65,
"repeatPenalty": 1.3,
"topP": 0.9,
"topK": 41,
"systemPrompt": "You are a helpful assistant..."
}
PATCH /settings — body: any subset of the above fields.
| Field | Type | Range | Description |
|---|---|---|---|
recentEpisodeLimit |
integer | 1–20 | Recent episodes injected into prompt |
semanticLimit |
integer | 1–20 | Max semantic search results |
scoreThreshold |
float | 0–1 | Minimum similarity score for Qdrant results |
semanticWeight |
float | 0–5 | RRF weight for Qdrant semantic results |
keywordWeight |
float | 0–5 | RRF weight for FTS5 keyword results (0 = disabled) |
modelsFolderPath |
string | — | Path to folder containing .gguf files |
temperature |
float | 0–2 | Inference randomness |
repeatPenalty |
float | 1–2 | Repeat token penalty |
topP |
float | 0–1 | Nucleus sampling probability mass |
topK |
integer | 1–100 | Top-K token candidates per step |
systemPrompt |
string | — | Global system prompt (null reverts to hardcoded default) |
Settings are persisted to data/settings.json and read on every request —
changes take effect immediately without a service restart.
Episodes
| Method | Path | Description |
|---|---|---|
| GET | /episodes | Paginated episode list across all sessions |
| DELETE | /episodes/:id | Delete an episode (SQLite + Qdrant) |
GET /episodes — query params:
| Param | Default | Description |
|---|---|---|
| limit | 20 | Episodes per page |
| offset | 0 | Pagination offset |
| q | — | Keyword search (FTS) |
Memory Service — port 3002
Direct access is for debugging only. All client traffic goes through orchestration.
Health
| Method | Path | Description |
|---|---|---|
| GET | /health | Service health check |
Sessions
| Method | Path | Description |
|---|---|---|
| POST | /sessions | Create a new session |
| GET | /sessions | Paginated session list with optional projectId filter |
| GET | /sessions/:id | Get session by internal ID |
| GET | /sessions/by-external/:externalId | Get session by external ID |
| PATCH | /sessions/by-external/:externalId | Update session fields |
| DELETE | /sessions/by-external/:externalId | Delete session (cascades to episodes) |
Route ordering:
by-external/:externalIdmust be defined before/:idto preventby-externalbeing captured as an ID param.
POST /sessions — body:
{ "externalId": "unique-uuid", "metadata": {} }
PATCH /sessions/by-external/:externalId — body:
{ "name": "Session Name", "projectId": 3 }
Both fields are optional. Only provided fields are updated.
Episodes
| Method | Path | Description |
|---|---|---|
| POST | /episodes | Create episode + auto-embed into Qdrant |
| GET | /episodes | Paginated episode list across all sessions |
| GET | /episodes/search?q=&limit= | FTS keyword search across all episodes |
| GET | /episodes/:id | Get episode by ID |
| GET | /sessions/:id/episodes?limit=&offset= | Paginated episodes for a session |
| DELETE | /episodes/:id | Delete episode (SQLite + Qdrant cleanup) |
Route ordering:
/episodes/searchmust be defined before/episodes/:id.
POST /episodes — body:
{
"sessionId": 1,
"userMessage": "Hello",
"aiResponse": "Hi there!",
"tokenCount": 10
}
Projects
| Method | Path | Description |
|---|---|---|
| POST | /projects | Create a new project |
| GET | /projects | Get all projects |
| GET | /projects/:id | Get project by ID |
| PATCH | /projects/:id | Update a project (dynamic — any subset of fields) |
| DELETE | /projects/:id | Delete project + null session assignments |
Same request/response shape as orchestration /projects above.
Summaries
| Method | Path | Description |
|---|---|---|
| POST | /summaries | Create a new summary |
| GET | /sessions/:id/summaries | Get all summaries for a session (internal ID) |
| GET | /projects/:id/summaries | Get all summaries for a project |
| PATCH | /summaries/:id | Update a summary (content, tokenCount, episodeRange) |
| DELETE | /summaries/:id | Delete a summary |
POST /summaries — body:
{
"sessionId": 72,
"content": "The user discussed...",
"tokenCount": 579,
"episodeRange": "246-251"
}
content is required. Either sessionId or projectId is required.
PATCH /summaries/:id — body: any subset of content, tokenCount, episodeRange.
Entities
| Method | Path | Description |
|---|---|---|
| POST | /entities | Upsert entity (creates or updates by name + type) |
| GET | /entities/by-type/:type | All entities of a given type |
| GET | /entities/:id | Get entity by ID |
| DELETE | /entities/:id | Delete entity (cascades to relationships) |
Route ordering:
/entities/by-type/:typemust be before/entities/:id.
POST /entities — body:
{
"name": "NexusAI",
"type": "project",
"notes": "My AI memory project",
"metadata": {}
}
Relationships
| Method | Path | Description |
|---|---|---|
| POST | /relationships | Upsert a relationship between two entities |
| GET | /entities/:id/relationships | All relationships for an entity |
| DELETE | /relationships | Delete a specific relationship |
POST /relationships — body:
{ "fromId": 1, "toId": 2, "label": "uses", "metadata": {} }
DELETE /relationships — body:
{ "fromId": 1, "toId": 2, "label": "works_on", "notes": "Alice is the primary developer.", "metadata": {} }
notes is optional. label should be a snake_case verb. Relationship is identified by the composite key (fromId, toId, label) — re-submitting with the same key increments mention_count and preserves existing notes if the new value is null.
Relationships are identified by the composite key (fromId, toId, label).
Delete uses request body rather than URL params since this three-part key
is awkward to encode in a path.
Graph
| Method | Path | Description |
|---|---|---|
| GET | /graph/neighborhood/:entityId | Entity neighborhood — nodes + edges within N hops |
| POST | /graph/neighbors | Bulk 1-hop neighborhood for a set of entity IDs |
GET /graph/neighborhood/:entityId — query params:
| Param | Default | Max | Description |
|---|---|---|---|
| depth | 1 | 3 | Traversal depth |
Returns { entity, neighborhood: { nodes, edges } }. Returns 404 if entity not found.
POST /graph/neighbors — body:
{ "entityIds": [5, 8, 12] }
Returns { nodes: [...], edges: [...] }. Used internally by orchestration — not a client-facing endpoint.
---
## Embedding Service — port 3003
| Method | Path | Description |
|---|---|---|
| GET | /health | Service health check |
| POST | /embed | Embed a single text string |
| POST | /embed/batch | Embed an array of text strings |
**POST /embed — body:**
```json
{ "text": "Hello from NexusAI" }
POST /embed — response:
{ "embedding": [0.123, -0.456, ...], "model": "nomic-embed-text", "dimensions": 768 }
Inference Service — port 3001
| Method | Path | Description |
|---|---|---|
| GET | /health | Health check — reports active provider and model |
| POST | /complete | Full completion — awaits entire response |
| POST | /complete/stream | Streaming completion via SSE |
POST /complete — body:
{
"prompt": "What is the capital of France?",
"model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
"temperature": 0.7,
"maxTokens": 1024,
"topP": 0.9,
"topK": 40,
"repeatPenalty": 1.1
}
All fields except prompt are optional. In normal usage these are forwarded
from orchestration, which reads them from settings.json.
POST /complete — response:
{
"text": "The capital of France is Paris.",
"model": "gemma-4-26B...gguf",
"done": true,
"evalCount": 8,
"promptEvalCount": 41
}