Files

Storme-bit 055683424d retrieval fusion

2026-04-27 07:03:46 -07:00

13 KiB

Raw Permalink Blame History

API Routes

All HTTP endpoints across NexusAI services. Clients communicate only with the orchestration service (port 4000) — memory service routes are listed here for reference and direct debugging use.

Orchestration Service — port 4000

Health

Method	Path	Description
GET	/health	Service health check

Chat

Method	Path	Description
POST	/chat	Send a message, receive full response
POST	/chat/stream	Send a message, receive SSE token stream

POST /chat and POST /chat/stream — request body:

{
  "sessionId": "your-session-uuid",
  "message": "Hello, my name is Tim.",
  "model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
  "temperature": 0.7
}

model and temperature are optional. Inference parameters (temperature, topP, topK, repeatPenalty) are read from settings.json on every request — controlled via PATCH /settings.

POST /chat — response:

{
  "sessionId": "your-session-uuid",
  "response": "Hello Tim! How can I help you today?",
  "model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
  "tokenCount": 87
}

POST /chat/stream — response (SSE):

data: {"text":"Hello"}
data: {"text":" Tim"}
data: {"done":true,"model":"gemma-4-26B...gguf","tokenCount":87}

Sessions

Method	Path	Description
GET	/sessions	Paginated session list
GET	/sessions/:sessionId/history	Paginated episode history for a session
PATCH	/sessions/:sessionId	Update session name and/or project assignment
DELETE	/sessions/:sessionId	Delete session and all its episodes

GET /sessions — query params:

Param	Default	Description
limit	20	Sessions per page
offset	0	Pagination offset
projectId	—	Filter by project (integer ID)

PATCH /sessions/:sessionId — body:

{ "name": "My Session", "projectId": 3 }

Either name or projectId is required. Both can be sent together. Returns the updated session object.

GET /sessions/:sessionId/history — query params:

Param	Default	Description
limit	20	Episodes per page
offset	0	Pagination offset

Returns { sessionId, episodes: [...] }. Episodes ordered newest first.

Projects

Method	Path	Description
GET	/projects	Get all projects
POST	/projects	Create a new project
PATCH	/projects/:id	Update a project (partial — any subset of fields)
DELETE	/projects/:id	Delete a project (nulls session assignments)

POST /projects — body:

{
  "name": "My Project",
  "description": "Optional description",
  "colour": "#3d3a79",
  "icon": null,
  "isolated": 1
}

name is required. All other fields optional. isolated is always 1 — all projects use isolated memory. Returns 201 with the created project object.

PATCH /projects/:id — body: any subset of fields, all optional.

Field	Type	Description
`name`	string	Project name
`description`	string	Project description
`colour`	string	Hex colour for UI accent
`icon`	string	Icon identifier
`isolated`	integer	Memory isolation flag (always 1)
`notes`	string	User-authored project notes
`system_prompt`	string	Per-project system prompt override (null = use global)

Only provided fields are updated — omitted fields are not touched.

Summaries

Method	Path	Description
GET	/summaries/session/:sessionId	Get all summaries for a session (by external UUID)
GET	/summaries/project/:projectId	Get all summaries for a project

GET /summaries/session/:sessionId — resolves the external UUID to an internal session ID, then fetches summaries from the memory service. Returns an array of summary objects ordered by created_at ascending.

GET /summaries/project/:projectId — proxies directly to the memory service project summaries endpoint.

Summary object shape:

{
  "id": 8,
  "session_id": 72,
  "project_id": null,
  "content": "The user asked about...",
  "token_count": 579,
  "episode_range": "246-251",
  "created_at": 1776766518,
  "updated_at": 1776766518
}

Proxy requirement: /summaries must be added to both the Caddyfile reverse proxy and the Vite dev proxy config alongside the other route prefixes. See orchestration-service.md for the Caddy block pattern.

Models

Method	Path	Description
GET	/models	Available models scanned live from models folder
GET	/models/props	Live model props from llama-server (context window, loaded model)

GET /models — returns array:

[{ "value": "model-name.gguf", "label": "Display Name", "description": null, "size": "19.7 GB" }]

Scans .gguf files live from modelsFolderPath (set in settings). Merges with models.json in the same folder for label and description metadata.

GET /models/props — returns:

{ "contextWindow": 64000, "modelAlias": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf" }

Fetches directly from llama-server /props. n_ctx is at data.default_generation_settings.n_ctx in the llama-server response. Returns 503 if llama-server is unreachable.

Settings

Method	Path	Description
GET	/settings	Get all current settings
PATCH	/settings	Update one or more settings

GET /settings — response:

{
  "recentEpisodeLimit": 9,
  "semanticLimit": 5,
  "scoreThreshold": 0.6,
  "modelsFolderPath": "/mnt/nexus-models",
  "temperature": 0.65,
  "repeatPenalty": 1.3,
  "topP": 0.9,
  "topK": 41,
  "systemPrompt": "You are a helpful assistant..."
}

PATCH /settings — body: any subset of the above fields.

Field	Type	Range	Description
`recentEpisodeLimit`	integer	1–20	Recent episodes injected into prompt
`semanticLimit`	integer	1–20	Max semantic search results
`scoreThreshold`	float	0–1	Minimum similarity score for Qdrant results
`semanticWeight`	float	0–5	RRF weight for Qdrant semantic results
`keywordWeight`	float	0–5	RRF weight for FTS5 keyword results (`0` = disabled)
`modelsFolderPath`	string	—	Path to folder containing .gguf files
`temperature`	float	0–2	Inference randomness
`repeatPenalty`	float	1–2	Repeat token penalty
`topP`	float	0–1	Nucleus sampling probability mass
`topK`	integer	1–100	Top-K token candidates per step
`systemPrompt`	string	—	Global system prompt (null reverts to hardcoded default)

Settings are persisted to data/settings.json and read on every request — changes take effect immediately without a service restart.

Episodes

Method	Path	Description
GET	/episodes	Paginated episode list across all sessions
DELETE	/episodes/:id	Delete an episode (SQLite + Qdrant)

GET /episodes — query params:

Param	Default	Description
limit	20	Episodes per page
offset	0	Pagination offset
q	—	Keyword search (FTS)

Memory Service — port 3002

Direct access is for debugging only. All client traffic goes through orchestration.

Health

Method	Path	Description
GET	/health	Service health check

Sessions

Method	Path	Description
POST	/sessions	Create a new session
GET	/sessions	Paginated session list with optional projectId filter
GET	/sessions/:id	Get session by internal ID
GET	/sessions/by-external/:externalId	Get session by external ID
PATCH	/sessions/by-external/:externalId	Update session fields
DELETE	/sessions/by-external/:externalId	Delete session (cascades to episodes)

Route ordering: by-external/:externalId must be defined before /:id to prevent by-external being captured as an ID param.

POST /sessions — body:

{ "externalId": "unique-uuid", "metadata": {} }

PATCH /sessions/by-external/:externalId — body:

{ "name": "Session Name", "projectId": 3 }

Both fields are optional. Only provided fields are updated.

Episodes

Method	Path	Description
POST	/episodes	Create episode + auto-embed into Qdrant
GET	/episodes	Paginated episode list across all sessions
GET	/episodes/search?q=&limit=	FTS keyword search across all episodes
GET	/episodes/:id	Get episode by ID
GET	/sessions/:id/episodes?limit=&offset=	Paginated episodes for a session
DELETE	/episodes/:id	Delete episode (SQLite + Qdrant cleanup)

Route ordering: /episodes/search must be defined before /episodes/:id.

POST /episodes — body:

{
  "sessionId": 1,
  "userMessage": "Hello",
  "aiResponse": "Hi there!",
  "tokenCount": 10
}

Projects

Method	Path	Description
POST	/projects	Create a new project
GET	/projects	Get all projects
GET	/projects/:id	Get project by ID
PATCH	/projects/:id	Update a project (dynamic — any subset of fields)
DELETE	/projects/:id	Delete project + null session assignments

Same request/response shape as orchestration /projects above.

Summaries

Method	Path	Description
POST	/summaries	Create a new summary
GET	/sessions/:id/summaries	Get all summaries for a session (internal ID)
GET	/projects/:id/summaries	Get all summaries for a project
PATCH	/summaries/:id	Update a summary (content, tokenCount, episodeRange)
DELETE	/summaries/:id	Delete a summary

POST /summaries — body:

{
  "sessionId": 72,
  "content": "The user discussed...",
  "tokenCount": 579,
  "episodeRange": "246-251"
}

content is required. Either sessionId or projectId is required.

PATCH /summaries/:id — body: any subset of content, tokenCount, episodeRange.

Entities

Method	Path	Description
POST	/entities	Upsert entity (creates or updates by name + type)
GET	/entities/by-type/:type	All entities of a given type
GET	/entities/:id	Get entity by ID
DELETE	/entities/:id	Delete entity (cascades to relationships)

Route ordering: /entities/by-type/:type must be before /entities/:id.

POST /entities — body:

{
  "name": "NexusAI",
  "type": "project",
  "notes": "My AI memory project",
  "metadata": {}
}

Relationships

Method	Path	Description
POST	/relationships	Upsert a relationship between two entities
GET	/entities/:id/relationships	All relationships for an entity
DELETE	/relationships	Delete a specific relationship

POST /relationships — body:

{ "fromId": 1, "toId": 2, "label": "uses", "metadata": {} }

DELETE /relationships — body:

{ "fromId": 1, "toId": 2, "label": "works_on", "notes": "Alice is the primary developer.", "metadata": {} }

notes is optional. label should be a snake_case verb. Relationship is identified by the composite key (fromId, toId, label) — re-submitting with the same key increments mention_count and preserves existing notes if the new value is null.

Relationships are identified by the composite key (fromId, toId, label). Delete uses request body rather than URL params since this three-part key is awkward to encode in a path.

Graph

Method	Path	Description
GET	/graph/neighborhood/:entityId	Entity neighborhood — nodes + edges within N hops
POST	/graph/neighbors	Bulk 1-hop neighborhood for a set of entity IDs

GET /graph/neighborhood/:entityId — query params:

Param	Default	Max	Description
depth	1	3	Traversal depth

Returns { entity, neighborhood: { nodes, edges } }. Returns 404 if entity not found.

POST /graph/neighbors — body:

{ "entityIds": [5, 8, 12] }
Returns { nodes: [...], edges: [...] }. Used internally by orchestration — not a client-facing endpoint.

---

## Embedding Service — port 3003

| Method | Path | Description |
|---|---|---|
| GET | /health | Service health check |
| POST | /embed | Embed a single text string |
| POST | /embed/batch | Embed an array of text strings |

**POST /embed — body:**
```json
{ "text": "Hello from NexusAI" }

POST /embed — response:

{ "embedding": [0.123, -0.456, ...], "model": "nomic-embed-text", "dimensions": 768 }

Inference Service — port 3001

Method	Path	Description
GET	/health	Health check — reports active provider and model
POST	/complete	Full completion — awaits entire response
POST	/complete/stream	Streaming completion via SSE

POST /complete — body:

{
  "prompt": "What is the capital of France?",
  "model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
  "temperature": 0.7,
  "maxTokens": 1024,
  "topP": 0.9,
  "topK": 40,
  "repeatPenalty": 1.1
}

All fields except prompt are optional. In normal usage these are forwarded from orchestration, which reads them from settings.json.

POST /complete — response:

{
  "text": "The capital of France is Paris.",
  "model": "gemma-4-26B...gguf",
  "done": true,
  "evalCount": 8,
  "promptEvalCount": 41
}

13 KiB Raw Permalink Blame History Unescape Escape

API Routes

Orchestration Service — port 4000

Health

Chat

Sessions

Projects

Summaries

Models

Settings

Episodes

Memory Service — port 3002

Health

Sessions

Episodes

Projects

Summaries

Entities

Relationships

Graph

Inference Service — port 3001

13 KiB

Raw Permalink Blame History