nexusAI/docs/reference/API-routes.md

# API Routes

All HTTP endpoints across NexusAI services. Clients communicate only with
the orchestration service (port 4000) — memory service routes are listed
here for reference and direct debugging use.

---

## Orchestration Service — port 4000

### Health

| Method | Path | Description |
|---|---|---|
| GET | /health | Service health check |

### Chat

| Method | Path | Description |
|---|---|---|
| POST | /chat | Send a message, receive full response |
| POST | /chat/stream | Send a message, receive SSE token stream |

**POST /chat and POST /chat/stream — request body:**
```json
{
  "sessionId": "your-session-uuid",
  "message": "Hello, my name is Tim.",
  "model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
  "temperature": 0.7
}
```
`model` and `temperature` are optional. Inference parameters (temperature,
topP, topK, repeatPenalty) are read from `settings.json` on every request —
controlled via `PATCH /settings`.

**POST /chat — response:**
```json
{
  "sessionId": "your-session-uuid",
  "response": "Hello Tim! How can I help you today?",
  "model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
  "tokenCount": 87
}
```

**POST /chat/stream — response (SSE):**
```
data: {"text":"Hello"}
data: {"text":" Tim"}
data: {"done":true,"model":"gemma-4-26B...gguf","tokenCount":87}
```

### Sessions

| Method | Path | Description |
|---|---|---|
| GET | /sessions | Paginated session list |
| GET | /sessions/:sessionId/history | Paginated episode history for a session |
| PATCH | /sessions/:sessionId | Update session name and/or project assignment |
| DELETE | /sessions/:sessionId | Delete session and all its episodes |

**GET /sessions — query params:**

| Param | Default | Description |
|---|---|---|
| limit | 20 | Sessions per page |
| offset | 0 | Pagination offset |
| projectId | — | Filter by project (integer ID) |

**PATCH /sessions/:sessionId — body:**
```json
{ "name": "My Session", "projectId": 3 }
```
Either `name` or `projectId` is required. Both can be sent together.
Returns the updated session object.

**GET /sessions/:sessionId/history — query params:**

| Param | Default | Description |
|---|---|---|
| limit | 20 | Episodes per page |
| offset | 0 | Pagination offset |

Returns `{ sessionId, episodes: [...] }`. Episodes ordered newest first.

### Projects

| Method | Path | Description |
|---|---|---|
| GET | /projects | Get all projects |
| POST | /projects | Create a new project |
| PATCH | /projects/:id | Update a project (partial — any subset of fields) |
| DELETE | /projects/:id | Delete a project (nulls session assignments) |

**POST /projects — body:**
```json
{
  "name": "My Project",
  "description": "Optional description",
  "colour": "#3d3a79",
  "icon": null,
  "isolated": 1
}
```
`name` is required. All other fields optional. `isolated` is always `1` —
all projects use isolated memory. Returns `201` with the created project object.

**PATCH /projects/:id — body:** any subset of fields, all optional.

| Field | Type | Description |
|---|---|---|
| `name` | string | Project name |
| `description` | string | Project description |
| `colour` | string | Hex colour for UI accent |
| `icon` | string | Icon identifier |
| `isolated` | integer | Memory isolation flag (always 1) |
| `notes` | string | User-authored project notes |

Only provided fields are updated — omitted fields are not touched. This
enables safe partial updates (e.g. saving just `notes` without affecting
`name` or `colour`). Both orchestration and memory service implement dynamic
field patching.

### Models

| Method | Path | Description |
|---|---|---|
| GET | /models | Available models scanned live from models folder |
| GET | /models/props | Live model props from llama-server (context window, loaded model) |

**GET /models** — returns array:
```json
[{ "value": "model-name.gguf", "label": "Display Name", "description": null, "size": "19.7 GB" }]
```
Scans `.gguf` files live from `modelsFolderPath` (set in settings). Merges
with `models.json` in the same folder for label and description metadata.

**GET /models/props** — returns:
```json
{ "contextWindow": 64000, "modelAlias": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf" }
```
Fetches directly from llama-server `/props`. `n_ctx` is at
`data.default_generation_settings.n_ctx` in the llama-server response.
Returns `503` if llama-server is unreachable.

### Settings

| Method | Path | Description |
|---|---|---|
| GET | /settings | Get all current settings |
| PATCH | /settings | Update one or more settings |

**GET /settings — response:**
```json
{
  "recentEpisodeLimit": 9,
  "semanticLimit": 5,
  "scoreThreshold": 0.6,
  "modelsFolderPath": "/mnt/nexus-models",
  "temperature": 0.65,
  "repeatPenalty": 1.3,
  "topP": 0.9,
  "topK": 41
}
```

**PATCH /settings — body:** any subset of the above fields.

| Field | Type | Range | Description |
|---|---|---|---|
| `recentEpisodeLimit` | integer | 1–20 | Recent episodes injected into prompt |
| `semanticLimit` | integer | 1–20 | Max semantic search results |
| `scoreThreshold` | float | 0–1 | Minimum similarity score |
| `modelsFolderPath` | string | — | Path to folder containing .gguf files |
| `temperature` | float | 0–2 | Inference randomness |
| `repeatPenalty` | float | 1–2 | Repeat token penalty |
| `topP` | float | 0–1 | Nucleus sampling probability mass |
| `topK` | integer | 1–100 | Top-K token candidates per step |

Settings are persisted to `data/settings.json` and read on every request —
changes take effect immediately without a service restart.

### Episodes

| Method | Path | Description |
|---|---|---|
| GET | /episodes | Paginated episode list across all sessions |
| DELETE | /episodes/:id | Delete an episode (SQLite + Qdrant) |

**GET /episodes — query params:**

| Param | Default | Description |
|---|---|---|
| limit | 20 | Episodes per page |
| offset | 0 | Pagination offset |
| q | — | Keyword search (FTS) |

---

## Memory Service — port 3002

Direct access is for debugging only. All client traffic goes through
orchestration.

### Health

| Method | Path | Description |
|---|---|---|
| GET | /health | Service health check |

### Sessions

| Method | Path | Description |
|---|---|---|
| POST | /sessions | Create a new session |
| GET | /sessions | Paginated session list with optional projectId filter |
| GET | /sessions/:id | Get session by internal ID |
| GET | /sessions/by-external/:externalId | Get session by external ID |
| PATCH | /sessions/by-external/:externalId | Update session fields |
| DELETE | /sessions/by-external/:externalId | Delete session (cascades to episodes) |

> Route ordering: `by-external/:externalId` must be defined before `/:id`
> to prevent `by-external` being captured as an ID param.

**POST /sessions — body:**
```json
{ "externalId": "unique-uuid", "metadata": {} }
```

**PATCH /sessions/by-external/:externalId — body:**
```json
{ "name": "Session Name", "projectId": 3 }
```
Both fields are optional. Only provided fields are updated.

### Episodes

| Method | Path | Description |
|---|---|---|
| POST | /episodes | Create episode + auto-embed into Qdrant |
| GET | /episodes | Paginated episode list across all sessions |
| GET | /episodes/search?q=&limit= | FTS keyword search across all episodes |
| GET | /episodes/:id | Get episode by ID |
| GET | /sessions/:id/episodes?limit=&offset= | Paginated episodes for a session |
| DELETE | /episodes/:id | Delete episode (SQLite + Qdrant cleanup) |

> Route ordering: `/episodes/search` must be defined before `/episodes/:id`.

**POST /episodes — body:**
```json
{
  "sessionId": 1,
  "userMessage": "Hello",
  "aiResponse": "Hi there!",
  "tokenCount": 10
}
```

### Projects

| Method | Path | Description |
|---|---|---|
| POST | /projects | Create a new project |
| GET | /projects | Get all projects |
| GET | /projects/:id | Get project by ID |
| PATCH | /projects/:id | Update a project (dynamic — any subset of fields) |
| DELETE | /projects/:id | Delete project + null session assignments |

Same request/response shape as orchestration `/projects` above.

### Entities

| Method | Path | Description |
|---|---|---|
| POST | /entities | Upsert entity (creates or updates by name + type) |
| GET | /entities/by-type/:type | All entities of a given type |
| GET | /entities/:id | Get entity by ID |
| DELETE | /entities/:id | Delete entity (cascades to relationships) |

> Route ordering: `/entities/by-type/:type` must be before `/entities/:id`.

**POST /entities — body:**
```json
{
  "name": "NexusAI",
  "type": "project",
  "notes": "My AI memory project",
  "metadata": {}
}
```

### Relationships

| Method | Path | Description |
|---|---|---|
| POST | /relationships | Upsert a relationship between two entities |
| GET | /entities/:id/relationships | All relationships for an entity |
| DELETE | /relationships | Delete a specific relationship |

**POST /relationships — body:**
```json
{ "fromId": 1, "toId": 2, "label": "uses", "metadata": {} }
```

**DELETE /relationships — body:**
```json
{ "fromId": 1, "toId": 2, "label": "uses" }
```

Relationships are identified by the composite key `(fromId, toId, label)`.
Delete uses request body rather than URL params since this three-part key
is awkward to encode in a path.

---

## Embedding Service — port 3003

| Method | Path | Description |
|---|---|---|
| GET | /health | Service health check |
| POST | /embed | Embed a single text string |
| POST | /embed/batch | Embed an array of text strings |

**POST /embed — body:**
```json
{ "text": "Hello from NexusAI" }
```

**POST /embed — response:**
```json
{ "embedding": [0.123, -0.456, ...], "model": "nomic-embed-text", "dimensions": 768 }
```

---

## Inference Service — port 3001

| Method | Path | Description |
|---|---|---|
| GET | /health | Health check — reports active provider and model |
| POST | /complete | Full completion — awaits entire response |
| POST | /complete/stream | Streaming completion via SSE |

**POST /complete — body:**
```json
{
  "prompt": "What is the capital of France?",
  "model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
  "temperature": 0.7,
  "maxTokens": 1024,
  "topP": 0.9,
  "topK": 40,
  "repeatPenalty": 1.1
}
```
All fields except `prompt` are optional. In normal usage these are forwarded
from orchestration, which reads them from `settings.json`.

**POST /complete — response:**
```json
{
  "text": "The capital of France is Paris.",
  "model": "gemma-4-26B...gguf",
  "done": true,
  "evalCount": 8,
  "promptEvalCount": 41
}
```