update documentation

This commit is contained in:
Storme-bit
2026-04-17 03:46:17 -07:00
parent 27e3c98304
commit 5145b9a7db
13 changed files with 822 additions and 794 deletions

View File

@@ -43,48 +43,34 @@ src/
│ └── index.js # Qdrant collection management, upsert, search, delete
├── entities/
│ └── index.js # Entity + relationship CRUD
└── index.js # Express app + route definitions
└── index.js # Express app + all route definitions
```
## SQLite Schema
Six core tables:
- **sessions** — top-level conversation containers, identified by an `external_id`, optional `name`, and optional `project_id`
- **sessions** — top-level conversation containers. Fields: `external_id`, `name`, `project_id`, `metadata`
- **episodes** — individual exchanges (user message + AI response) tied to a session
- **entities** — named things the system learns about (people, places, concepts)
- **relationships** — directional labeled links between entities
- **summaries** — condensed episode groups for efficient context retrieval
- **projects** — named groupings of sessions with optional description, colour, and icon
- **projects** — named groupings of sessions with `name`, `description`, `colour`, `icon`, `isolated`
### Migrations
Schema changes that cannot be expressed in `CREATE TABLE IF NOT EXISTS` are applied
as migrations in `db/index.js` at startup, wrapped in try/catch to safely ignore
already-applied changes:
Schema changes that cannot use `CREATE TABLE IF NOT EXISTS` are applied as
idempotent migrations in `db/index.js` at startup:
```js
try {
db.exec(`ALTER TABLE sessions ADD COLUMN name TEXT`);
} catch {}
try {
db.exec(`ALTER TABLE sessions ADD COLUMN project_id INTEGER REFERENCES projects(id)`);
} catch {}
try {
db.exec(`CREATE INDEX IF NOT EXISTS idx_sessions_project ON sessions(project_id)`);
} catch {}
try { db.exec(`ALTER TABLE sessions ADD COLUMN name TEXT`); } catch {}
try { db.exec(`ALTER TABLE sessions ADD COLUMN project_id INTEGER REFERENCES projects(id)`); } catch {}
try { db.exec(`CREATE INDEX IF NOT EXISTS idx_sessions_project ON sessions(project_id)`); } catch {}
try { db.exec(`ALTER TABLE projects ADD COLUMN isolated INTEGER NOT NULL DEFAULT 0`); } catch {}
```
This pattern is idempotent — safe to run on every startup. New migrations should
always be appended here rather than modifying the schema file, since `ALTER TABLE`
and index creation on existing tables cannot use `IF NOT EXISTS` guards in SQLite.
Current migrations:
- `ALTER TABLE sessions ADD COLUMN name TEXT` — adds display name to sessions
- `ALTER TABLE sessions ADD COLUMN project_id INTEGER` — links sessions to projects
- `CREATE INDEX idx_sessions_project` — index on the new project_id column
New migrations are always appended here — never modify the schema file for
existing tables since `ALTER TABLE` cannot use `IF NOT EXISTS`.
### FTS5 Full-Text Search
@@ -96,11 +82,27 @@ keep the FTS index automatically in sync with the episodes table.
- `journal_mode = WAL` — non-blocking reads during writes
- `foreign_keys = ON` — enforces referential integrity and cascade deletes
- PRAGMAs are set via `db.pragma()` separately from `db.exec()`
- PRAGMAs set via `db.pragma()`, not `db.exec()`
### Dynamic Session Updates
`updateSession` builds its `SET` clause dynamically from only the fields
passed — prevents partial updates from overwriting fields that weren't
touched:
```js
function updateSession(id, { name, projectId } = {}) {
const updates = [];
const values = [];
if (name !== undefined) { updates.push('name = ?'); values.push(name ?? null); }
if (projectId !== undefined) { updates.push('project_id = ?'); values.push(projectId ?? null); }
// ...
}
```
## Qdrant / Semantic Layer
Three collections are initialized on service startup (created if they don't already exist):
Three Qdrant collections are initialized on service startup:
| Collection | Purpose |
|---|---|
@@ -108,208 +110,50 @@ Three collections are initialized on service startup (created if they don't alre
| `entities` | Embeddings for named entities |
| `summaries` | Embeddings for condensed episode summaries |
All collections use **768-dimension vectors** with **Cosine similarity**, matching the
output of the `nomic-embed-text` embedding model via Ollama.
All collections use **768-dimension vectors** with **Cosine similarity**,
matching `nomic-embed-text` via Ollama. Vector size and distance metric are
defined in `@nexusai/shared` — not hardcoded here.
Vector dimension and distance metric are defined in `@nexusai/shared` constants
(`QDRANT.VECTOR_SIZE`, `QDRANT.DISTANCE_METRIC`) — not hardcoded in this service.
### Semantic Layer Operations
Each collection exposes three operations via helper functions in `src/semantic/index.js`:
- **Upsert** — stores a vector with a payload containing the SQLite row ID, enabling
lookups back to the full content after a vector search
- **Search** — returns the top-k most similar vectors, with optional Qdrant filter
- **Delete** — removes a vector point by ID
The `wait: true` flag is used on all write operations so the caller receives confirmation
only after Qdrant has committed the change.
Each collection exposes three operations in `src/semantic/index.js`:
upsert, search (with optional Qdrant filter), and delete. The `wait: true`
flag is used on all writes.
## Embedding Write Path
When a new episode is created, the memory service automatically generates and stores
a vector embedding in Qdrant via the embedding service:
When a new episode is created:
1. Episode is saved to SQLite synchronously — the response is returned immediately
2. Both sides of the exchange are combined into a single text:
```
User: {userMessage}
Assistant: {aiResponse}
```
3. This text is sent to the embedding service (`POST /embed`)
4. The returned vector is upserted into the `episodes` Qdrant collection with a
payload of `{ sessionId, createdAt }` for filtering and lookups
1. Episode saved to SQLite synchronously — response returned immediately
2. User message + AI response combined: `User: ...\nAssistant: ...`
3. Text sent to embedding service (`POST /embed`)
4. Vector upserted into `episodes` Qdrant collection with payload `{ sessionId, createdAt }`
The embedding step is **fire-and-forget** — it runs asynchronously after the SQLite
insert succeeds. If embedding fails, the episode is still saved and searchable via
FTS. The error is logged but does not affect the API response.
This step is **fire-and-forget** — if embedding fails, the episode is still
saved and searchable via FTS. The error is logged but not surfaced.
### Hybrid Retrieval Pattern
Qdrant and SQLite work as a pair — neither operates in isolation:
1. Query is embedded and searched in Qdrant → returns IDs + similarity scores
2. IDs are used to fetch full content from SQLite
3. Results are ranked and assembled into a context package
> The Qdrant payload stores `sessionId` (the internal integer ID). This is
> used for per-session and per-project filtering during semantic search. See
> `memory-isolation.md` for how project-level filtering works.
## Entity Layer
Entities and relationships are stored in SQLite with two key constraints:
Entities and relationships use upsert semantics with composite unique
constraints to prevent duplicates:
- `UNIQUE(name, type)` on entities — ensures no duplicates; upsert updates existing records
- `UNIQUE(from_id, to_id, label)` on relationships — prevents duplicate edges
- `ON DELETE CASCADE` on both `from_id` and `to_id` — deleting an entity automatically
removes all relationships where it appears on either end
- `UNIQUE(name, type)` on entities
- `UNIQUE(from_id, to_id, label)` on relationships
- `ON DELETE CASCADE` on relationship foreign keys
## Endpoints
## Project Delete Behaviour
### Health
Deleting a project runs as a transaction — it first nulls out `project_id`
on all assigned sessions, then deletes the project. This avoids a foreign
key constraint failure since `sessions.project_id` has no `ON DELETE` rule:
| Method | Path | Description |
|---|---|---|
| GET | /health | Service health check |
### Sessions
| Method | Path | Description |
|---|---|---|
| POST | /sessions | Create a new session |
| GET | /sessions | Get paginated list of all sessions |
| GET | /sessions/:id | Get session by internal ID |
| GET | /sessions/by-external/:externalId | Get session by external ID |
| PATCH | /sessions/by-external/:externalId | Update session name |
| DELETE | /sessions/by-external/:externalId | Delete session (cascades to episodes + summaries) |
> Route ordering matters in Express: `by-external/:externalId` must be defined before
> `/:id` to prevent the literal string `by-external` being captured as an ID parameter.
**POST /sessions body:**
```json
{
"externalId": "unique-session-id",
"metadata": {}
}
```js
const doDelete = db.transaction(() => {
db.prepare(`UPDATE sessions SET project_id = NULL WHERE project_id = ?`).run(id);
db.prepare(`DELETE FROM projects WHERE id = ?`).run(id);
});
```
**PATCH /sessions/by-external/:externalId body:**
```json
{
"name": "My Renamed Session"
}
```
Returns the updated session object. `name` is required and must be non-empty.
**DELETE /sessions/by-external/:externalId**
Returns `204 No Content` on success. Cascades to delete all associated episodes
and summaries via SQLite `ON DELETE CASCADE`.
### Episodes
| Method | Path | Description |
|---|---|---|
| POST | /episodes | Create episode + auto-embed into Qdrant |
| GET | /episodes/search?q=&limit= | Full-text search across episodes |
| GET | /episodes/:id | Get episode by ID |
| GET | /sessions/:id/episodes?limit=&offset= | Get paginated episodes for a session |
| DELETE | /episodes/:id | Delete an episode |
**POST /episodes body:**
```json
{
"sessionId": 1,
"userMessage": "Hello",
"aiResponse": "Hi there!",
"tokenCount": 10,
"metadata": {}
}
```
> Note: `/episodes/search` must be defined before `/episodes/:id` in Express to prevent
> the word `search` being captured as an ID parameter.
### Projects
| Method | Path | Description |
|---|---|---|
| POST | /projects | Create a new project |
| GET | /projects | Get all projects |
| GET | /projects/:id | Get project by ID |
| PATCH | /projects/:id | Update a project |
| DELETE | /projects/:id | Delete a project |
**POST /projects body:**
```json
{
"name": "My Project",
"description": "Optional description",
"colour": "#3d3a79",
"icon": null
}
```
`name` is required. `description`, `colour`, and `icon` are optional.
Returns `201` with the created project object on success.
**PATCH /projects/:id body:** same fields as POST, all optional.
**DELETE /projects/:id**
Returns `204 No Content`. Sessions assigned to the project are not deleted —
their `project_id` foreign key is left as-is (nullable, no cascade).
### Entities
| Method | Path | Description |
|---|---|---|
| POST | /entities | Upsert an entity (creates or updates by name + type) |
| GET | /entities/by-type/:type | Get all entities of a given type |
| GET | /entities/:id | Get entity by internal ID |
| DELETE | /entities/:id | Delete entity (cascades to relationships) |
**POST /entities body:**
```json
{
"name": "NexusAI",
"type": "project",
"notes": "My AI memory project",
"metadata": {}
}
```
> Note: `/entities/by-type/:type` must be defined before `/entities/:id` in Express to
> prevent `by-type` being captured as an ID parameter.
### Relationships
| Method | Path | Description |
|---|---|---|
| POST | /relationships | Upsert a relationship between two entities |
| GET | /entities/:id/relationships | Get all relationships originating from an entity |
| DELETE | /relationships | Delete a specific relationship |
**POST /relationships body:**
```json
{
"fromId": 1,
"toId": 2,
"label": "uses",
"metadata": {}
}
```
**DELETE /relationships body:**
```json
{
"fromId": 1,
"toId": 2,
"label": "uses"
}
```
> Relationships are identified by the composite key `(fromId, toId, label)`. Delete uses
> the request body rather than URL params as this three-part key is awkward to express
> cleanly in a path.
For all HTTP endpoints, see `api-routes.md`.