documentation updates for entity extraction and summarization

2026-04-21 03:50:38 -07:00
parent 32365e67f4
commit acda21317b
6 changed files with 540 additions and 107 deletions
--- a/docs/services/entity-extraction.md
+++ b/docs/services/entity-extraction.md
@@ -0,0 +1,178 @@
+# Memory Service
+
+**Package:** `@nexusai/memory-service`  
+**Location:** `packages/memory-service`  
+**Deployed on:** Mini PC 1 (192.168.0.81)  
+**Port:** 3002
+
+## Purpose
+
+Responsible for all reading and writing of long-term memory. Acts as the
+sole interface to both SQLite and Qdrant — no other service accesses these
+stores directly. On episode creation, automatically calls the embedding
+service to generate and store a vector in Qdrant.
+
+## Dependencies
+
+- `express` — HTTP API
+- `better-sqlite3` — SQLite driver
+- `@qdrant/js-client-rest` — Qdrant vector store client
+- `dotenv` — environment variable loading
+- `@nexusai/shared` — shared utilities and constants
+
+## Environment Variables
+
+| Variable | Required | Default | Description |
+|---|---|---|---|
+| PORT | No | 3002 | Port to listen on |
+| SQLITE_PATH | Yes | — | Path to SQLite database file |
+| QDRANT_URL | No | http://localhost:6333 | Qdrant instance URL |
+| EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL |
+| EXTRACTION_URL | No | http://localhost:11434 | Ollama URL for entity extraction |
+| EXTRACTION_MODEL | No | qwen2.5:3b | Ollama model used for entity extraction |
+
+## Internal Structure
+
+```
+src/
+├── db/
+│   ├── index.js       # SQLite connection + initialization + migrations
+│   ├── schema.js      # Table definitions, indexes, FTS5, triggers
+│   ├── projects.js    # Project CRUD functions
+│   └── summaries.js   # Summary CRUD functions
+├── episodic/
+│   └── index.js       # Session + episode CRUD, FTS search, embedding write path
+├── semantic/
+│   └── index.js       # Qdrant collection management, upsert, search, delete
+├── entities/
+│   ├── index.js       # Entity + relationship CRUD
+│   └── extraction.js  # Automatic entity extraction via qwen2.5:3b on Ollama
+└── index.js           # Express app + all route definitions
+```
+
+## SQLite Schema
+
+Seven core tables:
+
+- **sessions** — top-level conversation containers. Fields: `external_id`, `name`, `project_id`, `metadata`
+- **episodes** — individual exchanges (user message + AI response) tied to a session
+- **entities** — named things the system learns about (people, places, concepts)
+- **relationships** — directional labeled links between entities
+- **summaries** — condensed episode groups for efficient context retrieval
+- **projects** — named groupings of sessions with `name`, `description`, `colour`, `icon`, `isolated`, `notes`, `system_prompt`
+
+### Migrations
+
+Schema changes that cannot use `CREATE TABLE IF NOT EXISTS` are applied as
+idempotent migrations in `db/index.js` at startup:
+
+```js
+try { db.exec(`ALTER TABLE sessions ADD COLUMN name TEXT`); } catch {}
+try { db.exec(`ALTER TABLE sessions ADD COLUMN project_id INTEGER REFERENCES projects(id)`); } catch {}
+try { db.exec(`CREATE INDEX IF NOT EXISTS idx_sessions_project ON sessions(project_id)`); } catch {}
+try { db.exec(`ALTER TABLE projects ADD COLUMN isolated INTEGER NOT NULL DEFAULT 0`); } catch {}
+try { db.exec(`ALTER TABLE projects ADD COLUMN notes TEXT`); } catch {}
+try { db.exec(`ALTER TABLE projects ADD COLUMN system_prompt TEXT`); } catch {}
+```
+
+New migrations are always appended here — never modify the schema file for
+existing tables since `ALTER TABLE` cannot use `IF NOT EXISTS`.
+
+### FTS5 Full-Text Search
+
+An `episodes_fts` virtual table enables keyword search across all episodes.
+Three triggers (`episodes_fts_insert`, `episodes_fts_update`, `episodes_fts_delete`)
+keep the FTS index automatically in sync with the episodes table.
+
+### SQLite Configuration
+
+- `journal_mode = WAL` — non-blocking reads during writes
+- `foreign_keys = ON` — enforces referential integrity and cascade deletes
+- PRAGMAs set via `db.pragma()`, not `db.exec()`
+
+### Dynamic Updates
+
+Both `updateSession` and `updateProject` build their `SET` clause dynamically
+from only the fields passed — prevents partial updates from overwriting fields
+that weren't touched.
+
+`updateProject` allowlist:
+```js
+const allowed = ['name', 'description', 'colour', 'icon', 'isolated', 'notes', 'system_prompt'];
+```
+
+## Qdrant / Semantic Layer
+
+Three Qdrant collections are initialized on service startup via `semantic.initCollections()`:
+
+| Collection | Purpose |
+|---|---|
+| `episodes` | Embeddings for individual conversation exchanges |
+| `entities` | Embeddings for named entities |
+| `summaries` | Embeddings for condensed episode summaries |
+
+All collections use **768-dimension vectors** with **Cosine similarity**,
+matching `nomic-embed-text` via Ollama. Vector size and distance metric are
+defined in `@nexusai/shared` — not hardcoded here.
+
+`initCollections()` iterates `Object.values(COLLECTIONS)` and creates any
+collection that doesn't already exist at startup — all three collections are
+guaranteed to exist before any requests are handled, avoiding race conditions
+between the first entity embed and an entity search.
+
+Each collection exposes upsert, search (with optional Qdrant filter), and
+delete operations. The `wait: true` flag is used on all writes.
+
+## Embedding Write Path
+
+When a new episode is created:
+
+1. Episode saved to SQLite synchronously — response returned immediately
+2. User message + AI response combined: `User: ...\nAssistant: ...`
+3. Text sent to embedding service (`POST /embed`)
+4. Vector upserted into `episodes` Qdrant collection with payload `{ sessionId, createdAt }`
+
+This step is **fire-and-forget** — if embedding fails, the episode is still
+saved and searchable via FTS. The error is logged but not surfaced.
+
+> The Qdrant payload stores `sessionId` (the internal integer ID). See
+> `memory-isolation.md` for how project-level filtering works.
+
+## Entity Layer
+
+Entities and relationships use upsert semantics with composite unique
+constraints to prevent duplicates:
+
+- `UNIQUE(name, type)` on entities
+- `UNIQUE(from_id, to_id, label)` on relationships
+- `ON DELETE CASCADE` on relationship foreign keys
+
+After each episode is saved, `extraction.js` automatically extracts named
+entities from the conversation using `qwen2.5:3b` on Ollama — fire-and-forget.
+
+> For full details on the extraction pipeline, prompt format, constrained
+> decoding, stoplist, and Qdrant storage, see `entity-extraction.md`.
+
+## Summaries Layer
+
+Session summaries are generated by `orchestration-service/src/services/summarization.js`
+after each episode write and stored here via `POST /summaries`. The memory
+service is responsible only for CRUD — generation logic lives in orchestration.
+
+> For full details on trigger conditions, prompt format, cumulative updates,
+> and ChatML token stripping, see `summarization.md`.
+
+## Project Delete Behaviour
+
+Deleting a project runs as a transaction — it first nulls out `project_id`
+on all assigned sessions, then deletes the project. This avoids a foreign
+key constraint failure since `sessions.project_id` has no `ON DELETE` rule:
+
+```js
+const doDelete = db.transaction(() => {
+  db.prepare(`UPDATE sessions SET project_id = NULL WHERE project_id = ?`).run(id);
+  db.prepare(`DELETE FROM projects WHERE id = ?`).run(id);
+});
+```
+
+For all HTTP endpoints, see `api-routes.md`.