roadmap phase 1 complete

2026-04-27 03:10:39 -07:00
parent 9fe8e568cf
commit 1a97b19280
19 changed files with 759 additions and 281 deletions
--- a/docs/services/entity-extraction.md
+++ b/docs/services/entity-extraction.md
@@ -1,178 +1,140 @@
-# Memory Service
+# Entity Extraction

-**Package:** `@nexusai/memory-service`  
-**Location:** `packages/memory-service`  
-**Deployed on:** Mini PC 1 (192.168.0.81)  
-**Port:** 3002
+**Location:** `packages/memory-service/src/entities/extraction.js`  
+**Triggered by:** Episode creation (`POST /episodes`)  
+**Model:** `qwen2.5:3b` via Ollama (configurable via `EXTRACTION_MODEL` env var)

 ## Purpose

-Responsible for all reading and writing of long-term memory. Acts as the
-sole interface to both SQLite and Qdrant — no other service accesses these
-stores directly. On episode creation, automatically calls the embedding
-service to generate and store a vector in Qdrant.
+After each episode is saved to SQLite, the extraction pipeline runs
+asynchronously in the background to identify named entities and the
+relationships between them. Results are written back to SQLite and
+embedded into Qdrant — the episode response is never delayed.

-## Dependencies
+## Trigger

- `express` — HTTP API
- `better-sqlite3` — SQLite driver
- `@qdrant/js-client-rest` — Qdrant vector store client
- `dotenv` — environment variable loading
- `@nexusai/shared` — shared utilities and constants
-
-## Environment Variables
-
-| Variable | Required | Default | Description |
-|---|---|---|---|
-| PORT | No | 3002 | Port to listen on |
-| SQLITE_PATH | Yes | — | Path to SQLite database file |
-| QDRANT_URL | No | http://localhost:6333 | Qdrant instance URL |
-| EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL |
-| EXTRACTION_URL | No | http://localhost:11434 | Ollama URL for entity extraction |
-| EXTRACTION_MODEL | No | qwen2.5:3b | Ollama model used for entity extraction |
-
-## Internal Structure
-
-```
-src/
-├── db/
-│   ├── index.js       # SQLite connection + initialization + migrations
-│   ├── schema.js      # Table definitions, indexes, FTS5, triggers
-│   ├── projects.js    # Project CRUD functions
-│   └── summaries.js   # Summary CRUD functions
-├── episodic/
-│   └── index.js       # Session + episode CRUD, FTS search, embedding write path
-├── semantic/
-│   └── index.js       # Qdrant collection management, upsert, search, delete
-├── entities/
-│   ├── index.js       # Entity + relationship CRUD
-│   └── extraction.js  # Automatic entity extraction via qwen2.5:3b on Ollama
-└── index.js           # Express app + all route definitions
-```
-
-## SQLite Schema
-
-Seven core tables:
-
- **sessions** — top-level conversation containers. Fields: `external_id`, `name`, `project_id`, `metadata`
- **episodes** — individual exchanges (user message + AI response) tied to a session
- **entities** — named things the system learns about (people, places, concepts)
- **relationships** — directional labeled links between entities
- **summaries** — condensed episode groups for efficient context retrieval
- **projects** — named groupings of sessions with `name`, `description`, `colour`, `icon`, `isolated`, `notes`, `system_prompt`
-
-### Migrations
-
-Schema changes that cannot use `CREATE TABLE IF NOT EXISTS` are applied as
-idempotent migrations in `db/index.js` at startup:
+`createEpisode()` in `episodic/index.js` calls `extractAndStoreEntities()`
+immediately after the SQLite insert, without awaiting it:

 ```js
-try { db.exec(`ALTER TABLE sessions ADD COLUMN name TEXT`); } catch {}
-try { db.exec(`ALTER TABLE sessions ADD COLUMN project_id INTEGER REFERENCES projects(id)`); } catch {}
-try { db.exec(`CREATE INDEX IF NOT EXISTS idx_sessions_project ON sessions(project_id)`); } catch {}
-try { db.exec(`ALTER TABLE projects ADD COLUMN isolated INTEGER NOT NULL DEFAULT 0`); } catch {}
-try { db.exec(`ALTER TABLE projects ADD COLUMN notes TEXT`); } catch {}
-try { db.exec(`ALTER TABLE projects ADD COLUMN system_prompt TEXT`); } catch {}
+extractAndStoreEntities(userMessage, aiResponse, episode.id, projectId)
+  .catch(err => logger.error(`Failed to extract entities for episode ${episode.id}:`, err.message));
 ```

-New migrations are always appended here — never modify the schema file for
-existing tables since `ALTER TABLE` cannot use `IF NOT EXISTS`.
+If extraction throws, the episode is unaffected — the error is logged and
+swallowed.

-### FTS5 Full-Text Search
+## Model Settings

-An `episodes_fts` virtual table enables keyword search across all episodes.
-Three triggers (`episodes_fts_insert`, `episodes_fts_update`, `episodes_fts_delete`)
-keep the FTS index automatically in sync with the episodes table.
+| Setting | Value | Notes |
+|---|---|---|
+| Model | `qwen2.5:3b` | Ollama, configurable via `EXTRACTION_MODEL` |
+| Temperature | 0.1 | Low for consistent, deterministic output |
+| `num_predict` | 1500 | Higher ceiling to accommodate entity + relationship JSON |
+| `format` | `'json'` | Ollama constrained decoding — enforces valid JSON output |
+| Prompt format | ChatML | `<\|im_start\|>` / `<\|im_end\|>` tokens |

-### SQLite Configuration
+## Prompt Structure

- `journal_mode = WAL` — non-blocking reads during writes
- `foreign_keys = ON` — enforces referential integrity and cascade deletes
- PRAGMAs set via `db.pragma()`, not `db.exec()`
+The prompt is built by `buildExtractionPrompt()`. It includes:

-### Dynamic Updates
+1. **System message** — declares the model's role as an entity and relationship extractor
+2. **Instructions** — entity types, field rules, relationship label format, required JSON schema
+3. **Known entities block** — last 20 entities from SQLite, by `rowid DESC`, used to encourage consistent name/type pairs across conversations
+4. **Conversation** — the raw user message and AI response, delimited clearly

-Both `updateSession` and `updateProject` build their `SET` clause dynamically
-from only the fields passed — prevents partial updates from overwriting fields
-that weren't touched.
+```
+<|im_start|>system
+You are a named entity and relationship extractor. You output only valid JSON.
+<|im_end|>
+<|im_start|>user
+Read the conversation below and extract all named entities and the relationships between them.
+Entity types: person, place, project, technology, concept, organization
+...
+Return this exact JSON structure:
+{ "entities": [...], "relationships": [...] }

-`updateProject` allowlist:
-```js
-const allowed = ['name', 'description', 'colour', 'icon', 'isolated', 'notes', 'system_prompt'];
+Already known entities (use these exact name and type values if the same entity appears):
+- "NexusAI" (project)
+- "Alice" (person)
+
+--- CONVERSATION ---
+User: ...
+Assistant: ...
+--- END CONVERSATION ---
+<|im_end|>
+<|im_start|>assistant
 ```

-## Qdrant / Semantic Layer
+## Expected JSON Output

-Three Qdrant collections are initialized on service startup via `semantic.initCollections()`:
-
-| Collection | Purpose |
-|---|---|
-| `episodes` | Embeddings for individual conversation exchanges |
-| `entities` | Embeddings for named entities |
-| `summaries` | Embeddings for condensed episode summaries |
-
-All collections use **768-dimension vectors** with **Cosine similarity**,
-matching `nomic-embed-text` via Ollama. Vector size and distance metric are
-defined in `@nexusai/shared` — not hardcoded here.
-
-`initCollections()` iterates `Object.values(COLLECTIONS)` and creates any
-collection that doesn't already exist at startup — all three collections are
-guaranteed to exist before any requests are handled, avoiding race conditions
-between the first entity embed and an entity search.
-
-Each collection exposes upsert, search (with optional Qdrant filter), and
-delete operations. The `wait: true` flag is used on all writes.
-
-## Embedding Write Path
-
-When a new episode is created:
-
-1. Episode saved to SQLite synchronously — response returned immediately
-2. User message + AI response combined: `User: ...\nAssistant: ...`
-3. Text sent to embedding service (`POST /embed`)
-4. Vector upserted into `episodes` Qdrant collection with payload `{ sessionId, createdAt }`
-
-This step is **fire-and-forget** — if embedding fails, the episode is still
-saved and searchable via FTS. The error is logged but not surfaced.
-
-> The Qdrant payload stores `sessionId` (the internal integer ID). See
-> `memory-isolation.md` for how project-level filtering works.
-
-## Entity Layer
-
-Entities and relationships use upsert semantics with composite unique
-constraints to prevent duplicates:
-
- `UNIQUE(name, type)` on entities
- `UNIQUE(from_id, to_id, label)` on relationships
- `ON DELETE CASCADE` on relationship foreign keys
-
-After each episode is saved, `extraction.js` automatically extracts named
-entities from the conversation using `qwen2.5:3b` on Ollama — fire-and-forget.
-
-> For full details on the extraction pipeline, prompt format, constrained
-> decoding, stoplist, and Qdrant storage, see `entity-extraction.md`.
-
-## Summaries Layer
-
-Session summaries are generated by `orchestration-service/src/services/summarization.js`
-after each episode write and stored here via `POST /summaries`. The memory
-service is responsible only for CRUD — generation logic lives in orchestration.
-
-> For full details on trigger conditions, prompt format, cumulative updates,
-> and ChatML token stripping, see `summarization.md`.
-
-## Project Delete Behaviour
-
-Deleting a project runs as a transaction — it first nulls out `project_id`
-on all assigned sessions, then deletes the project. This avoids a foreign
-key constraint failure since `sessions.project_id` has no `ON DELETE` rule:
-
-```js
-const doDelete = db.transaction(() => {
-  db.prepare(`UPDATE sessions SET project_id = NULL WHERE project_id = ?`).run(id);
-  db.prepare(`DELETE FROM projects WHERE id = ?`).run(id);
-});
+```json
+{
+  "entities": [
+    { "name": "Alice", "type": "person", "notes": "Software engineer working on NexusAI." },
+    { "name": "NexusAI", "type": "project", "notes": "A modular AI assistant with persistent memory." }
+  ],
+  "relationships": [
+    {
+      "from": "Alice", "fromType": "person",
+      "to": "NexusAI", "toType": "project",
+      "label": "works_on",
+      "notes": "Alice is the primary developer."
+    }
+  ]
+}
 ```

-For all HTTP endpoints, see `api-routes.md`.
+Relationship labels use **snake_case verbs** (e.g. `works_on`, `manages`, `uses`,
+`knows`, `located_in`, `part_of`, `created_by`).
+
+## JSON Parsing
+
+The raw model response is matched with `/\{[\s\S]*\}/` before parsing — this
+tolerates any preamble or trailing prose the model emits alongside the JSON.
+If the match fails or `JSON.parse` throws, the function logs a warning and
+returns without writing anything.
+
+## Entity Processing
+
+For each entity in `parsed.entities`:
+
+1. Validate `name`, `type` (must be in `ENTITY_TYPES`), and not in `IGNORED_NAMES`
+2. Call `upsertEntity(name, type, notes)`:
+   - **Insert**: creates new row with `mention_count = 1`, `source = 'extraction'`
+   - **Conflict** on `(name, type)`: increments `mention_count`, updates `last_seen_at`, preserves existing `notes` if new extraction returns null
+3. Add to `entityMap` keyed by `"${name}::${type}"` — used for relationship resolution below
+4. Call `linkEntityToEpisode(entity.id, episodeId)` — writes to `entity_episodes` join table
+5. Fire-and-forget: embed as `"${name} (${type}): ${notes}"` → store to Qdrant `entities` collection with `{ name, type, notes, projectId }` in payload
+
+**Valid entity types:** `person`, `place`, `project`, `technology`, `concept`, `organization`
+
+**Stoplist (ignored names):** `good morning`, `good night`, `hello`, `goodbye`, `thanks`, `thank you`
+
+## Relationship Processing
+
+After all entities are saved, relationships are processed:
+
+1. For each entry in `parsed.relationships`, look up both endpoints in `entityMap` using `"${from}::${fromType}"` and `"${to}::${toType}"` as keys
+2. If either endpoint is missing (filtered out, invalid type, or not in this extraction), the relationship is silently skipped
+3. Call `upsertRelationship(fromId, toId, label, notes)`:
+   - **Insert**: creates new row with `mention_count = 1`
+   - **Conflict** on `(from_id, to_id, label)`: increments `mention_count`, preserves existing `notes` if new is null
+
+Relationships are unidirectional in storage. Bidirectionality is handled at
+query time by the graph traversal layer.
+
+## Project Scoping
+
+`projectId` is threaded through from the episode creation call. It is stored
+in the Qdrant entity payload, which enables project-scoped entity search in
+orchestration. SQLite entities and relationships are global — scoping only
+applies at the Qdrant retrieval layer.
+
+## Error Behaviour
+
+All steps after the initial model call are wrapped in a single outer try/catch.
+If Ollama is unreachable, returns a non-200 status, or the JSON cannot be
+parsed, the function logs at `warn` level and returns. There is no retry logic.
+Individual entity embedding failures are caught per-entity and logged at `warn`
+level without affecting other entities in the same batch.