roadmap phase 1 complete

2026-04-27 03:10:39 -07:00
parent 9fe8e568cf
commit 1a97b19280
19 changed files with 759 additions and 281 deletions
--- a/docs/architecture/overview.md
+++ b/docs/architecture/overview.md
@@ -74,7 +74,7 @@ service by ID after the vector search.
 The core four-service architecture is complete and operational. Key capabilities:

 - **Hybrid memory retrieval** — recent episodes + semantic search combined into every prompt
- **Entity layer** — automatic extraction of named entities from conversations via qwen2.5:3b, stored in SQLite and Qdrant, injected into every prompt as structured knowledge
+- **Entity layer + Knowledge graph** — automatic extraction of named entities and relationships from conversations via qwen2.5:3b. Entities and relationships are stored in SQLite with `mention_count` tracking. A graph traversal layer expands Qdrant entity search hits into a 1-hop neighborhood subgraph, injecting structured connected knowledge into every prompt
 - **Projects** — sessions grouped with shared or isolated memory pools
 - **Auto-naming** — sessions named automatically from first exchange via inference
 - **Project-scoped semantic search** — Qdrant filtered by project session IDs
--- a/docs/reference/API-routes.md
+++ b/docs/reference/API-routes.md
@@ -360,13 +360,34 @@ Same request/response shape as orchestration `/projects` above.

 **DELETE /relationships — body:**
 ```json
-{ "fromId": 1, "toId": 2, "label": "uses" }
+{ "fromId": 1, "toId": 2, "label": "works_on", "notes": "Alice is the primary developer.", "metadata": {} }
 ```
+notes is optional. label should be a snake_case verb. Relationship is identified by the composite key (fromId, toId, label) — re-submitting with the same key increments mention_count and preserves existing notes if the new value is null.

 Relationships are identified by the composite key `(fromId, toId, label)`.
 Delete uses request body rather than URL params since this three-part key
 is awkward to encode in a path.

+### Graph
+
+| Method | Path | Description |
+|---|---|---|
+| GET | /graph/neighborhood/:entityId | Entity neighborhood — nodes + edges within N hops |
+| POST | /graph/neighbors | Bulk 1-hop neighborhood for a set of entity IDs |
+
+**GET /graph/neighborhood/:entityId — query params:**
+
+| Param | Default | Max | Description |
+|---|---|---|---|
+| depth | 1 | 3 | Traversal depth |
+
+Returns `{ entity, neighborhood: { nodes, edges } }`. Returns `404` if entity not found.
+
+**POST /graph/neighbors — body:**
+```json
+{ "entityIds": [5, 8, 12] }
+Returns { nodes: [...], edges: [...] }. Used internally by orchestration — not a client-facing endpoint.
+
 ---

 ## Embedding Service — port 3003
--- a/docs/roadmap.md
+++ b/docs/roadmap.md
@@ -59,10 +59,10 @@

 ### 1. Knowledge Graph (SQLite)
 The highest-leverage memory upgrade. Transforms NexusAI from "remembers conversations" to "understands relationships between things."
- [ ] Graph schema — `nodes` and `edges` tables with typed relationships
- [ ] Entity → node promotion pipeline
- [ ] Relationship traversal queries
- [ ] Graph-aware context assembly in orchestration
+- [x] Graph schema — `nodes` and `edges` tables with typed relationships
+- [x] Entity → node promotion pipeline (`mention_count` tracked; threshold gating deferred to Phase 2)
+- [x] Relationship traversal queries
+- [x] Graph-aware context assembly in orchestration

 ### 2. Retrieval Fusion + Full-Text Search
 Multi-strategy retrieval merged into a single ranked result set.
--- a/docs/services/entity-extraction.md
+++ b/docs/services/entity-extraction.md
@@ -1,178 +1,140 @@
-# Memory Service
+# Entity Extraction

-**Package:** `@nexusai/memory-service`  
-**Location:** `packages/memory-service`  
-**Deployed on:** Mini PC 1 (192.168.0.81)  
-**Port:** 3002
+**Location:** `packages/memory-service/src/entities/extraction.js`  
+**Triggered by:** Episode creation (`POST /episodes`)  
+**Model:** `qwen2.5:3b` via Ollama (configurable via `EXTRACTION_MODEL` env var)

 ## Purpose

-Responsible for all reading and writing of long-term memory. Acts as the
-sole interface to both SQLite and Qdrant — no other service accesses these
-stores directly. On episode creation, automatically calls the embedding
-service to generate and store a vector in Qdrant.
+After each episode is saved to SQLite, the extraction pipeline runs
+asynchronously in the background to identify named entities and the
+relationships between them. Results are written back to SQLite and
+embedded into Qdrant — the episode response is never delayed.

-## Dependencies
+## Trigger

- `express` — HTTP API
- `better-sqlite3` — SQLite driver
- `@qdrant/js-client-rest` — Qdrant vector store client
- `dotenv` — environment variable loading
- `@nexusai/shared` — shared utilities and constants
-
-## Environment Variables
-
-| Variable | Required | Default | Description |
-|---|---|---|---|
-| PORT | No | 3002 | Port to listen on |
-| SQLITE_PATH | Yes | — | Path to SQLite database file |
-| QDRANT_URL | No | http://localhost:6333 | Qdrant instance URL |
-| EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL |
-| EXTRACTION_URL | No | http://localhost:11434 | Ollama URL for entity extraction |
-| EXTRACTION_MODEL | No | qwen2.5:3b | Ollama model used for entity extraction |
-
-## Internal Structure
-
-```
-src/
-├── db/
-│   ├── index.js       # SQLite connection + initialization + migrations
-│   ├── schema.js      # Table definitions, indexes, FTS5, triggers
-│   ├── projects.js    # Project CRUD functions
-│   └── summaries.js   # Summary CRUD functions
-├── episodic/
-│   └── index.js       # Session + episode CRUD, FTS search, embedding write path
-├── semantic/
-│   └── index.js       # Qdrant collection management, upsert, search, delete
-├── entities/
-│   ├── index.js       # Entity + relationship CRUD
-│   └── extraction.js  # Automatic entity extraction via qwen2.5:3b on Ollama
-└── index.js           # Express app + all route definitions
-```
-
-## SQLite Schema
-
-Seven core tables:
-
- **sessions** — top-level conversation containers. Fields: `external_id`, `name`, `project_id`, `metadata`
- **episodes** — individual exchanges (user message + AI response) tied to a session
- **entities** — named things the system learns about (people, places, concepts)
- **relationships** — directional labeled links between entities
- **summaries** — condensed episode groups for efficient context retrieval
- **projects** — named groupings of sessions with `name`, `description`, `colour`, `icon`, `isolated`, `notes`, `system_prompt`
-
-### Migrations
-
-Schema changes that cannot use `CREATE TABLE IF NOT EXISTS` are applied as
-idempotent migrations in `db/index.js` at startup:
+`createEpisode()` in `episodic/index.js` calls `extractAndStoreEntities()`
+immediately after the SQLite insert, without awaiting it:

 ```js
-try { db.exec(`ALTER TABLE sessions ADD COLUMN name TEXT`); } catch {}
-try { db.exec(`ALTER TABLE sessions ADD COLUMN project_id INTEGER REFERENCES projects(id)`); } catch {}
-try { db.exec(`CREATE INDEX IF NOT EXISTS idx_sessions_project ON sessions(project_id)`); } catch {}
-try { db.exec(`ALTER TABLE projects ADD COLUMN isolated INTEGER NOT NULL DEFAULT 0`); } catch {}
-try { db.exec(`ALTER TABLE projects ADD COLUMN notes TEXT`); } catch {}
-try { db.exec(`ALTER TABLE projects ADD COLUMN system_prompt TEXT`); } catch {}
+extractAndStoreEntities(userMessage, aiResponse, episode.id, projectId)
+  .catch(err => logger.error(`Failed to extract entities for episode ${episode.id}:`, err.message));
 ```

-New migrations are always appended here — never modify the schema file for
-existing tables since `ALTER TABLE` cannot use `IF NOT EXISTS`.
+If extraction throws, the episode is unaffected — the error is logged and
+swallowed.

-### FTS5 Full-Text Search
+## Model Settings

-An `episodes_fts` virtual table enables keyword search across all episodes.
-Three triggers (`episodes_fts_insert`, `episodes_fts_update`, `episodes_fts_delete`)
-keep the FTS index automatically in sync with the episodes table.
+| Setting | Value | Notes |
+|---|---|---|
+| Model | `qwen2.5:3b` | Ollama, configurable via `EXTRACTION_MODEL` |
+| Temperature | 0.1 | Low for consistent, deterministic output |
+| `num_predict` | 1500 | Higher ceiling to accommodate entity + relationship JSON |
+| `format` | `'json'` | Ollama constrained decoding — enforces valid JSON output |
+| Prompt format | ChatML | `<\|im_start\|>` / `<\|im_end\|>` tokens |

-### SQLite Configuration
+## Prompt Structure

- `journal_mode = WAL` — non-blocking reads during writes
- `foreign_keys = ON` — enforces referential integrity and cascade deletes
- PRAGMAs set via `db.pragma()`, not `db.exec()`
+The prompt is built by `buildExtractionPrompt()`. It includes:

-### Dynamic Updates
+1. **System message** — declares the model's role as an entity and relationship extractor
+2. **Instructions** — entity types, field rules, relationship label format, required JSON schema
+3. **Known entities block** — last 20 entities from SQLite, by `rowid DESC`, used to encourage consistent name/type pairs across conversations
+4. **Conversation** — the raw user message and AI response, delimited clearly

-Both `updateSession` and `updateProject` build their `SET` clause dynamically
-from only the fields passed — prevents partial updates from overwriting fields
-that weren't touched.
+```
+<|im_start|>system
+You are a named entity and relationship extractor. You output only valid JSON.
+<|im_end|>
+<|im_start|>user
+Read the conversation below and extract all named entities and the relationships between them.
+Entity types: person, place, project, technology, concept, organization
+...
+Return this exact JSON structure:
+{ "entities": [...], "relationships": [...] }

-`updateProject` allowlist:
-```js
-const allowed = ['name', 'description', 'colour', 'icon', 'isolated', 'notes', 'system_prompt'];
+Already known entities (use these exact name and type values if the same entity appears):
+- "NexusAI" (project)
+- "Alice" (person)
+
+--- CONVERSATION ---
+User: ...
+Assistant: ...
+--- END CONVERSATION ---
+<|im_end|>
+<|im_start|>assistant
 ```

-## Qdrant / Semantic Layer
+## Expected JSON Output

-Three Qdrant collections are initialized on service startup via `semantic.initCollections()`:
-
-| Collection | Purpose |
-|---|---|
-| `episodes` | Embeddings for individual conversation exchanges |
-| `entities` | Embeddings for named entities |
-| `summaries` | Embeddings for condensed episode summaries |
-
-All collections use **768-dimension vectors** with **Cosine similarity**,
-matching `nomic-embed-text` via Ollama. Vector size and distance metric are
-defined in `@nexusai/shared` — not hardcoded here.
-
-`initCollections()` iterates `Object.values(COLLECTIONS)` and creates any
-collection that doesn't already exist at startup — all three collections are
-guaranteed to exist before any requests are handled, avoiding race conditions
-between the first entity embed and an entity search.
-
-Each collection exposes upsert, search (with optional Qdrant filter), and
-delete operations. The `wait: true` flag is used on all writes.
-
-## Embedding Write Path
-
-When a new episode is created:
-
-1. Episode saved to SQLite synchronously — response returned immediately
-2. User message + AI response combined: `User: ...\nAssistant: ...`
-3. Text sent to embedding service (`POST /embed`)
-4. Vector upserted into `episodes` Qdrant collection with payload `{ sessionId, createdAt }`
-
-This step is **fire-and-forget** — if embedding fails, the episode is still
-saved and searchable via FTS. The error is logged but not surfaced.
-
-> The Qdrant payload stores `sessionId` (the internal integer ID). See
-> `memory-isolation.md` for how project-level filtering works.
-
-## Entity Layer
-
-Entities and relationships use upsert semantics with composite unique
-constraints to prevent duplicates:
-
- `UNIQUE(name, type)` on entities
- `UNIQUE(from_id, to_id, label)` on relationships
- `ON DELETE CASCADE` on relationship foreign keys
-
-After each episode is saved, `extraction.js` automatically extracts named
-entities from the conversation using `qwen2.5:3b` on Ollama — fire-and-forget.
-
-> For full details on the extraction pipeline, prompt format, constrained
-> decoding, stoplist, and Qdrant storage, see `entity-extraction.md`.
-
-## Summaries Layer
-
-Session summaries are generated by `orchestration-service/src/services/summarization.js`
-after each episode write and stored here via `POST /summaries`. The memory
-service is responsible only for CRUD — generation logic lives in orchestration.
-
-> For full details on trigger conditions, prompt format, cumulative updates,
-> and ChatML token stripping, see `summarization.md`.
-
-## Project Delete Behaviour
-
-Deleting a project runs as a transaction — it first nulls out `project_id`
-on all assigned sessions, then deletes the project. This avoids a foreign
-key constraint failure since `sessions.project_id` has no `ON DELETE` rule:
-
-```js
-const doDelete = db.transaction(() => {
-  db.prepare(`UPDATE sessions SET project_id = NULL WHERE project_id = ?`).run(id);
-  db.prepare(`DELETE FROM projects WHERE id = ?`).run(id);
-});
+```json
+{
+  "entities": [
+    { "name": "Alice", "type": "person", "notes": "Software engineer working on NexusAI." },
+    { "name": "NexusAI", "type": "project", "notes": "A modular AI assistant with persistent memory." }
+  ],
+  "relationships": [
+    {
+      "from": "Alice", "fromType": "person",
+      "to": "NexusAI", "toType": "project",
+      "label": "works_on",
+      "notes": "Alice is the primary developer."
+    }
+  ]
+}
 ```

-For all HTTP endpoints, see `api-routes.md`.
+Relationship labels use **snake_case verbs** (e.g. `works_on`, `manages`, `uses`,
+`knows`, `located_in`, `part_of`, `created_by`).
+
+## JSON Parsing
+
+The raw model response is matched with `/\{[\s\S]*\}/` before parsing — this
+tolerates any preamble or trailing prose the model emits alongside the JSON.
+If the match fails or `JSON.parse` throws, the function logs a warning and
+returns without writing anything.
+
+## Entity Processing
+
+For each entity in `parsed.entities`:
+
+1. Validate `name`, `type` (must be in `ENTITY_TYPES`), and not in `IGNORED_NAMES`
+2. Call `upsertEntity(name, type, notes)`:
+   - **Insert**: creates new row with `mention_count = 1`, `source = 'extraction'`
+   - **Conflict** on `(name, type)`: increments `mention_count`, updates `last_seen_at`, preserves existing `notes` if new extraction returns null
+3. Add to `entityMap` keyed by `"${name}::${type}"` — used for relationship resolution below
+4. Call `linkEntityToEpisode(entity.id, episodeId)` — writes to `entity_episodes` join table
+5. Fire-and-forget: embed as `"${name} (${type}): ${notes}"` → store to Qdrant `entities` collection with `{ name, type, notes, projectId }` in payload
+
+**Valid entity types:** `person`, `place`, `project`, `technology`, `concept`, `organization`
+
+**Stoplist (ignored names):** `good morning`, `good night`, `hello`, `goodbye`, `thanks`, `thank you`
+
+## Relationship Processing
+
+After all entities are saved, relationships are processed:
+
+1. For each entry in `parsed.relationships`, look up both endpoints in `entityMap` using `"${from}::${fromType}"` and `"${to}::${toType}"` as keys
+2. If either endpoint is missing (filtered out, invalid type, or not in this extraction), the relationship is silently skipped
+3. Call `upsertRelationship(fromId, toId, label, notes)`:
+   - **Insert**: creates new row with `mention_count = 1`
+   - **Conflict** on `(from_id, to_id, label)`: increments `mention_count`, preserves existing `notes` if new is null
+
+Relationships are unidirectional in storage. Bidirectionality is handled at
+query time by the graph traversal layer.
+
+## Project Scoping
+
+`projectId` is threaded through from the episode creation call. It is stored
+in the Qdrant entity payload, which enables project-scoped entity search in
+orchestration. SQLite entities and relationships are global — scoping only
+applies at the Qdrant retrieval layer.
+
+## Error Behaviour
+
+All steps after the initial model call are wrapped in a single outer try/catch.
+If Ollama is unreachable, returns a non-200 status, or the JSON cannot be
+parsed, the function logs at `warn` level and returns. There is no retry logic.
+Individual entity embedding failures are caught per-entity and logged at `warn`
+level without affecting other entities in the same batch.
--- a/docs/services/knowledge-graph.md
+++ b/docs/services/knowledge-graph.md
@@ -0,0 +1,213 @@
+# Knowledge Graph
+
+**Location:** `packages/memory-service/src/graph/index.js`  
+**Schema additions:** `entity_episodes` table; new columns on `entities` and `relationships`  
+**Exposed via:** `GET /graph/neighborhood/:entityId`, `POST /graph/neighbors`  
+**Consumed by:** Orchestration service context assembly
+
+## Purpose
+
+The knowledge graph transforms NexusAI from "remembers conversations" to
+"understands relationships between things." Rather than injecting a flat
+list of entity facts into every prompt, orchestration now retrieves a
+1-hop subgraph of connected entities and their relationships, giving the
+model structured, linked knowledge about people, projects, technologies,
+and concepts that have appeared across conversations.
+
+## Schema
+
+### `entity_episodes` (join table)
+
+Tracks which episodes contributed to each entity's knowledge. Defined in
+`schema.js` — exists on all installs.
+
+```sql
+CREATE TABLE IF NOT EXISTS entity_episodes (
+  entity_id  INTEGER NOT NULL REFERENCES entities(id) ON DELETE CASCADE,
+  episode_id INTEGER NOT NULL REFERENCES episodes(id) ON DELETE CASCADE,
+  PRIMARY KEY (entity_id, episode_id)
+);
+```
+
+Both FKs cascade on delete — removing an entity or episode automatically
+cleans up its join rows.
+
+### New columns on `entities`
+
+Added via migration in `db/index.js`:
+
+| Column | Type | Default | Description |
+|---|---|---|---|
+| `mention_count` | INTEGER | 1 | How many times this entity has been extracted across conversations |
+| `confidence` | REAL | 1.0 | Reserved for future confidence scoring |
+| `source` | TEXT | `'extraction'` | `'extraction'` (auto) or `'manual'` |
+| `last_seen_at` | INTEGER | NULL | Unix timestamp of most recent extraction hit |
+
+### New columns on `relationships`
+
+| Column | Type | Default | Description |
+|---|---|---|---|
+| `mention_count` | INTEGER | 1 | How many times this edge has been extracted |
+| `notes` | TEXT | NULL | Relationship context sentence from extraction |
+
+## Entity Promotion Model
+
+Entities are not created equal — some are mentioned once in passing, others
+recur across many conversations. `mention_count` is the signal:
+
+- Every time `upsertEntity` is called for an existing `(name, type)` pair, `mention_count` is incremented and `last_seen_at` is updated.
+- `ENTITIES.PROMOTION_THRESHOLD` (default: **3**) is the `mention_count` at which an entity is considered "well-established" — referenced in the codebase for future filtering and scoring logic.
+- Currently `mention_count` is stored and incremented but not yet used to gate retrieval. It provides the foundation for future features such as orphan cleanup (entities never re-extracted) and confidence-weighted graph traversal.
+
+The same pattern applies to relationships — `mention_count` rises each time
+the same `(from_id, to_id, label)` triple is extracted.
+
+## Graph Traversal
+
+`src/graph/index.js` exports two functions built on SQLite's `WITH RECURSIVE`
+CTE support. No external graph database is needed.
+
+### `getNeighborhood(entityId, depth)`
+
+Traverses the graph from a single entity, following edges in **both directions**,
+up to `depth` hops. Returns `{ nodes: [...entities], edges: [...relationships] }`.
+
+Default depth: `ENTITIES.GRAPH_HOP_DEPTH` (1). Maximum enforced at HTTP layer: 3.
+
+**SQLite query:**
+
+```sql
+WITH RECURSIVE traverse(entity_id, depth) AS (
+    SELECT ?, 0
+    UNION
+    SELECT
+        CASE WHEN r.from_id = t.entity_id THEN r.to_id ELSE r.from_id END,
+        t.depth + 1
+    FROM relationships r
+    JOIN traverse t ON (r.from_id = t.entity_id OR r.to_id = t.entity_id)
+    WHERE t.depth < ?
+)
+SELECT DISTINCT entity_id FROM traverse
+```
+
+`UNION` (not `UNION ALL`) eliminates duplicate visits and naturally handles
+cycles — a node already in the traversal set is not re-visited.
+
+After collecting node IDs, two follow-up queries fetch:
+- All entity rows for those IDs
+- All relationship rows where both `from_id` and `to_id` are in the node set
+
+This ensures edges between neighbors are included even if they aren't on the
+traversal path from the seed.
+
+### `getEntityNeighbors(entityIds[])`
+
+Bulk 1-hop version designed for orchestration. Given multiple seed entity IDs
+(the results of Qdrant semantic search), returns the combined 1-hop subgraph.
+
+1. Finds all neighbor IDs via one query using `IN (...)` on both `from_id` and `to_id`
+2. Deduplicates seeds + neighbors using a JavaScript `Set`
+3. Fetches all entity rows and all relationship rows within the combined node set
+
+This is intentionally simpler than the recursive version — orchestration always
+uses depth=1, and the bulk query avoids N separate CTE calls.
+
+## Graph-Aware Context Assembly
+
+Orchestration's `assembleContext` (in `src/chat/index.js`) integrates the
+graph at step 7 of the chat pipeline:
+
+1. Qdrant entity search returns up to `ORCHESTRATION.ENTITIES_LIMIT` results, each including `r.id` (the SQLite entity ID) alongside the Qdrant payload
+2. `graph.getNeighbors(entityIds)` is called with those IDs → `POST /graph/neighbors` on memory-service
+3. The returned `{ nodes, edges }` is passed to `formatGraphContext()`
+4. On failure, falls back to using the Qdrant payload data directly as flat nodes with no edges
+
+### Prompt Format
+
+`formatGraphContext(nodes, edges)` in `chat/index.js` formats the subgraph as:
+
+```
+Here is what you know about entities relevant to this conversation and their connections:
+- Alice (person): software engineer working on NexusAI
+  → works_on NexusAI (project)
+  → knows Bob (person)
+- NexusAI (project): AI assistant framework
+- Bob (person): Alice's colleague
+```
+
+- One line per node: `- {name} ({type}): {notes}`
+- Outbound edges indented below: `  → {label} {target_name} ({target_type})`
+- Nodes with only inbound edges (pulled in as neighbors) appear without connection lines
+- Only outbound edges are shown — each relationship appears once, from the `from_id` side
+
+## Project Scoping
+
+The knowledge graph respects project boundaries at the **entry point**, not
+during traversal:
+
+- Qdrant entity search is filtered by `projectId` — only entities tagged with this project are returned as seeds
+- Graph traversal in SQLite is unfiltered — neighbors can be from any project or no project
+- This is intentional: the graph entry is project-scoped, but traversal follows the global relationship graph to discover connected knowledge
+
+Entities are tagged with `projectId` in the Qdrant payload at extraction time.
+Entities extracted from non-project sessions have `projectId: null` and only
+appear in unfiltered global searches.
+
+## API Reference
+
+### `GET /graph/neighborhood/:entityId`
+
+Returns the neighborhood of a single entity.
+
+**Query params:**
+
+| Param | Default | Max | Description |
+|---|---|---|---|
+| `depth` | `ENTITIES.GRAPH_HOP_DEPTH` (1) | 3 | Traversal depth |
+
+**Response:**
+```json
+{
+  "entity": { "id": 5, "name": "Alice", "type": "person", "notes": "...", "mention_count": 4 },
+  "neighborhood": {
+    "nodes": [
+      { "id": 5, "name": "Alice", "type": "person", "notes": "..." },
+      { "id": 8, "name": "NexusAI", "type": "project", "notes": "..." }
+    ],
+    "edges": [
+      { "id": 2, "from_id": 5, "to_id": 8, "label": "works_on", "notes": "...", "mention_count": 3 }
+    ]
+  }
+}
+```
+
+Returns 404 if the entity does not exist.
+
+### `POST /graph/neighbors`
+
+Bulk 1-hop neighborhood for a set of entity IDs. Used internally by
+orchestration — not intended for direct client use.
+
+**Request body:**
+```json
+{ "entityIds": [5, 8, 12] }
+```
+
+**Response:**
+```json
+{
+  "nodes": [ ...entity objects... ],
+  "edges": [ ...relationship objects... ]
+}
+```
+
+Returns 400 if `entityIds` is missing or empty.
+
+## Constants (`packages/shared/src/config/constants.js`)
+
+| Constant | Value | Description |
+|---|---|---|
+| `ENTITIES.PROMOTION_THRESHOLD` | 3 | `mention_count` at which an entity is considered well-established |
+| `ENTITIES.GRAPH_HOP_DEPTH` | 1 | Default traversal depth for neighborhood queries |
+| `ORCHESTRATION.ENTITIES_LIMIT` | 5 | Max entity seeds returned from Qdrant search |
+| `ORCHESTRATION.ENTITIES_THRESHOLD` | 0.55 | Minimum similarity score for entity Qdrant search |
--- a/docs/services/memory-service.md
+++ b/docs/services/memory-service.md
@@ -9,8 +9,8 @@

 Responsible for all reading and writing of long-term memory. Acts as the
 sole interface to both SQLite and Qdrant — no other service accesses these
-stores directly. On episode creation, automatically calls the embedding
-service to generate and store a vector in Qdrant.
+stores directly. On episode creation, automatically triggers entity and
+relationship extraction and embeds results into Qdrant.

 ## Dependencies

@@ -45,19 +45,22 @@ src/
 ├── semantic/
 │   └── index.js       # Qdrant collection management, upsert, search, delete
 ├── entities/
-│   ├── index.js       # Entity + relationship CRUD
-│   └── extraction.js  # Automatic entity extraction via qwen2.5:3b on Ollama
+│   ├── index.js       # Entity + relationship CRUD (upsert, mention tracking)
+│   └── extraction.js  # Automatic entity + relationship extraction via qwen2.5:3b
+├── graph/
+│   └── index.js       # Knowledge graph traversal (neighborhood queries, recursive CTE)
 └── index.js           # Express app + all route definitions
 ```

 ## SQLite Schema

-Seven core tables:
+Eight core tables:

 - **sessions** — top-level conversation containers. Fields: `external_id`, `name`, `project_id`, `metadata`
 - **episodes** — individual exchanges (user message + AI response) tied to a session
- **entities** — named things the system learns about (people, places, concepts)
- **relationships** — directional labeled links between entities
+- **entities** — named things the system learns about (people, places, concepts, etc.). Fields include `mention_count`, `confidence`, `source`, `last_seen_at`
+- **relationships** — directional labeled links between entities (`from_id`, `to_id`, `label`). Fields include `mention_count`, `notes`
+- **entity_episodes** — join table linking entities to the episodes where they were extracted. Used for provenance and orphan cleanup
 - **summaries** — condensed episode groups for efficient context retrieval
 - **projects** — named groupings of sessions with `name`, `description`, `colour`, `icon`, `isolated`, `notes`, `system_prompt`

@@ -73,10 +76,18 @@ try { db.exec(`CREATE INDEX IF NOT EXISTS idx_sessions_project ON sessions(proje
 try { db.exec(`ALTER TABLE projects ADD COLUMN isolated INTEGER NOT NULL DEFAULT 0`); } catch {}
 try { db.exec(`ALTER TABLE projects ADD COLUMN notes TEXT`); } catch {}
 try { db.exec(`ALTER TABLE projects ADD COLUMN system_prompt TEXT`); } catch {}
+// Knowledge graph columns:
+try { db.exec(`ALTER TABLE entities ADD COLUMN mention_count INTEGER NOT NULL DEFAULT 1`) } catch {}
+try { db.exec(`ALTER TABLE entities ADD COLUMN confidence REAL NOT NULL DEFAULT 1.0`) } catch {}
+try { db.exec(`ALTER TABLE entities ADD COLUMN source TEXT NOT NULL DEFAULT 'extraction'`) } catch {}
+try { db.exec(`ALTER TABLE entities ADD COLUMN last_seen_at INTEGER`) } catch {}
+try { db.exec(`ALTER TABLE relationships ADD COLUMN mention_count INTEGER NOT NULL DEFAULT 1`) } catch {}
+try { db.exec(`ALTER TABLE relationships ADD COLUMN notes TEXT`) } catch {}
 ```

-New migrations are always appended here — never modify the schema file for
-existing tables since `ALTER TABLE` cannot use `IF NOT EXISTS`.
+`entity_episodes` is defined in `schema.js` itself (not a migration) since it is a new table.
+
+New migrations are always appended — never modify the schema file for existing tables since `ALTER TABLE` cannot use `IF NOT EXISTS`.

 ### FTS5 Full-Text Search

@@ -117,8 +128,7 @@ defined in `@nexusai/shared` — not hardcoded here.

 `initCollections()` iterates `Object.values(COLLECTIONS)` and creates any
 collection that doesn't already exist at startup — all three collections are
-guaranteed to exist before any requests are handled, avoiding race conditions
-between the first entity embed and an entity search.
+guaranteed to exist before any requests are handled.

 Each collection exposes upsert, search (with optional Qdrant filter), and
 delete operations. The `wait: true` flag is used on all writes.
@@ -143,15 +153,27 @@ saved and searchable via FTS. The error is logged but not surfaced.
 Entities and relationships use upsert semantics with composite unique
 constraints to prevent duplicates:

- `UNIQUE(name, type)` on entities
- `UNIQUE(from_id, to_id, label)` on relationships
+- `UNIQUE(name, type)` on entities — conflict increments `mention_count` and updates `last_seen_at`
+- `UNIQUE(from_id, to_id, label)` on relationships — conflict increments `mention_count` and preserves existing `notes`
 - `ON DELETE CASCADE` on relationship foreign keys

 After each episode is saved, `extraction.js` automatically extracts named
-entities from the conversation using `qwen2.5:3b` on Ollama — fire-and-forget.
+entities **and relationships** from the conversation using `qwen2.5:3b` on
+Ollama — fire-and-forget. Each saved entity is also linked to the episode
+via the `entity_episodes` join table.

-> For full details on the extraction pipeline, prompt format, constrained
-> decoding, stoplist, and Qdrant storage, see `entity-extraction.md`.
+> For full details on the extraction pipeline and JSON format, see `entity-extraction.md`.  
+> For the knowledge graph traversal layer, see `knowledge-graph.md`.
+
+## Knowledge Graph Layer
+
+`src/graph/index.js` provides SQLite-based graph traversal over the entities
+and relationships tables. Two functions are exposed via HTTP:
+
+- **`getNeighborhood(entityId, depth)`** — recursive CTE traversal, bidirectional, returns `{ nodes, edges }`
+- **`getEntityNeighbors(entityIds[])`** — bulk 1-hop traversal for orchestration context assembly
+
+> For design rationale, traversal queries, and integration with orchestration, see `knowledge-graph.md`.

 ## Summaries Layer

@@ -175,4 +197,4 @@ const doDelete = db.transaction(() => {
 });
 ```

-For all HTTP endpoints, see `api-routes.md`.
+For all HTTP endpoints, see `api-routes.md`.
--- a/docs/services/orchestration-service.md
+++ b/docs/services/orchestration-service.md
@@ -42,9 +42,10 @@ src/
 │   ├── inference.js      # HTTP client for inference service
 │   ├── embedding.js      # HTTP client for embedding service
 │   ├── qdrant.js         # HTTP client for Qdrant (direct vector search)
+│   ├── graph.js          # HTTP client for memory-service graph endpoints
 │   └── summarization.js  # Session summarisation — triggers after each episode
 ├── chat/
-│   └── index.js          # Core pipeline — context assembly, isolation, auto-naming
+│   └── index.js          # Core pipeline — context assembly, graph expansion, auto-naming
 ├── config/
 │   └── settings.js       # Settings load/save — reads/writes data/settings.json
 ├── routes/
@@ -71,7 +72,7 @@ via `appSettings.load()` — changes apply immediately without a service restart
 |---|---|---|
 | `recentEpisodeLimit` | 5 | Recent episodes injected into prompt |
 | `semanticLimit` | 5 | Semantic search results injected into prompt |
-| `scoreThreshold` | 0.75 | Minimum similarity score for semantic results |
+| `scoreThreshold` | 0.5 | Minimum similarity score for semantic results |
 | `modelsFolderPath` | `/mnt/nexus-models` | Path to folder containing .gguf files |
 | `temperature` | 0.7 | Inference temperature |
 | `repeatPenalty` | 1.1 | Repeat token penalty |
@@ -104,20 +105,27 @@ difference is how the inference response is delivered to the client.
   episodes. Deduplicated against recent episodes. Non-critical.

 6. **Entity search** — query `entities` Qdrant collection filtered by
-   `projectId`. Non-project sessions receive no entity context. Non-critical.
+   `projectId`. Returns entity IDs alongside Qdrant payload data (the Qdrant
+   point ID equals the SQLite entity ID). Non-critical.

-7. **Prompt assembly** — combine system prompt, entity context, semantic
+7. **Graph neighborhood expansion** — call `POST /graph/neighbors` on
+   memory-service with the entity IDs from step 6. Returns a 1-hop subgraph
+   `{ nodes, edges }` — entity objects plus the relationships connecting them.
+   If no entities were found or the graph call fails, falls back to flat entity
+   list (no edges). Non-critical.
+
+8. **Prompt assembly** — combine system prompt, graph context, semantic
   episodes, recent episodes, and user message.

-8. **Inference** — send to inference service. `/chat` awaits full response;
+9. **Inference** — send to inference service. `/chat` awaits full response;
   `/chat/stream` pipes SSE chunks to the client.

-9. **Episode write** — write exchange back to memory with `projectId`.
+10. **Episode write** — write exchange back to memory with `projectId`.

-10. **Summarisation trigger** — `triggerSummary(session, allEpisodes)` called
+11. **Summarisation trigger** — `triggerSummary(session, allEpisodes)` called
    fire-and-forget. See `summarization.md` for full details.

-11. **Auto-naming** — on first message with no session name, fires a secondary
+12. **Auto-naming** — on first message with no session name, fires a secondary
    inference call (max 20 tokens, temperature 0.3) to generate a session name.

 ### Prompt Structure
@@ -125,8 +133,9 @@ difference is how the inference response is delivered to the client.
 ```
 [Resolved system prompt]

-Here is what you know about entities relevant to this conversation:
+Here is what you know about entities relevant to this conversation and their connections:
 - {name} ({type}): {notes}
+  → {label} {neighbor_name} ({neighbor_type})
 ---
 Here are some relevant memories from earlier conversations:
 User: {past user message}
@@ -141,6 +150,12 @@ User: {current message}
 Assistant:
 ```

+The entity block renders the full graph neighborhood — seed entities matched
+by Qdrant search plus any neighbors pulled in by 1-hop traversal. Each entity
+shows its `notes` and any outbound relationships with their targets. Neighbor
+nodes that have no outbound edges within the subgraph appear without connection
+lines.
+
 ## Summarisation

 After each episode write, `triggerSummary` is called fire-and-forget. It
@@ -199,4 +214,7 @@ handle /health*    { reverse_proxy localhost:4000 }

 After updating: `caddy reload --config /path/to/Caddyfile`

-For all HTTP endpoints, see `api-routes.md`.
+> Note: `/graph` routes are on the memory-service (port 3002) and are called
+> internally by orchestration — they do not need a Caddy entry.
+
+For all HTTP endpoints, see `api-routes.md`.