roadmap phase 1 complete

This commit is contained in:
Storme-bit
2026-04-27 03:10:39 -07:00
parent 9fe8e568cf
commit 1a97b19280
19 changed files with 759 additions and 281 deletions

View File

@@ -74,7 +74,7 @@ service by ID after the vector search.
The core four-service architecture is complete and operational. Key capabilities:
- **Hybrid memory retrieval** — recent episodes + semantic search combined into every prompt
- **Entity layer** — automatic extraction of named entities from conversations via qwen2.5:3b, stored in SQLite and Qdrant, injected into every prompt as structured knowledge
- **Entity layer + Knowledge graph** — automatic extraction of named entities and relationships from conversations via qwen2.5:3b. Entities and relationships are stored in SQLite with `mention_count` tracking. A graph traversal layer expands Qdrant entity search hits into a 1-hop neighborhood subgraph, injecting structured connected knowledge into every prompt
- **Projects** — sessions grouped with shared or isolated memory pools
- **Auto-naming** — sessions named automatically from first exchange via inference
- **Project-scoped semantic search** — Qdrant filtered by project session IDs

View File

@@ -360,13 +360,34 @@ Same request/response shape as orchestration `/projects` above.
**DELETE /relationships — body:**
```json
{ "fromId": 1, "toId": 2, "label": "uses" }
{ "fromId": 1, "toId": 2, "label": "works_on", "notes": "Alice is the primary developer.", "metadata": {} }
```
notes is optional. label should be a snake_case verb. Relationship is identified by the composite key (fromId, toId, label) — re-submitting with the same key increments mention_count and preserves existing notes if the new value is null.
Relationships are identified by the composite key `(fromId, toId, label)`.
Delete uses request body rather than URL params since this three-part key
is awkward to encode in a path.
### Graph
| Method | Path | Description |
|---|---|---|
| GET | /graph/neighborhood/:entityId | Entity neighborhood — nodes + edges within N hops |
| POST | /graph/neighbors | Bulk 1-hop neighborhood for a set of entity IDs |
**GET /graph/neighborhood/:entityId — query params:**
| Param | Default | Max | Description |
|---|---|---|---|
| depth | 1 | 3 | Traversal depth |
Returns `{ entity, neighborhood: { nodes, edges } }`. Returns `404` if entity not found.
**POST /graph/neighbors — body:**
```json
{ "entityIds": [5, 8, 12] }
Returns { nodes: [...], edges: [...] }. Used internally by orchestration not a client-facing endpoint.
---
## Embedding Service port 3003

View File

@@ -59,10 +59,10 @@
### 1. Knowledge Graph (SQLite)
The highest-leverage memory upgrade. Transforms NexusAI from "remembers conversations" to "understands relationships between things."
- [ ] Graph schema — `nodes` and `edges` tables with typed relationships
- [ ] Entity → node promotion pipeline
- [ ] Relationship traversal queries
- [ ] Graph-aware context assembly in orchestration
- [x] Graph schema — `nodes` and `edges` tables with typed relationships
- [x] Entity → node promotion pipeline (`mention_count` tracked; threshold gating deferred to Phase 2)
- [x] Relationship traversal queries
- [x] Graph-aware context assembly in orchestration
### 2. Retrieval Fusion + Full-Text Search
Multi-strategy retrieval merged into a single ranked result set.

View File

@@ -1,178 +1,140 @@
# Memory Service
# Entity Extraction
**Package:** `@nexusai/memory-service`
**Location:** `packages/memory-service`
**Deployed on:** Mini PC 1 (192.168.0.81)
**Port:** 3002
**Location:** `packages/memory-service/src/entities/extraction.js`
**Triggered by:** Episode creation (`POST /episodes`)
**Model:** `qwen2.5:3b` via Ollama (configurable via `EXTRACTION_MODEL` env var)
## Purpose
Responsible for all reading and writing of long-term memory. Acts as the
sole interface to both SQLite and Qdrant — no other service accesses these
stores directly. On episode creation, automatically calls the embedding
service to generate and store a vector in Qdrant.
After each episode is saved to SQLite, the extraction pipeline runs
asynchronously in the background to identify named entities and the
relationships between them. Results are written back to SQLite and
embedded into Qdrant — the episode response is never delayed.
## Dependencies
## Trigger
- `express` — HTTP API
- `better-sqlite3` — SQLite driver
- `@qdrant/js-client-rest` — Qdrant vector store client
- `dotenv` — environment variable loading
- `@nexusai/shared` — shared utilities and constants
## Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
| PORT | No | 3002 | Port to listen on |
| SQLITE_PATH | Yes | — | Path to SQLite database file |
| QDRANT_URL | No | http://localhost:6333 | Qdrant instance URL |
| EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL |
| EXTRACTION_URL | No | http://localhost:11434 | Ollama URL for entity extraction |
| EXTRACTION_MODEL | No | qwen2.5:3b | Ollama model used for entity extraction |
## Internal Structure
```
src/
├── db/
│ ├── index.js # SQLite connection + initialization + migrations
│ ├── schema.js # Table definitions, indexes, FTS5, triggers
│ ├── projects.js # Project CRUD functions
│ └── summaries.js # Summary CRUD functions
├── episodic/
│ └── index.js # Session + episode CRUD, FTS search, embedding write path
├── semantic/
│ └── index.js # Qdrant collection management, upsert, search, delete
├── entities/
│ ├── index.js # Entity + relationship CRUD
│ └── extraction.js # Automatic entity extraction via qwen2.5:3b on Ollama
└── index.js # Express app + all route definitions
```
## SQLite Schema
Seven core tables:
- **sessions** — top-level conversation containers. Fields: `external_id`, `name`, `project_id`, `metadata`
- **episodes** — individual exchanges (user message + AI response) tied to a session
- **entities** — named things the system learns about (people, places, concepts)
- **relationships** — directional labeled links between entities
- **summaries** — condensed episode groups for efficient context retrieval
- **projects** — named groupings of sessions with `name`, `description`, `colour`, `icon`, `isolated`, `notes`, `system_prompt`
### Migrations
Schema changes that cannot use `CREATE TABLE IF NOT EXISTS` are applied as
idempotent migrations in `db/index.js` at startup:
`createEpisode()` in `episodic/index.js` calls `extractAndStoreEntities()`
immediately after the SQLite insert, without awaiting it:
```js
try { db.exec(`ALTER TABLE sessions ADD COLUMN name TEXT`); } catch {}
try { db.exec(`ALTER TABLE sessions ADD COLUMN project_id INTEGER REFERENCES projects(id)`); } catch {}
try { db.exec(`CREATE INDEX IF NOT EXISTS idx_sessions_project ON sessions(project_id)`); } catch {}
try { db.exec(`ALTER TABLE projects ADD COLUMN isolated INTEGER NOT NULL DEFAULT 0`); } catch {}
try { db.exec(`ALTER TABLE projects ADD COLUMN notes TEXT`); } catch {}
try { db.exec(`ALTER TABLE projects ADD COLUMN system_prompt TEXT`); } catch {}
extractAndStoreEntities(userMessage, aiResponse, episode.id, projectId)
.catch(err => logger.error(`Failed to extract entities for episode ${episode.id}:`, err.message));
```
New migrations are always appended here — never modify the schema file for
existing tables since `ALTER TABLE` cannot use `IF NOT EXISTS`.
If extraction throws, the episode is unaffected — the error is logged and
swallowed.
### FTS5 Full-Text Search
## Model Settings
An `episodes_fts` virtual table enables keyword search across all episodes.
Three triggers (`episodes_fts_insert`, `episodes_fts_update`, `episodes_fts_delete`)
keep the FTS index automatically in sync with the episodes table.
| Setting | Value | Notes |
|---|---|---|
| Model | `qwen2.5:3b` | Ollama, configurable via `EXTRACTION_MODEL` |
| Temperature | 0.1 | Low for consistent, deterministic output |
| `num_predict` | 1500 | Higher ceiling to accommodate entity + relationship JSON |
| `format` | `'json'` | Ollama constrained decoding — enforces valid JSON output |
| Prompt format | ChatML | `<\|im_start\|>` / `<\|im_end\|>` tokens |
### SQLite Configuration
## Prompt Structure
- `journal_mode = WAL` — non-blocking reads during writes
- `foreign_keys = ON` — enforces referential integrity and cascade deletes
- PRAGMAs set via `db.pragma()`, not `db.exec()`
The prompt is built by `buildExtractionPrompt()`. It includes:
### Dynamic Updates
1. **System message** — declares the model's role as an entity and relationship extractor
2. **Instructions** — entity types, field rules, relationship label format, required JSON schema
3. **Known entities block** — last 20 entities from SQLite, by `rowid DESC`, used to encourage consistent name/type pairs across conversations
4. **Conversation** — the raw user message and AI response, delimited clearly
Both `updateSession` and `updateProject` build their `SET` clause dynamically
from only the fields passed — prevents partial updates from overwriting fields
that weren't touched.
```
<|im_start|>system
You are a named entity and relationship extractor. You output only valid JSON.
<|im_end|>
<|im_start|>user
Read the conversation below and extract all named entities and the relationships between them.
Entity types: person, place, project, technology, concept, organization
...
Return this exact JSON structure:
{ "entities": [...], "relationships": [...] }
`updateProject` allowlist:
```js
const allowed = ['name', 'description', 'colour', 'icon', 'isolated', 'notes', 'system_prompt'];
Already known entities (use these exact name and type values if the same entity appears):
- "NexusAI" (project)
- "Alice" (person)
--- CONVERSATION ---
User: ...
Assistant: ...
--- END CONVERSATION ---
<|im_end|>
<|im_start|>assistant
```
## Qdrant / Semantic Layer
## Expected JSON Output
Three Qdrant collections are initialized on service startup via `semantic.initCollections()`:
| Collection | Purpose |
|---|---|
| `episodes` | Embeddings for individual conversation exchanges |
| `entities` | Embeddings for named entities |
| `summaries` | Embeddings for condensed episode summaries |
All collections use **768-dimension vectors** with **Cosine similarity**,
matching `nomic-embed-text` via Ollama. Vector size and distance metric are
defined in `@nexusai/shared` — not hardcoded here.
`initCollections()` iterates `Object.values(COLLECTIONS)` and creates any
collection that doesn't already exist at startup — all three collections are
guaranteed to exist before any requests are handled, avoiding race conditions
between the first entity embed and an entity search.
Each collection exposes upsert, search (with optional Qdrant filter), and
delete operations. The `wait: true` flag is used on all writes.
## Embedding Write Path
When a new episode is created:
1. Episode saved to SQLite synchronously — response returned immediately
2. User message + AI response combined: `User: ...\nAssistant: ...`
3. Text sent to embedding service (`POST /embed`)
4. Vector upserted into `episodes` Qdrant collection with payload `{ sessionId, createdAt }`
This step is **fire-and-forget** — if embedding fails, the episode is still
saved and searchable via FTS. The error is logged but not surfaced.
> The Qdrant payload stores `sessionId` (the internal integer ID). See
> `memory-isolation.md` for how project-level filtering works.
## Entity Layer
Entities and relationships use upsert semantics with composite unique
constraints to prevent duplicates:
- `UNIQUE(name, type)` on entities
- `UNIQUE(from_id, to_id, label)` on relationships
- `ON DELETE CASCADE` on relationship foreign keys
After each episode is saved, `extraction.js` automatically extracts named
entities from the conversation using `qwen2.5:3b` on Ollama — fire-and-forget.
> For full details on the extraction pipeline, prompt format, constrained
> decoding, stoplist, and Qdrant storage, see `entity-extraction.md`.
## Summaries Layer
Session summaries are generated by `orchestration-service/src/services/summarization.js`
after each episode write and stored here via `POST /summaries`. The memory
service is responsible only for CRUD — generation logic lives in orchestration.
> For full details on trigger conditions, prompt format, cumulative updates,
> and ChatML token stripping, see `summarization.md`.
## Project Delete Behaviour
Deleting a project runs as a transaction — it first nulls out `project_id`
on all assigned sessions, then deletes the project. This avoids a foreign
key constraint failure since `sessions.project_id` has no `ON DELETE` rule:
```js
const doDelete = db.transaction(() => {
db.prepare(`UPDATE sessions SET project_id = NULL WHERE project_id = ?`).run(id);
db.prepare(`DELETE FROM projects WHERE id = ?`).run(id);
});
```json
{
"entities": [
{ "name": "Alice", "type": "person", "notes": "Software engineer working on NexusAI." },
{ "name": "NexusAI", "type": "project", "notes": "A modular AI assistant with persistent memory." }
],
"relationships": [
{
"from": "Alice", "fromType": "person",
"to": "NexusAI", "toType": "project",
"label": "works_on",
"notes": "Alice is the primary developer."
}
]
}
```
For all HTTP endpoints, see `api-routes.md`.
Relationship labels use **snake_case verbs** (e.g. `works_on`, `manages`, `uses`,
`knows`, `located_in`, `part_of`, `created_by`).
## JSON Parsing
The raw model response is matched with `/\{[\s\S]*\}/` before parsing — this
tolerates any preamble or trailing prose the model emits alongside the JSON.
If the match fails or `JSON.parse` throws, the function logs a warning and
returns without writing anything.
## Entity Processing
For each entity in `parsed.entities`:
1. Validate `name`, `type` (must be in `ENTITY_TYPES`), and not in `IGNORED_NAMES`
2. Call `upsertEntity(name, type, notes)`:
- **Insert**: creates new row with `mention_count = 1`, `source = 'extraction'`
- **Conflict** on `(name, type)`: increments `mention_count`, updates `last_seen_at`, preserves existing `notes` if new extraction returns null
3. Add to `entityMap` keyed by `"${name}::${type}"` — used for relationship resolution below
4. Call `linkEntityToEpisode(entity.id, episodeId)` — writes to `entity_episodes` join table
5. Fire-and-forget: embed as `"${name} (${type}): ${notes}"` → store to Qdrant `entities` collection with `{ name, type, notes, projectId }` in payload
**Valid entity types:** `person`, `place`, `project`, `technology`, `concept`, `organization`
**Stoplist (ignored names):** `good morning`, `good night`, `hello`, `goodbye`, `thanks`, `thank you`
## Relationship Processing
After all entities are saved, relationships are processed:
1. For each entry in `parsed.relationships`, look up both endpoints in `entityMap` using `"${from}::${fromType}"` and `"${to}::${toType}"` as keys
2. If either endpoint is missing (filtered out, invalid type, or not in this extraction), the relationship is silently skipped
3. Call `upsertRelationship(fromId, toId, label, notes)`:
- **Insert**: creates new row with `mention_count = 1`
- **Conflict** on `(from_id, to_id, label)`: increments `mention_count`, preserves existing `notes` if new is null
Relationships are unidirectional in storage. Bidirectionality is handled at
query time by the graph traversal layer.
## Project Scoping
`projectId` is threaded through from the episode creation call. It is stored
in the Qdrant entity payload, which enables project-scoped entity search in
orchestration. SQLite entities and relationships are global — scoping only
applies at the Qdrant retrieval layer.
## Error Behaviour
All steps after the initial model call are wrapped in a single outer try/catch.
If Ollama is unreachable, returns a non-200 status, or the JSON cannot be
parsed, the function logs at `warn` level and returns. There is no retry logic.
Individual entity embedding failures are caught per-entity and logged at `warn`
level without affecting other entities in the same batch.

View File

@@ -0,0 +1,213 @@
# Knowledge Graph
**Location:** `packages/memory-service/src/graph/index.js`
**Schema additions:** `entity_episodes` table; new columns on `entities` and `relationships`
**Exposed via:** `GET /graph/neighborhood/:entityId`, `POST /graph/neighbors`
**Consumed by:** Orchestration service context assembly
## Purpose
The knowledge graph transforms NexusAI from "remembers conversations" to
"understands relationships between things." Rather than injecting a flat
list of entity facts into every prompt, orchestration now retrieves a
1-hop subgraph of connected entities and their relationships, giving the
model structured, linked knowledge about people, projects, technologies,
and concepts that have appeared across conversations.
## Schema
### `entity_episodes` (join table)
Tracks which episodes contributed to each entity's knowledge. Defined in
`schema.js` — exists on all installs.
```sql
CREATE TABLE IF NOT EXISTS entity_episodes (
entity_id INTEGER NOT NULL REFERENCES entities(id) ON DELETE CASCADE,
episode_id INTEGER NOT NULL REFERENCES episodes(id) ON DELETE CASCADE,
PRIMARY KEY (entity_id, episode_id)
);
```
Both FKs cascade on delete — removing an entity or episode automatically
cleans up its join rows.
### New columns on `entities`
Added via migration in `db/index.js`:
| Column | Type | Default | Description |
|---|---|---|---|
| `mention_count` | INTEGER | 1 | How many times this entity has been extracted across conversations |
| `confidence` | REAL | 1.0 | Reserved for future confidence scoring |
| `source` | TEXT | `'extraction'` | `'extraction'` (auto) or `'manual'` |
| `last_seen_at` | INTEGER | NULL | Unix timestamp of most recent extraction hit |
### New columns on `relationships`
| Column | Type | Default | Description |
|---|---|---|---|
| `mention_count` | INTEGER | 1 | How many times this edge has been extracted |
| `notes` | TEXT | NULL | Relationship context sentence from extraction |
## Entity Promotion Model
Entities are not created equal — some are mentioned once in passing, others
recur across many conversations. `mention_count` is the signal:
- Every time `upsertEntity` is called for an existing `(name, type)` pair, `mention_count` is incremented and `last_seen_at` is updated.
- `ENTITIES.PROMOTION_THRESHOLD` (default: **3**) is the `mention_count` at which an entity is considered "well-established" — referenced in the codebase for future filtering and scoring logic.
- Currently `mention_count` is stored and incremented but not yet used to gate retrieval. It provides the foundation for future features such as orphan cleanup (entities never re-extracted) and confidence-weighted graph traversal.
The same pattern applies to relationships — `mention_count` rises each time
the same `(from_id, to_id, label)` triple is extracted.
## Graph Traversal
`src/graph/index.js` exports two functions built on SQLite's `WITH RECURSIVE`
CTE support. No external graph database is needed.
### `getNeighborhood(entityId, depth)`
Traverses the graph from a single entity, following edges in **both directions**,
up to `depth` hops. Returns `{ nodes: [...entities], edges: [...relationships] }`.
Default depth: `ENTITIES.GRAPH_HOP_DEPTH` (1). Maximum enforced at HTTP layer: 3.
**SQLite query:**
```sql
WITH RECURSIVE traverse(entity_id, depth) AS (
SELECT ?, 0
UNION
SELECT
CASE WHEN r.from_id = t.entity_id THEN r.to_id ELSE r.from_id END,
t.depth + 1
FROM relationships r
JOIN traverse t ON (r.from_id = t.entity_id OR r.to_id = t.entity_id)
WHERE t.depth < ?
)
SELECT DISTINCT entity_id FROM traverse
```
`UNION` (not `UNION ALL`) eliminates duplicate visits and naturally handles
cycles — a node already in the traversal set is not re-visited.
After collecting node IDs, two follow-up queries fetch:
- All entity rows for those IDs
- All relationship rows where both `from_id` and `to_id` are in the node set
This ensures edges between neighbors are included even if they aren't on the
traversal path from the seed.
### `getEntityNeighbors(entityIds[])`
Bulk 1-hop version designed for orchestration. Given multiple seed entity IDs
(the results of Qdrant semantic search), returns the combined 1-hop subgraph.
1. Finds all neighbor IDs via one query using `IN (...)` on both `from_id` and `to_id`
2. Deduplicates seeds + neighbors using a JavaScript `Set`
3. Fetches all entity rows and all relationship rows within the combined node set
This is intentionally simpler than the recursive version — orchestration always
uses depth=1, and the bulk query avoids N separate CTE calls.
## Graph-Aware Context Assembly
Orchestration's `assembleContext` (in `src/chat/index.js`) integrates the
graph at step 7 of the chat pipeline:
1. Qdrant entity search returns up to `ORCHESTRATION.ENTITIES_LIMIT` results, each including `r.id` (the SQLite entity ID) alongside the Qdrant payload
2. `graph.getNeighbors(entityIds)` is called with those IDs → `POST /graph/neighbors` on memory-service
3. The returned `{ nodes, edges }` is passed to `formatGraphContext()`
4. On failure, falls back to using the Qdrant payload data directly as flat nodes with no edges
### Prompt Format
`formatGraphContext(nodes, edges)` in `chat/index.js` formats the subgraph as:
```
Here is what you know about entities relevant to this conversation and their connections:
- Alice (person): software engineer working on NexusAI
→ works_on NexusAI (project)
→ knows Bob (person)
- NexusAI (project): AI assistant framework
- Bob (person): Alice's colleague
```
- One line per node: `- {name} ({type}): {notes}`
- Outbound edges indented below: ` → {label} {target_name} ({target_type})`
- Nodes with only inbound edges (pulled in as neighbors) appear without connection lines
- Only outbound edges are shown — each relationship appears once, from the `from_id` side
## Project Scoping
The knowledge graph respects project boundaries at the **entry point**, not
during traversal:
- Qdrant entity search is filtered by `projectId` — only entities tagged with this project are returned as seeds
- Graph traversal in SQLite is unfiltered — neighbors can be from any project or no project
- This is intentional: the graph entry is project-scoped, but traversal follows the global relationship graph to discover connected knowledge
Entities are tagged with `projectId` in the Qdrant payload at extraction time.
Entities extracted from non-project sessions have `projectId: null` and only
appear in unfiltered global searches.
## API Reference
### `GET /graph/neighborhood/:entityId`
Returns the neighborhood of a single entity.
**Query params:**
| Param | Default | Max | Description |
|---|---|---|---|
| `depth` | `ENTITIES.GRAPH_HOP_DEPTH` (1) | 3 | Traversal depth |
**Response:**
```json
{
"entity": { "id": 5, "name": "Alice", "type": "person", "notes": "...", "mention_count": 4 },
"neighborhood": {
"nodes": [
{ "id": 5, "name": "Alice", "type": "person", "notes": "..." },
{ "id": 8, "name": "NexusAI", "type": "project", "notes": "..." }
],
"edges": [
{ "id": 2, "from_id": 5, "to_id": 8, "label": "works_on", "notes": "...", "mention_count": 3 }
]
}
}
```
Returns 404 if the entity does not exist.
### `POST /graph/neighbors`
Bulk 1-hop neighborhood for a set of entity IDs. Used internally by
orchestration — not intended for direct client use.
**Request body:**
```json
{ "entityIds": [5, 8, 12] }
```
**Response:**
```json
{
"nodes": [ ...entity objects... ],
"edges": [ ...relationship objects... ]
}
```
Returns 400 if `entityIds` is missing or empty.
## Constants (`packages/shared/src/config/constants.js`)
| Constant | Value | Description |
|---|---|---|
| `ENTITIES.PROMOTION_THRESHOLD` | 3 | `mention_count` at which an entity is considered well-established |
| `ENTITIES.GRAPH_HOP_DEPTH` | 1 | Default traversal depth for neighborhood queries |
| `ORCHESTRATION.ENTITIES_LIMIT` | 5 | Max entity seeds returned from Qdrant search |
| `ORCHESTRATION.ENTITIES_THRESHOLD` | 0.55 | Minimum similarity score for entity Qdrant search |

View File

@@ -9,8 +9,8 @@
Responsible for all reading and writing of long-term memory. Acts as the
sole interface to both SQLite and Qdrant — no other service accesses these
stores directly. On episode creation, automatically calls the embedding
service to generate and store a vector in Qdrant.
stores directly. On episode creation, automatically triggers entity and
relationship extraction and embeds results into Qdrant.
## Dependencies
@@ -45,19 +45,22 @@ src/
├── semantic/
│ └── index.js # Qdrant collection management, upsert, search, delete
├── entities/
│ ├── index.js # Entity + relationship CRUD
│ └── extraction.js # Automatic entity extraction via qwen2.5:3b on Ollama
│ ├── index.js # Entity + relationship CRUD (upsert, mention tracking)
│ └── extraction.js # Automatic entity + relationship extraction via qwen2.5:3b
├── graph/
│ └── index.js # Knowledge graph traversal (neighborhood queries, recursive CTE)
└── index.js # Express app + all route definitions
```
## SQLite Schema
Seven core tables:
Eight core tables:
- **sessions** — top-level conversation containers. Fields: `external_id`, `name`, `project_id`, `metadata`
- **episodes** — individual exchanges (user message + AI response) tied to a session
- **entities** — named things the system learns about (people, places, concepts)
- **relationships** — directional labeled links between entities
- **entities** — named things the system learns about (people, places, concepts, etc.). Fields include `mention_count`, `confidence`, `source`, `last_seen_at`
- **relationships** — directional labeled links between entities (`from_id`, `to_id`, `label`). Fields include `mention_count`, `notes`
- **entity_episodes** — join table linking entities to the episodes where they were extracted. Used for provenance and orphan cleanup
- **summaries** — condensed episode groups for efficient context retrieval
- **projects** — named groupings of sessions with `name`, `description`, `colour`, `icon`, `isolated`, `notes`, `system_prompt`
@@ -73,10 +76,18 @@ try { db.exec(`CREATE INDEX IF NOT EXISTS idx_sessions_project ON sessions(proje
try { db.exec(`ALTER TABLE projects ADD COLUMN isolated INTEGER NOT NULL DEFAULT 0`); } catch {}
try { db.exec(`ALTER TABLE projects ADD COLUMN notes TEXT`); } catch {}
try { db.exec(`ALTER TABLE projects ADD COLUMN system_prompt TEXT`); } catch {}
// Knowledge graph columns:
try { db.exec(`ALTER TABLE entities ADD COLUMN mention_count INTEGER NOT NULL DEFAULT 1`) } catch {}
try { db.exec(`ALTER TABLE entities ADD COLUMN confidence REAL NOT NULL DEFAULT 1.0`) } catch {}
try { db.exec(`ALTER TABLE entities ADD COLUMN source TEXT NOT NULL DEFAULT 'extraction'`) } catch {}
try { db.exec(`ALTER TABLE entities ADD COLUMN last_seen_at INTEGER`) } catch {}
try { db.exec(`ALTER TABLE relationships ADD COLUMN mention_count INTEGER NOT NULL DEFAULT 1`) } catch {}
try { db.exec(`ALTER TABLE relationships ADD COLUMN notes TEXT`) } catch {}
```
New migrations are always appended here — never modify the schema file for
existing tables since `ALTER TABLE` cannot use `IF NOT EXISTS`.
`entity_episodes` is defined in `schema.js` itself (not a migration) since it is a new table.
New migrations are always appended — never modify the schema file for existing tables since `ALTER TABLE` cannot use `IF NOT EXISTS`.
### FTS5 Full-Text Search
@@ -117,8 +128,7 @@ defined in `@nexusai/shared` — not hardcoded here.
`initCollections()` iterates `Object.values(COLLECTIONS)` and creates any
collection that doesn't already exist at startup — all three collections are
guaranteed to exist before any requests are handled, avoiding race conditions
between the first entity embed and an entity search.
guaranteed to exist before any requests are handled.
Each collection exposes upsert, search (with optional Qdrant filter), and
delete operations. The `wait: true` flag is used on all writes.
@@ -143,15 +153,27 @@ saved and searchable via FTS. The error is logged but not surfaced.
Entities and relationships use upsert semantics with composite unique
constraints to prevent duplicates:
- `UNIQUE(name, type)` on entities
- `UNIQUE(from_id, to_id, label)` on relationships
- `UNIQUE(name, type)` on entities — conflict increments `mention_count` and updates `last_seen_at`
- `UNIQUE(from_id, to_id, label)` on relationships — conflict increments `mention_count` and preserves existing `notes`
- `ON DELETE CASCADE` on relationship foreign keys
After each episode is saved, `extraction.js` automatically extracts named
entities from the conversation using `qwen2.5:3b` on Ollama — fire-and-forget.
entities **and relationships** from the conversation using `qwen2.5:3b` on
Ollama — fire-and-forget. Each saved entity is also linked to the episode
via the `entity_episodes` join table.
> For full details on the extraction pipeline, prompt format, constrained
> decoding, stoplist, and Qdrant storage, see `entity-extraction.md`.
> For full details on the extraction pipeline and JSON format, see `entity-extraction.md`.
> For the knowledge graph traversal layer, see `knowledge-graph.md`.
## Knowledge Graph Layer
`src/graph/index.js` provides SQLite-based graph traversal over the entities
and relationships tables. Two functions are exposed via HTTP:
- **`getNeighborhood(entityId, depth)`** — recursive CTE traversal, bidirectional, returns `{ nodes, edges }`
- **`getEntityNeighbors(entityIds[])`** — bulk 1-hop traversal for orchestration context assembly
> For design rationale, traversal queries, and integration with orchestration, see `knowledge-graph.md`.
## Summaries Layer
@@ -175,4 +197,4 @@ const doDelete = db.transaction(() => {
});
```
For all HTTP endpoints, see `api-routes.md`.
For all HTTP endpoints, see `api-routes.md`.

View File

@@ -42,9 +42,10 @@ src/
│ ├── inference.js # HTTP client for inference service
│ ├── embedding.js # HTTP client for embedding service
│ ├── qdrant.js # HTTP client for Qdrant (direct vector search)
│ ├── graph.js # HTTP client for memory-service graph endpoints
│ └── summarization.js # Session summarisation — triggers after each episode
├── chat/
│ └── index.js # Core pipeline — context assembly, isolation, auto-naming
│ └── index.js # Core pipeline — context assembly, graph expansion, auto-naming
├── config/
│ └── settings.js # Settings load/save — reads/writes data/settings.json
├── routes/
@@ -71,7 +72,7 @@ via `appSettings.load()` — changes apply immediately without a service restart
|---|---|---|
| `recentEpisodeLimit` | 5 | Recent episodes injected into prompt |
| `semanticLimit` | 5 | Semantic search results injected into prompt |
| `scoreThreshold` | 0.75 | Minimum similarity score for semantic results |
| `scoreThreshold` | 0.5 | Minimum similarity score for semantic results |
| `modelsFolderPath` | `/mnt/nexus-models` | Path to folder containing .gguf files |
| `temperature` | 0.7 | Inference temperature |
| `repeatPenalty` | 1.1 | Repeat token penalty |
@@ -104,20 +105,27 @@ difference is how the inference response is delivered to the client.
episodes. Deduplicated against recent episodes. Non-critical.
6. **Entity search** — query `entities` Qdrant collection filtered by
`projectId`. Non-project sessions receive no entity context. Non-critical.
`projectId`. Returns entity IDs alongside Qdrant payload data (the Qdrant
point ID equals the SQLite entity ID). Non-critical.
7. **Prompt assembly** — combine system prompt, entity context, semantic
7. **Graph neighborhood expansion** — call `POST /graph/neighbors` on
memory-service with the entity IDs from step 6. Returns a 1-hop subgraph
`{ nodes, edges }` — entity objects plus the relationships connecting them.
If no entities were found or the graph call fails, falls back to flat entity
list (no edges). Non-critical.
8. **Prompt assembly** — combine system prompt, graph context, semantic
episodes, recent episodes, and user message.
8. **Inference** — send to inference service. `/chat` awaits full response;
9. **Inference** — send to inference service. `/chat` awaits full response;
`/chat/stream` pipes SSE chunks to the client.
9. **Episode write** — write exchange back to memory with `projectId`.
10. **Episode write** — write exchange back to memory with `projectId`.
10. **Summarisation trigger**`triggerSummary(session, allEpisodes)` called
11. **Summarisation trigger**`triggerSummary(session, allEpisodes)` called
fire-and-forget. See `summarization.md` for full details.
11. **Auto-naming** — on first message with no session name, fires a secondary
12. **Auto-naming** — on first message with no session name, fires a secondary
inference call (max 20 tokens, temperature 0.3) to generate a session name.
### Prompt Structure
@@ -125,8 +133,9 @@ difference is how the inference response is delivered to the client.
```
[Resolved system prompt]
Here is what you know about entities relevant to this conversation:
Here is what you know about entities relevant to this conversation and their connections:
- {name} ({type}): {notes}
→ {label} {neighbor_name} ({neighbor_type})
---
Here are some relevant memories from earlier conversations:
User: {past user message}
@@ -141,6 +150,12 @@ User: {current message}
Assistant:
```
The entity block renders the full graph neighborhood — seed entities matched
by Qdrant search plus any neighbors pulled in by 1-hop traversal. Each entity
shows its `notes` and any outbound relationships with their targets. Neighbor
nodes that have no outbound edges within the subgraph appear without connection
lines.
## Summarisation
After each episode write, `triggerSummary` is called fire-and-forget. It
@@ -199,4 +214,7 @@ handle /health* { reverse_proxy localhost:4000 }
After updating: `caddy reload --config /path/to/Caddyfile`
For all HTTP endpoints, see `api-routes.md`.
> Note: `/graph` routes are on the memory-service (port 3002) and are called
> internally by orchestration — they do not need a Caddy entry.
For all HTTP endpoints, see `api-routes.md`.