roadmap phase 1 complete

This commit is contained in:
Storme-bit
2026-04-27 03:10:39 -07:00
parent 9fe8e568cf
commit 1a97b19280
19 changed files with 759 additions and 281 deletions

View File

@@ -9,8 +9,8 @@
Responsible for all reading and writing of long-term memory. Acts as the
sole interface to both SQLite and Qdrant — no other service accesses these
stores directly. On episode creation, automatically calls the embedding
service to generate and store a vector in Qdrant.
stores directly. On episode creation, automatically triggers entity and
relationship extraction and embeds results into Qdrant.
## Dependencies
@@ -45,19 +45,22 @@ src/
├── semantic/
│ └── index.js # Qdrant collection management, upsert, search, delete
├── entities/
│ ├── index.js # Entity + relationship CRUD
│ └── extraction.js # Automatic entity extraction via qwen2.5:3b on Ollama
│ ├── index.js # Entity + relationship CRUD (upsert, mention tracking)
│ └── extraction.js # Automatic entity + relationship extraction via qwen2.5:3b
├── graph/
│ └── index.js # Knowledge graph traversal (neighborhood queries, recursive CTE)
└── index.js # Express app + all route definitions
```
## SQLite Schema
Seven core tables:
Eight core tables:
- **sessions** — top-level conversation containers. Fields: `external_id`, `name`, `project_id`, `metadata`
- **episodes** — individual exchanges (user message + AI response) tied to a session
- **entities** — named things the system learns about (people, places, concepts)
- **relationships** — directional labeled links between entities
- **entities** — named things the system learns about (people, places, concepts, etc.). Fields include `mention_count`, `confidence`, `source`, `last_seen_at`
- **relationships** — directional labeled links between entities (`from_id`, `to_id`, `label`). Fields include `mention_count`, `notes`
- **entity_episodes** — join table linking entities to the episodes where they were extracted. Used for provenance and orphan cleanup
- **summaries** — condensed episode groups for efficient context retrieval
- **projects** — named groupings of sessions with `name`, `description`, `colour`, `icon`, `isolated`, `notes`, `system_prompt`
@@ -73,10 +76,18 @@ try { db.exec(`CREATE INDEX IF NOT EXISTS idx_sessions_project ON sessions(proje
try { db.exec(`ALTER TABLE projects ADD COLUMN isolated INTEGER NOT NULL DEFAULT 0`); } catch {}
try { db.exec(`ALTER TABLE projects ADD COLUMN notes TEXT`); } catch {}
try { db.exec(`ALTER TABLE projects ADD COLUMN system_prompt TEXT`); } catch {}
// Knowledge graph columns:
try { db.exec(`ALTER TABLE entities ADD COLUMN mention_count INTEGER NOT NULL DEFAULT 1`) } catch {}
try { db.exec(`ALTER TABLE entities ADD COLUMN confidence REAL NOT NULL DEFAULT 1.0`) } catch {}
try { db.exec(`ALTER TABLE entities ADD COLUMN source TEXT NOT NULL DEFAULT 'extraction'`) } catch {}
try { db.exec(`ALTER TABLE entities ADD COLUMN last_seen_at INTEGER`) } catch {}
try { db.exec(`ALTER TABLE relationships ADD COLUMN mention_count INTEGER NOT NULL DEFAULT 1`) } catch {}
try { db.exec(`ALTER TABLE relationships ADD COLUMN notes TEXT`) } catch {}
```
New migrations are always appended here — never modify the schema file for
existing tables since `ALTER TABLE` cannot use `IF NOT EXISTS`.
`entity_episodes` is defined in `schema.js` itself (not a migration) since it is a new table.
New migrations are always appended — never modify the schema file for existing tables since `ALTER TABLE` cannot use `IF NOT EXISTS`.
### FTS5 Full-Text Search
@@ -117,8 +128,7 @@ defined in `@nexusai/shared` — not hardcoded here.
`initCollections()` iterates `Object.values(COLLECTIONS)` and creates any
collection that doesn't already exist at startup — all three collections are
guaranteed to exist before any requests are handled, avoiding race conditions
between the first entity embed and an entity search.
guaranteed to exist before any requests are handled.
Each collection exposes upsert, search (with optional Qdrant filter), and
delete operations. The `wait: true` flag is used on all writes.
@@ -143,15 +153,27 @@ saved and searchable via FTS. The error is logged but not surfaced.
Entities and relationships use upsert semantics with composite unique
constraints to prevent duplicates:
- `UNIQUE(name, type)` on entities
- `UNIQUE(from_id, to_id, label)` on relationships
- `UNIQUE(name, type)` on entities — conflict increments `mention_count` and updates `last_seen_at`
- `UNIQUE(from_id, to_id, label)` on relationships — conflict increments `mention_count` and preserves existing `notes`
- `ON DELETE CASCADE` on relationship foreign keys
After each episode is saved, `extraction.js` automatically extracts named
entities from the conversation using `qwen2.5:3b` on Ollama — fire-and-forget.
entities **and relationships** from the conversation using `qwen2.5:3b` on
Ollama — fire-and-forget. Each saved entity is also linked to the episode
via the `entity_episodes` join table.
> For full details on the extraction pipeline, prompt format, constrained
> decoding, stoplist, and Qdrant storage, see `entity-extraction.md`.
> For full details on the extraction pipeline and JSON format, see `entity-extraction.md`.
> For the knowledge graph traversal layer, see `knowledge-graph.md`.
## Knowledge Graph Layer
`src/graph/index.js` provides SQLite-based graph traversal over the entities
and relationships tables. Two functions are exposed via HTTP:
- **`getNeighborhood(entityId, depth)`** — recursive CTE traversal, bidirectional, returns `{ nodes, edges }`
- **`getEntityNeighbors(entityIds[])`** — bulk 1-hop traversal for orchestration context assembly
> For design rationale, traversal queries, and integration with orchestration, see `knowledge-graph.md`.
## Summaries Layer
@@ -175,4 +197,4 @@ const doDelete = db.transaction(() => {
});
```
For all HTTP endpoints, see `api-routes.md`.
For all HTTP endpoints, see `api-routes.md`.