Files

Storme-bit 1a97b19280 roadmap phase 1 complete

2026-04-27 03:10:39 -07:00

8.8 KiB

Raw Blame History

Memory Service

Package: @nexusai/memory-service
Location: packages/memory-service
Deployed on: Mini PC 1 (192.168.0.81)
Port: 3002

Purpose

Responsible for all reading and writing of long-term memory. Acts as the sole interface to both SQLite and Qdrant — no other service accesses these stores directly. On episode creation, automatically triggers entity and relationship extraction and embeds results into Qdrant.

Dependencies

express — HTTP API
better-sqlite3 — SQLite driver
@qdrant/js-client-rest — Qdrant vector store client
dotenv — environment variable loading
@nexusai/shared — shared utilities and constants

Environment Variables

Variable	Required	Default	Description
PORT	No	3002	Port to listen on
SQLITE_PATH	Yes	—	Path to SQLite database file
QDRANT_URL	No	http://localhost:6333	Qdrant instance URL
EMBEDDING_SERVICE_URL	No	http://localhost:3003	Embedding service URL
EXTRACTION_URL	No	http://localhost:11434	Ollama URL for entity extraction
EXTRACTION_MODEL	No	qwen2.5:3b	Ollama model used for entity extraction

Internal Structure

src/
├── db/
│   ├── index.js       # SQLite connection + initialization + migrations
│   ├── schema.js      # Table definitions, indexes, FTS5, triggers
│   ├── projects.js    # Project CRUD functions
│   └── summaries.js   # Summary CRUD functions
├── episodic/
│   └── index.js       # Session + episode CRUD, FTS search, embedding write path
├── semantic/
│   └── index.js       # Qdrant collection management, upsert, search, delete
├── entities/
│   ├── index.js       # Entity + relationship CRUD (upsert, mention tracking)
│   └── extraction.js  # Automatic entity + relationship extraction via qwen2.5:3b
├── graph/
│   └── index.js       # Knowledge graph traversal (neighborhood queries, recursive CTE)
└── index.js           # Express app + all route definitions

SQLite Schema

Eight core tables:

sessions — top-level conversation containers. Fields: external_id, name, project_id, metadata
episodes — individual exchanges (user message + AI response) tied to a session
entities — named things the system learns about (people, places, concepts, etc.). Fields include mention_count, confidence, source, last_seen_at
relationships — directional labeled links between entities (from_id, to_id, label). Fields include mention_count, notes
entity_episodes — join table linking entities to the episodes where they were extracted. Used for provenance and orphan cleanup
summaries — condensed episode groups for efficient context retrieval
projects — named groupings of sessions with name, description, colour, icon, isolated, notes, system_prompt

Migrations

Schema changes that cannot use CREATE TABLE IF NOT EXISTS are applied as idempotent migrations in db/index.js at startup:

try { db.exec(`ALTER TABLE sessions ADD COLUMN name TEXT`); } catch {}
try { db.exec(`ALTER TABLE sessions ADD COLUMN project_id INTEGER REFERENCES projects(id)`); } catch {}
try { db.exec(`CREATE INDEX IF NOT EXISTS idx_sessions_project ON sessions(project_id)`); } catch {}
try { db.exec(`ALTER TABLE projects ADD COLUMN isolated INTEGER NOT NULL DEFAULT 0`); } catch {}
try { db.exec(`ALTER TABLE projects ADD COLUMN notes TEXT`); } catch {}
try { db.exec(`ALTER TABLE projects ADD COLUMN system_prompt TEXT`); } catch {}
// Knowledge graph columns:
try { db.exec(`ALTER TABLE entities ADD COLUMN mention_count INTEGER NOT NULL DEFAULT 1`) } catch {}
try { db.exec(`ALTER TABLE entities ADD COLUMN confidence REAL NOT NULL DEFAULT 1.0`) } catch {}
try { db.exec(`ALTER TABLE entities ADD COLUMN source TEXT NOT NULL DEFAULT 'extraction'`) } catch {}
try { db.exec(`ALTER TABLE entities ADD COLUMN last_seen_at INTEGER`) } catch {}
try { db.exec(`ALTER TABLE relationships ADD COLUMN mention_count INTEGER NOT NULL DEFAULT 1`) } catch {}
try { db.exec(`ALTER TABLE relationships ADD COLUMN notes TEXT`) } catch {}

entity_episodes is defined in schema.js itself (not a migration) since it is a new table.

New migrations are always appended — never modify the schema file for existing tables since ALTER TABLE cannot use IF NOT EXISTS.

FTS5 Full-Text Search

An episodes_fts virtual table enables keyword search across all episodes. Three triggers (episodes_fts_insert, episodes_fts_update, episodes_fts_delete) keep the FTS index automatically in sync with the episodes table.

SQLite Configuration

journal_mode = WAL — non-blocking reads during writes
foreign_keys = ON — enforces referential integrity and cascade deletes
PRAGMAs set via db.pragma(), not db.exec()

Dynamic Updates

Both updateSession and updateProject build their SET clause dynamically from only the fields passed — prevents partial updates from overwriting fields that weren't touched.

updateProject allowlist:

const allowed = ['name', 'description', 'colour', 'icon', 'isolated', 'notes', 'system_prompt'];

Qdrant / Semantic Layer

Three Qdrant collections are initialized on service startup via semantic.initCollections():

Collection	Purpose
`episodes`	Embeddings for individual conversation exchanges
`entities`	Embeddings for named entities
`summaries`	Embeddings for condensed episode summaries

All collections use 768-dimension vectors with Cosine similarity, matching nomic-embed-text via Ollama. Vector size and distance metric are defined in @nexusai/shared — not hardcoded here.

initCollections() iterates Object.values(COLLECTIONS) and creates any collection that doesn't already exist at startup — all three collections are guaranteed to exist before any requests are handled.

Each collection exposes upsert, search (with optional Qdrant filter), and delete operations. The wait: true flag is used on all writes.

Embedding Write Path

When a new episode is created:

Episode saved to SQLite synchronously — response returned immediately
User message + AI response combined: User: ...\nAssistant: ...
Text sent to embedding service (POST /embed)
Vector upserted into episodes Qdrant collection with payload { sessionId, createdAt }

This step is fire-and-forget — if embedding fails, the episode is still saved and searchable via FTS. The error is logged but not surfaced.

The Qdrant payload stores sessionId (the internal integer ID). See memory-isolation.md for how project-level filtering works.

Entity Layer

Entities and relationships use upsert semantics with composite unique constraints to prevent duplicates:

UNIQUE(name, type) on entities — conflict increments mention_count and updates last_seen_at
UNIQUE(from_id, to_id, label) on relationships — conflict increments mention_count and preserves existing notes
ON DELETE CASCADE on relationship foreign keys

After each episode is saved, extraction.js automatically extracts named entities and relationships from the conversation using qwen2.5:3b on Ollama — fire-and-forget. Each saved entity is also linked to the episode via the entity_episodes join table.

For full details on the extraction pipeline and JSON format, see entity-extraction.md.
For the knowledge graph traversal layer, see knowledge-graph.md.

Knowledge Graph Layer

src/graph/index.js provides SQLite-based graph traversal over the entities and relationships tables. Two functions are exposed via HTTP:

getNeighborhood(entityId, depth) — recursive CTE traversal, bidirectional, returns { nodes, edges }
getEntityNeighbors(entityIds[]) — bulk 1-hop traversal for orchestration context assembly

For design rationale, traversal queries, and integration with orchestration, see knowledge-graph.md.

Summaries Layer

Session summaries are generated by orchestration-service/src/services/summarization.js after each episode write and stored here via POST /summaries. The memory service is responsible only for CRUD — generation logic lives in orchestration.

For full details on trigger conditions, prompt format, cumulative updates, and ChatML token stripping, see summarization.md.

Project Delete Behaviour

Deleting a project runs as a transaction — it first nulls out project_id on all assigned sessions, then deletes the project. This avoids a foreign key constraint failure since sessions.project_id has no ON DELETE rule:

const doDelete = db.transaction(() => {
  db.prepare(`UPDATE sessions SET project_id = NULL WHERE project_id = ?`).run(id);
  db.prepare(`DELETE FROM projects WHERE id = ?`).run(id);
});

For all HTTP endpoints, see api-routes.md.

8.8 KiB Raw Blame History