178 lines
7.1 KiB
Markdown
178 lines
7.1 KiB
Markdown
# Memory Service
|
|
|
|
**Package:** `@nexusai/memory-service`
|
|
**Location:** `packages/memory-service`
|
|
**Deployed on:** Mini PC 1 (192.168.0.81)
|
|
**Port:** 3002
|
|
|
|
## Purpose
|
|
|
|
Responsible for all reading and writing of long-term memory. Acts as the
|
|
sole interface to both SQLite and Qdrant — no other service accesses these
|
|
stores directly. On episode creation, automatically calls the embedding
|
|
service to generate and store a vector in Qdrant.
|
|
|
|
## Dependencies
|
|
|
|
- `express` — HTTP API
|
|
- `better-sqlite3` — SQLite driver
|
|
- `@qdrant/js-client-rest` — Qdrant vector store client
|
|
- `dotenv` — environment variable loading
|
|
- `@nexusai/shared` — shared utilities and constants
|
|
|
|
## Environment Variables
|
|
|
|
| Variable | Required | Default | Description |
|
|
|---|---|---|---|
|
|
| PORT | No | 3002 | Port to listen on |
|
|
| SQLITE_PATH | Yes | — | Path to SQLite database file |
|
|
| QDRANT_URL | No | http://localhost:6333 | Qdrant instance URL |
|
|
| EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL |
|
|
| EXTRACTION_URL | No | http://localhost:11434 | Ollama URL for entity extraction |
|
|
| EXTRACTION_MODEL | No | qwen2.5:3b | Ollama model used for entity extraction |
|
|
|
|
## Internal Structure
|
|
|
|
```
|
|
src/
|
|
├── db/
|
|
│ ├── index.js # SQLite connection + initialization + migrations
|
|
│ ├── schema.js # Table definitions, indexes, FTS5, triggers
|
|
│ ├── projects.js # Project CRUD functions
|
|
│ └── summaries.js # Summary CRUD functions
|
|
├── episodic/
|
|
│ └── index.js # Session + episode CRUD, FTS search, embedding write path
|
|
├── semantic/
|
|
│ └── index.js # Qdrant collection management, upsert, search, delete
|
|
├── entities/
|
|
│ ├── index.js # Entity + relationship CRUD
|
|
│ └── extraction.js # Automatic entity extraction via qwen2.5:3b on Ollama
|
|
└── index.js # Express app + all route definitions
|
|
```
|
|
|
|
## SQLite Schema
|
|
|
|
Seven core tables:
|
|
|
|
- **sessions** — top-level conversation containers. Fields: `external_id`, `name`, `project_id`, `metadata`
|
|
- **episodes** — individual exchanges (user message + AI response) tied to a session
|
|
- **entities** — named things the system learns about (people, places, concepts)
|
|
- **relationships** — directional labeled links between entities
|
|
- **summaries** — condensed episode groups for efficient context retrieval
|
|
- **projects** — named groupings of sessions with `name`, `description`, `colour`, `icon`, `isolated`, `notes`, `system_prompt`
|
|
|
|
### Migrations
|
|
|
|
Schema changes that cannot use `CREATE TABLE IF NOT EXISTS` are applied as
|
|
idempotent migrations in `db/index.js` at startup:
|
|
|
|
```js
|
|
try { db.exec(`ALTER TABLE sessions ADD COLUMN name TEXT`); } catch {}
|
|
try { db.exec(`ALTER TABLE sessions ADD COLUMN project_id INTEGER REFERENCES projects(id)`); } catch {}
|
|
try { db.exec(`CREATE INDEX IF NOT EXISTS idx_sessions_project ON sessions(project_id)`); } catch {}
|
|
try { db.exec(`ALTER TABLE projects ADD COLUMN isolated INTEGER NOT NULL DEFAULT 0`); } catch {}
|
|
try { db.exec(`ALTER TABLE projects ADD COLUMN notes TEXT`); } catch {}
|
|
try { db.exec(`ALTER TABLE projects ADD COLUMN system_prompt TEXT`); } catch {}
|
|
```
|
|
|
|
New migrations are always appended here — never modify the schema file for
|
|
existing tables since `ALTER TABLE` cannot use `IF NOT EXISTS`.
|
|
|
|
### FTS5 Full-Text Search
|
|
|
|
An `episodes_fts` virtual table enables keyword search across all episodes.
|
|
Three triggers (`episodes_fts_insert`, `episodes_fts_update`, `episodes_fts_delete`)
|
|
keep the FTS index automatically in sync with the episodes table.
|
|
|
|
### SQLite Configuration
|
|
|
|
- `journal_mode = WAL` — non-blocking reads during writes
|
|
- `foreign_keys = ON` — enforces referential integrity and cascade deletes
|
|
- PRAGMAs set via `db.pragma()`, not `db.exec()`
|
|
|
|
### Dynamic Updates
|
|
|
|
Both `updateSession` and `updateProject` build their `SET` clause dynamically
|
|
from only the fields passed — prevents partial updates from overwriting fields
|
|
that weren't touched.
|
|
|
|
`updateProject` allowlist:
|
|
```js
|
|
const allowed = ['name', 'description', 'colour', 'icon', 'isolated', 'notes', 'system_prompt'];
|
|
```
|
|
|
|
## Qdrant / Semantic Layer
|
|
|
|
Three Qdrant collections are initialized on service startup via `semantic.initCollections()`:
|
|
|
|
| Collection | Purpose |
|
|
|---|---|
|
|
| `episodes` | Embeddings for individual conversation exchanges |
|
|
| `entities` | Embeddings for named entities |
|
|
| `summaries` | Embeddings for condensed episode summaries |
|
|
|
|
All collections use **768-dimension vectors** with **Cosine similarity**,
|
|
matching `nomic-embed-text` via Ollama. Vector size and distance metric are
|
|
defined in `@nexusai/shared` — not hardcoded here.
|
|
|
|
`initCollections()` iterates `Object.values(COLLECTIONS)` and creates any
|
|
collection that doesn't already exist at startup — all three collections are
|
|
guaranteed to exist before any requests are handled, avoiding race conditions
|
|
between the first entity embed and an entity search.
|
|
|
|
Each collection exposes upsert, search (with optional Qdrant filter), and
|
|
delete operations. The `wait: true` flag is used on all writes.
|
|
|
|
## Embedding Write Path
|
|
|
|
When a new episode is created:
|
|
|
|
1. Episode saved to SQLite synchronously — response returned immediately
|
|
2. User message + AI response combined: `User: ...\nAssistant: ...`
|
|
3. Text sent to embedding service (`POST /embed`)
|
|
4. Vector upserted into `episodes` Qdrant collection with payload `{ sessionId, createdAt }`
|
|
|
|
This step is **fire-and-forget** — if embedding fails, the episode is still
|
|
saved and searchable via FTS. The error is logged but not surfaced.
|
|
|
|
> The Qdrant payload stores `sessionId` (the internal integer ID). See
|
|
> `memory-isolation.md` for how project-level filtering works.
|
|
|
|
## Entity Layer
|
|
|
|
Entities and relationships use upsert semantics with composite unique
|
|
constraints to prevent duplicates:
|
|
|
|
- `UNIQUE(name, type)` on entities
|
|
- `UNIQUE(from_id, to_id, label)` on relationships
|
|
- `ON DELETE CASCADE` on relationship foreign keys
|
|
|
|
After each episode is saved, `extraction.js` automatically extracts named
|
|
entities from the conversation using `qwen2.5:3b` on Ollama — fire-and-forget.
|
|
|
|
> For full details on the extraction pipeline, prompt format, constrained
|
|
> decoding, stoplist, and Qdrant storage, see `entity-extraction.md`.
|
|
|
|
## Summaries Layer
|
|
|
|
Session summaries are generated by `orchestration-service/src/services/summarization.js`
|
|
after each episode write and stored here via `POST /summaries`. The memory
|
|
service is responsible only for CRUD — generation logic lives in orchestration.
|
|
|
|
> For full details on trigger conditions, prompt format, cumulative updates,
|
|
> and ChatML token stripping, see `summarization.md`.
|
|
|
|
## Project Delete Behaviour
|
|
|
|
Deleting a project runs as a transaction — it first nulls out `project_id`
|
|
on all assigned sessions, then deletes the project. This avoids a foreign
|
|
key constraint failure since `sessions.project_id` has no `ON DELETE` rule:
|
|
|
|
```js
|
|
const doDelete = db.transaction(() => {
|
|
db.prepare(`UPDATE sessions SET project_id = NULL WHERE project_id = ?`).run(id);
|
|
db.prepare(`DELETE FROM projects WHERE id = ?`).run(id);
|
|
});
|
|
```
|
|
|
|
For all HTTP endpoints, see `api-routes.md`. |