nexusAI/docs/services/memory-service.md

# Memory Service

**Package:** `@nexusai/memory-service`
**Location:** `packages/memory-service`
**Deployed on:** Mini PC 1 (192.168.0.81)
**Port:** 3002

## Purpose

Responsible for all reading and writing of long-term memory. Acts as the
sole interface to both SQLite and Qdrant — no other service accesses these
stores directly. On episode creation, automatically calls the embedding
service to generate and store a vector in Qdrant.

## Dependencies

- `express` — HTTP API
- `better-sqlite3` — SQLite driver
- `@qdrant/js-client-rest` — Qdrant vector store client
- `dotenv` — environment variable loading
- `@nexusai/shared` — shared utilities and constants

## Environment Variables

| Variable | Required | Default | Description |
|---|---|---|---|
| PORT | No | 3002 | Port to listen on |
| SQLITE_PATH | Yes | — | Path to SQLite database file |
| QDRANT_URL | No | http://localhost:6333 | Qdrant instance URL |
| EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL |
| EXTRACTION_URL | No | http://localhost:11434 | Ollama URL for entity extraction |
| EXTRACTION_MODEL | No | qwen2.5:3b | Ollama model used for entity extraction |

## Internal Structure

```
src/
├── db/
│   ├── index.js       # SQLite connection + initialization + migrations
│   ├── schema.js      # Table definitions, indexes, FTS5, triggers
│   └── projects.js    # Project CRUD functions
├── episodic/
│   └── index.js       # Session + episode CRUD, FTS search, embedding write path
├── semantic/
│   └── index.js       # Qdrant collection management, upsert, search, delete
├── entities/
│   ├── index.js       # Entity + relationship CRUD
│   └── extraction.js  # Automatic entity extraction via qwen2.5:3b on Ollama
└── index.js           # Express app + all route definitions
```

## SQLite Schema

Six core tables:

- **sessions** — top-level conversation containers. Fields: `external_id`, `name`, `project_id`, `metadata`
- **episodes** — individual exchanges (user message + AI response) tied to a session
- **entities** — named things the system learns about (people, places, concepts)
- **relationships** — directional labeled links between entities
- **summaries** — condensed episode groups for efficient context retrieval
- **projects** — named groupings of sessions with `name`, `description`, `colour`, `icon`, `isolated`, `notes`, `system_prompt`

### Migrations

Schema changes that cannot use `CREATE TABLE IF NOT EXISTS` are applied as
idempotent migrations in `db/index.js` at startup:

```js
try { db.exec(`ALTER TABLE sessions ADD COLUMN name TEXT`); } catch {}
try { db.exec(`ALTER TABLE sessions ADD COLUMN project_id INTEGER REFERENCES projects(id)`); } catch {}
try { db.exec(`CREATE INDEX IF NOT EXISTS idx_sessions_project ON sessions(project_id)`); } catch {}
try { db.exec(`ALTER TABLE projects ADD COLUMN isolated INTEGER NOT NULL DEFAULT 0`); } catch {}
try { db.exec(`ALTER TABLE projects ADD COLUMN notes TEXT`); } catch {}
try { db.exec(`ALTER TABLE projects ADD COLUMN system_prompt TEXT`); } catch {}
```

New migrations are always appended here — never modify the schema file for
existing tables since `ALTER TABLE` cannot use `IF NOT EXISTS`.

### FTS5 Full-Text Search

An `episodes_fts` virtual table enables keyword search across all episodes.
Three triggers (`episodes_fts_insert`, `episodes_fts_update`, `episodes_fts_delete`)
keep the FTS index automatically in sync with the episodes table.

### SQLite Configuration

- `journal_mode = WAL` — non-blocking reads during writes
- `foreign_keys = ON` — enforces referential integrity and cascade deletes
- PRAGMAs set via `db.pragma()`, not `db.exec()`

### Dynamic Updates

Both `updateSession` and `updateProject` build their `SET` clause dynamically
from only the fields passed — prevents partial updates from overwriting fields
that weren't touched.

`updateProject` allowlist:
```js
const allowed = ['name', 'description', 'colour', 'icon', 'isolated', 'notes', 'system_prompt'];
```

This means saving just `{ notes: "..." }` or `{ system_prompt: "..." }` won't
touch any other field.

## Qdrant / Semantic Layer

Three Qdrant collections are initialized on service startup:

| Collection | Purpose |
|---|---|
| `episodes` | Embeddings for individual conversation exchanges |
| `entities` | Embeddings for named entities |
| `summaries` | Embeddings for condensed episode summaries |

All collections use **768-dimension vectors** with **Cosine similarity**,
matching `nomic-embed-text` via Ollama. Vector size and distance metric are
defined in `@nexusai/shared` — not hardcoded here.

Each collection exposes three operations in `src/semantic/index.js`:
upsert, search (with optional Qdrant filter), and delete. The `wait: true`
flag is used on all writes.

## Embedding Write Path

When a new episode is created:

1. Episode saved to SQLite synchronously — response returned immediately
2. User message + AI response combined: `User: ...\nAssistant: ...`
3. Text sent to embedding service (`POST /embed`)
4. Vector upserted into `episodes` Qdrant collection with payload `{ sessionId, createdAt }`

This step is **fire-and-forget** — if embedding fails, the episode is still
saved and searchable via FTS. The error is logged but not surfaced.

> The Qdrant payload stores `sessionId` (the internal integer ID). This is
> used for per-session and per-project filtering during semantic search. See
> `memory-isolation.md` for how project-level filtering works.

## Entity Layer

Entities and relationships use upsert semantics with composite unique
constraints to prevent duplicates:

- `UNIQUE(name, type)` on entities
- `UNIQUE(from_id, to_id, label)` on relationships
- `ON DELETE CASCADE` on relationship foreign keys

### Automatic Entity Extraction

After each episode is saved, `extraction.js` automatically extracts named
entities from the conversation using `qwen2.5:3b` running on Ollama (Mini PC 1).
This runs **fire-and-forget** — the episode is already saved and returned
before extraction begins.

**Entity types extracted:** `person`, `place`, `project`, `technology`,
`concept`, `organization`

The extraction prompt uses ChatML format (native to qwen2.5) and primes the
response by ending with `[` to steer the model directly into JSON array output.
A list of already-known entities is injected into the prompt so the model
reuses existing `(name, type)` pairs rather than creating duplicates with
different types.

After extraction, each entity is:
1. Upserted into SQLite via `upsertEntity` — notes are only written if
   the entity is new (`COALESCE(entities.notes, excluded.notes)` prevents
   overwriting existing notes with speculative updates)
2. Embedded via the embedding service and upserted into the `entities`
   Qdrant collection with `{ name, type, notes, projectId }` as payload —
   `projectId` scopes entities to their project for isolated retrieval

`extractAndStoreEntities` receives `projectId` from `createEpisode`, which
receives it from the episode route, which receives it from orchestration's
`createEpisode` call. This ensures entities are tagged with the correct
project scope at extraction time.

## Project Delete Behaviour

Deleting a project runs as a transaction — it first nulls out `project_id`
on all assigned sessions, then deletes the project. This avoids a foreign
key constraint failure since `sessions.project_id` has no `ON DELETE` rule:

```js
const doDelete = db.transaction(() => {
  db.prepare(`UPDATE sessions SET project_id = NULL WHERE project_id = ?`).run(id);
  db.prepare(`DELETE FROM projects WHERE id = ?`).run(id);
});
```

For all HTTP endpoints, see `api-routes.md`.