Files
nexusAI/docs/services/memory-service.md
2026-04-05 00:26:58 -07:00

234 lines
7.5 KiB
Markdown

# Memory Service
**Package:** `@nexusai/memory-service`
**Location:** `packages/memory-service`
**Deployed on:** Mini PC 1 (192.168.0.81)
**Port:** 3002
## Purpose
Responsible for all reading and writing of long-term memory. Acts as the
sole interface to both SQLite and Qdrant — no other service accesses these
stores directly. On episode creation, automatically calls the embedding
service to generate and store a vector in Qdrant.
## Dependencies
- `express` — HTTP API
- `better-sqlite3` — SQLite driver
- `@qdrant/js-client-rest` — Qdrant vector store client
- `dotenv` — environment variable loading
- `@nexusai/shared` — shared utilities and constants
## Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
| PORT | No | 3002 | Port to listen on |
| SQLITE_PATH | Yes | — | Path to SQLite database file |
| QDRANT_URL | No | http://localhost:6333 | Qdrant instance URL |
| EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL |
## Internal Structure
```
src/
├── db/
│ ├── index.js # SQLite connection + initialization
│ └── schema.js # Table definitions, indexes, FTS5, triggers
├── episodic/
│ └── index.js # Session + episode CRUD, FTS search, embedding write path
├── semantic/
│ └── index.js # Qdrant collection management, upsert, search, delete
├── entities/
│ └── index.js # Entity + relationship CRUD
└── index.js # Express app + route definitions
```
## SQLite Schema
Five core tables:
- **sessions** — top-level conversation containers, identified by an `external_id`
- **episodes** — individual exchanges (user message + AI response) tied to a session
- **entities** — named things the system learns about (people, places, concepts)
- **relationships** — directional labeled links between entities
- **summaries** — condensed episode groups for efficient context retrieval
### FTS5 Full-Text Search
An `episodes_fts` virtual table enables keyword search across all episodes.
Three triggers (`episodes_fts_insert`, `episodes_fts_update`, `episodes_fts_delete`)
keep the FTS index automatically in sync with the episodes table.
### SQLite Configuration
- `journal_mode = WAL` — non-blocking reads during writes
- `foreign_keys = ON` — enforces referential integrity and cascade deletes
- PRAGMAs are set via `db.pragma()` separately from `db.exec()`
## Qdrant / Semantic Layer
Three collections are initialized on service startup (created if they don't already exist):
| Collection | Purpose |
|---|---|
| `episodes` | Embeddings for individual conversation exchanges |
| `entities` | Embeddings for named entities |
| `summaries` | Embeddings for condensed episode summaries |
All collections use **768-dimension vectors** with **Cosine similarity**, matching the
output of the `nomic-embed-text` embedding model via Ollama.
Vector dimension and distance metric are defined in `@nexusai/shared` constants
(`QDRANT.VECTOR_SIZE`, `QDRANT.DISTANCE_METRIC`) — not hardcoded in this service.
### Semantic Layer Operations
Each collection exposes three operations via helper functions in `src/semantic/index.js`:
- **Upsert** — stores a vector with a payload containing the SQLite row ID, enabling
lookups back to the full content after a vector search
- **Search** — returns the top-k most similar vectors, with optional Qdrant filter
- **Delete** — removes a vector point by ID
The `wait: true` flag is used on all write operations so the caller receives confirmation
only after Qdrant has committed the change.
## Embedding Write Path
When a new episode is created, the memory service automatically generates and stores
a vector embedding in Qdrant via the embedding service:
1. Episode is saved to SQLite synchronously — the response is returned immediately
2. Both sides of the exchange are combined into a single text:
```
User: {userMessage}
Assistant: {aiResponse}
```
3. This text is sent to the embedding service (`POST /embed`)
4. The returned vector is upserted into the `episodes` Qdrant collection with a
payload of `{ sessionId, createdAt }` for filtering and lookups
The embedding step is **fire-and-forget** — it runs asynchronously after the SQLite
insert succeeds. If embedding fails, the episode is still saved and searchable via
FTS. The error is logged but does not affect the API response.
### Hybrid Retrieval Pattern
Qdrant and SQLite work as a pair — neither operates in isolation:
1. Query is embedded and searched in Qdrant → returns IDs + similarity scores
2. IDs are used to fetch full content from SQLite
3. Results are ranked and assembled into a context package
## Entity Layer
Entities and relationships are stored in SQLite with two key constraints:
- `UNIQUE(name, type)` on entities — ensures no duplicates; upsert updates existing records
- `UNIQUE(from_id, to_id, label)` on relationships — prevents duplicate edges
- `ON DELETE CASCADE` on both `from_id` and `to_id` — deleting an entity automatically
removes all relationships where it appears on either end
## Endpoints
### Health
| Method | Path | Description |
|---|---|---|
| GET | /health | Service health check |
### Sessions
| Method | Path | Description |
|---|---|---|
| POST | /sessions | Create a new session |
| GET | /sessions/:id | Get session by internal ID |
| GET | /sessions/by-external/:externalId | Get session by external ID |
| DELETE | /sessions/:id | Delete session (cascades to episodes + summaries) |
**POST /sessions body:**
```json
{
"externalId": "unique-session-id",
"metadata": {}
}
```
### Episodes
| Method | Path | Description |
|---|---|---|
| POST | /episodes | Create episode + auto-embed into Qdrant |
| GET | /episodes/search?q=&limit= | Full-text search across episodes |
| GET | /episodes/:id | Get episode by ID |
| GET | /sessions/:id/episodes?limit=&offset= | Get paginated episodes for a session |
| DELETE | /episodes/:id | Delete an episode |
**POST /episodes body:**
```json
{
"sessionId": 1,
"userMessage": "Hello",
"aiResponse": "Hi there!",
"tokenCount": 10,
"metadata": {}
}
```
> Note: `/episodes/search` must be defined before `/episodes/:id` in Express to prevent
> the word `search` being captured as an ID parameter.
### Entities
| Method | Path | Description |
|---|---|---|
| POST | /entities | Upsert an entity (creates or updates by name + type) |
| GET | /entities/by-type/:type | Get all entities of a given type |
| GET | /entities/:id | Get entity by internal ID |
| DELETE | /entities/:id | Delete entity (cascades to relationships) |
**POST /entities body:**
```json
{
"name": "NexusAI",
"type": "project",
"notes": "My AI memory project",
"metadata": {}
}
```
> Note: `/entities/by-type/:type` must be defined before `/entities/:id` in Express to
> prevent `by-type` being captured as an ID parameter.
### Relationships
| Method | Path | Description |
|---|---|---|
| POST | /relationships | Upsert a relationship between two entities |
| GET | /entities/:id/relationships | Get all relationships originating from an entity |
| DELETE | /relationships | Delete a specific relationship |
**POST /relationships body:**
```json
{
"fromId": 1,
"toId": 2,
"label": "uses",
"metadata": {}
}
```
**DELETE /relationships body:**
```json
{
"fromId": 1,
"toId": 2,
"label": "uses"
}
```
> Relationships are identified by the composite key `(fromId, toId, label)`. Delete uses
> the request body rather than URL params as this three-part key is awkward to express
> cleanly in a path.