234 lines
7.5 KiB
Markdown
234 lines
7.5 KiB
Markdown
# Memory Service
|
|
|
|
**Package:** `@nexusai/memory-service`
|
|
**Location:** `packages/memory-service`
|
|
**Deployed on:** Mini PC 1 (192.168.0.81)
|
|
**Port:** 3002
|
|
|
|
## Purpose
|
|
|
|
Responsible for all reading and writing of long-term memory. Acts as the
|
|
sole interface to both SQLite and Qdrant — no other service accesses these
|
|
stores directly. On episode creation, automatically calls the embedding
|
|
service to generate and store a vector in Qdrant.
|
|
|
|
## Dependencies
|
|
|
|
- `express` — HTTP API
|
|
- `better-sqlite3` — SQLite driver
|
|
- `@qdrant/js-client-rest` — Qdrant vector store client
|
|
- `dotenv` — environment variable loading
|
|
- `@nexusai/shared` — shared utilities and constants
|
|
|
|
## Environment Variables
|
|
|
|
| Variable | Required | Default | Description |
|
|
|---|---|---|---|
|
|
| PORT | No | 3002 | Port to listen on |
|
|
| SQLITE_PATH | Yes | — | Path to SQLite database file |
|
|
| QDRANT_URL | No | http://localhost:6333 | Qdrant instance URL |
|
|
| EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL |
|
|
|
|
## Internal Structure
|
|
|
|
```
|
|
src/
|
|
├── db/
|
|
│ ├── index.js # SQLite connection + initialization
|
|
│ └── schema.js # Table definitions, indexes, FTS5, triggers
|
|
├── episodic/
|
|
│ └── index.js # Session + episode CRUD, FTS search, embedding write path
|
|
├── semantic/
|
|
│ └── index.js # Qdrant collection management, upsert, search, delete
|
|
├── entities/
|
|
│ └── index.js # Entity + relationship CRUD
|
|
└── index.js # Express app + route definitions
|
|
```
|
|
|
|
## SQLite Schema
|
|
|
|
Five core tables:
|
|
|
|
- **sessions** — top-level conversation containers, identified by an `external_id`
|
|
- **episodes** — individual exchanges (user message + AI response) tied to a session
|
|
- **entities** — named things the system learns about (people, places, concepts)
|
|
- **relationships** — directional labeled links between entities
|
|
- **summaries** — condensed episode groups for efficient context retrieval
|
|
|
|
### FTS5 Full-Text Search
|
|
|
|
An `episodes_fts` virtual table enables keyword search across all episodes.
|
|
Three triggers (`episodes_fts_insert`, `episodes_fts_update`, `episodes_fts_delete`)
|
|
keep the FTS index automatically in sync with the episodes table.
|
|
|
|
### SQLite Configuration
|
|
|
|
- `journal_mode = WAL` — non-blocking reads during writes
|
|
- `foreign_keys = ON` — enforces referential integrity and cascade deletes
|
|
- PRAGMAs are set via `db.pragma()` separately from `db.exec()`
|
|
|
|
## Qdrant / Semantic Layer
|
|
|
|
Three collections are initialized on service startup (created if they don't already exist):
|
|
|
|
| Collection | Purpose |
|
|
|---|---|
|
|
| `episodes` | Embeddings for individual conversation exchanges |
|
|
| `entities` | Embeddings for named entities |
|
|
| `summaries` | Embeddings for condensed episode summaries |
|
|
|
|
All collections use **768-dimension vectors** with **Cosine similarity**, matching the
|
|
output of the `nomic-embed-text` embedding model via Ollama.
|
|
|
|
Vector dimension and distance metric are defined in `@nexusai/shared` constants
|
|
(`QDRANT.VECTOR_SIZE`, `QDRANT.DISTANCE_METRIC`) — not hardcoded in this service.
|
|
|
|
### Semantic Layer Operations
|
|
|
|
Each collection exposes three operations via helper functions in `src/semantic/index.js`:
|
|
|
|
- **Upsert** — stores a vector with a payload containing the SQLite row ID, enabling
|
|
lookups back to the full content after a vector search
|
|
- **Search** — returns the top-k most similar vectors, with optional Qdrant filter
|
|
- **Delete** — removes a vector point by ID
|
|
|
|
The `wait: true` flag is used on all write operations so the caller receives confirmation
|
|
only after Qdrant has committed the change.
|
|
|
|
## Embedding Write Path
|
|
|
|
When a new episode is created, the memory service automatically generates and stores
|
|
a vector embedding in Qdrant via the embedding service:
|
|
|
|
1. Episode is saved to SQLite synchronously — the response is returned immediately
|
|
2. Both sides of the exchange are combined into a single text:
|
|
```
|
|
User: {userMessage}
|
|
Assistant: {aiResponse}
|
|
```
|
|
3. This text is sent to the embedding service (`POST /embed`)
|
|
4. The returned vector is upserted into the `episodes` Qdrant collection with a
|
|
payload of `{ sessionId, createdAt }` for filtering and lookups
|
|
|
|
The embedding step is **fire-and-forget** — it runs asynchronously after the SQLite
|
|
insert succeeds. If embedding fails, the episode is still saved and searchable via
|
|
FTS. The error is logged but does not affect the API response.
|
|
|
|
### Hybrid Retrieval Pattern
|
|
|
|
Qdrant and SQLite work as a pair — neither operates in isolation:
|
|
|
|
1. Query is embedded and searched in Qdrant → returns IDs + similarity scores
|
|
2. IDs are used to fetch full content from SQLite
|
|
3. Results are ranked and assembled into a context package
|
|
|
|
## Entity Layer
|
|
|
|
Entities and relationships are stored in SQLite with two key constraints:
|
|
|
|
- `UNIQUE(name, type)` on entities — ensures no duplicates; upsert updates existing records
|
|
- `UNIQUE(from_id, to_id, label)` on relationships — prevents duplicate edges
|
|
- `ON DELETE CASCADE` on both `from_id` and `to_id` — deleting an entity automatically
|
|
removes all relationships where it appears on either end
|
|
|
|
## Endpoints
|
|
|
|
### Health
|
|
|
|
| Method | Path | Description |
|
|
|---|---|---|
|
|
| GET | /health | Service health check |
|
|
|
|
### Sessions
|
|
|
|
| Method | Path | Description |
|
|
|---|---|---|
|
|
| POST | /sessions | Create a new session |
|
|
| GET | /sessions/:id | Get session by internal ID |
|
|
| GET | /sessions/by-external/:externalId | Get session by external ID |
|
|
| DELETE | /sessions/:id | Delete session (cascades to episodes + summaries) |
|
|
|
|
**POST /sessions body:**
|
|
```json
|
|
{
|
|
"externalId": "unique-session-id",
|
|
"metadata": {}
|
|
}
|
|
```
|
|
|
|
### Episodes
|
|
|
|
| Method | Path | Description |
|
|
|---|---|---|
|
|
| POST | /episodes | Create episode + auto-embed into Qdrant |
|
|
| GET | /episodes/search?q=&limit= | Full-text search across episodes |
|
|
| GET | /episodes/:id | Get episode by ID |
|
|
| GET | /sessions/:id/episodes?limit=&offset= | Get paginated episodes for a session |
|
|
| DELETE | /episodes/:id | Delete an episode |
|
|
|
|
**POST /episodes body:**
|
|
```json
|
|
{
|
|
"sessionId": 1,
|
|
"userMessage": "Hello",
|
|
"aiResponse": "Hi there!",
|
|
"tokenCount": 10,
|
|
"metadata": {}
|
|
}
|
|
```
|
|
|
|
> Note: `/episodes/search` must be defined before `/episodes/:id` in Express to prevent
|
|
> the word `search` being captured as an ID parameter.
|
|
|
|
### Entities
|
|
|
|
| Method | Path | Description |
|
|
|---|---|---|
|
|
| POST | /entities | Upsert an entity (creates or updates by name + type) |
|
|
| GET | /entities/by-type/:type | Get all entities of a given type |
|
|
| GET | /entities/:id | Get entity by internal ID |
|
|
| DELETE | /entities/:id | Delete entity (cascades to relationships) |
|
|
|
|
**POST /entities body:**
|
|
```json
|
|
{
|
|
"name": "NexusAI",
|
|
"type": "project",
|
|
"notes": "My AI memory project",
|
|
"metadata": {}
|
|
}
|
|
```
|
|
|
|
> Note: `/entities/by-type/:type` must be defined before `/entities/:id` in Express to
|
|
> prevent `by-type` being captured as an ID parameter.
|
|
|
|
### Relationships
|
|
|
|
| Method | Path | Description |
|
|
|---|---|---|
|
|
| POST | /relationships | Upsert a relationship between two entities |
|
|
| GET | /entities/:id/relationships | Get all relationships originating from an entity |
|
|
| DELETE | /relationships | Delete a specific relationship |
|
|
|
|
**POST /relationships body:**
|
|
```json
|
|
{
|
|
"fromId": 1,
|
|
"toId": 2,
|
|
"label": "uses",
|
|
"metadata": {}
|
|
}
|
|
```
|
|
|
|
**DELETE /relationships body:**
|
|
```json
|
|
{
|
|
"fromId": 1,
|
|
"toId": 2,
|
|
"label": "uses"
|
|
}
|
|
```
|
|
|
|
> Relationships are identified by the composite key `(fromId, toId, label)`. Delete uses
|
|
> the request body rather than URL params as this three-part key is awkward to express
|
|
> cleanly in a path. |