# Memory Service **Package:** `@nexusai/memory-service` **Location:** `packages/memory-service` **Deployed on:** Mini PC 1 (192.168.0.81) **Port:** 3002 ## Purpose Responsible for all reading and writing of long-term memory. Acts as the sole interface to both SQLite and Qdrant — no other service accesses these stores directly. On episode creation, automatically calls the embedding service to generate and store a vector in Qdrant. ## Dependencies - `express` — HTTP API - `better-sqlite3` — SQLite driver - `@qdrant/js-client-rest` — Qdrant vector store client - `dotenv` — environment variable loading - `@nexusai/shared` — shared utilities and constants ## Environment Variables | Variable | Required | Default | Description | |---|---|---|---| | PORT | No | 3002 | Port to listen on | | SQLITE_PATH | Yes | — | Path to SQLite database file | | QDRANT_URL | No | http://localhost:6333 | Qdrant instance URL | | EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL | ## Internal Structure ``` src/ ├── db/ │ ├── index.js # SQLite connection + initialization │ └── schema.js # Table definitions, indexes, FTS5, triggers ├── episodic/ │ └── index.js # Session + episode CRUD, FTS search, embedding write path ├── semantic/ │ └── index.js # Qdrant collection management, upsert, search, delete ├── entities/ │ └── index.js # Entity + relationship CRUD └── index.js # Express app + route definitions ``` ## SQLite Schema Five core tables: - **sessions** — top-level conversation containers, identified by an `external_id` - **episodes** — individual exchanges (user message + AI response) tied to a session - **entities** — named things the system learns about (people, places, concepts) - **relationships** — directional labeled links between entities - **summaries** — condensed episode groups for efficient context retrieval ### FTS5 Full-Text Search An `episodes_fts` virtual table enables keyword search across all episodes. Three triggers (`episodes_fts_insert`, `episodes_fts_update`, `episodes_fts_delete`) keep the FTS index automatically in sync with the episodes table. ### SQLite Configuration - `journal_mode = WAL` — non-blocking reads during writes - `foreign_keys = ON` — enforces referential integrity and cascade deletes - PRAGMAs are set via `db.pragma()` separately from `db.exec()` ## Qdrant / Semantic Layer Three collections are initialized on service startup (created if they don't already exist): | Collection | Purpose | |---|---| | `episodes` | Embeddings for individual conversation exchanges | | `entities` | Embeddings for named entities | | `summaries` | Embeddings for condensed episode summaries | All collections use **768-dimension vectors** with **Cosine similarity**, matching the output of the `nomic-embed-text` embedding model via Ollama. Vector dimension and distance metric are defined in `@nexusai/shared` constants (`QDRANT.VECTOR_SIZE`, `QDRANT.DISTANCE_METRIC`) — not hardcoded in this service. ### Semantic Layer Operations Each collection exposes three operations via helper functions in `src/semantic/index.js`: - **Upsert** — stores a vector with a payload containing the SQLite row ID, enabling lookups back to the full content after a vector search - **Search** — returns the top-k most similar vectors, with optional Qdrant filter - **Delete** — removes a vector point by ID The `wait: true` flag is used on all write operations so the caller receives confirmation only after Qdrant has committed the change. ## Embedding Write Path When a new episode is created, the memory service automatically generates and stores a vector embedding in Qdrant via the embedding service: 1. Episode is saved to SQLite synchronously — the response is returned immediately 2. Both sides of the exchange are combined into a single text: ``` User: {userMessage} Assistant: {aiResponse} ``` 3. This text is sent to the embedding service (`POST /embed`) 4. The returned vector is upserted into the `episodes` Qdrant collection with a payload of `{ sessionId, createdAt }` for filtering and lookups The embedding step is **fire-and-forget** — it runs asynchronously after the SQLite insert succeeds. If embedding fails, the episode is still saved and searchable via FTS. The error is logged but does not affect the API response. ### Hybrid Retrieval Pattern Qdrant and SQLite work as a pair — neither operates in isolation: 1. Query is embedded and searched in Qdrant → returns IDs + similarity scores 2. IDs are used to fetch full content from SQLite 3. Results are ranked and assembled into a context package ## Entity Layer Entities and relationships are stored in SQLite with two key constraints: - `UNIQUE(name, type)` on entities — ensures no duplicates; upsert updates existing records - `UNIQUE(from_id, to_id, label)` on relationships — prevents duplicate edges - `ON DELETE CASCADE` on both `from_id` and `to_id` — deleting an entity automatically removes all relationships where it appears on either end ## Endpoints ### Health | Method | Path | Description | |---|---|---| | GET | /health | Service health check | ### Sessions | Method | Path | Description | |---|---|---| | POST | /sessions | Create a new session | | GET | /sessions/:id | Get session by internal ID | | GET | /sessions/by-external/:externalId | Get session by external ID | | DELETE | /sessions/:id | Delete session (cascades to episodes + summaries) | **POST /sessions body:** ```json { "externalId": "unique-session-id", "metadata": {} } ``` ### Episodes | Method | Path | Description | |---|---|---| | POST | /episodes | Create episode + auto-embed into Qdrant | | GET | /episodes/search?q=&limit= | Full-text search across episodes | | GET | /episodes/:id | Get episode by ID | | GET | /sessions/:id/episodes?limit=&offset= | Get paginated episodes for a session | | DELETE | /episodes/:id | Delete an episode | **POST /episodes body:** ```json { "sessionId": 1, "userMessage": "Hello", "aiResponse": "Hi there!", "tokenCount": 10, "metadata": {} } ``` > Note: `/episodes/search` must be defined before `/episodes/:id` in Express to prevent > the word `search` being captured as an ID parameter. ### Entities | Method | Path | Description | |---|---|---| | POST | /entities | Upsert an entity (creates or updates by name + type) | | GET | /entities/by-type/:type | Get all entities of a given type | | GET | /entities/:id | Get entity by internal ID | | DELETE | /entities/:id | Delete entity (cascades to relationships) | **POST /entities body:** ```json { "name": "NexusAI", "type": "project", "notes": "My AI memory project", "metadata": {} } ``` > Note: `/entities/by-type/:type` must be defined before `/entities/:id` in Express to > prevent `by-type` being captured as an ID parameter. ### Relationships | Method | Path | Description | |---|---|---| | POST | /relationships | Upsert a relationship between two entities | | GET | /entities/:id/relationships | Get all relationships originating from an entity | | DELETE | /relationships | Delete a specific relationship | **POST /relationships body:** ```json { "fromId": 1, "toId": 2, "label": "uses", "metadata": {} } ``` **DELETE /relationships body:** ```json { "fromId": 1, "toId": 2, "label": "uses" } ``` > Relationships are identified by the composite key `(fromId, toId, label)`. Delete uses > the request body rather than URL params as this three-part key is awkward to express > cleanly in a path.