7.5 KiB
Memory Service
Package: @nexusai/memory-service
Location: packages/memory-service
Deployed on: Mini PC 1 (192.168.0.81)
Port: 3002
Purpose
Responsible for all reading and writing of long-term memory. Acts as the sole interface to both SQLite and Qdrant — no other service accesses these stores directly. On episode creation, automatically calls the embedding service to generate and store a vector in Qdrant.
Dependencies
express— HTTP APIbetter-sqlite3— SQLite driver@qdrant/js-client-rest— Qdrant vector store clientdotenv— environment variable loading@nexusai/shared— shared utilities and constants
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
| PORT | No | 3002 | Port to listen on |
| SQLITE_PATH | Yes | — | Path to SQLite database file |
| QDRANT_URL | No | http://localhost:6333 | Qdrant instance URL |
| EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL |
Internal Structure
src/
├── db/
│ ├── index.js # SQLite connection + initialization
│ └── schema.js # Table definitions, indexes, FTS5, triggers
├── episodic/
│ └── index.js # Session + episode CRUD, FTS search, embedding write path
├── semantic/
│ └── index.js # Qdrant collection management, upsert, search, delete
├── entities/
│ └── index.js # Entity + relationship CRUD
└── index.js # Express app + route definitions
SQLite Schema
Five core tables:
- sessions — top-level conversation containers, identified by an
external_id - episodes — individual exchanges (user message + AI response) tied to a session
- entities — named things the system learns about (people, places, concepts)
- relationships — directional labeled links between entities
- summaries — condensed episode groups for efficient context retrieval
FTS5 Full-Text Search
An episodes_fts virtual table enables keyword search across all episodes.
Three triggers (episodes_fts_insert, episodes_fts_update, episodes_fts_delete)
keep the FTS index automatically in sync with the episodes table.
SQLite Configuration
journal_mode = WAL— non-blocking reads during writesforeign_keys = ON— enforces referential integrity and cascade deletes- PRAGMAs are set via
db.pragma()separately fromdb.exec()
Qdrant / Semantic Layer
Three collections are initialized on service startup (created if they don't already exist):
| Collection | Purpose |
|---|---|
episodes |
Embeddings for individual conversation exchanges |
entities |
Embeddings for named entities |
summaries |
Embeddings for condensed episode summaries |
All collections use 768-dimension vectors with Cosine similarity, matching the
output of the nomic-embed-text embedding model via Ollama.
Vector dimension and distance metric are defined in @nexusai/shared constants
(QDRANT.VECTOR_SIZE, QDRANT.DISTANCE_METRIC) — not hardcoded in this service.
Semantic Layer Operations
Each collection exposes three operations via helper functions in src/semantic/index.js:
- Upsert — stores a vector with a payload containing the SQLite row ID, enabling lookups back to the full content after a vector search
- Search — returns the top-k most similar vectors, with optional Qdrant filter
- Delete — removes a vector point by ID
The wait: true flag is used on all write operations so the caller receives confirmation
only after Qdrant has committed the change.
Embedding Write Path
When a new episode is created, the memory service automatically generates and stores a vector embedding in Qdrant via the embedding service:
- Episode is saved to SQLite synchronously — the response is returned immediately
- Both sides of the exchange are combined into a single text:
User: {userMessage} Assistant: {aiResponse} - This text is sent to the embedding service (
POST /embed) - The returned vector is upserted into the
episodesQdrant collection with a payload of{ sessionId, createdAt }for filtering and lookups
The embedding step is fire-and-forget — it runs asynchronously after the SQLite insert succeeds. If embedding fails, the episode is still saved and searchable via FTS. The error is logged but does not affect the API response.
Hybrid Retrieval Pattern
Qdrant and SQLite work as a pair — neither operates in isolation:
- Query is embedded and searched in Qdrant → returns IDs + similarity scores
- IDs are used to fetch full content from SQLite
- Results are ranked and assembled into a context package
Entity Layer
Entities and relationships are stored in SQLite with two key constraints:
UNIQUE(name, type)on entities — ensures no duplicates; upsert updates existing recordsUNIQUE(from_id, to_id, label)on relationships — prevents duplicate edgesON DELETE CASCADEon bothfrom_idandto_id— deleting an entity automatically removes all relationships where it appears on either end
Endpoints
Health
| Method | Path | Description |
|---|---|---|
| GET | /health | Service health check |
Sessions
| Method | Path | Description |
|---|---|---|
| POST | /sessions | Create a new session |
| GET | /sessions/:id | Get session by internal ID |
| GET | /sessions/by-external/:externalId | Get session by external ID |
| DELETE | /sessions/:id | Delete session (cascades to episodes + summaries) |
POST /sessions body:
{
"externalId": "unique-session-id",
"metadata": {}
}
Episodes
| Method | Path | Description |
|---|---|---|
| POST | /episodes | Create episode + auto-embed into Qdrant |
| GET | /episodes/search?q=&limit= | Full-text search across episodes |
| GET | /episodes/:id | Get episode by ID |
| GET | /sessions/:id/episodes?limit=&offset= | Get paginated episodes for a session |
| DELETE | /episodes/:id | Delete an episode |
POST /episodes body:
{
"sessionId": 1,
"userMessage": "Hello",
"aiResponse": "Hi there!",
"tokenCount": 10,
"metadata": {}
}
Note:
/episodes/searchmust be defined before/episodes/:idin Express to prevent the wordsearchbeing captured as an ID parameter.
Entities
| Method | Path | Description |
|---|---|---|
| POST | /entities | Upsert an entity (creates or updates by name + type) |
| GET | /entities/by-type/:type | Get all entities of a given type |
| GET | /entities/:id | Get entity by internal ID |
| DELETE | /entities/:id | Delete entity (cascades to relationships) |
POST /entities body:
{
"name": "NexusAI",
"type": "project",
"notes": "My AI memory project",
"metadata": {}
}
Note:
/entities/by-type/:typemust be defined before/entities/:idin Express to preventby-typebeing captured as an ID parameter.
Relationships
| Method | Path | Description |
|---|---|---|
| POST | /relationships | Upsert a relationship between two entities |
| GET | /entities/:id/relationships | Get all relationships originating from an entity |
| DELETE | /relationships | Delete a specific relationship |
POST /relationships body:
{
"fromId": 1,
"toId": 2,
"label": "uses",
"metadata": {}
}
DELETE /relationships body:
{
"fromId": 1,
"toId": 2,
"label": "uses"
}
Relationships are identified by the composite key
(fromId, toId, label). Delete uses the request body rather than URL params as this three-part key is awkward to express cleanly in a path.