From 7d3f083485b3c008cf8c4a2de2c5363104acd21b Mon Sep 17 00:00:00 2001 From: Storme-bit Date: Sat, 4 Apr 2026 08:15:29 -0700 Subject: [PATCH] updated documentation for semantic and constant refactor --- docs/architecture/overview.md | 42 +++++++++++++------- docs/services/memory-service.md | 59 +++++++++++++++++++++------- docs/services/shared.md | 69 ++++++++++++++++++++++++++++----- 3 files changed, 132 insertions(+), 38 deletions(-) diff --git a/docs/architecture/overview.md b/docs/architecture/overview.md index 9d05cb5..0c7fafd 100644 --- a/docs/architecture/overview.md +++ b/docs/architecture/overview.md @@ -1,38 +1,50 @@ # Architecture Overview -NexusAI is a modular, memory-centric AI system designed for persistent, context-aware conversations. It separates concerns across different services that can be independently deployed and evolved +NexusAI is a modular, memory-centric AI system designed for persistent, context-aware conversations. It separates concerns across different services that can be independently deployed and evolved. ## Core Design Principles -- **Decoupled layers:** memory, inference, orchestration independent of eachother -- **Hybrid retrieval:** semantic similarity (QDrant) combined with structured storage (SQLite) for flexible, ranked context assembly -- **Home lab:** Services are properly distributed across the various nodes according to available hardware and resources + +- **Decoupled layers:** memory, inference, and orchestration are independent of each other +- **Hybrid retrieval:** semantic similarity (Qdrant) combined with structured storage (SQLite) for flexible, ranked context assembly +- **Home lab:** services are distributed across nodes according to available hardware and resources ## Memory Model -Memory is split between SQLite and QDrant, which both work together as a pair -- **SQlite:** episodic interactions, entities, relationships, summaries -- **QDrant:** vector embeddings for semantic similarity search -When recallng memory, QDrant returns IDs and similarity scores, which are used to fetch full content from SQLite. Neither SQlite or QDrant work in isolation +Memory is split between SQLite and Qdrant, which work together as a pair: + +- **SQLite:** episodic interactions, entities, relationships, summaries +- **Qdrant:** vector embeddings for semantic similarity search + +When recalling memory, Qdrant returns IDs and similarity scores, which are used to fetch +full content from SQLite. Neither SQLite nor Qdrant work in isolation. ## Hardware Layout + +| Node | Address | Role | |---|---|---| | Main PC | local | Primary inference (RTX A4000 16GB) | | Mini PC 1 | 192.168.0.81 | Memory service, Embedding service, Qdrant | | Mini PC 2 | 192.168.0.205 | Orchestration service, Gitea | ## Service Communication -All services expose a REST HTTP api. The orchestration service is the single entgry-point. Clients dont talk directly to the memory or inference services +All services expose a REST HTTP API. The orchestration service is the single entry point — +clients do not talk directly to the memory or inference services. + +``` Client └─► Orchestration (:4000) -├─► Memory Service (:3002) -│ └─► Qdrant (:6333) -│ └─► SQLite -├─► Embedding Service (:3003) -└─► Inference Service (:3001) -└─► Ollama + ├─► Memory Service (:3002) + │ ├─► Qdrant (:6333) + │ └─► SQLite + ├─► Embedding Service (:3003) + │ └─► Ollama + └─► Inference Service (:3001) + └─► Ollama +``` ## Technology Choices + | Concern | Choice | Reason | |---|---|---| | Language | Node.js (JavaScript) | Familiar stack, async I/O suits service architecture | diff --git a/docs/services/memory-service.md b/docs/services/memory-service.md index c201c94..b77fb7b 100644 --- a/docs/services/memory-service.md +++ b/docs/services/memory-service.md @@ -17,7 +17,7 @@ stores directly. - `better-sqlite3` — SQLite driver - `@qdrant/js-client-rest` — Qdrant vector store client - `dotenv` — environment variable loading -- `@nexusai/shared` — shared utilities +- `@nexusai/shared` — shared utilities and constants ## Environment Variables @@ -28,18 +28,23 @@ stores directly. | QDRANT_URL | No | http://localhost:6333 | Qdrant instance URL | ## Internal Structure + +``` src/ ├── db/ │ ├── index.js # SQLite connection + initialization │ └── schema.js # Table definitions, indexes, FTS5, triggers ├── episodic/ -│ └── index.js # Session + episode CRUD -├── semantic/ # Qdrant vector operations (in progress) +│ └── index.js # Session + episode CRUD and FTS search +├── semantic/ +│ └── index.js # Qdrant collection management, upsert, search, delete ├── entities/ # Entity + relationship CRUD (upcoming) └── index.js # Express app + route definitions +``` ## SQLite Schema -Four core tables: + +Five core tables: - **sessions** — top-level conversation containers, identified by an `external_id` - **episodes** — individual exchanges (user message + AI response) tied to a session @@ -59,6 +64,42 @@ keep the FTS index automatically in sync with the episodes table. - `foreign_keys = ON` — enforces referential integrity and cascade deletes - PRAGMAs are set via `db.pragma()` separately from `db.exec()` +## Qdrant / Semantic Layer + +Three collections are initialized on service startup (created if they don't already exist): + +| Collection | Purpose | +|---|---| +| `episodes` | Embeddings for individual conversation exchanges | +| `entities` | Embeddings for named entities | +| `summaries` | Embeddings for condensed episode summaries | + +All collections use **768-dimension vectors** with **Cosine similarity**, matching the +output of the `nomic-embed-text` embedding model via Ollama. + +Vector dimension and distance metric are defined in `@nexusai/shared` constants +(`QDRANT.VECTOR_SIZE`, `QDRANT.DISTANCE_METRIC`) — not hardcoded in this service. + +### Semantic Layer Operations + +Each collection exposes three operations via helper functions in `src/semantic/index.js`: + +- **Upsert** — stores a vector with a payload containing the SQLite row ID, enabling + lookups back to the full content after a vector search +- **Search** — returns the top-k most similar vectors, with optional Qdrant filter +- **Delete** — removes a vector point by ID + +The `wait: true` flag is used on all write operations so the caller receives confirmation +only after Qdrant has committed the change. + +### Hybrid Retrieval Pattern + +Qdrant and SQLite work as a pair — neither operates in isolation: + +1. Query is embedded and searched in Qdrant → returns IDs + similarity scores +2. IDs are used to fetch full content from SQLite +3. Results are ranked and assembled into a context package + ## Endpoints ### Health @@ -105,12 +146,4 @@ keep the FTS index automatically in sync with the episodes table. } ``` -> Semantic (Qdrant) and entity endpoints will be documented as they are built out. - -## Endpoints - -| Method | Path | Description | -|---|---|---| -| GET | /health | Service health check | - -> Further endpoints will be documented as the service is built out. \ No newline at end of file +> Semantic (Qdrant) and entity REST endpoints will be documented as they are built out. \ No newline at end of file diff --git a/docs/services/shared.md b/docs/services/shared.md index 0d964b7..7fb248a 100644 --- a/docs/services/shared.md +++ b/docs/services/shared.md @@ -1,18 +1,67 @@ # Shared Package -**Package:** '@nexusai/shared' -**Location:** 'packages/shared' +**Package:** `@nexusai/shared` +**Location:** `packages/shared` ## Purpose -Common utilities and configuration used across all NexusAI services -Keeping these here avoids duplicating and ensure consistent behavior -# Exports +Common utilities and configuration used across all NexusAI services. +Keeping these here avoids duplication and ensures consistent behaviour. -### 'getEnv(key, defaultValue?)' -Loads an environment variable by key. If no default is provided and the variable is missing, throws at startup rather than failing later on. -```javascript +## Exports + +### `getEnv(key, defaultValue?)` + +Loads an environment variable by key. If no default is provided and the +variable is missing, throws at startup rather than failing silently later. + +```js const { getEnv } = require('@nexusai/shared'); -const PORT = getEnv('PORT', '3002'); // optional — falls back to 3002 -const DB = getEnv('SQLITE_PATH'); // required — throws if missing + +const PORT = getEnv('PORT', '3002'); // optional — falls back to 3002 +const DB = getEnv('SQLITE_PATH'); // required — throws if missing ``` + +--- + +### Constants + +Tuneable values and shared identifiers are centralised in `constants.js` +rather than hardcoded across services. Import the relevant group by name. + +```js +const { QDRANT, COLLECTIONS, EPISODIC } = require('@nexusai/shared'); +``` + +#### `QDRANT` + +Vector store configuration. Values here must stay in sync with the +embedding model and Qdrant collection setup. + +| Key | Value | Description | +|---|---|---| +| `DEFAULT_URL` | `http://localhost:6333` | Fallback Qdrant URL if `QDRANT_URL` env var is not set | +| `VECTOR_SIZE` | `768` | Output dimensions of `nomic-embed-text` | +| `DISTANCE_METRIC` | `'Cosine'` | Similarity metric used for all collections | +| `DEFAULT_LIMIT` | `10` | Default top-k for vector searches | + +#### `COLLECTIONS` + +Canonical Qdrant collection names. Used by both the semantic layer and +any service that constructs Qdrant queries directly. + +| Key | Value | +|---|---| +| `EPISODES` | `'episodes'` | +| `ENTITIES` | `'entities'` | +| `SUMMARIES` | `'summaries'` | + +#### `EPISODIC` + +Default pagination and result limits for SQLite episode queries. + +| Key | Value | Description | +|---|---|---| +| `DEFAULT_RECENT_LIMIT` | `10` | Default number of recent episodes to retrieve | +| `DEFAULT_PAGE_SIZE` | `20` | Default episodes per page for paginated queries | +| `DEFAULT_SEARCH_LIMIT` | `10` | Default number of FTS search results to return | \ No newline at end of file