From 7d3f083485b3c008cf8c4a2de2c5363104acd21b Mon Sep 17 00:00:00 2001
From: Storme-bit <tk.stomre@gmail.com>
Date: Sat, 4 Apr 2026 08:15:29 -0700
Subject: [PATCH] updated documentation for semantic and constant refactor

---
 docs/architecture/overview.md   | 42 +++++++++++++-------
 docs/services/memory-service.md | 59 +++++++++++++++++++++-------
 docs/services/shared.md         | 69 ++++++++++++++++++++++++++++-----
 3 files changed, 132 insertions(+), 38 deletions(-)

diff --git a/docs/architecture/overview.md b/docs/architecture/overview.md
index 9d05cb5..0c7fafd 100644
--- a/docs/architecture/overview.md
+++ b/docs/architecture/overview.md
@@ -1,38 +1,50 @@
 # Architecture Overview
 
-NexusAI is a modular, memory-centric AI system designed for persistent, context-aware conversations.  It separates concerns across different services that can be independently deployed and evolved
+NexusAI is a modular, memory-centric AI system designed for persistent, context-aware conversations. It separates concerns across different services that can be independently deployed and evolved.
 
 ## Core Design Principles
-- **Decoupled layers:** memory, inference, orchestration independent of eachother
-- **Hybrid retrieval:** semantic similarity (QDrant) combined with structured storage (SQLite) for flexible, ranked context assembly
-- **Home lab:**  Services are properly distributed across the various nodes according to available hardware and resources
+
+- **Decoupled layers:** memory, inference, and orchestration are independent of each other
+- **Hybrid retrieval:** semantic similarity (Qdrant) combined with structured storage (SQLite) for flexible, ranked context assembly
+- **Home lab:** services are distributed across nodes according to available hardware and resources
 
 ## Memory Model
-Memory is split between SQLite and QDrant, which both work together as a pair
-- **SQlite:** episodic interactions, entities, relationships, summaries
-- **QDrant:** vector embeddings for semantic similarity search
 
-When recallng memory, QDrant returns IDs and similarity scores, which are used to fetch full content from SQLite.  Neither SQlite or QDrant work in isolation
+Memory is split between SQLite and Qdrant, which work together as a pair:
+
+- **SQLite:** episodic interactions, entities, relationships, summaries
+- **Qdrant:** vector embeddings for semantic similarity search
+
+When recalling memory, Qdrant returns IDs and similarity scores, which are used to fetch
+full content from SQLite. Neither SQLite nor Qdrant work in isolation.
 
 ## Hardware Layout
+
+| Node | Address | Role |
 |---|---|---|
 | Main PC | local | Primary inference (RTX A4000 16GB) |
 | Mini PC 1 | 192.168.0.81 | Memory service, Embedding service, Qdrant |
 | Mini PC 2 | 192.168.0.205 | Orchestration service, Gitea |
 
 ## Service Communication
-All services expose a REST HTTP api.  The orchestration service is the single entgry-point.  Clients dont talk directly to the memory or inference services
 
+All services expose a REST HTTP API. The orchestration service is the single entry point —
+clients do not talk directly to the memory or inference services.
+
+```
 Client
 └─► Orchestration (:4000)
-├─► Memory Service (:3002)
-│     └─► Qdrant (:6333)
-│     └─► SQLite
-├─► Embedding Service (:3003)
-└─► Inference Service (:3001)
-└─► Ollama
+    ├─► Memory Service (:3002)
+    │     ├─► Qdrant (:6333)
+    │     └─► SQLite
+    ├─► Embedding Service (:3003)
+    │     └─► Ollama
+    └─► Inference Service (:3001)
+          └─► Ollama
+```
 
 ## Technology Choices
+
 | Concern | Choice | Reason |
 |---|---|---|
 | Language | Node.js (JavaScript) | Familiar stack, async I/O suits service architecture |
diff --git a/docs/services/memory-service.md b/docs/services/memory-service.md
index c201c94..b77fb7b 100644
--- a/docs/services/memory-service.md
+++ b/docs/services/memory-service.md
@@ -17,7 +17,7 @@ stores directly.
 - `better-sqlite3` — SQLite driver
 - `@qdrant/js-client-rest` — Qdrant vector store client
 - `dotenv` — environment variable loading
-- `@nexusai/shared` — shared utilities
+- `@nexusai/shared` — shared utilities and constants
 
 ## Environment Variables
 
@@ -28,18 +28,23 @@ stores directly.
 | QDRANT_URL | No | http://localhost:6333 | Qdrant instance URL |
 
 ## Internal Structure
+
+```
 src/
 ├── db/
 │   ├── index.js       # SQLite connection + initialization
 │   └── schema.js      # Table definitions, indexes, FTS5, triggers
 ├── episodic/
-│   └── index.js       # Session + episode CRUD
-├── semantic/          # Qdrant vector operations (in progress)
+│   └── index.js       # Session + episode CRUD and FTS search
+├── semantic/
+│   └── index.js       # Qdrant collection management, upsert, search, delete
 ├── entities/          # Entity + relationship CRUD (upcoming)
 └── index.js           # Express app + route definitions
+```
 
 ## SQLite Schema
-Four core tables:
+
+Five core tables:
 
 - **sessions** — top-level conversation containers, identified by an `external_id`
 - **episodes** — individual exchanges (user message + AI response) tied to a session
@@ -59,6 +64,42 @@ keep the FTS index automatically in sync with the episodes table.
 - `foreign_keys = ON` — enforces referential integrity and cascade deletes
 - PRAGMAs are set via `db.pragma()` separately from `db.exec()`
 
+## Qdrant / Semantic Layer
+
+Three collections are initialized on service startup (created if they don't already exist):
+
+| Collection | Purpose |
+|---|---|
+| `episodes` | Embeddings for individual conversation exchanges |
+| `entities` | Embeddings for named entities |
+| `summaries` | Embeddings for condensed episode summaries |
+
+All collections use **768-dimension vectors** with **Cosine similarity**, matching the
+output of the `nomic-embed-text` embedding model via Ollama.
+
+Vector dimension and distance metric are defined in `@nexusai/shared` constants
+(`QDRANT.VECTOR_SIZE`, `QDRANT.DISTANCE_METRIC`) — not hardcoded in this service.
+
+### Semantic Layer Operations
+
+Each collection exposes three operations via helper functions in `src/semantic/index.js`:
+
+- **Upsert** — stores a vector with a payload containing the SQLite row ID, enabling
+  lookups back to the full content after a vector search
+- **Search** — returns the top-k most similar vectors, with optional Qdrant filter
+- **Delete** — removes a vector point by ID
+
+The `wait: true` flag is used on all write operations so the caller receives confirmation
+only after Qdrant has committed the change.
+
+### Hybrid Retrieval Pattern
+
+Qdrant and SQLite work as a pair — neither operates in isolation:
+
+1. Query is embedded and searched in Qdrant → returns IDs + similarity scores
+2. IDs are used to fetch full content from SQLite
+3. Results are ranked and assembled into a context package
+
 ## Endpoints
 
 ### Health
@@ -105,12 +146,4 @@ keep the FTS index automatically in sync with the episodes table.
 }
 ```
 
-> Semantic (Qdrant) and entity endpoints will be documented as they are built out.
-
-## Endpoints
-
-| Method | Path | Description |
-|---|---|---|
-| GET | /health | Service health check |
-
-> Further endpoints will be documented as the service is built out.
\ No newline at end of file
+> Semantic (Qdrant) and entity REST endpoints will be documented as they are built out.
\ No newline at end of file
diff --git a/docs/services/shared.md b/docs/services/shared.md
index 0d964b7..7fb248a 100644
--- a/docs/services/shared.md
+++ b/docs/services/shared.md
@@ -1,18 +1,67 @@
 # Shared Package
 
-**Package:** '@nexusai/shared'
-**Location:** 'packages/shared'
+**Package:** `@nexusai/shared`  
+**Location:** `packages/shared`
 
 ## Purpose
-Common utilities and configuration used across all NexusAI services
-Keeping these here avoids duplicating and ensure consistent behavior
 
-# Exports
+Common utilities and configuration used across all NexusAI services.
+Keeping these here avoids duplication and ensures consistent behaviour.
 
-### 'getEnv(key, defaultValue?)'
-Loads an environment variable by key.  If no default is provided and the variable is missing, throws at startup rather than failing later on.
-```javascript
+## Exports
+
+### `getEnv(key, defaultValue?)`
+
+Loads an environment variable by key. If no default is provided and the
+variable is missing, throws at startup rather than failing silently later.
+
+```js
 const { getEnv } = require('@nexusai/shared');
-const PORT = getEnv('PORT', '3002');         // optional — falls back to 3002
-const DB   = getEnv('SQLITE_PATH');          // required — throws if missing
+
+const PORT = getEnv('PORT', '3002');   // optional — falls back to 3002
+const DB   = getEnv('SQLITE_PATH');    // required — throws if missing
 ```
+
+---
+
+### Constants
+
+Tuneable values and shared identifiers are centralised in `constants.js`
+rather than hardcoded across services. Import the relevant group by name.
+
+```js
+const { QDRANT, COLLECTIONS, EPISODIC } = require('@nexusai/shared');
+```
+
+#### `QDRANT`
+
+Vector store configuration. Values here must stay in sync with the
+embedding model and Qdrant collection setup.
+
+| Key | Value | Description |
+|---|---|---|
+| `DEFAULT_URL` | `http://localhost:6333` | Fallback Qdrant URL if `QDRANT_URL` env var is not set |
+| `VECTOR_SIZE` | `768` | Output dimensions of `nomic-embed-text` |
+| `DISTANCE_METRIC` | `'Cosine'` | Similarity metric used for all collections |
+| `DEFAULT_LIMIT` | `10` | Default top-k for vector searches |
+
+#### `COLLECTIONS`
+
+Canonical Qdrant collection names. Used by both the semantic layer and
+any service that constructs Qdrant queries directly.
+
+| Key | Value |
+|---|---|
+| `EPISODES` | `'episodes'` |
+| `ENTITIES` | `'entities'` |
+| `SUMMARIES` | `'summaries'` |
+
+#### `EPISODIC`
+
+Default pagination and result limits for SQLite episode queries.
+
+| Key | Value | Description |
+|---|---|---|
+| `DEFAULT_RECENT_LIMIT` | `10` | Default number of recent episodes to retrieve |
+| `DEFAULT_PAGE_SIZE` | `20` | Default episodes per page for paginated queries |
+| `DEFAULT_SEARCH_LIMIT` | `10` | Default number of FTS search results to return |
\ No newline at end of file