From ebbeecfa1a4dc1f0525b86dec5767e78ea1f1dbc Mon Sep 17 00:00:00 2001
From: Storme-bit <tk.stomre@gmail.com>
Date: Sat, 4 Apr 2026 05:22:36 -0700
Subject: [PATCH] Add initial project documentation

---
 docs/README.md                      | 12 ++++++++
 docs/architecture/overview.md       | 43 +++++++++++++++++++++++++++++
 docs/deployment/homelab.md          | 42 ++++++++++++++++++++++++++++
 docs/services/embedding-service.md  | 34 +++++++++++++++++++++++
 docs/services/inference-service.md  | 35 +++++++++++++++++++++++
 docs/services/memory-service.md     | 36 ++++++++++++++++++++++++
 docs/services/orchestion-service.md | 36 ++++++++++++++++++++++++
 docs/services/shared.md             | 18 ++++++++++++
 8 files changed, 256 insertions(+)
 create mode 100644 docs/README.md
 create mode 100644 docs/architecture/overview.md
 create mode 100644 docs/deployment/homelab.md
 create mode 100644 docs/services/embedding-service.md
 create mode 100644 docs/services/inference-service.md
 create mode 100644 docs/services/memory-service.md
 create mode 100644 docs/services/orchestion-service.md
 create mode 100644 docs/services/shared.md

diff --git a/docs/README.md b/docs/README.md
new file mode 100644
index 0000000..6eafe0e
--- /dev/null
+++ b/docs/README.md
@@ -0,0 +1,12 @@
+# NexusAI Documentation
+
+## Contents
+
+- [Architecture Overview](architecture/overview.md)
+- [Services](services/)
+  - [Shared Package](services/shared.md)
+  - [Memory Service](services/memory-service.md)
+  - [Embedding Service](services/embedding-service.md)
+  - [Inference Service](services/inference-service.md)
+  - [Orchestration Service](services/orchestration-service.md)
+- [Deployment](deployment/homelab.md)
\ No newline at end of file
diff --git a/docs/architecture/overview.md b/docs/architecture/overview.md
new file mode 100644
index 0000000..9d05cb5
--- /dev/null
+++ b/docs/architecture/overview.md
@@ -0,0 +1,43 @@
+# Architecture Overview
+
+NexusAI is a modular, memory-centric AI system designed for persistent, context-aware conversations.  It separates concerns across different services that can be independently deployed and evolved
+
+## Core Design Principles
+- **Decoupled layers:** memory, inference, orchestration independent of eachother
+- **Hybrid retrieval:** semantic similarity (QDrant) combined with structured storage (SQLite) for flexible, ranked context assembly
+- **Home lab:**  Services are properly distributed across the various nodes according to available hardware and resources
+
+## Memory Model
+Memory is split between SQLite and QDrant, which both work together as a pair
+- **SQlite:** episodic interactions, entities, relationships, summaries
+- **QDrant:** vector embeddings for semantic similarity search
+
+When recallng memory, QDrant returns IDs and similarity scores, which are used to fetch full content from SQLite.  Neither SQlite or QDrant work in isolation
+
+## Hardware Layout
+|---|---|---|
+| Main PC | local | Primary inference (RTX A4000 16GB) |
+| Mini PC 1 | 192.168.0.81 | Memory service, Embedding service, Qdrant |
+| Mini PC 2 | 192.168.0.205 | Orchestration service, Gitea |
+
+## Service Communication
+All services expose a REST HTTP api.  The orchestration service is the single entgry-point.  Clients dont talk directly to the memory or inference services
+
+Client
+└─► Orchestration (:4000)
+├─► Memory Service (:3002)
+│     └─► Qdrant (:6333)
+│     └─► SQLite
+├─► Embedding Service (:3003)
+└─► Inference Service (:3001)
+└─► Ollama
+
+## Technology Choices
+| Concern | Choice | Reason |
+|---|---|---|
+| Language | Node.js (JavaScript) | Familiar stack, async I/O suits service architecture |
+| Package management | npm workspaces | Monorepo with shared code, no publishing needed |
+| Vector store | Qdrant | Mature, Docker-native, excellent Node.js client |
+| Relational store | SQLite (better-sqlite3) | Zero-ops, fast, sufficient for single-user |
+| LLM runtime | Ollama | Easiest local LLM management, serves embeddings too |
+| Version control | Gitea (self-hosted) | Code stays on local network |
\ No newline at end of file
diff --git a/docs/deployment/homelab.md b/docs/deployment/homelab.md
new file mode 100644
index 0000000..8cdc9a9
--- /dev/null
+++ b/docs/deployment/homelab.md
@@ -0,0 +1,42 @@
+# Homelab Deployment
+
+## Overview
+
+NexusAI is distributed across three nodes. Each node runs only the
+services appropriate for its hardware.
+
+## Mini PC 1 — 192.168.0.81
+
+Runs: Qdrant, Memory Service, Embedding Service
+```bash
+ssh username@192.168.0.81
+cd ~/nexusai
+docker compose -f docker-compose.mini1.yml up -d  # Qdrant
+npm run memory
+npm run embedding
+```
+
+## Mini PC 2 — 192.168.0.205
+
+Runs: Gitea, Orchestration Service
+```bash
+ssh username@192.168.0.205
+cd ~/gitea
+docker compose up -d                               # Gitea
+cd ~/nexusai
+npm run orchestration
+```
+
+## Main PC
+
+Runs: Ollama, Inference Service
+```bash
+ollama serve
+npm run inference
+```
+
+## Environment Files
+
+Each node needs a `.env` file in the relevant service package directory.
+These are not committed to git. See each service's documentation for
+required variables.
\ No newline at end of file
diff --git a/docs/services/embedding-service.md b/docs/services/embedding-service.md
new file mode 100644
index 0000000..cc9819c
--- /dev/null
+++ b/docs/services/embedding-service.md
@@ -0,0 +1,34 @@
+# Embedding Service
+
+**Package:** `@nexusai/embedding-service`  
+**Location:** `packages/embedding-service`  
+**Deployed on:** Mini PC 1 (192.168.0.81)  
+**Port:** 3003
+
+## Purpose
+
+Converts text into vector embeddings for storage in Qdrant. Keeps
+embedding workload off the main inference node.
+
+## Dependencies
+
+- `express` — HTTP API
+- `ollama` — Ollama client for embedding model
+- `dotenv` — environment variable loading
+- `@nexusai/shared` — shared utilities
+
+## Environment Variables
+
+| Variable | Required | Default | Description |
+|---|---|---|---|
+| PORT | No | 3003 | Port to listen on |
+| OLLAMA_URL | No | http://localhost:11434 | Ollama instance URL |
+| EMBEDDING_MODEL | No | nomic-embed-text | Ollama embedding model to use |
+
+## Endpoints
+
+| Method | Path | Description |
+|---|---|---|
+| GET | /health | Service health check |
+
+> Further endpoints will be documented as the service is built out.
\ No newline at end of file
diff --git a/docs/services/inference-service.md b/docs/services/inference-service.md
new file mode 100644
index 0000000..4a8e961
--- /dev/null
+++ b/docs/services/inference-service.md
@@ -0,0 +1,35 @@
+# Inference Service
+
+**Package:** `@nexusai/inference-service`  
+**Location:** `packages/inference-service`  
+**Deployed on:** Main PC  
+**Port:** 3001
+
+## Purpose
+
+Thin adapter layer around the local LLM runtime (Ollama). Receives
+assembled context packages from the orchestration service and returns
+model responses.
+
+## Dependencies
+
+- `express` — HTTP API
+- `ollama` — Ollama client
+- `dotenv` — environment variable loading
+- `@nexusai/shared` — shared utilities
+
+## Environment Variables
+
+| Variable | Required | Default | Description |
+|---|---|---|---|
+| PORT | No | 3001 | Port to listen on |
+| OLLAMA_URL | No | http://localhost:11434 | Ollama instance URL |
+| DEFAULT_MODEL | No | llama3 | Default model to use for inference |
+
+## Endpoints
+
+| Method | Path | Description |
+|---|---|---|
+| GET | /health | Service health check |
+
+> Further endpoints will be documented as the service is built out.
\ No newline at end of file
diff --git a/docs/services/memory-service.md b/docs/services/memory-service.md
new file mode 100644
index 0000000..be70b82
--- /dev/null
+++ b/docs/services/memory-service.md
@@ -0,0 +1,36 @@
+# Memory Service
+
+**Package:** `@nexusai/memory-service`  
+**Location:** `packages/memory-service`  
+**Deployed on:** Mini PC 1 (192.168.0.81)  
+**Port:** 3002
+
+## Purpose
+
+Responsible for all reading and writing of long-term memory. Acts as the
+sole interface to both SQLite and Qdrant — no other service accesses these
+stores directly.
+
+## Dependencies
+
+- `express` — HTTP API
+- `better-sqlite3` — SQLite driver
+- `@qdrant/js-client-rest` — Qdrant vector store client
+- `dotenv` — environment variable loading
+- `@nexusai/shared` — shared utilities
+
+## Environment Variables
+
+| Variable | Required | Default | Description |
+|---|---|---|---|
+| PORT | No | 3002 | Port to listen on |
+| SQLITE_PATH | Yes | — | Path to SQLite database file |
+| QDRANT_URL | No | http://localhost:6333 | Qdrant instance URL |
+
+## Endpoints
+
+| Method | Path | Description |
+|---|---|---|
+| GET | /health | Service health check |
+
+> Further endpoints will be documented as the service is built out.
\ No newline at end of file
diff --git a/docs/services/orchestion-service.md b/docs/services/orchestion-service.md
new file mode 100644
index 0000000..c0c5f00
--- /dev/null
+++ b/docs/services/orchestion-service.md
@@ -0,0 +1,36 @@
+# Orchestration Service
+
+**Package:** `@nexusai/orchestration-service`  
+**Location:** `packages/orchestration-service`  
+**Deployed on:** Mini PC 2 (192.168.0.205)  
+**Port:** 4000
+
+## Purpose
+
+The main entry point for all clients. Assembles context packages from
+memory, routes prompts to inference, and writes new episodes back to
+memory after each interaction.
+
+## Dependencies
+
+- `express` — HTTP API
+- `node-fetch` — inter-service HTTP communication
+- `dotenv` — environment variable loading
+- `@nexusai/shared` — shared utilities
+
+## Environment Variables
+
+| Variable | Required | Default | Description |
+|---|---|---|---|
+| PORT | No | 4000 | Port to listen on |
+| MEMORY_SERVICE_URL | No | http://localhost:3002 | Memory service URL |
+| EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL |
+| INFERENCE_SERVICE_URL | No | http://localhost:3001 | Inference service URL |
+
+## Endpoints
+
+| Method | Path | Description |
+|---|---|---|
+| GET | /health | Service health check |
+
+> Further endpoints will be documented as the service is built out.
\ No newline at end of file
diff --git a/docs/services/shared.md b/docs/services/shared.md
new file mode 100644
index 0000000..0d964b7
--- /dev/null
+++ b/docs/services/shared.md
@@ -0,0 +1,18 @@
+# Shared Package
+
+**Package:** '@nexusai/shared'
+**Location:** 'packages/shared'
+
+## Purpose
+Common utilities and configuration used across all NexusAI services
+Keeping these here avoids duplicating and ensure consistent behavior
+
+# Exports
+
+### 'getEnv(key, defaultValue?)'
+Loads an environment variable by key.  If no default is provided and the variable is missing, throws at startup rather than failing later on.
+```javascript
+const { getEnv } = require('@nexusai/shared');
+const PORT = getEnv('PORT', '3002');         // optional — falls back to 3002
+const DB   = getEnv('SQLITE_PATH');          // required — throws if missing
+```