Add initial project documentation
This commit is contained in:
12
docs/README.md
Normal file
12
docs/README.md
Normal file
@@ -0,0 +1,12 @@
|
|||||||
|
# NexusAI Documentation
|
||||||
|
|
||||||
|
## Contents
|
||||||
|
|
||||||
|
- [Architecture Overview](architecture/overview.md)
|
||||||
|
- [Services](services/)
|
||||||
|
- [Shared Package](services/shared.md)
|
||||||
|
- [Memory Service](services/memory-service.md)
|
||||||
|
- [Embedding Service](services/embedding-service.md)
|
||||||
|
- [Inference Service](services/inference-service.md)
|
||||||
|
- [Orchestration Service](services/orchestration-service.md)
|
||||||
|
- [Deployment](deployment/homelab.md)
|
||||||
43
docs/architecture/overview.md
Normal file
43
docs/architecture/overview.md
Normal file
@@ -0,0 +1,43 @@
|
|||||||
|
# Architecture Overview
|
||||||
|
|
||||||
|
NexusAI is a modular, memory-centric AI system designed for persistent, context-aware conversations. It separates concerns across different services that can be independently deployed and evolved
|
||||||
|
|
||||||
|
## Core Design Principles
|
||||||
|
- **Decoupled layers:** memory, inference, orchestration independent of eachother
|
||||||
|
- **Hybrid retrieval:** semantic similarity (QDrant) combined with structured storage (SQLite) for flexible, ranked context assembly
|
||||||
|
- **Home lab:** Services are properly distributed across the various nodes according to available hardware and resources
|
||||||
|
|
||||||
|
## Memory Model
|
||||||
|
Memory is split between SQLite and QDrant, which both work together as a pair
|
||||||
|
- **SQlite:** episodic interactions, entities, relationships, summaries
|
||||||
|
- **QDrant:** vector embeddings for semantic similarity search
|
||||||
|
|
||||||
|
When recallng memory, QDrant returns IDs and similarity scores, which are used to fetch full content from SQLite. Neither SQlite or QDrant work in isolation
|
||||||
|
|
||||||
|
## Hardware Layout
|
||||||
|
|---|---|---|
|
||||||
|
| Main PC | local | Primary inference (RTX A4000 16GB) |
|
||||||
|
| Mini PC 1 | 192.168.0.81 | Memory service, Embedding service, Qdrant |
|
||||||
|
| Mini PC 2 | 192.168.0.205 | Orchestration service, Gitea |
|
||||||
|
|
||||||
|
## Service Communication
|
||||||
|
All services expose a REST HTTP api. The orchestration service is the single entgry-point. Clients dont talk directly to the memory or inference services
|
||||||
|
|
||||||
|
Client
|
||||||
|
└─► Orchestration (:4000)
|
||||||
|
├─► Memory Service (:3002)
|
||||||
|
│ └─► Qdrant (:6333)
|
||||||
|
│ └─► SQLite
|
||||||
|
├─► Embedding Service (:3003)
|
||||||
|
└─► Inference Service (:3001)
|
||||||
|
└─► Ollama
|
||||||
|
|
||||||
|
## Technology Choices
|
||||||
|
| Concern | Choice | Reason |
|
||||||
|
|---|---|---|
|
||||||
|
| Language | Node.js (JavaScript) | Familiar stack, async I/O suits service architecture |
|
||||||
|
| Package management | npm workspaces | Monorepo with shared code, no publishing needed |
|
||||||
|
| Vector store | Qdrant | Mature, Docker-native, excellent Node.js client |
|
||||||
|
| Relational store | SQLite (better-sqlite3) | Zero-ops, fast, sufficient for single-user |
|
||||||
|
| LLM runtime | Ollama | Easiest local LLM management, serves embeddings too |
|
||||||
|
| Version control | Gitea (self-hosted) | Code stays on local network |
|
||||||
42
docs/deployment/homelab.md
Normal file
42
docs/deployment/homelab.md
Normal file
@@ -0,0 +1,42 @@
|
|||||||
|
# Homelab Deployment
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
NexusAI is distributed across three nodes. Each node runs only the
|
||||||
|
services appropriate for its hardware.
|
||||||
|
|
||||||
|
## Mini PC 1 — 192.168.0.81
|
||||||
|
|
||||||
|
Runs: Qdrant, Memory Service, Embedding Service
|
||||||
|
```bash
|
||||||
|
ssh username@192.168.0.81
|
||||||
|
cd ~/nexusai
|
||||||
|
docker compose -f docker-compose.mini1.yml up -d # Qdrant
|
||||||
|
npm run memory
|
||||||
|
npm run embedding
|
||||||
|
```
|
||||||
|
|
||||||
|
## Mini PC 2 — 192.168.0.205
|
||||||
|
|
||||||
|
Runs: Gitea, Orchestration Service
|
||||||
|
```bash
|
||||||
|
ssh username@192.168.0.205
|
||||||
|
cd ~/gitea
|
||||||
|
docker compose up -d # Gitea
|
||||||
|
cd ~/nexusai
|
||||||
|
npm run orchestration
|
||||||
|
```
|
||||||
|
|
||||||
|
## Main PC
|
||||||
|
|
||||||
|
Runs: Ollama, Inference Service
|
||||||
|
```bash
|
||||||
|
ollama serve
|
||||||
|
npm run inference
|
||||||
|
```
|
||||||
|
|
||||||
|
## Environment Files
|
||||||
|
|
||||||
|
Each node needs a `.env` file in the relevant service package directory.
|
||||||
|
These are not committed to git. See each service's documentation for
|
||||||
|
required variables.
|
||||||
34
docs/services/embedding-service.md
Normal file
34
docs/services/embedding-service.md
Normal file
@@ -0,0 +1,34 @@
|
|||||||
|
# Embedding Service
|
||||||
|
|
||||||
|
**Package:** `@nexusai/embedding-service`
|
||||||
|
**Location:** `packages/embedding-service`
|
||||||
|
**Deployed on:** Mini PC 1 (192.168.0.81)
|
||||||
|
**Port:** 3003
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
|
||||||
|
Converts text into vector embeddings for storage in Qdrant. Keeps
|
||||||
|
embedding workload off the main inference node.
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
- `express` — HTTP API
|
||||||
|
- `ollama` — Ollama client for embedding model
|
||||||
|
- `dotenv` — environment variable loading
|
||||||
|
- `@nexusai/shared` — shared utilities
|
||||||
|
|
||||||
|
## Environment Variables
|
||||||
|
|
||||||
|
| Variable | Required | Default | Description |
|
||||||
|
|---|---|---|---|
|
||||||
|
| PORT | No | 3003 | Port to listen on |
|
||||||
|
| OLLAMA_URL | No | http://localhost:11434 | Ollama instance URL |
|
||||||
|
| EMBEDDING_MODEL | No | nomic-embed-text | Ollama embedding model to use |
|
||||||
|
|
||||||
|
## Endpoints
|
||||||
|
|
||||||
|
| Method | Path | Description |
|
||||||
|
|---|---|---|
|
||||||
|
| GET | /health | Service health check |
|
||||||
|
|
||||||
|
> Further endpoints will be documented as the service is built out.
|
||||||
35
docs/services/inference-service.md
Normal file
35
docs/services/inference-service.md
Normal file
@@ -0,0 +1,35 @@
|
|||||||
|
# Inference Service
|
||||||
|
|
||||||
|
**Package:** `@nexusai/inference-service`
|
||||||
|
**Location:** `packages/inference-service`
|
||||||
|
**Deployed on:** Main PC
|
||||||
|
**Port:** 3001
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
|
||||||
|
Thin adapter layer around the local LLM runtime (Ollama). Receives
|
||||||
|
assembled context packages from the orchestration service and returns
|
||||||
|
model responses.
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
- `express` — HTTP API
|
||||||
|
- `ollama` — Ollama client
|
||||||
|
- `dotenv` — environment variable loading
|
||||||
|
- `@nexusai/shared` — shared utilities
|
||||||
|
|
||||||
|
## Environment Variables
|
||||||
|
|
||||||
|
| Variable | Required | Default | Description |
|
||||||
|
|---|---|---|---|
|
||||||
|
| PORT | No | 3001 | Port to listen on |
|
||||||
|
| OLLAMA_URL | No | http://localhost:11434 | Ollama instance URL |
|
||||||
|
| DEFAULT_MODEL | No | llama3 | Default model to use for inference |
|
||||||
|
|
||||||
|
## Endpoints
|
||||||
|
|
||||||
|
| Method | Path | Description |
|
||||||
|
|---|---|---|
|
||||||
|
| GET | /health | Service health check |
|
||||||
|
|
||||||
|
> Further endpoints will be documented as the service is built out.
|
||||||
36
docs/services/memory-service.md
Normal file
36
docs/services/memory-service.md
Normal file
@@ -0,0 +1,36 @@
|
|||||||
|
# Memory Service
|
||||||
|
|
||||||
|
**Package:** `@nexusai/memory-service`
|
||||||
|
**Location:** `packages/memory-service`
|
||||||
|
**Deployed on:** Mini PC 1 (192.168.0.81)
|
||||||
|
**Port:** 3002
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
|
||||||
|
Responsible for all reading and writing of long-term memory. Acts as the
|
||||||
|
sole interface to both SQLite and Qdrant — no other service accesses these
|
||||||
|
stores directly.
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
- `express` — HTTP API
|
||||||
|
- `better-sqlite3` — SQLite driver
|
||||||
|
- `@qdrant/js-client-rest` — Qdrant vector store client
|
||||||
|
- `dotenv` — environment variable loading
|
||||||
|
- `@nexusai/shared` — shared utilities
|
||||||
|
|
||||||
|
## Environment Variables
|
||||||
|
|
||||||
|
| Variable | Required | Default | Description |
|
||||||
|
|---|---|---|---|
|
||||||
|
| PORT | No | 3002 | Port to listen on |
|
||||||
|
| SQLITE_PATH | Yes | — | Path to SQLite database file |
|
||||||
|
| QDRANT_URL | No | http://localhost:6333 | Qdrant instance URL |
|
||||||
|
|
||||||
|
## Endpoints
|
||||||
|
|
||||||
|
| Method | Path | Description |
|
||||||
|
|---|---|---|
|
||||||
|
| GET | /health | Service health check |
|
||||||
|
|
||||||
|
> Further endpoints will be documented as the service is built out.
|
||||||
36
docs/services/orchestion-service.md
Normal file
36
docs/services/orchestion-service.md
Normal file
@@ -0,0 +1,36 @@
|
|||||||
|
# Orchestration Service
|
||||||
|
|
||||||
|
**Package:** `@nexusai/orchestration-service`
|
||||||
|
**Location:** `packages/orchestration-service`
|
||||||
|
**Deployed on:** Mini PC 2 (192.168.0.205)
|
||||||
|
**Port:** 4000
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
|
||||||
|
The main entry point for all clients. Assembles context packages from
|
||||||
|
memory, routes prompts to inference, and writes new episodes back to
|
||||||
|
memory after each interaction.
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
- `express` — HTTP API
|
||||||
|
- `node-fetch` — inter-service HTTP communication
|
||||||
|
- `dotenv` — environment variable loading
|
||||||
|
- `@nexusai/shared` — shared utilities
|
||||||
|
|
||||||
|
## Environment Variables
|
||||||
|
|
||||||
|
| Variable | Required | Default | Description |
|
||||||
|
|---|---|---|---|
|
||||||
|
| PORT | No | 4000 | Port to listen on |
|
||||||
|
| MEMORY_SERVICE_URL | No | http://localhost:3002 | Memory service URL |
|
||||||
|
| EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL |
|
||||||
|
| INFERENCE_SERVICE_URL | No | http://localhost:3001 | Inference service URL |
|
||||||
|
|
||||||
|
## Endpoints
|
||||||
|
|
||||||
|
| Method | Path | Description |
|
||||||
|
|---|---|---|
|
||||||
|
| GET | /health | Service health check |
|
||||||
|
|
||||||
|
> Further endpoints will be documented as the service is built out.
|
||||||
18
docs/services/shared.md
Normal file
18
docs/services/shared.md
Normal file
@@ -0,0 +1,18 @@
|
|||||||
|
# Shared Package
|
||||||
|
|
||||||
|
**Package:** '@nexusai/shared'
|
||||||
|
**Location:** 'packages/shared'
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
Common utilities and configuration used across all NexusAI services
|
||||||
|
Keeping these here avoids duplicating and ensure consistent behavior
|
||||||
|
|
||||||
|
# Exports
|
||||||
|
|
||||||
|
### 'getEnv(key, defaultValue?)'
|
||||||
|
Loads an environment variable by key. If no default is provided and the variable is missing, throws at startup rather than failing later on.
|
||||||
|
```javascript
|
||||||
|
const { getEnv } = require('@nexusai/shared');
|
||||||
|
const PORT = getEnv('PORT', '3002'); // optional — falls back to 3002
|
||||||
|
const DB = getEnv('SQLITE_PATH'); // required — throws if missing
|
||||||
|
```
|
||||||
Reference in New Issue
Block a user