nexusAI/docs/architecture/overview.md

# Architecture Overview

NexusAI is a modular, memory-centric AI system designed for persistent, context-aware conversations.  It separates concerns across different services that can be independently deployed and evolved

## Core Design Principles
- **Decoupled layers:** memory, inference, orchestration independent of eachother
- **Hybrid retrieval:** semantic similarity (QDrant) combined with structured storage (SQLite) for flexible, ranked context assembly
- **Home lab:**  Services are properly distributed across the various nodes according to available hardware and resources

## Memory Model
Memory is split between SQLite and QDrant, which both work together as a pair
- **SQlite:** episodic interactions, entities, relationships, summaries
- **QDrant:** vector embeddings for semantic similarity search

When recallng memory, QDrant returns IDs and similarity scores, which are used to fetch full content from SQLite.  Neither SQlite or QDrant work in isolation

## Hardware Layout
|---|---|---|
| Main PC | local | Primary inference (RTX A4000 16GB) |
| Mini PC 1 | 192.168.0.81 | Memory service, Embedding service, Qdrant |
| Mini PC 2 | 192.168.0.205 | Orchestration service, Gitea |

## Service Communication
All services expose a REST HTTP api.  The orchestration service is the single entgry-point.  Clients dont talk directly to the memory or inference services

Client
└─► Orchestration (:4000)
├─► Memory Service (:3002)
│     └─► Qdrant (:6333)
│     └─► SQLite
├─► Embedding Service (:3003)
└─► Inference Service (:3001)
└─► Ollama

## Technology Choices
| Concern | Choice | Reason |
|---|---|---|
| Language | Node.js (JavaScript) | Familiar stack, async I/O suits service architecture |
| Package management | npm workspaces | Monorepo with shared code, no publishing needed |
| Vector store | Qdrant | Mature, Docker-native, excellent Node.js client |
| Relational store | SQLite (better-sqlite3) | Zero-ops, fast, sufficient for single-user |
| LLM runtime | Ollama | Easiest local LLM management, serves embeddings too |
| Version control | Gitea (self-hosted) | Code stays on local network |