Add initial project documentation
This commit is contained in:
43
docs/architecture/overview.md
Normal file
43
docs/architecture/overview.md
Normal file
@@ -0,0 +1,43 @@
|
||||
# Architecture Overview
|
||||
|
||||
NexusAI is a modular, memory-centric AI system designed for persistent, context-aware conversations. It separates concerns across different services that can be independently deployed and evolved
|
||||
|
||||
## Core Design Principles
|
||||
- **Decoupled layers:** memory, inference, orchestration independent of eachother
|
||||
- **Hybrid retrieval:** semantic similarity (QDrant) combined with structured storage (SQLite) for flexible, ranked context assembly
|
||||
- **Home lab:** Services are properly distributed across the various nodes according to available hardware and resources
|
||||
|
||||
## Memory Model
|
||||
Memory is split between SQLite and QDrant, which both work together as a pair
|
||||
- **SQlite:** episodic interactions, entities, relationships, summaries
|
||||
- **QDrant:** vector embeddings for semantic similarity search
|
||||
|
||||
When recallng memory, QDrant returns IDs and similarity scores, which are used to fetch full content from SQLite. Neither SQlite or QDrant work in isolation
|
||||
|
||||
## Hardware Layout
|
||||
|---|---|---|
|
||||
| Main PC | local | Primary inference (RTX A4000 16GB) |
|
||||
| Mini PC 1 | 192.168.0.81 | Memory service, Embedding service, Qdrant |
|
||||
| Mini PC 2 | 192.168.0.205 | Orchestration service, Gitea |
|
||||
|
||||
## Service Communication
|
||||
All services expose a REST HTTP api. The orchestration service is the single entgry-point. Clients dont talk directly to the memory or inference services
|
||||
|
||||
Client
|
||||
└─► Orchestration (:4000)
|
||||
├─► Memory Service (:3002)
|
||||
│ └─► Qdrant (:6333)
|
||||
│ └─► SQLite
|
||||
├─► Embedding Service (:3003)
|
||||
└─► Inference Service (:3001)
|
||||
└─► Ollama
|
||||
|
||||
## Technology Choices
|
||||
| Concern | Choice | Reason |
|
||||
|---|---|---|
|
||||
| Language | Node.js (JavaScript) | Familiar stack, async I/O suits service architecture |
|
||||
| Package management | npm workspaces | Monorepo with shared code, no publishing needed |
|
||||
| Vector store | Qdrant | Mature, Docker-native, excellent Node.js client |
|
||||
| Relational store | SQLite (better-sqlite3) | Zero-ops, fast, sufficient for single-user |
|
||||
| LLM runtime | Ollama | Easiest local LLM management, serves embeddings too |
|
||||
| Version control | Gitea (self-hosted) | Code stays on local network |
|
||||
Reference in New Issue
Block a user