updated documentation for semantic and constant refactor

This commit is contained in:
Storme-bit
2026-04-04 08:15:29 -07:00
parent bd600d9865
commit 7d3f083485
3 changed files with 132 additions and 38 deletions

View File

@@ -1,38 +1,50 @@
# Architecture Overview
NexusAI is a modular, memory-centric AI system designed for persistent, context-aware conversations. It separates concerns across different services that can be independently deployed and evolved
NexusAI is a modular, memory-centric AI system designed for persistent, context-aware conversations. It separates concerns across different services that can be independently deployed and evolved.
## Core Design Principles
- **Decoupled layers:** memory, inference, orchestration independent of eachother
- **Hybrid retrieval:** semantic similarity (QDrant) combined with structured storage (SQLite) for flexible, ranked context assembly
- **Home lab:** Services are properly distributed across the various nodes according to available hardware and resources
- **Decoupled layers:** memory, inference, and orchestration are independent of each other
- **Hybrid retrieval:** semantic similarity (Qdrant) combined with structured storage (SQLite) for flexible, ranked context assembly
- **Home lab:** services are distributed across nodes according to available hardware and resources
## Memory Model
Memory is split between SQLite and QDrant, which both work together as a pair
- **SQlite:** episodic interactions, entities, relationships, summaries
- **QDrant:** vector embeddings for semantic similarity search
When recallng memory, QDrant returns IDs and similarity scores, which are used to fetch full content from SQLite. Neither SQlite or QDrant work in isolation
Memory is split between SQLite and Qdrant, which work together as a pair:
- **SQLite:** episodic interactions, entities, relationships, summaries
- **Qdrant:** vector embeddings for semantic similarity search
When recalling memory, Qdrant returns IDs and similarity scores, which are used to fetch
full content from SQLite. Neither SQLite nor Qdrant work in isolation.
## Hardware Layout
| Node | Address | Role |
|---|---|---|
| Main PC | local | Primary inference (RTX A4000 16GB) |
| Mini PC 1 | 192.168.0.81 | Memory service, Embedding service, Qdrant |
| Mini PC 2 | 192.168.0.205 | Orchestration service, Gitea |
## Service Communication
All services expose a REST HTTP api. The orchestration service is the single entgry-point. Clients dont talk directly to the memory or inference services
All services expose a REST HTTP API. The orchestration service is the single entry point —
clients do not talk directly to the memory or inference services.
```
Client
└─► Orchestration (:4000)
├─► Memory Service (:3002)
─► Qdrant (:6333)
│ └─► SQLite
├─► Embedding Service (:3003)
└─► Inference Service (:3001)
└─► Ollama
├─► Memory Service (:3002)
─► Qdrant (:6333)
│ └─► SQLite
├─► Embedding Service (:3003)
│ └─► Ollama
└─► Inference Service (:3001)
└─► Ollama
```
## Technology Choices
| Concern | Choice | Reason |
|---|---|---|
| Language | Node.js (JavaScript) | Familiar stack, async I/O suits service architecture |