updated documentation for semantic and constant refactor

2026-04-04 08:15:29 -07:00
parent bd600d9865
commit 7d3f083485
3 changed files with 132 additions and 38 deletions
--- a/docs/architecture/overview.md
+++ b/docs/architecture/overview.md
@@ -1,38 +1,50 @@
 # Architecture Overview

-NexusAI is a modular, memory-centric AI system designed for persistent, context-aware conversations.  It separates concerns across different services that can be independently deployed and evolved
+NexusAI is a modular, memory-centric AI system designed for persistent, context-aware conversations. It separates concerns across different services that can be independently deployed and evolved.

 ## Core Design Principles
- **Decoupled layers:** memory, inference, orchestration independent of eachother
- **Hybrid retrieval:** semantic similarity (QDrant) combined with structured storage (SQLite) for flexible, ranked context assembly
- **Home lab:**  Services are properly distributed across the various nodes according to available hardware and resources
+
+- **Decoupled layers:** memory, inference, and orchestration are independent of each other
+- **Hybrid retrieval:** semantic similarity (Qdrant) combined with structured storage (SQLite) for flexible, ranked context assembly
+- **Home lab:** services are distributed across nodes according to available hardware and resources

 ## Memory Model
-Memory is split between SQLite and QDrant, which both work together as a pair
- **SQlite:** episodic interactions, entities, relationships, summaries
- **QDrant:** vector embeddings for semantic similarity search

-When recallng memory, QDrant returns IDs and similarity scores, which are used to fetch full content from SQLite.  Neither SQlite or QDrant work in isolation
+Memory is split between SQLite and Qdrant, which work together as a pair:
+
+- **SQLite:** episodic interactions, entities, relationships, summaries
+- **Qdrant:** vector embeddings for semantic similarity search
+
+When recalling memory, Qdrant returns IDs and similarity scores, which are used to fetch
+full content from SQLite. Neither SQLite nor Qdrant work in isolation.

 ## Hardware Layout
+
+| Node | Address | Role |
 |---|---|---|
 | Main PC | local | Primary inference (RTX A4000 16GB) |
 | Mini PC 1 | 192.168.0.81 | Memory service, Embedding service, Qdrant |
 | Mini PC 2 | 192.168.0.205 | Orchestration service, Gitea |

 ## Service Communication
-All services expose a REST HTTP api.  The orchestration service is the single entgry-point.  Clients dont talk directly to the memory or inference services

+All services expose a REST HTTP API. The orchestration service is the single entry point —
+clients do not talk directly to the memory or inference services.
+
+```
 Client
 └─► Orchestration (:4000)
-├─► Memory Service (:3002)
-│     └─► Qdrant (:6333)
-│     └─► SQLite
-├─► Embedding Service (:3003)
-└─► Inference Service (:3001)
-└─► Ollama
+    ├─► Memory Service (:3002)
+    │     ├─► Qdrant (:6333)
+    │     └─► SQLite
+    ├─► Embedding Service (:3003)
+    │     └─► Ollama
+    └─► Inference Service (:3001)
+          └─► Ollama
+```

 ## Technology Choices
+
 | Concern | Choice | Reason |
 |---|---|---|
 | Language | Node.js (JavaScript) | Familiar stack, async I/O suits service architecture |