added chat client documentation

This commit is contained in:
Storme-bit
2026-04-06 05:00:12 -07:00
parent f6cdc65464
commit 0aea052311
5 changed files with 200 additions and 7 deletions

View File

@@ -9,4 +9,5 @@
- [Embedding Service](services/embedding-service.md)
- [Inference Service](services/inference-service.md)
- [Orchestration Service](services/orchestration-service.md)
- [Chat Client](services/chat-client.md)
- [Deployment](deployment/homelab.md)

View File

@@ -24,7 +24,7 @@ full content from SQLite. Neither SQLite nor Qdrant work in isolation.
|---|---|---|
| Main PC | local | Primary inference (RTX A4000 16GB) |
| Mini PC 1 | 192.168.0.81 | Memory service, Embedding service, Qdrant |
| Mini PC 2 | 192.168.0.205 | Orchestration service, Gitea |
| Mini PC 2 | 192.168.0.205 | Orchestration service, Chat Client, Gitea |
## Service Communication
@@ -34,6 +34,7 @@ clients do not talk directly to the memory or inference services.
```
Client
└─► Orchestration (:4000)
├─► Chat Client (static files, /srv/nexusai)
├─► Memory Service (:3002)
│ ├─► Qdrant (:6333)
│ └─► SQLite

View File

@@ -18,11 +18,16 @@ npm run embedding
## Mini PC 2 — 192.168.0.205
Runs: Gitea, Orchestration Service
Runs: Gitea, Orchestration Service, Chat Client (via Caddy)
```bash
ssh username@192.168.0.205
cd ~/gitea
docker compose up -d # Gitea
docker compose up -d # Gitea
cd /opt/stacks/network
docker compose up -d # Caddy, Authelia, and other network services
cd ~/nexusai
npm run orchestration
```
@@ -35,6 +40,47 @@ ollama serve
npm run inference
```
## Chat Client Deployment
The chat client is a React + Vite app build to static files and served by Caddy on Mini PC 2 (Infrastructure node). It does not run as a Node process
```bash
# On dev machine or Mini PC 2 after git pull
cd ~/nexusAI/packages/chat-client
npm run build
# Output lands in packages/chat-client/dist/
# Caddy serves this directory directly via volume mount
```
Caddy config (`/opt/docker/caddy/Caddyfile`):
```caddy
nexus.jellystorm.com {
import authelia
handle /chat* {
reverse_proxy 192.168.0.205:4000
}
handle /sessions* {
reverse_proxy 192.168.0.205:4000
}
handle {
root * /srv/nexusai
try_files {path} /index.html
file_server
}
}
```
The Caddy container mounts the dist directory via Docker volume:
```yaml
- /home/storme/nexusAI/packages/chat-client/dist:/srv/nexusai
```
> After adding or changing volume mounts, a full `docker compose down caddy && docker compose up -d caddy`
> is required. Caddyfile-only changes only need `docker compose restart caddy`.
## Environment Files
Each node needs a `.env` file in the relevant service package directory.

View File

@@ -0,0 +1,116 @@
# Chat Client
**Package:** `@nexusai/chat-client`
**Location:** `packages/chat-client`
**Deployed on:** Mini PC 2 (192.168.0.205)
**URL:** `https://nexus.jellystorm.com` (behind Authelia SSO)
## Purpose
Browser-based chat interface for NexusAI. Communicates exclusively with
the orchestration service — no direct access to memory, embedding, or
inference services. Served as static files by Caddy on Mini PC 2.
## Dependencies
- `react` + `react-dom` — UI framework
- `uuid` — session ID generation
- `vite` + `@vitejs/plugin-react` — build tooling
## Build
```bash
cd packages/chat-client
npm run build # outputs to dist/
npm run dev # local dev server on port 5173
```
Vite bakes environment variables into the bundle at build time. The `.env`
file is only needed on the machine running the build, not where files are served.
## Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
| VITE_ORCHESTRATION_URL | No | `''` (empty) | Orchestration base URL. Empty string uses Vite proxy in dev, Caddy proxy in production. |
## Internal Structure
```
src/
├── api/
│ └── orchestration.js # All fetch calls to the orchestration service
├── hooks/
│ ├── useSession.js # Session list, history loading, active session state
│ └── useChat.js # Message sending, SSE streaming, message state
├── components/
│ ├── App.jsx # Root component — layout and shared state
│ ├── SessionList.jsx # Left sidebar — session list and new chat button
│ ├── ChatWindow.jsx # Centre panel — message thread and input bar
│ ├── MessageBubble.jsx # Individual message bubble (user or assistant)
│ └── InfoPanel.jsx # Right panel — model selector and session metadata
├── index.css # Global reset and CSS variables
└── main.jsx # React entry point
```
## Layout
Three-panel layout with collapsible sidebars:
┌─────────────────┬──────────────────────────┬─────────────┐
│ Session List │ Chat Window │ Info Panel │
│ (collapsible) │ │ (collapsible)│
│ │ [message thread] │ │
│ + New Chat │ │ Model │
│ │ │ Session ID │
│ Session 1 │ │ Token count │
│ Session 2 │ │ │
│ │ [input bar] │ │
└─────────────────┴──────────────────────────┴─────────────┘
On mobile, sidebars collapse to a 56px icon rail. The centre chat window
always fills the remaining space.
## API Layer
All orchestration calls are centralised in `src/api/orchestration.js`:
| Function | Method | Path | Description |
|---|---|---|---|
| `fetchSessions` | GET | /sessions | Load session list for sidebar |
| `fetchSessionHistory` | GET | /sessions/:id/history | Load episode history on session select |
| `sendMessage` | POST | /chat | Send message, await full response |
| `streamMessage` | POST | /chat/stream | Send message, receive SSE token stream |
`streamMessage` returns an abort function — call it to cancel a stream mid-flight.
It uses a buffer pattern to handle SSE chunks that may span multiple network packets.
## Streaming
The chat input sends messages via `POST /chat/stream`. Tokens arrive as SSE events:
data: {"text":"Hello"}
data: {"text":" Tim"}
data: {"done":true}
An empty assistant bubble is appended immediately when the stream opens, then
updated token by token using `updateLastMessage`. The blinking cursor in
`MessageBubble` is shown while `message.streaming === true` and disappears
when `done` is received.
## Model Selector
Available models are defined in `InfoPanel.jsx`:
| Label | Value |
|---|---|
| Companion | `companion:latest` |
| Mistral Nemo | `mistral-nemo:latest` |
| Coder | `coder:latest` |
| Qwen 2.5 Coder 14B | `qwen2.5-coder:14b` |
The selected model is passed with every chat request. To add a new model,
update the `MODELS` array in `InfoPanel.jsx`.
## Session Management
Sessions are identified by a `external_id` — a human-readable string or UUID
generated client-side. New sessions are created locally with `uuid` and auto-registered
in the memory service on the first message. The session list refreshes after each
completed response to surface newly created sessions.

View File

@@ -14,10 +14,11 @@ or inference services — all traffic flows through orchestration.
## Dependencies
- `express` HTTP API
- `node-fetch` — inter-service HTTP communication (memory service client only)
- `dotenv` — environment variable loading
- `@nexusai/shared` — shared utilities
- `express` : HTTP API
- `cors` : cross-origin resource sharing middleware
- `node-fetch` : inter-service HTTP communication (memory service client only)
- `dotenv` : environment variable loading
- `@nexusai/shared` : shared utilities
> `memory.js` uses `node-fetch` v2 (pinned) because it is CommonJS. All other
> service clients use Node.js built-in `fetch`.
@@ -31,6 +32,7 @@ or inference services — all traffic flows through orchestration.
| EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL |
| INFERENCE_SERVICE_URL | No | http://localhost:3001 | Inference service URL |
| QDRANT_URL | No | http://localhost:6333 | Qdrant URL for semantic search |
| CORS_ORIGIN | No | http://localhost:5173 | Allowed origin for CORS requests |
## Internal Structure
```
@@ -138,6 +140,7 @@ the `{"done":true}` event.
| Method | Path | Description |
|---|---|---|
| GET | /sessions | Get paginated list of all sessions |
| GET | /sessions/:sessionId/history | Get paginated episode history for a session |
---
@@ -215,4 +218,30 @@ Response:
}
```
---
**GET /sessions**
Returns a paginated list of all sessions, ordered by most recently active.
Query parameters:
| Parameter | Default | Description |
|---|---|---|
| limit | 20 | Maximum number of sessions to return |
| offset | 0 | Number of sessions to skip (for pagination) |
Response:
```json
[
{
"id": 1,
"external_id": "test-semantic",
"metadata": null,
"created_at": 1712345678,
"updated_at": 1712345999
}
]
```
Episodes are ordered newest first. Returns `404` if the session does not exist.