diff --git a/docs/README.md b/docs/README.md index 6eafe0e..86e10c8 100644 --- a/docs/README.md +++ b/docs/README.md @@ -9,4 +9,5 @@ - [Embedding Service](services/embedding-service.md) - [Inference Service](services/inference-service.md) - [Orchestration Service](services/orchestration-service.md) + - [Chat Client](services/chat-client.md) - [Deployment](deployment/homelab.md) \ No newline at end of file diff --git a/docs/architecture/overview.md b/docs/architecture/overview.md index 0c7fafd..19bf479 100644 --- a/docs/architecture/overview.md +++ b/docs/architecture/overview.md @@ -24,7 +24,7 @@ full content from SQLite. Neither SQLite nor Qdrant work in isolation. |---|---|---| | Main PC | local | Primary inference (RTX A4000 16GB) | | Mini PC 1 | 192.168.0.81 | Memory service, Embedding service, Qdrant | -| Mini PC 2 | 192.168.0.205 | Orchestration service, Gitea | +| Mini PC 2 | 192.168.0.205 | Orchestration service, Chat Client, Gitea | ## Service Communication @@ -34,6 +34,7 @@ clients do not talk directly to the memory or inference services. ``` Client └─► Orchestration (:4000) + ├─► Chat Client (static files, /srv/nexusai) ├─► Memory Service (:3002) │ ├─► Qdrant (:6333) │ └─► SQLite diff --git a/docs/deployment/homelab.md b/docs/deployment/homelab.md index 8cdc9a9..cc43870 100644 --- a/docs/deployment/homelab.md +++ b/docs/deployment/homelab.md @@ -18,11 +18,16 @@ npm run embedding ## Mini PC 2 — 192.168.0.205 -Runs: Gitea, Orchestration Service +Runs: Gitea, Orchestration Service, Chat Client (via Caddy) ```bash ssh username@192.168.0.205 + cd ~/gitea -docker compose up -d # Gitea +docker compose up -d # Gitea + +cd /opt/stacks/network +docker compose up -d # Caddy, Authelia, and other network services + cd ~/nexusai npm run orchestration ``` @@ -35,6 +40,47 @@ ollama serve npm run inference ``` +## Chat Client Deployment + +The chat client is a React + Vite app build to static files and served by Caddy on Mini PC 2 (Infrastructure node). It does not run as a Node process +```bash +# On dev machine or Mini PC 2 after git pull +cd ~/nexusAI/packages/chat-client +npm run build +# Output lands in packages/chat-client/dist/ +# Caddy serves this directory directly via volume mount +``` +Caddy config (`/opt/docker/caddy/Caddyfile`): +```caddy +nexus.jellystorm.com { + import authelia + + handle /chat* { + reverse_proxy 192.168.0.205:4000 + } + + handle /sessions* { + reverse_proxy 192.168.0.205:4000 + } + + handle { + root * /srv/nexusai + try_files {path} /index.html + file_server + } +} +``` + +The Caddy container mounts the dist directory via Docker volume: +```yaml +- /home/storme/nexusAI/packages/chat-client/dist:/srv/nexusai +``` + +> After adding or changing volume mounts, a full `docker compose down caddy && docker compose up -d caddy` +> is required. Caddyfile-only changes only need `docker compose restart caddy`. + + + ## Environment Files Each node needs a `.env` file in the relevant service package directory. diff --git a/docs/services/chat-client.md b/docs/services/chat-client.md new file mode 100644 index 0000000..13287d8 --- /dev/null +++ b/docs/services/chat-client.md @@ -0,0 +1,116 @@ +# Chat Client + +**Package:** `@nexusai/chat-client` +**Location:** `packages/chat-client` +**Deployed on:** Mini PC 2 (192.168.0.205) +**URL:** `https://nexus.jellystorm.com` (behind Authelia SSO) + +## Purpose + +Browser-based chat interface for NexusAI. Communicates exclusively with +the orchestration service — no direct access to memory, embedding, or +inference services. Served as static files by Caddy on Mini PC 2. + +## Dependencies + +- `react` + `react-dom` — UI framework +- `uuid` — session ID generation +- `vite` + `@vitejs/plugin-react` — build tooling + +## Build +```bash +cd packages/chat-client +npm run build # outputs to dist/ +npm run dev # local dev server on port 5173 +``` + +Vite bakes environment variables into the bundle at build time. The `.env` +file is only needed on the machine running the build, not where files are served. + +## Environment Variables + +| Variable | Required | Default | Description | +|---|---|---|---| +| VITE_ORCHESTRATION_URL | No | `''` (empty) | Orchestration base URL. Empty string uses Vite proxy in dev, Caddy proxy in production. | + +## Internal Structure +``` +src/ +├── api/ +│ └── orchestration.js # All fetch calls to the orchestration service +├── hooks/ +│ ├── useSession.js # Session list, history loading, active session state +│ └── useChat.js # Message sending, SSE streaming, message state +├── components/ +│ ├── App.jsx # Root component — layout and shared state +│ ├── SessionList.jsx # Left sidebar — session list and new chat button +│ ├── ChatWindow.jsx # Centre panel — message thread and input bar +│ ├── MessageBubble.jsx # Individual message bubble (user or assistant) +│ └── InfoPanel.jsx # Right panel — model selector and session metadata +├── index.css # Global reset and CSS variables +└── main.jsx # React entry point +``` + +## Layout + +Three-panel layout with collapsible sidebars: +┌─────────────────┬──────────────────────────┬─────────────┐ +│ Session List │ Chat Window │ Info Panel │ +│ (collapsible) │ │ (collapsible)│ +│ │ [message thread] │ │ +│ + New Chat │ │ Model │ +│ │ │ Session ID │ +│ Session 1 │ │ Token count │ +│ Session 2 │ │ │ +│ │ [input bar] │ │ +└─────────────────┴──────────────────────────┴─────────────┘ + +On mobile, sidebars collapse to a 56px icon rail. The centre chat window +always fills the remaining space. + +## API Layer + +All orchestration calls are centralised in `src/api/orchestration.js`: + +| Function | Method | Path | Description | +|---|---|---|---| +| `fetchSessions` | GET | /sessions | Load session list for sidebar | +| `fetchSessionHistory` | GET | /sessions/:id/history | Load episode history on session select | +| `sendMessage` | POST | /chat | Send message, await full response | +| `streamMessage` | POST | /chat/stream | Send message, receive SSE token stream | + +`streamMessage` returns an abort function — call it to cancel a stream mid-flight. +It uses a buffer pattern to handle SSE chunks that may span multiple network packets. + +## Streaming + +The chat input sends messages via `POST /chat/stream`. Tokens arrive as SSE events: +data: {"text":"Hello"} +data: {"text":" Tim"} +data: {"done":true} + +An empty assistant bubble is appended immediately when the stream opens, then +updated token by token using `updateLastMessage`. The blinking cursor in +`MessageBubble` is shown while `message.streaming === true` and disappears +when `done` is received. + +## Model Selector + +Available models are defined in `InfoPanel.jsx`: + +| Label | Value | +|---|---| +| Companion | `companion:latest` | +| Mistral Nemo | `mistral-nemo:latest` | +| Coder | `coder:latest` | +| Qwen 2.5 Coder 14B | `qwen2.5-coder:14b` | + +The selected model is passed with every chat request. To add a new model, +update the `MODELS` array in `InfoPanel.jsx`. + +## Session Management + +Sessions are identified by a `external_id` — a human-readable string or UUID +generated client-side. New sessions are created locally with `uuid` and auto-registered +in the memory service on the first message. The session list refreshes after each +completed response to surface newly created sessions. \ No newline at end of file diff --git a/docs/services/orchestration-service.md b/docs/services/orchestration-service.md index 70f7a2d..56c796a 100644 --- a/docs/services/orchestration-service.md +++ b/docs/services/orchestration-service.md @@ -14,10 +14,11 @@ or inference services — all traffic flows through orchestration. ## Dependencies -- `express` — HTTP API -- `node-fetch` — inter-service HTTP communication (memory service client only) -- `dotenv` — environment variable loading -- `@nexusai/shared` — shared utilities +- `express` : HTTP API +- `cors` : cross-origin resource sharing middleware +- `node-fetch` : inter-service HTTP communication (memory service client only) +- `dotenv` : environment variable loading +- `@nexusai/shared` : shared utilities > `memory.js` uses `node-fetch` v2 (pinned) because it is CommonJS. All other > service clients use Node.js built-in `fetch`. @@ -31,6 +32,7 @@ or inference services — all traffic flows through orchestration. | EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL | | INFERENCE_SERVICE_URL | No | http://localhost:3001 | Inference service URL | | QDRANT_URL | No | http://localhost:6333 | Qdrant URL for semantic search | +| CORS_ORIGIN | No | http://localhost:5173 | Allowed origin for CORS requests | ## Internal Structure ``` @@ -138,6 +140,7 @@ the `{"done":true}` event. | Method | Path | Description | |---|---|---| +| GET | /sessions | Get paginated list of all sessions | | GET | /sessions/:sessionId/history | Get paginated episode history for a session | --- @@ -215,4 +218,30 @@ Response: } ``` +--- + +**GET /sessions** + +Returns a paginated list of all sessions, ordered by most recently active. + +Query parameters: + +| Parameter | Default | Description | +|---|---|---| +| limit | 20 | Maximum number of sessions to return | +| offset | 0 | Number of sessions to skip (for pagination) | + +Response: +```json +[ + { + "id": 1, + "external_id": "test-semantic", + "metadata": null, + "created_at": 1712345678, + "updated_at": 1712345999 + } +] +``` + Episodes are ordered newest first. Returns `404` if the session does not exist. \ No newline at end of file