smarter context assembly implementation

minor clean up
retrieval fusion
2026-04-27 21:41:32 -07:00 · 2026-04-27 20:17:05 -07:00 · 2026-04-27 07:03:46 -07:00 · 2026-04-27 05:56:23 -07:00 · 2026-04-27 05:46:01 -07:00 · 2026-04-27 05:21:43 -07:00
84 changed files with 9194 additions and 1575 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -5,4 +5,5 @@ data/
 .env
 .env.*
 *.db
 .claude/settings.local.json
 EOF
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,108 @@
 # CLAUDE.md
 This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
 ## Development Commands
 ```bash
 # Start individual services
 npm run memory           # Memory Service (port 3002)
 npm run embedding        # Embedding Service (port 3003)
 npm run inference        # Inference Service (port 3001)
 npm run orchestration    # Orchestration Service (port 4000)
 npm run mini1            # Start memory + embedding concurrently
 # Per-service dev mode (with --watch)
 npm -w packages/<service-name> run dev
 # Chat client
 npm -w packages/chat-client run dev      # Vite dev server (port 5173)
 npm -w packages/chat-client run build    # Production build
 ```
 No test framework or linter is configured.
 ## Architecture Overview
 NexusAI is a **modular AI assistant** with persistent, project-scoped memory. It's a Node.js monorepo (`npm workspaces`) with 4 independent backend services, 1 React frontend, and 1 shared package.
 ### Services
 | Package | Port | Role |
 |---|---|---|
 | `orchestration-service` | 4000 | Central gateway; coordinates all others |
 | `memory-service` | 3002 | SQLite + Qdrant hybrid memory |
 | `embedding-service` | 3003 | Text embeddings via Ollama (`nomic-embed-text`, 768-dim) |
 | `inference-service` | 3001 | LLM inference (Ollama or llama.cpp) |
 | `chat-client` | 5173 | React/Vite frontend |
 | `shared` | — | Constants, env helpers, logger, formatters |
 All inter-service communication is **REST HTTP only** — no message queues or WebSockets.
 ### Chat Request Flow
 1. Client POSTs to orchestration `/chat/stream`
 2. Orchestration resolves session, fetches **recent episodes** (SQLite) + **semantic episodes** (Qdrant vector search) + **entities** (Qdrant, scoped by project)
 3. Embedding computed for user message (embedding-service)
 4. Prompt assembled: system message → entities → semantic memories → recent episodes → user message
 5. Inference streams response (inference-service)
 6. Episode stored in SQLite + Qdrant (fire-and-forget embedding)
 7. Entity extraction triggered async (qwen2.5:3b via inference-service)
 8. Auto-summarization checked (threshold: 200+ tokens, re-triggers every 5 episodes)
 9. Auto-naming on first message (temp 0.3, 20 tokens max)
 ### Memory Model
 **Dual store — neither works alone:**
 - **SQLite** (`better-sqlite3`, synchronous) — Full content: sessions, episodes, entities, relationships, projects, summaries, FTS5 index
 - **Qdrant** — Vector embeddings for semantic search; IDs used to fetch full content from SQLite afterward
 Orchestration queries Qdrant directly (bypasses memory-service) for performance, then fetches full episode content from memory-service by ID.
 **Project-scoped isolation:** Sessions grouped into projects; Qdrant queries use `should` filter on session IDs to enforce memory boundaries. Non-project sessions share a common pool.
 ### Key File Locations
 **Orchestration** (`packages/orchestration-service/src/`):
 - `chat/index.js` — Core prompt building and memory assembly
 - `routes/` — HTTP endpoints: chat, sessions, projects, episodes, models, settings, summaries
 - `services/` — Thin HTTP clients for memory, embedding, inference, and direct Qdrant access
 - `config/settings.js` — Loads/saves `data/settings.json` (user-tunable: model params, thresholds, system prompt)
 **Memory** (`packages/memory-service/src/`):
 - `db/schema.js` — SQLite table definitions (source of truth for data model)
 - `episodic/` — Episode CRUD
 - `semantic/` — Qdrant operations
 - `entities/` — Entity extraction + CRUD
 - `summarization/` — Project summary generation
 **Shared** (`packages/shared/src/`):
 - `config/constants.js` — All tunables (ports, thresholds, model names, vector size)
 - `config/env.js` — `getEnv()` helper with fallback to constants
 - `utils.js` — `parseRow()`, `formatEpisodeText()`, `logger`
 **Frontend** (`packages/chat-client/src/`):
 - `App.jsx` — View router and top-level state (views: home, chat, all-chats, all-projects, project, memory, summaries, settings)
 - `hooks/` — `useChat`, `useSession`, `useModels`, `useProjects`, `useSettings`, `useContextMenu`
 - `api/orchestration.js` — Fetch wrapper for all API calls
 - Vite proxy points to `192.168.0.205:4000` (Mini PC 2 / orchestration)
 ### Configuration
 Each service uses `.env` via `dotenv`, falling back to `packages/shared/src/config/constants.js`. The orchestration service also serves `data/settings.json` to the frontend via `/settings` — this is the single source of truth for user-facing inference parameters and system prompt.
 ### Deployment
 Home lab across 3 nodes, managed with Docker Compose:
 - **Main PC** — RTX A4000 (inference via llama.cpp)
 - **Mini PC 1** — memory + embedding services, Qdrant, Ollama
 - **Mini PC 2** — orchestration + chat client, Caddy reverse proxy + Authelia SSO
 Docker Compose files: `docker-compose.mini1.yml`, `docker-compose.mini2.yml`. All services expose `/health`. Deployment docs: `docs/deployment/homelab.md`.
 ## Key Development Principles
 - **Layer-by-layer validation** — always build and test backend → orchestration → frontend in sequence, curl-testing each layer before proceeding
 - **New orchestration routes require changes in four places**: route file, `orchestration-service/src/index.js`, Caddyfile on Mini PC 2 (`192.168.0.205`), and `vite.config.js` in the chat client
 - **All services read settings on every request** — no restart required for config changes
 - **Backend-first development** — data layer → service endpoints → orchestration proxy → frontend
--- a/docs/README.md
+++ b/docs/README.md
@@ -1,13 +1,23 @@
 # NexusAI Documentation
-## Contents
+## Architecture
 - [Architecture Overview](architecture/overview.md)
- [Services](services/)
+
 ## Services
 - [Shared Package](services/shared.md)
 - [Memory Service](services/memory-service.md)
 - [Embedding Service](services/embedding-service.md)
 - [Inference Service](services/inference-service.md)
 - [Orchestration Service](services/orchestration-service.md)
 - [Chat Client](services/chat-client.md)
- [Deployment](deployment/homelab.md)
+
 ## Reference
 - [API Routes](reference/api-routes.md) — all HTTP endpoints across all services
 - [Memory Isolation](reference/memory-isolation.md) — project-scoped memory model
 ## Deployment
 - [Homelab](deployment/homelab.md)
--- a/docs/architecture/overview.md
+++ b/docs/architecture/overview.md
@@ -1,56 +1,81 @@
 # Architecture Overview
-NexusAI is a modular, memory-centric AI system designed for persistent, context-aware conversations. It separates concerns across different services that can be independently deployed and evolved.
+NexusAI is a modular, memory-centric AI assistant designed for persistent,
 context-aware conversations. It separates concerns across independent services
 that can be evolved and deployed separately.
 ## Core Design Principles
- **Decoupled layers:** memory, inference, and orchestration are independent of each other
+- **Decoupled layers** — memory, inference, and orchestration are independent of each other
- **Hybrid retrieval:** semantic similarity (Qdrant) combined with structured storage (SQLite) for flexible, ranked context assembly
+- **Hybrid retrieval** — semantic similarity (Qdrant) combined with structured storage (SQLite) for flexible, ranked context assembly
- **Home lab:** services are distributed across nodes according to available hardware and resources
+- **Project-scoped memory** — sessions can be grouped into projects with shared or isolated memory pools
 - **Home lab first** — services are distributed across nodes according to available hardware
 ## Memory Model
-Memory is split between SQLite and Qdrant, which work together as a pair:
+Memory is split between SQLite and Qdrant, which always work as a pair:
- **SQLite:** episodic interactions, entities, relationships, summaries
+- **SQLite** — episodic interactions, entities, relationships, summaries, sessions, projects
- **Qdrant:** vector embeddings for semantic similarity search
+- **Qdrant** — vector embeddings for semantic similarity search
-When recalling memory, Qdrant returns IDs and similarity scores, which are used to fetch
+When recalling memory, Qdrant returns IDs and similarity scores, which are used
-full content from SQLite. Neither SQLite nor Qdrant work in isolation.
+to fetch full content from SQLite. Neither store works in isolation.
 Episode embeddings carry a `{ sessionId, createdAt }` payload in Qdrant,
 enabling per-session and per-project filtering at search time. See
 `memory-isolation.md` for how project-scoped retrieval works.
 ## Hardware Layout
 | Node | Address | Role |
 |---|---|---|
-| Main PC | local | Primary inference (RTX A4000 16GB) |
+| Main PC | 192.168.0.79 | Primary inference — RTX A4000 16GB |
-| Mini PC 1 | 192.168.0.81 | Memory service, Embedding service, Qdrant |
+| Mini PC 1 | 192.168.0.81 | Memory service, Embedding service, Qdrant, Ollama |
-| Mini PC 2 | 192.168.0.205 | Orchestration service, Chat Client, Gitea |
+| Mini PC 2 | 192.168.0.205 | Orchestration service, Chat Client, Caddy, Authelia, Gitea |
 ## Service Communication
-All services expose a REST HTTP API. The orchestration service is the single entry point —
+All services expose a REST HTTP API. The orchestration service is the single
-clients do not talk directly to the memory or inference services.
+entry point — clients never talk directly to memory or inference services.
 ```
-Client
+Client (browser)
-└─► Orchestration (:4000)
+└─► Caddy (HTTPS + Authelia SSO)
-    ├─► Chat Client (static files, /srv/nexusai)
+    └─► Orchestration (:4000) — Mini PC 2
-    ├─► Memory Service (:3002)
+        ├─► Memory Service (:3002) — Mini PC 1
-    │     ├─► Qdrant (:6333)
+        │     ├─► SQLite (local file)
-    │     └─► SQLite
+        │     └─► Qdrant (:6333) — Mini PC 1
-    ├─► Embedding Service (:3003)
+        ├─► Embedding Service (:3003) — Mini PC 1
-    │     └─► Ollama
+        │     └─► Ollama (:11434) — Mini PC 1
-    └─► Inference Service (:3001)
+        ├─► Inference Service (:3001) — Main PC
-          └─► Ollama
+        │     └─► llama-server (:8080) — Main PC
        └─► Qdrant (:6333) — Mini PC 1 (direct — semantic search)
 ```
 Note: Orchestration queries Qdrant directly for semantic search (bypassing
 the memory service) but always fetches full episode content from the memory
 service by ID after the vector search.
 ## Technology Choices
 | Concern | Choice | Reason |
 |---|---|---|
-| Language | Node.js (JavaScript) | Familiar stack, async I/O suits service architecture |
+| Language | Node.js (CommonJS) | Familiar stack, async I/O suits service architecture |
 | Package management | npm workspaces | Monorepo with shared code, no publishing needed |
 | Vector store | Qdrant | Mature, Docker-native, excellent Node.js client |
-| Relational store | SQLite (better-sqlite3) | Zero-ops, fast, sufficient for single-user |
+| Relational store | SQLite (better-sqlite3) | Zero-ops, fast, sufficient for single-user scale |
-| LLM runtime | Ollama | Easiest local LLM management, serves embeddings too |
+| LLM inference | llama.cpp (`llama-server`) | Maximum GPU utilisation on RTX A4000, OpenAI-compatible API |
 | Embeddings | Ollama (`nomic-embed-text`) | Co-located with memory service on Mini PC 1, 768-dim Cosine |
 | Reverse proxy | Caddy + Authelia | Automatic HTTPS, SSO/MFA for all exposed services |
 | Version control | Gitea (self-hosted) | Code stays on local network |
 ## Current State
 The core four-service architecture is complete and operational. Key capabilities:
 - **Retrieval fusion** — Reciprocal Rank Fusion (RRF) merges semantic (Qdrant vector search) and keyword (SQLite FTS5) episode retrieval into a single ranked result set. Weights are configurable per strategy via settings; keyword search is off by default (`keywordWeight: 0`) and can be enabled without a service restart
 - **Entity layer + Knowledge graph** — automatic extraction of named entities and relationships from conversations via qwen2.5:3b. Entities and relationships are stored in SQLite with `mention_count` tracking. A graph traversal layer expands Qdrant entity search hits into a 1-hop neighborhood subgraph, injecting structured connected knowledge into every prompt
 - **Projects** — sessions grouped with shared or isolated memory pools
 - **Auto-naming** — sessions named automatically from first exchange via inference
 - **Project-scoped semantic search** — Qdrant filtered by project session IDs
 - **Chat client** — view-based UI with sidebar navigation, project views, session management
--- a/docs/deployment/homelab.md
+++ b/docs/deployment/homelab.md
@@ -7,50 +7,73 @@ services appropriate for its hardware.
 ## Mini PC 1 — 192.168.0.81
-Runs: Qdrant, Memory Service, Embedding Service
+Runs: Qdrant, Memory Service, Embedding Service, Ollama
 ```bash
-ssh username@192.168.0.81
+ssh storme@192.168.0.81
 cd ~/nexusai
 docker compose -f docker-compose.mini1.yml up -d  # Qdrant
-npm run memory
+npm run memory      # port 3002
-npm run embedding
+npm run embedding   # port 3003
 ollama serve        # port 11434 — must bind 0.0.0.0 (OLLAMA_HOST=0.0.0.0)
 ```
 > Ollama must be started with `OLLAMA_HOST=0.0.0.0` to accept connections
 > from other services on the LAN. Without this, embedding requests from the
 > memory service will be refused.
 ## Mini PC 2 — 192.168.0.205
-Runs: Gitea, Orchestration Service, Chat Client (via Caddy)
+Runs: Orchestration Service, Chat Client (via Caddy), Gitea, Caddy, Authelia
 ```bash
 ssh username@192.168.0.205
-cd ~/gitea
+```bash
-docker compose up -d        # Gitea
+ssh storme@192.168.0.205
 cd /opt/stacks/network
 docker compose up -d        # Caddy, Authelia, and other network services
-cd ~/nexusai
+cd ~/nexusAI
-npm run orchestration
+npm run orchestration       # port 4000
 ```
-## Main PC
+## Main PC — 192.168.0.79
-Runs: Ollama, Inference Service
+Runs: Inference Service, llama-server
-```bash
+
-ollama serve
+```powershell
-npm run inference
+# Start llama-server first — inference service depends on it
 .\llama-gpu\llama-server.exe `
  -m .\models\gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf `
  -ngl 99 --reasoning off --host 0.0.0.0 --port 8080 -c 64000
 # Then start inference service
 npm run inference            # port 3001
 ```
 ## Chat Client Deployment
-The chat client is a React + Vite app build to static files and served by Caddy on Mini PC 2 (Infrastructure node).  It does not run as a Node process
+The chat client is a React + Vite app built to static files and served by
 Caddy on Mini PC 2. It does not run as a Node process.
 ```bash
-# On dev machine or Mini PC 2 after git pull
+# On Mini PC 2 after git pull
 cd ~/nexusAI/packages/chat-client
-npm run build
+
 # Set production URL before building
 VITE_ORCHESTRATION_URL=https://nexus.jellystorm.com npm run build
 # Output lands in packages/chat-client/dist/
-# Caddy serves this directory directly via volume mount
+# Caddy serves this directory directly via Docker volume mount
 ```
-Caddy config (`/opt/docker/caddy/Caddyfile`):
+
 > Do NOT set `VITE_ORCHESTRATION_URL` during local dev — Vite's proxy handles
 > routing and setting the HTTPS domain will cause Authelia to intercept API
 > requests, producing confusing JSON parse errors.
 ## Caddy Configuration
 The Caddyfile on Mini PC 2 must include a handle block for each route prefix
 the client needs to reach. Current required blocks for NexusAI:
 ```caddy
 nexus.jellystorm.com {
    import authelia
@@ -63,6 +86,14 @@ nexus.jellystorm.com {
        reverse_proxy 192.168.0.205:4000
    }
    handle /models* {
        reverse_proxy 192.168.0.205:4000
    }
    handle /projects* {
        reverse_proxy 192.168.0.205:4000
    }
    handle {
        root * /srv/nexusai
        try_files {path} /index.html
@@ -71,18 +102,45 @@ nexus.jellystorm.com {
 }
 ```
-The Caddy container mounts the dist directory via Docker volume:
+When adding new top-level routes to the orchestration service, add a matching
 handle block here and reload Caddy:
 ```bash
 caddy reload --config /path/to/Caddyfile
 ```
 The Caddy container mounts the `dist` directory via Docker volume:
 ```yaml
 - /home/storme/nexusAI/packages/chat-client/dist:/srv/nexusai
 ```
 > After adding or changing volume mounts, a full `docker compose down caddy && docker compose up -d caddy`
-> is required. Caddyfile-only changes only need `docker compose restart caddy`.
+> is required. Caddyfile-only changes only need `caddy reload`.
 ## Environment Files
-Each node needs a `.env` file in the relevant service package directory.
+Each service needs a `.env` file in its package directory. These are not
-These are not committed to git. See each service's documentation for
+committed to git. See each service's documentation for required variables.
-required variables.
+
 | Service | Location | Key Variables |
 |---|---|---|
 | Memory | `packages/memory-service/.env` | `SQLITE_PATH`, `QDRANT_URL`, `EMBEDDING_SERVICE_URL` |
 | Embedding | `packages/embedding-service/.env` | `OLLAMA_URL`, `EMBEDDING_MODEL` |
 | Inference | `packages/inference-service/.env` | `INFERENCE_PROVIDER`, `INFERENCE_URL`, `DEFAULT_MODEL` |
 | Orchestration | `packages/orchestration-service/src/.env` | `MEMORY_SERVICE_URL`, `EMBEDDING_SERVICE_URL`, `INFERENCE_SERVICE_URL`, `QDRANT_URL`, `MODELS_MANIFEST_PATH` |
 | Chat client | `packages/chat-client/.env` | `VITE_ORCHESTRATION_URL` (production builds only) |
 ## Models Manifest
 The models manifest (`models.json`) lives on the Main PC alongside the model
 files, accessible to orchestration via an SMB mount at `/mnt/nexus-models`.
 ```json
 [
  { "value": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf", "label": "Gemma 4 26B Claude Distill" }
 ]
 ```
 `value` must exactly match the model name as reported by `llama-server`
 (including `.gguf` extension). No service restart needed to pick up changes.
--- a/docs/homelab/homelab-overview.md
+++ b/docs/homelab/homelab-overview.md
@@ -39,21 +39,21 @@ All external access is routed through **Caddy** (reverse proxy) with **Authelia*
 |------|--------|
 | GPU | NVIDIA RTX A4000 |
 | Role | Primary AI inference node |
-| Key Services | Ollama (inference) |
+| Key Services | llama-server (llama.cpp), Inference Service |
 ### Mini PC 1 — Media Node (`192.168.0.81`)
 | Spec | Detail |
 |------|--------|
 | GPU | NVIDIA RTX 5050 |
 | Role | Media services, embeddings, vector storage |
-| Key Services | Jellyfin, Nextcloud, Qdrant, arr stack, NexusAI memory/embedding |
+| Key Services | Jellyfin, Nextcloud, Qdrant, arr stack, NexusAI memory/embedding, Ollama |
 | Storage | NVMe (OS) + 3x external HDDs (see [Storage Layout](#storage-layout)) |
 ### Mini PC 2 — Infrastructure Node (`192.168.0.205`)
 | Spec | Detail |
 |------|--------|
-| Role | Network management, monitoring, auth, DNS, git |
+| Role | Network management, monitoring, auth, DNS, git, NexusAI orchestration |
-| Key Services | Caddy, Authelia, Tailscale, Pihole, Grafana, Gitea |
+| Key Services | Caddy, Authelia, Tailscale, Pihole, Grafana, Gitea, NexusAI orchestration |
 | Storage | NVMe (OS only) |
 ---
@@ -155,7 +155,8 @@ All external access is routed through **Caddy** (reverse proxy) with **Authelia*
 | Service | Notes |
 |---------|-------|
-| Ollama | Runs LLM inference using the RTX A4000. Also serves `nomic-embed-text` embeddings (768-dim vectors) consumed by NexusAI's embedding service on Mini PC 1. |
+| llama-server (llama.cpp) | Primary LLM inference using the RTX A4000. Started manually before the inference service. Serves the OpenAI-compatible API on port 8080. |
 | Ollama | Serves `nomic-embed-text` embeddings (768-dim vectors) consumed by NexusAI's embedding service on Mini PC 1. |
 ---
@@ -234,7 +235,7 @@ Phase 1 focused on establishing a stable, secure, and observable foundation:
 - ✅ Self-hosted git (Gitea)
 - ✅ Media stack fully operational (Jellyfin, arr stack, Nextcloud)
 - ✅ Download pipeline with VPN isolation (Gluetun + qBittorrent)
- ✅ NexusAI foundation services running (Qdrant, Ollama)
+- ✅ NexusAI foundation services running (Qdrant, Ollama, llama.cpp)
 - ✅ Container management across nodes (Portainer + agent)
 ---
@@ -249,6 +250,6 @@ Phase 2 shifts focus to resilience, security hardening, and smart home integrati
 - **Additional security hardening** — Audit exposed services, tighten firewall rules, review Authelia policies
 - **IP webcam integration** — Add camera feeds into the homelab ecosystem
 - **Home Assistant** — Integrate smart home automation and sensor data
- **Continued NexusAI development** — Entities layer, embedding service, inference and orchestration buildout
+- **Continued NexusAI development** — Entity extraction pipeline, summaries layer, SettingsView implementation
 > This section will be expanded as Phase 2 planning matures.
--- a/docs/reference/API-routes.md
+++ b/docs/reference/API-routes.md
@@ -0,0 +1,447 @@
 # API Routes
 All HTTP endpoints across NexusAI services. Clients communicate only with
 the orchestration service (port 4000) — memory service routes are listed
 here for reference and direct debugging use.
 ---
 ## Orchestration Service — port 4000
 ### Health
 | Method | Path | Description |
 |---|---|---|
 | GET | /health | Service health check |
 ### Chat
 | Method | Path | Description |
 |---|---|---|
 | POST | /chat | Send a message, receive full response |
 | POST | /chat/stream | Send a message, receive SSE token stream |
 **POST /chat and POST /chat/stream — request body:**
 ```json
 {
  "sessionId": "your-session-uuid",
  "message": "Hello, my name is Tim.",
  "model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
  "temperature": 0.7
 }
 ```
 `model` and `temperature` are optional. Inference parameters (temperature,
 topP, topK, repeatPenalty) are read from `settings.json` on every request —
 controlled via `PATCH /settings`.
 **POST /chat — response:**
 ```json
 {
  "sessionId": "your-session-uuid",
  "response": "Hello Tim! How can I help you today?",
  "model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
  "tokenCount": 87
 }
 ```
 **POST /chat/stream — response (SSE):**
 ```
 data: {"text":"Hello"}
 data: {"text":" Tim"}
 data: {"done":true,"model":"gemma-4-26B...gguf","tokenCount":87}
 ```
 ### Sessions
 | Method | Path | Description |
 |---|---|---|
 | GET | /sessions | Paginated session list |
 | GET | /sessions/:sessionId/history | Paginated episode history for a session |
 | PATCH | /sessions/:sessionId | Update session name and/or project assignment |
 | DELETE | /sessions/:sessionId | Delete session and all its episodes |
 **GET /sessions — query params:**
 | Param | Default | Description |
 |---|---|---|
 | limit | 20 | Sessions per page |
 | offset | 0 | Pagination offset |
 | projectId | — | Filter by project (integer ID) |
 **PATCH /sessions/:sessionId — body:**
 ```json
 { "name": "My Session", "projectId": 3 }
 ```
 Either `name` or `projectId` is required. Both can be sent together.
 Returns the updated session object.
 **GET /sessions/:sessionId/history — query params:**
 | Param | Default | Description |
 |---|---|---|
 | limit | 20 | Episodes per page |
 | offset | 0 | Pagination offset |
 Returns `{ sessionId, episodes: [...] }`. Episodes ordered newest first.
 ### Projects
 | Method | Path | Description |
 |---|---|---|
 | GET | /projects | Get all projects |
 | POST | /projects | Create a new project |
 | PATCH | /projects/:id | Update a project (partial — any subset of fields) |
 | DELETE | /projects/:id | Delete a project (nulls session assignments) |
 **POST /projects — body:**
 ```json
 {
  "name": "My Project",
  "description": "Optional description",
  "colour": "#3d3a79",
  "icon": null,
  "isolated": 1
 }
 ```
 `name` is required. All other fields optional. `isolated` is always `1` —
 all projects use isolated memory. Returns `201` with the created project object.
 **PATCH /projects/:id — body:** any subset of fields, all optional.
 | Field | Type | Description |
 |---|---|---|
 | `name` | string | Project name |
 | `description` | string | Project description |
 | `colour` | string | Hex colour for UI accent |
 | `icon` | string | Icon identifier |
 | `isolated` | integer | Memory isolation flag (always 1) |
 | `notes` | string | User-authored project notes |
 | `system_prompt` | string | Per-project system prompt override (null = use global) |
 Only provided fields are updated — omitted fields are not touched.
 ### Summaries
 | Method | Path | Description |
 |---|---|---|
 | GET | /summaries/session/:sessionId | Get all summaries for a session (by external UUID) |
 | GET | /summaries/project/:projectId | Get all summaries for a project |
 **GET /summaries/session/:sessionId** — resolves the external UUID to an
 internal session ID, then fetches summaries from the memory service.
 Returns an array of summary objects ordered by `created_at` ascending.
 **GET /summaries/project/:projectId** — proxies directly to the memory
 service project summaries endpoint.
 **Summary object shape:**
 ```json
 {
  "id": 8,
  "session_id": 72,
  "project_id": null,
  "content": "The user asked about...",
  "token_count": 579,
  "episode_range": "246-251",
  "created_at": 1776766518,
  "updated_at": 1776766518
 }
 ```
 > **Proxy requirement:** `/summaries` must be added to both the Caddyfile
 > reverse proxy and the Vite dev proxy config alongside the other route
 > prefixes. See `orchestration-service.md` for the Caddy block pattern.
 ### Models
 | Method | Path | Description |
 |---|---|---|
 | GET | /models | Available models scanned live from models folder |
 | GET | /models/props | Live model props from llama-server (context window, loaded model) |
 **GET /models** — returns array:
 ```json
 [{ "value": "model-name.gguf", "label": "Display Name", "description": null, "size": "19.7 GB" }]
 ```
 Scans `.gguf` files live from `modelsFolderPath` (set in settings). Merges
 with `models.json` in the same folder for label and description metadata.
 **GET /models/props** — returns:
 ```json
 { "contextWindow": 64000, "modelAlias": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf" }
 ```
 Fetches directly from llama-server `/props`. `n_ctx` is at
 `data.default_generation_settings.n_ctx` in the llama-server response.
 Returns `503` if llama-server is unreachable.
 ### Settings
 | Method | Path | Description |
 |---|---|---|
 | GET | /settings | Get all current settings |
 | PATCH | /settings | Update one or more settings |
 **GET /settings — response:**
 ```json
 {
  "recentEpisodeLimit": 9,
  "semanticLimit": 5,
  "scoreThreshold": 0.6,
  "modelsFolderPath": "/mnt/nexus-models",
  "temperature": 0.65,
  "repeatPenalty": 1.3,
  "topP": 0.9,
  "topK": 41,
  "systemPrompt": "You are a helpful assistant..."
 }
 ```
 **PATCH /settings — body:** any subset of the above fields.
 | Field | Type | Range | Description |
 |---|---|---|---|
 | `recentEpisodeLimit` | integer | 1–20 | Recent episodes injected into prompt |
 | `semanticLimit` | integer | 1–20 | Max semantic search results |
 | `scoreThreshold` | float | 0–1 | Minimum similarity score for Qdrant results |
 | `semanticWeight` | float | 0–5 | RRF weight for Qdrant semantic results |
 | `keywordWeight` | float | 0–5 | RRF weight for FTS5 keyword results (`0` = disabled) |
 | `modelsFolderPath` | string | — | Path to folder containing .gguf files |
 | `temperature` | float | 0–2 | Inference randomness |
 | `repeatPenalty` | float | 1–2 | Repeat token penalty |
 | `topP` | float | 0–1 | Nucleus sampling probability mass |
 | `topK` | integer | 1–100 | Top-K token candidates per step |
 | `systemPrompt` | string | — | Global system prompt (null reverts to hardcoded default) |
 Settings are persisted to `data/settings.json` and read on every request —
 changes take effect immediately without a service restart.
 ### Episodes
 | Method | Path | Description |
 |---|---|---|
 | GET | /episodes | Paginated episode list across all sessions |
 | DELETE | /episodes/:id | Delete an episode (SQLite + Qdrant) |
 **GET /episodes — query params:**
 | Param | Default | Description |
 |---|---|---|
 | limit | 20 | Episodes per page |
 | offset | 0 | Pagination offset |
 | q | — | Keyword search (FTS) |
 ---
 ## Memory Service — port 3002
 Direct access is for debugging only. All client traffic goes through
 orchestration.
 ### Health
 | Method | Path | Description |
 |---|---|---|
 | GET | /health | Service health check |
 ### Sessions
 | Method | Path | Description |
 |---|---|---|
 | POST | /sessions | Create a new session |
 | GET | /sessions | Paginated session list with optional projectId filter |
 | GET | /sessions/:id | Get session by internal ID |
 | GET | /sessions/by-external/:externalId | Get session by external ID |
 | PATCH | /sessions/by-external/:externalId | Update session fields |
 | DELETE | /sessions/by-external/:externalId | Delete session (cascades to episodes) |
 > Route ordering: `by-external/:externalId` must be defined before `/:id`
 > to prevent `by-external` being captured as an ID param.
 **POST /sessions — body:**
 ```json
 { "externalId": "unique-uuid", "metadata": {} }
 ```
 **PATCH /sessions/by-external/:externalId — body:**
 ```json
 { "name": "Session Name", "projectId": 3 }
 ```
 Both fields are optional. Only provided fields are updated.
 ### Episodes
 | Method | Path | Description |
 |---|---|---|
 | POST | /episodes | Create episode + auto-embed into Qdrant |
 | GET | /episodes | Paginated episode list across all sessions |
 | GET | /episodes/search?q=&limit= | FTS keyword search across all episodes |
 | GET | /episodes/:id | Get episode by ID |
 | GET | /sessions/:id/episodes?limit=&offset= | Paginated episodes for a session |
 | DELETE | /episodes/:id | Delete episode (SQLite + Qdrant cleanup) |
 > Route ordering: `/episodes/search` must be defined before `/episodes/:id`.
 **POST /episodes — body:**
 ```json
 {
  "sessionId": 1,
  "userMessage": "Hello",
  "aiResponse": "Hi there!",
  "tokenCount": 10
 }
 ```
 ### Projects
 | Method | Path | Description |
 |---|---|---|
 | POST | /projects | Create a new project |
 | GET | /projects | Get all projects |
 | GET | /projects/:id | Get project by ID |
 | PATCH | /projects/:id | Update a project (dynamic — any subset of fields) |
 | DELETE | /projects/:id | Delete project + null session assignments |
 Same request/response shape as orchestration `/projects` above.
 ### Summaries
 | Method | Path | Description |
 |---|---|---|
 | POST | /summaries | Create a new summary |
 | GET | /sessions/:id/summaries | Get all summaries for a session (internal ID) |
 | GET | /projects/:id/summaries | Get all summaries for a project |
 | PATCH | /summaries/:id | Update a summary (content, tokenCount, episodeRange) |
 | DELETE | /summaries/:id | Delete a summary |
 **POST /summaries — body:**
 ```json
 {
  "sessionId": 72,
  "content": "The user discussed...",
  "tokenCount": 579,
  "episodeRange": "246-251"
 }
 ```
 `content` is required. Either `sessionId` or `projectId` is required.
 **PATCH /summaries/:id — body:** any subset of `content`, `tokenCount`, `episodeRange`.
 ### Entities
 | Method | Path | Description |
 |---|---|---|
 | POST | /entities | Upsert entity (creates or updates by name + type) |
 | GET | /entities/by-type/:type | All entities of a given type |
 | GET | /entities/:id | Get entity by ID |
 | DELETE | /entities/:id | Delete entity (cascades to relationships) |
 > Route ordering: `/entities/by-type/:type` must be before `/entities/:id`.
 **POST /entities — body:**
 ```json
 {
  "name": "NexusAI",
  "type": "project",
  "notes": "My AI memory project",
  "metadata": {}
 }
 ```
 ### Relationships
 | Method | Path | Description |
 |---|---|---|
 | POST | /relationships | Upsert a relationship between two entities |
 | GET | /entities/:id/relationships | All relationships for an entity |
 | DELETE | /relationships | Delete a specific relationship |
 **POST /relationships — body:**
 ```json
 { "fromId": 1, "toId": 2, "label": "uses", "metadata": {} }
 ```
 **DELETE /relationships — body:**
 ```json
 { "fromId": 1, "toId": 2, "label": "works_on", "notes": "Alice is the primary developer.", "metadata": {} }
 ```
 notes is optional. label should be a snake_case verb. Relationship is identified by the composite key (fromId, toId, label) — re-submitting with the same key increments mention_count and preserves existing notes if the new value is null.
 Relationships are identified by the composite key `(fromId, toId, label)`.
 Delete uses request body rather than URL params since this three-part key
 is awkward to encode in a path.
 ### Graph
 | Method | Path | Description |
 |---|---|---|
 | GET | /graph/neighborhood/:entityId | Entity neighborhood — nodes + edges within N hops |
 | POST | /graph/neighbors | Bulk 1-hop neighborhood for a set of entity IDs |
 **GET /graph/neighborhood/:entityId — query params:**
 | Param | Default | Max | Description |
 |---|---|---|---|
 | depth | 1 | 3 | Traversal depth |
 Returns `{ entity, neighborhood: { nodes, edges } }`. Returns `404` if entity not found.
 **POST /graph/neighbors — body:**
 ```json
 { "entityIds": [5, 8, 12] }
 Returns { nodes: [...], edges: [...] }. Used internally by orchestration — not a client-facing endpoint.
 ---
 ## Embedding Service — port 3003
 | Method | Path | Description |
 |---|---|---|
 | GET | /health | Service health check |
 | POST | /embed | Embed a single text string |
 | POST | /embed/batch | Embed an array of text strings |
 **POST /embed — body:**
 ```json
 { "text": "Hello from NexusAI" }
 ```
 **POST /embed — response:**
 ```json
 { "embedding": [0.123, -0.456, ...], "model": "nomic-embed-text", "dimensions": 768 }
 ```
 ---
 ## Inference Service — port 3001
 | Method | Path | Description |
 |---|---|---|
 | GET | /health | Health check — reports active provider and model |
 | POST | /complete | Full completion — awaits entire response |
 | POST | /complete/stream | Streaming completion via SSE |
 **POST /complete — body:**
 ```json
 {
  "prompt": "What is the capital of France?",
  "model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
  "temperature": 0.7,
  "maxTokens": 1024,
  "topP": 0.9,
  "topK": 40,
  "repeatPenalty": 1.1
 }
 ```
 All fields except `prompt` are optional. In normal usage these are forwarded
 from orchestration, which reads them from `settings.json`.
 **POST /complete — response:**
 ```json
 {
  "text": "The capital of France is Paris.",
  "model": "gemma-4-26B...gguf",
  "done": true,
  "evalCount": 8,
  "promptEvalCount": 41
 }
 ```
--- a/docs/reference/Memory-isolation.md
+++ b/docs/reference/Memory-isolation.md
@@ -0,0 +1,160 @@
 # Memory Isolation
 NexusAI implements project-scoped memory — sessions belonging to the same
 project share semantic context within that project's boundary. All projects
 are isolated by default.
 ## Concepts
 **Session** — a single conversation thread. Identified by `external_id`.
 **Project** — a named grouping of sessions. `isolated` is always `1` —
 the toggle has been removed from the UI and `isolated: 1` is hardcoded on
 project creation.
 **Semantic search** — at inference time, the user's message is embedded and
 compared against past episodes and entities in Qdrant to surface relevant
 context. The scope of this search is controlled by the project context.
 ## Semantic Search Scope
 | Session state | Episode search scope | Entity search scope |
 |---|---|---|
 | No project | All non-project episodes (shared pool) | No entity context |
 | Assigned to a project | All episodes across all sessions in that project | Entities tagged with that project |
 | Removed from a project | Back to shared non-project pool | Back to no entity context |
 Non-project sessions share a common memory pool — they can draw on each
 other's episodes via semantic search, but cannot access episodes from any
 project session. Project sessions are fully isolated from all non-project
 sessions and from other projects.
 ## How It Works
 ### Step 1 — Project context resolution (orchestration)
 In `chat/index.js`, immediately after session resolution:
 ```js
 let projectSessionIds = null;
 if (session.project_id) {
  const project = await memory.getProject(session.project_id);
  if (project) {
    const projectSessions = await memory.getProjectSessions(session.project_id);
    projectSessionIds = projectSessions.map(s => s.id);
  }
 }
 ```
 If the session belongs to any project, `projectSessionIds` is populated with
 the internal integer IDs of all sessions in that project — creating a shared
 memory pool across all conversations in the project.
 ### Step 2 — Qdrant episode filter construction
 In `services/qdrant.js`, `searchEpisodes` builds the filter:
 ```js
 if (projectSessionIds) {
  body.filter = {
    should: projectSessionIds.map(id => ({
      key: 'sessionId', match: { value: id }
    }))
  };
 } else if (sessionId) {
  body.filter = { must: [{ key: 'sessionId', match: { value: sessionId } }] };
 }
 ```
 `should` is Qdrant's "match any of" operator — equivalent to SQL
 `WHERE sessionId IN (...)`. When `projectSessionIds` is set, the single-session
 filter is not used.
 ### Step 3 — Entity search scoping
 Entity search is also project-scoped. `searchEntities` in `services/qdrant.js`
 accepts a `projectId` parameter and filters accordingly:
 ```js
 if (projectId) {
  body.filter = {
    must: [{ key: 'projectId', match: { value: projectId } }]
  };
 }
 // No filter for non-project sessions — entity context not provided
 ```
 Non-project sessions receive no entity context. Project sessions only see
 entities extracted from conversations within that project.
 ### Step 4 — Episode payloads
 Every episode upserted into Qdrant carries `{ sessionId, createdAt }` in its
 payload. `sessionId` here is the **internal integer ID** from SQLite.
 ### Step 5 — Entity payloads
 Every entity upserted into Qdrant carries `{ name, type, notes, projectId }`
 in its payload. `projectId` is the integer project ID.
 Entities are extracted and stored with `projectId` by `extraction.js`, which
 receives it from `createEpisode` in `episodic/index.js`, which receives it
 from the memory service episode route, which receives it from orchestration's
 `createEpisode` call in `chat/index.js`. The full chain:
 ```
 chat/index.js → memory.createEpisode(session.id, ..., session.project_id)
  → POST /episodes { projectId }
  → episodic.createEpisode(..., projectId)
  → extractAndStoreEntities(userMessage, aiResponse, projectId)
  → semantic.upsertEntity(id, vector, { name, type, notes, projectId })
 ```
 ## Important Behaviours
 **Pre-existing episodes are included immediately.** When a session is added
 to a project and a new message is sent, Qdrant can match all of that session's
 existing episodes since the filter only requires the `sessionId` to be in the
 project's session list.
 **Removing a session from a project takes effect immediately.** On the next
 message, `getProjectSessions` will not include that session's ID, so its
 episodes disappear from the semantic search scope.
 **Entity tags are immutable.** Entities extracted from a session's episodes
 are tagged with the `projectId` at extraction time. If a session is later
 moved to a different project, its previously extracted entities retain the
 original `projectId`. New entities extracted after the move will use the new
 `projectId`. Re-tagging existing entities requires a Qdrant payload update.
 **New sessions created from ProjectView are assigned after the first message.**
 `handleNewProjectChat` in `App.jsx` calls `sendMessage` with the project ID,
 which is passed to `useChat`. After `onDone` fires, `useChat` calls
 `updateSession` to write the project assignment to the backend. There is a
 brief window during the first message where the session has no project assigned.
 The project is correctly applied from the second message onward.
 ## Verified Behaviours (tested April 2026)
 - Project sessions cannot read episodes from non-project sessions ✓
 - Non-project sessions cannot read episodes from project sessions ✓
 - Non-project sessions can read each other's episodes ✓
 - Adding a session to a project — its history joins the project pool immediately ✓
 - Removing a session from a project — exits the project pool immediately ✓
 - Entity contamination across projects eliminated by `projectId` filter ✓
 ## Qdrant Payload Structures
 **Episodes:**
 ```json
 { "sessionId": 42, "createdAt": 1776080188 }
 ```
 **Entities:**
 ```json
 { "name": "NexusAI", "type": "project", "notes": "...", "projectId": 3 }
 ```
 `sessionId` is the SQLite `sessions.id` integer, not the `external_id` UUID.
 `projectId` is the SQLite `projects.id` integer.
 Always use internal IDs when building Qdrant filters.
--- a/docs/roadmap.md
+++ b/docs/roadmap.md
@@ -0,0 +1,228 @@
 # NexusAI — Master Roadmap
 > A modular, memory-centric AI assistant and personal second brain.  
 > Built on Node.js, React/Vite, SQLite, Qdrant, and llama.cpp.  
 > Repo: `https://gitea.jellystorm.com/storme/nexusAI`
 ---
 ## Current State (Completed)
 ### Backend — Core Four Services
 - ✅ **Shared package** — `getEnv`, constants (`QDRANT`, `COLLECTIONS`, `EPISODIC`, `SERVICES`)
 - ✅ **Memory service** (port 3002, Mini PC 1) — SQLite schema (sessions, episodes, entities, relationships, summaries), FTS5 search, full CRUD endpoints, Qdrant semantic layer (3 collections), embedding write path
 - ✅ **Embedding service** (port 3003, Mini PC 1) — `nomic-embed-text` via Ollama, 768-dim vectors, `/embed` and `/embed/batch`
 - ✅ **Inference service** (port 3001, Main PC) — provider pattern (`INFERENCE_PROVIDER`), llama.cpp provider, `/complete` and `/complete/stream` (SSE)
 - ✅ **Orchestration service** (port 4000, Mini PC 2) — `/chat` and `/chat/stream`, session auto-create, dual-layer context assembly (recency + semantic), episode write-back
 ### Memory System
 - ✅ Episodic memory — full conversation history in SQLite
 - ✅ Semantic memory — Qdrant vector search across episodes and entities
 - ✅ Entity extraction — background inference pass after each episode (qwen2.5:3b via Ollama)
 - ✅ Automatic summarization — triggered at context threshold, cumulative summary updates
 - ✅ Project memory isolation — project sessions fully isolated from each other and from non-project sessions
 ### Chat Client
 - ✅ React/Vite frontend served via Caddy
 - ✅ Sidebar navigation — recent chats, projects, settings
 - ✅ Project management — CRUD, colour coding, isolated flag, ProjectView
 - ✅ Session management — auto-naming, project assignment, SessionModal
 - ✅ Streaming chat interface — SSE token-by-token rendering
 - ✅ Memory viewer — episode browsing, deletion, health panel
 - ✅ Settings panel — models section, configuration
 ### Infrastructure
 - ✅ Caddy reverse proxy with Authelia SSO
 - ✅ Prometheus + Grafana monitoring (VRAM, CPU, RAM)
 - ✅ npm workspaces monorepo
 - ✅ Gitea self-hosted repo
 ---
 ## Phase 1 — Loose Ends & Stability - COMPLETE ✅
 *Target: Next development session (Saturday)*
 ### Bug Fixes
 ✅ **Entity extraction JSON parsing** — robustify response parser in `extraction.js` to handle model returning markdown fences or preamble around JSON
 ✅ **Qdrant entity search empty results** — verify entities embedded post-isolation-fix are surfacing correctly in project session searches
 ### Tech Debt
 ✅ **Logging** — introduce `LOG_LEVEL` env var across all services; reduce noise in production
 ✅ **Error response consistency** — audit all endpoints for uniform `{ error, detail }` shape
 ✅ **Constants audit** — move any remaining inline magic numbers (limits, thresholds, timeouts) to shared config
 ✅ **Orchestration `chat/index.js` review** — extract any logic that has grown beyond its intended scope into dedicated modules
 ---
 ## Phase 2 — Memory System Upgrades
 *The core intelligence layer*
 ### 1. Knowledge Graph (SQLite) ✅
 The highest-leverage memory upgrade. Transforms NexusAI from "remembers conversations" to "understands relationships between things."
 - [x] Graph schema — `nodes` and `edges` tables with typed relationships
 - [x] Entity → node promotion pipeline (`mention_count` tracked; threshold gating deferred to Phase 2)
 - [x] Relationship traversal queries
 - [x] Graph-aware context assembly in orchestration
 ### 2. Retrieval Fusion + Full-Text Search ✅
 Multi-strategy retrieval merged into a single ranked result set.
 - [x] Reciprocal Rank Fusion (RRF) — merge semantic (Qdrant) + keyword (FTS5) results
 - [x] Configurable weights per retrieval strategy (`semanticWeight`, `keywordWeight` via `PATCH /settings`)
 - [x] Score threshold retained per-strategy; FTS scoped to session/project sessions; `keywordWeight: 0` default (disabled until tuned)
 ### 3. Memory Consolidation Lifecycle
 Prevents long-term memory degradation and enables compression.
 - [ ] Episode aging — score/weight episodes by recency and access frequency
 - [ ] Consolidation pass — merge related low-weight episodes into summary nodes
 - [ ] Orphan cleanup — remove entities no longer referenced by active episodes
 ### 4. User Preference Model
 Automatically maintained profile injected into every system prompt.
 - [ ] Preference schema — communication style, interests, known facts, tone preferences
 - [ ] Auto-update from conversation history
 - [ ] Manual override / review UI
 ### 5. Confidence-Based Routing *(inspired by acid2lake)*
 Short-circuit simple requests before they reach the LLM.
 - [ ] Intent classifier in orchestration — categorise incoming messages
 - [ ] Confidence bands — FAST PATH (memory lookup only) vs FULL (LLM + context)
 - [ ] Fast-path handlers — direct memory queries, session lookups, factual recalls
 ### 6. Smarter Context Assembly *(inspired by acid2lake)*
 Budget-aware context selection instead of dumping all relevant memory into the prompt.
 - [ ] Token budget manager in orchestration
 - [ ] Priority scoring — recency × relevance × entity weight
 - [ ] Configurable context budget via env var
 ### 7. Procedural Memory Store *(inspired by acid2lake)*
 Learns "how NexusAI has successfully handled this type of request before."
 - [ ] Procedural memory schema — trigger pattern, steps, success count, confidence
 - [ ] Auto-population from successful interaction traces
 - [ ] Procedural context injection for matched request types
 ### 8. Reflection / Self-Summarization
 NexusAI periodically reviews and synthesises its own memory.
 - [ ] Scheduled reflection pass — background job, configurable interval
 - [ ] Cross-session insight extraction
 - [ ] Summary nodes written back to knowledge graph
 - *Requires: Knowledge graph + consolidation lifecycle*
 ### 9. Proactive Agent Loop
 The JARVIS moment — NexusAI reasons, plans, and acts across multiple steps.
 - [ ] Tool calling framework in orchestration
 - [ ] Built-in tools — memory search, entity lookup, summarize, web fetch
 - [ ] Reasoning loop — think → act → observe → respond
 - [ ] Agent mode toggle per session
 - *Requires: All Phase 2 items above*
 ---
 ## Phase 3 — Client Features
 *Making the daily driver experience excellent*
 ### Core Chat Enhancements
 - [ ] Message regeneration — re-roll last AI response
 - [ ] Edit & resend — edit a previous message, clear subsequent history
 - [ ] Copy message button — hover icon per message
 - [ ] Message timestamps — subtle, toggleable
 - [ ] Token count display — per-response usage indicator
 ### Memory Visibility
 - [ ] **"What I remember" panel** — show which episodes/entities were injected into context
 - [ ] Memory pinning — mark episodes as always-include
 - [x] Session summary view — on-demand or auto-generated session summary
 - [ ] Memory attribution — subtle indicator on responses that were memory-informed
 ### Session & Project Management
 - [ ] Session search — full-text search across all sessions
 - [ ] Session tagging — freeform tags beyond project assignment
 - [ ] Session export — download as markdown or JSON
 - [ ] Pinned sessions — pin frequently used sessions to sidebar top
 - [ ] Bulk session actions — delete, move to project
 ### Model & Persona Controls *(high priority — circles back to companion origins)*
 - [ ] Per-session model switching — override default model per session
 - [x] System prompt editor — per-project custom prompts
 - [ ] System prompt editor — per-session custom prompts
 - [ ] Persona profiles — saved configurations (model + system prompt + temperature)
  - Examples: "Daily Driver", "Creative Mode", "Concise Mode", "Coding Mode"
 - [ ] Temperature / parameter sliders — collapsible panel for power users
 ### Second Brain Features
 - [ ] **Quick capture** — minimal input to save a thought directly to memory without starting a chat
 - [ ] **Knowledge graph visualiser** — interactive node/edge view of entities and relationships
 - [ ] Memory search page — dedicated search UI across all episodes and entities
 - [ ] Daily digest — generated summary of recent activity and learned facts
 ### Quality of Life
 - [ ] Keyboard shortcuts — `Ctrl+K` command palette, `Ctrl+Enter` to send
 - [ ] Dark/light theme toggle
 - [ ] Mobile layout polish — collapsible sidebar, touch-friendly inputs
 - [ ] Notification support — browser notifications for long completions
 ---
 ## Phase 4 — Coding Copilot
 *After core is feature-complete*
 ### Project Directory Awareness
 - [ ] Directory watcher service — monitors a VS Code workspace for changes
 - [ ] Symbol indexer — AST parsing via Tree-sitter, file → symbol map in SQLite
 - [ ] Diagnostic indexer — compiler errors/warnings per file, triggered on save
 - [ ] Maps to existing project isolation — coding project = NexusAI project with `indexedDirectory` flag
 ### Coding-Specific Memory
 - [ ] Procedural patterns per language/framework — stored in procedural memory layer
 - [ ] Skill compilation — successful coding solutions abstracted into reusable patterns
 - [ ] Codebase semantic search — embed code chunks into Qdrant, search by intent
 ---
 ## Phase 5 — Stretch Goals
 ### Voice Layer
 - [ ] TTS output — text-to-speech for AI responses
 - [ ] STT input — speech-to-text for voice messages
 - [ ] Hardware-dependent — deferred until appropriate hardware available
 - *Architecturally clean addition — new input/output modality only*
 ### Homelab Enhancements
 - [ ] Backup improvements — automated, verified backups of SQLite + Qdrant data
 - [ ] Security hardening — network segmentation, service-level auth
 - [ ] IP webcam integration
 - [ ] Home Assistant integration
 ---
 ## Architecture Reference
 ### Services & Nodes
 | Service | Host | Port | Role |
 |---|---|---|---|
 | Inference | Main PC `192.168.0.79` | 3001 | llama.cpp provider, `/complete`, `/complete/stream` |
 | Memory | Mini PC 1 `192.168.0.81` | 3002 | SQLite, episode/entity/summary CRUD |
 | Embedding | Mini PC 1 `192.168.0.81` | 3003 | nomic-embed-text via Ollama, vector generation |
 | Qdrant | Mini PC 1 `192.168.0.81` | 6333 | Vector store — episodes, entities, summaries collections |
 | Orchestration | Hub `192.168.0.205` | 4000 | Chat pipeline, context assembly, session management |
 | Chat Client | Hub `192.168.0.205` | — | React/Vite, served via Caddy |
 | Caddy + Authelia | Hub `192.168.0.205` | 443 | Reverse proxy, SSO |
 ### Primary Models
 | Role | Model | Notes |
 |---|---|---|
 | Daily driver | Gemma 4 26B Claude Distill APEX I-Mini | `--reasoning off` flag critical |
 | Creative/worldbuilding | Gemma 4 21B REAP Q5_K_M | |
 | Coding | DeepSeek Coder V2 Lite Instruct Q6_K | |
 | Background tasks | qwen2.5:3b via Ollama | Entity extraction, summarization |
 ### Key Design Principles
 - **Layer-by-layer validation** — backend → orchestration → frontend, curl-test each layer
 - **Fire-and-forget async** — embedding and entity extraction never block the chat response
 - **All services read settings on every request** — no restart required for config changes
 - **Backend-first development** — data layer → endpoints → orchestration proxy → frontend
 ---
 *Last updated: April 2026*
--- a/docs/services/chat-client.md
+++ b/docs/services/chat-client.md
@@ -14,6 +14,7 @@ inference services. Served as static files by Caddy on Mini PC 2.
 ## Dependencies
 - `react` + `react-dom` — UI framework
 - `react-markdown` — Markdown rendering in message bubbles and memory viewer
 - `uuid` — session ID generation
 - `vite` + `@vitejs/plugin-react` — build tooling
@@ -55,10 +56,6 @@ VITE_ORCHESTRATION_URL=https://nexus.jellystorm.com
 during local development, bypassing Caddy and Authelia entirely:
 ```js
 // vite.config.js
 import { defineConfig } from 'vite';
 import react from '@vitejs/plugin-react';
 export default defineConfig({
  plugins: [react()],
  server: {
@@ -66,12 +63,17 @@ export default defineConfig({
      '/models':   'http://192.168.0.205:4000',
      '/sessions': 'http://192.168.0.205:4000',
      '/chat':     'http://192.168.0.205:4000',
      '/projects': 'http://192.168.0.205:4000',
      '/episodes': 'http://192.168.0.205:4000',
      '/settings': 'http://192.168.0.205:4000',
      '/health':   'http://192.168.0.205:4000',
    }
  }
 });
 ```
-If new routes are added to the orchestration service, add them here too.
+When adding new top-level routes to the orchestration service, add a matching
 entry here and in the Caddy config.
 ## Internal Structure
@@ -80,42 +82,106 @@ src/
 ├── api/
 │   └── orchestration.js     # All fetch calls to the orchestration service
 ├── config/
-│   └── constants.js        # FALLBACK_MODELS, DEFAULT_MODEL, API_DEFAULTS
+│   └── constants.js         # FALLBACK_MODELS, DEFAULT_MODEL, API_DEFAULTS, CLIENT_DEFAULTS
 ├── hooks/
 │   ├── useSession.js        # Session list, history loading, active session state
 │   ├── useChat.js           # Message sending, SSE streaming, message state
 │   ├── useModels.js         # Dynamic model list fetched from /models endpoint
 │   ├── useProjects.js       # Project list fetched from /projects endpoint
 │   ├── useSettings.js       # Settings fetch + saveSetting helper
 │   └── useContextMenu.js    # Right-click context menu position and visibility
 ├── components/
-│   ├── App.jsx              # Root component — layout and shared state
+│   ├── App.jsx              # Root component — layout, shared state, view routing
-│   ├── SessionList.jsx      # Left sidebar — session list, rename, delete
+│   ├── Sidebar.jsx          # Left sidebar — projects, grouped recent chats, navigation
-│   ├── ChatWindow.jsx       # Centre panel — message thread and input bar
+│   ├── HomeView.jsx         # Landing screen — greeting, centred input, quick actions
-│   ├── MessageBubble.jsx    # Individual message bubble (user or assistant)
+│   ├── ChatWindow.jsx       # Centre panel — message thread, back button, model pill
-│   ├── InfoPanel.jsx        # Right panel — model selector and session metadata
+│   ├── MessageBubble.jsx    # Individual message bubble — renders markdown via react-markdown
-│   └── SessionModal.jsx     # Modal dialog for session settings and delete confirmation
+│   ├── InfoPanel.jsx        # Right panel — model selector and session metadata (slide-in)
 │   ├── SessionModal.jsx     # Modal for session rename, project assignment, delete
 │   ├── ProjectModal.jsx     # Modal for project create/edit — name, description, colour,
 │   │                        #   system prompt override; delete confirmation
 │   ├── AllChatsView.jsx     # Paginated session list with project indicator column
 │   ├── AllProjectsView.jsx  # Project tile grid with create/edit/delete; tile click navigates to ProjectView
 │   ├── ProjectView.jsx      # Individual project — conversations, new chat input, memory
 │   │                        #   placeholder, user notes, ⋮ edit/delete menu
 │   ├── MemoryView.jsx       # Paginated, searchable, expandable, deletable episode viewer
 │   └── SettingsView.jsx     # Settings — Memory, Models, Behaviour (system prompt),
 │                            #   About, Appearance
 ├── index.css                # Global reset, CSS variables, utility classes
 └── main.jsx                 # React entry point
 ```
 ## Layout
-Three-panel layout with collapsible sidebars:
+The app uses a view-based layout. `App.jsx` manages a `view` state string
 that controls which main panel is rendered. The left sidebar and right info
 panel are persistent across all views.
 ```
-┌─────────────────┬──────────────────────────┬─────────────┐
+┌──────────────────┬──────────────────────────────┐
-│  Session List   │       Chat Window         │  Info Panel │
+│     Sidebar      │   Main Area (view-dependent)  │
-│  (collapsible)  │                           │ (collapsible)│
+│  (collapsible)   │                               │
-│                 │  [message thread]         │             │
+│                  │  home         → HomeView      │
-│ + New Chat      │                           │ Model       │
+│ + New Chat       │  chat         → ChatWindow    │
-│                 │                           │ Session ID  │
+│ ⊞ View Projects  │  all-chats    → AllChatsView  │
-│ Session 1       │                           │ Token count │
+│                  │  all-projects → AllProjectsView│
-│ Session 2       │                           │             │
+│ PROJECTS ▾       │  project      → ProjectView   │
-│                 │  [input bar]              │             │
+│  [tile] [tile]   │  settings     → SettingsView  │
-└─────────────────┴──────────────────────────┴─────────────┘
+│  All Projects →  │  memory       → MemoryView    │
 │                  │                               │
 │ RECENT CHATS ▾   │                               │
 │  ● Project A     │                               │
 │    Session 1     │                               │
 │    Session 2     │                               │
 │  ● Project B     │                               │
 │    Session 3     │                               │
 │  Other           │                               │
 │    Session 4     │                               │
 │  All Chats →     │                               │
 │                  │                               │
 │ ⚙ Settings       │                               │
 └──────────────────┴──────────────────────────────┘
 ```
-Sidebars collapse to a 56px icon rail. The centre chat window always
+The sidebar collapses to a 48px icon rail and starts collapsed on the home
-fills the remaining space.
+view. The right `InfoPanel` slides in from the right using
 `transform: translateX()` — hidden by default, toggled via the `⊹` button
 in the `ChatWindow` header.
 ## View Routing
 | View | Component | Trigger |
 |---|---|---|
 | `'home'` | `HomeView` | Initial load |
 | `'chat'` | `ChatWindow` | Selecting a session; new chat; sending from HomeView |
 | `'all-chats'` | `AllChatsView` | "All Chats →" or ☰ icon in collapsed rail |
 | `'all-projects'` | `AllProjectsView` | "View Projects" button or ⊞ icon |
 | `'project'` | `ProjectView` | Clicking a project tile in sidebar or AllProjectsView |
 | `'settings'` | `SettingsView` | Settings button or ⚙ icon |
 | `'memory'` | `MemoryView` | "Open →" button in Settings → Memory section |
 `activeProject` state in `App.jsx` tracks which project `ProjectView` is
 displaying. Set via `onSelectProject` before navigating to `'project'`.
 ### View History Stack
 `App.jsx` maintains a `viewHistory` array. Each `navigate(view)` call pushes
 the current view onto the stack. `goBack()` pops the last entry and restores
 it. All view components receive `onBack={goBack}` — no component hardcodes
 its own back destination. Navigating to `'home'` collapses the sidebar;
 leaving `'home'` expands it.
 ## Home View
 `HomeView` is the landing screen shown on initial load. It displays:
 - Time-based greeting ("Morning / Afternoon / Evening, Tim")
 - Currently loaded model name (from `modelProps.modelAlias`, stripped of `.gguf`)
 - Centred textarea input — sending creates a new session and navigates to chat
 - Quick action pills that populate the input without auto-sending
 `handleHomeSend` in `App.jsx` calls `createSession()` (which returns the new
 session object), then immediately calls `sendMessage` with the session passed
 directly — avoiding the React state settling race condition.
 ## CSS Architecture
@@ -148,7 +214,7 @@ rules, inline styles for dynamic prop-driven values.
 | Class | Description |
 |---|---|
-| `.panel-header` | Shared header row — used in all three panels |
+| `.panel-header` | Shared header row — used across all panels |
 | `.btn-reset` | Resets button styles (no border, bg, cursor pointer) |
 | `.btn-icon` | Icon button with hover state |
 | `.btn-primary` | Accent-coloured action button with `:hover` and `:disabled` states |
@@ -161,110 +227,148 @@ rules, inline styles for dynamic prop-driven values.
 | `.label-upper` | Uppercase section label style |
 | `.truncate` | Text overflow ellipsis |
 ## API Layer
 All orchestration calls are centralised in `src/api/orchestration.js`:
 | Function | Method | Path | Description |
 |---|---|---|---|
 | `fetchSessions` | GET | /sessions | Load session list for sidebar |
 | `fetchSessionHistory` | GET | /sessions/:id/history | Load episode history on session select |
 | `sendMessage` | POST | /chat | Send message, await full response |
 | `streamMessage` | POST | /chat/stream | Send message, receive SSE token stream |
 | `fetchModels` | GET | /models | Load available models from manifest |
 | `renameSession` | PATCH | /sessions/:id | Rename a session |
 | `deleteSession` | DELETE | /sessions/:id | Delete a session |
 `streamMessage` returns an abort function — call it to cancel a stream mid-flight.
 Uses a buffer pattern to handle SSE chunks that may span multiple network packets.
 ## Streaming
-The chat input sends messages via `POST /chat/stream`. Tokens arrive as SSE events:
+Messages are sent via `POST /chat/stream`. Tokens arrive as SSE events and
 are written into the active assistant bubble token by token via
 `updateLastMessage`. The blinking cursor in `MessageBubble` is shown while
 `message.streaming === true`.
-```
+`useChat.sendMessage` accepts an optional `session` parameter (4th arg) that
-data: {"text":"Hello"}
+overrides the closed-over `activeSession`. This is used by `handleHomeSend`
-data: {"text":" Tim"}
+and `handleNewProjectChat` in `App.jsx` to pass the newly created session
-data: {"done":true,"model":"gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf","tokenCount":87}
+object directly, avoiding React state settling races.
 ```
-An empty assistant bubble is appended immediately when the stream opens, then
+`useChat` accepts an optional `projectId` parameter in `sendMessage`. After
-updated token by token using `updateLastMessage`. The blinking cursor in
+the first message completes in a new session, if `projectId` is set,
-`MessageBubble` is shown while `message.streaming === true` and disappears
+`updateSession` is called to write the project assignment to the backend.
 when the done event is received. Model name and token count from the done
 event are stored in `useChat` state and displayed in the InfoPanel.
 ## Dynamic Model Selector
 Available models are fetched from `GET /models` on mount via the `useModels` hook.
 The hook initialises with `FALLBACK_MODELS` from `constants.js` and replaces them
 with the server response on success. If the fetch fails, the fallback list is used
 silently — a warning is logged to the console.
 To add a model, update `models.json` on the main PC — no client rebuild needed.
 `FALLBACK_MODELS` in `constants.js` should be kept in sync with `models.json`
 as a reasonable last-resort list in case the endpoint is unreachable.
 ## Session Management
-Sessions are identified by `external_id` — a UUID generated client-side via the
+Sessions are identified by `external_id` — a UUID generated client-side via
-`uuid` package. New sessions are created locally and auto-registered in the memory
+the `uuid` package. New sessions are created locally and auto-registered in
-service on the first message. The session list refreshes after each completed
+the memory service on the first message. The session list refreshes after
-response to surface newly created sessions.
+each completed response to surface newly created sessions.
-### Session Name Display
+`useSession.createSession` returns the new session object — callers can pass
 it directly to `sendMessage` rather than waiting for React state to update.
-The chat header and session list both display `session.name` if set, falling back
+`useSession.selectSession` skips the history fetch for new (`isNew: true`)
-to `session.external_id` if no name has been assigned:
+sessions — fetching history for an unsaved session would 404 since it doesn't
 exist in the backend yet.
-```js
+### Auto-naming
-activeSession.name || activeSession.external_id
+
-```
+After the first exchange completes, orchestration fires a secondary inference
 call with a short naming prompt (max 20 tokens, temperature 0.3). The result
 is written back as `session.name`. The client fires a second `refreshSessions`
 after a 3-second delay to pick up the name once written.
 Manually renamed sessions are never overwritten — the `!session.name` guard
 in `chat/index.js` prevents this.
 ### Session Actions
-The session list supports rename and delete via two entry points:
+Session rows support rename, project assignment, and delete via:
 - **Hover** — reveals ✎ and ✕ icon buttons alongside the row
 - **Right-click** — context menu with the same actions
- **Hover** — reveals ✎ (rename) and ✕ (delete) icon buttons alongside the session row
+`SessionModal` handles rename and project assignment together in `settings`
- **Right-click** — opens a context menu with the same actions
+mode, and delete confirmation in `confirm-delete` mode.
-Both trigger `SessionModal` — a shared modal component with two modes:
+### Key Patterns
-| Mode | Trigger | Behaviour |
+- Button nesting: action icons are siblings of row buttons, not children — HTML forbids `<button>` inside `<button>`
-|---|---|---|
+- Context menu rendered outside sidebar via React fragment to avoid `overflow: hidden` clipping
-| `settings` | Rename button / context menu rename | Shows name input, saves on Enter or Save button |
+- `useContextMenu` dismisses on a `window` click listener
-| `confirm-delete` | Delete button / context menu delete | Shows confirmation dialog with session name, requires explicit Delete button click |
+- Dynamic `updateSession` SQL builds `SET` clause from only the fields passed — prevents accidental overwrites
 - `AllChatsView` pagination uses `CLIENT_DEFAULTS.PAGE_SIZE` (not `API_DEFAULTS.PAGE_SIZE` which doesn't exist)
 - `Sidebar` groups sessions by project — `key` must be passed directly to `<SessionRow key={...}>`, not included in the props spread object
-The modal is intentionally titled "Session Settings" and structured to expand
+## Sidebar — Session Grouping
 into a full settings panel in future iterations.
-Actions are disabled on unsaved (new) sessions that haven't had a first message sent yet.
+Recent sessions in the sidebar are grouped by project under a colour dot +
 project name label. Unassigned sessions appear under "Other" if any project
 groups are present. The grouping is computed client-side from the `sessions`
 array and `projects` list already available in `App.jsx` — no extra API call.
-### Active Session Clearing on Delete
+`AllChatsView` receives `projects` as a prop from `App.jsx` and displays a
 project indicator column (colour dot + truncated name) in each session row.
-When the deleted session is the currently active one, `App.jsx` detects the match
+## Project Management
 and calls `selectSession(null)` to clear the chat window before refreshing the list:
-```js
+All projects are isolated by default (`isolated: 1` hardcoded on create).
-function handleSessionsChange(deletedSession) {
+The isolated toggle has been removed from `ProjectModal`.
    if (deletedSession?.external_id === activeSession?.external_id) {
        selectSession(null);
    }
    refreshSessions();
 }
 ```
-### Context Menu
+`useProjects` fetches the project list from `GET /projects` on mount and
 exposes `refreshProjects` for keeping the sidebar in sync after mutations.
-Implemented via `useContextMenu` hook — tracks `{ x, y, session }` state and
+### ProjectModal Fields
 attaches a `window` click listener to dismiss on any outside click. Rendered
 outside the sidebar div via a React fragment to avoid being clipped by
 `overflow: hidden`.
-### Button Nesting
+- **Name** (required)
 - **Description** (optional)
 - **Colour** — picker from six preset hex values
 - **System Prompt** (optional) — overrides the global system prompt for all
  conversations in this project. Leave blank to use the global default.
  Stored as `system_prompt` (snake_case) matching the SQLite column.
  `Enter` key does not submit — textarea fields make it ambiguous. Save button only.
-Session row action icons (✎ ✕) are rendered as siblings of the session
+`handleSave` in `ProjectView` destructures `system_prompt` (snake_case) to
-`<button>`, not children — HTML does not allow `<button>` inside `<button>`.
+match what `ProjectModal` sends. `updateProject` in `orchestration.js` uses
-The outer `<div>` owns hover state and context menu; the inner `<button>` handles
+a passthrough pattern — spreads all fields into the request body.
-session selection; action icon buttons sit alongside it in the same flex row.
+
 ### System Prompt Hierarchy
 System prompt resolution in `chat/index.js` (orchestration):
 1. `project.system_prompt` — if set on the project (highest priority)
 2. `settings.systemPrompt` — global setting from `settings.json`
 3. `ORCHESTRATION.SYSTEM_PROMPT` — hardcoded constant in `@nexusai/shared` (last resort)
 ### ProjectView
 `ProjectView` is a full project workspace with:
 - Colour accent bar + project title + description
 - ⋮ dropdown menu for edit (opens `ProjectModal` pre-filled) and delete
 - Conversations list — each session is a clickable row navigating to `'chat'`
 - `ChatInput` component below the list (or centred when no sessions exist) for
  starting new project-tied conversations without a separate button
 - **Project Memory** — placeholder section explaining upcoming auto-summary feature
 - **Project Notes** — textarea with Save button; notes saved to `projects.notes`
  column in SQLite; save button only appears when content has changed from last
  saved value (`savedNotes` state tracks the baseline, not `initialNotes`)
 `updateProject` in `orchestration.js` uses a passthrough pattern — spreads
 all fields directly into the request body. This allows partial updates like
 `{ notes }` or `{ system_prompt }` without clobbering other fields.
 For memory isolation behaviour, see `memory-isolation.md`.
 ## Settings
 `useSettings` fetches from `GET /settings` on mount and exposes a
 `saveSetting(key, value)` helper that issues a `PATCH /settings` with a
 single key-value pair. The `saving` boolean is exposed for disabling save
 buttons during in-flight requests.
 `SettingsView` receives `settings`/`saveSetting`/`saving` from a single
 `useSettings()` call at the top level and passes them as props to
 `ModelsSection`, `ModelsFolderSetting`, and `SystemPromptSetting` — avoiding
 triple fetch on mount. `modelProps` (context window, loaded model) is fetched
 once in `App.jsx` and passed down as a prop.
 `SettingsView` is organised into sections:
 - **Memory** — recent episode limit, semantic limit, score threshold, link to MemoryView
 - **Models** — models folder path, temperature, repeat penalty, Top-P, Top-K,
  active model dropdown, read-only model info panel (file, size, context window,
  loaded model from llama-server)
 - **Behaviour** — global system prompt textarea (`SystemPromptSetting`). Save
  button appears only when content differs from `savedPrompt` state. Saving an
  empty string sends `null` which reverts to the hardcoded default.
 - **About** — service health check panel, version
 - **Appearance** — theme (coming soon)
 An error boundary (`SettingsSectionErrorBoundary`) wraps the Models section —
 if the models fetch fails, only that section shows an error with a Retry
 button rather than blanking the entire settings view.
--- a/docs/services/embedding-service.md
+++ b/docs/services/embedding-service.md
@@ -27,80 +27,43 @@ minimizing network hops on the memory write path.
 | OLLAMA_URL | No | http://localhost:11434 | Ollama instance URL |
 | EMBEDDING_MODEL | No | nomic-embed-text | Ollama embedding model to use |
 > Ollama must be running with `OLLAMA_HOST=0.0.0.0` to accept LAN connections
 > from other services.
 ## Model
-**nomic-embed-text** via Ollama produces **768-dimension** vectors using **Cosine similarity**.
+**nomic-embed-text** via Ollama produces **768-dimension** vectors with
-This must match the `QDRANT.VECTOR_SIZE` constant in `@nexusai/shared`.
+**Cosine similarity**. This must match `QDRANT.VECTOR_SIZE` in `@nexusai/shared`.
 If the embedding model is changed, the Qdrant collections must be reinitialized
-with the new vector dimension — updating `QDRANT.VECTOR_SIZE` in `constants.js` is
+with the new vector dimension. Updating `QDRANT.VECTOR_SIZE` in `constants.js`
-the single change required to keep everything consistent.
+is the single change required to keep everything consistent.
 ## Ollama API
-Uses the `/api/embed` endpoint (Ollama v0.4+). Request shape:
+Uses the `/api/embed` endpoint (Ollama v0.4+):
 ```json
 // Request
 { "model": "nomic-embed-text", "input": "text to embed" }
 ```
 Response key is `embeddings[0]` — an array of 768 floats.
-## Endpoints
+// Response key
-
+embeddings[0]  // array of 768 floats
 ### Health
 | Method | Path | Description |
 |---|---|---|
 | GET | /health | Service health check |
 ### Embed
 | Method | Path | Description |
 |---|---|---|
 | POST | /embed | Embed a single text string |
 | POST | /embed/batch | Embed an array of text strings |
 ---
 **POST /embed**
 Embeds a single text string and returns the vector.
 Request body:
 ```json
 {
  "text": "Hello from NexusAI"
 }
 ```
-Response:
+> Earlier Ollama versions used `/api/embeddings` with a `prompt` key and
-```json
+> returned `embedding` (singular). Use `/api/embed`, `input`, and
-{
+> `embeddings[0]` for Ollama v0.4+.
  "embedding": [0.123, -0.456, ...],
  "model": "nomic-embed-text",
  "dimensions": 768
 }
 ```
---
+## Usage in NexusAI
-**POST /embed/batch**
+The embedding service is called in two places:
-Embeds an array of strings sequentially and returns all vectors in the same order.
+1. **Memory service** — after each episode is saved to SQLite, the combined
-Ollama does not natively parallelize embeddings, so requests are processed one at a time.
+   `User: ..\nAssistant: ..` text is embedded and upserted into Qdrant.
   This is fire-and-forget — failures are logged but don't affect the response.
-Request body:
+2. **Orchestration service** — the user's message is embedded at the start of
-```json
+   the chat pipeline to perform semantic search against past episodes.
 {
  "texts": ["first sentence", "second sentence"]
 }
 ```
-Response:
+For all HTTP endpoints, see `api-routes.md`.
 ```json
 {
  "embeddings": [[0.123, ...], [0.456, ...]],
  "model": "nomic-embed-text",
  "dimensions": 768,
  "count": 2
 }
 ```
--- a/docs/services/entity-extraction.md
+++ b/docs/services/entity-extraction.md
@@ -0,0 +1,140 @@
 # Entity Extraction
 **Location:** `packages/memory-service/src/entities/extraction.js`  
 **Triggered by:** Episode creation (`POST /episodes`)  
 **Model:** `qwen2.5:3b` via Ollama (configurable via `EXTRACTION_MODEL` env var)
 ## Purpose
 After each episode is saved to SQLite, the extraction pipeline runs
 asynchronously in the background to identify named entities and the
 relationships between them. Results are written back to SQLite and
 embedded into Qdrant — the episode response is never delayed.
 ## Trigger
 `createEpisode()` in `episodic/index.js` calls `extractAndStoreEntities()`
 immediately after the SQLite insert, without awaiting it:
 ```js
 extractAndStoreEntities(userMessage, aiResponse, episode.id, projectId)
  .catch(err => logger.error(`Failed to extract entities for episode ${episode.id}:`, err.message));
 ```
 If extraction throws, the episode is unaffected — the error is logged and
 swallowed.
 ## Model Settings
 | Setting | Value | Notes |
 |---|---|---|
 | Model | `qwen2.5:3b` | Ollama, configurable via `EXTRACTION_MODEL` |
 | Temperature | 0.1 | Low for consistent, deterministic output |
 | `num_predict` | 1500 | Higher ceiling to accommodate entity + relationship JSON |
 | `format` | `'json'` | Ollama constrained decoding — enforces valid JSON output |
 | Prompt format | ChatML | `<\|im_start\|>` / `<\|im_end\|>` tokens |
 ## Prompt Structure
 The prompt is built by `buildExtractionPrompt()`. It includes:
 1. **System message** — declares the model's role as an entity and relationship extractor
 2. **Instructions** — entity types, field rules, relationship label format, required JSON schema
 3. **Known entities block** — last 20 entities from SQLite, by `rowid DESC`, used to encourage consistent name/type pairs across conversations
 4. **Conversation** — the raw user message and AI response, delimited clearly
 ```
 <|im_start|>system
 You are a named entity and relationship extractor. You output only valid JSON.
 <|im_end|>
 <|im_start|>user
 Read the conversation below and extract all named entities and the relationships between them.
 Entity types: person, place, project, technology, concept, organization
 ...
 Return this exact JSON structure:
 { "entities": [...], "relationships": [...] }
 Already known entities (use these exact name and type values if the same entity appears):
 - "NexusAI" (project)
 - "Alice" (person)
 --- CONVERSATION ---
 User: ...
 Assistant: ...
 --- END CONVERSATION ---
 <|im_end|>
 <|im_start|>assistant
 ```
 ## Expected JSON Output
 ```json
 {
  "entities": [
    { "name": "Alice", "type": "person", "notes": "Software engineer working on NexusAI." },
    { "name": "NexusAI", "type": "project", "notes": "A modular AI assistant with persistent memory." }
  ],
  "relationships": [
    {
      "from": "Alice", "fromType": "person",
      "to": "NexusAI", "toType": "project",
      "label": "works_on",
      "notes": "Alice is the primary developer."
    }
  ]
 }
 ```
 Relationship labels use **snake_case verbs** (e.g. `works_on`, `manages`, `uses`,
 `knows`, `located_in`, `part_of`, `created_by`).
 ## JSON Parsing
 The raw model response is matched with `/\{[\s\S]*\}/` before parsing — this
 tolerates any preamble or trailing prose the model emits alongside the JSON.
 If the match fails or `JSON.parse` throws, the function logs a warning and
 returns without writing anything.
 ## Entity Processing
 For each entity in `parsed.entities`:
 1. Validate `name`, `type` (must be in `ENTITY_TYPES`), and not in `IGNORED_NAMES`
 2. Call `upsertEntity(name, type, notes)`:
   - **Insert**: creates new row with `mention_count = 1`, `source = 'extraction'`
   - **Conflict** on `(name, type)`: increments `mention_count`, updates `last_seen_at`, preserves existing `notes` if new extraction returns null
 3. Add to `entityMap` keyed by `"${name}::${type}"` — used for relationship resolution below
 4. Call `linkEntityToEpisode(entity.id, episodeId)` — writes to `entity_episodes` join table
 5. Fire-and-forget: embed as `"${name} (${type}): ${notes}"` → store to Qdrant `entities` collection with `{ name, type, notes, projectId }` in payload
 **Valid entity types:** `person`, `place`, `project`, `technology`, `concept`, `organization`
 **Stoplist (ignored names):** `good morning`, `good night`, `hello`, `goodbye`, `thanks`, `thank you`
 ## Relationship Processing
 After all entities are saved, relationships are processed:
 1. For each entry in `parsed.relationships`, look up both endpoints in `entityMap` using `"${from}::${fromType}"` and `"${to}::${toType}"` as keys
 2. If either endpoint is missing (filtered out, invalid type, or not in this extraction), the relationship is silently skipped
 3. Call `upsertRelationship(fromId, toId, label, notes)`:
   - **Insert**: creates new row with `mention_count = 1`
   - **Conflict** on `(from_id, to_id, label)`: increments `mention_count`, preserves existing `notes` if new is null
 Relationships are unidirectional in storage. Bidirectionality is handled at
 query time by the graph traversal layer.
 ## Project Scoping
 `projectId` is threaded through from the episode creation call. It is stored
 in the Qdrant entity payload, which enables project-scoped entity search in
 orchestration. SQLite entities and relationships are global — scoping only
 applies at the Qdrant retrieval layer.
 ## Error Behaviour
 All steps after the initial model call are wrapped in a single outer try/catch.
 If Ollama is unreachable, returns a non-200 status, or the JSON cannot be
 parsed, the function logs at `warn` level and returns. There is no retry logic.
 Individual entity embedding failures are caught per-entity and logged at `warn`
 level without affecting other entities in the same batch.
--- a/docs/services/inference-service.md
+++ b/docs/services/inference-service.md
@@ -24,20 +24,19 @@ to switch inference backends without changes to the rest of the system.
 | Variable | Required | Default | Description |
 |---|---|---|---|
 | PORT | No | 3001 | Port to listen on |
-| INFERENCE_PROVIDER | No | llamacpp | Active inference provider (`ollama` or `llamacpp`) |
+| INFERENCE_PROVIDER | No | llamacpp | Active provider (`ollama` or `llamacpp`) |
 | INFERENCE_URL | No | http://localhost:8080 | URL of the inference runtime |
 | DEFAULT_MODEL | No | local-model | Default model name passed to the provider |
 > `INFERENCE_URL` points to `llama-server` directly (port 8080), not to this
-> service itself. The orchestration service uses `INFERENCE_SERVICE_URL` to
+> service. The orchestration service uses `INFERENCE_SERVICE_URL` to reach
-> reach this service on port 3001.
+> this service on port 3001.
 ## Provider Architecture
-The inference service uses a provider pattern to abstract the underlying
+The active provider is selected at startup via `INFERENCE_PROVIDER` and
-LLM runtime. The active provider is selected at startup via `INFERENCE_PROVIDER`
+loaded from `src/providers/`. Both providers expose identical function
-and loaded from `src/providers/`. Both providers expose identical function
+signatures.
 signatures, so the rest of the service is unaware of which backend is active.
 ### Supported Providers
@@ -46,28 +45,41 @@ signatures, so the rest of the service is unaware of which backend is active.
 | llama.cpp | `llamacpp` | llama.cpp server (OpenAI-compatible API) — **current default** |
 | Ollama | `ollama` | Ollama via the `ollama` npm package — available as fallback |
-Switching providers requires only a `.env` change — no code modifications needed:
+Switching providers requires only a `.env` change — no code modifications:
 ```
 INFERENCE_PROVIDER=llamacpp
 INFERENCE_URL=http://localhost:8080
 ```
-### Provider Validation
+The provider loader throws immediately on an unknown value, preventing silent
 misconfiguration.
 > **LM Studio compatibility note:** LM Studio exposes an OpenAI-compatible
 > `/v1/chat/completions` endpoint with the same request shape as llama.cpp.
 > A future `lmstudio.js` provider would be nearly identical to `llamacpp.js` —
 > only the `BASE_URL` would differ. No architectural changes required.
 ## Internal Structure
 The provider loader validates `INFERENCE_PROVIDER` at startup and throws immediately
 if an unknown value is set — prevents silent misconfiguration:
 ```
-Error: Unknown inference provider: "foo". Valid options: ollama, llamacpp
+src/
 ├── providers/
 │   ├── ollama.js      # Ollama provider
 │   └── llamacpp.js    # llama.cpp provider (OpenAI-compatible REST)
 ├── routes/
 │   └── inference.js   # /complete and /complete/stream route handlers
 ├── infer.js           # Provider loader — selects and re-exports active provider
 └── index.js           # Express app + route definitions
 ```
 ## llama.cpp Provider
-The llama.cpp provider uses the OpenAI-compatible REST API exposed by `llama-server`.
+Uses the OpenAI-compatible REST API exposed by `llama-server`.
 ### Starting llama-server
-`llama-server` must be started manually on the main PC before the inference service
+Must be started manually on the main PC before the inference service can
-can handle requests. It loads a single model at startup:
+handle requests:
 ```powershell
 .\llama-gpu\llama-server.exe `
@@ -79,60 +91,42 @@ can handle requests. It loads a single model at startup:
  -c 64000
 ```
 Key flags:
 | Flag | Description |
 |---|---|
 | `-m` | Path to the `.gguf` model file |
 | `-ngl 99` | Offload as many layers as possible to GPU |
-| `--reasoning off` | Disables thinking/reasoning delay on Gemma 4 models |
+| `--reasoning off` | Disables thinking delay on Gemma 4 models |
-| `--host 0.0.0.0` | Allows connections from other machines on the LAN |
+| `--host 0.0.0.0` | Allows LAN connections |
 | `--port 8080` | Port for the llama-server HTTP API |
 | `-c 64000` | Context window size in tokens |
-> `-c 64000` is intentionally large. Monitor VRAM usage — if pressure builds,
+> `-c 64000` is intentionally large. NexusAI's memory architecture handles
-> reduce this value. The NexusAI memory architecture handles context injection
+> context injection so 6–8K is often sufficient if VRAM pressure builds.
 > so a smaller window (6–8K) is often sufficient.
 ### Model Naming
-The model name sent in API requests must match the name as reported by
+The model name in requests must match the name reported by `llama-server`
-`llama-server` — including the `.gguf` extension. The reported name can be
+including the `.gguf` extension:
 verified with:
 ```powershell
 Invoke-RestMethod -Uri "http://192.168.0.79:8080/v1/models"
 ```
-Set `DEFAULT_MODEL` in `.env` to the exact reported name:
+Set `DEFAULT_MODEL` in `.env` to the exact reported name.
 ```
 DEFAULT_MODEL=gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf
 ```
 ### Inference Parameters
-The llamacpp provider maps NexusAI options to OpenAI-compatible fields:
+All parameters are resolved in `resolveOptions()` — falling back to
 `INFERENCE_DEFAULTS` from `@nexusai/shared` if not provided in the request.
 In normal usage, orchestration reads these from `settings.json` and forwards
 them on every request.
-| NexusAI option | API field | Default |
+| NexusAI option | API field | Default | Description |
-|---|---|---|
+|---|---|---|---|
-| `temperature` | `temperature` | 0.7 |
+| `temperature` | `temperature` | 0.7 | Response randomness (0 = deterministic) |
-| `maxTokens` | `max_tokens` | 1024 |
+| `maxTokens` | `max_tokens` | 1024 | Max tokens to generate |
-| `topP` | `top_p` | 0.9 |
+| `topP` | `top_p` | 0.9 | Nucleus sampling probability mass |
-| `topK` | `top_k` | 40 |
+| `topK` | `top_k` | 40 | Top-K token candidates per step |
-| `repeatPenalty` | `repeat_penalty` | 1.1 |
+| `repeatPenalty` | `repeat_penalty` | 1.1 | Penalty for recently used tokens |
-| `seed` | `seed` | null (random) |
+| `seed` | `seed` | null | null = random; integer for reproducible output |
 ## Internal Structure
 ```
 src/
 ├── providers/
 │   ├── ollama.js      # Ollama provider — uses ollama npm package
 │   └── llamacpp.js    # llama.cpp provider — uses OpenAI-compatible REST API
 ├── routes/
 │   └── inference.js   # /complete and /complete/stream route handlers
 ├── infer.js           # Provider loader — selects and re-exports active provider
 └── index.js           # Express app + route definitions
 ```
 ## Streaming Response Format
@@ -143,7 +137,7 @@ The llama.cpp provider yields chunks in this shape:
 { response: '', done: true, model: "model-name.gguf", tokenCount: 42 }
 ```
-The inference route re-emits these as SSE events:
+The inference route re-emits as SSE:
 ```
 data: {"response":"token text"}
 data: {"done":true,"model":"model-name.gguf","tokenCount":42}
@@ -151,66 +145,6 @@ data: [DONE]
 ```
 `model` and `tokenCount` are captured from the llama.cpp `finish_reason: stop`
-chunk (`usage.completion_tokens`) and emitted on the done event so the
+chunk and emitted on the done event.
 orchestration layer can forward them to the client.
-## Endpoints
+For all HTTP endpoints, see `api-routes.md`.
 ### Health
 | Method | Path | Description |
 |---|---|---|
 | GET | /health | Service health check — reports active provider and model |
 ### Inference
 | Method | Path | Description |
 |---|---|---|
 | POST | /complete | Standard completion — returns full response when done |
 | POST | /complete/stream | Streaming completion via Server-Sent Events |
 ---
 **POST /complete**
 Request body:
 ```json
 {
  "prompt": "What is the capital of France?",
  "model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
  "temperature": 0.7,
  "maxTokens": 1024
 }
 ```
 `model` is optional — falls back to `DEFAULT_MODEL` if omitted.  
 `maxTokens` is optional — defaults to 1024.  
 `temperature` is optional — defaults to 0.7.
 Response:
 ```json
 {
  "text": "The capital of France is Paris.",
  "model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
  "done": true,
  "evalCount": 8,
  "promptEvalCount": 41
 }
 ```
 ---
 **POST /complete/stream**
 Same request body as `/complete`.
 Response is a stream of Server-Sent Events:
 ```
 data: {"response":"The"}
 data: {"response":" capital of France is Paris."}
 data: {"done":true,"model":"gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf","tokenCount":8}
 data: [DONE]
 ```
 Clients should accumulate `response` fields to build the full response string.
 The `done` event carries `model` and `tokenCount` for display in the UI.
--- a/docs/services/knowledge-graph.md
+++ b/docs/services/knowledge-graph.md
@@ -0,0 +1,213 @@
 # Knowledge Graph
 **Location:** `packages/memory-service/src/graph/index.js`  
 **Schema additions:** `entity_episodes` table; new columns on `entities` and `relationships`  
 **Exposed via:** `GET /graph/neighborhood/:entityId`, `POST /graph/neighbors`  
 **Consumed by:** Orchestration service context assembly
 ## Purpose
 The knowledge graph transforms NexusAI from "remembers conversations" to
 "understands relationships between things." Rather than injecting a flat
 list of entity facts into every prompt, orchestration now retrieves a
 1-hop subgraph of connected entities and their relationships, giving the
 model structured, linked knowledge about people, projects, technologies,
 and concepts that have appeared across conversations.
 ## Schema
 ### `entity_episodes` (join table)
 Tracks which episodes contributed to each entity's knowledge. Defined in
 `schema.js` — exists on all installs.
 ```sql
 CREATE TABLE IF NOT EXISTS entity_episodes (
  entity_id  INTEGER NOT NULL REFERENCES entities(id) ON DELETE CASCADE,
  episode_id INTEGER NOT NULL REFERENCES episodes(id) ON DELETE CASCADE,
  PRIMARY KEY (entity_id, episode_id)
 );
 ```
 Both FKs cascade on delete — removing an entity or episode automatically
 cleans up its join rows.
 ### New columns on `entities`
 Added via migration in `db/index.js`:
 | Column | Type | Default | Description |
 |---|---|---|---|
 | `mention_count` | INTEGER | 1 | How many times this entity has been extracted across conversations |
 | `confidence` | REAL | 1.0 | Reserved for future confidence scoring |
 | `source` | TEXT | `'extraction'` | `'extraction'` (auto) or `'manual'` |
 | `last_seen_at` | INTEGER | NULL | Unix timestamp of most recent extraction hit |
 ### New columns on `relationships`
 | Column | Type | Default | Description |
 |---|---|---|---|
 | `mention_count` | INTEGER | 1 | How many times this edge has been extracted |
 | `notes` | TEXT | NULL | Relationship context sentence from extraction |
 ## Entity Promotion Model
 Entities are not created equal — some are mentioned once in passing, others
 recur across many conversations. `mention_count` is the signal:
 - Every time `upsertEntity` is called for an existing `(name, type)` pair, `mention_count` is incremented and `last_seen_at` is updated.
 - `ENTITIES.PROMOTION_THRESHOLD` (default: **3**) is the `mention_count` at which an entity is considered "well-established" — referenced in the codebase for future filtering and scoring logic.
 - Currently `mention_count` is stored and incremented but not yet used to gate retrieval. It provides the foundation for future features such as orphan cleanup (entities never re-extracted) and confidence-weighted graph traversal.
 The same pattern applies to relationships — `mention_count` rises each time
 the same `(from_id, to_id, label)` triple is extracted.
 ## Graph Traversal
 `src/graph/index.js` exports two functions built on SQLite's `WITH RECURSIVE`
 CTE support. No external graph database is needed.
 ### `getNeighborhood(entityId, depth)`
 Traverses the graph from a single entity, following edges in **both directions**,
 up to `depth` hops. Returns `{ nodes: [...entities], edges: [...relationships] }`.
 Default depth: `ENTITIES.GRAPH_HOP_DEPTH` (1). Maximum enforced at HTTP layer: 3.
 **SQLite query:**
 ```sql
 WITH RECURSIVE traverse(entity_id, depth) AS (
    SELECT ?, 0
    UNION
    SELECT
        CASE WHEN r.from_id = t.entity_id THEN r.to_id ELSE r.from_id END,
        t.depth + 1
    FROM relationships r
    JOIN traverse t ON (r.from_id = t.entity_id OR r.to_id = t.entity_id)
    WHERE t.depth < ?
 )
 SELECT DISTINCT entity_id FROM traverse
 ```
 `UNION` (not `UNION ALL`) eliminates duplicate visits and naturally handles
 cycles — a node already in the traversal set is not re-visited.
 After collecting node IDs, two follow-up queries fetch:
 - All entity rows for those IDs
 - All relationship rows where both `from_id` and `to_id` are in the node set
 This ensures edges between neighbors are included even if they aren't on the
 traversal path from the seed.
 ### `getEntityNeighbors(entityIds[])`
 Bulk 1-hop version designed for orchestration. Given multiple seed entity IDs
 (the results of Qdrant semantic search), returns the combined 1-hop subgraph.
 1. Finds all neighbor IDs via one query using `IN (...)` on both `from_id` and `to_id`
 2. Deduplicates seeds + neighbors using a JavaScript `Set`
 3. Fetches all entity rows and all relationship rows within the combined node set
 This is intentionally simpler than the recursive version — orchestration always
 uses depth=1, and the bulk query avoids N separate CTE calls.
 ## Graph-Aware Context Assembly
 Orchestration's `assembleContext` (in `src/chat/index.js`) integrates the
 graph at step 7 of the chat pipeline:
 1. Qdrant entity search returns up to `ORCHESTRATION.ENTITIES_LIMIT` results, each including `r.id` (the SQLite entity ID) alongside the Qdrant payload
 2. `graph.getNeighbors(entityIds)` is called with those IDs → `POST /graph/neighbors` on memory-service
 3. The returned `{ nodes, edges }` is passed to `formatGraphContext()`
 4. On failure, falls back to using the Qdrant payload data directly as flat nodes with no edges
 ### Prompt Format
 `formatGraphContext(nodes, edges)` in `chat/index.js` formats the subgraph as:
 ```
 Here is what you know about entities relevant to this conversation and their connections:
 - Alice (person): software engineer working on NexusAI
  → works_on NexusAI (project)
  → knows Bob (person)
 - NexusAI (project): AI assistant framework
 - Bob (person): Alice's colleague
 ```
 - One line per node: `- {name} ({type}): {notes}`
 - Outbound edges indented below: `  → {label} {target_name} ({target_type})`
 - Nodes with only inbound edges (pulled in as neighbors) appear without connection lines
 - Only outbound edges are shown — each relationship appears once, from the `from_id` side
 ## Project Scoping
 The knowledge graph respects project boundaries at the **entry point**, not
 during traversal:
 - Qdrant entity search is filtered by `projectId` — only entities tagged with this project are returned as seeds
 - Graph traversal in SQLite is unfiltered — neighbors can be from any project or no project
 - This is intentional: the graph entry is project-scoped, but traversal follows the global relationship graph to discover connected knowledge
 Entities are tagged with `projectId` in the Qdrant payload at extraction time.
 Entities extracted from non-project sessions have `projectId: null` and only
 appear in unfiltered global searches.
 ## API Reference
 ### `GET /graph/neighborhood/:entityId`
 Returns the neighborhood of a single entity.
 **Query params:**
 | Param | Default | Max | Description |
 |---|---|---|---|
 | `depth` | `ENTITIES.GRAPH_HOP_DEPTH` (1) | 3 | Traversal depth |
 **Response:**
 ```json
 {
  "entity": { "id": 5, "name": "Alice", "type": "person", "notes": "...", "mention_count": 4 },
  "neighborhood": {
    "nodes": [
      { "id": 5, "name": "Alice", "type": "person", "notes": "..." },
      { "id": 8, "name": "NexusAI", "type": "project", "notes": "..." }
    ],
    "edges": [
      { "id": 2, "from_id": 5, "to_id": 8, "label": "works_on", "notes": "...", "mention_count": 3 }
    ]
  }
 }
 ```
 Returns 404 if the entity does not exist.
 ### `POST /graph/neighbors`
 Bulk 1-hop neighborhood for a set of entity IDs. Used internally by
 orchestration — not intended for direct client use.
 **Request body:**
 ```json
 { "entityIds": [5, 8, 12] }
 ```
 **Response:**
 ```json
 {
  "nodes": [ ...entity objects... ],
  "edges": [ ...relationship objects... ]
 }
 ```
 Returns 400 if `entityIds` is missing or empty.
 ## Constants (`packages/shared/src/config/constants.js`)
 | Constant | Value | Description |
 |---|---|---|
 | `ENTITIES.PROMOTION_THRESHOLD` | 3 | `mention_count` at which an entity is considered well-established |
 | `ENTITIES.GRAPH_HOP_DEPTH` | 1 | Default traversal depth for neighborhood queries |
 | `ORCHESTRATION.ENTITIES_LIMIT` | 5 | Max entity seeds returned from Qdrant search |
 | `ORCHESTRATION.ENTITIES_THRESHOLD` | 0.55 | Minimum similarity score for entity Qdrant search |
--- a/docs/services/memory-service.md
+++ b/docs/services/memory-service.md
@@ -9,8 +9,8 @@
 Responsible for all reading and writing of long-term memory. Acts as the
 sole interface to both SQLite and Qdrant — no other service accesses these
-stores directly. On episode creation, automatically calls the embedding
+stores directly. On episode creation, automatically triggers entity and
-service to generate and store a vector in Qdrant.
+relationship extraction and embeds results into Qdrant.
 ## Dependencies
@@ -28,6 +28,8 @@ service to generate and store a vector in Qdrant.
 | SQLITE_PATH | Yes | — | Path to SQLite database file |
 | QDRANT_URL | No | http://localhost:6333 | Qdrant instance URL |
 | EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL |
 | EXTRACTION_URL | No | http://localhost:11434 | Ollama URL for entity extraction |
 | EXTRACTION_MODEL | No | qwen2.5:3b | Ollama model used for entity extraction |
 ## Internal Structure
@@ -35,42 +37,57 @@ service to generate and store a vector in Qdrant.
 src/
 ├── db/
 │   ├── index.js       # SQLite connection + initialization + migrations
-│   └── schema.js      # Table definitions, indexes, FTS5, triggers
+│   ├── schema.js      # Table definitions, indexes, FTS5, triggers
 │   ├── projects.js    # Project CRUD functions
 │   └── summaries.js   # Summary CRUD functions
 ├── episodic/
 │   └── index.js       # Session + episode CRUD, FTS search, embedding write path
 ├── semantic/
 │   └── index.js       # Qdrant collection management, upsert, search, delete
 ├── entities/
-│   └── index.js       # Entity + relationship CRUD
+│   ├── index.js       # Entity + relationship CRUD (upsert, mention tracking)
-└── index.js           # Express app + route definitions
+│   └── extraction.js  # Automatic entity + relationship extraction via qwen2.5:3b
 ├── graph/
 │   └── index.js       # Knowledge graph traversal (neighborhood queries, recursive CTE)
 └── index.js           # Express app + all route definitions
 ```
 ## SQLite Schema
-Five core tables:
+Eight core tables:
- **sessions** — top-level conversation containers, identified by an `external_id` and optional `name`
+- **sessions** — top-level conversation containers. Fields: `external_id`, `name`, `project_id`, `metadata`
 - **episodes** — individual exchanges (user message + AI response) tied to a session
- **entities** — named things the system learns about (people, places, concepts)
+- **entities** — named things the system learns about (people, places, concepts, etc.). Fields include `mention_count`, `confidence`, `source`, `last_seen_at`
- **relationships** — directional labeled links between entities
+- **relationships** — directional labeled links between entities (`from_id`, `to_id`, `label`). Fields include `mention_count`, `notes`
 - **entity_episodes** — join table linking entities to the episodes where they were extracted. Used for provenance and orphan cleanup
 - **summaries** — condensed episode groups for efficient context retrieval
 - **projects** — named groupings of sessions with `name`, `description`, `colour`, `icon`, `isolated`, `notes`, `system_prompt`
 ### Migrations
-Schema changes that cannot be expressed in `CREATE TABLE IF NOT EXISTS` are applied
+Schema changes that cannot use `CREATE TABLE IF NOT EXISTS` are applied as
-as migrations in `db/index.js` at startup, wrapped in try/catch to safely ignore
+idempotent migrations in `db/index.js` at startup:
 already-applied changes:
 ```js
-try {
+try { db.exec(`ALTER TABLE sessions ADD COLUMN name TEXT`); } catch {}
-    db.exec(`ALTER TABLE sessions ADD COLUMN name TEXT`);
+try { db.exec(`ALTER TABLE sessions ADD COLUMN project_id INTEGER REFERENCES projects(id)`); } catch {}
-} catch {
+try { db.exec(`CREATE INDEX IF NOT EXISTS idx_sessions_project ON sessions(project_id)`); } catch {}
-    // Column already exists — safe to ignore on subsequent startups
+try { db.exec(`ALTER TABLE projects ADD COLUMN isolated INTEGER NOT NULL DEFAULT 0`); } catch {}
-}
+try { db.exec(`ALTER TABLE projects ADD COLUMN notes TEXT`); } catch {}
 try { db.exec(`ALTER TABLE projects ADD COLUMN system_prompt TEXT`); } catch {}
 // Knowledge graph columns:
 try { db.exec(`ALTER TABLE entities ADD COLUMN mention_count INTEGER NOT NULL DEFAULT 1`) } catch {}
 try { db.exec(`ALTER TABLE entities ADD COLUMN confidence REAL NOT NULL DEFAULT 1.0`) } catch {}
 try { db.exec(`ALTER TABLE entities ADD COLUMN source TEXT NOT NULL DEFAULT 'extraction'`) } catch {}
 try { db.exec(`ALTER TABLE entities ADD COLUMN last_seen_at INTEGER`) } catch {}
 try { db.exec(`ALTER TABLE relationships ADD COLUMN mention_count INTEGER NOT NULL DEFAULT 1`) } catch {}
 try { db.exec(`ALTER TABLE relationships ADD COLUMN notes TEXT`) } catch {}
 ```
-Current migrations:
+`entity_episodes` is defined in `schema.js` itself (not a migration) since it is a new table.
- `ALTER TABLE sessions ADD COLUMN name TEXT` — adds display name to sessions
+
 New migrations are always appended — never modify the schema file for existing tables since `ALTER TABLE` cannot use `IF NOT EXISTS`.
 ### FTS5 Full-Text Search
@@ -82,11 +99,22 @@ keep the FTS index automatically in sync with the episodes table.
 - `journal_mode = WAL` — non-blocking reads during writes
 - `foreign_keys = ON` — enforces referential integrity and cascade deletes
- PRAGMAs are set via `db.pragma()` separately from `db.exec()`
+- PRAGMAs set via `db.pragma()`, not `db.exec()`
 ### Dynamic Updates
 Both `updateSession` and `updateProject` build their `SET` clause dynamically
 from only the fields passed — prevents partial updates from overwriting fields
 that weren't touched.
 `updateProject` allowlist:
 ```js
 const allowed = ['name', 'description', 'colour', 'icon', 'isolated', 'notes', 'system_prompt'];
 ```
 ## Qdrant / Semantic Layer
-Three collections are initialized on service startup (created if they don't already exist):
+Three Qdrant collections are initialized on service startup via `semantic.initCollections()`:
 | Collection | Purpose |
 |---|---|
@@ -94,177 +122,79 @@ Three collections are initialized on service startup (created if they don't alre
 | `entities` | Embeddings for named entities |
 | `summaries` | Embeddings for condensed episode summaries |
-All collections use **768-dimension vectors** with **Cosine similarity**, matching the
+All collections use **768-dimension vectors** with **Cosine similarity**,
-output of the `nomic-embed-text` embedding model via Ollama.
+matching `nomic-embed-text` via Ollama. Vector size and distance metric are
 defined in `@nexusai/shared` — not hardcoded here.
-Vector dimension and distance metric are defined in `@nexusai/shared` constants
+`initCollections()` iterates `Object.values(COLLECTIONS)` and creates any
-(`QDRANT.VECTOR_SIZE`, `QDRANT.DISTANCE_METRIC`) — not hardcoded in this service.
+collection that doesn't already exist at startup — all three collections are
 guaranteed to exist before any requests are handled.
-### Semantic Layer Operations
+Each collection exposes upsert, search (with optional Qdrant filter), and
-
+delete operations. The `wait: true` flag is used on all writes.
 Each collection exposes three operations via helper functions in `src/semantic/index.js`:
 - **Upsert** — stores a vector with a payload containing the SQLite row ID, enabling
  lookups back to the full content after a vector search
 - **Search** — returns the top-k most similar vectors, with optional Qdrant filter
 - **Delete** — removes a vector point by ID
 The `wait: true` flag is used on all write operations so the caller receives confirmation
 only after Qdrant has committed the change.
 ## Embedding Write Path
-When a new episode is created, the memory service automatically generates and stores
+When a new episode is created:
 a vector embedding in Qdrant via the embedding service:
-1. Episode is saved to SQLite synchronously — the response is returned immediately
+1. Episode saved to SQLite synchronously — response returned immediately
-2. Both sides of the exchange are combined into a single text:
+2. User message + AI response combined: `User: ...\nAssistant: ...`
-   ```
+3. Text sent to embedding service (`POST /embed`)
-   User: {userMessage}
+4. Vector upserted into `episodes` Qdrant collection with payload `{ sessionId, createdAt }`
   Assistant: {aiResponse}
   ```
 3. This text is sent to the embedding service (`POST /embed`)
 4. The returned vector is upserted into the `episodes` Qdrant collection with a
   payload of `{ sessionId, createdAt }` for filtering and lookups
-The embedding step is **fire-and-forget** — it runs asynchronously after the SQLite
+This step is **fire-and-forget** — if embedding fails, the episode is still
-insert succeeds. If embedding fails, the episode is still saved and searchable via
+saved and searchable via FTS. The error is logged but not surfaced.
 FTS. The error is logged but does not affect the API response.
-### Hybrid Retrieval Pattern
+> The Qdrant payload stores `sessionId` (the internal integer ID). See
-
+> `memory-isolation.md` for how project-level filtering works.
 Qdrant and SQLite work as a pair — neither operates in isolation:
 1. Query is embedded and searched in Qdrant → returns IDs + similarity scores
 2. IDs are used to fetch full content from SQLite
 3. Results are ranked and assembled into a context package
 ## Entity Layer
-Entities and relationships are stored in SQLite with two key constraints:
+Entities and relationships use upsert semantics with composite unique
 constraints to prevent duplicates:
- `UNIQUE(name, type)` on entities — ensures no duplicates; upsert updates existing records
+- `UNIQUE(name, type)` on entities — conflict increments `mention_count` and updates `last_seen_at`
- `UNIQUE(from_id, to_id, label)` on relationships — prevents duplicate edges
+- `UNIQUE(from_id, to_id, label)` on relationships — conflict increments `mention_count` and preserves existing `notes`
- `ON DELETE CASCADE` on both `from_id` and `to_id` — deleting an entity automatically
+- `ON DELETE CASCADE` on relationship foreign keys
  removes all relationships where it appears on either end
-## Endpoints
+After each episode is saved, `extraction.js` automatically extracts named
 entities **and relationships** from the conversation using `qwen2.5:3b` on
 Ollama — fire-and-forget. Each saved entity is also linked to the episode
 via the `entity_episodes` join table.
-### Health
+> For full details on the extraction pipeline and JSON format, see `entity-extraction.md`.  
 > For the knowledge graph traversal layer, see `knowledge-graph.md`.
-| Method | Path | Description |
+## Knowledge Graph Layer
 |---|---|---|
 | GET | /health | Service health check |
-### Sessions
+`src/graph/index.js` provides SQLite-based graph traversal over the entities
 and relationships tables. Two functions are exposed via HTTP:
-| Method | Path | Description |
+- **`getNeighborhood(entityId, depth)`** — recursive CTE traversal, bidirectional, returns `{ nodes, edges }`
-|---|---|---|
+- **`getEntityNeighbors(entityIds[])`** — bulk 1-hop traversal for orchestration context assembly
 | POST | /sessions | Create a new session |
 | GET | /sessions | Get paginated list of all sessions |
 | GET | /sessions/:id | Get session by internal ID |
 | GET | /sessions/by-external/:externalId | Get session by external ID |
 | PATCH | /sessions/by-external/:externalId | Update session name |
 | DELETE | /sessions/by-external/:externalId | Delete session (cascades to episodes + summaries) |
-> Route ordering matters in Express: `by-external/:externalId` must be defined before
+> For design rationale, traversal queries, and integration with orchestration, see `knowledge-graph.md`.
 > `/:id` to prevent the literal string `by-external` being captured as an ID parameter.
-**POST /sessions body:**
+## Summaries Layer
-```json
+
-{
+Session summaries are generated by `orchestration-service/src/services/summarization.js`
-  "externalId": "unique-session-id",
+after each episode write and stored here via `POST /summaries`. The memory
-  "metadata": {}
+service is responsible only for CRUD — generation logic lives in orchestration.
-}
+
 > For full details on trigger conditions, prompt format, cumulative updates,
 > and ChatML token stripping, see `summarization.md`.
 ## Project Delete Behaviour
 Deleting a project runs as a transaction — it first nulls out `project_id`
 on all assigned sessions, then deletes the project. This avoids a foreign
 key constraint failure since `sessions.project_id` has no `ON DELETE` rule:
 ```js
 const doDelete = db.transaction(() => {
  db.prepare(`UPDATE sessions SET project_id = NULL WHERE project_id = ?`).run(id);
  db.prepare(`DELETE FROM projects WHERE id = ?`).run(id);
 });
 ```
-**PATCH /sessions/by-external/:externalId body:**
+For all HTTP endpoints, see `api-routes.md`.
 ```json
 {
  "name": "My Renamed Session"
 }
 ```
 Returns the updated session object. `name` is required and must be non-empty.
 **DELETE /sessions/by-external/:externalId**
 Returns `204 No Content` on success. Cascades to delete all associated episodes
 and summaries via SQLite `ON DELETE CASCADE`.
 ### Episodes
 | Method | Path | Description |
 |---|---|---|
 | POST | /episodes | Create episode + auto-embed into Qdrant |
 | GET | /episodes/search?q=&limit= | Full-text search across episodes |
 | GET | /episodes/:id | Get episode by ID |
 | GET | /sessions/:id/episodes?limit=&offset= | Get paginated episodes for a session |
 | DELETE | /episodes/:id | Delete an episode |
 **POST /episodes body:**
 ```json
 {
  "sessionId": 1,
  "userMessage": "Hello",
  "aiResponse": "Hi there!",
  "tokenCount": 10,
  "metadata": {}
 }
 ```
 > Note: `/episodes/search` must be defined before `/episodes/:id` in Express to prevent
 > the word `search` being captured as an ID parameter.
 ### Entities
 | Method | Path | Description |
 |---|---|---|
 | POST | /entities | Upsert an entity (creates or updates by name + type) |
 | GET | /entities/by-type/:type | Get all entities of a given type |
 | GET | /entities/:id | Get entity by internal ID |
 | DELETE | /entities/:id | Delete entity (cascades to relationships) |
 **POST /entities body:**
 ```json
 {
  "name": "NexusAI",
  "type": "project",
  "notes": "My AI memory project",
  "metadata": {}
 }
 ```
 > Note: `/entities/by-type/:type` must be defined before `/entities/:id` in Express to
 > prevent `by-type` being captured as an ID parameter.
 ### Relationships
 | Method | Path | Description |
 |---|---|---|
 | POST | /relationships | Upsert a relationship between two entities |
 | GET | /entities/:id/relationships | Get all relationships originating from an entity |
 | DELETE | /relationships | Delete a specific relationship |
 **POST /relationships body:**
 ```json
 {
  "fromId": 1,
  "toId": 2,
  "label": "uses",
  "metadata": {}
 }
 ```
 **DELETE /relationships body:**
 ```json
 {
  "fromId": 1,
  "toId": 2,
  "label": "uses"
 }
 ```
 > Relationships are identified by the composite key `(fromId, toId, label)`. Delete uses
 > the request body rather than URL params as this three-part key is awkward to express
 > cleanly in a path.
--- a/docs/services/orchestration-service.md
+++ b/docs/services/orchestration-service.md
@@ -27,256 +27,200 @@ or inference services — all traffic flows through orchestration.
 | MEMORY_SERVICE_URL | No | http://localhost:3002 | Memory service URL |
 | EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL |
 | INFERENCE_SERVICE_URL | No | http://localhost:3001 | Inference service URL |
 | LLAMA_SERVER_URL | No | http://localhost:8080 | Direct llama-server URL for /models/props |
 | QDRANT_URL | No | http://localhost:6333 | Qdrant URL for semantic search |
 | CORS_ORIGIN | No | http://localhost:5173 | Allowed origin for CORS requests |
-| MODELS_MANIFEST_PATH | Yes | — | Path to `models.json` manifest file |
+| EXTRACTION_URL | No | http://localhost:11434 | Ollama URL for summarisation |
 | EXTRACTION_MODEL | No | qwen2.5:3b | Ollama model used for summarisation |
 ## Internal Structure
 ```
 src/
 ├── services/
 │   ├── memory.js         # HTTP client for memory service
 │   ├── inference.js      # HTTP client for inference service
 │   ├── embedding.js      # HTTP client for embedding service
-│   └── qdrant.js      # HTTP client for Qdrant vector search
+│   ├── qdrant.js         # HTTP client for Qdrant (direct vector search)
 │   ├── graph.js          # HTTP client for memory-service graph endpoints
 │   └── summarization.js  # Session summarisation — triggers after each episode
 ├── chat/
-│   └── index.js       # Core pipeline logic — context assembly and coordination
+│   └── index.js          # Core pipeline — context assembly, graph expansion, auto-naming
 ├── config/
 │   └── settings.js       # Settings load/save — reads/writes data/settings.json
 ├── routes/
-│   ├── chat.js        # POST /chat and POST /chat/stream route handlers
+│   ├── chat.js           # POST /chat and POST /chat/stream
-│   ├── sessions.js    # Session list, history, rename, and delete routes
+│   ├── sessions.js       # Session CRUD proxy
-│   └── models.js      # GET /models — reads models.json manifest from disk
+│   ├── projects.js       # Project CRUD proxy
 │   ├── episodes.js       # Episode list and delete proxy
 │   ├── summaries.js      # GET /summaries/session/:id and /summaries/project/:id
 │   ├── settings.js       # GET /settings and PATCH /settings
 │   ├── health.js         # GET /health/services — pings all four services
 │   └── models.js         # GET /models and GET /models/props
 └── index.js              # Express app entry point
 ```
-The `services/` layer wraps all downstream HTTP calls in named functions,
+The `services/` layer wraps all downstream HTTP calls in named functions.
 keeping the pipeline logic in `chat/index.js` readable and ensuring that
 URL or endpoint changes have a single place to be updated.
 ## Settings
 Settings are persisted to `data/settings.json` and loaded on every request
 via `appSettings.load()` — changes apply immediately without a service restart.
 | Setting | Default | Description |
 |---|---|---|
 | `recentEpisodeLimit` | 5 | Recent episodes injected into prompt |
 | `semanticLimit` | 5 | Semantic search results injected into prompt |
 | `scoreThreshold` | 0.5 | Minimum similarity score for Qdrant semantic results |
 | `semanticWeight` | 1.0 | RRF weight for Qdrant semantic results |
 | `keywordWeight` | 0 | RRF weight for FTS5 keyword results (`0` = disabled) |
 | `modelsFolderPath` | `/mnt/nexus-models` | Path to folder containing .gguf files |
 | `temperature` | 0.7 | Inference temperature |
 | `repeatPenalty` | 1.1 | Repeat token penalty |
 | `topP` | 0.9 | Nucleus sampling probability mass |
 | `topK` | 40 | Top-K token candidates per step |
 | `systemPrompt` | *(ORCHESTRATION.SYSTEM_PROMPT)* | Global system prompt. `null` reverts to hardcoded constant. |
 ## Chat Pipeline
-Both `POST /chat` and `POST /chat/stream` share the same context assembly
+Both `POST /chat` and `POST /chat/stream` share the same steps. The only
-steps. The only difference is how the inference response is delivered to
+difference is how the inference response is delivered to the client.
 the client.
-1. **Session resolution** — looks up the session by `externalId` in the memory
+### Steps
   service. If not found, auto-creates a new session. Clients can generate a
   UUID for new conversations and pass it directly — no pre-creation step needed.
-2. **Recent episode retrieval** — fetches the most recent episodes for the session
+1. **Session resolution** — look up session by `externalId`. Auto-create if
-   (default: 5) from the memory service.
+   not found.
-3. **Semantic search** — embeds the user message via the embedding service, then
+2. **Project context resolution** — if the session has a `project_id`, fetch
-   queries Qdrant for the top-5 most similar past episodes (score threshold: 0.75).
+   the project and all its session IDs. Used to scope semantic search. The
-   Results are deduplicated against the recent episode set using a `Set` of IDs.
+   project's `system_prompt` is also read at this step if set.
   Full episode content is fetched from the memory service by ID. This step is
   non-critical — if it fails, a warning is logged and the pipeline continues with
   recency-only context.
-4. **Prompt assembly** — combines the system prompt, semantic episodes (if any),
+3. **System prompt resolution** — three-tier hierarchy:
-   recent episodes, and the current user message into a single prompt string.
+   - `project.system_prompt` — highest priority
   - `settings.systemPrompt` — global setting from `settings.json`
   - `ORCHESTRATION.SYSTEM_PROMPT` — hardcoded constant (last resort)
-5. **Inference** — sends the assembled prompt to the inference service. `/chat`
+4. **Recent episode retrieval** — fetch most recent episodes (`recentEpisodeLimit`).
   awaits the full response; `/chat/stream` opens an SSE connection and pipes
   chunks to the client as they arrive.
-6. **Episode write** — writes the new exchange (user message + AI response)
+5. **Fused episode retrieval** — runs semantic (Qdrant) and keyword (FTS5)
-   back to the memory service as a fire-and-forget operation. For streaming,
+   search in parallel, then merges results via Reciprocal Rank Fusion (RRF).
-   the full response text is accumulated across chunks before writing.
+   Both paths are filtered against `recentIds` before fusion. FTS is scoped
   to the current session or all project sessions. If `keywordWeight` is `0`,
   the FTS call is skipped entirely. Non-critical — failures fall back to
   whichever strategy succeeded.
-7. **Response** — returns the AI response, model name, session ID, and token
+6. **Entity search** — query `entities` Qdrant collection filtered by
-   count to the client.
+   `projectId`. Returns entity IDs alongside Qdrant payload data (the Qdrant
   point ID equals the SQLite entity ID). Non-critical.
 7. **Graph neighborhood expansion** — call `POST /graph/neighbors` on
   memory-service with the entity IDs from step 6. Returns a 1-hop subgraph
   `{ nodes, edges }` — entity objects plus the relationships connecting them.
   If no entities were found or the graph call fails, falls back to flat entity
   list (no edges). Non-critical.
 8. **Prompt assembly** — combine system prompt, graph context, fused episodes,
   recent episodes, and user message.
 9. **Inference** — send to inference service. `/chat` awaits full response;
   `/chat/stream` pipes SSE chunks to the client.
 10. **Episode write** — write exchange back to memory with `projectId`.
 11. **Summarisation trigger** — `triggerSummary(session, allEpisodes)` called
    fire-and-forget. See `summarization.md` for full details.
 12. **Auto-naming** — on first message with no session name, fires a secondary
    inference call (max 20 tokens, temperature 0.3) to generate a session name.
 ### Prompt Structure
 ## Prompt Structure
 ```
-[System prompt]
+[Resolved system prompt]
 Here is what you know about entities relevant to this conversation and their connections:
 - {name} ({type}): {notes}
  → {label} {neighbor_name} ({neighbor_type})
 ---
 Here are some relevant memories from earlier conversations:
 User: {past user message}
 Assistant: {past ai response}
 ... (up to 5 semantic episodes)
 ---
 Here are some relevant memories from your past conversations:
 User: {past user message}
 Assistant: {past ai response}
 ... (up to 5 recent episodes)
 --- End of recent memories ---
 User: {current message}
 Assistant:
 ```
-Semantic episodes appear before recent episodes so the model encounters
+The entity block renders the full graph neighborhood — seed entities matched
-long-range relevant context before the immediate conversation flow.
+by Qdrant search plus any neighbors pulled in by 1-hop traversal. Each entity
 shows its `notes` and any outbound relationships with their targets. Neighbor
 nodes that have no outbound edges within the subgraph appear without connection
 lines.
 ## Summarisation
 After each episode write, `triggerSummary` is called fire-and-forget. It
 checks token thresholds and episode counts before generating, then stores
 the result in the memory service.
 > For full details on trigger conditions, prompt format, cumulative updates,
 > ChatML token stripping, and episode range tracking, see `summarization.md`.
 ## SSE Stream Format
-The inference service emits chunks from the llama.cpp provider in this format:
+Inference service → orchestration:
 ```
 data: {"response":"Hello","done":false}
-data: {"response":"!","done":false}
+data: {"done":true,"model":"gemma-4-26B...gguf","tokenCount":42}
 data: {"done":true,"model":"gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf","tokenCount":42}
 data: [DONE]
 ```
-The orchestration service re-emits to the client as:
+Orchestration → client:
 ```
 data: {"text":"Hello"}
-data: {"text":"!"}
+data: {"done":true,"model":"gemma-4-26B...gguf","tokenCount":42}
 data: {"done":true,"model":"gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf","tokenCount":42}
 ```
-The `[DONE]` sentinel from the inference service is consumed internally
+The `[DONE]` sentinel is consumed internally and not forwarded.
 and not forwarded. The client stream is terminated by `res.end()` after
 the done event. Model name and token count are included on the done event
 so the client can display them in the UI.
-## Models Manifest
+## Models Route
-The `/models` endpoint reads a `models.json` file from disk at the path
+`GET /models` scans `.gguf` files live from `modelsFolderPath` and merges
-specified by `MODELS_MANIFEST_PATH`. The file lives on the main PC alongside
+with `models.json` for metadata. Returns file size in GB.
 the model files, and is accessible to orchestration via a network share
 mounted at `/mnt/nexus-models`.
-The manifest is read fresh on each request — no restart needed when models
+`GET /models/props` fetches directly from llama-server. Returns
-are added or removed.
+`{ contextWindow, modelAlias }`. Returns `503` if unreachable.
-**models.json format:**
+## Sessions Route Behaviour
-```json
+
-[
+`PATCH /sessions/:sessionId` accepts `name`, `projectId`, or both.
-  { "value": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf", "label": "Gemma 4 26B Claude Distill" }
+Rejects only when neither is provided — allows `useChat` to write project
-]
+assignment separately from rename operations.
 ## Caddy Configuration
 Each route prefix needs a handle block in the Caddyfile on Mini PC 2.
 **Any new top-level route must be added here AND in `vite.config.js`.**
 ```
 handle /chat*      { reverse_proxy localhost:4000 }
 handle /sessions*  { reverse_proxy localhost:4000 }
 handle /models*    { reverse_proxy localhost:4000 }
 handle /projects*  { reverse_proxy localhost:4000 }
 handle /episodes*  { reverse_proxy localhost:4000 }
 handle /settings*  { reverse_proxy localhost:4000 }
 handle /summaries* { reverse_proxy localhost:4000 }
 handle /health*    { reverse_proxy localhost:4000 }
 ```
- `value` — must match the model name as reported by `llama-server` (including `.gguf` extension)
+After updating: `caddy reload --config /path/to/Caddyfile`
 - `label` — display name shown in the UI
-## Endpoints
+> Note: `/graph` routes are on the memory-service (port 3002) and are called
 > internally by orchestration — they do not need a Caddy entry.
-### Health
+For all HTTP endpoints, see `api-routes.md`.
 | Method | Path | Description |
 |---|---|---|
 | GET | /health | Service health check — reports downstream service URLs |
 ### Chat
 | Method | Path | Description |
 |---|---|---|
 | POST | /chat | Send a message and receive a complete response |
 | POST | /chat/stream | Send a message and receive a streaming SSE response |
 ### Sessions
 | Method | Path | Description |
 |---|---|---|
 | GET | /sessions | Get paginated list of all sessions |
 | GET | /sessions/:sessionId/history | Get paginated episode history for a session |
 | PATCH | /sessions/:sessionId | Rename a session |
 | DELETE | /sessions/:sessionId | Delete a session and all its episodes |
 ### Models
 | Method | Path | Description |
 |---|---|---|
 | GET | /models | Get list of available models from manifest file |
 ---
 **POST /chat**
 Request body:
 ```json
 {
  "sessionId": "your-session-uuid",
  "message": "Hello, my name is Tim.",
  "model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
  "temperature": 0.7
 }
 ```
 `model` and `temperature` are optional — fall back to inference service defaults
 if omitted.
 Response:
 ```json
 {
  "sessionId": "your-session-uuid",
  "response": "Hello Tim! How can I help you today?",
  "model": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf",
  "tokenCount": 87
 }
 ```
 ---
 **POST /chat/stream**
 Same request body as `POST /chat`.
 Response is a stream of Server-Sent Events:
 ```
 data: {"text":"Hello"}
 data: {"text":" Tim"}
 data: {"done":true,"model":"gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf","tokenCount":87}
 ```
 ---
 **PATCH /sessions/:sessionId**
 Request body:
 ```json
 { "name": "My Renamed Session" }
 ```
 Returns the updated session object. `name` is required and trimmed of whitespace.
 ---
 **DELETE /sessions/:sessionId**
 Returns `204 No Content`. Cascades to delete all episodes for the session.
 ---
 **GET /sessions/:sessionId/history**
 Query parameters:
 | Parameter | Default | Description |
 |---|---|---|
 | limit | 20 | Maximum number of episodes to return |
 | offset | 0 | Number of episodes to skip (for pagination) |
 Response:
 ```json
 {
  "sessionId": "your-session-uuid",
  "episodes": [
    {
      "id": 42,
      "session_id": 1,
      "user_message": "Hello, my name is Tim.",
      "ai_response": "Hello Tim! How can I help you today?",
      "token_count": 87,
      "created_at": 1712345678,
      "metadata": null
    }
  ]
 }
 ```
 Episodes are ordered newest first.
 ---
 **GET /models**
 Returns the parsed contents of `models.json`:
 ```json
 [
  { "value": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf", "label": "Gemma 4 26B Claude Distill" }
 ]
 ```
 Returns `500` if the manifest file cannot be read or parsed.
--- a/docs/services/retrieval-fusion.md
+++ b/docs/services/retrieval-fusion.md
@@ -0,0 +1,153 @@
 # Retrieval Fusion
 **Implementation:** `packages/orchestration-service/src/chat/index.js`  
 **FTS scoping:** `packages/memory-service/src/episodic/index.js`, `src/index.js`  
 **Settings:** `semanticWeight`, `keywordWeight` via `PATCH /settings`
 ## Purpose
 Rather than relying solely on Qdrant vector similarity (which finds semantically
 related content but misses exact keyword matches) or FTS5 keyword search alone
 (which finds exact matches but not paraphrases), Reciprocal Rank Fusion (RRF)
 merges the ranked results from both strategies into a single better-ranked list.
 Episodes that rank highly in **both** lists score highest. An episode that is
 the top semantic match but irrelevant by keyword, or vice versa, scores lower
 than one that satisfies both.
 ## How RRF Works
 For each episode `d`, its fused score is:
 ```
 RRF(d) = w_semantic / (k + rank_semantic(d))
        + w_keyword  / (k + rank_keyword(d))
 ```
 - `rank_i(d)` — 1-based position in that strategy's result list (episode absent from a list contributes 0 for that term)
 - `k = 60` — smoothing constant (standard; not exposed in settings)
 - `w_semantic`, `w_keyword` — user-tunable weights (both default-sourced from `RETRIEVAL` constants)
 Setting a weight to `0` removes that strategy's contribution entirely. Setting
 `keywordWeight` to `0` also short-circuits the FTS network call.
 ## Architecture
 Fusion lives in orchestration — the service already coordinates multiple data
 sources, and fusion is a retrieval strategy, not a storage concern.
 ```
 getFusedEpisodes()
 ├── getSemanticEpisodes()     — Qdrant embed+search → fetch full rows by ID
 │   (existing path, unchanged)
 └── getFTSResults()           — memory-service /episodes/search → full rows directly
    (skipped entirely if keywordWeight == 0)
         ↓
 fuseEpisodeResults()          — pure RRF, no I/O
         ↓
 fusedEpisodes[]               — top semanticLimit episodes by RRF score
 ```
 ### Data Shape Consistency
 Both sides must enter fusion as `Episode[]` — full SQLite row objects with
 the same shape — and both must be filtered against `recentIds` first:
 - **Semantic path**: `recentIds` filter applied before `getEpisodeById` fetch (existing behaviour)
 - **FTS path**: full rows returned directly; `recentIds` filter applied in `getFusedEpisodes` after receiving them
 FTS requests `semanticLimit * 2` results to provide headroom for the
 `recentIds` filter without under-serving the fusion.
 ## FTS Session Scoping
 Without scoping, FTS5 searches across all episodes in the database. For
 context assembly, results must be constrained to the current session or
 project session pool — the same scope used for Qdrant semantic search.
 `searchEpisodes(query, limit, sessionIds)` in memory-service accepts an
 optional `sessionIds` array. When provided, the SQL becomes:
 ```sql
 SELECT e.* FROM episodes e
 JOIN episodes_fts fts ON e.id = fts.rowid
 WHERE episodes_fts MATCH ?
 AND e.session_id IN (?, ?, ...)
 ORDER BY rank
 LIMIT ?
 ```
 The HTTP endpoint `GET /episodes/search` accepts `sessionIds` as a
 comma-separated query param: `?q=hello&sessionIds=1,2,3`.
 In orchestration, `ftsSessionIds` is set to:
 - `projectSessionIds` (all sessions in the project) — if the session belongs to a project
 - `[session.id]` — otherwise (single session only)
 This mirrors the Qdrant scoping logic exactly.
 ## `fuseEpisodeResults` — Implementation Detail
 ```js
 function fuseEpisodeResults(semanticEps, keywordEps, { semanticWeight, keywordWeight, limit }) {
    const k = RETRIEVAL.RRF_K; // 60
    const scores = new Map();  // episode.id → { episode, score }
    // Score semantic results (already filtered against recentIds)
    semanticEps.forEach((ep, i) => {
        scores.set(ep.id, { episode: ep, score: semanticWeight / (k + i + 1) });
    });
    // Score + merge keyword results (already filtered against recentIds)
    keywordEps.forEach((ep, i) => {
        const contrib = keywordWeight / (k + i + 1);
        if (scores.has(ep.id)) {
            scores.get(ep.id).score += contrib;   // appears in both — sum scores
        } else if (contrib > 0) {
            scores.set(ep.id, { episode: ep, score: contrib });  // FTS-only episode
        }
        // contrib == 0 (keywordWeight: 0) → episode not added (guard prevents score-0 bleed-through)
    });
    return [...scores.values()]
        .sort((a, b) => b.score - a.score)
        .slice(0, limit)
        .map(({ episode }) => episode);
 }
 ```
 The `else if (contrib > 0)` guard prevents FTS-only episodes from entering
 the result set with a score of 0 when `keywordWeight` is 0 — verified by
 the test suite.
 ## Settings
 | Setting | Default | Range | Description |
 |---|---|---|---|
 | `semanticWeight` | 1.0 | 0–5 | Weight applied to Qdrant semantic results |
 | `keywordWeight` | 0 | 0–5 | Weight applied to FTS5 keyword results. `0` = disabled |
 Both are readable via `GET /settings` and writable via `PATCH /settings`
 without a service restart. Changes take effect on the next chat request.
 **To enable keyword search:**
 ```bash
 curl -X PATCH http://localhost:4000/settings \
  -H "Content-Type: application/json" \
  -d '{"keywordWeight": 1.0}'
 ```
 **To favour keyword matches over semantic:**
 ```bash
 curl -X PATCH http://localhost:4000/settings \
  -H "Content-Type: application/json" \
  -d '{"semanticWeight": 0.5, "keywordWeight": 2.0}'
 ```
 ## Constants (`packages/shared/src/config/constants.js`)
 | Constant | Value | Description |
 |---|---|---|
 | `RETRIEVAL.RRF_K` | 60 | RRF smoothing constant — not exposed in settings |
 | `RETRIEVAL.SEMANTIC_WEIGHT` | 1.0 | Default semantic weight |
 | `RETRIEVAL.KEYWORD_WEIGHT` | 0 | Default keyword weight (off) |
--- a/docs/services/shared.md
+++ b/docs/services/shared.md
@@ -142,6 +142,9 @@ llama.cpp runtime defaults — used by the llama.cpp inference provider.
 #### `INFERENCE_DEFAULTS`
 Default inference parameters applied when not specified in a request.
 These are used as fallbacks in `resolveOptions()` in both providers.
 Orchestration reads live values from `settings.json` and forwards them
 on every request — these constants are the fallback layer only.
 | Key | Value | Description |
 |---|---|---|
@@ -154,21 +157,52 @@ Default inference parameters applied when not specified in a request.
 #### `ORCHESTRATION`
-Orchestration pipeline defaults.
+Orchestration pipeline defaults. Used as fallback values in
 `config/settings.js` when `settings.json` doesn't contain a key.
 | Key | Value | Description |
 |---|---|---|
 | `RECENT_EPISODE_LIMIT` | `5` | Recent episodes to inject into prompt |
 | `SEMANTIC_LIMIT` | `5` | Semantic search results to inject into prompt |
 | `SCORE_THRESHOLD` | `0.75` | Minimum similarity score for semantic results |
 | `ENTITIES_LIMIT` | `5` | Max entity search results to inject into prompt |
 | `ENTITIES_THRESHOLD` | `0.55` | Minimum similarity score for entity results |
 | `TEMPERATURE` | `0.7` | Default inference temperature |
 | `CORS_ORIGIN` | `'http://localhost:5173'` | Fallback allowed CORS origin |
 | `SYSTEM_PROMPT` | *(see below)* | Default system prompt |
 > `ENTITIES_THRESHOLD` is set to `0.55` — lower than `SCORE_THRESHOLD` because
 > entity notes generated by a 3B model tend to embed with lower cosine similarity
 > than full episode text. Tune upward if irrelevant entities appear in context.
 > `repeatPenalty`, `topP`, and `topK` defaults are sourced from
 > `INFERENCE_DEFAULTS` in `config/settings.js` rather than `ORCHESTRATION`,
 > since those constants already define the canonical values.
 Default system prompt:
 > "You are a helpful, context-aware AI assistant. You have access to memories
 > of past conversations with the user. Use them to provide consistent,
 > personalised responses."
 #### `SUMMARIES`
 Controls the automatic session summarisation system in `orchestration-service/src/services/summarization.js`.
 | Key | Value | Description |
 |---|---|---|
 | `THRESHOLD_TOKENS` | `200` | Minimum total session tokens before summarisation is considered |
 | `MAX_SUMMARY_TOKENS` | `800` | If existing summary exceeds this length (chars), create a new row instead of updating |
 | `MIN_EPISODES_SINCE` | `5` | Minimum new episodes since last summary before re-summarising |
 These can be overridden per-deployment via environment variables in the
 orchestration service `.env`:
 ```
 SUMMARY_THRESHOLD_TOKENS=200
 SUMMARY_MAX_TOKENS=800
 SUMMARY_MIN_EPISODES=5
 ```
 #### `SQLITE`
 | Key | Value | Description |
--- a/docs/services/summarization.md
+++ b/docs/services/summarization.md
@@ -0,0 +1,201 @@
 # Summarization
 Session summarization generates rolling plain-text summaries of conversation
 history, giving the model a condensed view of past context without consuming
 the full context window with raw episodes.
 **Location:** `packages/orchestration-service/src/services/summarization.js`  
 **Triggered by:** `chat/index.js` after every episode write (fire-and-forget)  
 **Model:** `qwen2.5:3b` via Ollama on Mini PC 1 (192.168.0.81)
 ---
 ## Trigger Conditions
 `triggerSummary(session, allEpisodes)` calls `maybeSummarize` fire-and-forget.
 `maybeSummarize` proceeds only when both conditions are met:
 1. Total session token count exceeds `SUMMARIES.THRESHOLD_TOKENS` (default 200)
 2. At least `SUMMARIES.MIN_EPISODES_SINCE` (default 5) new episodes have
   accumulated since the last summary
 The token threshold is intentionally low — it ensures summaries start
 generating early in a session's life rather than only after very long
 conversations.
 ---
 ## Summary Rows and Cumulative Updates
 Each session can have multiple summary rows in the `summaries` table.
 The update strategy depends on the size of the most recent summary:
 | Condition | Action |
 |---|---|
 | No existing summary | Generate fresh summary from all episodes |
 | Latest summary under `MAX_SUMMARY_TOKENS` | Update: summarise new episodes with existing summary as context |
 | Latest summary over `MAX_SUMMARY_TOKENS` | Create new row: treat as fresh summarisation |
 This produces a chain of summary rows over time. Each row's `episode_range`
 covers only the episodes summarised in that specific pass (e.g. `259-263`),
 not all episodes in the session.
 ---
 ## Ollama Request
 ```js
 {
    model: EXTRACTION_MODEL,   // qwen2.5:3b (set via EXTRACTION_MODEL env var)
    prompt: buildSummaryPrompt(episodesToSummarize, existingSummary),
    stream: false,
    // No format: 'json' — free-text output required for summaries
    options: {
        temperature: 0.2,
        num_predict: 500,
    },
 }
 ```
 `temperature: 0.2` is slightly higher than extraction (0.1) — summaries
 benefit from some fluency. `num_predict: 500` gives room for 5 thorough
 sentences without risk of runoff.
 ---
 ## Prompt Format
 ChatML format — native to qwen2.5:
 ```
 <|im_start|>user
 Summarize the conversation below in 3-5 sentences.
 Write in third person. Do not quote directly — paraphrase only.
 Do not include greetings, sign-offs, or filler. Output only the summary text.
 Conversation:
 {context}
 <|im_end|>
 <|im_start|>assistant
 ```
 For cumulative updates, the instruction and context change:
 ```
 <|im_start|>user
 Update the summary below to incorporate the new exchanges.
 Write 3-5 sentences in third person. Do not quote directly — paraphrase only.
 Do not include greetings, sign-offs, or filler. Output only the updated summary text.
 Previous summary:
 {existingSummary}
 New exchanges:
 {context}
 <|im_end|>
 <|im_start|>assistant
 ```
 ### Input truncation
 Episode context is truncated to `MAX_CHARS = 3000` characters, keeping the
 most recent exchanges (sliced from the end). This keeps Qwen focused and
 prevents the prompt from exceeding its effective context window.
 ---
 ## ChatML Token Stripping
 Qwen occasionally echoes ChatML tokens back into its response. The raw output
 is cleaned before saving:
 ```js
 const raw = data.response?.trim() ?? '';
 const content = raw
    .replace(/<\|im_start\|>.*?<\|im_end\|>/gs, '')
    .replace(/<\|im_start\|>|<\|im_end\|>|<\|im_sep\|>/g, '')
    .trim();
 return content;
 ```
 Without this, leaked tokens get stored in the summary and then injected
 back into the next summarisation prompt — causing the model to append a new
 summary after the old one rather than replacing it.
 ---
 ## Episode Range Tracking
 Each summary row stores `episode_range` as `"firstId-lastId"` covering only
 the episodes summarised in that pass:
 ```js
 const summarizedIds = episodesToSummarize.map(ep => ep.id).sort((a,b) => a - b);
 const episodeRange = `${summarizedIds.at(0)}-${summarizedIds.at(-1)}`;
 ```
 This makes SummaryView cards meaningful — "Episodes 259-263" tells you
 exactly which exchanges that summary covers, rather than always showing
 the full session range.
 ---
 ## Summary Storage
 Summaries are written directly to the memory service from orchestration:
 ```js
 // Create new row
 await fetch(`${MEMORY_URL}/summaries`, {
    method: 'POST',
    body: JSON.stringify({ sessionId: session.id, content, tokenCount, episodeRange }),
 });
 // Update existing row
 await fetch(`${MEMORY_URL}/summaries/${latest.id}`, {
    method: 'PATCH',
    body: JSON.stringify({ content, tokenCount, episodeRange }),
 });
 ```
 `session.id` here is the internal SQLite integer ID — not the external UUID.
 It is available directly on the `session` object passed from `chat/index.js`.
 ---
 ## Client-Side Indicator
 The chat client shows a "Summarising…" spinner in the `ChatWindow` header
 and on the InfoPanel's Session Memory button while summarisation may be
 in progress.
 Since summarisation is fire-and-forget with no completion signal back to
 the client, the indicator is timer-based: it activates when the stream
 finishes and clears after 8 seconds.
 ```js
 // In App.jsx, watching the streaming state from useChat:
 useEffect(() => {
    if (prevStreaming.current && !streaming) {
        setSummarising(true);
        const t = setTimeout(() => setSummarising(false), 8000);
        return () => clearTimeout(t);
    }
    prevStreaming.current = streaming;
 }, [streaming]);
 ```
 ---
 ## Environment Variables
 Set in `packages/orchestration-service/src/.env`:
 | Variable | Default | Description |
 |---|---|---|
 | `EXTRACTION_URL` | `http://localhost:11434` | Ollama instance URL |
 | `EXTRACTION_MODEL` | `qwen2.5:3b` | Model for summarisation |
 | `MEMORY_SERVICE_URL` | `http://localhost:3002` | Memory service URL |
 | `SUMMARY_THRESHOLD_TOKENS` | `200` | Token threshold before summarisation triggers |
 | `SUMMARY_MAX_TOKENS` | `800` | Max summary length before a new row is created |
 | `SUMMARY_MIN_EPISODES` | `5` | Min new episodes since last summary before re-summarising |s
--- a/package-lock.json
+++ b/package-lock.json
--- a/packages/chat-client/package.json
+++ b/packages/chat-client/package.json
@@ -10,6 +10,7 @@
  "dependencies": {
    "react": "^18.2.0",
    "react-dom": "^18.2.0",
    "react-markdown": "^10.1.0",
    "uuid": "^13.0.0"
  },
  "devDependencies": {
--- a/packages/chat-client/src/App.jsx
+++ b/packages/chat-client/src/App.jsx
@@ -1,18 +1,54 @@
-import React, { useState } from 'react';
+import React, { useState, useEffect } from 'react';
 import SessionList from './components/SessionList';
 import ChatWindow from './components/ChatWindow';
 import InfoPanel from './components/InfoPanel';
 import Sidebar from './components/Sidebar';
 import HomeView from './components/HomeView';
 import { v4 as uuidv4 } from 'uuid';
 import { getModelProps } from './api/orchestration';
 /*** View Panels*** */
 import AllChatsView from './components/AllChatsView';
 import AllProjectsView from './components/AllProjectsView';
 import SettingsView from './components/SettingsView';
 import ProjectView from './components/ProjectView';
 import MemoryView from './components/MemoryView';
 import SummaryView from './components/SummaryView';
 /**** useHooks **** */
 import { useSession } from './hooks/useSession';
 import { useChat } from './hooks/useChat';
 import { useModels } from './hooks/useModels';
 import { useProjects } from './hooks/useProjects';
 // Views where back nav makes sense, and where they go back to
 const BACK_MAP = {
  'chat':         'home',
  'all-chats':    'home',
  'all-projects': 'home',
  'settings':     'home',
  'project':      'all-projects',
  'memory':       'settings',
  'summaries':    'chat',   
 };
 export default function App() {
-  const [leftOpen, setLeftOpen] = useState(true);
+  const [leftOpen, setLeftOpen] = useState(false); // collapsed on home
-  const [rightOpen, setRightOpen] = useState(true);
+  const [rightOpen, setRightOpen] = useState(false);
  const { models, selectedModel, setSelectedModel } = useModels();
  const [view, setView] = useState('home');
  const [viewHistory, setViewHistory] = useState([]);
  const [activeProject, setActiveProject] = useState(null);
  const { projects, refreshProjects } = useProjects();
  // Lifted model props — available to header + SettingsView
  const [modelProps, setModelProps] = useState(null);
  useEffect(() => {
    getModelProps().then(setModelProps).catch(() => {});
  }, []);
  const {
    sessions,
    setSessions,
    activeSession,
    messages,
    loadingHistory,
@@ -29,10 +65,32 @@ export default function App() {
    streaming,
    lastTokenCount,
    lastModel,
    summarising,
  } = useChat({ activeSession, appendMessage, updateLastMessage, refreshSessions });
  function navigate(nextView) {
    setViewHistory(prev => [...prev, view]);
    setView(nextView);
    // Expand sidebar when leaving home
    if (view === 'home') setLeftOpen(true);
  }
  function goBack() {
    if (viewHistory.length > 0) {
      const prev = viewHistory[viewHistory.length - 1];
      setViewHistory(h => h.slice(0, -1));
      setView(prev);
      if (prev === 'home') setLeftOpen(false);
    } else {
      // Fallback to BACK_MAP
      const dest = BACK_MAP[view] ?? 'home';
      setView(dest);
      if (dest === 'home') setLeftOpen(false);
    }
  }
  function handleSendMessage(text) {
-    sendMessage(text, selectedModel);
+    sendMessage(text, selectedModel, activeSession?.project_id ?? null);
  }
  function handleSessionsChange(deletedSession) {
@@ -42,22 +100,57 @@ export default function App() {
    refreshSessions();
  }
  // Home: create session, navigate to chat, then send after a tick
  function handleHomeSend(text) {
    const newSession = createSession(); // ← capture the returned session
    setViewHistory(prev => [...prev, 'home']);
    setView('chat');
    setLeftOpen(true);
    sendMessage(text, selectedModel, null, newSession); // ← pass directly, no setTimeout needed
  }
  function handleNewProjectChat(text) {
    const newSession = {
      external_id: uuidv4(),
      metadata: null,
      isNew: true,
      project_id: activeProject?.id ?? null,
    };
    setSessions(prev => [newSession, ...prev]);
    selectSession(newSession);
    setViewHistory(prev => [...prev, view]);
    setView('chat');
    setLeftOpen(true);
    sendMessage(text, selectedModel, activeProject?.id ?? null, newSession); // ← direct, no timeout
  }
  const canGoBack = view !== 'home';
  return (
-    <div style={{
+    <div style={{ display: 'flex', height: '100vh', overflow: 'hidden' }}>
-      display: 'flex',
+      <Sidebar
      height: '100vh',
      overflow: 'hidden',
    }}>
      <SessionList
        sessions={sessions}
        activeSession={activeSession}
-        onSelectSession={selectSession}
+        onSelectSession={session => { selectSession(session); navigate('chat'); }}
-        onNewChat={createSession}
+        onNewChat={() => { createSession(); navigate('chat'); }}
        onNewProject={() => navigate('all-projects')}
        isOpen={leftOpen}
        onToggle={() => setLeftOpen(o => !o)}
        onSessionsChange={handleSessionsChange}
        onNavigate={navigate}
        projects={projects}
        onProjectsChange={refreshProjects}
        onSelectProject={setActiveProject}
      />
      {view === 'home' && (
        <HomeView
          onSendMessage={handleHomeSend}
          loadedModel={modelProps?.modelAlias ?? null}
        />
      )}
      {view === 'chat' && (
        <ChatWindow
          messages={messages}
          loadingHistory={loadingHistory}
@@ -65,7 +158,63 @@ export default function App() {
          activeSession={activeSession}
          onSendMessage={handleSendMessage}
          onCancel={cancelStream}
          onTogglePanel={() => setRightOpen(o => !o)}
          onBack={goBack}
          canGoBack={canGoBack}
          loadedModel={modelProps?.modelAlias ?? null}
          summarising={summarising}
        />
      )}
      {view === 'all-chats' && (
        <AllChatsView
          onBack={goBack}
          onSelectSession={session => { selectSession(session); navigate('chat'); }}
          projects={projects}
        />
      )}
      {view === 'all-projects' && (
        <AllProjectsView
          onBack={goBack}
          onProjectsChange={refreshProjects}
          onSelectProject={setActiveProject}
          onNavigate={navigate}
        />
      )}
      {view === 'settings' && (
        <SettingsView
          onNavigate={navigate}
          onBack={goBack}
          modelProps={modelProps}
        />
      )}
      {view === 'project' && activeProject && (
        <ProjectView
          project={activeProject}
          onNavigate={navigate}
          onBack={goBack}
          onSelectSession={selectSession}
          onNewProjectChat={handleNewProjectChat}
          onProjectsChange={refreshProjects}  // ← add
        />
      )}
      {view === 'memory' && (
        <MemoryView
          onNavigate={navigate}
          onBack={goBack}
        />
      )}
      {view === 'summaries' && (
        <SummaryView
          activeSession={activeSession}
          onBack={goBack}
        />
      )}
      <InfoPanel
        isOpen={rightOpen}
@@ -76,7 +225,8 @@ export default function App() {
        onModelChange={setSelectedModel}
        lastModel={lastModel}
        lastTokenCount={lastTokenCount}
-
+        summarising={summarising}
        onViewSummary={() => navigate('summaries')}
      />
    </div>
  );
--- a/packages/chat-client/src/api/orchestration.js
+++ b/packages/chat-client/src/api/orchestration.js
@@ -1,11 +1,17 @@
 import { API_DEFAULTS } from "../config/constants";
 const BASE_URL = import.meta.env.VITE_ORCHESTRATION_URL ?? '';
 // ── Sessions ────────────────────────────────────────────────
-export async function fetchSessions(limit = API_DEFAULTS.SESSIONS_LIMIT, offset = API_DEFAULTS.OFFSET) {
+export async function fetchSessions(limit = API_DEFAULTS.SESSIONS_LIMIT, offset = API_DEFAULTS.OFFSET, projectId = null) {
-  const res = await fetch(`${BASE_URL}/sessions?limit=${limit}&offset=${offset}`);
+  const url = new URL(`${BASE_URL}/sessions`, window.location.origin);
  url.searchParams.set('limit', limit);
  url.searchParams.set('offset', offset);
  if (projectId) url.searchParams.set('projectId', projectId);
  const res = await fetch(url.toString());
  if (!res.ok) throw new Error(`Failed to fetch sessions: ${res.status}`);
  return res.json();
 }
@@ -28,60 +34,6 @@ export async function sendMessage(sessionId, message, model) {
  return res.json();
 }
 // onChunk(text) called for each token
 // onDone({ model, tokenCount }) called when stream closes
 // returns an abort function — call it to cancel mid-stream
 /*
 export function streamMessage(sessionId, message, model, { onChunk, onDone, onError }) {
  const controller = new AbortController();
  (async () => {
    try {
      const res = await fetch(`${BASE_URL}/chat/stream`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ sessionId, message, model }),
        signal: controller.signal,
      });
      if (!res.ok) throw new Error(`Stream request failed: ${res.status}`);
      const reader = res.body.getReader();
      const decoder = new TextDecoder();
      let buffer = '';
      while (true) {
        const { done, value } = await reader.read();
        if (done) break;
        // Append to buffer and split on double newline (SSE event delimiter)
        buffer += decoder.decode(value, { stream: true });
        const events = buffer.split('\n\n');
        buffer = events.pop(); // last item may be incomplete — keep in buffer
        for (const event of events) {
          const line = event.trim();
          if (!line.startsWith('data: ')) continue;
          const raw = line.slice(6);
          try {
            const data = JSON.parse(raw);
            if (data.text) onChunk(data.text);
            if (data.done) onDone({ model: data.model ?? model, tokenCount: data.tokenCount ?? 0 });
            if (data.error) onError(new Error(data.error));
          } catch {
            // malformed JSON — skip
          }
        }
      }
    } catch (err) {
      if (err.name !== 'AbortError') onError(err);
    }
  })();
  return () => controller.abort();
 }
 */
 export function streamMessage(sessionId, message, model, { onChunk, onDone, onError }) {
  const controller = new AbortController();
@@ -144,19 +96,131 @@ export async function fetchModels() {
  return res.json();
 }
-export async function renameSession(sessionId, name) {
+export async function updateSession(sessionId, { name, projectId } = {}) {
  const res = await fetch(`${BASE_URL}/sessions/${sessionId}`, {
    method: 'PATCH',
    headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({ name }),
+    body: JSON.stringify({ name, projectId }),
  });
-    if (!res.ok) throw new Error(`Failed to rename session: ${res.status}`);
+  if (!res.ok) throw new Error(`Failed to update session: ${res.status}`);
  return res.json();
 }
 export async function renameSession(sessionId, name) {
  return updateSession(sessionId, {name})
 }
 export async function deleteSession(sessionId) {
    const res = await fetch(`${BASE_URL}/sessions/${sessionId}`, {
        method: 'DELETE',
    });
    if (!res.ok) throw new Error(`Failed to delete session: ${res.status}`);
 }
 export async function fetchProjects() {
  const res = await fetch(`${BASE_URL}/projects`);
  if (!res.ok) throw new Error(`Failed to fetch projects: ${res.status}`);
  return res.json();
 }
 export async function createProject({ name, description, colour, icon, isolated }) {
  const res = await fetch(`${BASE_URL}/projects`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ name, description, colour, icon, isolated: isolated ? 1 : 0 }),
  });
  if (!res.ok) throw new Error(`Failed to create project: ${res.status}`);
  return res.json();
 }
 export async function updateProject(id, fields = {}) {
  // Convert isolated boolean to integer if present
  const body = { ...fields };
  if (body.isolated !== undefined) body.isolated = body.isolated ? 1 : 0;
  const res = await fetch(`${BASE_URL}/projects/${id}`, {
    method: 'PATCH',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify(body),
  });
  if (!res.ok) throw new Error(`Failed to update project: ${res.status}`);
  return res.json();
 }
 export async function deleteProject(id) {
  const res = await fetch(`${BASE_URL}/projects/${id}`, { method: 'DELETE' });
  if (!res.ok) throw new Error(`Failed to delete project: ${res.status}`);
 }
 export async function updateSessionProject(sessionId, projectId) {
  const res = await fetch(`${BASE_URL}/sessions/${sessionId}`, {
    method: 'PATCH',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ projectId }),
  });
  if (!res.ok) throw new Error(`Failed to update session project: ${res.status}`);
  return res.json();
 }
 export async function getEpisodes({ limit = API_DEFAULTS.EPISODE_LIMIT, offset = API_DEFAULTS.OFFSET, sessionId, q } = {}) {
  const url = new URL(`${BASE_URL}/episodes`, window.location.origin);
  url.searchParams.set('limit', limit);
  url.searchParams.set('offset', offset);
  if (sessionId) url.searchParams.set('sessionId', sessionId);
  if (q) url.searchParams.set('q', q);
  const res = await fetch(url.toString());
  if (!res.ok) throw new Error(`Failed to fetch episodes: ${res.status}`);
  return res.json(); // { episodes, total }
 }
 export async function deleteEpisode(id) {
  const res = await fetch(`${BASE_URL}/episodes/${id}`, { method: 'DELETE' });
  if (!res.ok) throw new Error(`Failed to delete episode: ${res.status}`);
 }
 export async function getSettings() {
  const res = await fetch(`${BASE_URL}/settings`);
  if (!res.ok) throw new Error(`Failed to fetch settings: ${res.status}`);
  return res.json();
 }
 export async function updateSettings(updates) {
  const res = await fetch(`${BASE_URL}/settings`, {
    method: 'PATCH',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify(updates),
  });
  if (!res.ok) throw new Error(`Failed to update settings: ${res.status}`);
  return res.json();
 }
 export async function getServiceHealth() {
  const res = await fetch(`${BASE_URL}/health/services`);
  if (!res.ok) throw new Error(`Failed to fetch health: ${res.status}`);
  return res.json();
 }
 export async function getModelProps() {
  const res = await fetch(`${BASE_URL}/models/props`);
  if (!res.ok) throw new Error('Failed to fetch model props');
  return res.json();
 }
 export async function fetchSessionSummaries(sessionId) {
  const res = await fetch(`${BASE_URL}/summaries/session/${sessionId}`);
  if (!res.ok) throw new Error(`Failed to fetch summaries: ${res.status}`);
  return res.json();
 }
 export async function generateProjectSummary(projectId) {
    const res = await fetch(`${BASE_URL}/summaries/project/${projectId}/generate`, { method: 'POST' });
    if (!res.ok) throw new Error(`Failed to generate project summary: ${res.status}`);
    return res.json();
 }
 export async function fetchProjectOverviewSummary(projectId) {
    const res = await fetch(`${BASE_URL}/summaries/project/${projectId}/overview`);
    if (!res.ok) throw new Error(`Failed to fetch project overview: ${res.status}`);
    return res.json(); // null if none exists yet
 }
--- a/packages/chat-client/src/components/AllChatsView.jsx
+++ b/packages/chat-client/src/components/AllChatsView.jsx
@@ -0,0 +1,274 @@
 import React, { useState, useEffect } from 'react';
 import { fetchSessions, deleteSession } from '../api/orchestration';
 import { CLIENT_DEFAULTS } from '../config/constants';
 const PAGE_SIZE = CLIENT_DEFAULTS.PAGE_SIZE;
 export default function AllChatsView({ onSelectSession, onBack, projects }) {
  const [sessions, setSessions] = useState([]);
  const [loading, setLoading] = useState(true);
  const [page, setPage] = useState(0);
  const [total, setTotal] = useState(0);
  const [selected, setSelected] = useState(new Set());
  const [confirmOpen, setConfirmOpen] = useState(false);
  const [deleting, setDeleting] = useState(false);
  useEffect(() => {
    loadPage(page);
  }, [page]);
  async function loadPage(p) {
    setLoading(true);
    setSelected(new Set());
    try {
      const data = await fetchSessions(PAGE_SIZE, p * PAGE_SIZE);
      setSessions(data);
      setTotal(data.length === PAGE_SIZE ? (p + 2) * PAGE_SIZE : p * PAGE_SIZE + data.length);
    } catch (err) {
      console.error('[AllChatsView] Failed to load sessions:', err.message);
    } finally {
      setLoading(false);
    }
  }
  function toggleSelect(id) {
    setSelected(prev => {
      const next = new Set(prev);
      next.has(id) ? next.delete(id) : next.add(id);
      return next;
    });
  }
  function toggleSelectAll() {
    if (selected.size === sessions.length) {
      setSelected(new Set());
    } else {
      setSelected(new Set(sessions.map(s => s.external_id)));
    }
  }
  async function handleBulkDelete() {
    setDeleting(true);
    try {
      await Promise.all([...selected].map(id => deleteSession(id)));
      setConfirmOpen(false);
      await loadPage(page);
    } catch (err) {
      console.error('[AllChatsView] Bulk delete failed:', err.message);
    } finally {
      setDeleting(false);
    }
  }
  function formatTimestamp(ts) {
    if (!ts) return '—';
    const date = new Date(ts * 1000);
    const now = new Date();
    const diffMs = now - date;
    const diffMins = Math.floor(diffMs / 60000);
    const diffHours = Math.floor(diffMs / 3600000);
    const diffDays = Math.floor(diffMs / 86400000);
    if (diffMins < 1)   return 'Just now';
    if (diffMins < 60)  return `${diffMins}m ago`;
    if (diffHours < 24) return `${diffHours}h ago`;
    if (diffDays === 1) return 'Yesterday';
    return date.toLocaleDateString([], { month: 'short', day: 'numeric', year: 'numeric' });
  }
  function getProject(projectId) {
    if (!projectId || !projects) return null;
    return projects.find(p => p.id === projectId) ?? null;
  }
  const totalPages = Math.ceil(total / PAGE_SIZE);
  const allSelected = sessions.length > 0 && selected.size === sessions.length;
  return (
    <div className="flex-col flex-1 overflow-hidden" style={{ background: 'var(--bg-base)' }}>
      {/* Header */}
      <div className="panel-header" style={{ padding: '0 8px 0 8px', justifyContent: 'space-between' }}>
        <div style={{ display: 'flex', alignItems: 'center', gap: '4px' }}>
          <button className="btn-icon" onClick={onBack} title="Back" style={{ fontSize: '16px', padding: '4px 8px' }}>←</button>
          <span className="text-base" style={{ fontWeight: 500, color: 'var(--text-secondary)' }}>All Chats</span>
        </div>
        {selected.size > 0 && (
          <button
            onClick={() => setConfirmOpen(true)}
            className="btn-reset text-xs"
            style={{
              padding: '4px 10px',
              borderRadius: 'var(--radius-md)',
              background: '#c0392b22',
              color: '#ff6b6b',
              border: '1px solid #c0392b55',
            }}
          >
            Delete {selected.size} selected
          </button>
        )}
      </div>
      {/* Table */}
      <div className="flex-1 scroll-y" style={{ padding: '16px 24px' }}>
        {loading ? (
          <div className="text-base text-muted" style={{ padding: '40px', textAlign: 'center' }}>
            Loading...
          </div>
        ) : (
          <table style={{ width: '100%', borderCollapse: 'collapse' }}>
            <thead>
              <tr style={{ borderBottom: '1px solid var(--border)' }}>
                <th style={{ width: '36px', padding: '8px 0' }}>
                  <input
                    type="checkbox"
                    checked={allSelected}
                    onChange={toggleSelectAll}
                    style={{ cursor: 'pointer', accentColor: 'var(--accent-hover)' }}
                  />
                </th>
                <th className="label-upper" style={{ textAlign: 'left', padding: '8px 12px' }}>Name</th>
                <th className="label-upper" style={{ textAlign: 'left', padding: '8px 12px', width: '130px' }}>Project</th>
                <th className="label-upper" style={{ textAlign: 'right', padding: '8px 0', width: '110px' }}>Last Active</th>
              </tr>
            </thead>
            <tbody>
              {sessions.map(session => {
                const isSelected = selected.has(session.external_id);
                const project = getProject(session.project_id);
                return (
                  <tr
                    key={session.external_id}
                    style={{
                      borderBottom: '1px solid var(--border)',
                      background: isSelected ? 'var(--bg-elevated)' : 'transparent',
                      transition: 'background 0.1s',
                    }}
                    onMouseEnter={e => { if (!isSelected) e.currentTarget.style.background = 'var(--bg-surface)'; }}
                    onMouseLeave={e => { if (!isSelected) e.currentTarget.style.background = 'transparent'; }}
                  >
                    <td style={{ padding: '10px 0', width: '36px' }}>
                      <input
                        type="checkbox"
                        checked={isSelected}
                        onChange={() => toggleSelect(session.external_id)}
                        style={{ cursor: 'pointer', accentColor: 'var(--accent-hover)' }}
                      />
                    </td>
                    <td style={{ padding: '10px 12px' }}>
                      <button
                        className="btn-reset text-base"
                        onClick={() => onSelectSession(session)}
                        style={{ color: 'var(--text-primary)', textAlign: 'left' }}
                      >
                        {session.name || session.external_id}
                      </button>
                    </td>
                    <td style={{ padding: '10px 12px' }}>
                      {project ? (
                        <div style={{ display: 'flex', alignItems: 'center', gap: '6px' }}>
                          <div style={{
                            width: '6px', height: '6px', borderRadius: '50%', flexShrink: 0,
                            background: project.colour ?? 'var(--accent)',
                          }} />
                          <span className="text-xs text-muted truncate" style={{ maxWidth: '90px' }}>
                            {project.name}
                          </span>
                        </div>
                      ) : (
                        <span className="text-xs text-muted">—</span>
                      )}
                    </td>
                    <td className="text-xs text-muted" style={{ textAlign: 'right', padding: '10px 0' }}>
                      {formatTimestamp(session.updated_at)}
                    </td>
                  </tr>
                );
              })}
              {sessions.length === 0 && (
                <tr>
                  <td colSpan={4} className="text-base text-muted"
                    style={{ textAlign: 'center', padding: '40px' }}>
                    No conversations yet
                  </td>
                </tr>
              )}
            </tbody>
          </table>
        )}
      </div>
      {/* Pagination */}
      {totalPages > 1 && (
        <div className="flex items-center" style={{
          borderTop: '1px solid var(--border)',
          padding: '10px 24px',
          gap: '12px',
          flexShrink: 0,
          justifyContent: 'flex-end',
        }}>
          <span className="text-xs text-muted">
            Page {page + 1} of {totalPages}
          </span>
          <button
            className="btn-icon"
            onClick={() => setPage(p => p - 1)}
            disabled={page === 0}
            style={{ fontSize: '14px' }}
          >‹</button>
          <button
            className="btn-icon"
            onClick={() => setPage(p => p + 1)}
            disabled={(page + 1) * PAGE_SIZE >= total}
            style={{ fontSize: '14px' }}
          >›</button>
        </div>
      )}
      {/* Bulk delete confirmation dialog */}
      {confirmOpen && (
        <div onClick={() => setConfirmOpen(false)} style={{
          position: 'fixed', inset: 0,
          background: 'rgba(0,0,0,0.5)',
          display: 'flex', alignItems: 'center', justifyContent: 'center',
          zIndex: 100,
        }}>
          <div onClick={e => e.stopPropagation()} style={{
            background: 'var(--bg-surface)',
            border: '1px solid var(--border)',
            borderRadius: 'var(--radius-lg)',
            padding: '24px', width: '360px',
            display: 'flex', flexDirection: 'column', gap: '16px',
          }}>
            <h2 style={{ fontSize: '15px', fontWeight: 600, color: 'var(--text-primary)' }}>
              Delete {selected.size} conversation{selected.size !== 1 ? 's' : ''}?
            </h2>
            <p className="text-sm text-secondary">
              This will permanently remove all selected conversations and their messages. This cannot be undone.
            </p>
            <div className="flex" style={{ gap: '8px', justifyContent: 'flex-end' }}>
              <button
                className="btn-reset text-base text-muted"
                onClick={() => setConfirmOpen(false)}
                style={{ padding: '8px 14px', borderRadius: 'var(--radius-md)' }}
              >Cancel</button>
              <button
                className="btn-reset text-base"
                onClick={handleBulkDelete}
                disabled={deleting}
                style={{
                  padding: '8px 16px', borderRadius: 'var(--radius-md)',
                  background: deleting ? 'var(--bg-elevated)' : '#c0392b',
                  color: deleting ? 'var(--text-muted)' : 'white',
                }}
              >{deleting ? 'Deleting...' : 'Delete'}</button>
            </div>
          </div>
        </div>
      )}
    </div>
  );
 }
--- a/packages/chat-client/src/components/AllProjectsView.jsx
+++ b/packages/chat-client/src/components/AllProjectsView.jsx
@@ -0,0 +1,166 @@
 import React, { useState, useEffect } from 'react';
 import ProjectModal from './ProjectModal';
 import { fetchProjects, createProject, updateProject, deleteProject } from '../api/orchestration';
 export default function AllProjectsView({ onProjectsChange, onBack, onSelectProject, onNavigate }) {
  const [projects, setProjects] = useState([]);
  const [loading, setLoading] = useState(true);
  const [modal, setModal] = useState(null); // { mode, project? }
  useEffect(() => { load(); }, []);
  async function load() {
    setLoading(true);
    try {
      setProjects(await fetchProjects());
    } catch (err) {
      console.error('[AllProjectsView] Failed to load:', err.message);
    } finally {
      setLoading(false);
    }
  }
 async function handleSave({ name, description, colour, icon }) {
  try {
    if (modal.mode === 'create') {
      await createProject({ name, description, colour, icon });
    } else {
      await updateProject(modal.project.id, { name, description, colour, icon });
    }
    await load();
    onProjectsChange?.();  // add this
  } catch (err) {
    console.error('[AllProjectsView] Save failed:', err.message);
  }
 }
 async function handleDelete(id) {
  try {
    await deleteProject(id);
    await load();
    onProjectsChange?.();  // add this
  } catch (err) {
    console.error('[AllProjectsView] Delete failed:', err.message);
  }
 }
  return (
    <div className="flex-col flex-1 overflow-hidden" style={{ background: 'var(--bg-base)' }}>
      {/* Header */}
      <div className="panel-header" style={{ padding: '0 8px 0 8px', justifyContent: 'space-between' }}>
        <div style={{ display: 'flex', alignItems: 'center', gap: '4px' }}>
          <button className="btn-icon" onClick={onBack} title="Back" style={{ fontSize: '16px', padding: '4px 8px' }}>←</button>
          <span className="text-base" style={{ fontWeight: 500, color: 'var(--text-secondary)' }}>All Projects</span>
        </div>
        <button
          className="btn-primary"
          onClick={() => setModal({ mode: 'create' })}
          style={{ padding: '5px 12px', fontSize: '12px' }}
        >
          + New Project
        </button>
      </div>
      {/* Tile grid */}
      <div className="flex-1 scroll-y" style={{ padding: '24px' }}>
        {loading ? (
          <div className="text-base text-muted" style={{ textAlign: 'center', padding: '40px' }}>
            Loading...
          </div>
        ) : (
          <div style={{
            display: 'grid',
            gridTemplateColumns: 'repeat(auto-fill, minmax(180px, 1fr))',
            gap: '16px',
          }}>
            {projects.map(project => (
              <ProjectTile
                  key={project.id}
                  project={project}
                  onSelect={() => { onSelectProject(project); onNavigate('project'); }}
                  onEdit={() => setModal({ mode: 'edit', project })}
                  onDelete={() => setModal({ mode: 'confirm-delete', project })}
                />
                ))}
            {projects.length === 0 && (
              <div className="text-base text-muted" style={{
                gridColumn: '1 / -1', textAlign: 'center', padding: '60px 0',
              }}>
                No projects yet — create one to get started
              </div>
            )}
          </div>
        )}
      </div>
      {modal && (
        <ProjectModal
          project={modal.project}
          mode={modal.mode}
          onSave={handleSave}
          onDelete={handleDelete}
          onClose={() => setModal(null)}
        />
      )}
    </div>
  );
 }
 function ProjectTile({ project, onSelect, onEdit, onDelete }) {
  const [hovered, setHovered] = useState(false);
  return (
    <div
      onClick={onSelect}
      onMouseEnter={() => setHovered(true)}
      onMouseLeave={() => setHovered(false)}
      style={{
        background: 'var(--bg-surface)',
        border: `1px solid ${hovered ? 'var(--accent)' : 'var(--border)'}`,
        borderRadius: 'var(--radius-lg)',
        padding: '16px',
        display: 'flex', flexDirection: 'column', gap: '8px',
        transition: 'border-color 0.15s',
        position: 'relative',
        minHeight: '100px',
        cursor: 'pointer',
      }}
    >
      {/* Colour accent bar */}
      <div style={{
        position: 'absolute', top: 0, left: 0, right: 0,
        height: '3px',
        background: project.colour ?? 'var(--accent)',
        borderRadius: 'var(--radius-lg) var(--radius-lg) 0 0',
      }} />
      <span className="text-base truncate" style={{
        fontWeight: 500, color: 'var(--text-primary)', marginTop: '4px',
      }}>
        {project.name}
      </span>
      {project.description && (
        <span className="text-xs text-muted" style={{
          display: '-webkit-box', WebkitLineClamp: 2,
          WebkitBoxOrient: 'vertical', overflow: 'hidden',
        }}>
          {project.description}
        </span>
      )}
      {/* Action buttons — appear on hover */}
      {hovered && (
        <div className="flex" style={{ gap: '4px', marginTop: 'auto', justifyContent: 'flex-end' }}>
          <button className="btn-icon" onClick={e => { e.stopPropagation(); onEdit(); }}
            title="Edit" style={{ fontSize: '12px' }}>✎</button>
          <button className="btn-icon" onClick={e => { e.stopPropagation(); onDelete(); }}
            title="Delete" style={{ fontSize: '12px', color: '#ff6b6b' }}>✕</button>
        </div>
      )}
    </div>
  );
 }
--- a/packages/chat-client/src/components/ChatWindow.jsx
+++ b/packages/chat-client/src/components/ChatWindow.jsx
@@ -1,7 +1,19 @@
 import React, { useEffect, useRef } from 'react';
 import MessageBubble from './MessageBubble';
-export default function ChatWindow({ messages, loadingHistory, streaming, onSendMessage, onCancel, activeSession }) {
+export default function ChatWindow({
  messages,
  loadingHistory,
  streaming,
  onSendMessage,
  onCancel,
  activeSession,
  onTogglePanel,
  onBack,
  canGoBack,
  loadedModel,
  summarising,
 }) {
  const bottomRef = useRef(null);
  const inputRef = useRef(null);
  const [input, setInput] = React.useState('');
@@ -24,16 +36,75 @@ export default function ChatWindow({ messages, loadingHistory, streaming, onSend
    }
  }
  // Trim .gguf for display
  const modelLabel = loadedModel ? loadedModel.replace('.gguf', '') : null;
  return (
    <div className="flex-col flex-1 overflow-hidden" style={{ background: 'var(--bg-base)' }}>
      {/* Header */}
-      <div className="panel-header" style={{ padding: '0 20px' }}>
+      <div className="panel-header" style={{ padding: '0 12px 0 8px', justifyContent: 'space-between' }}>
-        <span className="text-base text-secondary">
+        <div style={{ display: 'flex', alignItems: 'center', gap: '4px', minWidth: 0 }}>
-          {activeSession ? ( activeSession.name || activeSession.external_id) : 'No session selected'}
+          {/* Back button */}
          {canGoBack && (
            <button
              className="btn-icon"
              onClick={onBack}
              title="Go back"
              style={{ flexShrink: 0, fontSize: '16px', padding: '4px 8px' }}
            >←</button>
          )}
          {/* Session name */}
          <span className="text-base text-secondary truncate">
            {activeSession ? (activeSession.name || activeSession.external_id) : 'New chat'}
          </span>
        </div>
        <div style={{ display: 'flex', alignItems: 'center', gap: '8px', flexShrink: 0 }}>
          {/* Loaded model pill */}
          {modelLabel && (
            <span style={{
              fontSize: '11px',
              color: 'var(--text-muted)',
              background: 'var(--bg-elevated)',
              border: '1px solid var(--border)',
              borderRadius: '999px',
              padding: '2px 10px',
              maxWidth: '200px',
              overflow: 'hidden',
              textOverflow: 'ellipsis',
              whiteSpace: 'nowrap',
            }}>
              {modelLabel}
            </span>
          )}
          {!modelLabel && (
            <span style={{
              fontSize: '11px',
              color: 'var(--text-muted)',
              fontStyle: 'italic',
            }}>
              No model loaded
            </span>
          )}
          {summarising && (
            <div style={{ display: 'flex', alignItems: 'center', gap: '6px' }}>
              <div style={{
                width: '10px', height: '10px', borderRadius: '50%',
                border: '2px solid var(--accent)',
                borderTopColor: 'transparent',
                animation: 'spin 0.7s linear infinite',
                flexShrink: 0,
              }} />
              <span style={{ fontSize: '11px', color: 'var(--text-muted)', whiteSpace: 'nowrap' }}>
                Summarising…
              </span>
            </div>
          )}
          <button className="btn-icon" onClick={onTogglePanel} title="Session info">⊹</button>
        </div>
      </div>
      {/* Message thread */}
      <div className="flex-1 scroll-y" style={{ padding: '20px 0' }}>
        {!activeSession && (
@@ -43,7 +114,7 @@ export default function ChatWindow({ messages, loadingHistory, streaming, onSend
            gap: '12px',
          }}>
            <div style={{ fontSize: '32px', opacity: 0.4 }}>✦</div>
-            <p className="text-base">Select a session or start a new chat</p>
+            <p className="text-base">Start typing to begin</p>
          </div>
        )}
@@ -79,8 +150,7 @@ export default function ChatWindow({ messages, loadingHistory, streaming, onSend
            value={input}
            onChange={e => setInput(e.target.value)}
            onKeyDown={handleKeyDown}
-            disabled={!activeSession}
+            placeholder="Message NexusAI..."
            placeholder={activeSession ? 'Message NexusAI...' : 'Select a session to start chatting'}
            rows={1}
            style={{
              flex: 1,
@@ -114,7 +184,7 @@ export default function ChatWindow({ messages, loadingHistory, streaming, onSend
          ) : (
            <button
              onClick={handleSend}
-              disabled={!activeSession || !input.trim()}
+              disabled={!input.trim()}
              className="btn-primary"
              style={{
                width: '32px',
--- a/packages/chat-client/src/components/HomeView.jsx
+++ b/packages/chat-client/src/components/HomeView.jsx
@@ -0,0 +1,149 @@
 import React, { useState } from 'react';
 function getGreeting() {
  const h = new Date().getHours();
  if (h < 12) return 'Morning';
  if (h < 18) return 'Afternoon';
  return 'Evening';
 }
 const QUICK_ACTIONS = [
  { label: 'Summarise something', icon: '◈' },
  { label: 'Help me write', icon: '✦' },
  { label: 'Explain a concept', icon: '◎' },
  { label: 'Debug my code', icon: '</>' },
 ];
 export default function HomeView({ onSendMessage, loadedModel }) {
  const [input, setInput] = useState('');
  function handleSend() {
    const text = input.trim();
    if (!text) return;
    setInput('');
    onSendMessage(text);
  }
  function handleKeyDown(e) {
    if (e.key === 'Enter' && !e.shiftKey) {
      e.preventDefault();
      handleSend();
    }
  }
  const modelLabel = loadedModel ? loadedModel.replace('.gguf', '') : null;
  return (
    <div className="flex-col flex-1 overflow-hidden" style={{
      background: 'var(--bg-base)',
      alignItems: 'center',
      justifyContent: 'center',
      gap: '32px',
    }}>
      {/* Greeting */}
      <div style={{ textAlign: 'center' }}>
        <h1 style={{
          fontSize: '32px',
          fontWeight: 600,
          color: 'var(--text-primary)',
          letterSpacing: '-0.5px',
          marginBottom: '8px',
        }}>
          {getGreeting()}, Tim
        </h1>
        <p className="text-sm text-muted">
          {modelLabel ? `Running ${modelLabel}` : 'No model loaded'}
        </p>
      </div>
      {/* Input */}
      <div style={{ width: '100%', maxWidth: '580px', padding: '0 24px' }}>
        <div style={{
          background: 'var(--bg-elevated)',
          border: '1px solid var(--border)',
          borderRadius: 'var(--radius-lg)',
          padding: '12px 14px',
        }}>
          <textarea
            value={input}
            onChange={e => setInput(e.target.value)}
            onKeyDown={handleKeyDown}
            placeholder="How can I help you today?"
            rows={1}
            autoFocus
            style={{
              width: '100%',
              background: 'transparent',
              border: 'none',
              outline: 'none',
              color: 'var(--text-primary)',
              fontSize: '14px',
              lineHeight: '1.6',
              resize: 'none',
              fontFamily: 'inherit',
              maxHeight: '120px',
              overflowY: 'auto',
            }}
            onInput={e => {
              e.target.style.height = 'auto';
              e.target.style.height = `${e.target.scrollHeight}px`;
            }}
          />
          <div style={{ display: 'flex', justifyContent: 'flex-end', marginTop: '8px' }}>
            <button
              onClick={handleSend}
              disabled={!input.trim()}
              className="btn-primary"
              style={{
                width: '32px', height: '32px',
                fontSize: '16px',
                border: '1px solid var(--border)',
              }}
            >↑</button>
          </div>
        </div>
        <p className="text-xs text-muted" style={{ textAlign: 'center', marginTop: '8px' }}>
          Enter to send · Shift+Enter for new line
        </p>
      </div>
      {/* Quick action pills — populate input, don't auto-send */}
      <div style={{
        display: 'flex', gap: '8px',
        flexWrap: 'wrap', justifyContent: 'center',
        padding: '0 24px',
      }}>
        {QUICK_ACTIONS.map(({ label, icon }) => (
          <button
            key={label}
            onClick={() => setInput(label)}
            style={{
              display: 'flex', alignItems: 'center', gap: '6px',
              padding: '7px 14px',
              background: 'var(--bg-surface)',
              border: '1px solid var(--border)',
              borderRadius: '999px',
              color: 'var(--text-secondary)',
              fontSize: '13px',
              cursor: 'pointer',
              transition: 'border-color 0.15s, color 0.15s',
            }}
            onMouseEnter={e => {
              e.currentTarget.style.borderColor = 'var(--accent)';
              e.currentTarget.style.color = 'var(--text-primary)';
            }}
            onMouseLeave={e => {
              e.currentTarget.style.borderColor = 'var(--border)';
              e.currentTarget.style.color = 'var(--text-secondary)';
            }}
          >
            <span style={{ fontSize: '11px', opacity: 0.7 }}>{icon}</span>
            {label}
          </button>
        ))}
      </div>
    </div>
  );
 }
--- a/packages/chat-client/src/components/InfoPanel.jsx
+++ b/packages/chat-client/src/components/InfoPanel.jsx
@@ -1,14 +1,29 @@
 import React from 'react';
-export default function InfoPanel({ isOpen, onToggle, activeSession, lastModel, lastTokenCount, selectedModel, onModelChange, models }) {
+export default function InfoPanel({ 
  isOpen, 
  onToggle, 
  activeSession, 
  lastModel, 
  lastTokenCount, 
  selectedModel, 
  onModelChange, 
  models, 
  summarising,
  onViewSummary,
 }) {
  return (
    <div className="flex-col" style={{
-      width: isOpen ? 'var(--panel-width)' : '56px',
+  position: 'fixed',
-      flexShrink: 0,
+  top: 0,
  right: 0,
  height: '100vh',
  width: 'var(--panel-width)',
  background: 'var(--bg-surface)',
  borderLeft: '1px solid var(--border)',
-      transition: 'width 0.2s ease',
+  transform: isOpen ? 'translateX(0)' : 'translateX(100%)',
-      overflow: 'hidden',
+  transition: 'transform 0.2s ease',
  zIndex: 20,
 }}>
      {/* Header */}
@@ -70,13 +85,37 @@ export default function InfoPanel({ isOpen, onToggle, activeSession, lastModel,
            )}
          </Section>
-        </div>
+          {/* Session Memory button */}
          {activeSession && !activeSession.isNew && (
          <button
            onClick={onViewSummary}
            className="btn-reset text-sm"
            style={{
              marginTop: '8px', width: '100%', padding: '7px 10px',
              borderRadius: 'var(--radius-md)',
              background: 'var(--bg-elevated)',
              border: '1px solid var(--border)',
              color: 'var(--text-secondary)',
              display: 'flex', alignItems: 'center', gap: '8px',
            }}
            onMouseEnter={e => e.currentTarget.style.borderColor = 'var(--accent-hover)'}
            onMouseLeave={e => e.currentTarget.style.borderColor = 'var(--border)'}
          >
            <span>◈</span>
            <span>Session Memory</span>
            {summarising && (
              <div style={{
                marginLeft: 'auto',
                width: '8px', height: '8px', borderRadius: '50%',
                border: '2px solid var(--accent-hover)',
                borderTopColor: 'transparent',
                animation: 'spin 0.7s linear infinite',
                flexShrink: 0,
              }} />
            )}
          </button>
        )}
      {!isOpen && (
        <div className="flex-col items-center" style={{ flex: 1, paddingTop: '16px', gap: '16px' }}>
          <IconHint title="Model">M</IconHint>
          <IconHint title="Session">S</IconHint>
        </div>
      )}
    </div>
--- a/packages/chat-client/src/components/MemoryView.jsx
+++ b/packages/chat-client/src/components/MemoryView.jsx
@@ -0,0 +1,194 @@
 import React, { useState, useEffect, useCallback } from 'react';
 import { getEpisodes, deleteEpisode } from '../api/orchestration';
 import ReactMarkdown from 'react-markdown';
 const PAGE_SIZE = 20;
 export default function MemoryView({ onNavigate, onBack }) {
  const [episodes, setEpisodes]   = useState([]);
  const [total, setTotal]         = useState(0);
  const [offset, setOffset]       = useState(0);
  const [search, setSearch]       = useState('');
  const [query, setQuery]         = useState('');   // committed search term
  const [expanded, setExpanded]   = useState(null); // episode id
  const [loading, setLoading]     = useState(false);
  const [error, setError]         = useState(null);
  const load = useCallback(async () => {
    setLoading(true);
    setError(null);
    try {
      const data = await getEpisodes({ limit: PAGE_SIZE, offset, q: query || undefined });
      setEpisodes(data.episodes);
      setTotal(data.total);
    } catch (err) {
      setError(err.message);
    } finally {
      setLoading(false);
    }
  }, [offset, query]);
  useEffect(() => { load(); }, [load]);
  function handleSearch(e) {
    e.preventDefault();
    setOffset(0);       // reset to page 1 on new search
    setQuery(search);
  }
  async function handleDelete(id) {
    if (!confirm('Delete this memory? This cannot be undone.')) return;
    await deleteEpisode(id);
    load();
  }
  const totalPages = Math.ceil(total / PAGE_SIZE);
  const currentPage = Math.floor(offset / PAGE_SIZE) + 1;
  return (
    <div style={{ display: 'flex', flexDirection: 'column', flex: 1, overflow: 'hidden', background: 'var(--bg-base)' }}>
      {/* Header */}
      <div className="panel-header" style={{ padding: '0 24px', gap: 12 }}>
        <button className="btn-icon" onClick={onBack} title="Back">
          ←
        </button>
        <span className="text-base" style={{ fontWeight: 500 }}>Memory Viewer</span>
        <span className="text-sm text-muted" style={{ marginLeft: 'auto' }}>
          {total} episode{total !== 1 ? 's' : ''}
        </span>
      </div>
      {/* Search bar */}
      <form onSubmit={handleSearch} style={{ padding: '12px 24px', borderBottom: '1px solid var(--border)' }}>
        <div style={{ display: 'flex', gap: 8 }}>
          <input
            className="text-sm"
            value={search}
            onChange={e => setSearch(e.target.value)}
            placeholder="Search memories…"
            style={{
              flex: 1, padding: '8px 12px',
              background: 'var(--bg-surface)', border: '1px solid var(--border)',
              borderRadius: 'var(--radius)', color: 'var(--text-primary)',
            }}
          />
          <button type="submit" className="btn-primary" style={{ padding: '8px 16px' }}>
            Search
          </button>
          {query && (
            <button type="button" className="btn-icon" onClick={() => { setSearch(''); setQuery(''); setOffset(0); }}>
              ✕
            </button>
          )}
        </div>
      </form>
      {/* Episode list */}
      <div className="scroll-y flex-1" style={{ padding: '16px 24px' }}>
        {loading && <p className="text-sm text-muted">Loading…</p>}
        {error  && <p className="text-sm" style={{ color: 'var(--error, #e05)' }}>{error}</p>}
        {!loading && episodes.length === 0 && (
          <p className="text-sm text-muted">No memories found.</p>
        )}
        {episodes.map(ep => (
          <EpisodeCard
            key={ep.id}
            episode={ep}
            expanded={expanded === ep.id}
            onToggle={() => setExpanded(expanded === ep.id ? null : ep.id)}
            onDelete={() => handleDelete(ep.id)}
          />
        ))}
      </div>
      {/* Pagination */}
      {totalPages > 1 && (
        <div style={{
          display: 'flex', alignItems: 'center', justifyContent: 'center',
          gap: 12, padding: '12px', borderTop: '1px solid var(--border)',
        }}>
          <button className="btn-icon" disabled={offset === 0}
            onClick={() => setOffset(o => Math.max(0, o - PAGE_SIZE))}>←</button>
          <span className="text-sm text-muted">{currentPage} / {totalPages}</span>
          <button className="btn-icon" disabled={currentPage >= totalPages}
            onClick={() => setOffset(o => o + PAGE_SIZE)}>→</button>
        </div>
      )}
    </div>
  );
 }
 function stripMarkdown(text) {
  return text
    .replace(/\*\*(.*?)\*\*/g, '$1')  // bold
    .replace(/\*(.*?)\*/g, '$1')       // italic
    .replace(/`([^`]+)`/g, '$1')       // inline code
    .replace(/^#{1,6}\s+/gm, '')       // headings
    .replace(/^\s*[-*+]\s+/gm, '')     // list markers
    .trim();
 }
 function EpisodeCard({ episode, expanded, onToggle, onDelete }) {
  const date = new Date(episode.created_at * 1000).toLocaleString();
  const preview = stripMarkdown(episode.user_message).slice(0, 80) + 
  (episode.user_message.length > 80 ? '…' : '');
  return (
    <div style={{
      background: 'var(--bg-surface)', border: '1px solid var(--border)',
      borderRadius: 'var(--radius-lg)', marginBottom: 8, overflow: 'hidden',
    }}>
      {/* Card header — always visible */}
      <div style={{ display: 'flex', alignItems: 'center', gap: 8, padding: '10px 14px', cursor: 'pointer' }}
        onClick={onToggle}>
        <span style={{ flex: 1, fontSize: 13, color: 'var(--text-primary)' }}>{preview}</span>
        <span className="text-sm text-muted">{date}</span>
        <span className="text-muted" style={{ fontSize: 11 }}>#{episode.id}</span>
        <button className="btn-icon" style={{ color: 'var(--error, #e05)', fontSize: 14 }}
          onClick={e => { e.stopPropagation(); onDelete(); }} title="Delete">🗑</button>
        <span className="text-muted" style={{ fontSize: 11 }}>{expanded ? '▲' : '▼'}</span>
      </div>
      {/* Expanded content */}
      {expanded && (
        <div style={{ padding: '0 14px 14px', borderTop: '1px solid var(--border)' }}>
          <MessageBlock label="You" content={episode.user_message} color="var(--accent)" />
          <MessageBlock label="NexusAI" content={episode.ai_response} color="var(--text-secondary)" />
          {episode.token_count > 0 && (
            <p className="text-sm text-muted" style={{ marginTop: 8 }}>
              Tokens: {episode.token_count}
            </p>
          )}
        </div>
      )}
    </div>
  );
 }
 function MessageBlock({ label, content, color }) {
  const isAI = label === 'NexusAI';
  return (
    <div style={{ marginTop: 12 }}>
      <p style={{ fontSize: 11, fontWeight: 600, color, marginBottom: 4, textTransform: 'uppercase', letterSpacing: '0.05em' }}>
        {label}
      </p>
        <ReactMarkdown
            components={{
              p:      ({children}) => <p style={{ margin: '0 0 8px', lineHeight: 1.6, fontSize: 13 }}>{children}</p>,
              ul:     ({children}) => <ul style={{ margin: '0 0 8px', paddingLeft: '20px' }}>{children}</ul>,
              ol:     ({children}) => <ol style={{ margin: '0 0 8px', paddingLeft: '20px' }}>{children}</ol>,
              li:     ({children}) => <li style={{ marginBottom: '2px', fontSize: 13 }}>{children}</li>,
              code:   ({inline, children}) => inline
                ? <code style={{ background: 'var(--bg-elevated)', padding: '1px 5px', borderRadius: 'var(--radius-sm)', fontSize: 12, fontFamily: 'monospace' }}>{children}</code>
                : <pre style={{ background: 'var(--bg-elevated)', padding: '10px 12px', borderRadius: 'var(--radius-md)', overflowX: 'auto', fontSize: 12, fontFamily: 'monospace' }}><code>{children}</code></pre>,
              strong: ({children}) => <strong style={{ fontWeight: 600, color: 'var(--text-primary)' }}>{children}</strong>,
            }}
          >{content}</ReactMarkdown>
    </div>
  );
 }
--- a/packages/chat-client/src/components/MessageBubble.jsx
+++ b/packages/chat-client/src/components/MessageBubble.jsx
@@ -1,4 +1,5 @@
 import React from 'react';
 import ReactMarkdown from 'react-markdown';
 export default function MessageBubble({ message }) {
  const isUser = message.role === 'user';
@@ -24,17 +25,29 @@ export default function MessageBubble({ message }) {
      <div style={{
        maxWidth: '70%',
-        padding: '10px 14px',
+        padding: '14px 14px',
-        borderRadius: isUser ? '18px 18px 4px 18px' : '18px 18px 18px 4px',
+        borderRadius: isUser ? '18px 4px 4px 18px' : '4px 18px 18px 4px',
        background: isUser ? 'var(--bubble-user)' : 'var(--bubble-ai)',
        color: 'var(--text-primary)',
-        fontSize: '14px',
+        fontSize: '18px',
-        lineHeight: '1.6',
+        lineHeight: '1.8',
-        border: isUser ? 'none' : '1px solid var(--border)',
+        border: isUser ? 'none' : '2px solid var(--border)',
        whiteSpace: 'pre-wrap',
        wordBreak: 'break-word',
      }}>
-        {message.text}
+        <ReactMarkdown
          components={{
            // Tighten up default spacing so it fits the bubble style
            p: ({ children }) => <p style={{ margin: '0 0 8px', lineHeight: 1.6 }}>{children}</p>,
            ul: ({ children }) => <ul style={{ margin: '0 0 8px', paddingLeft: '20px' }}>{children}</ul>,
            ol: ({ children }) => <ol style={{ margin: '0 0 8px', paddingLeft: '20px' }}>{children}</ol>,
            li: ({ children }) => <li style={{ marginBottom: '2px' }}>{children}</li>,
            code: ({ inline, children }) => inline
              ? <code style={{ background: 'var(--bg-elevated)', padding: '1px 5px', borderRadius: 'var(--radius-sm)', fontSize: '12px', fontFamily: 'monospace' }}>{children}</code>
              : <pre style={{ background: 'var(--bg-elevated)', padding: '10px 12px', borderRadius: 'var(--radius-md)', overflowX: 'auto', fontSize: '12px', fontFamily: 'monospace' }}><code>{children}</code></pre>,
            strong: ({ children }) => <strong style={{ fontWeight: 600, color: 'var(--text-primary)' }}>{children}</strong>,
          }}
        >{message.text}</ReactMarkdown>
        {message.streaming && (
          <span style={{
            display: 'inline-block',
@@ -47,7 +60,7 @@ export default function MessageBubble({ message }) {
          }} />
        )}
        {message.error && (
-          <div className="text-xs" style={{ marginTop: '6px', color: '#ff6b6b' }}>
+          <div className="text-xs" style={{ marginTop: '6px', color: 'var(--warning)' }}>
            ⚠ Failed to complete response
          </div>
        )}
--- a/packages/chat-client/src/components/ProjectModal.jsx
+++ b/packages/chat-client/src/components/ProjectModal.jsx
@@ -0,0 +1,178 @@
 import React, { useState, useEffect, useRef } from 'react';
 const COLOURS = ['#3d3a79', '#2d6a4f', '#7b2d8b', '#c0392b', '#d4800a', '#1a6b8a'];
 export default function ProjectModal({ project, mode, onSave, onDelete, onClose }) {
  const [name, setName] = useState(project?.name ?? '');
  const [description, setDescription] = useState(project?.description ?? '');
  const [colour, setColour] = useState(project?.colour ?? COLOURS[0]);
  const [systemPrompt, setSystemPrompt] = useState(project?.system_prompt ?? '');
  const inputRef = useRef(null);
  useEffect(() => {
    if (mode !== 'confirm-delete') inputRef.current?.focus();
  }, [mode]);
  function handleSubmit() {
    const trimmed = name.trim();
    if (!trimmed) return;
    onSave({
      name: trimmed,
      description: description.trim() || null,
      colour,
      icon: null,
      isolated: 1,
      system_prompt: systemPrompt.trim() || null,
    });
    onClose();
  }
  function handleKeyDown(e) {
    if (e.key === 'Escape') onClose();
    // Don't submit on Enter — textarea fields make Enter ambiguous
  }
  return (
    <div onClick={onClose} style={{
      position: 'fixed', inset: 0,
      background: 'rgba(0,0,0,0.5)',
      display: 'flex', alignItems: 'center', justifyContent: 'center',
      zIndex: 100,
    }}>
      <div onClick={e => e.stopPropagation()} onKeyDown={handleKeyDown} style={{
        background: 'var(--bg-surface)',
        border: '1px solid var(--border)',
        borderRadius: 'var(--radius-lg)',
        padding: '24px', width: '420px',
        maxHeight: '90vh', overflowY: 'auto',
        display: 'flex', flexDirection: 'column', gap: '16px',
      }}>
        {mode === 'confirm-delete' ? (
          <>
            <h2 style={{ fontSize: '15px', fontWeight: 600, color: 'var(--text-primary)' }}>
              Delete project?
            </h2>
            <p className="text-sm text-secondary">
              Are you sure you want to delete{' '}
              <span style={{ color: 'var(--text-primary)', fontWeight: 500 }}>{project.name}</span>?
              Sessions in this project will not be deleted.
            </p>
            <div className="flex" style={{ gap: '8px', justifyContent: 'flex-end' }}>
              <button className="btn-reset text-base text-muted"
                onClick={onClose}
                style={{ padding: '8px 14px', borderRadius: 'var(--radius-md)' }}>
                Cancel
              </button>
              <button className="btn-reset text-base"
                onClick={() => { onDelete(project.id); onClose(); }}
                style={{ padding: '8px 16px', borderRadius: 'var(--radius-md)', background: '#c0392b', color: 'white' }}>
                Delete
              </button>
            </div>
          </>
        ) : (
          <>
            <h2 style={{ fontSize: '15px', fontWeight: 600, color: 'var(--text-primary)' }}>
              {mode === 'create' ? 'New Project' : 'Edit Project'}
            </h2>
            {/* Name */}
            <div className="flex-col" style={{ gap: '6px' }}>
              <label className="label-upper">Name</label>
              <input
                ref={inputRef}
                value={name}
                onChange={e => setName(e.target.value)}
                placeholder="Project name..."
                style={{
                  background: 'var(--bg-elevated)', border: '1px solid var(--border)',
                  borderRadius: 'var(--radius-md)', padding: '8px 12px',
                  color: 'var(--text-primary)', fontSize: '14px', outline: 'none', width: '100%',
                }}
              />
            </div>
            {/* Description */}
            <div className="flex-col" style={{ gap: '6px' }}>
              <label className="label-upper">Description <span style={{ opacity: 0.5 }}>(optional)</span></label>
              <textarea
                value={description}
                onChange={e => setDescription(e.target.value)}
                placeholder="What's this project about..."
                rows={2}
                style={{
                  background: 'var(--bg-elevated)', border: '1px solid var(--border)',
                  borderRadius: 'var(--radius-md)', padding: '8px 12px',
                  color: 'var(--text-primary)', fontSize: '14px', outline: 'none',
                  width: '100%', resize: 'none', fontFamily: 'inherit',
                }}
              />
            </div>
            {/* Colour picker */}
            <div className="flex-col" style={{ gap: '6px' }}>
              <label className="label-upper">Colour</label>
              <div className="flex" style={{ gap: '8px' }}>
                {COLOURS.map(c => (
                  <button
                    key={c}
                    onClick={() => setColour(c)}
                    className="btn-reset"
                    style={{
                      width: '24px', height: '24px',
                      borderRadius: '50%',
                      background: c,
                      border: colour === c ? '2px solid var(--text-primary)' : '2px solid transparent',
                      outline: colour === c ? '2px solid var(--accent-hover)' : 'none',
                      outlineOffset: '2px',
                    }}
                  />
                ))}
              </div>
            </div>
            {/* System Prompt */}
            <div className="flex-col" style={{ gap: '6px' }}>
              <label className="label-upper">
                System Prompt <span style={{ opacity: 0.5 }}>(optional)</span>
              </label>
              <p className="text-xs text-muted" style={{ marginTop: '-2px' }}>
                Overrides the global system prompt for conversations in this project.
                Leave blank to use the global default.
              </p>
              <textarea
                value={systemPrompt}
                onChange={e => setSystemPrompt(e.target.value)}
                placeholder="You are a helpful assistant specialised in..."
                rows={4}
                style={{
                  background: 'var(--bg-elevated)', border: '1px solid var(--border)',
                  borderRadius: 'var(--radius-md)', padding: '8px 12px',
                  color: 'var(--text-primary)', fontSize: '13px', outline: 'none',
                  width: '100%', resize: 'vertical', fontFamily: 'inherit',
                  lineHeight: '1.6',
                }}
                onFocus={e => e.target.style.borderColor = 'var(--accent)'}
                onBlur={e => e.target.style.borderColor = 'var(--border)'}
              />
            </div>
            <div className="flex" style={{ gap: '8px', justifyContent: 'flex-end' }}>
              <button className="btn-reset text-base text-muted"
                onClick={onClose}
                style={{ padding: '8px 14px', borderRadius: 'var(--radius-md)' }}>
                Cancel
              </button>
              <button className="btn-primary"
                onClick={handleSubmit}
                disabled={!name.trim()}
                style={{ padding: '8px 16px' }}>
                {mode === 'create' ? 'Create' : 'Save'}
              </button>
            </div>
          </>
        )}
      </div>
    </div>
  );
 }
--- a/packages/chat-client/src/components/ProjectView.jsx
+++ b/packages/chat-client/src/components/ProjectView.jsx
@@ -0,0 +1,440 @@
 import React, { useState, useEffect } from 'react';
 import { fetchSessions, updateProject, deleteProject, generateProjectSummary, fetchProjectOverviewSummary } from '../api/orchestration';
 import ProjectModal from './ProjectModal';
 export default function ProjectView({ project, onNavigate, onBack, onSelectSession, onNewProjectChat, onProjectsChange }) {
  const [sessions, setSessions] = useState([]);
  const [loading, setLoading] = useState(true);
  const [input, setInput] = useState('');
  const [menuOpen, setMenuOpen] = useState(false);
  const [modal, setModal] = useState(null);
  const [overview, setOverview] = useState(null);
  const [overviewLoading, setOverviewLoading] = useState(true);
  const [generating, setGenerating] = useState(false);
  const [generateError, setGenerateError] = useState(null);
  useEffect(() => { load(); }, [project.id]);
  useEffect(() => {
    async function loadOverview() {
      setOverviewLoading(true);
      try {
        setOverview(await fetchProjectOverviewSummary(project.id));
      } catch (err) {
        console.error('[ProjectView] Failed to load overview:', err.message);
      } finally {
        setOverviewLoading(false);
      }
    }
    loadOverview();
  }, [project.id]);
  async function load() {
    setLoading(true);
    try {
      setSessions(await fetchSessions(50, 0, project.id));
    } catch (err) {
      console.error('[ProjectView] Failed to load sessions:', err.message);
    } finally {
      setLoading(false);
    }
  }
  function handleSend() {
    const text = input.trim();
    if (!text) return;
    setInput('');
    onNewProjectChat(text);
  }
  function handleKeyDown(e) {
    if (e.key === 'Enter' && !e.shiftKey) {
      e.preventDefault();
      handleSend();
    }
  }
  async function handleSave({ name, description, colour, icon, isolated, system_prompt }) {
    try {
      await updateProject(project.id, { name, description, colour, icon, isolated, system_prompt });
      onProjectsChange?.();
      setModal(null);
    } catch (err) {
      console.error('[ProjectView] Update failed:', err.message);
    }
  }
  async function handleDelete() {
    try {
      await deleteProject(project.id);
      onProjectsChange?.();
      onBack();
    } catch (err) {
      console.error('[ProjectView] Delete failed:', err.message);
    }
  }
  function formatTimestamp(ts) {
    if (!ts) return '—';
    const date = new Date(ts * 1000);
    const now = new Date();
    const diffMs = now - date;
    const diffMins = Math.floor(diffMs / 60000);
    const diffHours = Math.floor(diffMs / 3600000);
    const diffDays = Math.floor(diffMs / 86400000);
    if (diffMins < 1)   return 'Just now';
    if (diffMins < 60)  return `${diffMins}m ago`;
    if (diffHours < 24) return `${diffHours}h ago`;
    if (diffDays === 1) return 'Yesterday';
    return date.toLocaleDateString([], { month: 'short', day: 'numeric', year: 'numeric' });
  }
  async function handleGenerateSummary() {
    setGenerating(true);
    setGenerateError(null);
    try {
        setOverview(await generateProjectSummary(project.id));
    } catch (err) {
        // 422 means no session summaries exist yet — surface a friendly message
        setGenerateError(
            err.message.includes('422')
                ? 'No conversations found in this project yet.'
                : 'Failed to generate summary. Please try again.'
        );
    } finally {
        setGenerating(false);
    }
 }
  return (
    <div className="flex-col flex-1 overflow-hidden" style={{ background: 'var(--bg-base)' }}>
      {/* Colour accent bar */}
      <div style={{ height: '3px', flexShrink: 0, background: project.colour ?? 'var(--accent)' }} />
      {/* Header */}
      <div className="panel-header" style={{ padding: '0 24px', justifyContent: 'space-between' }}>
        <button
          className="btn-reset text-xs text-muted"
          onClick={onBack}
          style={{ display: 'flex', alignItems: 'center', gap: '4px' }}
          onMouseEnter={e => e.currentTarget.style.color = 'var(--text-secondary)'}
          onMouseLeave={e => e.currentTarget.style.color = 'var(--text-muted)'}
        >
          ← All Projects
        </button>
        <div style={{ position: 'relative' }}>
          <button
            className="btn-icon"
            onClick={() => setMenuOpen(o => !o)}
            title="Project options"
            style={{ fontSize: '18px', letterSpacing: '1px' }}
          >⋮</button>
          {menuOpen && (
            <>
              <div style={{ position: 'fixed', inset: 0, zIndex: 40 }} onClick={() => setMenuOpen(false)} />
              <div style={{
                position: 'absolute', top: '100%', right: 0,
                background: 'var(--bg-elevated)',
                border: '1px solid var(--border)',
                borderRadius: 'var(--radius-md)',
                padding: '4px', zIndex: 50, minWidth: '150px',
              }}>
                <MenuButton onClick={() => { setMenuOpen(false); setModal({ mode: 'edit' }); }}>
                  ✎ Edit details
                </MenuButton>
                <MenuButton danger onClick={() => { setMenuOpen(false); setModal({ mode: 'confirm-delete' }); }}>
                  ✕ Delete project
                </MenuButton>
              </div>
            </>
          )}
        </div>
      </div>
      {/* Scrollable content */}
      <div className="flex-1 scroll-y" style={{ padding: '32px 24px' }}>
        {/* Project title + description */}
        <div style={{ marginBottom: '32px' }}>
          <h1 style={{ fontSize: '22px', fontWeight: 600, color: 'var(--text-primary)', marginBottom: '8px' }}>
            {project.name}
          </h1>
          {project.description && (
            <p className="text-sm" style={{ color: 'var(--text-secondary)', maxWidth: '560px', lineHeight: 1.6 }}>
              {project.description}
            </p>
          )}
        </div>
        {/* ── Conversations ── */}
        <div style={{ marginBottom: '40px' }}>
          <p className="label-upper" style={{ marginBottom: '12px' }}>Conversations</p>
          {loading ? (
            <div className="text-sm text-muted">Loading...</div>
          ) : sessions.length === 0 ? (
            <div style={{ display: 'flex', flexDirection: 'column', alignItems: 'center', gap: '16px', padding: '32px 0' }}>
              <p className="text-sm text-muted">No conversations yet — start one below</p>
              <ChatInput
                value={input}
                onChange={setInput}
                onSend={handleSend}
                placeholder={`Start a conversation in ${project.name}…`}
                autoFocus
              />
            </div>
          ) : (
            <>
              <div style={{ display: 'flex', flexDirection: 'column', marginBottom: '16px' }}>
                {sessions.map((session, i) => (
                  <button
                    key={session.external_id}
                    className="btn-reset"
                    onClick={() => { onSelectSession(session); onNavigate('chat'); }}
                    style={{
                      padding: '12px 16px',
                      display: 'flex', alignItems: 'center', justifyContent: 'space-between',
                      borderBottom: i < sessions.length - 1 ? '1px solid var(--border)' : 'none',
                      borderRadius: i === 0
                        ? 'var(--radius-md) var(--radius-md) 0 0'
                        : i === sessions.length - 1
                        ? '0 0 var(--radius-md) var(--radius-md)'
                        : '0',
                      background: 'var(--bg-surface)',
                      textAlign: 'left',
                    }}
                    onMouseEnter={e => e.currentTarget.style.background = 'var(--bg-elevated)'}
                    onMouseLeave={e => e.currentTarget.style.background = 'var(--bg-surface)'}
                  >
                    <span className="text-base" style={{ color: 'var(--text-primary)' }}>
                      {session.name || session.external_id}
                    </span>
                    <span className="text-xs text-muted" style={{ flexShrink: 0, marginLeft: '16px' }}>
                      {formatTimestamp(session.updated_at)}
                    </span>
                  </button>
                ))}
              </div>
              <ChatInput
                value={input}
                onChange={setInput}
                onSend={handleSend}
                placeholder={`New conversation in ${project.name}…`}
              />
            </>
          )}
        </div>
        {/* ── Project Memory ── */}
        <div style={{ marginBottom: '40px' }}>
            <div style={{ display: 'flex', alignItems: 'center', justifyContent: 'space-between', marginBottom: '12px' }}>
                <p className="label-upper">Project Memory</p>
                <button
                    className="btn-primary"
                    style={{ padding: '5px 12px', fontSize: '12px', display: 'flex', alignItems: 'center', gap: '6px' }}
                    onClick={handleGenerateSummary}
                    disabled={generating}
                >
                    {generating
                        ? <><span className="spinner" />Generating…</>
                        : overview ? 'Regenerate' : 'Generate Summary'
                    }
                </button>
            </div>
            <div style={{
                background: 'var(--bg-surface)',
                border: '1px solid var(--border)',
                borderRadius: 'var(--radius-lg)',
                padding: '20px',
            }}>
                {overviewLoading ? (
                    <p className="text-sm text-muted">Loading…</p>
                ) : generateError ? (
                    <p className="text-sm" style={{ color: 'var(--text-muted)', fontStyle: 'italic' }}>
                        {generateError}
                    </p>
                ) : overview ? (
                    <>
                        <p className="text-sm" style={{ color: 'var(--text-secondary)', lineHeight: 1.7, whiteSpace: 'pre-wrap' }}>
                            {overview.content}
                        </p>
                        <p className="text-xs text-muted" style={{ marginTop: '12px' }}>
                            Last generated {formatTimestamp(overview.created_at)}
                        </p>
                    </>
                ) : (
                    // No overview exists yet — explain what this section is for
                    <div style={{ display: 'flex', flexDirection: 'column', gap: '10px' }}>
                        <div style={{ display: 'flex', alignItems: 'center', gap: '10px' }}>
                            <span style={{ fontSize: '20px', opacity: 0.4 }}>◈</span>
                            <span className="text-sm" style={{ fontWeight: 500, color: 'var(--text-primary)' }}>
                                No project summary yet
                            </span>
                        </div>
                        <p className="text-sm text-muted" style={{ lineHeight: 1.6, maxWidth: '520px' }}>
                            Generate a summary to create a concise overview of this project's goals,
                            progress, and key decisions — built from your session summaries.
                        </p>
                    </div>
                )}
            </div>
        </div>
        {/* ── Notes ── */}
        <NotesSection projectId={project.id} initialNotes={project.notes ?? ''} />
      </div>
      {/* Modal */}
      {modal && (
        <ProjectModal
          project={project}
          mode={modal.mode}
          onSave={handleSave}
          onDelete={handleDelete}
          onClose={() => setModal(null)}
        />
      )}
    </div>
  );
 }
 // ── Sub-components ─────────────────────────────────────────
 function ChatInput({ value, onChange, onSend, placeholder, autoFocus }) {
  function handleKeyDown(e) {
    if (e.key === 'Enter' && !e.shiftKey) {
      e.preventDefault();
      onSend();
    }
  }
  return (
    <div style={{ width: '100%', maxWidth: '520px' }}>
      <div style={{
        background: 'var(--bg-elevated)',
        border: '1px solid var(--border)',
        borderRadius: 'var(--radius-lg)',
        padding: '12px 14px',
      }}>
        <textarea
          value={value}
          onChange={e => onChange(e.target.value)}
          onKeyDown={handleKeyDown}
          placeholder={placeholder}
          rows={1}
          autoFocus={autoFocus}
          style={{
            width: '100%', background: 'transparent',
            border: 'none', outline: 'none',
            color: 'var(--text-primary)', fontSize: '14px',
            lineHeight: '1.6', resize: 'none', fontFamily: 'inherit',
            maxHeight: '120px', overflowY: 'auto',
          }}
          onInput={e => {
            e.target.style.height = 'auto';
            e.target.style.height = `${e.target.scrollHeight}px`;
          }}
        />
        <div style={{ display: 'flex', justifyContent: 'flex-end', marginTop: '8px' }}>
          <button
            onClick={onSend}
            disabled={!value.trim()}
            className="btn-primary"
            style={{ width: '32px', height: '32px', fontSize: '16px', border: '1px solid var(--border)' }}
          >↑</button>
        </div>
      </div>
      <p className="text-xs text-muted" style={{ textAlign: 'center', marginTop: '8px' }}>
        Enter to send · Shift+Enter for new line
      </p>
    </div>
  );
 }
 function NotesSection({ projectId, initialNotes }) {
  const [notes, setNotes] = useState(initialNotes);
  const [savedNotes, setSavedNotes] = useState(initialNotes);
  const [saving, setSaving] = useState(false);
  const isDirty = notes !== savedNotes;
  async function handleSave() {
    setSaving(true);
    try {
      await updateProject(projectId, { notes });
      setSavedNotes(notes);
    } catch (err) {
      console.error('[NotesSection] Save failed:', err.message);
    } finally {
      setSaving(false);
    }
  }
  return (
    <div style={{ marginBottom: '40px' }}>
      <div style={{ display: 'flex', alignItems: 'center', justifyContent: 'space-between', marginBottom: '12px' }}>
        <p className="label-upper">Project Notes</p>
        {isDirty && (
          <button
            className="btn-primary"
            style={{ padding: '5px 12px', fontSize: '12px' }}
            disabled={saving}
            onClick={handleSave}
          >
            {saving ? 'Saving…' : 'Save'}
          </button>
        )}
      </div>
      <textarea
        value={notes}
        onChange={e => setNotes(e.target.value)}
        placeholder="Add notes about this project — references, goals, context, anything useful…"
        rows={6}
        style={{
          width: '100%',
          background: 'var(--bg-surface)',
          border: '1px solid var(--border)',
          borderRadius: 'var(--radius-lg)',
          padding: '14px 16px',
          color: 'var(--text-primary)',
          fontSize: '13px', lineHeight: '1.6',
          resize: 'vertical', fontFamily: 'inherit',
          outline: 'none', boxSizing: 'border-box',
        }}
        onFocus={e => e.target.style.borderColor = 'var(--accent)'}
        onBlur={e => e.target.style.borderColor = 'var(--border)'}
      />
      {!isDirty && notes && (
        <p className="text-xs text-muted" style={{ marginTop: '6px' }}>Saved</p>
      )}
    </div>
  );
 }
 function MenuButton({ children, onClick, danger }) {
  return (
    <button
      className="btn-reset text-sm"
      onClick={onClick}
      style={{
        width: '100%', padding: '8px 12px',
        borderRadius: 'var(--radius-sm)',
        justifyContent: 'flex-start',
        color: danger ? '#ff6b6b' : 'var(--text-primary)',
      }}
      onMouseEnter={e => e.currentTarget.style.background = 'var(--bg-surface)'}
      onMouseLeave={e => e.currentTarget.style.background = 'transparent'}
    >{children}</button>
  );
 }
--- a/packages/chat-client/src/components/SessionList.jsx
+++ b/packages/chat-client/src/components/SessionList.jsx
@@ -1,215 +0,0 @@
 import React, { useState } from 'react';
 import SessionModal from './SessionModal';
 import { useContextMenu } from '../hooks/useContextMenu';
 import { renameSession, deleteSession } from '../api/orchestration';
 export default function SessionList({ sessions, activeSession, onSelectSession, onNewChat, isOpen, onToggle, onSessionsChange }) {
  const [modalSession, setModalSession] = useState(null);
  const [hoveredId, setHoveredId] = useState(null);
  const { menu, open: openMenu, close: closeMenu } = useContextMenu();
  const [modalMode, setModalMode] = useState('settings');
  function formatDate(ts) {
    if (!ts) return '';
    const date = new Date(ts * 1000);
    const now = new Date();
    const diffDays = Math.floor((now - date) / (1000 * 60 * 60 * 24));
    if (diffDays === 0) return date.toLocaleTimeString([], { hour: '2-digit', minute: '2-digit' });
    if (diffDays === 1) return 'Yesterday';
    if (diffDays < 7) return date.toLocaleDateString([], { weekday: 'long' });
    return date.toLocaleDateString([], { month: 'short', day: 'numeric' });
  }
  function getPreview(session) {
    if (session.isNew) return 'New conversation';
    return session.name || session.external_id;
  }
  async function handleRename(session, name) {
    try {
      await renameSession(session.external_id, name);
      onSessionsChange();
    } catch (err) {
      console.error('[SessionList] Rename failed:', err.message);
    }
  }
  async function handleDelete(session) {
    try {
      await deleteSession(session.external_id);
      onSessionsChange();
    } catch (err) {
      console.error('[SessionList] Delete failed:', err.message);
    }
  }
  return (
    <>
      <div style={{
        width: isOpen ? 'var(--sidebar-width)' : '56px',
        flexShrink: 0,
        background: 'var(--bg-surface)',
        borderRight: '1px solid var(--border)',
        transition: 'width 0.2s ease',
        overflow: 'hidden',
      }} className="flex-col">
        {/* Header */}
        <div className="panel-header" style={{
          justifyContent: isOpen ? 'space-between' : 'center',
          padding: isOpen ? '0 12px 0 16px' : '0',
        }}>
          {isOpen && <span className="text-base" style={{ fontWeight: 500, color: 'var(--text-secondary)' }}>Conversations</span>}
          <button className="btn-icon" onClick={onToggle}>{isOpen ? '◀' : '▶'}</button>
        </div>
        {/* New chat button */}
        <div style={{ padding: isOpen ? '12px' : '12px 8px', flexShrink: 0 }}>
          <button className="btn-primary" onClick={onNewChat} style={{
            width: '100%',
            padding: isOpen ? '8px 12px' : '8px',
            display: 'flex',
            alignItems: 'center',
            justifyContent: isOpen ? 'flex-start' : 'center',
            gap: '8px',
            whiteSpace: 'nowrap',
            overflow: 'hidden',
          }}>
            <span style={{ fontSize: '18px', lineHeight: 1, flexShrink: 0 }}>+</span>
            {isOpen && <span>New Chat</span>}
          </button>
        </div>
        {/* Session list */}
        <div className="flex-1 scroll-y">
          {isOpen && sessions.map(session => {
            const isActive = activeSession?.external_id === session.external_id;
            const isHovered = hoveredId === session.external_id;
            return (
              <div
                key={session.external_id}
                onMouseEnter={() => setHoveredId(session.external_id)}
                onMouseLeave={() => setHoveredId(null)}
                onContextMenu={e => !session.isNew && openMenu(e, session)}
                style={{
                  position: 'relative',
                  display: 'flex',
                  alignItems: 'stretch',
                  background: isActive ? 'var(--bg-elevated)' : isHovered ? 'var(--bg-elevated)' : 'transparent',
                  borderLeft: isActive ? '2px solid var(--accent)' : '2px solid transparent',
                  transition: 'background 0.1s',
                }}
              >
                {/* Session select button — no action icons inside */}
                <button
                  onClick={() => onSelectSession(session)}
                  className="btn-reset"
                  style={{
                    flex: 1,
                    padding: '10px 16px',
                    paddingRight: isHovered && !session.isNew ? '4px' : '16px',
                    textAlign: 'left',
                    flexDirection: 'column',
                    gap: '3px',
                    minWidth: 0, // allows truncation to work
                  }}
                >
                  <div className="flex" style={{ gap: '8px', width: '100%' }}>
                    <span className="text-base truncate" style={{
                      color: isActive ? 'var(--text-primary)' : 'var(--text-secondary)',
                      fontWeight: isActive ? 500 : 400,
                      flex: 1,
                    }}>
                      {getPreview(session)}
                    </span>
                    {!isHovered && (
                      <span className="text-xs text-muted flex-shrink">
                        {formatDate(session.updated_at)}
                      </span>
                    )}
                  </div>
                  {session.isNew && (
                    <span className="text-xs text-accent" style={{ fontStyle: 'italic' }}>Unsaved</span>
                  )}
                </button>
                {/* Action icons — outside the button, alongside it */}
                {isHovered && !session.isNew && (
                  <div className="flex items-center flex-shrink" style={{ gap: '2px', paddingRight: '8px' }}>
                    <button
                      className="btn-icon"
                      title="Rename"
                      onClick={() => { setModalMode('settings'); setModalSession(session); }}
                      style={{ padding: '2px 4px', fontSize: '12px' }}
                    >✎</button>
                    <button
                      className="btn-icon"
                      title="Delete"
                      onClick={() => { setModalMode('confirm-delete'); setModalSession(session); }}
                      style={{ padding: '2px 4px', fontSize: '12px', color: '#ff6b6b' }}
                    >✕</button>
                  </div>
                )}
              </div>
            );
          })}
          {isOpen && sessions.length === 0 && (
            <div className="text-base text-muted" style={{ padding: '24px 16px', textAlign: 'center' }}>
              No conversations yet
            </div>
          )}
        </div>
      </div>
      {/* Context menu */}
      {menu && (
        <div
          onClick={e => e.stopPropagation()}
          style={{
            position: 'fixed',
            top: menu.y,
            left: menu.x,
            background: 'var(--bg-elevated)',
            border: '1px solid var(--border)',
            borderRadius: 'var(--radius-md)',
            padding: '4px',
            zIndex: 50,
            minWidth: '140px',
          }}
        >
          <button 
            className="btn-reset text-base" 
            onClick={() => { setModalMode('settings'); setModalSession(menu.session); closeMenu(); }}
            style={{ width: '100%', padding: '8px 12px', borderRadius: 'var(--radius-sm)', justifyContent: 'flex-start', color: 'var(--text-primary)' }}
            onMouseEnter={e => e.currentTarget.style.background = 'var(--bg-surface)'}
            onMouseLeave={e => e.currentTarget.style.background = 'transparent'}
          >
            ✎ Rename
          </button>
          <button className="btn-reset text-base"
            onClick={() => { setModalMode('confirm-delete'); setModalSession(menu.session); closeMenu(); }}
            style={{
              width: '100%', padding: '8px 12px', borderRadius: 'var(--radius-sm)',
              justifyContent: 'flex-start', color: '#ff6b6b'
            }}
            onMouseEnter={e => e.currentTarget.style.background = 'var(--bg-surface)'}
            onMouseLeave={e => e.currentTarget.style.background = 'transparent'}
          >✕ Delete</button>
        </div>
      )}
      {/* Rename modal */}
      {modalSession && (
        <SessionModal
          session={modalSession}
          mode={modalMode}
          onRename={handleRename}
          onDelete={handleDelete}
          onClose={() => setModalSession(null)}
        />
      )}
    </>
  );
 }
--- a/packages/chat-client/src/components/SessionModal.jsx
+++ b/packages/chat-client/src/components/SessionModal.jsx
@@ -1,8 +1,9 @@
 // SessionModal.jsx
 import React, { useState, useEffect, useRef } from 'react';
 import { updateSession } from '../api/orchestration';
-export default function SessionModal({ session, mode = 'settings', onRename, onDelete, onClose }) {
+export default function SessionModal({ session, mode = 'settings', onRename, onDelete, onClose, projects = [] }) {
  const [name, setName] = useState(session?.name || '');
  const [projectId, setProjectId] = useState(session?.project_id ?? '');
  const inputRef = useRef(null);
  useEffect(() => {
@@ -15,7 +16,7 @@ export default function SessionModal({ session, mode = 'settings', onRename, onD
  function handleSubmit() {
    const trimmed = name.trim();
    if (!trimmed) return;
-        onRename(session, trimmed);
+    onRename(session, trimmed, projectId || null);
    onClose();
  }
@@ -28,30 +29,26 @@ export default function SessionModal({ session, mode = 'settings', onRename, onD
  return (
    <div onClick={onClose} style={{
-            position: 'fixed',
+      position: 'fixed', inset: 0,
            inset: 0,
      background: 'rgba(0,0,0,0.5)',
-            display: 'flex',
+      display: 'flex', alignItems: 'center', justifyContent: 'center',
            alignItems: 'center',
            justifyContent: 'center',
      zIndex: 100,
    }}>
      <div onClick={e => e.stopPropagation()} onKeyDown={handleKeyDown} style={{
        background: 'var(--bg-surface)',
        border: '1px solid var(--border)',
        borderRadius: 'var(--radius-lg)',
-                padding: '24px',
+        padding: '24px', width: '360px',
-                width: '360px',
+        display: 'flex', flexDirection: 'column', gap: '16px',
                display: 'flex',
                flexDirection: 'column',
                gap: '16px',
      }}>
        {mode === 'settings' ? (
          <>
            <h2 style={{ fontSize: '15px', fontWeight: 600, color: 'var(--text-primary)' }}>
              Session Settings
            </h2>
-                        <div className="flex-col" style={{ gap: '8px' }}>
+
            {/* Name */}
            <div className="flex-col" style={{ gap: '6px' }}>
              <label className="label-upper">Name</label>
              <input
                ref={inputRef}
@@ -59,26 +56,44 @@ export default function SessionModal({ session, mode = 'settings', onRename, onD
                onChange={e => setName(e.target.value)}
                placeholder="Enter session name..."
                style={{
-                                    background: 'var(--bg-elevated)',
+                  background: 'var(--bg-elevated)', border: '1px solid var(--border)',
-                                    border: '1px solid var(--border)',
+                  borderRadius: 'var(--radius-md)', padding: '8px 12px',
-                                    borderRadius: 'var(--radius-md)',
+                  color: 'var(--text-primary)', fontSize: '14px', outline: 'none', width: '100%',
                                    padding: '8px 12px',
                                    color: 'var(--text-primary)',
                                    fontSize: '14px',
                                    outline: 'none',
                                    width: '100%',
                }}
              />
            </div>
            {/* Project assignment */}
            <div className="flex-col" style={{ gap: '6px' }}>
              <label className="label-upper">Project <span style={{ opacity: 0.5 }}>(optional)</span></label>
              <select
                value={projectId}
                onChange={e => setProjectId(e.target.value)}
                style={{
                  width: '100%', padding: '8px 10px',
                  background: 'var(--bg-elevated)', border: '1px solid var(--border)',
                  borderRadius: 'var(--radius-md)', color: 'var(--text-primary)',
                  fontSize: '13px', cursor: 'pointer', outline: 'none',
                }}
              >
                <option value=''>No project</option>
                {projects.map(p => (
                  <option key={p.id} value={p.id}>{p.name}</option>
                ))}
              </select>
            </div>
            <div className="flex" style={{ gap: '8px', justifyContent: 'flex-end' }}>
              <button className="btn-reset text-base text-muted"
                onClick={onClose}
-                                style={{ padding: '8px 14px', borderRadius: 'var(--radius-md)' }}
+                style={{ padding: '8px 14px', borderRadius: 'var(--radius-md)' }}>
-                            >Cancel</button>
+                Cancel
              </button>
              <button className="btn-primary" onClick={handleSubmit}
                disabled={!name.trim()}
-                                style={{ padding: '8px 16px' }}
+                style={{ padding: '8px 16px' }}>
-                            >Save</button>
+                Save
              </button>
            </div>
          </>
        ) : (
@@ -96,17 +111,14 @@ export default function SessionModal({ session, mode = 'settings', onRename, onD
            <div className="flex" style={{ gap: '8px', justifyContent: 'flex-end' }}>
              <button className="btn-reset text-base text-muted"
                onClick={onClose}
-                                style={{ padding: '8px 14px', borderRadius: 'var(--radius-md)' }}
+                style={{ padding: '8px 14px', borderRadius: 'var(--radius-md)' }}>
-                            >Cancel</button>
+                Cancel
              </button>
              <button className="btn-reset text-base"
                onClick={() => { onDelete(session); onClose(); }}
-                                style={{
+                style={{ padding: '8px 16px', borderRadius: 'var(--radius-md)', background: '#c0392b', color: 'white' }}>
-                                    padding: '8px 16px',
+                Delete
-                                    borderRadius: 'var(--radius-md)',
+              </button>
                                    background: '#c0392b',
                                    color: 'white',
                                }}
                            >Delete</button>
            </div>
          </>
        )}
--- a/packages/chat-client/src/components/SettingsView.jsx
+++ b/packages/chat-client/src/components/SettingsView.jsx
@@ -0,0 +1,502 @@
 import React, { useState, useEffect, useCallback } from 'react';
 import { useSettings } from '../hooks/useSettings';
 import { useModels } from '../hooks/useModels';
 import { getServiceHealth } from '../api/orchestration';
 export default function SettingsView({ onNavigate, onBack, modelProps }) {
  const { settings, saveSetting, saving } = useSettings();
  return (
    <div style={{ display: 'flex', flexDirection: 'column', flex: 1, overflow: 'hidden', background: 'var(--bg-base)' }}>
      <div className="panel-header" style={{ padding: '0 8px 0 8px' }}>
        <div style={{ display: 'flex', alignItems: 'center', gap: '4px' }}>
          <button className="btn-icon" onClick={onBack} title="Back" style={{ fontSize: '16px', padding: '4px 8px' }}>←</button>
          <span className="text-base" style={{ fontWeight: 500, color: 'var(--text-secondary)' }}>Settings</span>
        </div>
      </div>
      <div className="flex-1 scroll-y" style={{ padding: '24px' }}>
        <SettingsSection title="Memory">
          <SettingsRow
            label="Memory Viewer"
            description="Browse, search, and delete stored episodes"
            action={<button className="btn-primary" style={{ padding: '6px 14px', fontSize: '13px' }}
              onClick={() => onNavigate('memory')}>Open →</button>}
          />
          <NumberSetting
            label="Recent Episode Limit"
            description="Recent episodes injected into each prompt"
            value={settings?.recentEpisodeLimit}
            min={1} max={20}
            onSave={val => saveSetting('recentEpisodeLimit', val)}
            saving={saving}
          />
          <NumberSetting
            label="Semantic Search Limit"
            description="Max episodes retrieved via vector search per query"
            value={settings?.semanticLimit}
            min={1} max={20}
            onSave={val => saveSetting('semanticLimit', val)}
            saving={saving}
          />
          <NumberSetting
            label="Score Threshold"
            description="Minimum similarity score for semantic results (0–1)"
            value={settings?.scoreThreshold}
            min={0} max={1} step={0.05}
            onSave={val => saveSetting('scoreThreshold', val)}
            saving={saving}
          />
        </SettingsSection>
        <SettingsSection title="Models">
          <SettingsSectionErrorBoundary>
            <ModelsSection settings={settings} saveSetting={saveSetting} saving={saving} modelProps={modelProps} />
          </SettingsSectionErrorBoundary>
        </SettingsSection>
        {/* Global system prompt */}
        <SettingsSection title="Behaviour">
          <SystemPromptSetting settings={settings} saveSetting={saveSetting} saving={saving} />
        </SettingsSection>
        <SettingsSection title="About">
          <SettingsRow
            label="Service Health"
            description="Ping all four services"
            action={<ServiceHealth />}
          />
          <SettingsRow
            label="Version"
            description="NexusAI"
            action={<span className="text-sm text-muted">v0.1.0</span>}
          />
        </SettingsSection>
        <SettingsSection title="Appearance">
          <SettingsRow label="Theme" description="UI colour scheme" action={<ComingSoon />} />
        </SettingsSection>
      </div>
    </div>
  );
 }
 // ── Error boundary ───────────────────────────────────────────
 class SettingsSectionErrorBoundary extends React.Component {
  constructor(props) {
    super(props);
    this.state = { error: null };
  }
  static getDerivedStateFromError(error) {
    return { error };
  }
  render() {
    if (this.state.error) {
      return (
        <SettingsRow
          label="Models unavailable"
          description={this.state.error.message ?? 'Failed to load model settings'}
          action={
            <button className="btn-primary" style={{ padding: '5px 10px', fontSize: '12px' }}
              onClick={() => this.setState({ error: null })}>
              Retry
            </button>
          }
        />
      );
    }
    return this.props.children;
  }
 }
 // ── Layout components ────────────────────────────────────────
 function SettingsSection({ title, children }) {
  return (
    <div style={{ marginBottom: '32px' }}>
      <p className="label-upper" style={{ marginBottom: '12px', color: 'var(--text-secondary)' }}>
        {title}
      </p>
      <div style={{
        background: 'var(--bg-surface)',
        border: '1px solid var(--border)',
        borderRadius: 'var(--radius-lg)',
        overflow: 'hidden',
      }}>
        {children}
      </div>
    </div>
  );
 }
 function SettingsRow({ label, description, action }) {
  return (
    <div style={{
      display: 'flex', alignItems: 'flex-start', justifyContent: 'space-between',
      padding: '14px 16px',
      borderBottom: '1px solid var(--border)',
    }}>
      <div style={{ display: 'flex', flexDirection: 'column', gap: 2 }}>
        <span className="text-sm" style={{ color: 'var(--text-primary)', fontWeight: 500 }}>{label}</span>
        {description && <span className="text-xs text-muted">{description}</span>}
      </div>
      <div style={{ flexShrink: 0, marginLeft: 16 }}>
        {action}
      </div>
    </div>
  );
 }
 function NumberSetting({ label, description, value, min, max, step = 1, onSave, saving }) {
  const [local, setLocal] = useState(value ?? '');
  const isDirty = local !== '' && Number(local) !== value;
  useEffect(() => {
    if (value !== undefined) setLocal(value);
  }, [value]);
  return (
    <SettingsRow
      label={label}
      description={description}
      action={
        <div style={{ display: 'flex', alignItems: 'center', gap: 6 }}>
          <input
            type="number"
            value={local}
            min={min} max={max} step={step}
            onChange={e => setLocal(e.target.value)}
            style={{
              width: '64px', padding: '5px 8px', textAlign: 'center',
              background: 'var(--bg-elevated)', border: '1px solid var(--border)',
              borderRadius: 'var(--radius-md)', color: 'var(--text-primary)',
              fontSize: '13px', outline: 'none',
            }}
          />
          {isDirty && (
            <button
              className="btn-primary"
              style={{ padding: '5px 10px', fontSize: '12px' }}
              disabled={saving}
              onClick={() => onSave(Number(local))}
            >
              Save
            </button>
          )}
        </div>
      }
    />
  );
 }
 function ComingSoon() {
  return <span className="text-xs text-muted" style={{ fontStyle: 'italic' }}>Coming soon</span>;
 }
 // ── System prompt setting ────────────────────────────────────
 function SystemPromptSetting({ settings, saveSetting, saving }) {
  const [local, setLocal] = useState(settings?.systemPrompt ?? '');
  const [savedPrompt, setSavedPrompt] = useState(settings?.systemPrompt ?? '');
  useEffect(() => {
    if (settings?.systemPrompt !== undefined) {
      setLocal(settings.systemPrompt ?? '');
      setSavedPrompt(settings.systemPrompt ?? '');
    }
  }, [settings?.systemPrompt]);
  const isDirty = local !== savedPrompt;
  async function handleSave() {
    await saveSetting('systemPrompt', local.trim() || null);
    setSavedPrompt(local);
  }
  return (
    <div style={{ padding: '14px 16px', borderBottom: '1px solid var(--border)' }}>
      <div style={{ display: 'flex', alignItems: 'flex-start', justifyContent: 'space-between', marginBottom: '8px' }}>
        <div style={{ display: 'flex', flexDirection: 'column', gap: 2 }}>
          <span className="text-sm" style={{ color: 'var(--text-primary)', fontWeight: 500 }}>
            System Prompt
          </span>
          <span className="text-xs text-muted">
            Default instruction given to the model on every request. Projects can override this.
          </span>
        </div>
        {isDirty && (
          <button
            className="btn-primary"
            style={{ padding: '5px 12px', fontSize: '12px', flexShrink: 0, marginLeft: '16px' }}
            disabled={saving}
            onClick={handleSave}
          >
            {saving ? 'Saving…' : 'Save'}
          </button>
        )}
      </div>
      <textarea
        value={local}
        onChange={e => setLocal(e.target.value)}
        rows={5}
        style={{
          width: '100%',
          background: 'var(--bg-elevated)',
          border: '1px solid var(--border)',
          borderRadius: 'var(--radius-md)',
          padding: '10px 12px',
          color: 'var(--text-primary)',
          fontSize: '13px', lineHeight: '1.6',
          resize: 'vertical', fontFamily: 'inherit',
          outline: 'none', boxSizing: 'border-box',
        }}
        onFocus={e => e.target.style.borderColor = 'var(--accent)'}
        onBlur={e => e.target.style.borderColor = 'var(--border)'}
      />
      {!isDirty && local && (
        <p className="text-xs text-muted" style={{ marginTop: '6px' }}>Saved</p>
      )}
    </div>
  );
 }
 // ── Service health ───────────────────────────────────────────
 function ServiceHealth() {
  const [services, setServices] = useState(null);
  const [loading, setLoading] = useState(false);
  const [lastChecked, setLastChecked] = useState(null);
  const check = useCallback(async () => {
    setLoading(true);
    try {
      setServices(await getServiceHealth());
      setLastChecked(new Date());
    } catch (err) {
      console.error('[ServiceHealth]', err.message);
    } finally {
      setLoading(false);
    }
  }, []);
  return (
    <div style={{ display: 'flex', flexDirection: 'column', gap: 8 }}>
      <div style={{ display: 'flex', alignItems: 'center', gap: 8 }}>
        <button
          className="btn-primary"
          style={{ padding: '5px 12px', fontSize: '12px' }}
          disabled={loading}
          onClick={check}
        >
          {loading ? 'Checking…' : 'Check Now'}
        </button>
        {lastChecked && (
          <span className="text-xs text-muted">
            {lastChecked.toLocaleTimeString()}
          </span>
        )}
      </div>
      {services && (
        <div style={{
          display: 'flex', flexDirection: 'column',
          border: '1px solid var(--border)',
          borderRadius: 'var(--radius-md)',
          overflow: 'hidden', marginTop: 4,
        }}>
          {services.map((svc, i) => (
            <div key={svc.key} style={{
              display: 'flex', alignItems: 'center', gap: 10,
              padding: '8px 12px',
              borderBottom: i < services.length - 1 ? '1px solid var(--border)' : 'none',
              background: 'var(--bg-elevated)',
            }}>
              <div style={{
                width: 8, height: 8, borderRadius: '50%', flexShrink: 0,
                background: svc.status === 'healthy' ? '#2ecc71' : '#e74c3c',
              }} />
              <span className="text-sm" style={{ minWidth: 90, color: 'var(--text-primary)' }}>
                {svc.label}
              </span>
              <span className="text-xs text-muted" style={{ flex: 1 }}>
                {svc.key === 'inference' && svc.detail?.model
                  ? svc.detail.model
                  : svc.status === 'unreachable' ? 'Unreachable' : ''}
              </span>
              <span className="text-xs text-muted" style={{ flexShrink: 0 }}>
                {svc.latency}ms
              </span>
            </div>
          ))}
        </div>
      )}
    </div>
  );
 }
 // ── Models section ───────────────────────────────────────────
 function ModelsSection({ settings, saveSetting, saving, modelProps }) {
  const { models, selectedModel, setSelectedModel } = useModels();
  const [selectedInfo, setSelectedInfo] = useState(null);
  useEffect(() => {
    const m = models.find(m => m.value === selectedModel);
    setSelectedInfo(m ?? null);
  }, [selectedModel, models]);
  return (
    <>
      <SettingsRow
        label="Models Folder"
        description="Path to folder containing .gguf files"
        action={<ModelsFolderSetting settings={settings} saveSetting={saveSetting} saving={saving} />}
      />
      <NumberSetting
        label="Temperature"
        description="Response randomness — lower is more focused, higher is more creative (0–2)"
        value={settings?.temperature}
        min={0} max={2} step={0.05}
        onSave={val => saveSetting('temperature', val)}
        saving={saving}
      />
      <NumberSetting
        label="Repeat Penalty"
        description="Penalises repeated tokens — higher reduces repetition (1–2)"
        value={settings?.repeatPenalty}
        min={1} max={2} step={0.05}
        onSave={val => saveSetting('repeatPenalty', val)}
        saving={saving}
      />
      <NumberSetting
        label="Top-P"
        description="Nucleus sampling — limits token pool by cumulative probability (0–1)"
        value={settings?.topP}
        min={0} max={1} step={0.05}
        onSave={val => saveSetting('topP', val)}
        saving={saving}
      />
      <NumberSetting
        label="Top-K"
        description="Limits token pool to K most likely tokens per step (1–100)"
        value={settings?.topK}
        min={1} max={100} step={1}
        onSave={val => saveSetting('topK', val)}
        saving={saving}
      />
      <SettingsRow
        label="Active Model"
        description="Model used for inference"
        action={
          <select
            value={selectedModel}
            onChange={e => setSelectedModel(e.target.value)}
            style={{
              padding: '6px 10px', fontSize: '13px',
              background: 'var(--bg-elevated)', border: '1px solid var(--border)',
              borderRadius: 'var(--radius-md)', color: 'var(--text-primary)',
              cursor: 'pointer', outline: 'none', maxWidth: '220px',
            }}
          >
            {models.map(m => (
              <option key={m.value} value={m.value}>{m.label}</option>
            ))}
          </select>
        }
      />
      {selectedInfo && (
        <div style={{
          margin: '0', padding: '14px 16px',
          borderTop: '1px solid var(--border)',
          background: 'var(--bg-elevated)',
          display: 'flex', flexDirection: 'column', gap: 8,
        }}>
          <p className="label-upper" style={{ color: 'var(--text-muted)' }}>Model Info</p>
          <div style={{ display: 'flex', flexDirection: 'column', gap: 6 }}>
            <InfoLine label="File" value={selectedInfo.value} mono />
            <InfoLine label="Size" value={selectedInfo.size ?? '—'} />
            {selectedInfo.description && (
              <InfoLine label="Description" value={selectedInfo.description} />
            )}
            <InfoLine
              label="Context"
              value={modelProps?.contextWindow
                ? `${modelProps.contextWindow.toLocaleString()} tokens`
                : '—'}
            />
            <InfoLine
              label="Loaded"
              value={modelProps?.modelAlias ?? '—'}
              mono
            />
          </div>
          <p className="text-xs text-muted" style={{ marginTop: 4, fontStyle: 'italic' }}>
            Model loading and parameter configuration coming soon
          </p>
        </div>
      )}
    </>
  );
 }
 function InfoLine({ label, value, mono }) {
  return (
    <div style={{ display: 'flex', gap: 8, alignItems: 'baseline' }}>
      <span className="text-xs text-muted" style={{ minWidth: 72, flexShrink: 0 }}>{label}</span>
      <span style={{
        fontSize: 12, color: 'var(--text-secondary)',
        fontFamily: mono ? 'monospace' : 'inherit',
        wordBreak: 'break-all',
      }}>{value}</span>
    </div>
  );
 }
 function ModelsFolderSetting({ settings, saveSetting, saving }) {
  const [local, setLocal] = useState('');
  const [error, setError] = useState(null);
  useEffect(() => {
    if (settings?.modelsFolderPath) setLocal(settings.modelsFolderPath);
  }, [settings?.modelsFolderPath]);
  const isDirty = local !== '' && local !== settings?.modelsFolderPath;
  async function handleSave() {
    setError(null);
    try {
      await saveSetting('modelsFolderPath', local);
    } catch (err) {
      setError('Path not accessible');
    }
  }
  return (
    <div style={{ display: 'flex', flexDirection: 'column', gap: 4, alignItems: 'flex-end' }}>
      <div style={{ display: 'flex', gap: 6, alignItems: 'center' }}>
        <input
          value={local}
          onChange={e => { setLocal(e.target.value); setError(null); }}
          style={{
            width: '220px', padding: '5px 8px', fontSize: '12px',
            fontFamily: 'monospace',
            background: 'var(--bg-elevated)', border: `1px solid ${error ? '#e74c3c' : 'var(--border)'}`,
            borderRadius: 'var(--radius-md)', color: 'var(--text-primary)', outline: 'none',
          }}
        />
        {isDirty && (
          <button className="btn-primary" style={{ padding: '5px 10px', fontSize: '12px' }}
            disabled={saving} onClick={handleSave}>
            Save
          </button>
        )}
      </div>
      {error && <span className="text-xs" style={{ color: '#e74c3c' }}>{error}</span>}
    </div>
  );
 }
--- a/packages/chat-client/src/components/Sidebar.jsx
+++ b/packages/chat-client/src/components/Sidebar.jsx
@@ -0,0 +1,424 @@
 import React, { useState } from 'react';
 import SessionModal from './SessionModal';
 import { useContextMenu } from '../hooks/useContextMenu';
 import { renameSession, deleteSession, updateSession } from '../api/orchestration';
 export default function Sidebar({
  sessions,
  activeSession,
  onSelectSession,
  onNewChat,
  onNewProject,
  isOpen,
  onToggle,
  onSessionsChange,
  onNavigate,
  projects,
  onProjectsChange,
  onSelectProject
 }) {
  const [chatsOpen, setChatsOpen] = useState(true);
  const [projectsOpen, setProjectsOpen] = useState(true);
  const [modalSession, setModalSession] = useState(null);
  const [modalMode, setModalMode] = useState('settings');
  const [hoveredId, setHoveredId] = useState(null);
  const { menu, open: openMenu, close: closeMenu } = useContextMenu();
  // ── Handlers ────────────────────────────────────────────
  async function handleRename(session, name, projectId) {
    try {
      await updateSession(session.external_id, { name, projectId });
      onSessionsChange();
    } catch (err) {
      console.error('[Sidebar] Rename failed:', err.message);
    }
  }
  async function handleDelete(session) {
    try {
      await deleteSession(session.external_id);
      onSessionsChange(session);
    } catch (err) {
      console.error('[Sidebar] Delete failed:', err.message);
    }
  }
  // ── Collapsed rail ───────────────────────────────────────
  if (!isOpen) {
    return (
      <div className="flex-col" style={{
        width: '48px',
        flexShrink: 0,
        background: 'var(--bg-surface)',
        borderRight: '1px solid var(--border)',
        alignItems: 'center',
        paddingTop: '8px',
        paddingBottom: '8px',
        gap: '4px',
      }}>
        {/* Expand toggle */}
        <button className="btn-icon" onClick={onToggle} title="Expand sidebar"
          style={{ marginBottom: '4px' }}>▶</button>
        <div style={{ width: '32px', height: '1px', background: 'var(--border)', margin: '4px 0' }} />
        {/* New Chat */}
        <button className="btn-icon" onClick={onNewChat} title="New Chat"
          style={{ fontSize: '18px', color: 'var(--text-secondary)' }}>+</button>
        {/* New Project */}
        <button className="btn-icon" onClick={onNewProject} title="View Projects"
          style={{ fontSize: '14px', color: 'var(--text-secondary)' }}>⊞</button>
        {/* All Chats */}
        <button className="btn-icon" onClick={() => onNavigate('all-chats')} title="All Chats"
          style={{ fontSize: '14px', color: 'var(--text-secondary)' }}>☰</button>
        {/* Spacer */}
        <div style={{ flex: 1 }} />
        {/* Settings */}
        <button className="btn-icon" onClick={() => onNavigate('settings')} title="Settings"
          style={{ fontSize: '14px', color: 'var(--text-secondary)' }}>⚙</button>
      </div>
    );
  }
  // ── Expanded sidebar ─────────────────────────────────────
  const recentSessions = sessions.slice(0, 10);
  // Group recent sessions by project
  const grouped = {};
  const unassigned = [];
  for (const session of recentSessions) {
    if (session.project_id) {
      if (!grouped[session.project_id]) grouped[session.project_id] = [];
      grouped[session.project_id].push(session);
    } else {
      unassigned.push(session);
    }
  }
  const sessionRowProps = (session) => ({
    session,
    isActive: activeSession?.external_id === session.external_id,
    isHovered: hoveredId === session.external_id,
    onHover: setHoveredId,
    onSelect: () => { onSelectSession(session); onNavigate('chat'); },
    onRename: () => { setModalMode('settings'); setModalSession(session); },
    onDelete: () => { setModalMode('confirm-delete'); setModalSession(session); },
    onContextMenu: e => !session.isNew && openMenu(e, session),
  });
  return (
    <>
      <div className="flex-col" style={{
        width: 'var(--sidebar-width)',
        flexShrink: 0,
        background: 'var(--bg-surface)',
        borderRight: '1px solid var(--border)',
        overflow: 'hidden',
      }}>
        {/* Header */}
        <div className="panel-header" style={{ justifyContent: 'space-between', padding: '0 12px 0 16px' }}>
          <span className="text-base" style={{ fontWeight: 1000, color: 'var(--text-secondary)' }}>NexusAI</span>
          <button className="btn-icon" onClick={onToggle}>◀</button>
        </div>
        {/* Action buttons */}
        <div style={{ padding: '10px 10px 6px', display: 'flex', flexDirection: 'column', gap: '6px', flexShrink: 0 }}>
          <button className="btn-primary" onClick={onNewChat} style={{
            width: '100%', padding: '7px 12px',
            display: 'flex', alignItems: 'center', gap: '8px',
          }}>
            <span style={{ fontSize: '16px', lineHeight: 1 }}>+</span>
            <span>New Chat</span>
          </button>
          <button className="btn-primary" onClick={onNewProject} style={{
            width: '100%', padding: '7px 12px',
            display: 'flex', alignItems: 'center', gap: '8px',
          }}>
            <span style={{ fontSize: '14px', lineHeight: 1 }}>⊞</span>
            <span>View Projects</span>
          </button>
        </div>
        <div style={{ height: '1px', background: 'var(--border)', flexShrink: 0, margin: '2px 0' }} />
        {/* Scrollable content */}
        <div className="flex-1 scroll-y">
          {/* ── Projects section ── */}
          <SectionHeader
            label="Projects"
            isOpen={projectsOpen}
            onToggle={() => setProjectsOpen(o => !o)}
          />
          {projectsOpen && (
            <div style={{ padding: '4px 10px 8px' }}>
              {!projects?.length ? (
                <div style={{
                  padding: '10px',
                  borderRadius: 'var(--radius-md)',
                  border: '1px dashed var(--border)',
                  color: 'var(--text-sb-hdr)',
                  fontSize: '13px',
                  textAlign: 'center',
                }}>
                  No projects yet
                </div>
              ) : (
                <div style={{ display: 'flex', flexWrap: 'wrap', gap: '6px' }}>
                  {projects.slice(0, 6).map(project => (
                    <button
                      key={project.id}
                      onClick={() => { onSelectProject(project); onNavigate('project'); }}
                      className="btn-reset text-xs"
                      style={{
                        padding: '4px 8px',
                        borderRadius: 'var(--radius-sm)',
                        background: 'var(--bg-elevated)',
                        border: `1px solid ${project.colour ?? 'var(--border)'}`,
                        color: 'var(--text-secondary)',
                        maxWidth: '100%',
                      }}
                      title={project.description ?? project.name}
                    >
                      <span className="truncate" style={{ display: 'block', maxWidth: '140px' }}>
                        {project.name}
                      </span>
                    </button>
                  ))}
                </div>
              )}
            </div>
          )}
          <div style={{ height: '1px', background: 'var(--border)', margin: '2px 0' }} />
          {/* ── Recent Chats section ── */}
          <SectionHeader
            label="Recent Chats"
            isOpen={chatsOpen}
            onToggle={() => setChatsOpen(o => !o)}
          />
          {chatsOpen && (
            <>
              {recentSessions.length === 0 && (
                <div className="text-xs text-muted" style={{ padding: '12px 16px', textAlign: 'center' }}>
                  No conversations yet
                </div>
              )}
              {/* Project groups */}
              {Object.entries(grouped).map(([projectId, projectSessions]) => {
                const project = projects?.find(p => p.id === Number(projectId));
                return (
                  <div key={projectId}>
                    {/* Project group label */}
                    <div style={{
                      display: 'flex', alignItems: 'center', gap: '6px',
                      padding: '6px 16px 2px',
                    }}>
                      <span className=" text-muted truncate" 
                            style={{
                              fontSize: '12px', 
                              textTransform: 'uppercase', 
                              fontWeight: '500', 
                              textAlign: 'center',
                              borderRadius: 'var(--radius-md)',
                              border: `1px solid ${project.colour ?? 'var(--border)'}`,
                              padding: '2px 2px',
                              width: '100%'
                            }}>
                        {project?.name ?? 'Project'}
                      </span>
                    </div>
                    {projectSessions.map(session => (
                      <SessionRow key={session.external_id} {...sessionRowProps(session)} />
                    ))}
                  </div>
                );
              })}
              {/* Unassigned sessions */}
              {unassigned.length > 0 && (
                <>
                  {Object.keys(grouped).length > 0 && (
                    <div style={{ padding: '6px 16px 2px' }}>
                      <span className=" text-muted " style={{fontSize: '12px', textTransform: 'uppercase', fontWeight: '500', textAlign: 'center',}}>Other</span>
                    </div>
                  )}
                  {unassigned.map(session => (
                    <SessionRow key={session.external_id} {...sessionRowProps(session)} />
                  ))}
                </>
              )}
              {sessions.length > 0 && (
                <button
                  onClick={() => onNavigate('all-chats')}
                  className="btn-reset text-xs text-muted"
                  style={{ width: '100%', padding: '6px', borderRadius: 'var(--radius-sm)' }}
                  onMouseEnter={e => e.currentTarget.style.color = 'var(--text-secondary)'}
                  onMouseLeave={e => e.currentTarget.style.color = 'var(--text-muted)'}
                >
                  All Chats →
                </button>
              )}
            </>
          )}
        </div>
        {/* Settings — pinned to bottom */}
        <div style={{ borderTop: '1px solid var(--border)', padding: '8px 10px', flexShrink: 0 }}>
          <button
            onClick={() => onNavigate('settings')}
            className="btn-reset text-base"
            style={{
              width: '100%', padding: '8px 12px',
              borderRadius: 'var(--radius-md)',
              display: 'flex', alignItems: 'center', gap: '8px',
              color: 'var(--text-secondary)',
            }}
            onMouseEnter={e => e.currentTarget.style.background = 'var(--bg-elevated)'}
            onMouseLeave={e => e.currentTarget.style.background = 'transparent'}
          >
            <span style={{ fontSize: '14px' }}>⚙</span>
            <span>Settings</span>
          </button>
        </div>
      </div>
      {/* Context menu */}
      {menu && (
        <div
          onClick={e => e.stopPropagation()}
          style={{
            position: 'fixed', top: menu.y, left: menu.x,
            background: 'var(--bg-elevated)', border: '1px solid var(--border)',
            borderRadius: 'var(--radius-md)', padding: '4px', zIndex: 50, minWidth: '140px',
          }}
        >
          <ContextMenuItem
            onClick={() => { setModalMode('settings'); setModalSession(menu.session); closeMenu(); }}
          >✎ Rename</ContextMenuItem>
          <ContextMenuItem
            onClick={() => { setModalMode('confirm-delete'); setModalSession(menu.session); closeMenu(); }}
            danger
          >✕ Delete</ContextMenuItem>
        </div>
      )}
      {/* Session modal */}
      {modalSession && (
        <SessionModal
          session={modalSession}
          mode={modalMode}
          onRename={handleRename}
          onDelete={handleDelete}
          onClose={() => setModalSession(null)}
          projects={projects}
        />
      )}
    </>
  );
 }
 // ── Sub-components ───────────────────────────────────────────
 function SectionHeader({ label, isOpen, onToggle }) {
  return (
    <button
      onClick={onToggle}
      className="btn-reset label-upper"
      style={{
        width: '100%', padding: '8px 16px',
        display: 'flex', alignItems: 'center', justifyContent: 'center',
        color: 'var(--text-sb-hdr)',
      }}
    >
      <span>{label}</span>
      <span style={{ fontSize: '13px' }}>{isOpen ? '▾' : '▸'}</span>
    </button>
  );
 }
 function SessionRow({ session, isActive, isHovered, onHover, onSelect, onRename, onDelete, onContextMenu }) {
  return (
    <div
      onMouseEnter={() => onHover(session.external_id)}
      onMouseLeave={() => onHover(null)}
      onContextMenu={onContextMenu}
      style={{
        position: 'relative', display: 'flex', alignItems: 'stretch',
        background: isActive || isHovered ? 'var(--bg-elevated)' : 'transparent',
        borderLeft: isActive ? '2px solid var(--accent)' : '2px solid transparent',
        transition: 'background 0.1s',
        overflow: 'hidden',
        width: '100%',
        boxSizing: 'border-box',
      }}
    >
      <button
        onClick={onSelect}
        className="btn-reset"
        style={{
          flex: 1, padding: '8px 16px',
          paddingRight: isHovered && !session.isNew ? '4px' : '16px',
          textAlign: 'left',
          minWidth: 0,
          overflow: 'hidden',
        }}
      >
        <span className="text-base truncate" style={{
          display: 'block',
          color: isActive ? 'var(--text-primary)' : 'var(--text-secondary)',
          fontWeight: isActive ? 500 : 400,
        }}>
          {session.isNew ? 'New conversation' : (session.name || session.external_id)}
        </span>
        {session.isNew && (
          <span className="text-xs text-accent" style={{ fontStyle: 'italic' }}>Unsaved</span>
        )}
      </button>
      <div
        style={{
          display: 'flex', alignItems: 'center',
          gap: '2px',
          paddingRight: isHovered && !session.isNew ? '8px' : '0px',
          flexShrink: 0,
          width: isHovered && !session.isNew ? '44px' : '0px',
          overflow: 'hidden',
          transition: 'width 0.1s ease',
        }}
      >
        <button className="btn-icon" title="Rename" onClick={onRename}
          style={{ padding: '2px 4px', fontSize: '12px' }}>✎</button>
        <button className="btn-icon" title="Delete" onClick={onDelete}
          style={{ padding: '2px 4px', fontSize: '12px', color: '#ff6b6b' }}>✕</button>
      </div>
    </div>
  );
 }
 function ContextMenuItem({ children, onClick, danger }) {
  return (
    <button
      className="btn-reset text-base"
      onClick={onClick}
      style={{ width: '100%', padding: '8px 12px', borderRadius: 'var(--radius-sm)', justifyContent: 'flex-start', color: danger ? '#ff6b6b' : 'var(--text-primary)' }}
      onMouseEnter={e => e.currentTarget.style.background = 'var(--bg-surface)'}
      onMouseLeave={e => e.currentTarget.style.background = 'transparent'}
    >{children}</button>
  );
 }
--- a/packages/chat-client/src/components/SummaryView.jsx
+++ b/packages/chat-client/src/components/SummaryView.jsx
@@ -0,0 +1,124 @@
 import React, { useState, useEffect } from 'react';
 import { fetchSessionSummaries } from '../api/orchestration';
 import ReactMarkdown from 'react-markdown';
 export default function SummaryView({ activeSession, onBack }) {
  const [summaries, setSummaries] = useState([]);
  const [loading, setLoading]     = useState(true);
  const [error, setError]         = useState(null);
  const [expanded, setExpanded]   = useState(null);
  useEffect(() => {
    if (!activeSession || activeSession.isNew) {
      setLoading(false);
      return;
    }
    setLoading(true);
    fetchSessionSummaries(activeSession.external_id)
      .then(data => setSummaries(Array.isArray(data) ? data : []))
      .catch(err => setError(err.message))
      .finally(() => setLoading(false));
  }, [activeSession]);
  function formatTimestamp(ts) {
    if (!ts) return '—';
    return new Date(ts * 1000).toLocaleString([], {
      month: 'short', day: 'numeric',
      hour: '2-digit', minute: '2-digit',
    });
  }
  return (
    <div style={{ display: 'flex', flexDirection: 'column', flex: 1, overflow: 'hidden', background: 'var(--bg-base)' }}>
      {/* Header */}
      <div className="panel-header" style={{ padding: '0 24px', gap: 12 }}>
        <button className="btn-icon" onClick={onBack}>←</button>
        <span className="text-base" style={{ fontWeight: 500 }}>Session Memory</span>
        <span className="text-sm text-muted" style={{ marginLeft: 'auto' }}>
          {summaries.length} summar{summaries.length !== 1 ? 'ies' : 'y'}
        </span>
      </div>
      {/* Session name pill */}
      {activeSession && (
        <div style={{ padding: '8px 24px 0' }}>
          <span className="text-xs text-muted" style={{
            background: 'var(--bg-elevated)',
            border: '1px solid var(--border)',
            borderRadius: '999px',
            padding: '3px 10px',
          }}>
            {activeSession.name || activeSession.external_id}
          </span>
        </div>
      )}
      {/* Content */}
      <div className="scroll-y flex-1" style={{ padding: '16px 24px' }}>
        {loading && <p className="text-sm text-muted">Loading…</p>}
        {error   && <p className="text-sm" style={{ color: 'var(--error, #e05)' }}>{error}</p>}
        {!loading && !activeSession && (
          <p className="text-sm text-muted">No active session.</p>
        )}
        {!loading && activeSession && summaries.length === 0 && (
          <div style={{
            display: 'flex', flexDirection: 'column', alignItems: 'center',
            gap: '12px', padding: '48px 0', color: 'var(--text-muted)',
          }}>
            <span style={{ fontSize: '28px', opacity: 0.3 }}>◈</span>
            <p className="text-sm">No summaries yet for this session.</p>
            <p className="text-xs text-muted" style={{ maxWidth: '280px', textAlign: 'center', lineHeight: 1.6 }}>
              Summaries generate automatically once a session accumulates enough conversation.
            </p>
          </div>
        )}
        {summaries.map(summary => (
          <div key={summary.id} style={{
            background: 'var(--bg-surface)',
            border: '1px solid var(--border)',
            borderRadius: 'var(--radius-lg)',
            marginBottom: '10px', overflow: 'hidden',
          }}>
            {/* Card header */}
            <div
              onClick={() => setExpanded(expanded === summary.id ? null : summary.id)}
              style={{ display: 'flex', alignItems: 'center', gap: '10px', padding: '10px 14px', cursor: 'pointer' }}
            >
              <span style={{ flex: 1, fontSize: 13, color: 'var(--text-primary)' }}>
                Episodes {summary.episode_range}
              </span>
              <span className="text-xs text-muted">{formatTimestamp(summary.created_at)}</span>
              <span className="text-muted" style={{ fontSize: 11 }}>
                {expanded === summary.id ? '▲' : '▼'}
              </span>
            </div>
            {/* Expanded content */}
            {expanded === summary.id && (
              <div style={{ padding: '0 14px 14px', borderTop: '1px solid var(--border)' }}>
                <ReactMarkdown components={{
                  p: ({ children }) => (
                    <p style={{ margin: '8px 0', lineHeight: 1.7, fontSize: 13, color: 'var(--text-secondary)' }}>
                      {children}
                    </p>
                  ),
                }}>
                  {summary.content}
                </ReactMarkdown>
                {summary.token_count > 0 && (
                  <p className="text-xs text-muted" style={{ marginTop: 8 }}>
                    {summary.token_count.toLocaleString()} tokens covered
                  </p>
                )}
              </div>
            )}
          </div>
        ))}
      </div>
    </div>
  );
 }
--- a/packages/chat-client/src/config/constants.js
+++ b/packages/chat-client/src/config/constants.js
@@ -11,4 +11,9 @@ export const API_DEFAULTS = {
    SESSIONS_LIMIT: 20,
    HISTORY_LIMIT:  50,
    OFFSET:          0,
    EPISODE_LIMIT:  50,
 }
 export const CLIENT_DEFAULTS = {
    PAGE_SIZE: 20,
 }
--- a/packages/chat-client/src/hooks/useChat.js
+++ b/packages/chat-client/src/hooks/useChat.js
@@ -1,5 +1,5 @@
-import { useState, useCallback, useRef } from 'react';
+import React, { useEffect, useState, useCallback, useRef } from 'react';
-import { streamMessage } from '../api/orchestration';
+import { streamMessage, updateSession } from '../api/orchestration';
 export function useChat({ activeSession, appendMessage, updateLastMessage, refreshSessions }) {
  const [streaming, setStreaming] = useState(false);
@@ -7,9 +7,22 @@ export function useChat({ activeSession, appendMessage, updateLastMessage, refre
  const [lastTokenCount, setLastTokenCount] = useState(0);
  const [lastModel, setLastModel] = useState(null);
  const cancelRef = useRef(null);
  const prevStreaming = React.useRef(false);
  const [summarising, setSummarising] = useState(false);
-  const sendMessage = useCallback(async (text, model) => {
+  useEffect(() => {
-    if (!activeSession || !text.trim() || streaming) return;
+    if (prevStreaming.current && !streaming) {
      // Stream just finished — trigger the summarising indicator
      setSummarising(true);
      const t = setTimeout(() => setSummarising(false), 8000);
      return () => clearTimeout(t);
    }
    prevStreaming.current = streaming;
  }, [streaming]);
  const sendMessage = useCallback(async (text, model, projectId = null, session=null) => {
    const targetSession = session ?? activeSession;
    if (!targetSession || !text.trim() || streaming) return;
    setError(null);
@@ -32,7 +45,7 @@ export function useChat({ activeSession, appendMessage, updateLastMessage, refre
    // 3. Open stream
    cancelRef.current = streamMessage(
-      activeSession.external_id,
+      targetSession.external_id,
      text,
      model,
      {
@@ -53,6 +66,15 @@ export function useChat({ activeSession, appendMessage, updateLastMessage, refre
          // Refresh session list so new sessions appear in sidebar
          refreshSessions();
          // Delayed refresh
          setTimeout( () => refreshSessions(), 3000);
          // Assign project after first message if one was set
          if (projectId) {
            updateSession(targetSession.external_id, { projectId })
              .catch(err => console.warn('[useChat] Failed to assign project:', err.message));
          }
        },
        onError: (err) => {
@@ -86,5 +108,6 @@ export function useChat({ activeSession, appendMessage, updateLastMessage, refre
    error,
    lastTokenCount,
    lastModel,
    summarising,
  };
 }
--- a/packages/chat-client/src/hooks/useProjects.js
+++ b/packages/chat-client/src/hooks/useProjects.js
@@ -0,0 +1,19 @@
 import { useState, useEffect, useCallback } from 'react';
 import { fetchProjects } from '../api/orchestration';
 export function useProjects() {
  const [projects, setProjects] = useState([]);
  const refreshProjects = useCallback(async () => {
    try {
      setProjects(await fetchProjects());
    } catch (err) {
      console.warn('[useProjects] Failed to load projects:', err.message);
    }
  }, []);
  useEffect(() => { refreshProjects(); }, [refreshProjects]);
  return { projects, refreshProjects };
 }
--- a/packages/chat-client/src/hooks/useSession.js
+++ b/packages/chat-client/src/hooks/useSession.js
@@ -36,6 +36,7 @@ export function useSession() {
  const selectSession = useCallback(async (session) => {
    setActiveSession(session);
    setMessages([]);
    if (!session || session.isNew) return;
    setLoadingHistory(true);
    try {
@@ -57,11 +58,12 @@ export function useSession() {
    const newSession = {
      external_id: uuidv4(),
      metadata: null,
-      isNew: true, // flag so SessionList can style it differently
+      isNew: true,
    };
    setSessions(prev => [newSession, ...prev]);
    setActiveSession(newSession);
    setMessages([]);
    return newSession
  }, []);
@@ -82,6 +84,7 @@ export function useSession() {
  return {
    sessions,
    setSessions,
    activeSession,
    messages,
    loadingHistory,
--- a/packages/chat-client/src/hooks/useSettings.js
+++ b/packages/chat-client/src/hooks/useSettings.js
@@ -0,0 +1,25 @@
 import { useState, useEffect } from 'react';
 import { getSettings, updateSettings } from '../api/orchestration';
 export function useSettings() {
  const [settings, setSettings] = useState(null);
  const [saving, setSaving] = useState(false);
  useEffect(() => {
    getSettings().then(setSettings).catch(console.error);
  }, []);
  async function saveSetting(key, value) {
    setSaving(true);
    try {
      const updated = await updateSettings({ [key]: value });
      setSettings(updated);
    } catch (err) {
      console.error('[useSettings] Save failed:', err.message);
    } finally {
      setSaving(false);
    }
  }
  return { settings, saveSetting, saving };
 }
--- a/packages/chat-client/src/index.css
+++ b/packages/chat-client/src/index.css
@@ -1,17 +1,19 @@
 *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
 :root {
-  --bg-base:        #0f1117;
+  --bg-base:        #9c9a9a;
-  --bg-surface:     #0e0d0d;
+  --bg-surface:     #000000;
-  --bg-elevated:    #222536;
+  --bg-elevated:    #111111;
-  --border:         #2e3150;
+  --border:         #989899;
-  --accent:         #3d3a79;
+  --accent:         #333335;
  --accent-hover:   #574fd6;
  --text-primary:   #e8e8f0;
  --text-secondary: #8b8fa8;
-  --text-muted:     #555870;
+  --text-muted:     #ababaf;
-  --bubble-user:    #4742a8;
+  --text-sb-hdr:    #ffffff;
-  --bubble-ai:      #20264d;
+  --bubble-user:    #020202;
  --bubble-ai:      #303033;
  --warning:        #ec5353;
  --sidebar-width:  180px;
  --panel-width:    200px;
  --header-height:  40px;
@@ -33,6 +35,10 @@ html, body, #root {
  50%       { opacity: 0; }
 }
@keyframes spin {
  to { transform: rotate(360deg); }
 }
 /* ── Layout ─────────────────────────────────────────── */
 .flex        { display: flex; }
@@ -64,7 +70,9 @@ html, body, #root {
  cursor: pointer;
  display: flex;
  align-items: center;
-  justify-content: center;
+  justify-content: flex-start;
  min-width: 0;
  overflow: hidden;
 }
 .btn-icon {
@@ -105,5 +113,15 @@ html, body, #root {
 .text-muted   { color: var(--text-muted); }
 .text-secondary { color: var(--text-secondary); }
 .text-accent  { color: var(--accent); }
-.label-upper  { font-size: 11px; font-weight: 500; color: var(--text-muted); text-transform: uppercase; letter-spacing: 0.08em; }
+.label-upper  { font-size: 13px; font-weight: 750; color: var(--text-muted); text-transform: uppercase; letter-spacing: 0.08em; }
 .truncate     { overflow: hidden; text-overflow: ellipsis; white-space: nowrap; }
 .spinner {
  width: 12px;
  height: 12px;
  border: 2px solid var(--border);
  border-top-color: var(--text-muted);
  border-radius: 50%;
  animation: spin 0.7s linear infinite;
  flex-shrink: 0;
 }
--- a/packages/chat-client/vite.config.js
+++ b/packages/chat-client/vite.config.js
@@ -13,6 +13,10 @@ export default defineConfig({
      '/sessions':  'http://192.168.0.205:4000',
      '/models':    'http://192.168.0.205:4000',
      '/projects':  'http://192.168.0.205:4000',
      '/episodes':  'http://192.168.0.205:4000',
      '/settings':  'http://192.168.0.205:4000',
      '/health':    'http://192.168.0.205:4000',
      '/summaries': 'http://192.168.0.205:4000',
    },
  },
 });
--- a/packages/embedding-service/CLAUDE.md
+++ b/packages/embedding-service/CLAUDE.md
@@ -0,0 +1,64 @@
 # CLAUDE.md
 This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
 See the root [CLAUDE.md](../../CLAUDE.md) for overall architecture, service roles, and deployment layout.
 ## Running This Service
 ```bash
 npm run embedding                          # From repo root
 npm -w packages/embedding-service run dev  # With --watch
 ```
 Default port: **3003**. Requires Ollama to be reachable at `OLLAMA_URL`.
 ## Single-File Service
 The entire service is `src/index.js` — no subdirectory structure. All routes, the Ollama helper, and startup are in one file.
 ## Environment Variables
 | Variable | Default | Description |
 |---|---|---|
 | `PORT` | `3003` | Port to listen on |
 | `OLLAMA_URL` | `http://localhost:11434` | Ollama instance URL |
 | `EMBEDDING_MODEL` | `nomic-embed-text` | Model passed to Ollama `/api/embed` |
 Note: the env var name is `EMBEDDING_MODEL`, not `EMBED_MODEL` — the internal constant is `EMBED_MODEL` but the lookup key is different.
 ## Ollama API Details
 Uses Ollama's `/api/embed` endpoint (not `/api/embeddings`). Request shape:
 ```json
 { "model": "nomic-embed-text", "input": "text to embed" }
 ```
 Ollama returns `{ "embeddings": [[...]] }` — an array of arrays even for a single input. The helper takes `data.embeddings[0]` to return the single vector.
 The `ollama` npm package is listed as a dependency but is **not used** — all calls are raw `fetch`. Do not refactor to use the package without checking the API shape matches.
 ## Batch Endpoint
 `POST /embed/batch` embeds items **sequentially** in a for-loop, not in parallel. The comment explains this: Ollama doesn't parallelise embedding calls, so parallel requests would queue internally anyway. Do not change to `Promise.all` without verifying Ollama behaviour.
 ## Error Responses
 | Condition | Status | Notes |
 |---|---|---|
 | Missing/empty `text` | 400 | |
 | Ollama call fails | 502 | Upstream failure — correct status |
 | Empty `texts` array | 400 | |
 ## Known Issue
 The 400 error message for `/embed` reads `"text is required and must be empty"` — the word "not" is missing. Should read `"must not be empty"`.
 ## API Endpoints
 | Method | Path | Notes |
 |---|---|---|
 | GET | `/health` | Static response — does not verify Ollama is reachable |
 | POST | `/embed` | Body: `{ text: string }`. Returns `{ embedding, model, dimensions }` |
 | POST | `/embed/batch` | Body: `{ texts: string[] }`. Returns `{ embeddings, model, dimensions, count }` |
--- a/packages/embedding-service/package.json
+++ b/packages/embedding-service/package.json
@@ -9,7 +9,6 @@
  "dependencies": {
    "@nexusai/shared": "^1.0.0",
    "dotenv": "^17.4.0",
-    "express": "^5.2.1",
+    "express": "^5.2.1"
    "ollama": "^0.6.3"
  }
 }
--- a/packages/embedding-service/src/index.js
+++ b/packages/embedding-service/src/index.js
@@ -1,20 +1,21 @@
 require ('dotenv').config();
 const express = require('express');
-const {getEnv, OLLAMA, PORTS} = require('@nexusai/shared');
+const {getEnv, OLLAMA, PORTS, logger} = require('@nexusai/shared');
 const app = express();
-app.use(express.json());
+app.use(express.json({ limit: '1mb' }));    // limit request body to 1mb to prevent abuse - embedding requests should be small
-const PORT          = getEnv('PORT',            PORTS.EMBEDDING);  // Default to 3003 if PORT is not set
+const PORT          = getEnv('PORT',            PORTS.EMBEDDING);  
-const OLLAMA_URL    = getEnv('OLLAMA_URL',      OLLAMA.DEFAULT_URL); // URL for Ollama API
+const OLLAMA_URL    = getEnv('OLLAMA_URL',      OLLAMA.DEFAULT_URL); 
-const EMBED_MODEL   = getEnv('EMBEDDING_MODEL', OLLAMA.EMBED_MODEL); // Ollama model for embeddings
+const EMBED_MODEL   = getEnv('EMBEDDING_MODEL', OLLAMA.EMBED_MODEL); 
 //OLLAMA embedding helper function
 async function embedText(text) {
    const res = await fetch(`${OLLAMA_URL}/api/embed`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({ model: EMBED_MODEL, input: text })
+        body: JSON.stringify({ model: EMBED_MODEL, input: text }),
        signal: AbortSignal.timeout(30_000),
    });
    if (!res.ok) {
@@ -37,7 +38,7 @@ app.get('/health', (req,res) => {
 app.post('/embed', async (req, res) => {
    const { text } = req.body;
    if (!text || typeof text !== 'string' || text.trim() === '') {
-        return res.status(400).json({ error: 'text is required and must be empty' });
+        return res.status(400).json({ error: 'text is required and must not be empty' });
    }
    try {
@@ -60,7 +61,10 @@ app.post('/embed/batch', async (req, res) => {
    }
    try {
-        //sequential embedding for now, Ollama doesn't natively parallize embeddings
+        const invalid = texts.findIndex(t => !t || typeof t !== 'string' || t.trim() === '');
        if (invalid !== -1)
            return res.status(400).json({ error: `texts[${invalid}] is empty or not a string` });
        const embeddings = [];
        for (const text of texts) {
            embeddings.push(await embedText(text.trim()));
@@ -78,5 +82,5 @@ app.post('/embed/batch', async (req, res) => {
 /******* Start Server ********/
 app.listen(PORT, () => {
-    console.log(`Embedding Service listening on port ${PORT}`);
+    logger.info(`Embedding Service listening on port ${PORT}`);
 });
--- a/packages/inference-service/CLAUDE.md
+++ b/packages/inference-service/CLAUDE.md
@@ -0,0 +1,75 @@
 # CLAUDE.md
 This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
 See the root [CLAUDE.md](../../CLAUDE.md) for overall architecture, service roles, and deployment layout.
 ## Running This Service
 ```bash
 npm run inference                          # From repo root
 npm -w packages/inference-service run dev  # With --watch
 ```
 Default port: **3001**. Set `INFERENCE_PROVIDER` to select the backend.
 ## Provider Pattern
 `src/infer.js` reads `INFERENCE_PROVIDER` at startup and loads one of two providers:
 | `INFERENCE_PROVIDER` | Module | Backend |
 |---|---|---|
 | `ollama` (default) | `src/providers/ollama.js` | Ollama npm client → `/api/generate` |
 | `llamacpp` | `src/providers/llamacpp.js` | Raw fetch → `/v1/chat/completions` (OpenAI-compatible) |
 An unknown provider throws immediately at startup — fail-fast, not at request time.
 Both providers export the same interface: `complete(prompt, options)` and `completeStream(prompt, options)`.
 ## Environment Variables
 | Variable | Default | Description |
 |---|---|---|
 | `PORT` | `3001` | Port to listen on |
 | `INFERENCE_PROVIDER` | `ollama` | `ollama` or `llamacpp` |
 | `INFERENCE_URL` | `http://localhost:11434` (Ollama) / `http://localhost:8080` (llama.cpp) | Backend URL |
 | `DEFAULT_MODEL` | Provider-specific | Model name passed to backend |
 `INFERENCE_URL` defaults differ per provider — Ollama uses the Ollama default URL, llama.cpp uses the llama-server default.
 ## Options Resolution
 Both providers use `resolveOptions(options)` to merge caller-supplied options with `INFERENCE_DEFAULTS` from shared constants. Any option not supplied by the caller falls back to the constant.
 ## Streaming Chunk Format
 The two providers yield differently shaped chunks — the route in `src/routes/inference.js` normalises them:
 **Ollama** yields raw Ollama generate chunks: `{ response, done, model, eval_count, prompt_eval_count, ... }`
 **llama.cpp** yields:
 - Per-token: `{ response: delta, done: false }`
 - Final: `{ response: '', done: true, model, tokenCount }` — token count is the sum of `completion_tokens + prompt_tokens` from the usage chunk
 The route checks `chunk.response` to stream text and `chunk.done` to capture metadata. For Ollama streaming, **token count is not captured** — the done chunk from Ollama contains `eval_count`/`prompt_eval_count` but the route only reads `chunk.tokenCount` (a llama.cpp field). Ollama streaming calls always report `tokenCount: 0` to the client.
 ## Known Issue: `maxTokens` Missing from Streaming Route
 `POST /complete` correctly destructures `maxTokens` from the request body and passes it through. `POST /complete/stream` does **not** — it omits `maxTokens` from its destructuring, so streaming completions always use `INFERENCE_DEFAULTS.MAX_TOKENS` regardless of what the caller sends. This means `/chat/stream` has a different effective token ceiling than `/chat`.
 ## SSE Format (route → caller)
 ```
 data: {"response":"Hello"}        ← per token
 data: {"response":" world"}
 data: {"done":true,"model":"...","tokenCount":42}  ← final metadata
 data: [DONE]                       ← sentinel
 ```
 ## API Endpoints
 | Method | Path | Notes |
 |---|---|---|
 | GET | `/health` | Returns `{ service, status, provider, model }` |
 | POST | `/complete` | Body: `{ prompt, model?, temperature?, maxTokens?, topP?, topK?, repeatPenalty? }` |
 | POST | `/complete/stream` | Same body as `/complete` except `maxTokens` is silently ignored |
--- a/packages/inference-service/src/index.js
+++ b/packages/inference-service/src/index.js
@@ -1,10 +1,10 @@
 require ('dotenv').config();
 const express = require('express');
-const {getEnv, PORTS, OLLAMA} = require('@nexusai/shared');
+const {getEnv, PORTS, OLLAMA, logger} = require('@nexusai/shared');
 const inferenceRouter = require('./routes/inference');
 const app = express();
-app.use(express.json());
+app.use(express.json({ limit: '8mb' }));  // prompts include full context window
 const PORT      = getEnv('PORT', PORTS.INFERENCE);
 const PROVIDER  = getEnv('INFERENCE_PROVIDER',   'ollama');
@@ -24,5 +24,5 @@ app.use('/', inferenceRouter);
 // Start the server
 app.listen(PORT, () => {
-    console.log(`Inference Service is running on port ${PORT}`);
+    logger.info(`Inference Service is running on port ${PORT}`);
 });
--- a/packages/inference-service/src/providers/llamacpp.js
+++ b/packages/inference-service/src/providers/llamacpp.js
@@ -1,4 +1,4 @@
-const { getEnv, LLAMACPP, INFERENCE_DEFAULTS } = require("@nexusai/shared");
+const { getEnv, LLAMACPP, INFERENCE_DEFAULTS, logger } = require("@nexusai/shared");
 const BASE_URL = getEnv("INFERENCE_URL", LLAMACPP.DEFAULT_URL);
 const DEFAULT_MODEL = getEnv("DEFAULT_MODEL", LLAMACPP.DEFAULT_MODEL);
@@ -26,6 +26,7 @@ function buildPayload(prompt, options, stream = false) {
    top_k: opts.topK,
    repeat_penalty: opts.repeatPenalty,
    stream,
    stream_options: stream ? { include_usage: true } : undefined,
    ...(opts.seed !== null && { seed: opts.seed }),
  };
 }
@@ -75,15 +76,21 @@ async function* completeStream(prompt, options = {}) {
        const json = JSON.parse(line.slice(6));
        const delta = json.choices?.[0]?.delta?.content;
-      // Capture final metadata from the stop chunk
+        if (json.choices?.[0]?.finish_reason === 'stop') {
      if (json.choices?.[0]?.finish_reason === "stop") {
            finalModel = json.model ?? finalModel;
-        finalTokenCount = json.usage?.completion_tokens ?? finalTokenCount;
+        }
        // usage arrives in a separate final chunk with empty choices array
        if (json.usage) {
            finalTokenCount = (json.usage.completion_tokens ?? 0) + (json.usage.prompt_tokens ?? 0);
        }
        if (delta) yield { response: delta, done: false };
    }
  }
  logger.info('[llamacpp] finalTokenCount:', finalTokenCount);
  yield { response: '', done: true, model: finalModel, tokenCount: finalTokenCount };
 }
--- a/packages/inference-service/src/providers/ollama.js
+++ b/packages/inference-service/src/providers/ollama.js
@@ -57,8 +57,17 @@ async function* completeStream(prompt, options = {} ) {
    });
    for await (const chunk of stream) {
        if (chunk.done) {
            yield {
                response:   '',
                done:       true,
                model:      chunk.model,
                tokenCount: (chunk.eval_count ?? 0) + (chunk.prompt_eval_count ?? 0),
            };
        } else {
            yield chunk;
        }
    }
 }
 module.exports = { complete, completeStream };
--- a/packages/inference-service/src/routes/inference.js
+++ b/packages/inference-service/src/routes/inference.js
@@ -1,28 +1,29 @@
 const { Router } = require('express');
 const { complete, completeStream } = require('../infer');
 const { logger } = require('@nexusai/shared');
 const router = Router();
 // Standard completion endpoint - returns full response when done
 router.post('/complete', async (req, res) => {
-    const { prompt, model, temperature, maxTokens } = req.body;
+    const { prompt, model, temperature, maxTokens, topP, topK, repeatPenalty } = req.body;
    if (!prompt) {
        return res.status(400).json({ error: 'prompt is required'});
    }
    try {
-        const result = await complete (prompt, {model, temperature, maxTokens});
+        const result = await complete (prompt, {model, temperature, maxTokens, topP, topK, repeatPenalty});
        res.json(result);
    } catch (error) {
-        console.error('[Inference] Completion error:', error.message);
+        logger.error('[Inference] Completion error:', error.message);
-        res.status(500).json({ error: error.message });
+        res.status(500).json({ error: 'Inference failed', detail: error.message });
    }
 });
 // Streaming completion endpoint - sends partial responses as they arrive
 router.post('/complete/stream', async (req, res) => {
-    const { prompt, model, temperature } = req.body;
+    const { prompt, model, temperature, maxTokens, topP, topK, repeatPenalty } = req.body;
    if (!prompt) return res.status(400).json({ error: 'prompt is required' });
@@ -34,7 +35,7 @@ router.post('/complete/stream', async (req, res) => {
        let lastModel = model;
        let tokenCount = 0;
-        for await (const chunk of completeStream(prompt, { model, temperature })) {
+        for await (const chunk of completeStream(prompt, { model, temperature, maxTokens,topP, topK, repeatPenalty })) {
            if (chunk.response) {
                res.write(`data: ${JSON.stringify({ response: chunk.response })}\n\n`);
            }
@@ -42,6 +43,7 @@ router.post('/complete/stream', async (req, res) => {
                // capture final metadata from the done signal
                lastModel  = chunk.model      ?? lastModel;
                tokenCount = chunk.tokenCount ?? tokenCount;
                logger.info('[inference router] tokenCount from chunk:', chunk.tokenCount, '→', tokenCount);
            }
        }
@@ -50,7 +52,7 @@ router.post('/complete/stream', async (req, res) => {
        res.write('data: [DONE]\n\n');
    } catch (err) {
-        console.error('[Inference] Streaming error:', err.message);
+        logger.error('[Inference] Streaming error:', err.message);
        res.write(`data: ${JSON.stringify({ error: err.message })}\n\n`);
    } finally {
        res.end();
--- a/packages/memory-service/CLAUDE.md
+++ b/packages/memory-service/CLAUDE.md
@@ -0,0 +1,114 @@
 # CLAUDE.md
 This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
 See the root [CLAUDE.md](../../CLAUDE.md) for overall architecture, service roles, and the dual-store memory model.
 ## Running This Service
 ```bash
 npm run memory             # From repo root (node src/index.js)
 npm -w packages/memory-service run dev   # With --watch
 ```
 Default port: **3002**. Requires Qdrant and the embedding-service to be reachable on startup.
 ## SQLite Schema
 `src/db/schema.js` is the source of truth for the data model. Key schema facts:
 - `sessions` and `episodes` are linked by FK with cascade delete — deleting a session removes all its episodes automatically.
 - `episodes_fts` is an FTS5 virtual table that mirrors `user_message` and `ai_response`. It is kept in sync via SQL triggers on INSERT/UPDATE/DELETE. On service startup, the FTS index is fully rebuilt from live episode data.
 - Several columns (`sessions.name`, `sessions.project_id`, `entities.mention_count`, etc.) were added as migrations using `ALTER TABLE` wrapped in individual try-catch blocks. Failures are silently swallowed — if a column already exists, the alter fails and the service continues. The `idx_summaries_project` index is defined twice (benign duplicate).
 - `summaries` rows with `session_id IS NULL` and a `project_id` represent project-level overviews, not session summaries. This distinction is how `GET /projects/:id/overview` works.
 - `entity_episodes` is a join table linking entities to the episodes where they were first extracted. Used for provenance tracking and future orphan cleanup. Defined in `schema.js` (not a migration), so it exists on all installs.
 **New columns on `entities` (added via migration):**
 - `mention_count INTEGER DEFAULT 1` — incremented every time this entity is re-extracted
 - `confidence REAL DEFAULT 1.0` — reserved for future confidence scoring
 - `source TEXT DEFAULT 'extraction'` — `'extraction'` or `'manual'`
 - `last_seen_at INTEGER` — Unix timestamp of most recent extraction hit
 **New columns on `relationships` (added via migration):**
 - `mention_count INTEGER DEFAULT 1` — incremented every time this edge is re-extracted
 - `notes TEXT` — relationship context sentence from extraction
 ## Async Pipeline: Episode Creation
 `POST /episodes` returns a 201 as soon as the SQLite insert succeeds. Two background tasks run after without blocking the response:
 1. **Embedding** — Fetches a vector from embedding-service, stores to Qdrant with `{sessionId, createdAt}` as payload metadata.
 2. **Entity + relationship extraction** — Sends the episode text to Ollama (`qwen2.5:3b`, temp 0.1, 1500 tokens) and upserts any recognized entities and relationships to both SQLite and Qdrant. Also links each entity to the episode via `entity_episodes`.
 Both tasks catch and log errors silently. An episode can exist in SQLite with no corresponding Qdrant point if either step fails.
 ## Entity Extraction Details
 `src/entities/extraction.js`:
 - Fetches the last 20 known entities from SQLite before prompting the model, so the prompt can ask for name/type consistency with existing entries.
 - Recognized entity types: `person`, `place`, `project`, `technology`, `concept`, `organization` — anything else is discarded.
 - Ignores a hardcoded list of low-value names (`hello`, `thanks`, `good morning`, etc.).
 - Extracts JSON using a regex (`{...}`) applied to raw model output, so surrounding prose doesn't break parsing.
 - The model is asked to return both entities and relationships in a single JSON response: `{ "entities": [...], "relationships": [...] }`.
 - Entity upsert uses `ON CONFLICT(name, type) DO UPDATE` — preserves existing `notes` if the new extraction returns null, increments `mention_count`, updates `last_seen_at`.
 - Relationship upsert uses `ON CONFLICT(from_id, to_id, label) DO UPDATE` — increments `mention_count`, preserves existing `notes` if new is null.
 - Relationships are resolved by looking up both endpoints in the `entityMap` built during entity processing — if either entity wasn't saved (filtered out or invalid type), the relationship is silently dropped.
 - After upsert, embeds each entity as `"${name} (${type}): ${notes}"` and stores to Qdrant with `projectId` in the payload for project-scoped filtering.
 > For full details see `docs/services/entity-extraction.md` and `docs/services/knowledge-graph.md`.
 ## Knowledge Graph
 `src/graph/index.js` provides two SQLite traversal functions:
 - **`getNeighborhood(entityId, depth)`** — Single-entity recursive CTE traversal. Bidirectional (follows edges in both directions). Returns `{ nodes: [...entities], edges: [...relationships] }`. Depth defaults to `ENTITIES.GRAPH_HOP_DEPTH` (1), max enforced to 3 at the HTTP layer.
 - **`getEntityNeighbors(entityIds[])`** — Bulk 1-hop version for orchestration. Given a set of seed entity IDs, returns their immediate neighbors plus all edges within the combined node set.
 The recursive CTE uses `UNION` (not `UNION ALL`) to eliminate cycles and duplicate visits automatically.
 > For full design rationale and usage see `docs/services/knowledge-graph.md`.
 ## Summarization Strategy
 `src/summarization/project.js`:
 - Preferred path: generate a project overview from existing **session-level summaries** (higher-level abstraction, shorter input).
 - Fallback path: if no session summaries exist, summarize raw episodes directly (up to `SUMMARIES.MAX_PROJECT_EPISODE_LIMIT`).
 - Both paths truncate input at `SUMMARIES.MAX_SUMMARY_CHARS` (8,000 chars) by slicing from the end (most recent content wins).
 - Strips ChatML tokens from the Ollama response (`<|im_start|>`, `<|im_end|>`).
 - Uses temp 0.2 and `num_predict 1200`.
 ## Qdrant Client
 `src/semantic/index.js` creates the Qdrant client lazily on first use and reuses it. All three collections (`episodes`, `entities`, `summaries`) are created at startup if missing. There is no connection health check — if Qdrant is unreachable, semantic operations throw at call time.
 ## API Endpoints Quick Reference
 | Method | Path | Notes |
 |---|---|---|
 | GET | `/health` | Static response, no dependency checks |
 | GET/POST | `/sessions` | POST requires `externalId`; duplicate → 409 |
 | GET/PATCH | `/sessions/by-external/:externalId` | PATCH accepts `name`, `projectId` |
 | DELETE | `/sessions/by-external/:externalId` | Cascades to episodes, summaries, relationships |
 | GET/POST | `/episodes` | POST triggers async embedding + entity/relationship extraction |
 | GET | `/episodes/search` | FTS5 search; route must precede `/:id` |
 | GET | `/sessions/:id/episodes` | Paginated, ordered `created_at DESC` |
 | DELETE | `/episodes/:id` | Removes from SQLite + async Qdrant delete |
 | POST | `/entities` | Upsert by `(name, type)`; increments `mention_count` on conflict |
 | GET | `/entities/by-type/:type` | All entities of given type |
 | GET/DELETE | `/entities/:id` | |
 | POST | `/relationships` | Upsert by `(fromId, toId, label)`; increments `mention_count` on conflict. Body: `fromId`, `toId`, `label`, `notes` (optional) |
 | GET | `/entities/:id/relationships` | Outbound only |
 | DELETE | `/relationships` | Body: `fromId`, `toId`, `label` |
 | GET | `/graph/neighborhood/:entityId` | Single-entity neighborhood; `?depth=` (default 1, max 3) |
 | POST | `/graph/neighbors` | Bulk 1-hop neighborhood; body: `{ entityIds: [...] }` |
 | GET/POST | `/projects` | POST requires non-empty `name` |
 | GET/PATCH/DELETE | `/projects/:id` | |
 | POST | `/projects/:id/summarize` | On-demand overview generation; 422 if no data |
 | GET | `/projects/:id/overview` | Returns null (not 404) if no overview exists |
 | GET | `/projects/:id/summaries` | All summaries for project |
 | POST | `/summaries` | Requires `content` + at least one of `sessionId`/`projectId` |
 | GET | `/sessions/:id/summaries` | |
 | PATCH/DELETE | `/summaries/:id` | |
--- a/packages/memory-service/src/db/index.js
+++ b/packages/memory-service/src/db/index.js
@@ -1,6 +1,6 @@
 const Database = require('better-sqlite3');
 const schema = require('./schema');
-const {getEnv, SQLITE } = require('@nexusai/shared');
+const {getEnv, SQLITE, logger } = require('@nexusai/shared');
 let db;  // Declare db variable in a scope accessible to all functions
@@ -26,12 +26,48 @@ function getDB() {
            db.exec(`CREATE INDEX IF NOT EXISTS idx_sessions_project ON sessions(project_id)`);
        } catch {}  
        try {
            db.exec(`ALTER TABLE projects ADD COLUMN isolated INTEGER NOT NULL DEFAULT 0`);
        } catch {}
        try {
            db.exec(`ALTER TABLE projects ADD COLUMN notes TEXT`);  // ← add this
        } catch {}
        try { 
            db.exec(`ALTER TABLE projects ADD COLUMN system_prompt TEXT`); 
        } catch {}
        try {
            db.exec(`ALTER TABLE summaries ADD COLUMN project_id INTEGER REFERENCES projects(id) ON DELETE CASCADE`);
        } catch {}
        try {
            db.exec(`ALTER TABLE summaries ADD COLUMN token_count INTEGER`);
        } catch {}
        try {
            db.exec(`CREATE INDEX IF NOT EXISTS idx_summaries_project ON summaries(project_id)`);
        } catch {}
        try {
            db.exec(`CREATE INDEX IF NOT EXISTS idx_summaries_session ON summaries(session_id)`);
        } catch {}
        try { db.exec(`ALTER TABLE entities ADD COLUMN mention_count INTEGER NOT NULL DEFAULT 1`) } catch {}
        try { db.exec(`ALTER TABLE entities ADD COLUMN confidence REAL NOT NULL DEFAULT 1.0`) } catch {}
        try { db.exec(`ALTER TABLE entities ADD COLUMN source TEXT NOT NULL DEFAULT 'extraction'`) } catch {}
        try { db.exec(`ALTER TABLE entities ADD COLUMN last_seen_at INTEGER`) } catch {}
        try { db.exec(`ALTER TABLE relationships ADD COLUMN mention_count INTEGER NOT NULL DEFAULT 1`) } catch {}
        try { db.exec(`ALTER TABLE relationships ADD COLUMN notes TEXT`) } catch {}
        // Sync FTS index with any existing episodes data
        db.exec(`INSERT OR REPLACE INTO episodes_fts(rowid, user_message, ai_response) 
            SELECT id, user_message, ai_response FROM episodes`);
-        console.log(`Connected to SQLite database at ${path}`);
+        logger.info(`Connected to SQLite database at ${path}`);
    }
    return db;
 }
--- a/packages/memory-service/src/db/projects.js
+++ b/packages/memory-service/src/db/projects.js
@@ -1,12 +1,12 @@
 const { getDB } = require('./index');
 const { parseRow } = require('@nexusai/shared');
-function createProject({ name, description, colour, icon }) {
+function createProject({ name, description, colour, icon, isolated }) {
  const db = getDB();
  const result = db.prepare(`
-    INSERT INTO projects (name, description, colour, icon)
+    INSERT INTO projects (name, description, colour, icon, isolated)
-    VALUES (?, ?, ?, ?)
+    VALUES (?, ?, ?, ?, ?)
-  `).run(name, description ?? null, colour ?? null, icon ?? null);
+  `).run(name, description ?? null, colour ?? null, icon ?? null, isolated ?? 0);
  return getProject(result.lastInsertRowid);
 }
@@ -20,18 +20,33 @@ function getProject(id) {
  return parseRow(db.prepare(`SELECT * FROM projects WHERE id = ?`).get(id));
 }
-function updateProject(id, { name, description, colour, icon }) {
+function updateProject(id, fields = {}) {
  const db = getDB();
-  db.prepare(`
+  const allowed = ['name', 'description', 'colour', 'icon', 'isolated', 'notes', 'system_prompt'];
-    UPDATE projects SET name = ?, description = ?, colour = ?, icon = ?
+  const updates = [];
-    WHERE id = ?
+  const values = [];
-  `).run(name, description ?? null, colour ?? null, icon ?? null, id);
+
  for (const key of allowed) {
    if (fields[key] !== undefined) {
      updates.push(`${key} = ?`);
      values.push(fields[key] ?? null);
    }
  }
  if (updates.length === 0) return getProject(id);
  values.push(id);
  db.prepare(`UPDATE projects SET ${updates.join(', ')} WHERE id = ?`).run(...values);
  return getProject(id);
 }
 function deleteProject(id) {
  const db = getDB();
  const doDelete = db.transaction(() => {
    db.prepare(`UPDATE sessions SET project_id = NULL WHERE project_id = ?`).run(id);
    db.prepare(`DELETE FROM projects WHERE id = ?`).run(id);
  });
  doDelete();
 }
 module.exports = { createProject, getProjects, getProject, updateProject, deleteProject };
--- a/packages/memory-service/src/db/schema.js
+++ b/packages/memory-service/src/db/schema.js
@@ -38,10 +38,35 @@ const schema = `
    UNIQUE(from_id, to_id, label)
  );
  CREATE INDEX IF NOT EXISTS idx_relationships_from ON relationships(from_id);
  CREATE INDEX IF NOT EXISTS idx_relationships_to   ON relationships(to_id);
  CREATE TABLE IF NOT EXISTS entity_episodes (
    entity_id  INTEGER NOT NULL REFERENCES entities(id) ON DELETE CASCADE,
    episode_id INTEGER NOT NULL REFERENCES episodes(id) ON DELETE CASCADE,
    PRIMARY KEY (entity_id, episode_id)
  );
  CREATE INDEX IF NOT EXISTS idx_entity_episodes_entity  ON entity_episodes(entity_id);
  CREATE INDEX IF NOT EXISTS idx_entity_episodes_episode ON entity_episodes(episode_id);
  CREATE TABLE IF NOT EXISTS projects (
    id          INTEGER PRIMARY KEY AUTOINCREMENT,
    name        TEXT NOT NULL,
    description TEXT,
    colour      TEXT,
    icon        TEXT,
    created_at  INTEGER NOT NULL DEFAULT (unixepoch())
  );
  CREATE TABLE IF NOT EXISTS summaries (
    id            INTEGER PRIMARY KEY AUTOINCREMENT,
    session_id    INTEGER REFERENCES sessions(id) ON DELETE CASCADE,
    project_id    INTEGER REFERENCES projects(id) ON DELETE CASCADE,
    content       TEXT    NOT NULL,
    token_count   INTEGER,
    episode_range TEXT,
    created_at    INTEGER NOT NULL DEFAULT (unixepoch()),
    metadata      TEXT
@@ -53,8 +78,6 @@ const schema = `
    ON episodes(created_at);
  CREATE INDEX IF NOT EXISTS idx_entities_type 
    ON entities(type);
  CREATE INDEX IF NOT EXISTS idx_summaries_session 
    ON summaries(session_id);
  CREATE VIRTUAL TABLE IF NOT EXISTS episodes_fts 
    USING fts5(user_message, ai_response, content=episodes, content_rowid=id);
@@ -79,14 +102,7 @@ const schema = `
      VALUES (new.id, new.user_message, new.ai_response);
    END;
-  CREATE TABLE IF NOT EXISTS projects (
+  
    id          INTEGER PRIMARY KEY AUTOINCREMENT,
    name        TEXT NOT NULL,
    description TEXT,
    colour      TEXT,
    icon        TEXT,
    created_at  INTEGER NOT NULL DEFAULT (unixepoch())
  );
 `;
 module.exports = schema;
--- a/packages/memory-service/src/db/summaries.js
+++ b/packages/memory-service/src/db/summaries.js
@@ -0,0 +1,76 @@
 const { getDB } = require('./index');
 const { parseRow } = require('@nexusai/shared');
 function createSummary({ sessionId = null, projectId = null, content, tokenCount = null, episodeRange = null, metadata = null }) {
    const db = getDB();
    const result = db.prepare(`
        INSERT INTO summaries (session_id, project_id, content, token_count, episode_range, metadata)
        VALUES (?, ?, ?, ?, ?, ?)
    `).run(sessionId, projectId, content, tokenCount, episodeRange, metadata ? JSON.stringify(metadata) : null);
    return getSummary(result.lastInsertRowid);
 }
 function getSummary(id) {
    const db = getDB();
    const row = db.prepare(`SELECT * FROM summaries WHERE id = ?`).get(id);
    return row ? parseRow(row) : null;
 }
 function getSummariesBySession(sessionId) {
    const db = getDB();
    return db.prepare(`SELECT * FROM summaries WHERE session_id = ? ORDER BY created_at ASC`)
        .all(sessionId).map(parseRow);
 }
 function getSummariesByProject(projectId) {
    const db = getDB();
    return db.prepare(`SELECT * FROM summaries WHERE project_id = ? ORDER BY created_at ASC`)
        .all(projectId).map(parseRow);
 }
 function updateSummary(id, { content, tokenCount, episodeRange, metadata }) {
    const db = getDB();
    const fields = [];
    const values = [];
    if (content     !== undefined) { fields.push('content = ?');       values.push(content); }
    if (tokenCount  !== undefined) { fields.push('token_count = ?');    values.push(tokenCount); }
    if (episodeRange !== undefined){ fields.push('episode_range = ?');  values.push(episodeRange); }
    if (metadata    !== undefined) { fields.push('metadata = ?');       values.push(JSON.stringify(metadata)); }
    if (!fields.length) return getSummary(id);
    values.push(id);
    db.prepare(`UPDATE summaries SET ${fields.join(', ')} WHERE id = ?`).run(...values);
    return getSummary(id);
 }
 function deleteSummary(id) {
    getDB().prepare(`DELETE FROM summaries WHERE id = ?`).run(id);
 }
 // Fetches session summaries that belong to sessions in a given project
 // Joins through sessions table since session summaries don't store project_id directly
 function getSessionSummariesForProject(projectId) {
    const db = getDB();
    return db.prepare(`
        SELECT s.* FROM summaries s
        JOIN sessions sess ON sess.id = s.session_id
        WHERE sess.project_id = ? AND s.session_id IS NOT NULL
        ORDER BY s.created_at ASC
    `).all(projectId).map(parseRow);
 }
 // Fetches the most recent project-level overview summary (session_id IS NULL distinguishes it)
 function getProjectOverviewSummary(projectId) {
    const db = getDB();
    const row = db.prepare(`
        SELECT * FROM summaries
        WHERE project_id = ? AND session_id IS NULL
        ORDER BY created_at DESC LIMIT 1
    `).get(projectId);
    return row ? parseRow(row) : null;
 }
 module.exports = { createSummary, getSummary, getSummariesBySession, getSummariesByProject, updateSummary, deleteSummary, getSessionSummariesForProject, getProjectOverviewSummary };
--- a/packages/memory-service/src/entities/extraction.js
+++ b/packages/memory-service/src/entities/extraction.js
@@ -0,0 +1,172 @@
 const semantic = require('../semantic')
 const { getEnv, SERVICES, formatEpisodeText, ENTITIES, logger } = require('@nexusai/shared');
 const { upsertEntity, upsertRelationship, linkEntityToEpisode } = require('./index');
 const EXTRACTION_URL = getEnv('EXTRACTION_URL', 'http://localhost:11434');
 const EXTRACTION_MODEL = getEnv('EXTRACTION_MODEL', 'qwen2.5:3b'); // ChatML format — see buildExtractionPrompt
 const EMBEDDING_SERVICE_URL = getEnv('EMBEDDING_SERVICE_URL', SERVICES.EMBEDDING_URL);
 const ENTITY_TYPES = ENTITIES.TYPES;
 const IGNORED_NAMES = ['good morning', 'good night', 'hello', 'goodbye', 'thanks', 'thank you'];
 // NOTE: This prompt uses ChatML format (<|im_start|> / <|im_end|> tags), which is
 // specific to qwen-family models. If EXTRACTION_MODEL is changed to a Llama-family
 // or other model, this format will need to change — most alternatives use either
 // plain text or [INST] / <<SYS>> tags. Silent degradation is likely if mismatched.
 function buildExtractionPrompt(userMessage, aiResponse, knownEntities = []) {
    const knownBlock = knownEntities.length > 0
        ? [
            'Already known entities (use these exact name and type values if the same entity appears):',
            ...knownEntities.map(e => `- "${e.name}" (${e.type})`),
            '',
          ].join('\n')
        : '';
    return [
        '<|im_start|>system',
        'You are a named entity and relationship extractor. You output only valid JSON.',
        '<|im_end|>',
        '<|im_start|>user',
        'Read the conversation below and extract all named entities and the relationships between them.',
        `Entity types: ${ENTITY_TYPES.join(', ')}`,
        'Use "character" for any fictional, game, or media characters (e.g. characters from anime, games, books, TV shows, movies)',
        'Use "person" only for real people',
        'For each entity provide:',
        '  "name": short proper noun only (max 4 words)',
        '  "type": one of the valid types',
        '  "notes": one specific sentence about this entity based on the conversation',
        'For relationships, use snake_case verb labels (e.g. works_on, manages, uses, knows, located_in, part_of, created_by).',
        'Only include relationships between entities you have listed above.',
        'Return this exact JSON structure:',
        '{ "entities": [{"name": "...", "type": "...", "notes": "..."}], "relationships": [{"from": "...", "fromType": "...", "to": "...", "toType": "...", "label": "...", "notes": "..."}] }',
        '',
        knownBlock,
        '--- CONVERSATION ---',
        `User: ${userMessage}`,
        `Assistant: ${aiResponse}`,
        '--- END CONVERSATION ---',
        '<|im_end|>',
        '<|im_start|>assistant',
    ].join('\n');
 }
 async function embedEntity(entity) {
    // Combine name, type and notes into a single descriptive string for embedding
    const text = `${entity.name} (${entity.type}): ${entity.notes ?? entity.name}`;
    const res = await fetch(`${EMBEDDING_SERVICE_URL}/embed`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ text }),
    });
    if (!res.ok) throw new Error(`Embedding service error: ${res.status}`);
    const data = await res.json();
    return data.embedding;
 }
 async function extractAndStoreEntities(userMessage, aiResponse, episodeId=null, projectId=null) {
    logger.info('[entities] Extraction triggered')
    try {
        // Fetch existing entities to guide the model toward consistent name/type pairs
        const db = require('../db').getDB();
        const knownEntities = db.prepare(`SELECT name, type FROM entities ORDER BY rowid DESC LIMIT 20`).all();
        const prompt = buildExtractionPrompt(userMessage, aiResponse, knownEntities);
        const res = await fetch(`${EXTRACTION_URL}/api/generate`, {
            method: 'POST',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify({
                model: EXTRACTION_MODEL,
                prompt: prompt,
                stream: false,
                format: 'json',
                options: {
                    temperature: ENTITIES.TEMPERATURE,
                    num_predict: ENTITIES.NUM_PREDICT,
                },
            }),
            signal: AbortSignal.timeout(60_000),
        });
        if (!res.ok) throw new Error(`Ollama responded ${res.status}`);
        const data = await res.json();
        const raw = data.response?.trim() ?? '';
        const jsonMatch = raw.match(/\{[\s\S]*\}/);
        if (!jsonMatch) {
            logger.warn('[entities] No JSON object found in response');
            logger.debug('[entities] Raw response was:', raw);
            return;
        }
        let parsed;
        try {
            parsed = JSON.parse(jsonMatch[0]);
        } catch (err) {
            logger.warn('[entities] Failed to parse extraction response:', err.message);
            logger.debug('[entities] Raw response was:', raw);
            return;
        }
        const entities = Array.isArray(parsed.entities) ? parsed.entities : [];
        if (entities.length === 0) {
            logger.debug('[entities] No entities found in this exchange — skipping');
            return;
        }
        // Map of "name::type" → saved entity, used for relationship resolution below
        const entityMap = new Map();
        let saved = 0;
        for (const { name, type, notes } of entities) {
            if (!name || !type || !ENTITY_TYPES.includes(type)) continue;
            if (IGNORED_NAMES.includes(name.toLowerCase())) continue;
            const entity = upsertEntity(name, type, notes ?? null);
            entityMap.set(`${name}::${type}`, entity);
            logger.info('[entities] Upserted entity:', entity);
            if (episodeId) linkEntityToEpisode(entity.id, episodeId);
            embedEntity(entity)
                .then(vector => semantic.upsertEntity(entity.id, vector, {
                    name: entity.name,
                    type: entity.type,
                    notes: entity.notes,
                    projectId: projectId ?? null,
                }))
                .catch(err => {
                    logger.warn(`[entities] Failed to embed entity "${entity.name}":`, err.message);
                });
            saved++;
        }
        if (saved > 0) logger.info(`[entities] Extracted and stored ${saved} entities`);
        // Process extracted relationships — both entities must have been saved above
        const relationships = Array.isArray(parsed.relationships) ? parsed.relationships : [];
        let relSaved = 0;
        for (const { from, fromType, to, toType, label, notes } of relationships) {
            if (!from || !fromType || !to || !toType || !label) continue;
            const fromEntity = entityMap.get(`${from}::${fromType}`);
            const toEntity   = entityMap.get(`${to}::${toType}`);
            if (!fromEntity || !toEntity) continue;
            upsertRelationship(fromEntity.id, toEntity.id, label, notes ?? null);
            relSaved++;
        }
        if (relSaved > 0) logger.info(`[entities] Extracted and stored ${relSaved} relationships`);
    } catch (err) {
        // Non-critical — log and move on, episode is already saved
        logger.warn('[entities] Extraction failed:', err.message);
    }
 }
 module.exports = { extractAndStoreEntities };
--- a/packages/memory-service/src/entities/index.js
+++ b/packages/memory-service/src/entities/index.js
@@ -4,18 +4,23 @@ const { parseRow } = require ('@nexusai/shared')
 /******* Entities ********/
 // Upsert an entity - insert or update if (name, type) already exists
-function upsertEntity(name, type, notes = null, metadata = null) {
+function upsertEntity(name, type, notes = null, metadata = null, source = 'extraction') {
    const db = getDB();
 const stmt = db.prepare(`
-        INSERT INTO entities (name, type, notes, metadata)
+    INSERT INTO entities (name, type, notes, metadata, source, last_seen_at)
-        VALUES (?, ?, ?, ?)
+    VALUES (?, ?, ?, ?, ?, unixepoch())
    ON CONFLICT(name, type) DO UPDATE SET
-            notes       = excluded.notes,
+        -- First extraction wins: notes are never overwritten once set.
        -- Revisit during Memory Consolidation Lifecycle (Phase 2) — once entity
        -- quality scoring exists, a higher-confidence extraction should be able
        -- to replace stale notes rather than being silently dropped.
        notes = COALESCE(entities.notes, excluded.notes),
        metadata      = excluded.metadata,
        mention_count = entities.mention_count + 1,
        last_seen_at  = unixepoch(),
        updated_at    = unixepoch()
 `);
-    const result = stmt.run(name, type, notes, metadata ? JSON.stringify(metadata) : null);
+    stmt.run(name, type, notes, metadata ? JSON.stringify(metadata) : null, source);
    return getEntityByNameType(name, type);
 }
@@ -40,15 +45,17 @@ function deleteEntity(id) {
 /********* Relationships *********/
 // Upsert a relationship, insert or ignore if (from_id, to_id, label) already exists
-function upsertRelationship(fromId, toId, label, metadata = null){
+function upsertRelationship(fromId, toId, label, notes = null, metadata = null) {
    const db = getDB();
    const stmt = db.prepare(`
-        INSERT INTO relationships (from_id, to_id, label, metadata)
+        INSERT INTO relationships (from_id, to_id, label, notes, metadata)
-        VALUES (?, ?, ?, ?)
+        VALUES (?, ?, ?, ?, ?)
-        ON CONFLICT(from_id, to_id, label) DO NOTHING
+        ON CONFLICT(from_id, to_id, label) DO UPDATE SET
            mention_count = relationships.mention_count + 1,
            -- First extraction wins for notes — same policy as entities.
            notes         = COALESCE(relationships.notes, excluded.notes)
    `);
-
+    stmt.run(fromId, toId, label, notes, metadata ? JSON.stringify(metadata) : null);
    const result = stmt.run(fromId, toId, label, metadata ?JSON.stringify(metadata) : null);
    return getRelationship(fromId, toId, label);
 }
@@ -69,7 +76,7 @@ function getEntityByNameType(name, type) {
 }
 // Retrive all relationships originating from a given entity
-function getRelationshipsByEntity(entityId) {
+function getOutboundRelationships(entityId) {
    const db = getDB();
    return db.prepare(`SELECT * FROM relationships WHERE from_id = ?`).all(entityId).map(parseRow);
 }
@@ -81,14 +88,23 @@ function deleteRelationship(fromId, toId, label) {
    db.prepare(`DELETE FROM relationships WHERE from_id = ? AND to_id = ? AND label = ?`).run(fromId, toId, label);
 }   
 function linkEntityToEpisode(entityId, episodeId) {
    const db = getDB();
    db.prepare(`
        INSERT OR IGNORE INTO entity_episodes (entity_id, episode_id)
        VALUES (?, ?)
    `).run(entityId, episodeId);
 }
 module.exports = {
    upsertEntity,
    getEntity,
    getEntitiesByType,
    getEntityByNameType,
    deleteEntity,
    linkEntityToEpisode,
    upsertRelationship,
    getRelationship,
-    getRelationshipsByEntity,
+    getOutboundRelationships,
    deleteRelationship
 }
--- a/packages/memory-service/src/episodic/index.js
+++ b/packages/memory-service/src/episodic/index.js
@@ -1,6 +1,7 @@
 const {getDB} = require('../db');
-const { EPISODIC, getEnv, SERVICES, parseRow, formatEpisodeText } = require('@nexusai/shared');
+const { EPISODIC, getEnv, SERVICES, parseRow, formatEpisodeText, SUMMARIES, logger } = require('@nexusai/shared');
 const semantic = require('../semantic');
 const { extractAndStoreEntities } = require('../entities/extraction')
 // --Sessions --------------------------------------------------
@@ -23,14 +24,25 @@ function getSession(id) {
    return parseRow(stmt.get(id));
 }
-function getSessions(limit = EPISODIC.DEFAULT_PAGE_SIZE, offset = 0) {
+
 function getSessions(limit = EPISODIC.DEFAULT_PAGE_SIZE, offset = EPISODIC.DEFAULT_OFFSET, projectId = null) {
  const db = getDB();
-  const stmt = db.prepare(`
+  const stmt = projectId
    ? db.prepare(`
        SELECT * FROM sessions
        WHERE project_id = ?
        ORDER BY updated_at DESC
        LIMIT ? OFFSET ?
      `)
    : db.prepare(`
        SELECT * FROM sessions
        ORDER BY updated_at DESC
        LIMIT ? OFFSET ?
      `);
-  return stmt.all(limit, offset).map(parseRow);
+
  return projectId
    ? stmt.all(projectId, limit, offset).map(parseRow)
    : stmt.all(limit, offset).map(parseRow);
 }
 // Retrieves a session by its external ID
@@ -52,13 +64,22 @@ function deleteSession(id) {
    db.prepare(`DELETE FROM sessions WHERE id = ?`).run(id);
 }
-function updateSession(id, { name } = {}){
+function updateSession(id, { name, projectId } = {}) {
  const db = getDB();
-  db.prepare(`
+  
-    UPDATE sessions
+  // Build update dynamically based on what was provided
-    SET name = ?, updated_at = unixepoch()
+  const updates = [];
-    WHERE id = ?
+  const values = [];
-  `).run(name ?? null, id);
+
  if (name !== undefined) { updates.push('name = ?'); values.push(name ?? null); }
  if (projectId !== undefined) { updates.push('project_id = ?'); values.push(projectId ?? null); }
  if (updates.length === 0) return getSession(id);
  updates.push('updated_at = unixepoch()');
  values.push(id);
  db.prepare(`UPDATE sessions SET ${updates.join(', ')} WHERE id = ?`).run(...values);
  return getSession(id);
 }
@@ -77,21 +98,20 @@ function deleteSessionByExternalId(externalId) {
 // --Episodes --------------------------------------------------
 // Creates a new episode linked to a session, with user message, AI response, optional token count, and metadata
-async function createEpisode(sessionId, userMessage, aiResponse, tokenCount = null, metadata = null) {
+async function createEpisode(sessionId, userMessage, aiResponse, tokenCount = null, projectId=null) {
  const db = getDB();
  // Wrap insert + session touch in a transaction — both succeed or neither does
  const insert = db.transaction(() => {
    const stmt = db.prepare(`
-      INSERT INTO episodes (session_id, user_message, ai_response, token_count, metadata)
+      INSERT INTO episodes (session_id, user_message, ai_response, token_count)
-      VALUES (?, ?, ?, ?, ?)
+      VALUES (?, ?, ?, ?)
    `);
    const result = stmt.run(
      sessionId,
      userMessage,
      aiResponse,
      tokenCount,
      metadata ? JSON.stringify(metadata) : null
    );
    touchSession(sessionId);
    return getEpisode(result.lastInsertRowid);
@@ -105,7 +125,11 @@ async function createEpisode(sessionId, userMessage, aiResponse, tokenCount = nu
      sessionId: episode.session_id,
      createdAt: episode.created_at
    }))
-    .catch(err => console.error(`Failed to embed episode ${episode.id}:`, err.message));
+    .catch(err => logger.error(`Failed to embed episode ${episode.id}:`, err.message));
  extractAndStoreEntities(userMessage, aiResponse, episode.id, projectId)
    .catch(err => logger.error(`Failed to extract entities for episode ${episode.id}:`, err.message));
  return episode;
 }
@@ -118,7 +142,7 @@ function getEpisode(id) {
 }
 // Retrieves episodes for a given session, ordered by creation time descending, with pagination
-function getEpisodesBySession(sessionId, limit = EPISODIC.DEFAULT_PAGE_SIZE, offset = 0) {
+function getEpisodesBySession(sessionId, limit = EPISODIC.DEFAULT_PAGE_SIZE, offset = EPISODIC.DEFAULT_OFFSET) {
  const db = getDB();
  const stmt = db.prepare(`
    SELECT * FROM episodes
@@ -130,30 +154,41 @@ function getEpisodesBySession(sessionId, limit = EPISODIC.DEFAULT_PAGE_SIZE, off
 }
 // Retrieves recent episodes across all sessions, ordered by creation time descending, with a limit
-function getRecentEpisodes(limit = EPISODIC.DEFAULT_RECENT_LIMIT) {
+function getRecentEpisodes(sessionId, limit = EPISODIC.DEFAULT_RECENT_LIMIT) {
  // Cross-session recent episodes — useful for recency-based retrieval
  const db = getDB();
  const stmt = db.prepare(`
    SELECT * FROM episodes
    WHERE session_id = ?
    ORDER BY created_at DESC
    LIMIT ?
  `);
-  return stmt.all(limit).map(parseRow);
+  return stmt.all(sessionId, limit).map(parseRow);
 }
 // Searches episodes using FTS5 full-text search, ordered by relevance, with a limit
-function searchEpisodes(query, limit = EPISODIC.DEFAULT_SEARCH_LIMIT) {
+function searchEpisodes(query, limit = EPISODIC.DEFAULT_SEARCH_LIMIT, sessionIds = null) {
  // FTS5 full-text search across all episodes
  const db = getDB();
-  const stmt = db.prepare(`
+  const safeQuery = `"${query.replace(/"/g, '""')}"`;
  if (sessionIds && sessionIds.length > 0) {
    const ph = sessionIds.map(() => '?').join(',');
    return db.prepare(`
      SELECT e.* FROM episodes e
      JOIN episodes_fts fts ON e.id = fts.rowid
      WHERE episodes_fts MATCH ?
      AND e.session_id IN (${ph})
      ORDER BY rank
      LIMIT ?
    `).all(safeQuery, ...sessionIds, limit).map(parseRow);
  }
  return db.prepare(`
    SELECT e.* FROM episodes e
    JOIN episodes_fts fts ON e.id = fts.rowid
    WHERE episodes_fts MATCH ?
    ORDER BY rank
    LIMIT ?
-  `);
+  `).all(safeQuery, limit).map(parseRow);
  return stmt.all(query, limit).map(parseRow);
 }
 // Deletes an episode by its ID
@@ -172,7 +207,8 @@ async function getEpisodeEmbedding(userMessage, aiResponse){
  const res = await fetch(`${url}/embed`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
-    body: JSON.stringify({ text })  
+    body: JSON.stringify({ text }),  
    signal: AbortSignal.timeout(30_000),
  })
  if (!res.ok) {
@@ -182,6 +218,17 @@ async function getEpisodeEmbedding(userMessage, aiResponse){
  return data.embedding;
 }
 function getEpisodesByProject(projectId, limit = SUMMARIES.MAX_PROJECT_EPISODE_LIMIT) {
    const db = getDB();
    return db.prepare(`
        SELECT e.* FROM episodes e
        JOIN sessions s ON s.id = e.session_id
        WHERE s.project_id = ?
        ORDER BY e.created_at ASC
        LIMIT ?
    `).all(projectId, limit).map(parseRow);
 }
 module.exports = {
  createSession,
  getSession,
@@ -196,5 +243,6 @@ module.exports = {
  getEpisodesBySession,
  getRecentEpisodes,
  searchEpisodes,
-  deleteEpisode
+  deleteEpisode,
  getEpisodesByProject
 };
--- a/packages/memory-service/src/graph/index.js
+++ b/packages/memory-service/src/graph/index.js
@@ -0,0 +1,77 @@
 const { getDB } = require('../db');
 const { parseRow, ENTITIES } = require('@nexusai/shared');
 // Single-entity neighborhood via recursive CTE — bidirectional, configurable depth
 function getNeighborhood(entityId, depth = ENTITIES.GRAPH_HOP_DEPTH) {
    const db = getDB();
    const nodeRows = db.prepare(`
        WITH RECURSIVE traverse(entity_id, depth) AS (
            SELECT ?, 0
            UNION
            SELECT
                CASE WHEN r.from_id = t.entity_id THEN r.to_id ELSE r.from_id END,
                t.depth + 1
            FROM relationships r
            JOIN traverse t ON (r.from_id = t.entity_id OR r.to_id = t.entity_id)
            WHERE t.depth < ?
        )
        SELECT DISTINCT entity_id FROM traverse
    `).all(entityId, depth);
    const nodeIds = nodeRows.map(r => r.entity_id);
    if (nodeIds.length === 0) return { nodes: [], edges: [] };
    const ph = nodeIds.map(() => '?').join(',');
    const nodes = db.prepare(
        `SELECT * FROM entities WHERE id IN (${ph})`
    ).all(...nodeIds).map(parseRow);
    const edges = db.prepare(
        `SELECT * FROM relationships WHERE from_id IN (${ph}) AND to_id IN (${ph})`
    ).all(...nodeIds, ...nodeIds).map(parseRow);
    return { nodes, edges };
 }
 // Bulk 1-hop neighborhood for orchestration — seeds are entity IDs from Qdrant search
 function getEntityNeighbors(entityIds) {
    if (!entityIds.length) return { nodes: [], edges: [] };
    const db = getDB();
    const ph = entityIds.map(() => '?').join(',');
    // entityIds appears three times — once for the CASE (finding the neighbor),
    // and once each for the FROM and TO sides of the WHERE clause
    const neighborRows = db.prepare(`
        SELECT DISTINCT
            CASE WHEN from_id IN (${ph}) THEN to_id ELSE from_id END AS entity_id
        FROM relationships
        WHERE from_id IN (${ph}) OR to_id IN (${ph})
    `).all(...entityIds, ...entityIds, ...entityIds);
    const allIds = [...new Set([...entityIds, ...neighborRows.map(r => r.entity_id)])];
    const allPh = allIds.map(() => '?').join(',');
    const nodes = db.prepare(
        `SELECT * FROM entities WHERE id IN (${allPh})`
    ).all(...allIds).map(parseRow);
    const edges = db.prepare(
        `SELECT * FROM relationships WHERE from_id IN (${allPh}) AND to_id IN (${allPh})`
    ).all(...allIds, ...allIds).map(parseRow);
    return { nodes, edges };
 }
 // Returns episode IDs linked to any of the given entity IDs via entity_episodes
 function getEpisodeIdsByEntities(entityIds) {
    if (!entityIds.length) return [];
    const db = getDB();
    const ph = entityIds.map(() => '?').join(',');
    return db.prepare(
        `SELECT DISTINCT episode_id FROM entity_episodes WHERE entity_id IN (${ph})`
    ).all(...entityIds).map(r => r.episode_id);
 }
 module.exports = { getNeighborhood, getEntityNeighbors, getEpisodeIdsByEntities };
--- a/packages/memory-service/src/index.js
+++ b/packages/memory-service/src/index.js
@@ -1,14 +1,18 @@
 require ('dotenv').config();
 const express = require('express');
-const {getEnv, PORTS, EPISODIC} = require('@nexusai/shared');
+const {getEnv, PORTS, EPISODIC, logger} = require('@nexusai/shared');
 const { getDB } = require('./db');
 const { createProject, getProjects, getProject, updateProject, deleteProject } = require('./db/projects');
 const { createSummary, getSummary, getSummariesBySession, getSummariesByProject, updateSummary, deleteSummary } = require('./db/summaries');
 const { generateAndStoreProjectSummary } = require('./summarization/project');
 const graph = require('./graph');
 const episodic = require('./episodic');
 const semantic = require('./semantic');
 const entities = require('./entities');
 const app = express();
-app.use(express.json());
+app.use(express.json({ limit: '2mb' }));
 const  PORT = getEnv('PORT', PORTS.MEMORY);
@@ -16,8 +20,8 @@ const  PORT = getEnv('PORT', PORTS.MEMORY);
 const db = getDB();
 semantic.initCollections()
-    .then(() => console.log(`QDrant collections ready`))
+    .then(() => logger.info(`QDrant collections ready`))
-    .catch(err => console.error(`QDrant initialization error:`, err.message));
+    .catch(err => logger.error(`QDrant initialization error:`, err.message));
 // Health check endpoint
 app.get('/health', (req, res) => {
@@ -29,6 +33,19 @@ app.get('/health', (req, res) => {
 /************************************ */
 // Creates a new session with an external ID and optional metadata
 app.get('/sessions', (req, res) => {
  const { 
    limit = EPISODIC.DEFAULT_PAGE_SIZE, 
    offset = EPISODIC.DEFAULT_OFFSET, 
    projectId 
  } = req.query;
  const parsedProjectId = projectId && projectId !== 'null' ? Number(projectId) : null;
  const sessions = episodic.getSessions(Number(limit), Number(offset), parsedProjectId);
  res.json(sessions);
 });
 app.post('/sessions', (req, res) => {
  const { externalId, metadata } = req.body;
  if (!externalId) {
@@ -42,12 +59,6 @@ app.post('/sessions', (req, res) => {
  }
 });
 app.get('/sessions', (req, res) => {
  const {limit = EPISODIC.DEFAULT_PAGE_SIZE, offset = EPISODIC.DEFAULT_OFFSET } = req.query;
  const sessions = episodic.getSessions(Number(limit), Number(offset));
  res.json(sessions);
 })
 // Retrieves a session by its external ID
 app.get('/sessions/by-external/:externalId', (req, res) => {
  const session = episodic.getSessionByExternalId(req.params.externalId);
@@ -65,18 +76,16 @@ app.get('/sessions/:id', (req, res) => {
 });
 app.patch('/sessions/by-external/:externalId', (req, res) => {
-  const { name } = req.body;
+  const { name, projectId } = req.body;
  try {
-    const session = episodic.updateSessionByExternalId(req.params.externalId, {name });
+    const session = episodic.updateSessionByExternalId(req.params.externalId, {name, projectId });
    res.json(session);
  } catch (err) {
-    res.status(500).json({error: err.message });
+    res.status(500).json({ error: 'Failed to update session', detail: err.message });
  }
 });
-
+// Deletes a session and all associated episodes
 // Updates the session's updated_at timestamp to now
 app.delete('/sessions/by-external/:externalId', (req, res) => {
  episodic.deleteSessionByExternalId(req.params.externalId);
  res.status(204).send();
@@ -88,28 +97,46 @@ app.delete('/sessions/by-external/:externalId', (req, res) => {
 /************************************* */
 app.post('/episodes', async (req, res) => {
-  const { sessionId, userMessage, aiResponse, tokenCount, metadata } = req.body;
+  const { sessionId, userMessage, aiResponse, tokenCount, projectId } = req.body;
  if (!sessionId || !userMessage || !aiResponse) {
    return res.status(400).json({ error: 'sessionId, userMessage and aiResponse are required' });
  }
-  const episode = await episodic.createEpisode(sessionId, userMessage, aiResponse, tokenCount, metadata);
+  const episode = await episodic.createEpisode(sessionId, userMessage, aiResponse, tokenCount, projectId);
  console.log('[memory] create episode body:', {
    sessionId,
    userMessageLength: userMessage?.length,
    aiResponseLength: aiResponse?.length,
    tokenCount
  });
  res.status(201).json(episode);
 });
 app.get('/episodes', (req, res) => {
  const { limit = 50, offset = 0, sessionId, q } = req.query;
  if (q) {
    const results = episodic.searchEpisodes(q, Number(limit));
    return res.json({ episodes: results, total: results.length });
  }
  const db = getDB();
  let episodes;
  if (sessionId) {
    episodes = episodic.getEpisodesBySession(Number(sessionId), Number(limit), Number(offset));
  } else {
    episodes = db.prepare(
      `SELECT * FROM episodes ORDER BY created_at DESC LIMIT ? OFFSET ?`
    ).all(Number(limit), Number(offset)).map(row => require('@nexusai/shared').parseRow(row));
  }
  const total = db.prepare(`SELECT COUNT(*) as count FROM episodes`).get().count;
  res.json({ episodes, total });
 });
 // Search MUST come before /:id — otherwise 'search' gets captured as an id
 app.get('/episodes/search', (req, res) => {
-  const { q, limit = EPISODIC.DEFAULT_PAGE_SIZE } = req.query;
+  const { q, limit = EPISODIC.DEFAULT_PAGE_SIZE, sessionIds } = req.query;
  if (!q) return res.status(400).json({ error: 'q (query) parameter is required' });
-  const results = episodic.searchEpisodes(q, Number(limit));
+  const parsedSessionIds = sessionIds
-  res.json(results);
+    ? sessionIds.split(',').map(Number).filter(Boolean)
    : null;
  res.json(episodic.searchEpisodes(q, Number(limit), parsedSessionIds));
 });
 app.get('/episodes/:id', (req, res) => {
@@ -130,7 +157,12 @@ app.get('/sessions/:id/episodes', (req, res) => {
 });
 app.delete('/episodes/:id', (req, res) => {
-  episodic.deleteEpisode(req.params.id);
+  const id = Number(req.params.id);
  episodic.deleteEpisode(id);
  semantic.deleteEpisode(id)  // fire-and-forget
    .catch(err => logger.error(`[Memory] Qdrant delete failed for episode ${id}:`, err.message));
  res.status(204).send();
 });
@@ -173,17 +205,17 @@ app.delete('/entities/:id', (req, res) => {
 // Upsert a relationship between two entities
 app.post('/relationships', (req, res) => {
-  const {fromId, toId, label, metadata } = req.body;
+  const { fromId, toId, label, notes, metadata } = req.body;
  if (!fromId || !toId || !label) {
    return res.status(400).json({ error: 'fromId, toId and label are required' });
  }
-  const relationship = entities.upsertRelationship(fromId, toId, label, metadata);
+  const relationship = entities.upsertRelationship(fromId, toId, label, notes, metadata);
  res.status(201).json(relationship);
 });
 // Get all relationships for a given entity ID
 app.get('/entities/:id/relationships', (req, res) => {
-  res.json(entities.getRelationshipsByEntity(req.params.id));
+  res.json(entities.getOutboundRelationships(req.params.id));
 });
 // Delete a specific relationship
@@ -196,11 +228,149 @@ app.delete('/relationships', (req, res) => {
  res.status(204).send();
 })
 /********************************* */
 /********** Graph Routes ********** */
 /********************************* */
 // Single-entity neighborhood — depth defaults to ENTITIES.GRAPH_HOP_DEPTH
 app.get('/graph/neighborhood/:entityId', (req, res) => {
    const entity = entities.getEntity(req.params.entityId);
    if (!entity) return res.status(404).json({ error: 'Entity not found' });
    const depth = req.query.depth ? Math.min(Number(req.query.depth), 3) : undefined;
    const neighborhood = graph.getNeighborhood(Number(req.params.entityId), depth);
    res.json({ entity, neighborhood });
 });
 // Bulk 1-hop neighborhood — body: { entityIds: [...] }
 app.post('/graph/neighbors', (req, res) => {
    const { entityIds } = req.body;
    if (!Array.isArray(entityIds) || entityIds.length === 0) {
        return res.status(400).json({ error: 'entityIds array is required' });
    }
    res.json(graph.getEntityNeighbors(entityIds.map(Number)));
 });
 app.post('/episodes/by-entities', (req, res) => {
    const { entityIds } = req.body;
    if (!Array.isArray(entityIds) || entityIds.length === 0) {
        return res.status(400).json({ error: 'entityIds array is required' });
    }
    res.json({ episodeIds: graph.getEpisodeIdsByEntities(entityIds.map(Number)) });
 });
 /*********************************** */
 /********** Project Routes ********** */
 /*********************************** */
 app.post('/projects', (req, res) => {
  const { name, description, colour, icon } = req.body;
  if (!name?.trim()) return res.status(400).json({ error: 'name is required' });
  try {
    res.status(201).json(createProject({ name: name.trim(), description, colour, icon }));
  } catch (err) {
    res.status(500).json({ error: 'Failed to create project', detail: err.message });
  }
 });
 app.get('/projects', (req, res) => {
  res.json(getProjects());
 });
 // Generate (or regenerate) a project overview summary on demand
 app.post('/projects/:id/summarize', async (req, res) => {
    const project = getProject(Number(req.params.id));
    if (!project) return res.status(404).json({ error: 'Project not found' });
    try {
        const summary = await generateAndStoreProjectSummary(Number(req.params.id));
        res.status(201).json(summary);
    } catch (err) {
        if (err.message.includes('No session summaries or episodes')) {
            return res.status(422).json({ error: err.message });
        }
        res.status(500).json({ error: 'Failed to generate project summary', detail: err.message });
    }
 });
 // Get the current project overview summary
 app.get('/projects/:id/overview', async (req, res) => {
    const { getProjectOverviewSummary } = require('./db/summaries');
    const summary = getProjectOverviewSummary(Number(req.params.id));
    // 200 with null is fine — frontend can handle "no overview yet" gracefully
    res.json(summary ?? null);
 });
 // Get summaries for a project
 app.get('/projects/:id/summaries', (req, res) => {
    res.json(getSummariesByProject(req.params.id));
 });
 app.get('/projects/:id', (req, res) => {
  const project = getProject(req.params.id);
  if (!project) return res.status(404).json({ error: 'Not found' });
  res.json(project);
 });
 app.patch('/projects/:id', (req, res) => {
  const project = getProject(req.params.id);
  if (!project) return res.status(404).json({ error: 'Not found' });
  res.json(updateProject(req.params.id, req.body));
 });
 app.delete('/projects/:id', (req, res) => {
  const project = getProject(req.params.id);
  if (!project) return res.status(404).json({ error: 'Not found' });
  deleteProject(req.params.id);
  res.status(204).send();
 });
 /*********************************** */
 /********** Summary Routes ********** */
 /*********************************** */
 // Create a summary (called by orchestration, fire-and-forget style)
 app.post('/summaries', (req, res) => {
    const { sessionId, projectId, content, tokenCount, episodeRange, metadata } = req.body;
    if (!content) return res.status(400).json({ error: 'content is required' });
    if (!sessionId && !projectId) return res.status(400).json({ error: 'sessionId or projectId is required' });
    try {
        const summary = createSummary({ sessionId, projectId, content, tokenCount, episodeRange, metadata });
        res.status(201).json(summary);
    } catch (err) {
        res.status(500).json({ error: 'Failed to create summary', detail: err.message });
    }
 });
 // Get summaries for a session
 app.get('/sessions/:id/summaries', (req, res) => {
    res.json(getSummariesBySession(req.params.id));
 });
 // Update a summary (for cumulative updates)
 app.patch('/summaries/:id', (req, res) => {
    const summary = getSummary(req.params.id);
    if (!summary) return res.status(404).json({ error: 'Not found' });
    res.json(updateSummary(req.params.id, req.body));
 });
 // Delete a summary
 app.delete('/summaries/:id', (req, res) => {
    deleteSummary(req.params.id);
    res.status(204).send();
 });
 /********************************** */
 /********** Start Server ********** */
 /********************************** */
 app.listen(PORT, () => {
-    console.log(`Memory Service is running on port ${PORT}`);
+    logger.info(`Memory Service is running on port ${PORT}`);
 });
--- a/packages/memory-service/src/semantic/index.js
+++ b/packages/memory-service/src/semantic/index.js
@@ -1,5 +1,5 @@
 const {QdrantClient} = require('@qdrant/js-client-rest');
-const {QDRANT, COLLECTIONS, getEnv} = require('@nexusai/shared');
+const {QDRANT, COLLECTIONS, getEnv, logger} = require('@nexusai/shared');
 let client;
@@ -24,9 +24,9 @@ async function initCollections() {
            distance: QDRANT.DISTANCE_METRIC
        }
      });
-      console.log(`Created Qdrant collection: ${name}`);
+      logger.info(`Created Qdrant collection: ${name}`);
    } else {
-      console.log(`Qdrant collection already exists: ${name}`);
+      logger.info(`Qdrant collection already exists: ${name}`);
    }
  }
 }
@@ -95,6 +95,11 @@ async function deleteVector(collection, id) {
    });
 }
 async function deleteEpisode(id) {
  return deleteVector(COLLECTIONS.EPISODES, id);
 }
 module.exports = {
    initCollections,
    upsertEpisode,
@@ -103,5 +108,6 @@ module.exports = {
    searchEpisodes,
    searchEntities,
    searchSummaries,
-    deleteVector
+    deleteVector,
    deleteEpisode
 };
--- a/packages/memory-service/src/summarization/project.js
+++ b/packages/memory-service/src/summarization/project.js
@@ -0,0 +1,142 @@
 const { SERVICES, getEnv, SUMMARIES } = require('@nexusai/shared');
 const { 
    getSessionSummariesForProject,
    getProjectOverviewSummary,
    createSummary,
    updateSummary,
 } = require('../db/summaries');
 const { getEpisodesByProject } = require('../episodic');
 const { getProject } = require('../db/projects');
 const EXTRACTION_URL   = getEnv('EXTRACTION_URL', 'http://localhost:11434');
 const EXTRACTION_MODEL = getEnv('EXTRACTION_MODEL', 'qwen2.5:3b');
 const MAX_SUMMARY_CHARS = SUMMARIES.MAX_SUMMARY_CHARS; // generous ceiling before we truncate input
 function buildProjectSummaryPrompt(projectName, sessionSummaries) {
    let summaryBlock = sessionSummaries
        .map((s, i) => `Session ${i + 1}:\n${s.content}`)
        .join('\n\n');
    // Guard against very large inputs — truncate oldest sessions if needed
    if (summaryBlock.length > MAX_SUMMARY_CHARS) {
        summaryBlock = summaryBlock.slice(-MAX_SUMMARY_CHARS);
    }
    return [
        '<|im_start|>user',
        `The following are session summaries from a project called "${projectName}".`,
        'Write a project overview covering: goals, progress, key decisions, and current state.',
        'Scale the length to the material — use multiple paragraphs for complex projects, a few sentences for simple ones.',
        'Be comprehensive but avoid padding. Do not repeat the same point twice.',
        'Write in third person. Output only the overview text, no headings or labels.',
        '',
    ].join('\n');
 }
 function buildProjectSummaryFromEpisodesPrompt(projectName, episodes) {
    // Condense episodes into a readable block, truncating if needed
    let episodeBlock = episodes
        .map(ep => `User: ${ep.user_message}\nAssistant: ${ep.ai_response}`)
        .join('\n\n');
    if (episodeBlock.length > MAX_SUMMARY_CHARS) {
        // Keep the most recent episodes — slice from the end
        episodeBlock = episodeBlock.slice(-MAX_SUMMARY_CHARS);
    }
    return [
        '<|im_start|>user',
        `The following are conversations from a project called "${projectName}".`,
        'Write a project overview covering: goals, progress, key decisions, and current state.',
        'Scale the length to the material — use multiple paragraphs for complex projects, a few sentences for simple ones.',
        'Be comprehensive but avoid padding. Do not repeat the same point twice.',
        'Write in third person. Output only the overview text, no headings or labels.',
        '',
        episodeBlock,
        '<|im_end|>',
        '<|im_start|>assistant',
    ].join('\n');
 }
 async function generateProjectSummaryFromEpisodes(projectName, episodes) {
    const prompt = buildProjectSummaryFromEpisodesPrompt(projectName, episodes);
    const res = await fetch(`${EXTRACTION_URL}/api/generate`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
            model: EXTRACTION_MODEL,
            prompt,
            stream: false,
            options: { temperature: 0.2, num_predict: 1200 },
        }),
    });
    if (!res.ok) throw new Error(`Ollama responded ${res.status}`);
    const data = await res.json();
    const raw = data.response?.trim() ?? '';
    return raw
        .replace(/<\|im_start\|>.*?<\|im_end\|>/gs, '')
        .replace(/<\|im_start\|>|<\|im_end\|>|<\|im_sep\|>/g, '')
        .trim();
 }
 async function generateProjectSummary(projectName, sessionSummaries) {
    const prompt = buildProjectSummaryPrompt(projectName, sessionSummaries);
    const res = await fetch(`${EXTRACTION_URL}/api/generate`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
            model: EXTRACTION_MODEL,
            prompt,
            stream: false,
            // No format: 'json' — we want free-text narrative, same as session summarization
            options: { temperature: 0.2, num_predict: 1200 },
        }),
    });
    if (!res.ok) throw new Error(`Ollama responded ${res.status}`);
    const data = await res.json();
    const raw = data.response?.trim() ?? '';
    return raw
        .replace(/<\|im_start\|>.*?<\|im_end\|>/gs, '')
        .replace(/<\|im_start\|>|<\|im_end\|>|<\|im_sep\|>/g, '')
        .trim();
 }
 // Main entry point — called by the route handler
 async function generateAndStoreProjectSummary(projectId) {
    const project = getProject(projectId);
    if (!project) throw new Error('Project not found');
    let content;
    const sessionSummaries = getSessionSummariesForProject(projectId);
    if (sessionSummaries.length > 0) {
        // Preferred path — summarize the summaries
        content = await generateProjectSummary(project.name, sessionSummaries);
    } else {
        // Fallback — summarize raw episodes directly
        const episodes = getEpisodesByProject(projectId);
        if (!episodes.length) {
            throw new Error('No session summaries or episodes found for this project');
        }
        content = await generateProjectSummaryFromEpisodes(project.name, episodes);
    }
    if (!content) throw new Error('Model returned empty summary');
    const existing = getProjectOverviewSummary(projectId);
    if (existing) {
        return updateSummary(existing.id, { content, tokenCount: null, episodeRange: null });
    } else {
        return createSummary({ projectId, content, sessionId: null });
    }
 }
 module.exports = { generateAndStoreProjectSummary };
--- a/packages/orchestration-service/CLAUDE.md
+++ b/packages/orchestration-service/CLAUDE.md
@@ -0,0 +1,156 @@
 # CLAUDE.md
 This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
 See the root [CLAUDE.md](../../CLAUDE.md) for overall architecture, service roles, and the end-to-end chat flow.
 ## Running This Service
 ```bash
 npm run orchestration             # From repo root (node src/index.js)
 npm -w packages/orchestration-service run dev   # With --watch
 ```
 Default port: **4000**. Depends on memory-service, embedding-service, inference-service, and Qdrant.
 ## Context Assembly (`src/chat/index.js`)
 `assembleContext(externalId, userMessage)` is the core function that builds the inference prompt. Order of operations:
 1. Resolve session by `externalId` (creates it if missing — every chat call is self-healing).
 2. If session has a `project_id`, load the project and fetch all sibling sessions (via `getProjectSessions`, hardcoded `limit=200`).
 3. Fetch `recentEpisodeLimit` recent episodes from memory-service.
 4. Embed the user message; search Qdrant EPISODES with `scoreThreshold`:
   - No project: `must: [sessionId == this session]`
   - Project: `should: [sessionId == s1, sessionId == s2, ...]` across all project sessions
   - Dedup against recent episode IDs before including.
 5. Run **fused episode retrieval** via `getFusedEpisodes` — Qdrant semantic search and FTS5 keyword search run in parallel, both filtered against `recentIds`, then merged via Reciprocal Rank Fusion (RRF). If `keywordWeight` is `0`, the FTS call is skipped. Returns top `semanticLimit` episodes by fused score.
 6. Embed and search Qdrant ENTITIES (filtered by `projectId` if in a project). Returns entity IDs alongside payload — the Qdrant point ID equals the SQLite entity ID.
 7. Expand matched entities into a 1-hop graph neighborhood via `POST /graph/neighbors` on the memory-service. Returns `{ nodes, edges }` — the full entity objects plus connecting relationships. Falls back to flat entity list (no edges) if the graph call fails.
 8. Build prompt in this fixed order: **system prompt → graph context → fused episodes → recent episodes → user message → "Assistant:"**
 The ordering prioritizes established facts (graph context) and relevant past context (semantic) over pure recency.
 ## Graph Context Format
 `formatGraphContext(nodes, edges)` in `src/chat/index.js` formats the neighborhood as:
 ```
 - Alice (person): software engineer working on NexusAI
  → works_on NexusAI (project)
  → knows Bob (person)
 - NexusAI (project): AI assistant framework
 - Bob (person): Alice's colleague
 ```
 Each node shows its notes on the first line. Outbound edges are indented below with `→ label target (type)`. Nodes with only inbound edges (neighbors pulled in by traversal) appear without connection lines.
 ## System Prompt Resolution
 Priority from highest to lowest:
 1. `project.system_prompt` (stored on the project row in memory-service)
 2. `settings.systemPrompt` (saved in `data/settings.json`)
 3. `ORCHESTRATION.SYSTEM_PROMPT` (shared constants fallback)
 ## Settings (`src/config/settings.js`)
 Settings are loaded from `data/settings.json` merged with defaults at every `GET /settings` call. `PATCH /settings` validates each field individually with specific constraints:
 | Field | Constraint |
 |---|---|
 | `recentEpisodeLimit` | integer, 1–20 |
 | `semanticLimit` | integer, 1–20 |
 | `scoreThreshold` | number, 0–1 |
 | `temperature` | number, 0–2 |
 | `repeatPenalty` | number, 1–2 |
 | `topP` | number, 0–1 |
 | `topK` | integer, 1–100 |
 | `modelsFolderPath` | path must exist and be readable |
 | `systemPrompt` | string (trimmed); `null` reverts to shared default |
 `data/settings.json` is created on first save. Parent directories are created if missing.
 ## Streaming SSE (`src/chat/index.js` — `chatStream`)
 The route sets SSE headers and delegates to `chatStream`, which:
 1. Calls `inference.completeStream()` → receives a raw HTTP Response with a readable body.
 2. Reads the body in chunks, buffers across chunk boundaries, splits on `\n\n`.
 3. For each event line starting with `data: `, parses the JSON and calls `onChunk(data.response)`.
 4. The `[DONE]` sentinel (used by some llama-server versions) is explicitly ignored.
 5. After stream ends, saves the assembled full response as an episode (same as non-streaming).
 If a chunk parse fails the error is logged and the stream continues. If the response body closes with no text accumulated, the episode is not saved (logged as warning).
 ## Fire-and-Forget Tasks
 After every successful chat turn:
 - **Summarization** (`services/summarization.js` → `triggerSummary`): checks token threshold → recency guard → calls Ollama → POSTs to memory-service. Only runs if `SUMMARIES.THRESHOLD_TOKENS` is exceeded AND at least `SUMMARIES.MIN_EPISODES_SINCE` new episodes have occurred since the last summary.
 - **Auto-naming** (`chat/index.js` → `autoNameSession`): only fires on the first message of a session. Uses temp 0.3, `maxTokens=20`, prompts for a ≤5-word title.
 Both tasks catch all errors and log warnings without surfacing to the client.
 ## Summarization Recency Guard
 `src/services/summarization.js` reads the `episode_range` field of the latest existing summary (format: `"<startId>-<endId>"`). It counts SQLite episodes with `id > endId`; if fewer than `SUMMARIES.MIN_EPISODES_SINCE`, it skips. This prevents rapid re-summarization on high-traffic sessions.
 When the existing summary's token count exceeds `SUMMARIES.MAX_SUMMARY_TOKENS`, it is treated as "expired" — a fresh summary is generated instead of an incremental update.
 ## Qdrant Calls (Direct, Not Via Memory-Service)
 `src/services/qdrant.js` makes REST calls to Qdrant directly at `QDRANT_URL`. This bypasses memory-service for semantic search performance. Orchestration fetches episode/entity content from memory-service by ID *after* getting vector search results from Qdrant.
 `searchEntities` checks `projectId !== null && projectId !== undefined` before applying the filter — a session with no project skips the filter entirely and searches globally.
 ## Retrieval Fusion (`src/chat/index.js`)
 Three functions handle fusion — all pure or lightly async, all non-critical:
 - **`getFTSResults(userMessage, { limit, sessionIds })`** — calls `memory.searchEpisodes`; returns `[]` and logs a warning on failure
 - **`fuseEpisodeResults(semanticEps, keywordEps, { semanticWeight, keywordWeight, limit })`** — pure RRF implementation. Key guard: FTS-only episodes are only added to the scores Map if `contrib > 0` (prevents score-0 bleed-through when `keywordWeight: 0`)
 - **`getFusedEpisodes(userMessage, session, recentIds, projectSessionIds, settings)`** — orchestrates both paths in `Promise.all`, applies `recentIds` filter to FTS results, calls fusion. Short-circuits FTS call entirely if `keywordWeight === 0`
 FTS is scoped to `projectSessionIds` if in a project, otherwise `[session.id]` — mirrors Qdrant scoping exactly.
 > For RRF formula, weight semantics, and enabling keyword search, see `docs/services/retrieval-fusion.md`.
 ## Graph Service Client (`src/services/graph.js`)
 Thin HTTP client for memory-service graph endpoints. One function:
 - **`getNeighbors(entityIds[])`** — POSTs to `memory-service/graph/neighbors` with the entity IDs from Qdrant entity search. Returns `{ nodes, edges }`. Throws on non-2xx — caller wraps in try/catch with graceful fallback.
 ## Models Endpoint
 `GET /models` scans `modelsFolderPath` for `.gguf` files and optionally reads a `models.json` manifest (keyed by filename) for labels and descriptions. File size is reported in GB. Returns 500 if the folder is inaccessible.
 `GET /models/props` proxies `/props` from llama-server and returns `{contextWindow, modelAlias}`. Returns 503 if llama-server is unreachable.
 ## Health Check
 `GET /health/services` runs parallel fetch calls to all four dependent services with a 3-second `AbortSignal.timeout` each. Results are returned as an array — the endpoint never returns a non-2xx itself regardless of downstream status.
 ## Background Model (qwen2.5:3b)
 Used for entity/relationship extraction and summarization via Ollama on Mini PC 1. Uses **ChatML format** (`<|im_start|>` / `<|im_end|>`) — not Phi3 format. Use `format: 'json'` only for structured extraction, never for free-text summarization.
 ## API Endpoints Quick Reference
 | Method | Path | Notes |
 |---|---|---|
 | GET | `/health` | Returns service URLs |
 | GET | `/health/services` | Parallel status of all dependencies |
 | POST | `/chat` | Blocking completion |
 | POST | `/chat/stream` | SSE streaming |
 | GET/PATCH | `/settings` | Persistent settings |
 | GET | `/models` | `.gguf` file scan |
 | GET | `/models/props` | llama-server model info |
 | GET | `/sessions` | Delegates to memory-service |
 | GET | `/sessions/:sessionId/history` | Paginated episodes by external ID |
 | PATCH | `/sessions/:sessionId` | `name` and/or `projectId` |
 | DELETE | `/sessions/:sessionId` | |
 | GET | `/episodes` | Delegates; supports `q` for FTS |
 | DELETE | `/episodes/:id` | Delegates |
 | GET/POST/PATCH/DELETE | `/projects` and `/projects/:id` | Delegates |
 | POST | `/summaries/project/:projectId/generate` | On-demand; 422 if no data |
 | GET | `/summaries/project/:projectId/overview` | |
 | GET | `/summaries/session/:sessionId` | Resolves external ID first |
 | GET | `/summaries/project/:projectId` | |
--- a/packages/orchestration-service/src/chat/index.js
+++ b/packages/orchestration-service/src/chat/index.js
@@ -1,85 +1,354 @@
-const memory = require('../services/memory');
+const memory = require("../services/memory");
-const inference = require('../services/inference');
+const inference = require("../services/inference");
-const embedding = require('../services/embedding');
+const embedding = require("../services/embedding");
-const qdrant = require('../services/qdrant');
+const qdrant = require("../services/qdrant");
-const { ORCHESTRATION } = require('@nexusai/shared')
+const { ORCHESTRATION, RETRIEVAL, logger } = require("@nexusai/shared");
 const appSettings = require("../config/settings");
 const {triggerSummary} = require('../services/summarization')
 const graph = require('../services/graph');
-const { RECENT_EPISODE_LIMIT, SEMANTIC_LIMIT, SCORE_THRESHOLD, SYSTEM_PROMPT } = ORCHESTRATION;
+function buildPrompt(guaranteed, selected, neighborhood, userMessage, systemPrompt) {
    const parts = [systemPrompt ?? ORCHESTRATION.SYSTEM_PROMPT];
-function buildPrompt(recentEpisodes, semanticEpisodes, userMessage) {
+    const graphText = formatGraphContext(neighborhood.nodes ?? [], neighborhood.edges ?? []);
-    const parts = [SYSTEM_PROMPT];
+    if (graphText) {
        parts.push("Here is what you know about entities relevant to this conversation and their connections:");
        parts.push(graphText);
        parts.push("---");
    }
-    if (semanticEpisodes.length > 0 )
+  if (selected.length > 0) {
-    {
+    parts.push("Relevant memories from earlier conversations:");
-        parts.push('Here are some relevant memories from earlier conversations:')
+    for (const ep of selected) {
        for (const ep of semanticEpisodes) {
      parts.push(`User: ${ep.user_message}\nAssistant: ${ep.ai_response}`);
    }
-        parts.push('---')
+    parts.push("---");
  }
-    if (recentEpisodes.length > 0) {
+  if (guaranteed.length > 0) {
-        parts.push(`Here are some relevant memories from your past conversations:`);
+    parts.push("Recent conversation history (most recent exchanges):");
-        for (const ep of recentEpisodes) {
+    for (const ep of guaranteed) {
      parts.push(`User: ${ep.user_message}\nAssistant: ${ep.ai_response}`);
    }
-        parts.push('--- End of recent memories ---\n');
+    parts.push("--- End of recent memories ---\n");
  }
  parts.push(`User: ${userMessage}`);
-    parts.push('Assistant:');
+  parts.push("Assistant:");
-    return parts.join('\n');
+  return parts.join("\n");
 }
-async function getSemanticEpisodes(userMessage, sessionId, recentIds) {
+function buildNamingPrompt(userMessage, aiResponse) {
  return [
    "Your task is to generate a short title for a conversation based on its first exchange.",
    "Rules: maximum 5 words, no punctuation, no quotes, plain text only.",
    'Examples: "Setting up a Raspberry Pi", "Help with Python list comprehension", "Planning a trip to Japan"',
    "",
    `User: ${userMessage}`,
    `Assistant: ${aiResponse}`,
    "",
    "Title:",
  ].join("\n");
 }
 function formatGraphContext(nodes, edges) {
    if (!nodes.length) return null;
    const nodeMap = new Map(nodes.map(n => [n.id, n]));
    // Build outbound adjacency
    const outbound = new Map(nodes.map(n => [n.id, []]));
    for (const edge of edges) {
        if (outbound.has(edge.from_id) && nodeMap.has(edge.to_id)) {
            const target = nodeMap.get(edge.to_id);
            outbound.get(edge.from_id).push(`${edge.label} ${target.name} (${target.type})`);
        }
    }
    return nodes.map(n => {
        const lines = [`- ${n.name} (${n.type}): ${n.notes ?? '(no notes)'}`];
        for (const conn of outbound.get(n.id) ?? []) lines.push(`  → ${conn}`);
        return lines.join('\n');
    }).join('\n');
 }
 async function autoNameSession(externalId, userMessage, aiResponse) {
  try {
    const prompt = buildNamingPrompt(userMessage, aiResponse);
    const result = await inference.complete(prompt, {
      maxTokens: 20, // title only needs a handful of tokens
      temperature: 0.3, // low temperature for consistent, factual naming
    });
    const name = result.text?.trim().replace(/^["']|["']$/g, ""); // strip any quotes the model adds
    if (name) {
      await memory.updateSession(externalId, { name });
      logger.info(
        `[orchestration] Auto-named session "${externalId}": "${name}"`,
      );
    }
  } catch (err) {
    logger.warn(
      "[orchestration] Auto-naming failed (non-critical):",
      err.message,
    );
  }
 }
 async function getSemanticEpisodes(
  userMessage,
  sessionId,
  recentIds,
  projectSessionIds = null,
  { semanticLimit, scoreThreshold } = {},
 ) {
  try {
    const vector = await embedding.embed(userMessage);
    const results = await qdrant.searchEpisodes(vector, {
-            limit: SEMANTIC_LIMIT,
+      limit: semanticLimit,
-            scoreThreshold: SCORE_THRESHOLD,
+      scoreThreshold: scoreThreshold,
-            sessionId,
+      sessionId: projectSessionIds ? null : sessionId,
      projectSessionIds,
    });
    const fetched = await Promise.all(
      results
-                .filter(r => !recentIds.has(r.id))
+        .filter((r) => !recentIds.has(r.id))
-                .map(r => memory.getEpisodeById(r.id))
+        .map((r) => memory.getEpisodeById(r.id)),
    );
    return fetched.filter(Boolean);
  } catch (err) {
-        console.warn(`[orchestration] Semantic search failed, continuing without: `, err.message);
+    logger.warn(
      `[orchestration] Semantic search failed, continuing without: `,
      err.message,
    );
    return [];
  }
 }
-async function chat(externalId, userMessage, options = {}) {
+async function getRelevantEntities(userMessage, projectId = null) {
    try {
        const vector = await embedding.embed(userMessage);
        const results = await qdrant.searchEntities(vector, { projectId });
        logger.info(
            '[orchestration] Entity search results:',
            results.map((r) => ({ name: r.payload?.name, score: r.score })),
        );
        // Include the Qdrant point ID (== SQLite entity ID) for graph traversal
        return results.map((r) => r.payload ? { id: r.id, ...r.payload } : null).filter(Boolean);
    } catch (err) {
        logger.debug('[orchestration] Entity search failed, continuing without:', err.message);
        return [];
    }
 }
 async function getFTSResults(userMessage, { limit, sessionIds }) {
    try {
        return await memory.searchEpisodes(userMessage, { limit, sessionIds });
    } catch (err) {
        logger.warn('[orchestration] FTS search failed, continuing without:', err.message);
        return [];
    }
 }
 // Returns {episode, score}[] — scores needed for buildScoredPool downstream
 function fuseEpisodeResults(semanticEps, keywordEps, { semanticWeight, keywordWeight, limit }) {
    const k = RETRIEVAL.RRF_K;
    const scores = new Map();
    semanticEps.forEach((ep, i) => {
        scores.set(ep.id, { episode: ep, score: semanticWeight / (k + i + 1) });
    });
    keywordEps.forEach((ep, i) => {
        const contrib = keywordWeight / (k + i + 1);
        if (scores.has(ep.id)) {
            scores.get(ep.id).score += contrib;
        } else if (contrib > 0) {
            scores.set(ep.id, { episode: ep, score: contrib });
        }
    });
    return [...scores.values()]
        .sort((a, b) => b.score - a.score)
        .slice(0, limit);
 }
 function estimateTokens(episode) {
    return episode.token_count
        ?? Math.ceil((episode.user_message.length + episode.ai_response.length) / 4);
 }
 function buildScoredPool(fusedWithScores, recentEpisodes, entityBoostedIds, { entityWeight }) {
    const k = RETRIEVAL.RRF_K;
    const pool = new Map(); // episode.id → {episode, score}
    for (const { episode, score } of fusedWithScores) {
        pool.set(episode.id, { episode, score });
    }
    recentEpisodes.forEach((ep, i) => {
        const recencyScore = 1.0 / (k + i + 1);
        if (pool.has(ep.id)) {
            pool.get(ep.id).score += recencyScore;
        } else {
            pool.set(ep.id, { episode: ep, score: recencyScore });
        }
    });
    for (const id of entityBoostedIds) {
        if (pool.has(id)) pool.get(id).score += entityWeight;
    }
    return [...pool.values()].sort((a, b) => b.score - a.score);
 }
 function selectWithinBudget(scoredPool, contextBudget, minRecentEpisodes, recentEpisodes) {
    let budget = contextBudget;
    const sortByTime = (a, b) => a.created_at - b.created_at;
    // Guarantee floor: always include the N most recent episodes
    const guaranteed = recentEpisodes.slice(0, minRecentEpisodes);
    const guaranteedIds = new Set(guaranteed.map(ep => ep.id));
    for (const ep of guaranteed) budget -= estimateTokens(ep);
    // Fill remaining budget from scored pool, highest-priority first
    const selected = [];
    for (const { episode } of scoredPool) {
        if (guaranteedIds.has(episode.id)) continue;
        const cost = estimateTokens(episode);
        // // Break rather than skip — lower-priority episodes aren't worth fitting over higher-priority ones
        if (budget - cost < 0) break;
        selected.push(episode);
        budget -= cost;
    }
    return {
        guaranteed: [...guaranteed].sort(sortByTime),
        selected:   selected.sort(sortByTime),
    };
 }
 async function getFusedEpisodes(userMessage, session, recentIds, projectSessionIds, settings) {
    const { semanticLimit, scoreThreshold, semanticWeight, keywordWeight } = settings;
    const ftsSessionIds = projectSessionIds ?? [session.id];
    const ftsPromise = keywordWeight > 0
        //  FTS and semantic may have significant overlap, so fetching more from FTS gives the fusion step more to work with before deduplication.
        ? getFTSResults(userMessage, { limit: semanticLimit * 2, sessionIds: ftsSessionIds })
        : Promise.resolve([]);
    const [semanticEps, rawKeywordEps] = await Promise.all([
        getSemanticEpisodes(userMessage, session.id, recentIds, projectSessionIds, { semanticLimit, scoreThreshold }),
        ftsPromise,
    ]);
    const keywordEps = rawKeywordEps.filter(ep => !recentIds.has(ep.id));
    return fuseEpisodeResults(semanticEps, keywordEps, { semanticWeight, keywordWeight, limit: semanticLimit });
 }
 async function assembleContext(externalId, userMessage) {
    const settings = appSettings.load();
    const { recentEpisodeLimit, semanticLimit, scoreThreshold,
            temperature, repeatPenalty, topP, topK, systemPrompt,
            semanticWeight, keywordWeight,
            contextBudget, entityWeight, minRecentEpisodes } = settings;
    // 1. Resolve or create session
    let session = await memory.getSessionByExternalId(externalId);
    if (!session) session = await memory.createSession(externalId);
-    // 2. Fetch recent episodes for context
+    // 2. Resolve project context
-    const recentEpisodes = await memory.getRecentEpisodes(session.id, RECENT_EPISODE_LIMIT );
+    let projectSessionIds = null;
    let activeSystemPrompt = systemPrompt ?? ORCHESTRATION.SYSTEM_PROMPT;
    if (session.project_id) {
        try {
            const project = await memory.getProject(session.project_id);
            if (project) {
                const projectSessions = await memory.getProjectSessions(session.project_id);
                if (project.system_prompt) activeSystemPrompt = project.system_prompt;
                projectSessionIds = projectSessions.map(s => s.id);
            }
        } catch (err) {
            logger.warn('[orchestration] Failed to resolve project context:', err.message);
        }
    }
    // 3. Fetch recent episodes
    const recentEpisodes = await memory.getRecentEpisodes(session.id, recentEpisodeLimit);
    const isFirstMessage = recentEpisodes.length === 0;
    const recentIds = new Set(recentEpisodes.map(e => e.id));
-    // 3. Semantic Search
+    // 4. Fused retrieval + entity search in parallel (both are independent)
-    const semanticEpisodes = await getSemanticEpisodes(userMessage, session.id, recentIds);
+    const [fusedWithScores, entityResults] = await Promise.all([
        getFusedEpisodes(userMessage, session, recentIds, projectSessionIds, { semanticLimit, scoreThreshold, semanticWeight, keywordWeight }),
        getRelevantEntities(userMessage, session.project_id ?? null),
    ]);
-    // 4. Assemble prompt
+    // 5. Entity-linked episode IDs for scoring bonus
-    const prompt = buildPrompt(recentEpisodes, semanticEpisodes, userMessage);
+    const entityIds = entityResults.map(e => e.id);
    let entityBoostedIds = new Set();
    if (entityIds.length > 0) {
        try {
            const result = await memory.getEpisodesByEntities(entityIds);
            entityBoostedIds = new Set(result.episodeIds);
        } catch (err) {
            logger.debug('[orchestration] Entity-episode lookup failed, skipping bonus:', err.message);
        }
    }
-    // 5. Run inference
+    // 6. Build unified scored pool and select within token budget
-    const result = await inference.complete(prompt, options);
+    const scoredPool = buildScoredPool(fusedWithScores, recentEpisodes, entityBoostedIds, { entityWeight });
    const { guaranteed, selected } = selectWithinBudget(scoredPool, contextBudget, minRecentEpisodes, recentEpisodes);
-    // 6. Write episode back to memory
+    // 7. Graph neighborhood expansion
-    memory.createEpisode(
+    let neighborhood = { nodes: [], edges: [] };
-        session.id,
+    if (entityIds.length > 0) {
-        userMessage,
+        try {
-        result.text,
+            neighborhood = await graph.getNeighbors(entityIds);
-        (result.evalCount || 0) + (result.promptEvalCount || 0 )
+        } catch (err) {
-    ).catch(err => console.error(`[orchestration] Failed to save episode`, err.message));
+            logger.warn('[orchestration] Graph neighborhood fetch failed, falling back to flat entities:', err.message);
            neighborhood = { nodes: entityResults, edges: [] };
        }
    }
    // 8. Assemble prompt
    const prompt = buildPrompt(guaranteed, selected, neighborhood, userMessage, activeSystemPrompt);
    return {
        session,
        prompt,
        isFirstMessage,
        inferenceOptions: { temperature, repeatPenalty, topP, topK },
    };
 }
 async function chat(externalId, userMessage, options = {}) {
    const { session, prompt, isFirstMessage, inferenceOptions } = await assembleContext(externalId, userMessage);
    const result = await inference.complete(prompt, { ...options, ...inferenceOptions });
    try {
        await memory.createEpisode(
            session.id, userMessage, result.text,
            (result.evalCount || 0) + (result.promptEvalCount || 0),
            session.project_id ?? null,
        );
    } catch (err) {
        logger.error('[orchestration] Failed to save episode:', err.message);
    }
    const allEpisodes = await memory.getRecentEpisodes(session.id, 9999);
    triggerSummary(session, allEpisodes);
    if (isFirstMessage && !session.name) {
        autoNameSession(externalId, userMessage, result.text).catch(() => {});
    }
    // 7. Return response
    return {
        sessionId: externalId,
        response: result.text,
@@ -87,131 +356,58 @@ async function chat(externalId, userMessage, options = {}) {
        tokenCount: (result.evalCount || 0) + (result.promptEvalCount || 0),
    };
 }
-/*
+
 async function chatStream(externalId, userMessage, onChunk, options = {}) {
    // 1. Resolve or create session
    let session = await memory.getSessionByExternalId(externalId);
    if (!session) session = await memory.createSession(externalId);
    // 2. Context assembly
    const recentEpisodes = await memory.getRecentEpisodes(session.id, RECENT_EPISODE_LIMIT);
    const recentIds = new Set(recentEpisodes.map(e => e.id));
    const semanticEpisodes = await getSemanticEpisodes(userMessage, session.id, recentIds)
    // 3. Assemble Prompt
    const prompt = buildPrompt(recentEpisodes, semanticEpisodes, userMessage);
    // 4. Open stream to inference service
    const res = await inference.completeStream(prompt, options);
    let fullText = '';
    let model = '';
    let tokenCount = 0;
    // 5. Parse SSE chunks
    // Replace the current SSE parsing block in chatStream:
    for await (const chunk of res.body) {
        const lines = chunk.toString().split('\n');
        for (const line of lines) {
            if (!line.startsWith('data: ')) continue;
            const raw = line.slice(6).trim();
            if (raw === '[DONE]') continue;
    try {
-                const data = JSON.parse(raw);
+        const { session, prompt, isFirstMessage, inferenceOptions } = await assembleContext(externalId, userMessage);
-                // llama.cpp provider shape: { response, done }
+        const res = await inference.completeStream(prompt, { ...options, ...inferenceOptions });
                if (data.response) {
                    fullText += data.response;
                    onChunk(data.response);
                }
-                // model comes through on done chunk from inference route
+        let fullText = '', model = '', tokenCount = 0, buffer = '';
                if (data.model) model = data.model;
                // token count — inference.js route sends this on the done chunk
                if (data.done && data.tokenCount !== undefined) {
                    tokenCount = data.tokenCount;
                }
            } catch {
                // partial chunk — skip
            }
        }
    }
    // 6. Write Complete episode to memory
    if(fullText && fullText.trim()){
        memory.createEpisode(session.id, userMessage, fullText, tokenCount)
            .catch(err => console.error('[orchestration] Failed to save streamed episode:', err.message))
    }
    return {model, tokenCount};        
 }
 */
 async function chatStream(externalId, userMessage, onChunk, options = {}) {
  let session = await memory.getSessionByExternalId(externalId);
  if (!session) session = await memory.createSession(externalId);
  const recentEpisodes = await memory.getRecentEpisodes(session.id, RECENT_EPISODE_LIMIT);
  const recentIds = new Set(recentEpisodes.map(e => e.id));
  const semanticEpisodes = await getSemanticEpisodes(userMessage, session.id, recentIds);
  const prompt = buildPrompt(recentEpisodes, semanticEpisodes, userMessage);
  const res = await inference.completeStream(prompt, options);
  let fullText = '';
  let model = '';
  let tokenCount = 0;
  let buffer = '';
        for await (const chunk of res.body) {
            buffer += Buffer.from(chunk).toString('utf8');
            const events = buffer.split('\n\n');
            buffer = events.pop() || '';
            for (const event of events) {
-      const lines = event.split('\n');
+                const dataLines = event.split('\n')
      const dataLines = lines
                    .filter(line => line.startsWith('data: '))
                    .map(line => line.slice(6));
-      if (dataLines.length === 0) continue;
+                if (!dataLines.length) continue;
                const raw = dataLines.join('\n').trim();
                if (raw === '[DONE]') continue;
                try {
                    const data = JSON.parse(raw);
-
+                    if (data.response) { fullText += data.response; onChunk(data.response); }
        if (data.response) {
          fullText += data.response;
          onChunk(data.response);
        }
                    if (data.model) model = data.model;
-        if (data.done && data.tokenCount !== undefined) {
+                    if (data.done && data.tokenCount !== undefined) tokenCount = data.tokenCount;
-          tokenCount = data.tokenCount;
+                    if (data.error) throw new Error(data.error);
        }
        if (data.error) {
          throw new Error(data.error);
        }
                } catch (err) {
-        console.error('[orchestration] Failed to parse inference SSE event:', raw, err.message);
+                    logger.error('[orchestration] Failed to parse SSE event:', raw, err.message);
                }
            }
        }
  console.log('[orchestration] final streamed text length:', fullText.length);
        if (fullText.trim()) {
-    await memory.createEpisode(session.id, userMessage, fullText, tokenCount);
+            await memory.createEpisode(session.id, userMessage, fullText, tokenCount, session.project_id ?? null);
            const allEpisodes = await memory.getRecentEpisodes(session.id, 9999);
            triggerSummary(session, allEpisodes);
        } else {
-    console.warn('[orchestration] Stream finished with no assistant text; episode not saved');
+            logger.warn('[orchestration] Stream finished with no assistant text; episode not saved');
        }
        if (isFirstMessage && !session.name) {
            autoNameSession(externalId, userMessage, fullText).catch(() => {});
        }
        return { model, tokenCount };
    } catch (err) {
        logger.error('[orchestration] chatStream fatal error:', err.message, err.stack);
        throw err;
    }
 }
 module.exports = { chat, chatStream };
--- a/packages/orchestration-service/src/config/settings.js
+++ b/packages/orchestration-service/src/config/settings.js
@@ -0,0 +1,41 @@
 const fs = require('fs');
 const path = require('path');
 const { getEnv, ORCHESTRATION, INFERENCE_DEFAULTS, RETRIEVAL } = require('@nexusai/shared');
 const SETTINGS_PATH = path.join(__dirname, '../../data/settings.json');
 const DEFAULTS = {
  recentEpisodeLimit:   ORCHESTRATION.RECENT_EPISODE_LIMIT,
  semanticLimit:        ORCHESTRATION.SEMANTIC_LIMIT,
  scoreThreshold:       ORCHESTRATION.SCORE_THRESHOLD,
  modelsFolderPath:     getEnv('MODELS_MANIFEST_PATH', '/mnt/nexus-models'),
  temperature:          INFERENCE_DEFAULTS.TEMPERATURE,
  repeatPenalty:        INFERENCE_DEFAULTS.REPEAT_PENALTY,
  topP:                 INFERENCE_DEFAULTS.TOP_P,
  topK:                 INFERENCE_DEFAULTS.TOP_K,
  systemPrompt:         ORCHESTRATION.SYSTEM_PROMPT,
  semanticWeight:       RETRIEVAL.SEMANTIC_WEIGHT,
  keywordWeight:        RETRIEVAL.KEYWORD_WEIGHT,
  contextBudget:        ORCHESTRATION.CONTEXT_BUDGET,
  entityWeight:         ORCHESTRATION.ENTITY_WEIGHT,
  minRecentEpisodes:    ORCHESTRATION.MIN_RECENT_EPISODES,
 };
 function load() {
  try {
    const raw = fs.readFileSync(SETTINGS_PATH, 'utf8');
    return { ...DEFAULTS, ...JSON.parse(raw) };
  } catch {
    return { ...DEFAULTS }; // file doesn't exist yet — use defaults
  }
 }
 function save(updates) {
  const current = load();
  const next = { ...current, ...updates };
  fs.mkdirSync(path.dirname(SETTINGS_PATH), { recursive: true });
  fs.writeFileSync(SETTINGS_PATH, JSON.stringify(next, null, 2));
  return next;
 }
 module.exports = { load, save, DEFAULTS };
--- a/packages/orchestration-service/src/index.js
+++ b/packages/orchestration-service/src/index.js
@@ -1,14 +1,21 @@
 require ('dotenv').config();
 const express = require('express');
-const {getEnv, PORTS, SERVICES, ORCHESTRATION} = require('@nexusai/shared');
+const {getEnv, PORTS, SERVICES, ORCHESTRATION, logger} = require('@nexusai/shared');
 /**** ROUTERS *** */
 const chatRouter = require('./routes/chat');
 const sessionsRouter = require('./routes/sessions');
 const modelsRouter = require('./routes/models');
 const projectsRouter = require('./routes/projects');
 const episodesRouter = require('./routes/episodes');
 const settingsRouter = require('./routes/settings');
 const healthRouter = require('./routes/health');
 const summariesRouter = require('./routes/summaries')
 const cors = require('cors');
 const app = express();
-app.use(express.json());
+app.use(express.json({ limit: '2mb' }));
 app.use(cors({
    origin: [
@@ -39,8 +46,12 @@ app.use('/chat', chatRouter);
 app.use('/sessions', sessionsRouter);
 app.use('/models', modelsRouter);
 app.use('/projects', projectsRouter);
 app.use('/episodes', episodesRouter);
 app.use('/settings', settingsRouter);
 app.use('/health/services', healthRouter);
 app.use('/summaries', summariesRouter)
 /******* Start the server ************/
 app.listen(PORT, () => {
-    console.log(`Orchestration Service is running on port ${PORT}`);
+    logger.info(`Orchestration Service is running on port ${PORT}`);
 });
--- a/packages/orchestration-service/src/routes/chat.js
+++ b/packages/orchestration-service/src/routes/chat.js
@@ -1,6 +1,8 @@
 const { Router } = require('express')
 const { chat, chatStream } = require('../chat/index');
 const memory = require('../services/memory')
 const logger = require('@nexusai/shared');
 const router = Router();
@@ -17,8 +19,8 @@ router.post('/', async (req, res) => {
        });
        res.json(result)
    } catch (err) {
-        console.error(`[orchestration] chat error: `, err.message)
+        logger.error(`[orchestration] chat error: `, err.message)
-        res.status(500).json ({ error: err.message})
+        res.status(500).json ({ error: 'Chat failed', detail: err.message })
    }
 });
--- a/packages/orchestration-service/src/routes/episodes.js
+++ b/packages/orchestration-service/src/routes/episodes.js
@@ -0,0 +1,25 @@
 const { Router } = require('express');
 const memory = require('../services/memory');
 const router = Router();
 router.get('/', async (req, res) => {
  const { limit, offset, sessionId, q } = req.query;
  try {
    const result = await memory.getEpisodes({ limit, offset, sessionId, q });
    res.json(result);
  } catch (err) {
    res.status(500).json({ error: 'Failed to fetch episodes', detail: err.message });
  }
 });
 router.delete('/:id', async (req, res) => {
  try {
    await memory.deleteEpisode(req.params.id);
    res.status(204).send();
  } catch (err) {
    res.status(500).json({ error: 'Failed to delete episode', detail: err.message });
  }
 });
 module.exports = router;
--- a/packages/orchestration-service/src/routes/health.js
+++ b/packages/orchestration-service/src/routes/health.js
@@ -0,0 +1,30 @@
 const { Router } = require('express');
 const fetch = require('node-fetch');
 const { getEnv, SERVICES, PORTS } = require('@nexusai/shared');
 const router = Router();
 const SERVICES_MAP = [
  { key: 'inference',     label: 'Inference',     url: `${getEnv('INFERENCE_SERVICE_URL', SERVICES.INFERENCE_URL)}/health` },
  { key: 'memory',        label: 'Memory',        url: `${getEnv('MEMORY_SERVICE_URL',    SERVICES.MEMORY_URL)}/health` },
  { key: 'embedding',     label: 'Embedding',     url: `${getEnv('EMBEDDING_SERVICE_URL', SERVICES.EMBEDDING_URL)}/health` },
  { key: 'orchestration', label: 'Orchestration', url: `http://localhost:${getEnv('PORT', PORTS.ORCHESTRATION)}/health` },
 ];
 router.get('/', async (req, res) => {
  const results = await Promise.all(
    SERVICES_MAP.map(async ({ key, label, url }) => {
      const start = Date.now();
      try {
        const r = await fetch(url, { signal: AbortSignal.timeout(3000) });
        const data = await r.json();
        return { key, label, status: 'healthy', latency: Date.now() - start, detail: data };
      } catch (err) {
        return { key, label, status: 'unreachable', latency: Date.now() - start, detail: null };
      }
    })
  );
  res.json(results);
 });
 module.exports = router;
--- a/packages/orchestration-service/src/routes/models.js
+++ b/packages/orchestration-service/src/routes/models.js
@@ -1,21 +1,70 @@
 // routes/models.js
 const express = require('express');
 const router = express.Router();
 const fs = require('fs');
 const path = require('path');
-const {getEnv} = require('@nexusai/shared');
+const appSettings = require('../config/settings');
-const MODELS_PATH = getEnv('MODELS_MANIFEST_PATH', path.join(__dirname, '../models.json'));
+const { getEnv, LLAMACPP, logger } = require('@nexusai/shared');
 const LLAMA_URL = getEnv('LLAMA_SERVER_URL', LLAMACPP.DEFAULT_URL);
 router.get('/', (req, res) => {
  const { modelsFolderPath } = appSettings.load();
  try {
-    const raw = fs.readFileSync(MODELS_PATH, 'utf8');
+    // Try scanning folder for .gguf files
-    const models = JSON.parse(raw);
+    const files = fs.readdirSync(modelsFolderPath)
      .filter(f => f.endsWith('.gguf'));
    // Try loading models.json for richer metadata (label, description)
    let manifest = {};
    try {
      const manifestPath = path.join(modelsFolderPath, 'models.json');
      const raw = fs.readFileSync(manifestPath, 'utf8');
      // Index manifest by filename for quick lookup
      const list = JSON.parse(raw);
      for (const m of list) {
        manifest[m.value] = m;
      }
    } catch {
      // No manifest — scan only, that's fine
    }
    const models = files.map(filename => ({
      value:       filename,
      label:       manifest[filename]?.label ?? filename.replace('.gguf', ''),
      description: manifest[filename]?.description ?? null,
      size:        getFileSizeMB(path.join(modelsFolderPath, filename)),
    }));
    res.json(models);
  } catch (err) {
-    console.error('[models] Failed to read manifest:', err.message);
+    logger.error('[models] Failed to scan folder:', err.message);
-    res.status(500).json({ error: 'Could not load models manifest' });
+    res.status(500).json({ error: `Could not read models folder: ${modelsFolderPath}` });
  }
 });
 router.get('/props', async (req, res) => {
  try {
    const response = await fetch(`${LLAMA_URL}/props`);
    if (!response.ok) throw new Error(`llama-server error: ${response.status}`);
    const data = await response.json();
    res.json({
      contextWindow: data.default_generation_settings?.n_ctx ?? null,
      modelAlias: data.model_alias,
    });
  } catch (err) {
    logger.error('[models/props]', err.message);
    res.status(503).json({ error: 'Could not reach llama-server' });
  }
 });
 function getFileSizeMB(filepath) {
  try {
    const bytes = fs.statSync(filepath).size;
    return (bytes / (1024 ** 3)).toFixed(1) + ' GB'; // models are big — show GB
  } catch {
    return null;
  }
 }
 module.exports = router;
--- a/packages/orchestration-service/src/routes/projects.js
+++ b/packages/orchestration-service/src/routes/projects.js
@@ -7,17 +7,17 @@ router.get('/', async (req, res) => {
    try {
        res.json(await memory.getProjects());
    } catch (err) {
-        res.status(500).json({ error: err.message });
+        res.status(500).json({ error: 'Failed to fetch projects', detail: err.message });
    }
 });
 router.post('/', async (req, res) => {
-    const { name, description, colour, icon } = req.body;
+    const { name, description, colour, icon, isolated } = req.body;
    if (!name?.trim()) return res.status(400).json({ error: 'name is required' });
    try {
-        res.status(201).json(await memory.createProject({ name: name.trim(), description, colour, icon }));
+        res.status(201).json(await memory.createProject({ name: name.trim(), description, colour, icon, isolated }));
    } catch (err) {
-        res.status(500).json({ error: err.message });
+        res.status(500).json({ error: 'Failed to create project', detail: err.message });
    }
 });
@@ -25,7 +25,7 @@ router.patch('/:id', async (req, res) => {
  try {
    res.json(await memory.updateProject(req.params.id, req.body));
  } catch (err) {
-        res.status(500).json({ error: err.message });
+    res.status(500).json({ error: 'Failed to update project', detail: err.message });
  }
 });
@@ -34,7 +34,7 @@ router.delete('/:id', async (req, res) => {
        await memory.deleteProject(req.params.id);
        res.status(204).send();
    } catch (err) {
-        res.status(500).json({ error: err.message });
+        res.status(500).json({ error: 'Failed to delete project', detail: err.message });
    }
 });
--- a/packages/orchestration-service/src/routes/sessions.js
+++ b/packages/orchestration-service/src/routes/sessions.js
@@ -15,29 +15,37 @@ router.get('/:sessionId/history', async (req, res) => {
    const history = await memory.getSessionHistory(session.id, Number(limit), Number(offset));
    res.json({ sessionId, episodes: history });
  } catch (err) {
-    res.status(500).json({ error: err.message });
+    res.status(500).json({ error: 'Failed to fetch session history', detail: err.message });
  }
 });
 router.get('/', async (req, res) => {
-    const { limit = EPISODIC.DEFAULT_PAGE_SIZE, offset = EPISODIC.DEFAULT_OFFSET } = req.query;
+  const { limit = EPISODIC.DEFAULT_PAGE_SIZE, offset = EPISODIC.DEFAULT_OFFSET, projectId } = req.query;
  const parsedProjectId = projectId && projectId !== 'null' ? projectId : null;
  try {
-        const sessions = await memory.getSessions(Number(limit), Number(offset));
+    const sessions = await memory.getSessions(Number(limit), Number(offset), parsedProjectId);
    res.json(sessions);
  } catch (err) {
-        res.status(500).json({error: err.message});
+    res.status(500).json({ error: 'Failed to fetch sessions', detail: err.message });
  }
-})
+});
 router.patch('/:sessionId', async (req, res) => {
-    const { name } = req.body;
+  const { name, projectId } = req.body;
-    if (!name?.trim()) return res.status(400).json({ error: 'name is required' });
+  
  // Allow patch with just projectId, or just name, or both
  if (!name?.trim() && projectId === undefined) {
    return res.status(400).json({ error: 'name or projectId is required' });
  }
  try {
-        const session = await memory.updateSession(req.params.sessionId, { name: name.trim() });
+    const session = await memory.updateSession(req.params.sessionId, { 
      name: name?.trim() || undefined, 
      projectId 
    });
    res.json(session);
  } catch (err) {
-        res.status(500).json({ error: err.message });
+    res.status(500).json({ error: 'Failed to update session', detail: err.message });
  }
 });
@@ -46,7 +54,7 @@ router.delete('/:sessionId', async (req, res) => {
        await memory.deleteSession(req.params.sessionId);
        res.status(204).send();
    } catch (err) {
-        res.status(500).json({ error: err.message });
+        res.status(500).json({ error: 'Failed to delete session', detail: err.message });
    }
 });
--- a/packages/orchestration-service/src/routes/settings.js
+++ b/packages/orchestration-service/src/routes/settings.js
@@ -0,0 +1,121 @@
 const { Router } = require('express');
 const settings = require('../config/settings');
 const fs = require('fs');
 const router = Router();
 router.get('/', (req, res) => {
  res.json(settings.load());
 });
 router.patch('/', (req, res) => {
  const { recentEpisodeLimit, semanticLimit, scoreThreshold } = req.body;
  const updates = {};
  if (recentEpisodeLimit !== undefined) {
    const val = Number(recentEpisodeLimit);
    if (!Number.isInteger(val) || val < 1 || val > 20)
      return res.status(400).json({ error: 'recentEpisodeLimit must be 1–20' });
    updates.recentEpisodeLimit = val;
  }
  if (semanticLimit !== undefined) {
    const val = Number(semanticLimit);
    if (!Number.isInteger(val) || val < 1 || val > 20)
      return res.status(400).json({ error: 'semanticLimit must be 1–20' });
    updates.semanticLimit = val;
  }
  if (scoreThreshold !== undefined) {
    const val = Number(scoreThreshold);
    if (isNaN(val) || val < 0 || val > 1)
      return res.status(400).json({ error: 'scoreThreshold must be 0–1' });
    updates.scoreThreshold = val;
  }
  if (req.body.modelsFolderPath !== undefined) {
    const val = req.body.modelsFolderPath.trim();
    if (!val) return res.status(400).json({ error: 'modelsFolderPath cannot be empty' });
    // Verify the path exists and is readable
    try {
        fs.readdirSync(val);
    } catch {
        return res.status(400).json({ error: `Path not accessible: ${val}` });
    }
    updates.modelsFolderPath = val;
  }
  if (req.body.temperature !== undefined) {
    const val = Number(req.body.temperature);
    if (isNaN(val) || val < 0 || val > 2)
        return res.status(400).json({ error: 'temperature must be 0–2' });
    updates.temperature = val;
  }
  if (req.body.repeatPenalty !== undefined) {
  const val = Number(req.body.repeatPenalty);
  if (isNaN(val) || val < 1 || val > 2)
    return res.status(400).json({ error: 'repeatPenalty must be 1–2' });
  updates.repeatPenalty = val;
 }
 if (req.body.topP !== undefined) {
  const val = Number(req.body.topP);
  if (isNaN(val) || val < 0 || val > 1)
    return res.status(400).json({ error: 'topP must be 0–1' });
  updates.topP = val;
 }
 if (req.body.topK !== undefined) {
  const val = Number(req.body.topK);
  if (!Number.isInteger(val) || val < 1 || val > 100)
    return res.status(400).json({ error: 'topK must be 1–100' });
  updates.topK = val;
 }
 if (req.body.systemPrompt !== undefined) {
  const val = req.body.systemPrompt;
  if (typeof val !== 'string') 
    return res.status(400).json({ error: 'systemPrompt must be a string' });
  updates.systemPrompt = val.trim() || null; // null reverts to default
 }
  if (req.body.semanticWeight !== undefined) {
    const val = Number(req.body.semanticWeight);
    if (isNaN(val) || val < 0 || val > 5)
      return res.status(400).json({ error: 'semanticWeight must be 0–5' });
    updates.semanticWeight = val;
  }
  if (req.body.keywordWeight !== undefined) {
    const val = Number(req.body.keywordWeight);
    if (isNaN(val) || val < 0 || val > 5)
      return res.status(400).json({ error: 'keywordWeight must be 0–5' });
    updates.keywordWeight = val;
  }
  if (req.body.contextBudget !== undefined) {
    const val = Number(req.body.contextBudget);
    if (!Number.isInteger(val) || val < 512 || val > 32768)
        return res.status(400).json({ error: 'contextBudget must be 512–32768' });
    updates.contextBudget = val;
  }
  if (req.body.entityWeight !== undefined) {
      const val = Number(req.body.entityWeight);
      if (isNaN(val) || val < 0 || val > 2)
          return res.status(400).json({ error: 'entityWeight must be 0–2' });
      updates.entityWeight = val;
  }
  if (req.body.minRecentEpisodes !== undefined) {
      const val = Number(req.body.minRecentEpisodes);
      if (!Number.isInteger(val) || val < 0 || val > 10)
          return res.status(400).json({ error: 'minRecentEpisodes must be 0–10' });
      updates.minRecentEpisodes = val;
  }
  res.json(settings.save(updates));
 });
 module.exports = router;
--- a/packages/orchestration-service/src/routes/summaries.js
+++ b/packages/orchestration-service/src/routes/summaries.js
@@ -0,0 +1,48 @@
 const { Router } = require('express');
 const memory = require('../services/memory');
 const router = Router();
 // Trigger on-demand project summary generation
 router.post('/project/:projectId/generate', async (req, res) => {
    try {
        const summary = await memory.generateProjectSummary(req.params.projectId);
        res.status(201).json(summary);
    } catch (err) {
        // Pass through 422 from memory-service ("no session summaries yet")
        const status = err.message.includes('422') ? 422 : 500;
        res.status(status).json({ error: err.message });
    }
 });
 // Get current project overview summary
 router.get('/project/:projectId/overview', async (req, res) => {
    try {
        const summary = await memory.getProjectOverviewSummary(req.params.projectId);
        res.json(summary);
    } catch (err) {
        res.status(500).json({ error: 'Failed to fetch project overview summary', detail: err.message });
    }
 });
 router.get('/session/:sessionId', async (req, res) => {
    try {
        const session = await memory.getSessionByExternalId(req.params.sessionId);
        if (!session) return res.status(404).json({ error: 'Session not found' });
        const summaries = await memory.getSummariesBySession(session.id);
        res.json(summaries);
    } catch (err) {
        res.status(500).json({ error: 'Failed to fetch session summaries', detail: err.message });
    }
 });
 router.get('/project/:projectId', async (req, res) => {
    try {
        const summaries = await memory.getSummariesByProject(req.params.projectId);
        res.json(summaries);
    } catch (err) {
        res.status(500).json({ error: 'Failed to fetch project summaries', detail: err.message });
    }
 });
 module.exports = router;
--- a/packages/orchestration-service/src/services/graph.js
+++ b/packages/orchestration-service/src/services/graph.js
@@ -0,0 +1,15 @@
 const { getEnv, SERVICES } = require('@nexusai/shared');
 const MEMORY_URL = getEnv('MEMORY_SERVICE_URL', SERVICES.MEMORY_URL);
 async function getNeighbors(entityIds) {
    const res = await fetch(`${MEMORY_URL}/graph/neighbors`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ entityIds }),
    });
    if (!res.ok) throw new Error(`Graph neighbors error: ${res.status}`);
    return res.json();
 }
 module.exports = { getNeighbors };
--- a/packages/orchestration-service/src/services/memory.js
+++ b/packages/orchestration-service/src/services/memory.js
@@ -29,11 +29,11 @@ async function getRecentEpisodes(sessionId, limit = EPISODIC.DEFAULT_SESSIONS_LI
    return res.json();
 }
-async function createEpisode(sessionId, userMessage, aiResponse, tokenCount) {
+async function createEpisode(sessionId, userMessage, aiResponse, tokenCount, projectId=null) {
    const res = await fetch(`${BASE_URL}/episodes`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({ sessionId, userMessage, aiResponse, tokenCount })
+        body: JSON.stringify({ sessionId, userMessage, aiResponse, tokenCount, projectId })
    });
    if (!res.ok) throw new Error(`Failed to create episode: ${res.status} ${res.statusText}`);
    return res.json();
@@ -55,19 +55,22 @@ async function getSessionHistory(sessionId, limit = EPISODIC.DEFAULT_SESSIONS_LI
    return res.json();
 }
-async function getSessions(limit = EPISODIC.DEFAULT_SESSIONS_LIMIT, offset = EPISODIC.DEFAULT_OFFSET) {
+async function getSessions(limit = EPISODIC.DEFAULT_SESSIONS_LIMIT, offset = EPISODIC.DEFAULT_OFFSET, projectId = null) {
-    const res = await fetch(
+  const url = new URL(`${BASE_URL}/sessions`);
-        `${BASE_URL}/sessions?limit=${limit}&offset=${offset}`
+  url.searchParams.set('limit', limit);
-    );
+  url.searchParams.set('offset', offset);
  if (projectId) url.searchParams.set('projectId', projectId);
  const res = await fetch(url.toString());
  if (!res.ok) throw new Error(`Failed to fetch sessions: ${res.status}`);
  return res.json();
 }
-async function updateSession(externalId, { name }) {
+async function updateSession(externalId, { name, projectId }) {
    const res = await fetch(`${BASE_URL}/sessions/by-external/${externalId}`, {
        method: 'PATCH',
        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({ name }),
+        body: JSON.stringify({ name, projectId }),
    });
    if (!res.ok) throw new Error(`Failed to update session: ${res.status}`);
    return res.json();
@@ -97,11 +100,11 @@ async function getProjects() {
    return res.json();
 }
-async function updateProject(id, { name, description, colour, icon }) {
+async function updateProject(id, fields = {}) {
    const res = await fetch(`${BASE_URL}/projects/${id}`, {
        method: 'PATCH',
        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({ name, description, colour, icon })
+        body: JSON.stringify(fields)
    });
    if (!res.ok) throw new Error(`Failed to update project: ${res.status}`);
    return res.json();
@@ -112,6 +115,107 @@ async function deleteProject(id) {
    if (!res.ok) throw new Error(`Failed to delete project: ${res.status}`);
 }
 async function getProjectSessions(projectId) {
  const url = new URL(`${BASE_URL}/sessions`);
  url.searchParams.set('limit', 200);  // generous upper bound
  url.searchParams.set('offset', 0);
  url.searchParams.set('projectId', projectId);
  const res = await fetch(url.toString());
  if (!res.ok) throw new Error(`Failed to fetch project sessions: ${res.status}`);
  return res.json(); // returns array of session objects
 }
 async function getProject(id) {
  const res = await fetch(`${BASE_URL}/projects/${id}`);
  if (res.status === 404) return null;
  if (!res.ok) throw new Error(`Failed to fetch project: ${res.status}`);
  return res.json();
 }
 async function getEpisodes({ limit = 50, offset = 0, sessionId, q } = {}) {
  const url = new URL(`${BASE_URL}/episodes`);
  url.searchParams.set('limit', limit);
  url.searchParams.set('offset', offset);
  if (sessionId) url.searchParams.set('sessionId', sessionId);
  if (q) url.searchParams.set('q', q);
  const res = await fetch(url.toString());
  if (!res.ok) throw new Error(`Failed to fetch episodes: ${res.status}`);
  return res.json();
 }
 async function deleteEpisode(id) {
  const res = await fetch(`${BASE_URL}/episodes/${id}`, { method: 'DELETE' });
  if (!res.ok) throw new Error(`Failed to delete episode: ${res.status}`);
 }
 async function getSummariesBySession(sessionId) {
    const res = await fetch(`${BASE_URL}/sessions/${sessionId}/summaries`);
    if (!res.ok) throw new Error(`Failed to fetch summaries: ${res.status}`);
    return res.json();
 }
 async function createSummary({ sessionId, projectId, content, tokenCount, episodeRange }) {
    const res = await fetch(`${BASE_URL}/summaries`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ sessionId, projectId, content, tokenCount, episodeRange }),
    });
    if (!res.ok) throw new Error(`Failed to create summary: ${res.status}`);
    return res.json();
 }
 async function updateSummary(id, { content, tokenCount, episodeRange }) {
    const res = await fetch(`${BASE_URL}/summaries/${id}`, {
        method: 'PATCH',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ content, tokenCount, episodeRange }),
    });
    if (!res.ok) throw new Error(`Failed to update summary: ${res.status}`);
    return res.json();
 }
 async function getSummariesByProject(projectId) {
    const res = await fetch(`${BASE_URL}/projects/${projectId}/summaries`);
    if (!res.ok) throw new Error(`Failed to fetch summaries: ${res.status}`);
    return res.json();
 }
 async function generateProjectSummary(projectId) {
    const res = await fetch(`${BASE_URL}/projects/${projectId}/summarize`, {
        method: 'POST',
    });
    if (!res.ok) throw new Error(`Failed to generate project summary: ${res.status}`);
    return res.json();
 }
 async function getProjectOverviewSummary(projectId) {
    const res = await fetch(`${BASE_URL}/projects/${projectId}/overview`);
    if (!res.ok) throw new Error(`Failed to fetch project overview: ${res.status}`);
    return res.json(); // null if none exists yet
 }
 async function searchEpisodes(query, { limit = 10, sessionIds = null } = {}) {
    const url = new URL(`${BASE_URL}/episodes/search`);
    url.searchParams.set('q', query);
    url.searchParams.set('limit', limit);
    if (sessionIds?.length) url.searchParams.set('sessionIds', sessionIds.join(','));
    const res = await fetch(url.toString());
    if (!res.ok) throw new Error(`FTS search error: ${res.status}`);
    return res.json();
 }
 async function getEpisodesByEntities(entityIds) {
    const res = await fetch(`${BASE_URL}/episodes/by-entities`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ entityIds }),
    });
    if (!res.ok) throw new Error(`Episodes-by-entities error: ${res.status}`);
    return res.json(); // { episodeIds: [...] }
 }
 module.exports = {
    getSessionByExternalId,
    createSession,
@@ -126,4 +230,16 @@ module.exports = {
    getProjects,
    updateProject,
    deleteProject,
    getProjectSessions,
    getProject,
    getEpisodes,
    deleteEpisode,
    getSummariesBySession,
    createSummary,
    updateSummary,
    getSummariesByProject,
    generateProjectSummary,
    getProjectOverviewSummary,
    searchEpisodes,
    getEpisodesByEntities,
 }
--- a/packages/orchestration-service/src/services/qdrant.js
+++ b/packages/orchestration-service/src/services/qdrant.js
@@ -2,13 +2,18 @@ const {getEnv, QDRANT, COLLECTIONS, ORCHESTRATION } = require('@nexusai/shared')
 const BASE_URL = getEnv('QDRANT_URL', QDRANT.DEFAULT_URL);
-async function searchEpisodes( vector, {limit = ORCHESTRATION.RECENT_EPISODE_LIMIT, scoreThreshold = ORCHESTRATION.SCORE_THRESHOLD, sessionId } = {}) {
+async function searchEpisodes( vector, {limit = ORCHESTRATION.RECENT_EPISODE_LIMIT, scoreThreshold = ORCHESTRATION.SCORE_THRESHOLD, sessionId, projectSessionIds } = {}) {
    const body = {vector, limit, score_threshold: scoreThreshold, with_payload: true};
-    if (sessionId) {
+    if(projectSessionIds) {
        body.filter = {
            should: projectSessionIds.map(id => ({
                key: 'sessionId', match: { value: id }
            }))
        };
    } else if (sessionId) {
        body.filter = { must: [{key: 'sessionId', match: {value: sessionId} }] };
    }
    const res = await fetch (
        `${BASE_URL}/collections/${COLLECTIONS.EPISODES}/points/search`,
        {
@@ -24,4 +29,30 @@ async function searchEpisodes( vector, {limit = ORCHESTRATION.RECENT_EPISODE_LIM
    return data.result;
 }
-module.exports = { searchEpisodes };
+async function searchEntities(vector, { limit = ORCHESTRATION.ENTITIES_LIMIT, scoreThreshold = ORCHESTRATION.ENTITIES_THRESHOLD, projectId = undefined } = {}) {
    const body = { vector, limit, score_threshold: scoreThreshold, with_payload: true };
    if (projectId !== null && projectId !== undefined) {
        body.filter = {
            must: [{ key: 'projectId', match: { value: projectId } }]
        };
    }
    const res = await fetch(
        `${BASE_URL}/collections/${COLLECTIONS.ENTITIES}/points/search`,
        {
            method: 'POST',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify(body),
        }
    );
    if (!res.ok) {
        const body = await res.text();
        throw new Error(`Qdrant error: ${res.status} - ${body}`);
    }
    const data = await res.json();
    return data.result;
 }
 module.exports = { searchEpisodes, searchEntities };
--- a/packages/orchestration-service/src/services/summarization.js
+++ b/packages/orchestration-service/src/services/summarization.js
@@ -0,0 +1,151 @@
 const { getEnv, SERVICES, SUMMARIES, logger } = require('@nexusai/shared');
 const EXTRACTION_URL  = getEnv('EXTRACTION_URL', 'http://localhost:11434');
 const EXTRACTION_MODEL = getEnv('EXTRACTION_MODEL', 'qwen2.5:3b');
 const MEMORY_URL      = getEnv('MEMORY_SERVICE_URL', SERVICES.MEMORY_URL);
 const THRESHOLD_TOKENS  = parseInt(getEnv('SUMMARY_THRESHOLD_TOKENS', SUMMARIES.THRESHOLD_TOKENS));
 const MAX_SUMMARY_TOKENS = parseInt(getEnv('SUMMARY_MAX_TOKENS', SUMMARIES.MAX_SUMMARY_TOKENS));
 const MIN_EPISODES_SINCE = parseInt(getEnv('SUMMARY_MIN_EPISODES', SUMMARIES.MIN_EPISODES_SINCE));
 function buildSummaryPrompt(episodes, existingSummary = null) {
    const MAX_CHARS = 3000;
    let context = episodes
        .map(ep => `User: ${ep.user_message}\nAssistant: ${ep.ai_response}`)
        .join('\n\n');
    if (context.length > MAX_CHARS) {
        context = context.slice(-MAX_CHARS);
    }
    const instruction = existingSummary
        ? `Update the summary below to incorporate the new exchanges.
 Write 3-5 sentences in third person. Do not quote directly — paraphrase only.
 Do not include greetings, sign-offs, or filler. Output only the updated summary text.
 Previous summary:
 ${existingSummary}
 New exchanges:
 ${context}`
        : `Summarize the conversation below in 3-5 sentences.
 Write in third person. Do not quote directly — paraphrase only.
 Do not include greetings, sign-offs, or filler. Output only the summary text.
 Conversation:
 ${context}`;
    return [
        '<|im_start|>user',   // ChatML for qwen2.5
        instruction,
        '<|im_end|>',
        '<|im_start|>assistant',
    ].join('\n');
 }
 async function generateSummary(episodes, existingSummary = null) {
    const prompt = buildSummaryPrompt(episodes, existingSummary);
    const res = await fetch(`${EXTRACTION_URL}/api/generate`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
            model: EXTRACTION_MODEL,
            prompt,
            stream: false,
            options: {
                temperature: 0.2,   // slightly higher than entities — summaries benefit from some fluency
                num_predict: 500,   // generous but bounded — keeps summaries from running long
            },
        }),
    });
    if (!res.ok) throw new Error(`Ollama responded ${res.status}`);
    const data = await res.json();
    const raw = data.response?.trim() ?? '';
    // Strip any leaked ChatML tokens Qwen echoes back
    const content = raw
        .replace(/<\|im_start\|>.*?<\|im_end\|>/gs, '')
        .replace(/<\|im_start\|>|<\|im_end\|>|<\|im_sep\|>/g, '')
        .trim();
    return content;
 }
 async function maybeSummarize(session, allEpisodes) {
    // 1. Sum total tokens for this session
    const totalTokens = allEpisodes.reduce((sum, ep) => sum + (ep.token_count || 0), 0);
    if (totalTokens < THRESHOLD_TOKENS) return; // under threshold — nothing to do
    // 2. Fetch existing summaries for session
    const summariesRes = await fetch(`${MEMORY_URL}/sessions/${session.id}/summaries`);
    if (!summariesRes.ok) return;
    const summaries = await summariesRes.json();
    const latest = summaries.at(-1) ?? null;
    const lastCoveredId = latest 
        ? parseInt(latest.episode_range?.split('-').at(-1)) || 0 
        : 0;
    // 3. Guard — don't re-summarize until MIN_EPISODES_SINCE new episodes have accumulated
    if (latest) {
        const newEpisodes = allEpisodes.filter(ep => ep.id > lastCoveredId);
        if (newEpisodes.length < MIN_EPISODES_SINCE) return;
    }
    // 4. Determine episodes to summarize
    const episodesToSummarize = latest
        ? allEpisodes.filter(ep => ep.id > lastCoveredId)
        : allEpisodes;
    // 5. Determine episode range from the episodes actually being summarized
    const summarizedIds = episodesToSummarize.map(ep => ep.id).sort((a,b) => a - b);
    const episodeRange = `${summarizedIds.at(0)}-${summarizedIds.at(-1)}`;
    const totalEpisodeTokens = allEpisodes.reduce((sum, ep) => sum + (ep.token_count || 0), 0);
    // add temporarily before the generateSummary call
    logger.debug('[summarization] episodes to summarize:', episodesToSummarize.length);
    const content = await generateSummary(
        episodesToSummarize,
        latest && latest.content.length < MAX_SUMMARY_TOKENS ? latest.content : null
        // if existing summary is already large, treat as fresh rather than appending to a huge blob
    );
    if (!content) return;
    // 6. Create new row or update existing
    if (!latest || latest.content.length >= MAX_SUMMARY_TOKENS) {
        await fetch(`${MEMORY_URL}/summaries`, {
            method: 'POST',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify({
                sessionId: session.id,
                content,
                tokenCount: totalEpisodeTokens,
                episodeRange,
            }),
        });
        logger.debug(`[summarization] Created new summary for session ${session.id}`);
    } else {
        await fetch(`${MEMORY_URL}/summaries/${latest.id}`, {
            method: 'PATCH',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify({
                content,
                tokenCount: totalEpisodeTokens,
                episodeRange,
            }),
        });
        logger.debug(`[summarization] Updated summary ${latest.id} for session ${session.id}`);
    }
 }
 async function triggerSummary(session, allEpisodes) {
    // Intentionally fire-and-forget — caller doesn't await this
    maybeSummarize(session, allEpisodes).catch(err =>
        logger.warn('[summarization] Summary failed (non-critical):', err.message)
    );
 }
 module.exports = { triggerSummary };
--- a/packages/shared/src/config/constants.js
+++ b/packages/shared/src/config/constants.js
@@ -24,7 +24,13 @@ const EPISODIC = {
 const ORCHESTRATION = {
    RECENT_EPISODE_LIMIT:   5,
    SEMANTIC_LIMIT:         5,
-    SCORE_THRESHOLD:        0.75,
+    SCORE_THRESHOLD:        0.5,
    ENTITIES_LIMIT:         5,
    ENTITIES_THRESHOLD:     0.55,
    TEMPERATURE:            0.7,
    CONTEXT_BUDGET:         4096,
    ENTITY_WEIGHT:          0.5,
    MIN_RECENT_EPISODES:    2,
    CORS_ORIGIN:            'http://localhost:5173',
    SYSTEM_PROMPT:          `You are a helpful, context-aware AI assistant. You have access to memories of past conversations with the user. Use them to provide consistent, personalised responses.`
 }
@@ -37,7 +43,7 @@ const OLLAMA = {
 const LLAMACPP = {
    DEFAULT_URL:    'http://localhost:8080',
-    DEFAULT_MODEL:  'gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf',
+    DEFAULT_MODEL:  'qwen/qwen3.6-35b-a3b',
 }
 const PORTS = {
@@ -66,6 +72,39 @@ const SQLITE = {
    DEFAULT_PATH: './data/nexusai.db'
 }
 const SUMMARIES = {
    THRESHOLD_TOKENS:   200,    //trigger summary when session hits this many tokens
    MAX_SUMMARY_TOKENS: 800,    //if existing summary exceeds this, create new instead of update
    MIN_EPISODES_SINCE: 5,      // don't resummarize until N new episodes since last summary
    MAX_SUMMARY_CHARS:  8000,   // max chars to include from recent episodes when generating summary (to control prompt size)
    MAX_PROJECT_EPISODE_LIMIT: 200, // max number of episodes to consider from the entire project when generating summary (to control prompt size)
 }
 const ENTITIES = {
    TEMPERATURE:    0.1,    // Low temperature, more precise extraction, less creative
    NUM_PREDICT:    1500,   // Max tokens to consider for entity extraction (e.g. recent conversation)
    THRESHOLD:      0.55,   // Minimum confidence score for an extracted entity to be included in the results
    PROMOTION_THRESHOLD: 3, // mention_count threshold before entity is considered well-established
    GRAPH_HOP_DEPTH: 1,     // Default traversal depth for neighborhood queries
    TYPES: [
        'person', 
        'place', 
        'project', 
        'technology', 
        'concept', 
        'organization', 
        'character', 
        'event', 
        'topic'
    ],
 }
 const RETRIEVAL = {
    RRF_K:              60,     // Reciprocal Rank Fusion smoothing constant, softens rank-1 advantage, not exposed in settings
    SEMANTIC_WEIGHT:    1.0,    // Weight applied to semantic (QDrant) results
    KEYWORD_WEIGHT:     0,    // Weight applied to keyword (SQLite) results, 0 = disables, set >0 to enable and tune balance between semantic vs keyword matches
 }
 module.exports = {
    QDRANT,
    COLLECTIONS,
@@ -76,5 +115,8 @@ module.exports = {
    LLAMACPP,
    INFERENCE_DEFAULTS,
    SQLITE,
-    ORCHESTRATION
+    ORCHESTRATION,
    SUMMARIES,
    ENTITIES,
    RETRIEVAL,
 };
--- a/packages/shared/src/index.js
+++ b/packages/shared/src/index.js
@@ -1,6 +1,7 @@
 const {getEnv} = require('./config/env');
-const {QDRANT, COLLECTIONS, EPISODIC, SERVICES, OLLAMA, PORTS, LLAMACPP, INFERENCE_DEFAULTS, SQLITE, ORCHESTRATION } = require('./config/constants');
+const {QDRANT, COLLECTIONS, EPISODIC, SERVICES, OLLAMA, PORTS, LLAMACPP, INFERENCE_DEFAULTS, SQLITE, ORCHESTRATION, SUMMARIES, ENTITIES, RETRIEVAL } = require('./config/constants');
 const {parseRow, formatEpisodeText} = require('./utils')
 const logger = require('./utils/logger');
 module.exports = {
    getEnv, 
@@ -16,4 +17,8 @@ module.exports = {
    ORCHESTRATION,
    parseRow,
    formatEpisodeText,
    SUMMARIES,
    ENTITIES,
    logger,
    RETRIEVAL,
 };
--- a/packages/shared/src/utils/logger.js
+++ b/packages/shared/src/utils/logger.js
@@ -0,0 +1,12 @@
 const LEVELS = { error: 0, warn: 1, info: 2, debug: 3 };
 const current = LEVELS[process.env.LOG_LEVEL?.toLowerCase()] ?? LEVELS.info;
 const logger = {
    error: (...args) => current >= LEVELS.error && console.error('[ERROR]', ...args),
    warn:  (...args) => current >= LEVELS.warn  && console.warn( '[WARN]',  ...args),
    info:  (...args) => current >= LEVELS.info  && console.log(  '[INFO]',  ...args),
    debug: (...args) => current >= LEVELS.debug && console.log(  '[DEBUG]', ...args),
 };
 module.exports = logger;
--- a/test-fusion.js
+++ b/test-fusion.js
@@ -0,0 +1,67 @@
 // test-fusion.js
 const { RETRIEVAL } = require('./packages/shared/src/config/constants');
 function fuseEpisodeResults(semanticEps, keywordEps, { semanticWeight, keywordWeight, limit }) {
    const k = RETRIEVAL.RRF_K;
    const scores = new Map();
    semanticEps.forEach((ep, i) => {
        scores.set(ep.id, { episode: ep, score: semanticWeight / (k + i + 1) });
    });
    keywordEps.forEach((ep, i) => {
        const contrib = keywordWeight / (k + i + 1);
        if (scores.has(ep.id)) {
            scores.get(ep.id).score += contrib;
        } else if (contrib > 0) {
            scores.set(ep.id, { episode: ep, score: contrib });
        }
    });
    return [...scores.values()]
        .sort((a, b) => b.score - a.score)
        .slice(0, limit)
        .map(({ episode }) => episode);
 }
 // --- Test 1: episodes in both lists rank highest ---
 const semantic = [
    { id: 1, user_message: 'ep1 — semantic only, rank 1' },
    { id: 2, user_message: 'ep2 — in both lists, rank 2 semantic' },
    { id: 3, user_message: 'ep3 — in both lists, rank 3 semantic' },
 ];
 const keyword = [
    { id: 3, user_message: 'ep3 — rank 1 FTS' },
    { id: 2, user_message: 'ep2 — rank 2 FTS' },
    { id: 4, user_message: 'ep4 — FTS only, rank 3' },
 ];
 const result = fuseEpisodeResults(semantic, keyword, { semanticWeight: 1, keywordWeight: 1, limit: 5 });
 console.log('Test 1 — equal weights, episodes in both lists should rank highest:');
 result.forEach((ep, i) => console.log(`  ${i + 1}. id=${ep.id} "${ep.user_message}"`));
 console.assert(result[0].id === 2 || result[0].id === 3, 'FAIL: ep2 or ep3 should be rank 1');
 console.assert(!result.find(e => e.id === 1) || result.indexOf(result.find(e => e.id === 1)) > result.indexOf(result.find(e => e.id === 2)), 'FAIL: ep1 (semantic only) should rank below ep2');
 console.log('  PASS\n');
 // --- Test 2: keywordWeight:0 → pure semantic passthrough ---
 const result2 = fuseEpisodeResults(semantic, keyword, { semanticWeight: 1, keywordWeight: 0, limit: 5 });
 console.log('Test 2 — keywordWeight:0 should return only semantic results in original order:');
 result2.forEach((ep, i) => console.log(`  ${i + 1}. id=${ep.id}`));
 console.assert(result2.length === 3, `FAIL: expected 3, got ${result2.length}`);
 console.assert(result2[0].id === 1, 'FAIL: ep1 should be rank 1');
 console.assert(result2[1].id === 2, 'FAIL: ep2 should be rank 2');
 console.log('  PASS\n');
 // --- Test 3: limit is respected ---
 const result3 = fuseEpisodeResults(semantic, keyword, { semanticWeight: 1, keywordWeight: 1, limit: 2 });
 console.log('Test 3 — limit:2 should return exactly 2 results:');
 console.assert(result3.length === 2, `FAIL: expected 2, got ${result3.length}`);
 console.log('  PASS\n');
 // --- Test 4: no overlap → all unique episodes, ordered by individual contribution ---
 const semOnly = [{ id: 10, user_message: 'sem' }];
 const ftsOnly = [{ id: 20, user_message: 'fts' }];
 const result4 = fuseEpisodeResults(semOnly, ftsOnly, { semanticWeight: 1, keywordWeight: 1, limit: 5 });
 console.log('Test 4 — no overlap, both should appear:');
 console.assert(result4.length === 2, `FAIL: expected 2, got ${result4.length}`);
 console.assert(result4[0].id === 10, 'FAIL: semantic rank-1 should beat fts rank-1 (same weight, both rank 1, but semantic inserted first — tie goes to semantic)');
 console.log('  PASS\n');
 console.log('All tests passed.');
Author	SHA1	Message	Date
Storme-bit	e4908193bd	smarter context assembly implementation	2026-04-27 21:41:32 -07:00
Storme-bit	b58a4e4692	minor clean up	2026-04-27 20:17:05 -07:00
Storme-bit	055683424d	retrieval fusion	2026-04-27 07:03:46 -07:00
Storme-bit	27ad614130	retrieval fusion	2026-04-27 05:56:23 -07:00
Storme-bit	8ade5c68ca	retrieval fusion	2026-04-27 05:46:01 -07:00
Storme-bit	49982a85de	retrieval fusion	2026-04-27 05:21:43 -07:00
Storme-bit	9c6c5c9a42	entity extraction prompt	2026-04-27 03:50:13 -07:00
Storme-bit	c9cbac87ac	knowledge graph entity fixes	2026-04-27 03:41:56 -07:00
Storme-bit	1a97b19280	roadmap phase 1 complete	2026-04-27 03:10:39 -07:00
Storme-bit	9fe8e568cf	roadmap phase 1 complete	2026-04-27 00:28:42 -07:00
Storme-bit	5ad01c6ad8	clean up	2026-04-27 00:14:51 -07:00
Storme-bit	aac0923351	Merge branch 'main' of http://192.168.0.205:3100/storme/nexusai	2026-04-27 00:10:16 -07:00
Storme-bit	54218894c0	logger clean up	2026-04-27 00:09:16 -07:00
Storme-bit	66a95f4479	logger clean up	2026-04-27 00:07:51 -07:00
storme	78476e166f	Delete .claude/settings.local.json	2026-04-27 06:57:49 +00:00
Storme-bit	696ead29f8	chat/index.js cleanup	2026-04-26 23:04:31 -07:00
Storme-bit	45db47a584	error response consistency, human readible1	2026-04-26 23:00:55 -07:00
Storme-bit	095c9a623e	error response consistency, human readible1	2026-04-26 23:00:18 -07:00
Storme-bit	f5011fddca	logger updates	2026-04-26 22:29:57 -07:00
Storme-bit	86e78cc4c6	logger updates	2026-04-26 22:28:54 -07:00
Storme-bit	c86b565eed	code cleanup/hardening	2026-04-26 21:59:16 -07:00
Storme-bit	be1c38b654	code cleanup/hardening	2026-04-26 21:57:39 -07:00
Storme-bit	4f3b18de08	code cleanup/hardening	2026-04-26 21:53:33 -07:00
Storme-bit	43fa12899c	NexusAI roadmap addition	2026-04-26 21:14:27 -07:00
Storme-bit	84f01ef209	NexusAI roadmap addition	2026-04-26 21:14:04 -07:00
Storme-bit	a50a748bcf	NexusAI roadmap addition	2026-04-26 21:13:15 -07:00
Storme-bit	32e8a83233	NexusAI roadmap addition	2026-04-26 21:08:19 -07:00
Storme-bit	855de6d0af	project summaries addition	2026-04-26 21:02:42 -07:00
Storme-bit	fcaf0e651f	project summaries addition	2026-04-26 19:11:40 -07:00
Storme-bit	6cdee72af2	project summaries addition	2026-04-26 18:59:28 -07:00
Storme-bit	4c6bd1df2d	project summaries addition	2026-04-26 18:57:25 -07:00
Storme-bit	2429fedb2c	code clean up pass	2026-04-26 18:18:40 -07:00
Storme-bit	bdc5947fcb	code clean up pass	2026-04-26 05:38:47 -07:00
Storme-bit	785047a824	code clean up pass	2026-04-26 05:19:31 -07:00
Storme-bit	acda21317b	documentation updates for entity extraction and summarization	2026-04-21 03:50:38 -07:00
Storme-bit	32365e67f4	summarization fix	2026-04-21 03:05:24 -07:00
Storme-bit	59918d5733	summaries chat client	2026-04-21 02:52:31 -07:00
Storme-bit	01f35b7b82	summaries chat client	2026-04-21 02:42:18 -07:00
Storme-bit	21a7e5f3b5	extraction error logging	2026-04-21 01:07:31 -07:00
Storme-bit	c81a1cb20e	extraction error logging	2026-04-21 00:35:48 -07:00
Storme-bit	781bf8a615	extraction error logging	2026-04-21 00:28:13 -07:00
Storme-bit	b44d35e7cb	extraction error logging	2026-04-21 00:27:28 -07:00
Storme-bit	22686fca3c	extraction error logging	2026-04-21 00:26:41 -07:00
Storme-bit	588e8395f8	extraction error logging	2026-04-21 00:22:39 -07:00
Storme-bit	936b04742e	extraction error logging	2026-04-21 00:22:29 -07:00
Storme-bit	9ab63cca19	extraction error logging	2026-04-21 00:02:13 -07:00
Storme-bit	528318b374	extraction error logging	2026-04-20 23:54:26 -07:00
Storme-bit	43dc800a0a	extraction error logging	2026-04-20 23:50:15 -07:00
Storme-bit	143df71efa	extraction error logging	2026-04-20 23:46:23 -07:00
Storme-bit	72b41056a5	extraction error logging	2026-04-20 23:44:26 -07:00
Storme-bit	5de64ba68e	extraction error logging	2026-04-20 23:40:20 -07:00
Storme-bit	405676edb5	extraction error logging	2026-04-20 23:28:46 -07:00
Storme-bit	980053a0ee	extraction error logging	2026-04-20 23:25:31 -07:00
Storme-bit	3636ef3ff9	extraction error logging	2026-04-20 23:19:01 -07:00
Storme-bit	d2352ea48b	updated extraction for phi3	2026-04-20 23:13:47 -07:00
Storme-bit	af04cef307	session summarization	2026-04-20 23:04:13 -07:00
Storme-bit	17e2fd8f14	session summarization	2026-04-20 22:59:54 -07:00
Storme-bit	c9f3f5bc79	session summarization	2026-04-20 22:39:26 -07:00
Storme-bit	2fc372815f	fixed summary creation	2026-04-19 18:05:00 -07:00
Storme-bit	395c06137c	fixed summary creation	2026-04-19 17:35:44 -07:00
Storme-bit	98b89d44a5	fixed ordering of fetched episodes	2026-04-19 17:14:52 -07:00
Storme-bit	57edf97270	summarization fetch failed	2026-04-19 15:31:59 -07:00
Storme-bit	cb6428448d	summarization fetch failed	2026-04-19 15:23:24 -07:00
Storme-bit	a674f4d774	summarization fetch failed	2026-04-19 15:17:30 -07:00
Storme-bit	7824404319	fixed token count reading	2026-04-19 07:59:31 -07:00
Storme-bit	0619c4c7f3	fixed token count reading	2026-04-19 07:50:10 -07:00
Storme-bit	225728e531	fixed token count reading	2026-04-19 07:38:36 -07:00
Storme-bit	8c807fb35b	summary system backend implementation	2026-04-19 07:23:00 -07:00
Storme-bit	4cc87d96b6	summary system backend implementation	2026-04-19 07:19:27 -07:00
Storme-bit	57e8c4c486	summary system backend implementation	2026-04-19 06:59:06 -07:00
Storme-bit	ef5bfd5757	summary system backend implementation	2026-04-19 06:57:09 -07:00
Storme-bit	a6e17e33a0	summary system backend implementation	2026-04-19 06:52:43 -07:00
Storme-bit	01ed60a547	summary system backend implementation	2026-04-19 06:51:39 -07:00
Storme-bit	2769f436fa	summary system backend implementation	2026-04-19 06:50:24 -07:00
Storme-bit	15c1bec609	system prompt client global and project	2026-04-19 03:02:03 -07:00
Storme-bit	fa3b0859f0	system prompt client global and project	2026-04-19 02:57:11 -07:00
Storme-bit	a0154e15e6	system prompt backend	2026-04-19 02:32:38 -07:00
Storme-bit	9c903a56ae	memory isolation fix and session grouping in client	2026-04-19 02:09:12 -07:00
Storme-bit	56355d232b	memory isolation fix	2026-04-19 01:02:52 -07:00
Storme-bit	ed57a0331a	documentation update	2026-04-19 00:26:48 -07:00
Storme-bit	e1375e7d1b	documentation update	2026-04-18 23:37:32 -07:00
Storme-bit	1fc6e8a66d	saving project notes	2026-04-18 23:17:34 -07:00
Storme-bit	ee8f5bb5f0	project view updates	2026-04-18 23:06:59 -07:00
Storme-bit	c87760cc01	project view updates	2026-04-18 22:53:24 -07:00
Storme-bit	e69ceb44e7	project view updates	2026-04-18 22:53:03 -07:00
Storme-bit	ad5ecb5ff3	chat client fixes	2026-04-18 21:21:05 -07:00
Storme-bit	44989a2b8b	documentation updated for model inference settings	2026-04-18 06:41:50 -07:00
Storme-bit	c198a00dde	model inference settings	2026-04-18 06:33:22 -07:00
Storme-bit	dd4013685b	model inference settings	2026-04-18 06:29:47 -07:00
Storme-bit	2d1f7176ff	model inference settings	2026-04-18 06:23:50 -07:00
Storme-bit	6935459428	model inference settings	2026-04-18 06:20:58 -07:00
Storme-bit	4b75529806	model inference settings	2026-04-18 06:16:31 -07:00
Storme-bit	daf5b9a8ae	model inference settings	2026-04-18 03:25:22 -07:00
Storme-bit	2b47b06563	model temperature settings	2026-04-18 02:54:47 -07:00
Storme-bit	616383e9bc	model temperature settings	2026-04-18 02:45:43 -07:00
Storme-bit	8bd4836cd7	model temperature settings	2026-04-18 02:40:31 -07:00
Storme-bit	9950ea3b62	implementing model selector and info panel	2026-04-18 02:28:01 -07:00
Storme-bit	9fccc4809d	implementing model selector	2026-04-18 01:53:26 -07:00
Storme-bit	68f2d758b1	implementing model selector	2026-04-18 01:52:02 -07:00
Storme-bit	072758df9c	health panel implementation	2026-04-17 23:35:31 -07:00
Storme-bit	8a5caf7399	health panel implementation	2026-04-17 23:32:33 -07:00
Storme-bit	afae2af85b	memory settings implementation	2026-04-17 23:18:48 -07:00
Storme-bit	77275cf476	memory settings implementation	2026-04-17 23:13:36 -07:00
Storme-bit	1cc7b62d79	added react-markdown	2026-04-17 22:45:24 -07:00
Storme-bit	fc864041c5	added react-markdown	2026-04-17 22:23:21 -07:00
Storme-bit	8ae12c8c50	memory view in chat client	2026-04-17 20:00:44 -07:00
Storme-bit	bf074295eb	memory view in chat client	2026-04-17 19:56:54 -07:00
Storme-bit	b3fb936494	memory view in chat client	2026-04-17 19:50:13 -07:00
Storme-bit	05f1fbb04e	bulk episodic deletion	2026-04-17 19:43:18 -07:00
Storme-bit	930a6dbd13	bulk episodic deletion	2026-04-17 19:34:35 -07:00
Storme-bit	99a4914d66	bulk episodic deletion	2026-04-17 19:34:21 -07:00
Storme-bit	91e4f68a8c	updated documentation for entity implementation	2026-04-17 07:00:28 -07:00
Storme-bit	7e50e82d8c	fix entity duplication glitch	2026-04-17 06:46:26 -07:00
Storme-bit	cfa1358174	adding in entity extraction layer with semantic search enabled	2026-04-17 06:28:15 -07:00
Storme-bit	1ed76e4d95	adding in entity extraction layer with semantic search enabled	2026-04-17 06:23:41 -07:00
Storme-bit	06d7031e44	adding in entity extraction layer with semantic search enabled	2026-04-17 06:18:39 -07:00
Storme-bit	902725b7f7	adding in entity extraction layer with semantic search enabled	2026-04-17 06:08:12 -07:00
Storme-bit	cf7f387add	adding in entity extraction layer with semantic search enabled	2026-04-17 06:04:24 -07:00
Storme-bit	b4fd3ed72c	adding in entity extraction layer with semantic search enabled	2026-04-17 06:01:49 -07:00
Storme-bit	cef1803af6	adding in entity extraction layer with semantic search enabled	2026-04-17 06:01:21 -07:00
Storme-bit	0cad85d4a7	adding in entity extraction layer	2026-04-17 05:54:33 -07:00
Storme-bit	4070eb5559	adding in entity extraction layer	2026-04-17 05:53:08 -07:00
Storme-bit	ba1e6b32e7	adding in entity extraction layer	2026-04-17 05:50:54 -07:00
Storme-bit	940b636175	adding in entity extraction layer	2026-04-17 05:47:34 -07:00
Storme-bit	2d2164451d	adding in entity extraction layer	2026-04-17 05:43:41 -07:00
Storme-bit	ec44b935d1	adding in entity extraction layer	2026-04-17 05:37:24 -07:00
Storme-bit	bb05d1508d	update documentation	2026-04-17 03:48:49 -07:00
Storme-bit	ac1bd963ef	update documentation	2026-04-17 03:46:45 -07:00
Storme-bit	5145b9a7db	update documentation	2026-04-17 03:46:17 -07:00
Storme-bit	27e3c98304	semantic search within project	2026-04-15 03:15:26 -07:00
Storme-bit	e1c16a5714	semantic search within project	2026-04-15 03:04:04 -07:00
Storme-bit	0db2896b55	missing POST /sessions	2026-04-15 02:52:40 -07:00
Storme-bit	46f3013a51	missing POST /sessions	2026-04-15 02:52:31 -07:00
Storme-bit	5f5fec9d00	wired in project isolation	2026-04-15 02:43:16 -07:00
Storme-bit	f83e37f5c7	wired in project isolation	2026-04-15 02:36:37 -07:00
Storme-bit	e8b81554c7	chat sessions in project view	2026-04-15 02:23:38 -07:00
Storme-bit	3f79cd4a41	chat sessions in project view	2026-04-14 02:16:10 -07:00
Storme-bit	4f388faaef	chat sessions in project view	2026-04-14 02:12:24 -07:00
Storme-bit	1d420789b3	chat sessions in project view	2026-04-14 02:11:23 -07:00
Storme-bit	11449bb207	chat sessions in project view	2026-04-14 02:04:16 -07:00
Storme-bit	eb702624c3	chat sessions in project view	2026-04-14 02:03:54 -07:00
Storme-bit	996db6d4f1	chat sessions in project view	2026-04-14 01:58:08 -07:00
Storme-bit	f8fcc99929	chat sessions in project view	2026-04-14 01:55:25 -07:00
Storme-bit	c892f54a04	chat sessions in project view	2026-04-14 01:52:11 -07:00
Storme-bit	cdd74b5902	get sessions by projectId	2026-04-14 01:29:13 -07:00
Storme-bit	271a396ef5	get sessions by projectId	2026-04-14 01:16:59 -07:00
Storme-bit	30aaad6f77	get sessions by projectId	2026-04-14 01:14:19 -07:00
Storme-bit	7598e8b9f4	get sessions by projectId	2026-04-14 01:07:59 -07:00
Storme-bit	8d4a553a2a	ALTER TABLE to add an isolated property for projects	2026-04-14 00:59:54 -07:00
Storme-bit	649ed2b350	added being able to assign sessions to projects via the sessions modal	2026-04-13 20:36:42 -07:00
Storme-bit	e3f6b9a9db	autonaming error logging	2026-04-13 20:15:57 -07:00
Storme-bit	70959e945a	autonaming error logging	2026-04-13 20:12:14 -07:00
Storme-bit	4e0f7d33aa	added auto-naming on first message	2026-04-13 20:04:36 -07:00
Storme-bit	0b9fedcd6e	updated documentation to reflect additions of new project, settings, and UI restructure	2026-04-13 17:26:20 -07:00
Storme-bit	699592071f	chat client UI restructure + added all projects view and settings view(placeholder)	2026-04-13 17:08:52 -07:00
Storme-bit	7501fc54f1	added missing memory service project routes	2026-04-13 06:18:34 -07:00