Files
nexusAI/docs/deployment/homelab.md
2026-04-17 03:46:17 -07:00

4.2 KiB

Homelab Deployment

Overview

NexusAI is distributed across three nodes. Each node runs only the services appropriate for its hardware.

Mini PC 1 — 192.168.0.81

Runs: Qdrant, Memory Service, Embedding Service, Ollama

ssh storme@192.168.0.81
docker compose -f docker-compose.mini1.yml up -d  # Qdrant
npm run memory      # port 3002
npm run embedding   # port 3003
ollama serve        # port 11434 — must bind 0.0.0.0 (OLLAMA_HOST=0.0.0.0)

Ollama must be started with OLLAMA_HOST=0.0.0.0 to accept connections from other services on the LAN. Without this, embedding requests from the memory service will be refused.

Mini PC 2 — 192.168.0.205

Runs: Orchestration Service, Chat Client (via Caddy), Gitea, Caddy, Authelia

ssh storme@192.168.0.205

cd /opt/stacks/network
docker compose up -d        # Caddy, Authelia, and other network services

cd ~/nexusAI
npm run orchestration       # port 4000

Main PC — 192.168.0.79

Runs: Inference Service, llama-server

# Start llama-server first — inference service depends on it
.\llama-gpu\llama-server.exe `
  -m .\models\gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf `
  -ngl 99 --reasoning off --host 0.0.0.0 --port 8080 -c 64000

# Then start inference service
npm run inference            # port 3001

Chat Client Deployment

The chat client is a React + Vite app built to static files and served by Caddy on Mini PC 2. It does not run as a Node process.

# On Mini PC 2 after git pull
cd ~/nexusAI/packages/chat-client

# Set production URL before building
VITE_ORCHESTRATION_URL=https://nexus.jellystorm.com npm run build

# Output lands in packages/chat-client/dist/
# Caddy serves this directory directly via Docker volume mount

Do NOT set VITE_ORCHESTRATION_URL during local dev — Vite's proxy handles routing and setting the HTTPS domain will cause Authelia to intercept API requests, producing confusing JSON parse errors.

Caddy Configuration

The Caddyfile on Mini PC 2 must include a handle block for each route prefix the client needs to reach. Current required blocks for NexusAI:

nexus.jellystorm.com {
    import authelia

    handle /chat* {
        reverse_proxy 192.168.0.205:4000
    }

    handle /sessions* {
        reverse_proxy 192.168.0.205:4000
    }

    handle /models* {
        reverse_proxy 192.168.0.205:4000
    }

    handle /projects* {
        reverse_proxy 192.168.0.205:4000
    }

    handle {
        root * /srv/nexusai
        try_files {path} /index.html
        file_server
    }
}

When adding new top-level routes to the orchestration service, add a matching handle block here and reload Caddy:

caddy reload --config /path/to/Caddyfile

The Caddy container mounts the dist directory via Docker volume:

- /home/storme/nexusAI/packages/chat-client/dist:/srv/nexusai

After adding or changing volume mounts, a full docker compose down caddy && docker compose up -d caddy is required. Caddyfile-only changes only need caddy reload.

Environment Files

Each service needs a .env file in its package directory. These are not committed to git. See each service's documentation for required variables.

Service Location Key Variables
Memory packages/memory-service/.env SQLITE_PATH, QDRANT_URL, EMBEDDING_SERVICE_URL
Embedding packages/embedding-service/.env OLLAMA_URL, EMBEDDING_MODEL
Inference packages/inference-service/.env INFERENCE_PROVIDER, INFERENCE_URL, DEFAULT_MODEL
Orchestration packages/orchestration-service/src/.env MEMORY_SERVICE_URL, EMBEDDING_SERVICE_URL, INFERENCE_SERVICE_URL, QDRANT_URL, MODELS_MANIFEST_PATH
Chat client packages/chat-client/.env VITE_ORCHESTRATION_URL (production builds only)

Models Manifest

The models manifest (models.json) lives on the Main PC alongside the model files, accessible to orchestration via an SMB mount at /mnt/nexus-models.

[
  { "value": "gemma-4-26B-A4B-Claude-Distill-APEX-I-Mini.gguf", "label": "Gemma 4 26B Claude Distill" }
]

value must exactly match the model name as reported by llama-server (including .gguf extension). No service restart needed to pick up changes.