Files
nexusAI/docs/homelab/overview.md
2026-04-05 01:37:01 -07:00

254 lines
8.6 KiB
Markdown

# 🏠 Homelab Documentation
> **Status:** Phase 1 Complete → Phase 2 In Progress
> **Last Updated:** April 2026
---
## Table of Contents
1. [Overview](#overview)
2. [Hardware](#hardware)
3. [Network & Access](#network--access)
4. [Infrastructure Node — Mini PC 2](#infrastructure-node--mini-pc-2)
5. [Media Node — Mini PC 1](#media-node--mini-pc-1)
6. [Main PC](#main-pc)
7. [Storage Layout](#storage-layout)
8. [Reverse Proxy — Caddy](#reverse-proxy--caddy)
9. [Phase 1 Summary](#phase-1-summary)
10. [Phase 2 Roadmap](#phase-2-roadmap)
---
## Overview
This homelab is a self-hosted, multi-node setup running across three machines on a local network. It is designed around modularity and separation of concerns:
- **Mini PC 2** acts as the infrastructure manager — handling networking, authentication, monitoring, DNS, and git hosting.
- **Mini PC 1** acts as the media and services workhorse — running the arr stack, Jellyfin, Nextcloud, NexusAI dependencies, and download management.
- **Main PC** is the primary inference node, housing the RTX A4000 GPU for AI workloads.
All external access is routed through **Caddy** (reverse proxy) with **Authelia** providing SSO/MFA, and **Tailscale** for secure remote access. **Pihole** handles local DNS.
---
## Hardware
### Main PC
| Spec | Detail |
|------|--------|
| GPU | NVIDIA RTX A4000 |
| Role | Primary AI inference node |
| Key Services | Ollama (inference) |
### Mini PC 1 — Media Node (`192.168.0.81`)
| Spec | Detail |
|------|--------|
| GPU | NVIDIA RTX 5050 |
| Role | Media services, embeddings, vector storage |
| Key Services | Jellyfin, Nextcloud, Qdrant, arr stack, NexusAI memory/embedding |
| Storage | NVMe (OS) + 3x external HDDs (see [Storage Layout](#storage-layout)) |
### Mini PC 2 — Infrastructure Node (`192.168.0.205`)
| Spec | Detail |
|------|--------|
| Role | Network management, monitoring, auth, DNS, git |
| Key Services | Caddy, Authelia, Tailscale, Pihole, Grafana, Gitea |
| Storage | NVMe (OS only) |
---
## Network & Access
| Component | Tool | Notes |
|-----------|------|-------|
| Reverse Proxy | Caddy | Handles HTTPS termination for all services |
| Authentication | Authelia | SSO + MFA for protected services |
| Remote Access | Tailscale | Secure VPN mesh for remote connectivity |
| DNS | Pihole | Local DNS resolution + ad blocking |
| Git Hosting | Gitea | Self-hosted at `192.168.0.205:3100` |
> **Note:** Gitea has SSO bypassed — it uses its own auth. All other externally exposed services are protected behind Authelia.
---
## Infrastructure Node — Mini PC 2
**IP:** `192.168.0.205`
### Containers
| Container | Port | Stack | Notes |
|-----------|------|-------|-------|
| caddy | 80, 443 | network | Reverse proxy, HTTPS termination |
| authelia | 9091 | network | SSO / MFA provider |
| tailscale | — | network | Mesh VPN |
| pihole | — | dns | Local DNS + ad blocking |
| prometheus | 9090 | monitoring | Metrics scraping |
| grafana | 3002 | monitoring | Dashboards |
| uptime-kuma | 3001 | monitoring | Uptime monitoring |
| node_exporter | 9100 | monitoring | Host metrics |
| cadvisor | 8088 | monitoring | Container metrics |
| prowlarr | 9696 | indexing | Indexer manager |
| flaresolverr | 8191 | indexing | Cloudflare bypass for indexers |
| homepage | 3000 | homepage | Service dashboard |
| gitea | 3100 | gitea | Self-hosted git |
| portainer | 9000 | — | Container management (primary) |
---
## Media Node — Mini PC 1
**IP:** `192.168.0.81`
### Containers
#### Media Apps
| Container | Port | Stack | Notes |
|-----------|------|-------|-------|
| jellyfin | 8096 | mediaapps | Media streaming server |
| seer | 5055 | mediaapps | Request management (Overseerr/Jellyseerr) |
| kavita | 5000 | mediaapps | Comics / manga / ebook reader |
#### Arr Stack
| Container | Port | Stack | Notes |
|-----------|------|-------|-------|
| radarr | 7878 | arrstack | Movie management |
| radarr-anime | 7877 | arrstack | Anime movie management |
| sonarr | 8989 | arrstack | TV show management |
| sonarr-anime | 8988 | arrstack | Anime series management |
| whisparr | 6969 | arrstack | Adult content management |
| bazarr | 6767 | arrstack | Subtitle management |
| suwayomi | 4567 | arrstack | Manga reader / downloader |
#### Download Core
| Container | Port | Stack | Notes |
|-----------|------|-------|-------|
| gluetun | 8080 | download_core | VPN container — qBittorrent routes through this |
| qbittorrent | (via gluetun) | download_core | Torrent client, traffic tunnelled through Gluetun |
#### Cloud & Tools
| Container | Port | Stack | Notes |
|-----------|------|-------|-------|
| nextcloud (AIO) | — | — | Self-hosted cloud storage suite |
| filebrowser | 8085 | home_tools | Web-based file management |
| couchdb-obsidian-livesync | 5984 | obsidian | CouchDB backend for Obsidian LiveSync |
#### NexusAI
| Container | Port | Stack | Notes |
|-----------|------|-------|-------|
| qdrant | 6333 | nexusai | Vector database for NexusAI memory service |
#### Monitoring & Management
| Container | Port | Stack | Notes |
|-----------|------|-------|-------|
| node_exporter | 9100 | monitoring | Host metrics |
| cadvisor | 8088 | monitoring | Container metrics |
| nvidia_smi_exporter | 9835 | — | GPU metrics (RTX 5050) |
| portainer_agent | 9001 | — | Managed by Portainer on Mini PC 2 |
---
## Main PC
**Role:** Primary AI inference node
| Service | Notes |
|---------|-------|
| Ollama | Runs LLM inference using the RTX A4000. Also serves `nomic-embed-text` embeddings (768-dim vectors) consumed by NexusAI's embedding service on Mini PC 1. |
---
## Storage Layout
### Mini PC 2 (Infrastructure Node)
| Device | Size | Mount | Notes |
|--------|------|-------|-------|
| sda | 238.5G | — | Primary disk |
| sda1 | 1G | /boot/efi | EFI partition |
| sda2 | 2G | /boot | Boot partition |
| sda3 → LVM | 100G | / | OS root via LVM |
### Mini PC 1 (Media Node)
| Device | Size | Mount | Notes |
|--------|------|-------|-------|
| nvme0n1p1 | 1G | /boot/efi | EFI partition |
| nvme0n1p2 | 464.7G | / | OS root (NVMe) |
| sda1 | 10.9T | /mnt/media-anime | External HDD — anime media |
| sdb1 | 7.3T | /mnt/media-main | External HDD — main media library |
| sdc1 | 7.3T | /mnt/seedbox | External HDD — seedbox/download staging |
> **Total external storage on Mini PC 1:** ~25.5TB across 3 drives
---
## Reverse Proxy — Caddy
All services are accessed via subdomains proxied through Caddy on Mini PC 2. Authelia middleware is applied to most services. Gitea is exempt from SSO.
```caddy
# ============================================================
# Caddyfile — redacted for documentation
# Full config lives at [REDACTED PATH] on Mini PC 2
# ============================================================
# --- Authelia forward auth snippet ---
(authelia) {
forward_auth authelia:9091 {
uri /api/verify?rd=https://auth.[DOMAIN]
copy_headers Remote-User Remote-Groups Remote-Name Remote-Email
}
}
# --- Example: SSO-protected service ---
service.[DOMAIN] {
import authelia
reverse_proxy [HOST]:[PORT]
}
# --- Example: Public / SSO-exempt service (e.g. Gitea) ---
git.[DOMAIN] {
reverse_proxy 192.168.0.205:3100
}
# --- Example: Internal-only service ---
grafana.[DOMAIN] {
import authelia
reverse_proxy 192.168.0.205:3002
}
```
> Actual subdomains, domain name, and internal IPs beyond what's documented above are redacted.
---
## Phase 1 Summary
Phase 1 focused on establishing a stable, secure, and observable foundation:
- ✅ Reverse proxy with HTTPS (Caddy)
- ✅ SSO & MFA across services (Authelia)
- ✅ Secure remote access (Tailscale)
- ✅ Local DNS & ad blocking (Pihole)
- ✅ Full monitoring stack (Prometheus + Grafana + Uptime Kuma + exporters)
- ✅ Self-hosted git (Gitea)
- ✅ Media stack fully operational (Jellyfin, arr stack, Nextcloud)
- ✅ Download pipeline with VPN isolation (Gluetun + qBittorrent)
- ✅ NexusAI foundation services running (Qdrant, Ollama)
- ✅ Container management across nodes (Portainer + agent)
---
## Phase 2 Roadmap
Phase 2 shifts focus to resilience, security hardening, and smart home integration.
### Priorities
- **Backup improvements** — Formalize and automate backup strategies for critical data (Nextcloud, databases, configs, media metadata)
- **Additional security hardening** — Audit exposed services, tighten firewall rules, review Authelia policies
- **IP webcam integration** — Add camera feeds into the homelab ecosystem
- **Home Assistant** — Integrate smart home automation and sensor data
- **Continued NexusAI development** — Entities layer, embedding service, inference and orchestration buildout
> This section will be expanded as Phase 2 planning matures.