Files
nexusAI/docs/services/inference-service.md
2026-04-04 05:22:36 -07:00

35 lines
916 B
Markdown

# Inference Service
**Package:** `@nexusai/inference-service`
**Location:** `packages/inference-service`
**Deployed on:** Main PC
**Port:** 3001
## Purpose
Thin adapter layer around the local LLM runtime (Ollama). Receives
assembled context packages from the orchestration service and returns
model responses.
## Dependencies
- `express` — HTTP API
- `ollama` — Ollama client
- `dotenv` — environment variable loading
- `@nexusai/shared` — shared utilities
## Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
| PORT | No | 3001 | Port to listen on |
| OLLAMA_URL | No | http://localhost:11434 | Ollama instance URL |
| DEFAULT_MODEL | No | llama3 | Default model to use for inference |
## Endpoints
| Method | Path | Description |
|---|---|---|
| GET | /health | Service health check |
> Further endpoints will be documented as the service is built out.