2.4 KiB
Embedding Service
Package: @nexusai/embedding-service
Location: packages/embedding-service
Deployed on: Mini PC 1 (192.168.0.81)
Port: 3003
Purpose
Converts text into vector embeddings via Ollama for storage in Qdrant. Keeps embedding workload co-located with the memory service on Mini PC 1, minimizing network hops on the memory write path.
Dependencies
express— HTTP API@nexusai/shared— shared utilitiesdotenv— environment variable loading
Uses Node.js built-in
fetch— no additional HTTP client library needed.
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
| PORT | No | 3003 | Port to listen on |
| OLLAMA_URL | No | http://localhost:11434 | Ollama instance URL |
| EMBEDDING_MODEL | No | nomic-embed-text | Ollama embedding model to use |
Model
nomic-embed-text via Ollama produces 768-dimension vectors using Cosine similarity.
This must match the QDRANT.VECTOR_SIZE constant in @nexusai/shared.
If the embedding model is changed, the Qdrant collections must be reinitialized
with the new vector dimension — updating QDRANT.VECTOR_SIZE in constants.js is
the single change required to keep everything consistent.
Ollama API
Uses the /api/embed endpoint (Ollama v0.4+). Request shape:
{ "model": "nomic-embed-text", "input": "text to embed" }
Response key is embeddings[0] — an array of 768 floats.
Endpoints
Health
| Method | Path | Description |
|---|---|---|
| GET | /health | Service health check |
Embed
| Method | Path | Description |
|---|---|---|
| POST | /embed | Embed a single text string |
| POST | /embed/batch | Embed an array of text strings |
POST /embed
Embeds a single text string and returns the vector.
Request body:
{
"text": "Hello from NexusAI"
}
Response:
{
"embedding": [0.123, -0.456, ...],
"model": "nomic-embed-text",
"dimensions": 768
}
POST /embed/batch
Embeds an array of strings sequentially and returns all vectors in the same order. Ollama does not natively parallelize embeddings, so requests are processed one at a time.
Request body:
{
"texts": ["first sentence", "second sentence"]
}
Response:
{
"embeddings": [[0.123, ...], [0.456, ...]],
"model": "nomic-embed-text",
"dimensions": 768,
"count": 2
}