Files
nexusAI/docs/services/embedding-service.md
2026-04-04 21:45:29 -07:00

2.4 KiB

Embedding Service

Package: @nexusai/embedding-service
Location: packages/embedding-service
Deployed on: Mini PC 1 (192.168.0.81)
Port: 3003

Purpose

Converts text into vector embeddings via Ollama for storage in Qdrant. Keeps embedding workload co-located with the memory service on Mini PC 1, minimizing network hops on the memory write path.

Dependencies

  • express — HTTP API
  • @nexusai/shared — shared utilities
  • dotenv — environment variable loading

Uses Node.js built-in fetch — no additional HTTP client library needed.

Environment Variables

Variable Required Default Description
PORT No 3003 Port to listen on
OLLAMA_URL No http://localhost:11434 Ollama instance URL
EMBEDDING_MODEL No nomic-embed-text Ollama embedding model to use

Model

nomic-embed-text via Ollama produces 768-dimension vectors using Cosine similarity. This must match the QDRANT.VECTOR_SIZE constant in @nexusai/shared.

If the embedding model is changed, the Qdrant collections must be reinitialized with the new vector dimension — updating QDRANT.VECTOR_SIZE in constants.js is the single change required to keep everything consistent.

Ollama API

Uses the /api/embed endpoint (Ollama v0.4+). Request shape:

{ "model": "nomic-embed-text", "input": "text to embed" }

Response key is embeddings[0] — an array of 768 floats.

Endpoints

Health

Method Path Description
GET /health Service health check

Embed

Method Path Description
POST /embed Embed a single text string
POST /embed/batch Embed an array of text strings

POST /embed

Embeds a single text string and returns the vector.

Request body:

{
  "text": "Hello from NexusAI"
}

Response:

{
  "embedding": [0.123, -0.456, ...],
  "model": "nomic-embed-text",
  "dimensions": 768
}

POST /embed/batch

Embeds an array of strings sequentially and returns all vectors in the same order. Ollama does not natively parallelize embeddings, so requests are processed one at a time.

Request body:

{
  "texts": ["first sentence", "second sentence"]
}

Response:

{
  "embeddings": [[0.123, ...], [0.456, ...]],
  "model": "nomic-embed-text",
  "dimensions": 768,
  "count": 2
}