minor clean up
This commit is contained in:
64
packages/embedding-service/CLAUDE.md
Normal file
64
packages/embedding-service/CLAUDE.md
Normal file
@@ -0,0 +1,64 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
See the root [CLAUDE.md](../../CLAUDE.md) for overall architecture, service roles, and deployment layout.
|
||||
|
||||
## Running This Service
|
||||
|
||||
```bash
|
||||
npm run embedding # From repo root
|
||||
npm -w packages/embedding-service run dev # With --watch
|
||||
```
|
||||
|
||||
Default port: **3003**. Requires Ollama to be reachable at `OLLAMA_URL`.
|
||||
|
||||
## Single-File Service
|
||||
|
||||
The entire service is `src/index.js` — no subdirectory structure. All routes, the Ollama helper, and startup are in one file.
|
||||
|
||||
## Environment Variables
|
||||
|
||||
| Variable | Default | Description |
|
||||
|---|---|---|
|
||||
| `PORT` | `3003` | Port to listen on |
|
||||
| `OLLAMA_URL` | `http://localhost:11434` | Ollama instance URL |
|
||||
| `EMBEDDING_MODEL` | `nomic-embed-text` | Model passed to Ollama `/api/embed` |
|
||||
|
||||
Note: the env var name is `EMBEDDING_MODEL`, not `EMBED_MODEL` — the internal constant is `EMBED_MODEL` but the lookup key is different.
|
||||
|
||||
## Ollama API Details
|
||||
|
||||
Uses Ollama's `/api/embed` endpoint (not `/api/embeddings`). Request shape:
|
||||
|
||||
```json
|
||||
{ "model": "nomic-embed-text", "input": "text to embed" }
|
||||
```
|
||||
|
||||
Ollama returns `{ "embeddings": [[...]] }` — an array of arrays even for a single input. The helper takes `data.embeddings[0]` to return the single vector.
|
||||
|
||||
The `ollama` npm package is listed as a dependency but is **not used** — all calls are raw `fetch`. Do not refactor to use the package without checking the API shape matches.
|
||||
|
||||
## Batch Endpoint
|
||||
|
||||
`POST /embed/batch` embeds items **sequentially** in a for-loop, not in parallel. The comment explains this: Ollama doesn't parallelise embedding calls, so parallel requests would queue internally anyway. Do not change to `Promise.all` without verifying Ollama behaviour.
|
||||
|
||||
## Error Responses
|
||||
|
||||
| Condition | Status | Notes |
|
||||
|---|---|---|
|
||||
| Missing/empty `text` | 400 | |
|
||||
| Ollama call fails | 502 | Upstream failure — correct status |
|
||||
| Empty `texts` array | 400 | |
|
||||
|
||||
## Known Issue
|
||||
|
||||
The 400 error message for `/embed` reads `"text is required and must be empty"` — the word "not" is missing. Should read `"must not be empty"`.
|
||||
|
||||
## API Endpoints
|
||||
|
||||
| Method | Path | Notes |
|
||||
|---|---|---|
|
||||
| GET | `/health` | Static response — does not verify Ollama is reachable |
|
||||
| POST | `/embed` | Body: `{ text: string }`. Returns `{ embedding, model, dimensions }` |
|
||||
| POST | `/embed/batch` | Body: `{ texts: string[] }`. Returns `{ embeddings, model, dimensions, count }` |
|
||||
Reference in New Issue
Block a user