diff --git a/packages/memory-service/CLAUDE.md b/packages/memory-service/CLAUDE.md new file mode 100644 index 0000000..d3a725e --- /dev/null +++ b/packages/memory-service/CLAUDE.md @@ -0,0 +1,88 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +See the root [CLAUDE.md](../../CLAUDE.md) for overall architecture, service roles, and the dual-store memory model. + +## Running This Service + +```bash +npm run memory # From repo root (node src/index.js) +npm -w packages/memory-service run dev # With --watch +``` + +Default port: **3002**. Requires Qdrant and the embedding-service to be reachable on startup. + +## SQLite Schema + +`src/db/schema.js` is the source of truth for the data model. Key schema facts: + +- `sessions` and `episodes` are linked by FK with cascade delete — deleting a session removes all its episodes automatically. +- `episodes_fts` is an FTS5 virtual table that mirrors `user_message` and `ai_response`. It is kept in sync via SQL triggers on INSERT/UPDATE/DELETE. On service startup, the FTS index is fully rebuilt from live episode data. +- Several columns (`sessions.name`, `sessions.project_id`, `projects.isolated`, etc.) were added as migrations using `ALTER TABLE` wrapped in individual try-catch blocks. Failures are silently swallowed — if a column already exists, the alter fails and the service continues. The `idx_summaries_project` index is defined twice (benign duplicate). +- `summaries` rows with `session_id IS NULL` and a `project_id` represent project-level overviews, not session summaries. This distinction is how `GET /projects/:id/overview` works. + +## Async Pipeline: Episode Creation + +`POST /episodes` returns a 201 as soon as the SQLite insert succeeds. Two background tasks run after without blocking the response: + +1. **Embedding** — Fetches a vector from embedding-service, stores to Qdrant with `{sessionId, createdAt}` as payload metadata. +2. **Entity extraction** — Sends the episode text to Ollama (`qwen2.5:3b`, temp 0.1, 200 tokens) and upserts any recognized entities to both SQLite and Qdrant. + +Both tasks catch and log errors silently. An episode can exist in SQLite with no corresponding Qdrant point if either step fails. + +## Entity Extraction Details + +`src/entities/extraction.js`: + +- Fetches the last 20 known entities from SQLite before prompting the model, so the prompt can ask for name/type consistency with existing entries. +- Recognized types: `person`, `place`, `project`, `technology`, `concept`, `organization` — anything else is discarded. +- Ignores a hardcoded list of low-value names (`hello`, `thanks`, `good morning`, etc.). +- Extracts JSON using a regex (`{...}`) applied to raw model output, so surrounding prose doesn't break parsing. +- Entity upsert uses `ON CONFLICT(name, type) DO UPDATE` — preserves existing `notes` if the new extraction returns null (`COALESCE(entities.notes, excluded.notes)`). +- After upsert, embeds each entity as `"${name} (${type}): ${notes}"` and stores to Qdrant with `projectId` in the payload for project-scoped filtering. + +## Summarization Strategy + +`src/summarization/project.js`: + +- Preferred path: generate a project overview from existing **session-level summaries** (higher-level abstraction, shorter input). +- Fallback path: if no session summaries exist, summarize raw episodes directly (up to `SUMMARIES.MAX_PROJECT_EPISODE_LIMIT`). +- Both paths truncate input at `SUMMARIES.MAX_SUMMARY_CHARS` (30,000 chars) by slicing from the end (most recent content wins). +- Strips ChatML tokens from the Ollama response (`<|im_start|>`, `<|im_end|>`). +- Uses temp 0.2 and `num_predict 1200`. + +## Known Quirk: `getRecentEpisodes` + +`src/episodic/index.js` `getRecentEpisodes(sessionId, limit)` has a parameter mismatch — the SQLite query binds only `limit`, not `sessionId`, so it returns recent episodes across **all sessions**. Orchestration-service uses `getEpisodesBySession()` (the paginated route) instead, so this bug is not visible in normal operation. Don't rely on `getRecentEpisodes` when you need session-scoped results. + +## Qdrant Client + +`src/semantic/index.js` creates the Qdrant client lazily on first use and reuses it. All three collections (`episodes`, `entities`, `summaries`) are created at startup if missing. There is no connection health check — if Qdrant is unreachable, semantic operations throw at call time. + +## API Endpoints Quick Reference + +| Method | Path | Notes | +|---|---|---| +| GET | `/health` | Static response, no dependency checks | +| GET/POST | `/sessions` | POST requires `externalId`; duplicate → 409 | +| GET/PATCH | `/sessions/by-external/:externalId` | PATCH accepts `name`, `projectId` | +| DELETE | `/sessions/by-external/:externalId` | Cascades to episodes, summaries, relationships | +| GET/POST | `/episodes` | POST triggers async embedding + entity extraction | +| GET | `/episodes/search` | FTS5 search; route must precede `/:id` | +| GET | `/sessions/:id/episodes` | Paginated, ordered `created_at DESC` | +| DELETE | `/episodes/:id` | Removes from SQLite + async Qdrant delete | +| POST | `/entities` | Upsert by `(name, type)` | +| GET | `/entities/by-type/:type` | All entities of given type | +| GET/DELETE | `/entities/:id` | | +| POST | `/relationships` | Upsert by `(fromId, toId, label)`; conflict = no-op | +| GET | `/entities/:id/relationships` | Outbound only | +| DELETE | `/relationships` | Body: `fromId`, `toId`, `label` | +| GET/POST | `/projects` | POST requires non-empty `name` | +| GET/PATCH/DELETE | `/projects/:id` | | +| POST | `/projects/:id/summarize` | On-demand overview generation; 422 if no data | +| GET | `/projects/:id/overview` | Returns null (not 404) if no overview exists | +| GET | `/projects/:id/summaries` | All summaries for project | +| POST | `/summaries` | Requires `content` + at least one of `sessionId`/`projectId` | +| GET | `/sessions/:id/summaries` | | +| PATCH/DELETE | `/summaries/:id` | | diff --git a/packages/memory-service/src/episodic/index.js b/packages/memory-service/src/episodic/index.js index 5e870fc..622d430 100644 --- a/packages/memory-service/src/episodic/index.js +++ b/packages/memory-service/src/episodic/index.js @@ -163,7 +163,7 @@ function getRecentEpisodes(sessionId, limit = EPISODIC.DEFAULT_RECENT_LIMIT) { ORDER BY created_at DESC LIMIT ? `); - return stmt.all(limit).map(parseRow); + return stmt.all(sessionId, limit).map(parseRow); }