retrieval fusion
This commit is contained in:
@@ -24,9 +24,10 @@ Default port: **4000**. Depends on memory-service, embedding-service, inference-
|
||||
- No project: `must: [sessionId == this session]`
|
||||
- Project: `should: [sessionId == s1, sessionId == s2, ...]` across all project sessions
|
||||
- Dedup against recent episode IDs before including.
|
||||
5. Embed and search Qdrant ENTITIES (filtered by `projectId` if in a project). Returns entity IDs alongside payload — the Qdrant point ID equals the SQLite entity ID.
|
||||
6. Expand matched entities into a 1-hop graph neighborhood via `POST /graph/neighbors` on the memory-service. Returns `{ nodes, edges }` — the full entity objects plus connecting relationships. Falls back to flat entity list (no edges) if the graph call fails.
|
||||
7. Build prompt in this fixed order: **system prompt → graph context → semantic episodes → recent episodes → user message → "Assistant:"**
|
||||
5. Run **fused episode retrieval** via `getFusedEpisodes` — Qdrant semantic search and FTS5 keyword search run in parallel, both filtered against `recentIds`, then merged via Reciprocal Rank Fusion (RRF). If `keywordWeight` is `0`, the FTS call is skipped. Returns top `semanticLimit` episodes by fused score.
|
||||
6. Embed and search Qdrant ENTITIES (filtered by `projectId` if in a project). Returns entity IDs alongside payload — the Qdrant point ID equals the SQLite entity ID.
|
||||
7. Expand matched entities into a 1-hop graph neighborhood via `POST /graph/neighbors` on the memory-service. Returns `{ nodes, edges }` — the full entity objects plus connecting relationships. Falls back to flat entity list (no edges) if the graph call fails.
|
||||
8. Build prompt in this fixed order: **system prompt → graph context → fused episodes → recent episodes → user message → "Assistant:"**
|
||||
|
||||
The ordering prioritizes established facts (graph context) and relevant past context (semantic) over pure recency.
|
||||
|
||||
@@ -100,6 +101,18 @@ When the existing summary's token count exceeds `SUMMARIES.MAX_SUMMARY_TOKENS`,
|
||||
|
||||
`searchEntities` checks `projectId !== null && projectId !== undefined` before applying the filter — a session with no project skips the filter entirely and searches globally.
|
||||
|
||||
## Retrieval Fusion (`src/chat/index.js`)
|
||||
|
||||
Three functions handle fusion — all pure or lightly async, all non-critical:
|
||||
|
||||
- **`getFTSResults(userMessage, { limit, sessionIds })`** — calls `memory.searchEpisodes`; returns `[]` and logs a warning on failure
|
||||
- **`fuseEpisodeResults(semanticEps, keywordEps, { semanticWeight, keywordWeight, limit })`** — pure RRF implementation. Key guard: FTS-only episodes are only added to the scores Map if `contrib > 0` (prevents score-0 bleed-through when `keywordWeight: 0`)
|
||||
- **`getFusedEpisodes(userMessage, session, recentIds, projectSessionIds, settings)`** — orchestrates both paths in `Promise.all`, applies `recentIds` filter to FTS results, calls fusion. Short-circuits FTS call entirely if `keywordWeight === 0`
|
||||
|
||||
FTS is scoped to `projectSessionIds` if in a project, otherwise `[session.id]` — mirrors Qdrant scoping exactly.
|
||||
|
||||
> For RRF formula, weight semantics, and enabling keyword search, see `docs/services/retrieval-fusion.md`.
|
||||
|
||||
## Graph Service Client (`src/services/graph.js`)
|
||||
|
||||
Thin HTTP client for memory-service graph endpoints. One function:
|
||||
|
||||
Reference in New Issue
Block a user