roadmap phase 1 complete

This commit is contained in:
Storme-bit
2026-04-27 03:10:39 -07:00
parent 9fe8e568cf
commit 1a97b19280
19 changed files with 759 additions and 281 deletions

View File

@@ -42,9 +42,10 @@ src/
│ ├── inference.js # HTTP client for inference service
│ ├── embedding.js # HTTP client for embedding service
│ ├── qdrant.js # HTTP client for Qdrant (direct vector search)
│ ├── graph.js # HTTP client for memory-service graph endpoints
│ └── summarization.js # Session summarisation — triggers after each episode
├── chat/
│ └── index.js # Core pipeline — context assembly, isolation, auto-naming
│ └── index.js # Core pipeline — context assembly, graph expansion, auto-naming
├── config/
│ └── settings.js # Settings load/save — reads/writes data/settings.json
├── routes/
@@ -71,7 +72,7 @@ via `appSettings.load()` — changes apply immediately without a service restart
|---|---|---|
| `recentEpisodeLimit` | 5 | Recent episodes injected into prompt |
| `semanticLimit` | 5 | Semantic search results injected into prompt |
| `scoreThreshold` | 0.75 | Minimum similarity score for semantic results |
| `scoreThreshold` | 0.5 | Minimum similarity score for semantic results |
| `modelsFolderPath` | `/mnt/nexus-models` | Path to folder containing .gguf files |
| `temperature` | 0.7 | Inference temperature |
| `repeatPenalty` | 1.1 | Repeat token penalty |
@@ -104,20 +105,27 @@ difference is how the inference response is delivered to the client.
episodes. Deduplicated against recent episodes. Non-critical.
6. **Entity search** — query `entities` Qdrant collection filtered by
`projectId`. Non-project sessions receive no entity context. Non-critical.
`projectId`. Returns entity IDs alongside Qdrant payload data (the Qdrant
point ID equals the SQLite entity ID). Non-critical.
7. **Prompt assembly** — combine system prompt, entity context, semantic
7. **Graph neighborhood expansion** — call `POST /graph/neighbors` on
memory-service with the entity IDs from step 6. Returns a 1-hop subgraph
`{ nodes, edges }` — entity objects plus the relationships connecting them.
If no entities were found or the graph call fails, falls back to flat entity
list (no edges). Non-critical.
8. **Prompt assembly** — combine system prompt, graph context, semantic
episodes, recent episodes, and user message.
8. **Inference** — send to inference service. `/chat` awaits full response;
9. **Inference** — send to inference service. `/chat` awaits full response;
`/chat/stream` pipes SSE chunks to the client.
9. **Episode write** — write exchange back to memory with `projectId`.
10. **Episode write** — write exchange back to memory with `projectId`.
10. **Summarisation trigger**`triggerSummary(session, allEpisodes)` called
11. **Summarisation trigger**`triggerSummary(session, allEpisodes)` called
fire-and-forget. See `summarization.md` for full details.
11. **Auto-naming** — on first message with no session name, fires a secondary
12. **Auto-naming** — on first message with no session name, fires a secondary
inference call (max 20 tokens, temperature 0.3) to generate a session name.
### Prompt Structure
@@ -125,8 +133,9 @@ difference is how the inference response is delivered to the client.
```
[Resolved system prompt]
Here is what you know about entities relevant to this conversation:
Here is what you know about entities relevant to this conversation and their connections:
- {name} ({type}): {notes}
→ {label} {neighbor_name} ({neighbor_type})
---
Here are some relevant memories from earlier conversations:
User: {past user message}
@@ -141,6 +150,12 @@ User: {current message}
Assistant:
```
The entity block renders the full graph neighborhood — seed entities matched
by Qdrant search plus any neighbors pulled in by 1-hop traversal. Each entity
shows its `notes` and any outbound relationships with their targets. Neighbor
nodes that have no outbound edges within the subgraph appear without connection
lines.
## Summarisation
After each episode write, `triggerSummary` is called fire-and-forget. It
@@ -199,4 +214,7 @@ handle /health* { reverse_proxy localhost:4000 }
After updating: `caddy reload --config /path/to/Caddyfile`
For all HTTP endpoints, see `api-routes.md`.
> Note: `/graph` routes are on the memory-service (port 3002) and are called
> internally by orchestration — they do not need a Caddy entry.
For all HTTP endpoints, see `api-routes.md`.