roadmap phase 1 complete
This commit is contained in:
@@ -42,9 +42,10 @@ src/
|
||||
│ ├── inference.js # HTTP client for inference service
|
||||
│ ├── embedding.js # HTTP client for embedding service
|
||||
│ ├── qdrant.js # HTTP client for Qdrant (direct vector search)
|
||||
│ ├── graph.js # HTTP client for memory-service graph endpoints
|
||||
│ └── summarization.js # Session summarisation — triggers after each episode
|
||||
├── chat/
|
||||
│ └── index.js # Core pipeline — context assembly, isolation, auto-naming
|
||||
│ └── index.js # Core pipeline — context assembly, graph expansion, auto-naming
|
||||
├── config/
|
||||
│ └── settings.js # Settings load/save — reads/writes data/settings.json
|
||||
├── routes/
|
||||
@@ -71,7 +72,7 @@ via `appSettings.load()` — changes apply immediately without a service restart
|
||||
|---|---|---|
|
||||
| `recentEpisodeLimit` | 5 | Recent episodes injected into prompt |
|
||||
| `semanticLimit` | 5 | Semantic search results injected into prompt |
|
||||
| `scoreThreshold` | 0.75 | Minimum similarity score for semantic results |
|
||||
| `scoreThreshold` | 0.5 | Minimum similarity score for semantic results |
|
||||
| `modelsFolderPath` | `/mnt/nexus-models` | Path to folder containing .gguf files |
|
||||
| `temperature` | 0.7 | Inference temperature |
|
||||
| `repeatPenalty` | 1.1 | Repeat token penalty |
|
||||
@@ -104,20 +105,27 @@ difference is how the inference response is delivered to the client.
|
||||
episodes. Deduplicated against recent episodes. Non-critical.
|
||||
|
||||
6. **Entity search** — query `entities` Qdrant collection filtered by
|
||||
`projectId`. Non-project sessions receive no entity context. Non-critical.
|
||||
`projectId`. Returns entity IDs alongside Qdrant payload data (the Qdrant
|
||||
point ID equals the SQLite entity ID). Non-critical.
|
||||
|
||||
7. **Prompt assembly** — combine system prompt, entity context, semantic
|
||||
7. **Graph neighborhood expansion** — call `POST /graph/neighbors` on
|
||||
memory-service with the entity IDs from step 6. Returns a 1-hop subgraph
|
||||
`{ nodes, edges }` — entity objects plus the relationships connecting them.
|
||||
If no entities were found or the graph call fails, falls back to flat entity
|
||||
list (no edges). Non-critical.
|
||||
|
||||
8. **Prompt assembly** — combine system prompt, graph context, semantic
|
||||
episodes, recent episodes, and user message.
|
||||
|
||||
8. **Inference** — send to inference service. `/chat` awaits full response;
|
||||
9. **Inference** — send to inference service. `/chat` awaits full response;
|
||||
`/chat/stream` pipes SSE chunks to the client.
|
||||
|
||||
9. **Episode write** — write exchange back to memory with `projectId`.
|
||||
10. **Episode write** — write exchange back to memory with `projectId`.
|
||||
|
||||
10. **Summarisation trigger** — `triggerSummary(session, allEpisodes)` called
|
||||
11. **Summarisation trigger** — `triggerSummary(session, allEpisodes)` called
|
||||
fire-and-forget. See `summarization.md` for full details.
|
||||
|
||||
11. **Auto-naming** — on first message with no session name, fires a secondary
|
||||
12. **Auto-naming** — on first message with no session name, fires a secondary
|
||||
inference call (max 20 tokens, temperature 0.3) to generate a session name.
|
||||
|
||||
### Prompt Structure
|
||||
@@ -125,8 +133,9 @@ difference is how the inference response is delivered to the client.
|
||||
```
|
||||
[Resolved system prompt]
|
||||
|
||||
Here is what you know about entities relevant to this conversation:
|
||||
Here is what you know about entities relevant to this conversation and their connections:
|
||||
- {name} ({type}): {notes}
|
||||
→ {label} {neighbor_name} ({neighbor_type})
|
||||
---
|
||||
Here are some relevant memories from earlier conversations:
|
||||
User: {past user message}
|
||||
@@ -141,6 +150,12 @@ User: {current message}
|
||||
Assistant:
|
||||
```
|
||||
|
||||
The entity block renders the full graph neighborhood — seed entities matched
|
||||
by Qdrant search plus any neighbors pulled in by 1-hop traversal. Each entity
|
||||
shows its `notes` and any outbound relationships with their targets. Neighbor
|
||||
nodes that have no outbound edges within the subgraph appear without connection
|
||||
lines.
|
||||
|
||||
## Summarisation
|
||||
|
||||
After each episode write, `triggerSummary` is called fire-and-forget. It
|
||||
@@ -199,4 +214,7 @@ handle /health* { reverse_proxy localhost:4000 }
|
||||
|
||||
After updating: `caddy reload --config /path/to/Caddyfile`
|
||||
|
||||
For all HTTP endpoints, see `api-routes.md`.
|
||||
> Note: `/graph` routes are on the memory-service (port 3002) and are called
|
||||
> internally by orchestration — they do not need a Caddy entry.
|
||||
|
||||
For all HTTP endpoints, see `api-routes.md`.
|
||||
|
||||
Reference in New Issue
Block a user