updated documentation for entity implementation
This commit is contained in:
@@ -76,17 +76,22 @@ difference is how the inference response is delivered to the client.
|
||||
recent episodes. Non-critical — if it fails, pipeline continues with
|
||||
recency-only context.
|
||||
|
||||
5. **Prompt assembly** — combine system prompt, semantic episodes, recent
|
||||
episodes, and user message.
|
||||
5. **Entity search** — reuse the embedded user message vector to query the
|
||||
`entities` Qdrant collection (score threshold 0.6, limit 5). Returns
|
||||
entity payloads (`name`, `type`, `notes`) directly — no SQLite roundtrip
|
||||
needed. Non-critical — if it fails, pipeline continues without entity context.
|
||||
|
||||
6. **Inference** — send to inference service. `/chat` awaits full response;
|
||||
6. **Prompt assembly** — combine system prompt, entity context, semantic
|
||||
episodes, recent episodes, and user message.
|
||||
|
||||
7. **Inference** — send to inference service. `/chat` awaits full response;
|
||||
`/chat/stream` pipes SSE chunks to the client.
|
||||
|
||||
7. **Episode write** — write the exchange back to memory. Fire-and-forget
|
||||
8. **Episode write** — write the exchange back to memory. Fire-and-forget
|
||||
for `/chat`; awaited for `/chat/stream` to ensure the full text is
|
||||
accumulated before saving.
|
||||
|
||||
8. **Auto-naming** — on `isFirstMessage && !session.name`, fire a secondary
|
||||
9. **Auto-naming** — on `isFirstMessage && !session.name`, fire a secondary
|
||||
inference call with a naming prompt (max 20 tokens, temperature 0.3) and
|
||||
write the result back as `session.name`. Fully fire-and-forget.
|
||||
|
||||
@@ -95,6 +100,10 @@ difference is how the inference response is delivered to the client.
|
||||
```
|
||||
[System prompt]
|
||||
|
||||
Here is what you know about entities relevant to this conversation:
|
||||
- {name} ({type}): {notes}
|
||||
... (up to 5 entity results)
|
||||
---
|
||||
Here are some relevant memories from earlier conversations:
|
||||
User: {past user message}
|
||||
Assistant: {past ai response}
|
||||
@@ -110,8 +119,9 @@ User: {current message}
|
||||
Assistant:
|
||||
```
|
||||
|
||||
Semantic episodes appear before recent episodes so the model sees
|
||||
long-range context before the immediate conversation flow.
|
||||
Entity context appears first — before episodic memory — because structured
|
||||
facts about known entities are the most stable and reliable context. Semantic
|
||||
episodes follow, then recent episodes as the immediate conversation flow.
|
||||
|
||||
## SSE Stream Format
|
||||
|
||||
|
||||
Reference in New Issue
Block a user