updated documentation for entity implementation
This commit is contained in:
@@ -28,6 +28,8 @@ service to generate and store a vector in Qdrant.
|
||||
| SQLITE_PATH | Yes | — | Path to SQLite database file |
|
||||
| QDRANT_URL | No | http://localhost:6333 | Qdrant instance URL |
|
||||
| EMBEDDING_SERVICE_URL | No | http://localhost:3003 | Embedding service URL |
|
||||
| EXTRACTION_URL | No | http://localhost:11434 | Ollama URL for entity extraction |
|
||||
| EXTRACTION_MODEL | No | qwen2.5:3b | Ollama model used for entity extraction |
|
||||
|
||||
## Internal Structure
|
||||
|
||||
@@ -42,7 +44,8 @@ src/
|
||||
├── semantic/
|
||||
│ └── index.js # Qdrant collection management, upsert, search, delete
|
||||
├── entities/
|
||||
│ └── index.js # Entity + relationship CRUD
|
||||
│ ├── index.js # Entity + relationship CRUD
|
||||
│ └── extraction.js # Automatic entity extraction via qwen2.5:3b on Ollama
|
||||
└── index.js # Express app + all route definitions
|
||||
```
|
||||
|
||||
@@ -143,6 +146,32 @@ constraints to prevent duplicates:
|
||||
- `UNIQUE(from_id, to_id, label)` on relationships
|
||||
- `ON DELETE CASCADE` on relationship foreign keys
|
||||
|
||||
### Automatic Entity Extraction
|
||||
|
||||
After each episode is saved, `extraction.js` automatically extracts named
|
||||
entities from the conversation using `qwen2.5:3b` running on Ollama (Mini PC 1).
|
||||
This runs **fire-and-forget** — the episode is already saved and returned
|
||||
before extraction begins.
|
||||
|
||||
**Entity types extracted:** `person`, `place`, `project`, `technology`,
|
||||
`concept`, `organization`
|
||||
|
||||
The extraction prompt uses ChatML format (native to qwen2.5) and primes the
|
||||
response by ending with `[` to steer the model directly into JSON array output.
|
||||
A list of already-known entities is injected into the prompt so the model
|
||||
reuses existing `(name, type)` pairs rather than creating duplicates with
|
||||
different types.
|
||||
|
||||
After extraction, each entity is:
|
||||
1. Upserted into SQLite via `upsertEntity` — notes are only written if
|
||||
the entity is new (`COALESCE(entities.notes, excluded.notes)` prevents
|
||||
overwriting existing notes with speculative updates)
|
||||
2. Embedded via the embedding service and upserted into the `entities`
|
||||
Qdrant collection with `{ name, type, notes }` as payload
|
||||
|
||||
The Qdrant payload stores enough information to reconstruct entity context
|
||||
at retrieval time without a SQLite roundtrip.
|
||||
|
||||
## Project Delete Behaviour
|
||||
|
||||
Deleting a project runs as a transaction — it first nulls out `project_id`
|
||||
|
||||
Reference in New Issue
Block a user