nexusAI/docs/services/entity-extraction.md

# Entity Extraction

**Location:** `packages/memory-service/src/entities/extraction.js`
**Triggered by:** Episode creation (`POST /episodes`)
**Model:** `qwen2.5:3b` via Ollama (configurable via `EXTRACTION_MODEL` env var)

## Purpose

After each episode is saved to SQLite, the extraction pipeline runs
asynchronously in the background to identify named entities and the
relationships between them. Results are written back to SQLite and
embedded into Qdrant — the episode response is never delayed.

## Trigger

`createEpisode()` in `episodic/index.js` calls `extractAndStoreEntities()`
immediately after the SQLite insert, without awaiting it:

```js
extractAndStoreEntities(userMessage, aiResponse, episode.id, projectId)
  .catch(err => logger.error(`Failed to extract entities for episode ${episode.id}:`, err.message));
```

If extraction throws, the episode is unaffected — the error is logged and
swallowed.

## Model Settings

| Setting | Value | Notes |
|---|---|---|
| Model | `qwen2.5:3b` | Ollama, configurable via `EXTRACTION_MODEL` |
| Temperature | 0.1 | Low for consistent, deterministic output |
| `num_predict` | 1500 | Higher ceiling to accommodate entity + relationship JSON |
| `format` | `'json'` | Ollama constrained decoding — enforces valid JSON output |
| Prompt format | ChatML | `<\|im_start\|>` / `<\|im_end\|>` tokens |

## Prompt Structure

The prompt is built by `buildExtractionPrompt()`. It includes:

1. **System message** — declares the model's role as an entity and relationship extractor
2. **Instructions** — entity types, field rules, relationship label format, required JSON schema
3. **Known entities block** — last 20 entities from SQLite, by `rowid DESC`, used to encourage consistent name/type pairs across conversations
4. **Conversation** — the raw user message and AI response, delimited clearly

```
<|im_start|>system
You are a named entity and relationship extractor. You output only valid JSON.
<|im_end|>
<|im_start|>user
Read the conversation below and extract all named entities and the relationships between them.
Entity types: person, place, project, technology, concept, organization
...
Return this exact JSON structure:
{ "entities": [...], "relationships": [...] }

Already known entities (use these exact name and type values if the same entity appears):
- "NexusAI" (project)
- "Alice" (person)

--- CONVERSATION ---
User: ...
Assistant: ...
--- END CONVERSATION ---
<|im_end|>
<|im_start|>assistant
```

## Expected JSON Output

```json
{
  "entities": [
    { "name": "Alice", "type": "person", "notes": "Software engineer working on NexusAI." },
    { "name": "NexusAI", "type": "project", "notes": "A modular AI assistant with persistent memory." }
  ],
  "relationships": [
    {
      "from": "Alice", "fromType": "person",
      "to": "NexusAI", "toType": "project",
      "label": "works_on",
      "notes": "Alice is the primary developer."
    }
  ]
}
```

Relationship labels use **snake_case verbs** (e.g. `works_on`, `manages`, `uses`,
`knows`, `located_in`, `part_of`, `created_by`).

## JSON Parsing

The raw model response is matched with `/\{[\s\S]*\}/` before parsing — this
tolerates any preamble or trailing prose the model emits alongside the JSON.
If the match fails or `JSON.parse` throws, the function logs a warning and
returns without writing anything.

## Entity Processing

For each entity in `parsed.entities`:

1. Validate `name`, `type` (must be in `ENTITY_TYPES`), and not in `IGNORED_NAMES`
2. Call `upsertEntity(name, type, notes)`:
   - **Insert**: creates new row with `mention_count = 1`, `source = 'extraction'`
   - **Conflict** on `(name, type)`: increments `mention_count`, updates `last_seen_at`, preserves existing `notes` if new extraction returns null
3. Add to `entityMap` keyed by `"${name}::${type}"` — used for relationship resolution below
4. Call `linkEntityToEpisode(entity.id, episodeId)` — writes to `entity_episodes` join table
5. Fire-and-forget: embed as `"${name} (${type}): ${notes}"` → store to Qdrant `entities` collection with `{ name, type, notes, projectId }` in payload

**Valid entity types:** `person`, `place`, `project`, `technology`, `concept`, `organization`

**Stoplist (ignored names):** `good morning`, `good night`, `hello`, `goodbye`, `thanks`, `thank you`

## Relationship Processing

After all entities are saved, relationships are processed:

1. For each entry in `parsed.relationships`, look up both endpoints in `entityMap` using `"${from}::${fromType}"` and `"${to}::${toType}"` as keys
2. If either endpoint is missing (filtered out, invalid type, or not in this extraction), the relationship is silently skipped
3. Call `upsertRelationship(fromId, toId, label, notes)`:
   - **Insert**: creates new row with `mention_count = 1`
   - **Conflict** on `(from_id, to_id, label)`: increments `mention_count`, preserves existing `notes` if new is null

Relationships are unidirectional in storage. Bidirectionality is handled at
query time by the graph traversal layer.

## Project Scoping

`projectId` is threaded through from the episode creation call. It is stored
in the Qdrant entity payload, which enables project-scoped entity search in
orchestration. SQLite entities and relationships are global — scoping only
applies at the Qdrant retrieval layer.

## Error Behaviour

All steps after the initial model call are wrapped in a single outer try/catch.
If Ollama is unreachable, returns a non-200 status, or the JSON cannot be
parsed, the function logs at `warn` level and returns. There is no retry logic.
Individual entity embedding failures are caught per-entity and logged at `warn`
level without affecting other entities in the same batch.