8.0 KiB
Knowledge Graph
Location: packages/memory-service/src/graph/index.js
Schema additions: entity_episodes table; new columns on entities and relationships
Exposed via: GET /graph/neighborhood/:entityId, POST /graph/neighbors
Consumed by: Orchestration service context assembly
Purpose
The knowledge graph transforms NexusAI from "remembers conversations" to "understands relationships between things." Rather than injecting a flat list of entity facts into every prompt, orchestration now retrieves a 1-hop subgraph of connected entities and their relationships, giving the model structured, linked knowledge about people, projects, technologies, and concepts that have appeared across conversations.
Schema
entity_episodes (join table)
Tracks which episodes contributed to each entity's knowledge. Defined in
schema.js — exists on all installs.
CREATE TABLE IF NOT EXISTS entity_episodes (
entity_id INTEGER NOT NULL REFERENCES entities(id) ON DELETE CASCADE,
episode_id INTEGER NOT NULL REFERENCES episodes(id) ON DELETE CASCADE,
PRIMARY KEY (entity_id, episode_id)
);
Both FKs cascade on delete — removing an entity or episode automatically cleans up its join rows.
New columns on entities
Added via migration in db/index.js:
| Column | Type | Default | Description |
|---|---|---|---|
mention_count |
INTEGER | 1 | How many times this entity has been extracted across conversations |
confidence |
REAL | 1.0 | Reserved for future confidence scoring |
source |
TEXT | 'extraction' |
'extraction' (auto) or 'manual' |
last_seen_at |
INTEGER | NULL | Unix timestamp of most recent extraction hit |
New columns on relationships
| Column | Type | Default | Description |
|---|---|---|---|
mention_count |
INTEGER | 1 | How many times this edge has been extracted |
notes |
TEXT | NULL | Relationship context sentence from extraction |
Entity Promotion Model
Entities are not created equal — some are mentioned once in passing, others
recur across many conversations. mention_count is the signal:
- Every time
upsertEntityis called for an existing(name, type)pair,mention_countis incremented andlast_seen_atis updated. ENTITIES.PROMOTION_THRESHOLD(default: 3) is themention_countat which an entity is considered "well-established" — referenced in the codebase for future filtering and scoring logic.- Currently
mention_countis stored and incremented but not yet used to gate retrieval. It provides the foundation for future features such as orphan cleanup (entities never re-extracted) and confidence-weighted graph traversal.
The same pattern applies to relationships — mention_count rises each time
the same (from_id, to_id, label) triple is extracted.
Graph Traversal
src/graph/index.js exports two functions built on SQLite's WITH RECURSIVE
CTE support. No external graph database is needed.
getNeighborhood(entityId, depth)
Traverses the graph from a single entity, following edges in both directions,
up to depth hops. Returns { nodes: [...entities], edges: [...relationships] }.
Default depth: ENTITIES.GRAPH_HOP_DEPTH (1). Maximum enforced at HTTP layer: 3.
SQLite query:
WITH RECURSIVE traverse(entity_id, depth) AS (
SELECT ?, 0
UNION
SELECT
CASE WHEN r.from_id = t.entity_id THEN r.to_id ELSE r.from_id END,
t.depth + 1
FROM relationships r
JOIN traverse t ON (r.from_id = t.entity_id OR r.to_id = t.entity_id)
WHERE t.depth < ?
)
SELECT DISTINCT entity_id FROM traverse
UNION (not UNION ALL) eliminates duplicate visits and naturally handles
cycles — a node already in the traversal set is not re-visited.
After collecting node IDs, two follow-up queries fetch:
- All entity rows for those IDs
- All relationship rows where both
from_idandto_idare in the node set
This ensures edges between neighbors are included even if they aren't on the traversal path from the seed.
getEntityNeighbors(entityIds[])
Bulk 1-hop version designed for orchestration. Given multiple seed entity IDs (the results of Qdrant semantic search), returns the combined 1-hop subgraph.
- Finds all neighbor IDs via one query using
IN (...)on bothfrom_idandto_id - Deduplicates seeds + neighbors using a JavaScript
Set - Fetches all entity rows and all relationship rows within the combined node set
This is intentionally simpler than the recursive version — orchestration always uses depth=1, and the bulk query avoids N separate CTE calls.
Graph-Aware Context Assembly
Orchestration's assembleContext (in src/chat/index.js) integrates the
graph at step 7 of the chat pipeline:
- Qdrant entity search returns up to
ORCHESTRATION.ENTITIES_LIMITresults, each includingr.id(the SQLite entity ID) alongside the Qdrant payload graph.getNeighbors(entityIds)is called with those IDs →POST /graph/neighborson memory-service- The returned
{ nodes, edges }is passed toformatGraphContext() - On failure, falls back to using the Qdrant payload data directly as flat nodes with no edges
Prompt Format
formatGraphContext(nodes, edges) in chat/index.js formats the subgraph as:
Here is what you know about entities relevant to this conversation and their connections:
- Alice (person): software engineer working on NexusAI
→ works_on NexusAI (project)
→ knows Bob (person)
- NexusAI (project): AI assistant framework
- Bob (person): Alice's colleague
- One line per node:
- {name} ({type}): {notes} - Outbound edges indented below:
→ {label} {target_name} ({target_type}) - Nodes with only inbound edges (pulled in as neighbors) appear without connection lines
- Only outbound edges are shown — each relationship appears once, from the
from_idside
Project Scoping
The knowledge graph respects project boundaries at the entry point, not during traversal:
- Qdrant entity search is filtered by
projectId— only entities tagged with this project are returned as seeds - Graph traversal in SQLite is unfiltered — neighbors can be from any project or no project
- This is intentional: the graph entry is project-scoped, but traversal follows the global relationship graph to discover connected knowledge
Entities are tagged with projectId in the Qdrant payload at extraction time.
Entities extracted from non-project sessions have projectId: null and only
appear in unfiltered global searches.
API Reference
GET /graph/neighborhood/:entityId
Returns the neighborhood of a single entity.
Query params:
| Param | Default | Max | Description |
|---|---|---|---|
depth |
ENTITIES.GRAPH_HOP_DEPTH (1) |
3 | Traversal depth |
Response:
{
"entity": { "id": 5, "name": "Alice", "type": "person", "notes": "...", "mention_count": 4 },
"neighborhood": {
"nodes": [
{ "id": 5, "name": "Alice", "type": "person", "notes": "..." },
{ "id": 8, "name": "NexusAI", "type": "project", "notes": "..." }
],
"edges": [
{ "id": 2, "from_id": 5, "to_id": 8, "label": "works_on", "notes": "...", "mention_count": 3 }
]
}
}
Returns 404 if the entity does not exist.
POST /graph/neighbors
Bulk 1-hop neighborhood for a set of entity IDs. Used internally by orchestration — not intended for direct client use.
Request body:
{ "entityIds": [5, 8, 12] }
Response:
{
"nodes": [ ...entity objects... ],
"edges": [ ...relationship objects... ]
}
Returns 400 if entityIds is missing or empty.
Constants (packages/shared/src/config/constants.js)
| Constant | Value | Description |
|---|---|---|
ENTITIES.PROMOTION_THRESHOLD |
3 | mention_count at which an entity is considered well-established |
ENTITIES.GRAPH_HOP_DEPTH |
1 | Default traversal depth for neighborhood queries |
ORCHESTRATION.ENTITIES_LIMIT |
5 | Max entity seeds returned from Qdrant search |
ORCHESTRATION.ENTITIES_THRESHOLD |
0.55 | Minimum similarity score for entity Qdrant search |