Files
nexusAI/docs/services/shared.md

6.9 KiB

Shared Package

Package: @nexusai/shared
Location: packages/shared

Purpose

Common utilities and configuration used across all NexusAI services. Keeping these here avoids duplication and ensures consistent behaviour.

Exports

getEnv(key, defaultValue?)

Loads an environment variable by key. If no default is provided and the variable is missing, throws at startup rather than failing silently later.

const { getEnv } = require('@nexusai/shared');

const PORT = getEnv('PORT', '3002');   // optional — falls back to 3002
const DB   = getEnv('SQLITE_PATH');    // required — throws if missing

parseRow(row)

Parses a SQLite row object, deserialising any JSON-encoded metadata fields into plain objects. Returns null if the row is null or undefined.

const { parseRow } = require('@nexusai/shared');
const session = parseRow(db.prepare('SELECT * FROM sessions WHERE id = ?').get(id));

formatEpisodeText(userMessage, aiResponse)

Combines a user message and AI response into the canonical text format used for embedding:

User: {userMessage}
Assistant: {aiResponse}

Used by the memory service's embedding write path to ensure consistent vector representations across all episodes.


Constants

Tuneable values and shared identifiers are centralised in constants.js rather than hardcoded across services. Import the relevant group by name.

const { QDRANT, COLLECTIONS, EPISODIC, LLAMACPP } = require('@nexusai/shared');

QDRANT

Vector store configuration. Values here must stay in sync with the embedding model and Qdrant collection setup.

Key Value Description
DEFAULT_URL http://localhost:6333 Fallback Qdrant URL
VECTOR_SIZE 768 Output dimensions of nomic-embed-text
DISTANCE_METRIC 'Cosine' Similarity metric used for all collections
DEFAULT_LIMIT 10 Default top-k for vector searches

COLLECTIONS

Canonical Qdrant collection names.

Key Value
EPISODES 'episodes'
ENTITIES 'entities'
SUMMARIES 'summaries'

EPISODIC

Default pagination and result limits for SQLite episode queries.

Key Value Description
DEFAULT_RECENT_LIMIT 10 Default number of recent episodes to retrieve
DEFAULT_PAGE_SIZE 20 Default episodes per page for paginated queries
DEFAULT_SEARCH_LIMIT 10 Default number of FTS search results to return
DEFAULT_OFFSET 0 Default pagination offset
DEFAULT_SESSIONS_LIMIT 20 Default number of sessions to return

SERVICES

Default URLs for inter-service communication. Used as fallback values when the corresponding environment variable is not set.

Key Value Description
EMBEDDING_URL http://localhost:3003 Fallback embedding service URL
MEMORY_URL http://localhost:3002 Fallback memory service URL
INFERENCE_URL http://localhost:3001 Fallback inference service URL

PORTS

Default port numbers for each service.

Key Value
INFERENCE '3001'
MEMORY '3002'
EMBEDDING '3003'
ORCHESTRATION '4000'

OLLAMA

Ollama runtime defaults — used by the Ollama inference provider.

Key Value Description
DEFAULT_URL http://localhost:11434 Fallback Ollama URL
EMBED_MODEL 'nomic-embed-text' Default embedding model
OLLAMA_MODEL 'companion:latest' Default chat model

LLAMACPP

llama.cpp runtime defaults — used by the llama.cpp inference provider.

Key Value Description
DEFAULT_URL http://localhost:8080 Fallback llama-server URL
DEFAULT_MODEL 'local-model' Fallback model name (override via DEFAULT_MODEL env var)

Always set DEFAULT_MODEL in the inference service .env to the exact model name reported by llama-server (including .gguf extension). The shared constant is a last-resort fallback only.

INFERENCE_DEFAULTS

Default inference parameters applied when not specified in a request. These are used as fallbacks in resolveOptions() in both providers. Orchestration reads live values from settings.json and forwards them on every request — these constants are the fallback layer only.

Key Value Description
TEMPERATURE 0.7 Controls randomness (0 = deterministic, 1 = creative)
MAX_TOKENS 1024 Maximum tokens to generate
TOP_P 0.9 Nucleus sampling probability mass
TOP_K 40 Top-K candidates at each step
REPEAT_PENALTY 1.1 Penalty for recently used tokens
SEED null null = random; set integer for reproducible outputs

ORCHESTRATION

Orchestration pipeline defaults. Used as fallback values in config/settings.js when settings.json doesn't contain a key.

Key Value Description
RECENT_EPISODE_LIMIT 5 Recent episodes to inject into prompt
SEMANTIC_LIMIT 5 Semantic search results to inject into prompt
SCORE_THRESHOLD 0.75 Minimum similarity score for semantic results
ENTITIES_LIMIT 5 Max entity search results to inject into prompt
ENTITIES_THRESHOLD 0.55 Minimum similarity score for entity results
TEMPERATURE 0.7 Default inference temperature
CORS_ORIGIN 'http://localhost:5173' Fallback allowed CORS origin
SYSTEM_PROMPT (see below) Default system prompt

ENTITIES_THRESHOLD is set to 0.55 — lower than SCORE_THRESHOLD because entity notes generated by a 3B model tend to embed with lower cosine similarity than full episode text. Tune upward if irrelevant entities appear in context.

repeatPenalty, topP, and topK defaults are sourced from INFERENCE_DEFAULTS in config/settings.js rather than ORCHESTRATION, since those constants already define the canonical values.

Default system prompt:

"You are a helpful, context-aware AI assistant. You have access to memories of past conversations with the user. Use them to provide consistent, personalised responses."

SUMMARIES

Controls the automatic session summarisation system in orchestration-service/src/services/summarization.js.

Key Value Description
THRESHOLD_TOKENS 200 Minimum total session tokens before summarisation is considered
MAX_SUMMARY_TOKENS 800 If existing summary exceeds this length (chars), create a new row instead of updating
MIN_EPISODES_SINCE 5 Minimum new episodes since last summary before re-summarising

These can be overridden per-deployment via environment variables in the orchestration service .env:

SUMMARY_THRESHOLD_TOKENS=200
SUMMARY_MAX_TOKENS=800
SUMMARY_MIN_EPISODES=5

SQLITE

Key Value Description
DEFAULT_PATH './data/nexusai.db' Fallback SQLite database path