documentation updated for model inference settings
This commit is contained in:
@@ -142,6 +142,9 @@ llama.cpp runtime defaults — used by the llama.cpp inference provider.
|
||||
#### `INFERENCE_DEFAULTS`
|
||||
|
||||
Default inference parameters applied when not specified in a request.
|
||||
These are used as fallbacks in `resolveOptions()` in both providers.
|
||||
Orchestration reads live values from `settings.json` and forwards them
|
||||
on every request — these constants are the fallback layer only.
|
||||
|
||||
| Key | Value | Description |
|
||||
|---|---|---|
|
||||
@@ -154,16 +157,22 @@ Default inference parameters applied when not specified in a request.
|
||||
|
||||
#### `ORCHESTRATION`
|
||||
|
||||
Orchestration pipeline defaults.
|
||||
Orchestration pipeline defaults. Used as fallback values in
|
||||
`config/settings.js` when `settings.json` doesn't contain a key.
|
||||
|
||||
| Key | Value | Description |
|
||||
|---|---|---|
|
||||
| `RECENT_EPISODE_LIMIT` | `5` | Recent episodes to inject into prompt |
|
||||
| `SEMANTIC_LIMIT` | `5` | Semantic search results to inject into prompt |
|
||||
| `SCORE_THRESHOLD` | `0.75` | Minimum similarity score for semantic results |
|
||||
| `TEMPERATURE` | `0.7` | Default inference temperature |
|
||||
| `CORS_ORIGIN` | `'http://localhost:5173'` | Fallback allowed CORS origin |
|
||||
| `SYSTEM_PROMPT` | *(see below)* | Default system prompt |
|
||||
|
||||
> `repeatPenalty`, `topP`, and `topK` defaults are sourced from
|
||||
> `INFERENCE_DEFAULTS` in `config/settings.js` rather than `ORCHESTRATION`,
|
||||
> since those constants already define the canonical values.
|
||||
|
||||
Default system prompt:
|
||||
> "You are a helpful, context-aware AI assistant. You have access to memories
|
||||
> of past conversations with the user. Use them to provide consistent,
|
||||
|
||||
Reference in New Issue
Block a user