Files
git.stella-ops.org/docs/modules/policy/guides/assistant-parameters.md
2026-01-13 18:53:39 +02:00

144 lines
12 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Advisory AI Assistant Parameters
_Primary audience: platform operators & policy authors • Updated: 2026-01-13_
This note centralises the tunable knobs that control Advisory AI’s planner, retrieval stack, inference clients, and guardrails. All options live under the `AdvisoryAI` configuration section and can be set via `appsettings.*` files or environment variables using ASP.NET Core’s double-underscore convention (`ADVISORYAI__Inference__Mode`, etc.). Chat quotas and tool allowlists can also be overridden per tenant/user via the chat settings endpoints; appsettings/env values are defaults.
**Policy/version pin** — For Sprint 0111, use the policy bundle hash shipped on 2025-11-19 (same drop as `CLI-VULN-29-001` / `CLI-VEX-30-001`). Set `AdvisoryAI:PolicyVersion` or `ADVISORYAI__POLICYVERSION=2025.11.19` in deployments; include the hash in DSSE metadata for Offline Kits.
| Area | Key(s) | Environment variable | Default | Notes |
| --- | --- | --- | --- | --- |
| Inference mode | `AdvisoryAI:Inference:Mode` | `ADVISORYAI__INFERENCE__MODE` | `Local` | `Local` runs the deterministic pipeline only; `Remote` posts sanitized prompts to `Remote.BaseAddress`. |
| Remote base URI | `AdvisoryAI:Inference:Remote:BaseAddress` | `ADVISORYAI__INFERENCE__REMOTE__BASEADDRESS` | — | Required when `Mode=Remote`. HTTPS strongly recommended. |
| Remote API key | `AdvisoryAI:Inference:Remote:ApiKey` | `ADVISORYAI__INFERENCE__REMOTE__APIKEY` | — | Injected as `Authorization: Bearer <key>` when present. |
| Remote timeout | `AdvisoryAI:Inference:Remote:TimeoutSeconds` | `ADVISORYAI__INFERENCE__REMOTE__TIMEOUTSECONDS` | `30` | Failing requests fall back to the sanitized prompt with `inference.fallback_reason=remote_timeout`. |
| Guardrail prompt cap | `AdvisoryAI:Guardrails:MaxPromptLength` | `ADVISORYAI__GUARDRAILS__MAXPROMPTLENGTH` | `16000` | Prompts longer than the cap are blocked with `prompt_too_long`. |
| Guardrail citations | `AdvisoryAI:Guardrails:RequireCitations` | `ADVISORYAI__GUARDRAILS__REQUIRECITATIONS` | `true` | When `true`, at least one citation must accompany every prompt. |
| Guardrail phrase seeds | `AdvisoryAI:Guardrails:BlockedPhrases[]`<br>`AdvisoryAI:Guardrails:BlockedPhraseFile` | `ADVISORYAI__GUARDRAILS__BLOCKEDPHRASES__0`<br>`ADVISORYAI__GUARDRAILS__BLOCKEDPHRASEFILE` | See defaults below | File paths are resolved relative to the content root; phrases are merged, de-duped, and lower-cased. |
| Chat request quota | `AdvisoryAI:Chat:Quotas:RequestsPerMinute` | `ADVISORYAI__CHAT__QUOTAS__REQUESTSPERMINUTE` | `60` | Requests per minute per user/org. |
| Chat daily request quota | `AdvisoryAI:Chat:Quotas:RequestsPerDay` | `ADVISORYAI__CHAT__QUOTAS__REQUESTSPERDAY` | `500` | Requests per day per user/org. |
| Chat token budget | `AdvisoryAI:Chat:Quotas:TokensPerDay` | `ADVISORYAI__CHAT__QUOTAS__TOKENSPERDAY` | `100000` | Tokens per day per user/org. |
| Chat tool budget | `AdvisoryAI:Chat:Quotas:ToolCallsPerDay` | `ADVISORYAI__CHAT__QUOTAS__TOOLCALLSPERDAY` | `10000` | Tool calls per day per user/org. |
| Guardrail scrubber entropy | `AdvisoryAI:Guardrails:EntropyThreshold` | `ADVISORYAI__GUARDRAILS__ENTROPYTHRESHOLD` | `3.5` | Entropy threshold for high-risk token redaction. |
| Guardrail scrubber min length | `AdvisoryAI:Guardrails:EntropyMinLength` | `ADVISORYAI__GUARDRAILS__ENTROPYMINLENGTH` | `20` | Minimum token length for entropy checks. |
| Guardrail scrubber allowlist file | `AdvisoryAI:Guardrails:AllowlistFile` | `ADVISORYAI__GUARDRAILS__ALLOWLISTFILE` | `data/advisory-ai/allowlist.txt` | Allowlisted patterns bypass redaction. |
| Guardrail scrubber allowlist patterns | `AdvisoryAI:Guardrails:AllowlistPatterns` | `ADVISORYAI__GUARDRAILS__ALLOWLISTPATTERNS__0` | See defaults | Additional allowlist patterns appended to defaults. |
| Chat tools allow all | `AdvisoryAI:Chat:Tools:AllowAll` | `ADVISORYAI__CHAT__TOOLS__ALLOWALL` | `true` | When true, allow all tools with enabled providers. |
| Chat tool allowlist | `AdvisoryAI:Chat:Tools:AllowedTools` | `ADVISORYAI__CHAT__TOOLS__ALLOWEDTOOLS__0` | See defaults | Allowed tools when `AllowAll=false`. |
| Chat audit enabled | `AdvisoryAI:Chat:Audit:Enabled` | `ADVISORYAI__CHAT__AUDIT__ENABLED` | `true` | Toggles chat audit persistence. |
| Chat audit connection string | `AdvisoryAI:Chat:Audit:ConnectionString` | `ADVISORYAI__CHAT__AUDIT__CONNECTIONSTRING` | --- | Postgres connection string for chat audit logs. |
| Chat audit schema | `AdvisoryAI:Chat:Audit:SchemaName` | `ADVISORYAI__CHAT__AUDIT__SCHEMANAME` | `advisoryai` | Schema for chat audit tables. |
| Chat audit evidence bundle | `AdvisoryAI:Chat:Audit:IncludeEvidenceBundle` | `ADVISORYAI__CHAT__AUDIT__INCLUDEEVIDENCEBUNDLE` | `false` | Store full evidence bundle JSON in audit log. |
| Chat audit retention | `AdvisoryAI:Chat:Audit:RetentionPeriod` | `ADVISORYAI__CHAT__AUDIT__RETENTIONPERIOD` | `90.00:00:00` | Retention period for audit logs. |
| Chat action policy allow | `AdvisoryAI:Chat:Actions:RequirePolicyAllow` | `ADVISORYAI__CHAT__ACTIONS__REQUIREPOLICYALLOW` | `true` | Require policy lattice approval before actions. |
| Plan cache TTL | `AdvisoryAI:PlanCache:DefaultTimeToLive`* | `ADVISORYAI__PLANCACHE__DEFAULTTIMETOLIVE` | `00:10:00` | Controls how long cached plans are reused. (`CleanupInterval` defaults to `00:05:00`). |
| Queue capacity | `AdvisoryAI:Queue:Capacity` | `ADVISORYAI__QUEUE__CAPACITY` | `1024` | Upper bound on in-memory tasks when using the default queue. |
| Queue wait interval | `AdvisoryAI:Queue:DequeueWaitInterval` | `ADVISORYAI__QUEUE__DEQUEUEWAITINTERVAL` | `00:00:01` | Back-off between queue polls when empty. |
> \* The plan-cache section is bound via `AddOptions<AdvisoryPlanCacheOptions>()`; override by adding an `AdvisoryAI__PlanCache` block to the host configuration.
---
## 1. Inference knobs & “temperature”
Advisory AI supports two inference modes:
- **Local (default)** – The orchestrator emits deterministic prompts and the worker returns the sanitized prompt verbatim. This mode is offline-friendly and does **not** call any external LLMs. There is no stochastic “temperature” here—the pipeline is purely rule-based.
- **Remote** – Sanitized prompts, citations, and metadata are POSTed to `Remote.BaseAddress + Remote.Endpoint` (default `/v1/inference`). Remote providers control sampling temperature on their side. StellaOps treats remote responses deterministically: we record the provider’s `modelId`, token usage, and any metadata they return. If your remote tier exposes a temperature knob, set it there; Advisory AI simply forwards the prompt.
### Remote inference quick sample
```json
{
"AdvisoryAI": {
"Inference": {
"Mode": "Remote",
"Remote": {
"BaseAddress": "https://inference.internal",
"Endpoint": "/v1/inference",
"ApiKey": "${ADVISORYAI_REMOTE_KEY}",
"TimeoutSeconds": 45
}
}
}
}
```
## 2. Guardrail configuration
| Setting | Default | Explanation |
| --- | --- | --- |
| `MaxPromptLength` | 16000 chars | Upper bound enforced after redaction. Increase cautiously—remote providers typically cap prompts at 32k tokens. |
| `RequireCitations` | `true` | Forces each prompt to include at least one citation. Disable only when testing synthetic prompts. |
| `BlockedPhrases[]` | `ignore previous instructions`, `disregard earlier instructions`, `you are now the system`, `override the system prompt`, `please jailbreak` | Inline list merged with the optional file. Comparisons are case-insensitive. |
| `BlockedPhraseFile` | — | Points to a newline-delimited list. Relative paths resolve against the content root (`AdvisoryAI.Hosting` sticks to AppContext base). |
| `EntropyThreshold` | `3.5` | Shannon entropy threshold for high-risk token redaction. Set to `0` to disable entropy checks. |
| `EntropyMinLength` | `20` | Minimum token length evaluated by the entropy scrubber. |
| `AllowlistPatterns` | Defaults (sha256/sha1/sha384/sha512) | Regex patterns that bypass entropy redaction for known-safe identifiers. |
| `AllowlistFile` | — | Optional allowlist file (JSON array or newline-delimited). Paths resolve against the content root. |
Violations surface in the response metadata (`guardrail.violations[*]`) and increment `advisory_ai_guardrail_blocks_total`. Console consumes the same payload for its ribbon state.
## 2.1 Tool policy lattice (chat)
Chat tool calls are allowed only when policy rules permit. Scope is evaluated on tenant, role, tool name, and resource.
Example (pseudo):
```text
allow_tool("vex.query") if role in ["analyst"] and namespace in ["team-a"]
deny_tool("vault.secrets.get") always
```
## 3. Retrieval & ranking weights (per-task)
Each task type (Summary, Conflict, Remediation) inherits the defaults below. Override any value via `AdvisoryAI:Tasks:<TaskType>:<Property>`.
| Task | `StructuredMaxChunks` | `VectorTopK` | `VectorQueries` (default) | `SbomMaxTimelineEntries` | `SbomMaxDependencyPaths` | `IncludeBlastRadius` |
| --- | --- | --- | --- | --- | --- | --- |
| Summary | 25 | 5 | `Summarize key facts`, `What is impacted?` | 10 | 20 | ✔ |
| Conflict | 30 | 6 | `Highlight conflicting statements`, `Where do sources disagree?` | 8 | 15 | ✖ |
| Remediation | 35 | 6 | `Provide remediation steps`, `Outline mitigations and fixes` | 12 | 25 | ✔ |
These knobs act as weighting levers: lower `VectorTopK` emphasises deterministic evidence; higher values favor breadth. `StructuredMaxChunks` bounds how many CSAF/OSV/VEX chunks reach the prompt, keeping token budgets predictable.
## 4. Token budgets
`AdvisoryTaskBudget` holds `PromptTokens` and `CompletionTokens` per task. Defaults:
| Task | Prompt tokens | Completion tokens |
| --- | --- | --- |
| Summary | 2 048 | 512 |
| Conflict | 2 048 | 512 |
| Remediation | 2 048 | 640 |
Overwrite via `AdvisoryAI:Tasks:Summary:Budget:PromptTokens`, etc. The worker records actual consumption in the response metadata (`inference.prompt_tokens`, `inference.completion_tokens`).
## 5. Cache TTLs & queue directories
- **Plan cache TTLs** – In-memory and file-system caches honour `AdvisoryAI:PlanCache:DefaultTimeToLive` (default 10 minutes) and `CleanupInterval` (default 5 minutes). Shorten the TTL to reduce stale plans or increase it to favour offline reuse. Both values accept ISO 8601 or `hh:mm:ss` time spans.
- **Queue & storage paths** – `AdvisoryAI:Queue:DirectoryPath`, `AdvisoryAI:Storage:PlanCacheDirectory`, and `AdvisoryAI:Storage:OutputDirectory` default to `data/advisory-ai/{queue,plans,outputs}` under the content root; override these when mounting RWX volumes in sovereign clusters.
- **Output TTLs** – Output artefacts inherit the host file-system retention policies. Combine `DefaultTimeToLive` with a cron or systemd timer to prune `outputs/` periodically when operating in remote-inference-heavy environments.
### Example: raised TTL & custom queue path
```json
{
"AdvisoryAI": {
"PlanCache": {
"DefaultTimeToLive": "00:20:00",
"CleanupInterval": "00:05:00"
},
"Queue": {
"DirectoryPath": "/var/lib/advisory-ai/queue"
}
}
}
```
## 6. Operational notes
- Updating **guardrail phrases** triggers only on host reload. When distributing blocked-phrase files via Offline Kits, keep filenames stable and version them through Git so QA can diff changes.
- **Temperature / sampling** remains a remote-provider concern. StellaOps records the provider’s `modelId` and exposes fallback metadata so policy authors can audit when sanitized prompts were returned instead of model output.
- Always track changes in `docs/implplan/SPRINT_0111_0001_0001_advisoryai.md` (task `DOCS-AIAI-31-006`) when promoting this document so the guild can trace which parameters were added per sprint.