Files
git.stella-ops.org/docs/modules/policy/guides/assistant-parameters.md
2026-01-13 18:53:39 +02:00

12 KiB
Raw Blame History

Advisory AI Assistant Parameters

Primary audience: platform operators & policy authors • Updated: 2026-01-13

This note centralises the tunable knobs that control Advisory AI’s planner, retrieval stack, inference clients, and guardrails. All options live under the AdvisoryAI configuration section and can be set via appsettings.* files or environment variables using ASP.NET Core’s double-underscore convention (ADVISORYAI__Inference__Mode, etc.). Chat quotas and tool allowlists can also be overridden per tenant/user via the chat settings endpoints; appsettings/env values are defaults.

Policy/version pin — For Sprint 0111, use the policy bundle hash shipped on 2025-11-19 (same drop as CLI-VULN-29-001 / CLI-VEX-30-001). Set AdvisoryAI:PolicyVersion or ADVISORYAI__POLICYVERSION=2025.11.19 in deployments; include the hash in DSSE metadata for Offline Kits.

Area Key(s) Environment variable Default Notes
Inference mode AdvisoryAI:Inference:Mode ADVISORYAI__INFERENCE__MODE Local Local runs the deterministic pipeline only; Remote posts sanitized prompts to Remote.BaseAddress.
Remote base URI AdvisoryAI:Inference:Remote:BaseAddress ADVISORYAI__INFERENCE__REMOTE__BASEADDRESS — Required when Mode=Remote. HTTPS strongly recommended.
Remote API key AdvisoryAI:Inference:Remote:ApiKey ADVISORYAI__INFERENCE__REMOTE__APIKEY — Injected as Authorization: Bearer <key> when present.
Remote timeout AdvisoryAI:Inference:Remote:TimeoutSeconds ADVISORYAI__INFERENCE__REMOTE__TIMEOUTSECONDS 30 Failing requests fall back to the sanitized prompt with inference.fallback_reason=remote_timeout.
Guardrail prompt cap AdvisoryAI:Guardrails:MaxPromptLength ADVISORYAI__GUARDRAILS__MAXPROMPTLENGTH 16000 Prompts longer than the cap are blocked with prompt_too_long.
Guardrail citations AdvisoryAI:Guardrails:RequireCitations ADVISORYAI__GUARDRAILS__REQUIRECITATIONS true When true, at least one citation must accompany every prompt.
Guardrail phrase seeds AdvisoryAI:Guardrails:BlockedPhrases[]
AdvisoryAI:Guardrails:BlockedPhraseFile
ADVISORYAI__GUARDRAILS__BLOCKEDPHRASES__0
ADVISORYAI__GUARDRAILS__BLOCKEDPHRASEFILE
See defaults below File paths are resolved relative to the content root; phrases are merged, de-duped, and lower-cased.
Chat request quota AdvisoryAI:Chat:Quotas:RequestsPerMinute ADVISORYAI__CHAT__QUOTAS__REQUESTSPERMINUTE 60 Requests per minute per user/org.
Chat daily request quota AdvisoryAI:Chat:Quotas:RequestsPerDay ADVISORYAI__CHAT__QUOTAS__REQUESTSPERDAY 500 Requests per day per user/org.
Chat token budget AdvisoryAI:Chat:Quotas:TokensPerDay ADVISORYAI__CHAT__QUOTAS__TOKENSPERDAY 100000 Tokens per day per user/org.
Chat tool budget AdvisoryAI:Chat:Quotas:ToolCallsPerDay ADVISORYAI__CHAT__QUOTAS__TOOLCALLSPERDAY 10000 Tool calls per day per user/org.
Guardrail scrubber entropy AdvisoryAI:Guardrails:EntropyThreshold ADVISORYAI__GUARDRAILS__ENTROPYTHRESHOLD 3.5 Entropy threshold for high-risk token redaction.
Guardrail scrubber min length AdvisoryAI:Guardrails:EntropyMinLength ADVISORYAI__GUARDRAILS__ENTROPYMINLENGTH 20 Minimum token length for entropy checks.
Guardrail scrubber allowlist file AdvisoryAI:Guardrails:AllowlistFile ADVISORYAI__GUARDRAILS__ALLOWLISTFILE data/advisory-ai/allowlist.txt Allowlisted patterns bypass redaction.
Guardrail scrubber allowlist patterns AdvisoryAI:Guardrails:AllowlistPatterns ADVISORYAI__GUARDRAILS__ALLOWLISTPATTERNS__0 See defaults Additional allowlist patterns appended to defaults.
Chat tools allow all AdvisoryAI:Chat:Tools:AllowAll ADVISORYAI__CHAT__TOOLS__ALLOWALL true When true, allow all tools with enabled providers.
Chat tool allowlist AdvisoryAI:Chat:Tools:AllowedTools ADVISORYAI__CHAT__TOOLS__ALLOWEDTOOLS__0 See defaults Allowed tools when AllowAll=false.
Chat audit enabled AdvisoryAI:Chat:Audit:Enabled ADVISORYAI__CHAT__AUDIT__ENABLED true Toggles chat audit persistence.
Chat audit connection string AdvisoryAI:Chat:Audit:ConnectionString ADVISORYAI__CHAT__AUDIT__CONNECTIONSTRING --- Postgres connection string for chat audit logs.
Chat audit schema AdvisoryAI:Chat:Audit:SchemaName ADVISORYAI__CHAT__AUDIT__SCHEMANAME advisoryai Schema for chat audit tables.
Chat audit evidence bundle AdvisoryAI:Chat:Audit:IncludeEvidenceBundle ADVISORYAI__CHAT__AUDIT__INCLUDEEVIDENCEBUNDLE false Store full evidence bundle JSON in audit log.
Chat audit retention AdvisoryAI:Chat:Audit:RetentionPeriod ADVISORYAI__CHAT__AUDIT__RETENTIONPERIOD 90.00:00:00 Retention period for audit logs.
Chat action policy allow AdvisoryAI:Chat:Actions:RequirePolicyAllow ADVISORYAI__CHAT__ACTIONS__REQUIREPOLICYALLOW true Require policy lattice approval before actions.
Plan cache TTL AdvisoryAI:PlanCache:DefaultTimeToLive* ADVISORYAI__PLANCACHE__DEFAULTTIMETOLIVE 00:10:00 Controls how long cached plans are reused. (CleanupInterval defaults to 00:05:00).
Queue capacity AdvisoryAI:Queue:Capacity ADVISORYAI__QUEUE__CAPACITY 1024 Upper bound on in-memory tasks when using the default queue.
Queue wait interval AdvisoryAI:Queue:DequeueWaitInterval ADVISORYAI__QUEUE__DEQUEUEWAITINTERVAL 00:00:01 Back-off between queue polls when empty.

* The plan-cache section is bound via AddOptions<AdvisoryPlanCacheOptions>(); override by adding an AdvisoryAI__PlanCache block to the host configuration.


1. Inference knobs & “temperature”

Advisory AI supports two inference modes:

  • Local (default) – The orchestrator emits deterministic prompts and the worker returns the sanitized prompt verbatim. This mode is offline-friendly and does not call any external LLMs. There is no stochastic “temperature” here—the pipeline is purely rule-based.
  • Remote – Sanitized prompts, citations, and metadata are POSTed to Remote.BaseAddress + Remote.Endpoint (default /v1/inference). Remote providers control sampling temperature on their side. StellaOps treats remote responses deterministically: we record the provider’s modelId, token usage, and any metadata they return. If your remote tier exposes a temperature knob, set it there; Advisory AI simply forwards the prompt.

Remote inference quick sample

{
  "AdvisoryAI": {
    "Inference": {
      "Mode": "Remote",
      "Remote": {
        "BaseAddress": "https://inference.internal",
        "Endpoint": "/v1/inference",
        "ApiKey": "${ADVISORYAI_REMOTE_KEY}",
        "TimeoutSeconds": 45
      }
    }
  }
}

2. Guardrail configuration

Setting Default Explanation
MaxPromptLength 16000 chars Upper bound enforced after redaction. Increase cautiously—remote providers typically cap prompts at 32k tokens.
RequireCitations true Forces each prompt to include at least one citation. Disable only when testing synthetic prompts.
BlockedPhrases[] ignore previous instructions, disregard earlier instructions, you are now the system, override the system prompt, please jailbreak Inline list merged with the optional file. Comparisons are case-insensitive.
BlockedPhraseFile — Points to a newline-delimited list. Relative paths resolve against the content root (AdvisoryAI.Hosting sticks to AppContext base).
EntropyThreshold 3.5 Shannon entropy threshold for high-risk token redaction. Set to 0 to disable entropy checks.
EntropyMinLength 20 Minimum token length evaluated by the entropy scrubber.
AllowlistPatterns Defaults (sha256/sha1/sha384/sha512) Regex patterns that bypass entropy redaction for known-safe identifiers.
AllowlistFile — Optional allowlist file (JSON array or newline-delimited). Paths resolve against the content root.

Violations surface in the response metadata (guardrail.violations[*]) and increment advisory_ai_guardrail_blocks_total. Console consumes the same payload for its ribbon state.

2.1 Tool policy lattice (chat)

Chat tool calls are allowed only when policy rules permit. Scope is evaluated on tenant, role, tool name, and resource.

Example (pseudo):

allow_tool("vex.query") if role in ["analyst"] and namespace in ["team-a"]
deny_tool("vault.secrets.get") always

3. Retrieval & ranking weights (per-task)

Each task type (Summary, Conflict, Remediation) inherits the defaults below. Override any value via AdvisoryAI:Tasks:<TaskType>:<Property>.

Task StructuredMaxChunks VectorTopK VectorQueries (default) SbomMaxTimelineEntries SbomMaxDependencyPaths IncludeBlastRadius
Summary 25 5 Summarize key facts, What is impacted? 10 20 ✔
Conflict 30 6 Highlight conflicting statements, Where do sources disagree? 8 15 ✖
Remediation 35 6 Provide remediation steps, Outline mitigations and fixes 12 25 ✔

These knobs act as weighting levers: lower VectorTopK emphasises deterministic evidence; higher values favor breadth. StructuredMaxChunks bounds how many CSAF/OSV/VEX chunks reach the prompt, keeping token budgets predictable.

4. Token budgets

AdvisoryTaskBudget holds PromptTokens and CompletionTokens per task. Defaults:

Task Prompt tokens Completion tokens
Summary 2 048 512
Conflict 2 048 512
Remediation 2 048 640

Overwrite via AdvisoryAI:Tasks:Summary:Budget:PromptTokens, etc. The worker records actual consumption in the response metadata (inference.prompt_tokens, inference.completion_tokens).

5. Cache TTLs & queue directories

  • Plan cache TTLs – In-memory and file-system caches honour AdvisoryAI:PlanCache:DefaultTimeToLive (default 10 minutes) and CleanupInterval (default 5 minutes). Shorten the TTL to reduce stale plans or increase it to favour offline reuse. Both values accept ISO 8601 or hh:mm:ss time spans.
  • Queue & storage paths – AdvisoryAI:Queue:DirectoryPath, AdvisoryAI:Storage:PlanCacheDirectory, and AdvisoryAI:Storage:OutputDirectory default to data/advisory-ai/{queue,plans,outputs} under the content root; override these when mounting RWX volumes in sovereign clusters.
  • Output TTLs – Output artefacts inherit the host file-system retention policies. Combine DefaultTimeToLive with a cron or systemd timer to prune outputs/ periodically when operating in remote-inference-heavy environments.

Example: raised TTL & custom queue path

{
  "AdvisoryAI": {
    "PlanCache": {
      "DefaultTimeToLive": "00:20:00",
      "CleanupInterval": "00:05:00"
    },
    "Queue": {
      "DirectoryPath": "/var/lib/advisory-ai/queue"
    }
  }
}

6. Operational notes

  • Updating guardrail phrases triggers only on host reload. When distributing blocked-phrase files via Offline Kits, keep filenames stable and version them through Git so QA can diff changes.
  • Temperature / sampling remains a remote-provider concern. StellaOps records the provider’s modelId and exposes fallback metadata so policy authors can audit when sanitized prompts were returned instead of model output.
  • Always track changes in docs/implplan/SPRINT_0111_0001_0001_advisoryai.md (task DOCS-AIAI-31-006) when promoting this document so the guild can trace which parameters were added per sprint.