audit, advisories and doctors/setup work

2026-01-13 18:53:39 +02:00
parent 9ca7cb183e
commit d7be6ba34b
811 changed files with 54242 additions and 4056 deletions
--- a/docs/modules/policy/AGENTS.md
+++ b/docs/modules/policy/AGENTS.md
@@ -22,6 +22,7 @@ Policy Engine compiles and evaluates Stella DSL policies deterministically, prod
 - Preserve determinism: sort outputs, normalise timestamps (UTC ISO-8601), and avoid machine-specific artefacts.
 - Keep Offline Kit parity in mind—document air-gapped workflows for any new feature.
 - Update runbooks/observability assets when operational characteristics change.
+- Assistant tool lattice rules must be deterministic and policy-auditable.
 ## Required Reading
 - `docs/modules/policy/README.md`
 - `docs/modules/policy/architecture.md`
--- a/docs/modules/policy/guides/assistant-parameters.md
+++ b/docs/modules/policy/guides/assistant-parameters.md
@@ -1,20 +1,36 @@
-# Advisory AI Assistant Parameters
+# Advisory AI Assistant Parameters

-_Primary audience: platform operators & policy authors • Updated: 2025-11-24_
+_Primary audience: platform operators & policy authors â€¢ Updated: 2026-01-13_

-This note centralises the tunable knobs that control Advisory AI’s planner, retrieval stack, inference clients, and guardrails. All options live under the `AdvisoryAI` configuration section and can be set via `appsettings.*` files or environment variables using ASP.NET Core’s double-underscore convention (`ADVISORYAI__Inference__Mode`, etc.).
+This note centralises the tunable knobs that control Advisory AIâ€™s planner, retrieval stack, inference clients, and guardrails. All options live under the `AdvisoryAI` configuration section and can be set via `appsettings.*` files or environment variables using ASP.NET Coreâ€™s double-underscore convention (`ADVISORYAI__Inference__Mode`, etc.). Chat quotas and tool allowlists can also be overridden per tenant/user via the chat settings endpoints; appsettings/env values are defaults.

-**Policy/version pin** — For Sprint 0111, use the policy bundle hash shipped on 2025-11-19 (same drop as `CLI-VULN-29-001` / `CLI-VEX-30-001`). Set `AdvisoryAI:PolicyVersion` or `ADVISORYAI__POLICYVERSION=2025.11.19` in deployments; include the hash in DSSE metadata for Offline Kits.
+**Policy/version pin** â€” For Sprint 0111, use the policy bundle hash shipped on 2025-11-19 (same drop as `CLI-VULN-29-001` / `CLI-VEX-30-001`). Set `AdvisoryAI:PolicyVersion` or `ADVISORYAI__POLICYVERSION=2025.11.19` in deployments; include the hash in DSSE metadata for Offline Kits.

 | Area | Key(s) | Environment variable | Default | Notes |
 | --- | --- | --- | --- | --- |
 | Inference mode | `AdvisoryAI:Inference:Mode` | `ADVISORYAI__INFERENCE__MODE` | `Local` | `Local` runs the deterministic pipeline only; `Remote` posts sanitized prompts to `Remote.BaseAddress`. |
-| Remote base URI | `AdvisoryAI:Inference:Remote:BaseAddress` | `ADVISORYAI__INFERENCE__REMOTE__BASEADDRESS` | — | Required when `Mode=Remote`. HTTPS strongly recommended. |
-| Remote API key | `AdvisoryAI:Inference:Remote:ApiKey` | `ADVISORYAI__INFERENCE__REMOTE__APIKEY` | — | Injected as `Authorization: Bearer <key>` when present. |
+| Remote base URI | `AdvisoryAI:Inference:Remote:BaseAddress` | `ADVISORYAI__INFERENCE__REMOTE__BASEADDRESS` | â€” | Required when `Mode=Remote`. HTTPS strongly recommended. |
+| Remote API key | `AdvisoryAI:Inference:Remote:ApiKey` | `ADVISORYAI__INFERENCE__REMOTE__APIKEY` | â€” | Injected as `Authorization: Bearer <key>` when present. |
 | Remote timeout | `AdvisoryAI:Inference:Remote:TimeoutSeconds` | `ADVISORYAI__INFERENCE__REMOTE__TIMEOUTSECONDS` | `30` | Failing requests fall back to the sanitized prompt with `inference.fallback_reason=remote_timeout`. |
 | Guardrail prompt cap | `AdvisoryAI:Guardrails:MaxPromptLength` | `ADVISORYAI__GUARDRAILS__MAXPROMPTLENGTH` | `16000` | Prompts longer than the cap are blocked with `prompt_too_long`. |
 | Guardrail citations | `AdvisoryAI:Guardrails:RequireCitations` | `ADVISORYAI__GUARDRAILS__REQUIRECITATIONS` | `true` | When `true`, at least one citation must accompany every prompt. |
 | Guardrail phrase seeds | `AdvisoryAI:Guardrails:BlockedPhrases[]`<br>`AdvisoryAI:Guardrails:BlockedPhraseFile` | `ADVISORYAI__GUARDRAILS__BLOCKEDPHRASES__0`<br>`ADVISORYAI__GUARDRAILS__BLOCKEDPHRASEFILE` | See defaults below | File paths are resolved relative to the content root; phrases are merged, de-duped, and lower-cased. |
+| Chat request quota | `AdvisoryAI:Chat:Quotas:RequestsPerMinute` | `ADVISORYAI__CHAT__QUOTAS__REQUESTSPERMINUTE` | `60` | Requests per minute per user/org. |
+| Chat daily request quota | `AdvisoryAI:Chat:Quotas:RequestsPerDay` | `ADVISORYAI__CHAT__QUOTAS__REQUESTSPERDAY` | `500` | Requests per day per user/org. |
+| Chat token budget | `AdvisoryAI:Chat:Quotas:TokensPerDay` | `ADVISORYAI__CHAT__QUOTAS__TOKENSPERDAY` | `100000` | Tokens per day per user/org. |
+| Chat tool budget | `AdvisoryAI:Chat:Quotas:ToolCallsPerDay` | `ADVISORYAI__CHAT__QUOTAS__TOOLCALLSPERDAY` | `10000` | Tool calls per day per user/org. |
+| Guardrail scrubber entropy | `AdvisoryAI:Guardrails:EntropyThreshold` | `ADVISORYAI__GUARDRAILS__ENTROPYTHRESHOLD` | `3.5` | Entropy threshold for high-risk token redaction. |
+| Guardrail scrubber min length | `AdvisoryAI:Guardrails:EntropyMinLength` | `ADVISORYAI__GUARDRAILS__ENTROPYMINLENGTH` | `20` | Minimum token length for entropy checks. |
+| Guardrail scrubber allowlist file | `AdvisoryAI:Guardrails:AllowlistFile` | `ADVISORYAI__GUARDRAILS__ALLOWLISTFILE` | `data/advisory-ai/allowlist.txt` | Allowlisted patterns bypass redaction. |
+| Guardrail scrubber allowlist patterns | `AdvisoryAI:Guardrails:AllowlistPatterns` | `ADVISORYAI__GUARDRAILS__ALLOWLISTPATTERNS__0` | See defaults | Additional allowlist patterns appended to defaults. |
+| Chat tools allow all | `AdvisoryAI:Chat:Tools:AllowAll` | `ADVISORYAI__CHAT__TOOLS__ALLOWALL` | `true` | When true, allow all tools with enabled providers. |
+| Chat tool allowlist | `AdvisoryAI:Chat:Tools:AllowedTools` | `ADVISORYAI__CHAT__TOOLS__ALLOWEDTOOLS__0` | See defaults | Allowed tools when `AllowAll=false`. |
+| Chat audit enabled | `AdvisoryAI:Chat:Audit:Enabled` | `ADVISORYAI__CHAT__AUDIT__ENABLED` | `true` | Toggles chat audit persistence. |
+| Chat audit connection string | `AdvisoryAI:Chat:Audit:ConnectionString` | `ADVISORYAI__CHAT__AUDIT__CONNECTIONSTRING` | --- | Postgres connection string for chat audit logs. |
+| Chat audit schema | `AdvisoryAI:Chat:Audit:SchemaName` | `ADVISORYAI__CHAT__AUDIT__SCHEMANAME` | `advisoryai` | Schema for chat audit tables. |
+| Chat audit evidence bundle | `AdvisoryAI:Chat:Audit:IncludeEvidenceBundle` | `ADVISORYAI__CHAT__AUDIT__INCLUDEEVIDENCEBUNDLE` | `false` | Store full evidence bundle JSON in audit log. |
+| Chat audit retention | `AdvisoryAI:Chat:Audit:RetentionPeriod` | `ADVISORYAI__CHAT__AUDIT__RETENTIONPERIOD` | `90.00:00:00` | Retention period for audit logs. |
+| Chat action policy allow | `AdvisoryAI:Chat:Actions:RequirePolicyAllow` | `ADVISORYAI__CHAT__ACTIONS__REQUIREPOLICYALLOW` | `true` | Require policy lattice approval before actions. |
 | Plan cache TTL | `AdvisoryAI:PlanCache:DefaultTimeToLive`* | `ADVISORYAI__PLANCACHE__DEFAULTTIMETOLIVE` | `00:10:00` | Controls how long cached plans are reused. (`CleanupInterval` defaults to `00:05:00`). |
 | Queue capacity | `AdvisoryAI:Queue:Capacity` | `ADVISORYAI__QUEUE__CAPACITY` | `1024` | Upper bound on in-memory tasks when using the default queue. |
 | Queue wait interval | `AdvisoryAI:Queue:DequeueWaitInterval` | `ADVISORYAI__QUEUE__DEQUEUEWAITINTERVAL` | `00:00:01` | Back-off between queue polls when empty. |
@@ -23,12 +39,12 @@ This note centralises the tunable knobs that control Advisory AI’s planner, re

 ---

-## 1. Inference knobs & “temperature”
+## 1. Inference knobs & â€œtemperatureâ€

 Advisory AI supports two inference modes:

- **Local (default)** – The orchestrator emits deterministic prompts and the worker returns the sanitized prompt verbatim. This mode is offline-friendly and does **not** call any external LLMs. There is no stochastic “temperature” here—the pipeline is purely rule-based.
- **Remote** – Sanitized prompts, citations, and metadata are POSTed to `Remote.BaseAddress + Remote.Endpoint` (default `/v1/inference`). Remote providers control sampling temperature on their side. StellaOps treats remote responses deterministically: we record the provider’s `modelId`, token usage, and any metadata they return. If your remote tier exposes a temperature knob, set it there; Advisory AI simply forwards the prompt.
+- **Local (default)** â€“ The orchestrator emits deterministic prompts and the worker returns the sanitized prompt verbatim. This mode is offline-friendly and does **not** call any external LLMs. There is no stochastic â€œtemperatureâ€ hereâ€”the pipeline is purely rule-based.
+- **Remote** â€“ Sanitized prompts, citations, and metadata are POSTed to `Remote.BaseAddress + Remote.Endpoint` (default `/v1/inference`). Remote providers control sampling temperature on their side. StellaOps treats remote responses deterministically: we record the providerâ€™s `modelId`, token usage, and any metadata they return. If your remote tier exposes a temperature knob, set it there; Advisory AI simply forwards the prompt.

 ### Remote inference quick sample

@@ -52,22 +68,36 @@ Advisory AI supports two inference modes:

 | Setting | Default | Explanation |
 | --- | --- | --- |
-| `MaxPromptLength` | 16000 chars | Upper bound enforced after redaction. Increase cautiously—remote providers typically cap prompts at 32k tokens. |
+| `MaxPromptLength` | 16000 chars | Upper bound enforced after redaction. Increase cautiouslyâ€”remote providers typically cap prompts at 32k tokens. |
 | `RequireCitations` | `true` | Forces each prompt to include at least one citation. Disable only when testing synthetic prompts. |
 | `BlockedPhrases[]` | `ignore previous instructions`, `disregard earlier instructions`, `you are now the system`, `override the system prompt`, `please jailbreak` | Inline list merged with the optional file. Comparisons are case-insensitive. |
-| `BlockedPhraseFile` | — | Points to a newline-delimited list. Relative paths resolve against the content root (`AdvisoryAI.Hosting` sticks to AppContext base). |
+| `BlockedPhraseFile` | â€” | Points to a newline-delimited list. Relative paths resolve against the content root (`AdvisoryAI.Hosting` sticks to AppContext base). |
+| `EntropyThreshold` | `3.5` | Shannon entropy threshold for high-risk token redaction. Set to `0` to disable entropy checks. |
+| `EntropyMinLength` | `20` | Minimum token length evaluated by the entropy scrubber. |
+| `AllowlistPatterns` | Defaults (sha256/sha1/sha384/sha512) | Regex patterns that bypass entropy redaction for known-safe identifiers. |
+| `AllowlistFile` | â€” | Optional allowlist file (JSON array or newline-delimited). Paths resolve against the content root. |

 Violations surface in the response metadata (`guardrail.violations[*]`) and increment `advisory_ai_guardrail_blocks_total`. Console consumes the same payload for its ribbon state.

+## 2.1 Tool policy lattice (chat)
+
+Chat tool calls are allowed only when policy rules permit. Scope is evaluated on tenant, role, tool name, and resource.
+
+Example (pseudo):
+```text
+allow_tool("vex.query") if role in ["analyst"] and namespace in ["team-a"]
+deny_tool("vault.secrets.get") always
+```
+
 ## 3. Retrieval & ranking weights (per-task)

 Each task type (Summary, Conflict, Remediation) inherits the defaults below. Override any value via `AdvisoryAI:Tasks:<TaskType>:<Property>`.

 | Task | `StructuredMaxChunks` | `VectorTopK` | `VectorQueries` (default) | `SbomMaxTimelineEntries` | `SbomMaxDependencyPaths` | `IncludeBlastRadius` |
 | --- | --- | --- | --- | --- | --- | --- |
-| Summary | 25 | 5 | `Summarize key facts`, `What is impacted?` | 10 | 20 | ✔ |
-| Conflict | 30 | 6 | `Highlight conflicting statements`, `Where do sources disagree?` | 8 | 15 | ✖ |
-| Remediation | 35 | 6 | `Provide remediation steps`, `Outline mitigations and fixes` | 12 | 25 | ✔ |
+| Summary | 25 | 5 | `Summarize key facts`, `What is impacted?` | 10 | 20 | âœ” |
+| Conflict | 30 | 6 | `Highlight conflicting statements`, `Where do sources disagree?` | 8 | 15 | âœ– |
+| Remediation | 35 | 6 | `Provide remediation steps`, `Outline mitigations and fixes` | 12 | 25 | âœ” |

 These knobs act as weighting levers: lower `VectorTopK` emphasises deterministic evidence; higher values favor breadth. `StructuredMaxChunks` bounds how many CSAF/OSV/VEX chunks reach the prompt, keeping token budgets predictable.

@@ -77,17 +107,17 @@ These knobs act as weighting levers: lower `VectorTopK` emphasises deterministic

 | Task | Prompt tokens | Completion tokens |
 | --- | --- | --- |
-| Summary | 2 048 | 512 |
-| Conflict | 2 048 | 512 |
-| Remediation | 2 048 | 640 |
+| Summary | 2â€¯048 | 512 |
+| Conflict | 2â€¯048 | 512 |
+| Remediation | 2â€¯048 | 640 |

 Overwrite via `AdvisoryAI:Tasks:Summary:Budget:PromptTokens`, etc. The worker records actual consumption in the response metadata (`inference.prompt_tokens`, `inference.completion_tokens`).

 ## 5. Cache TTLs & queue directories

- **Plan cache TTLs** – In-memory and file-system caches honour `AdvisoryAI:PlanCache:DefaultTimeToLive` (default 10 minutes) and `CleanupInterval` (default 5 minutes). Shorten the TTL to reduce stale plans or increase it to favour offline reuse. Both values accept ISO 8601 or `hh:mm:ss` time spans.
- **Queue & storage paths** – `AdvisoryAI:Queue:DirectoryPath`, `AdvisoryAI:Storage:PlanCacheDirectory`, and `AdvisoryAI:Storage:OutputDirectory` default to `data/advisory-ai/{queue,plans,outputs}` under the content root; override these when mounting RWX volumes in sovereign clusters.
- **Output TTLs** – Output artefacts inherit the host file-system retention policies. Combine `DefaultTimeToLive` with a cron or systemd timer to prune `outputs/` periodically when operating in remote-inference-heavy environments.
+- **Plan cache TTLs** â€“ In-memory and file-system caches honour `AdvisoryAI:PlanCache:DefaultTimeToLive` (default 10 minutes) and `CleanupInterval` (default 5 minutes). Shorten the TTL to reduce stale plans or increase it to favour offline reuse. Both values accept ISO 8601 or `hh:mm:ss` time spans.
+- **Queue & storage paths** â€“ `AdvisoryAI:Queue:DirectoryPath`, `AdvisoryAI:Storage:PlanCacheDirectory`, and `AdvisoryAI:Storage:OutputDirectory` default to `data/advisory-ai/{queue,plans,outputs}` under the content root; override these when mounting RWX volumes in sovereign clusters.
+- **Output TTLs** â€“ Output artefacts inherit the host file-system retention policies. Combine `DefaultTimeToLive` with a cron or systemd timer to prune `outputs/` periodically when operating in remote-inference-heavy environments.

 ### Example: raised TTL & custom queue path

@@ -108,5 +138,6 @@ Overwrite via `AdvisoryAI:Tasks:Summary:Budget:PromptTokens`, etc. The worker re
 ## 6. Operational notes

 - Updating **guardrail phrases** triggers only on host reload. When distributing blocked-phrase files via Offline Kits, keep filenames stable and version them through Git so QA can diff changes.
- **Temperature / sampling** remains a remote-provider concern. StellaOps records the provider’s `modelId` and exposes fallback metadata so policy authors can audit when sanitized prompts were returned instead of model output.
+- **Temperature / sampling** remains a remote-provider concern. StellaOps records the providerâ€™s `modelId` and exposes fallback metadata so policy authors can audit when sanitized prompts were returned instead of model output.
 - Always track changes in `docs/implplan/SPRINT_0111_0001_0001_advisoryai.md` (task `DOCS-AIAI-31-006`) when promoting this document so the guild can trace which parameters were added per sprint.
+
--- a/docs/modules/policy/guides/assistant-tool-lattice.md
+++ b/docs/modules/policy/guides/assistant-tool-lattice.md
@@ -0,0 +1,29 @@
+# Assistant Tool Lattice Policy Mapping
+
+This guide defines the tool lattice rule schema and default scope mapping for assistant tool calls.
+The lattice is evaluated by Policy Gateway and returns allow or deny decisions for each tool request.
+
+## Default deny behavior
+- If no rule matches a tool request, the decision is deny.
+- A rule must match tool name, action, and any configured tenant, role, scope, or resource filters to allow access.
+
+## Rule fields
+- tool: Tool name or wildcard pattern (for example, "vex.query" or "scanner.*").
+- action: Read or action discriminator (for example, "read" or "action").
+- scopes: Required Authority scopes (one or more).
+- roles: Optional role filters (one or more).
+- tenants: Optional tenant filters (one or more).
+- resource: Optional resource pattern (for example, "sbom:component:*").
+- effect: allow or deny.
+- priority: Integer priority; higher values evaluate first.
+
+## Default scope mapping
+| Tool | Action | Required scopes |
+| --- | --- | --- |
+| vex.query | read | vex:read |
+| sbom.read | read | sbom:read |
+| scanner.findings.topk | read | scanner:read or findings:read |
+
+## Override guidance
+- Use priority to override default rules.
+- Keep rules deterministic by using stable patterns and avoiding ambiguous overlaps.