Add Policy DSL Validator, Schema Exporter, and Simulation Smoke tools

- Implemented PolicyDslValidator with command-line options for strict mode and JSON output. - Created PolicySchemaExporter to generate JSON schemas for policy-related models. - Developed PolicySimulationSmoke tool to validate policy simulations against expected outcomes. - Added project files and necessary dependencies for each tool. - Ensured proper error handling and usage instructions across tools.
2025-10-27 08:00:11 +02:00
parent 651b8e0fa3
commit 96d52884e8
712 changed files with 49449 additions and 6124 deletions
--- a/docs/policy/dsl.md
+++ b/docs/policy/dsl.md
@@ -0,0 +1,294 @@
+# Stella Policy DSL (`stella-dsl@1`)
+
+> **Audience:** Policy authors, reviewers, and tooling engineers building lint/compile flows for the Policy Engine v2 rollout (Sprint 20).
+
+This document specifies the `stella-dsl@1` grammar, semantics, and guardrails used by Stella Ops to transform SBOM facts, Concelier advisories, and Excititor VEX statements into effective findings. Use it with the [Policy Engine Overview](overview.md) for architectural context and the upcoming lifecycle/run guides for operational workflows.
+
+---
+
+## 1 · Design Goals
+
+- **Deterministic:** Same policy + same inputs ⇒ identical findings on every machine.
+- **Declarative:** No arbitrary loops, network calls, or clock access.
+- **Explainable:** Every decision records the rule, inputs, and rationale in the explain trace.
+- **Lean authoring:** Common precedence, severity, and suppression patterns are first-class.
+- **Offline-friendly:** Grammar and built-ins avoid cloud dependencies, run the same in sealed deployments.
+
+---
+
+## 2 · Document Structure
+
+Policy packs ship one or more `.stella` files. Each file contains exactly one `policy` block:
+
+```dsl
+policy "Default Org Policy" syntax "stella-dsl@1" {
+  metadata {
+    description = "Baseline severity + VEX precedence"
+    tags = ["baseline","vex"]
+  }
+
+  profile severity {
+    map vendor_weight {
+      source "GHSA" => +0.5
+      source "OSV"  => +0.0
+      source "VendorX" => -0.2
+    }
+    env exposure_adjustments {
+      if env.runtime == "serverless" then -0.5
+      if env.exposure == "internal-only" then -1.0
+    }
+  }
+
+  rule vex_precedence priority 10 {
+    when vex.any(status in ["not_affected","fixed"])
+      and vex.justification in ["component_not_present","vulnerable_code_not_present"]
+    then status := vex.status
+    because "Strong vendor justification prevails";
+  }
+}
+```
+
+High-level layout:
+
+| Section | Purpose |
+|---------|---------|
+| `metadata` | Optional descriptive fields surfaced in Console/CLI. |
+| `imports` | Reserved for future reuse (not yet implemented in `@1`). |
+| `profile` blocks | Declarative scoring modifiers (`severity`, `trust`, `reachability`). |
+| `rule` blocks | When/then logic applied to each `(component, advisory, vex[])` tuple. |
+| `settings` | Optional evaluation toggles (sampling, default status overrides). |
+
+---
+
+## 3 · Lexical Rules
+
+- **Case sensitivity:** Keywords are lowercase; identifiers are case-sensitive.
+- **Whitespace:** Space, tab, newline act as separators. Indentation is cosmetic.
+- **Comments:** `// inline` and `/* block */` are ignored.
+- **Literals:**
+  - Strings use double quotes (`"text"`); escape with `\"`, `\n`, `\t`.
+  - Numbers are decimal; suffix `%` allowed for percentage weights (`-2.5%` becomes `-0.025`).
+  - Booleans: `true`, `false`.
+  - Lists: `[1, 2, 3]`, `["a","b"]`.
+- **Identifiers:** Start with letter or underscore, continue with letters, digits, `_`.
+- **Operators:** `=`, `==`, `!=`, `<`, `<=`, `>`, `>=`, `in`, `not in`, `and`, `or`, `not`, `:=`.
+
+---
+
+## 4 · Grammar (EBNF)
+
+```ebnf
+policy      = "policy", string, "syntax", string, "{", policy-body, "}" ;
+policy-body = { metadata | profile | settings | rule | helper } ;
+
+metadata    = "metadata", "{", { meta-entry }, "}" ;
+meta-entry  = identifier, "=", (string | list) ;
+
+profile     = "profile", identifier, "{", { profile-item }, "}" ;
+profile-item= map | env-map | scalar ;
+map         = "map", identifier, "{", { "source", string, "=>", number, ";" }, "}" ;
+env-map     = "env", identifier, "{", { "if", expression, "then", number, ";" }, "}" ;
+scalar      = identifier, "=", (number | string | list), ";" ;
+
+settings    = "settings", "{", { setting-entry }, "}" ;
+setting-entry = identifier, "=", (number | string | boolean), ";" ;
+
+rule        = "rule", identifier, [ "priority", integer ], "{",
+                 "when", predicate,
+                 { "and", predicate },
+                 "then", { action },
+                 [ "else", { action } ],
+                 [ "because", string ],
+             "}" ;
+
+predicate   = expression ;
+expression  = term, { ("and" | "or"), term } ;
+term        = ["not"], factor ;
+factor      = comparison | membership | function-call | literal | identifier | "(" expression ")" ;
+comparison  = value, comparator, value ;
+membership  = value, ("in" | "not in"), list ;
+value       = identifier | literal | function-call | field-access ;
+field-access= identifier, { ".", identifier | "[" literal "]" } ;
+function-call = identifier, "(", [ arg-list ], ")" ;
+arg-list    = expression, { ",", expression } ;
+literal     = string | number | boolean | list ;
+
+action      = assignment | ignore | escalate | require | warn | defer | annotate ;
+assignment  = target, ":=", expression, ";" ;
+target      = identifier, { ".", identifier } ;
+ignore      = "ignore", [ "until", expression ], [ "because", string ], ";" ;
+escalate    = "escalate", [ "to", expression ], [ "when", expression ], ";" ;
+require     = "requireVex", "{", require-fields, "}", ";" ;
+warn        = "warn", [ "message", string ], ";" ;
+defer       = "defer", [ "until", expression ], ";" ;
+annotate    = "annotate", identifier, ":=", expression, ";" ;
+```
+
+Notes:
+
+- `helper` is reserved for shared calculcations (not yet implemented in `@1`).
+- `else` branch executes only if `when` predicates evaluate truthy **and** no prior rule earlier in priority handled the tuple.
+- Semicolons inside rule bodies are optional when each clause is on its own line; the compiler emits canonical semicolons in IR.
+
+---
+
+## 5 · Evaluation Context
+
+Within predicates and actions you may reference the following namespaces:
+
+| Namespace | Fields | Description |
+|-----------|--------|-------------|
+| `sbom` | `purl`, `name`, `version`, `licenses`, `layerDigest`, `tags`, `usedByEntrypoint` | Component metadata from Scanner. |
+| `advisory` | `id`, `source`, `aliases`, `severity`, `cvss`, `publishedAt`, `modifiedAt`, `content.raw` | Canonical Concelier advisory view. |
+| `vex` | `status`, `justification`, `statementId`, `timestamp`, `scope` | Current VEX statement when iterating; aggregator helpers available. |
+| `vex.any(...)`, `vex.all(...)`, `vex.count(...)` | Functions operating over all matching statements. |
+| `run` | `policyId`, `policyVersion`, `tenant`, `timestamp` | Metadata for explain annotations. |
+| `env` | Arbitrary key/value pairs injected per run (e.g., `environment`, `runtime`). |
+| `telemetry` | Optional reachability signals; missing fields evaluate to `unknown`. |
+| `profile.<name>` | Values computed inside profile blocks (maps, scalars). |
+
+Missing fields evaluate to `null`, which is falsey in boolean context and propagates through comparisons unless explicitly checked.
+
+---
+
+## 6 · Built-ins (v1)
+
+| Function / Property | Signature | Description |
+|---------------------|-----------|-------------|
+| `normalize_cvss(advisory)` | `Advisory → SeverityScalar` | Parses `advisory.content.raw` for CVSS data; falls back to policy maps. |
+| `cvss(score, vector)` | `double × string → SeverityScalar` | Constructs a severity object manually. |
+| `severity_band(value)` | `string → SeverityBand` | Normalises strings like `"critical"`, `"medium"`. |
+| `risk_score(base, modifiers...)` | Variadic | Multiplies numeric modifiers (severity × trust × reachability). |
+| `vex.any(predicate)` | `(Statement → bool) → bool` | `true` if any statement satisfies predicate. |
+| `vex.all(predicate)` | `(Statement → bool) → bool` | `true` if all statements satisfy predicate. |
+| `vex.latest()` | `→ Statement` | Lexicographically newest statement. |
+| `advisory.has_tag(tag)` | `string → bool` | Checks advisory metadata tags. |
+| `advisory.matches(pattern)` | `string → bool` | Glob match against advisory identifiers. |
+| `sbom.has_tag(tag)` | `string → bool` | Uses SBOM inventory tags (usage vs inventory). |
+| `exists(expression)` | `→ bool` | `true` when value is non-null/empty. |
+| `coalesce(a, b, ...)` | `→ value` | First non-null argument. |
+| `days_between(dateA, dateB)` | `→ int` | Absolute day difference (UTC). |
+| `percent_of(part, whole)` | `→ double` | Fractions for scoring adjustments. |
+| `lowercase(text)` | `string → string` | Normalises casing deterministically (InvariantCulture). |
+
+All built-ins are pure; if inputs are null the result is null unless otherwise noted.
+
+---
+
+## 7 · Rule Semantics
+
+1. **Ordering:** Rules execute in ascending `priority`. When priorities tie, lexical order defines precedence.
+2. **Short-circuit:** Once a rule sets `status`, subsequent rules only execute if they use `combine`. Use this sparingly to avoid ambiguity.
+3. **Actions:**
+   - `status := <string>` – Allowed values: `affected`, `not_affected`, `fixed`, `suppressed`, `under_investigation`, `escalated`.
+   - `severity := <SeverityScalar>` – Either from `normalize_cvss`, `cvss`, or numeric map; ensures `normalized` and `score`.
+   - `ignore until <ISO-8601>` – Temporarily treats finding as suppressed until timestamp; recorded in explain trace.
+   - `warn message "<text>"` – Adds warn verdict and deducts `warnPenalty`.
+   - `escalate to severity_band("critical") when condition` – Forces verdict severity upward when condition true.
+   - `requireVex { vendors = ["VendorX"], justifications = ["component_not_present"] }` – Fails evaluation if matching VEX evidence absent.
+   - `annotate reason := "text"` – Adds free-form key/value pairs to explain payload.
+4. **Because clause:** Mandatory for actions changing status or severity; captured verbatim in explain traces.
+
+---
+
+## 8 · Scoping Helpers
+
+- **Maps:** Use `profile severity { map vendor_weight { ... } }` to declare additive factors. Retrieve with `profile.severity.vendor_weight["GHSA"]`.
+- **Environment overrides:** `env` profiles allow conditional adjustments based on runtime metadata.
+- **Tenancy:** `run.tenant` ensures policies remain tenant-aware; avoid hardcoding single-tenant IDs.
+- **Default values:** Use `settings { default_status = "affected"; }` to override built-in defaults.
+
+---
+
+## 9 · Examples
+
+### 9.1 Baseline Severity Normalisation
+
+```dsl
+rule advisory_normalization {
+  when advisory.source in ["GHSA","OSV"]
+  then severity := normalize_cvss(advisory)
+  because "Align vendor severity to CVSS baseline";
+}
+```
+
+### 9.2 VEX Override with Quiet Mode
+
+```dsl
+rule vex_strong_claim priority 5 {
+  when vex.any(status == "not_affected")
+       and vex.justification in ["component_not_present","vulnerable_code_not_present"]
+  then status := vex.status
+       annotate winning_statement := vex.latest().statementId
+       warn message "VEX override applied"
+  because "Strong VEX justification";
+}
+```
+
+### 9.3 Environment-Specific Escalation
+
+```dsl
+rule internet_exposed_guard {
+  when env.exposure == "internet"
+       and severity.normalized >= "High"
+  then escalate to severity_band("Critical")
+  because "Internet-exposed assets require critical posture";
+}
+```
+
+### 9.4 Anti-pattern (flagged by linter)
+
+```dsl
+rule catch_all {
+  when true
+  then status := "suppressed"
+  because "Suppress everything"  // ❌ Fails lint: unbounded suppression
+}
+```
+
+---
+
+## 10 · Validation & Tooling
+
+- `stella policy lint` ensures:
+  - Grammar compliance and canonical formatting.
+  - Static determinism guard (no forbidden namespaces).
+  - Anti-pattern detection (e.g., unconditional suppression, missing `because`).
+- `stella policy compile` emits IR (`.stella.ir.json`) and SHA-256 digest used in `policy_runs`.
+- CI pipelines (see `DEVOPS-POLICY-20-001`) compile sample packs and fail on lint violations.
+- Simulation harnesses (`stella policy simulate`) highlight provided/queried fields so policy authors affirm assumptions before promotion.
+
+---
+
+## 11 · Anti-patterns & Mitigations
+
+| Anti-pattern | Risk | Mitigation |
+|--------------|------|------------|
+| Catch-all suppress/ignore without scope | Masks all findings | Linter blocks rules with `when true` unless `priority` > 1000 and justification includes remediation plan. |
+| Comparing strings with inconsistent casing | Missed matches | Wrap comparisons in `lowercase(value)` to align casing or normalise metadata during ingest. |
+| Referencing `telemetry` without fallback | Null propagation | Wrap access in `exists(telemetry.reachability)`. |
+| Hardcoding tenant IDs | Breaks multi-tenant | Prefer `env.tenantTag` or metadata-sourced predicates. |
+| Duplicated rule names | Explain trace ambiguity | Compiler enforces unique `rule` identifiers within a policy. |
+
+---
+
+## 12 · Versioning & Compatibility
+
+- `syntax "stella-dsl@1"` is mandatory.
+- Future revisions (`@2`, …) will be additive; existing packs continue to compile with their declared version.
+- The compiler canonicalises documents (sorted keys, normalised whitespace) before hashing to ensure reproducibility.
+
+---
+
+## 13 · Compliance Checklist
+
+- [ ] **Grammar validated:** Policy compiles with `stella policy lint` and matches `syntax "stella-dsl@1"`.
+- [ ] **Deterministic constructs only:** No use of forbidden namespaces (`DateTime.Now`, `Guid.NewGuid`, external services).
+- [ ] **Rationales present:** Every status/severity change includes a `because` clause or `annotate` entry.
+- [ ] **Scoped suppressions:** Rules that ignore/suppress findings reference explicit components, vendors, or VEX justifications.
+- [ ] **Explain fields verified:** `annotate` keys align with Console/CLI expectations (documented in upcoming lifecycle guide).
+- [ ] **Offline parity tested:** Policy pack simulated in sealed mode (`--sealed`) to confirm absence of network dependencies.
+
+---
+
+*Last updated: 2025-10-26 (Sprint 20).*