feat: Add VEX Status Chip component and integration tests for reachability drift detection

- Introduced `VexStatusChipComponent` to display VEX status with color coding and tooltips.
- Implemented integration tests for reachability drift detection, covering various scenarios including drift detection, determinism, and error handling.
- Enhanced `ScannerToSignalsReachabilityTests` with a null implementation of `ICallGraphSyncService` for better test isolation.
- Updated project references to include the new Reachability Drift library.
This commit is contained in:
StellaOps Bot
2025-12-20 01:26:42 +02:00
parent edc91ea96f
commit 5fc469ad98
159 changed files with 41116 additions and 2305 deletions

View File

@@ -0,0 +1,619 @@
## Trust Algebra and Lattice Engine Specification
This spec defines a deterministic “Trust Algebra / Lattice Engine” that ingests heterogeneous security assertions (SBOM, VEX, reachability, provenance attestations), normalizes them into a canonical claim model, merges them using lattice operations that preserve **unknowns and contradictions**, and produces a **signed, replayable verdict** with an auditable proof trail.
The design deliberately separates:
1. **Knowledge aggregation** (monotone, conflict-preserving, order-independent), from
2. **Decision selection** (policy-driven, trust-aware, environment-aware).
This prevents “heuristics creep” and makes the system explainable and reproducible.
---
# 1) Scope and objectives
### 1.1 What the engine must do
* Accept VEX from multiple standards (OpenVEX, CSAF VEX, CycloneDX/ECMA-424 VEX).
* Accept internally generated evidence (SBOM, reachability proofs, mitigations, patch/pedigree evidence).
* Merge claims while representing:
* **Unknown** (no evidence)
* **Conflict** (credible evidence for both sides)
* Compute an output disposition aligned to common VEX output states:
* CycloneDX impact-analysis states include: `resolved`, `resolved_with_pedigree`, `exploitable`, `in_triage`, `false_positive`, `not_affected`. ([Ecma International][1])
* Provide deterministic, signed, replayable results:
* Same inputs + same policy bundle ⇒ same outputs.
* Produce a proof object that can be independently verified offline.
### 1.2 Non-goals
* “One score to rule them all” without proofs.
* Probabilistic scoring as the primary decision mechanism.
* Trust by vendor branding instead of cryptographic/verifiable identity.
---
# 2) Standards surface (external inputs) and canonicalization targets
The engine should support at minimum these external statement types:
### 2.1 CycloneDX / ECMA-424 VEX (embedded)
CycloneDXs vulnerability “impact analysis” model defines:
* `analysis.state` values: `resolved`, `resolved_with_pedigree`, `exploitable`, `in_triage`, `false_positive`, `not_affected`. ([Ecma International][1])
* `analysis.justification` values: `code_not_present`, `code_not_reachable`, `requires_configuration`, `requires_dependency`, `requires_environment`, `protected_by_compiler`, `protected_at_runtime`, `protected_at_perimeter`, `protected_by_mitigating_control`. ([Ecma International][1])
This is the richest mainstream state model; we will treat it as the “maximal” target semantics.
### 2.2 OpenVEX
OpenVEX defines status labels:
* `not_affected`, `affected`, `fixed`, `under_investigation`. ([Docker Documentation][2])
For `not_affected`, OpenVEX requires supplying either a status justification or an `impact_statement`. ([GitHub][3])
### 2.3 CSAF VEX
CSAF VEX requires `product_status` containing at least one of:
* `fixed`, `known_affected`, `known_not_affected`, `under_investigation`. ([OASIS Documents][4])
### 2.4 Provenance / attestations
The engine should ingest signed attestations, particularly DSSE-wrapped in-toto statements (common in Sigstore/Cosign flows). Sigstore documentation states payloads are signed using the DSSE signing spec. ([Sigstore][5])
DSSEs design highlights include binding the payload **and its type** to prevent confusion attacks and avoiding canonicalization to reduce attack surface. ([GitHub][6])
---
# 3) Canonical internal model
## 3.1 Core identifiers
### Subject identity
A **Subject** is what we are making a security determination about.
Minimum viable Subject key:
* `artifact.digest` (e.g., OCI image digest, binary hash)
* `component.id` (prefer `purl`, else `cpe`, else `bom-ref`)
* `vuln.id` (CVE/OSV/etc.)
* `context.id` (optional but recommended; see below)
```
Subject := (ArtifactRef, ComponentRef, VulnerabilityRef, ContextRef?)
```
### Context identity (optional but recommended)
ContextRef allows environment-sensitive statements to remain valid and deterministic:
* build flags
* runtime config profile (e.g., feature gates)
* deployment mode (cluster policy)
* OS / libc family
* FIPS mode, SELinux/AppArmor posture, etc.
ContextRef must be hashable (canonical JSON → digest).
---
## 3.2 Claims, evidence, attestations
### Claim
A **Claim** is a signed or unsigned assertion about a Subject.
Required fields:
* `claim.id`: content-addressable digest of canonical claim JSON
* `claim.subject`
* `claim.issuer`: principal identity
* `claim.time`: `issued_at`, `valid_from`, `valid_until` (optional)
* `claim.assertions[]`: list of atomic assertions (see §4)
* `claim.evidence_refs[]`: pointers to evidence objects
* `claim.signature`: optional DSSE / signature wrapper reference
### Evidence
Evidence is a typed object that supports replay and audit:
* `evidence.type`: e.g., `sbom_node`, `callgraph_path`, `loader_resolution`, `config_snapshot`, `patch_diff`, `pedigree_commit_chain`
* `evidence.digest`: hash of canonical bytes
* `evidence.producer`: tool identity and version
* `evidence.time`
* `evidence.payload_ref`: CAS pointer
* `evidence.signature_ref`: optional (attested evidence)
### Attestation wrapper
For signed payloads (claims or evidence bundles):
* Prefer DSSE envelopes for transport/type binding. ([GitHub][6])
* Prefer in-toto statement structure (subject + predicate + type).
---
# 4) The fact lattice: representing truth, unknowns, and conflicts
## 4.1 Why a lattice, not booleans
For vulnerability disposition you will routinely see:
* **no evidence** (unknown)
* **incomplete evidence** (triage)
* **contradictory evidence** (vendor says not affected; scanner says exploitable)
A boolean cannot represent these safely.
## 4.2 Four-valued fact lattice (Belnap-style)
For each atomic proposition `p`, the engine stores a value in:
```
K4 := { ⊥, T, F, }
⊥ = unknown (no support)
T = supported true
F = supported false
= conflict (support for both true and false)
```
### Knowledge ordering (≤k)
* ⊥ ≤k T ≤k
* ⊥ ≤k F ≤k
* T and F incomparable
### Join operator (⊔k)
Join is “union of support” and is monotone:
* ⊥ ⊔k x = x
* T ⊔k F =
* ⊔k x =
* T ⊔k T = T, F ⊔k F = F
This operator is order-independent; it provides deterministic aggregation even under parallel ingestion.
---
# 5) Atomic propositions (canonical “security atoms”)
For each Subject `S`, the engine maintains K4 truth values for these propositions:
1. **PRESENT**: the component instance is present in the artifact/context.
2. **APPLIES**: vulnerability applies to that component (version/range/cpe match).
3. **REACHABLE**: vulnerable code is reachable in the given context.
4. **MITIGATED**: controls prevent exploitation (compiler/runtime/perimeter/controls).
5. **FIXED**: remediation has been applied to the artifact.
6. **MISATTRIBUTED**: the finding is a false association (false positive).
These atoms are intentionally orthogonal; external formats are normalized into them.
---
# 6) Trust algebra: principals, assurance, and authority
Trust is not a single number; it must represent:
* cryptographic verification
* identity assurance
* authority scope
* freshness/revocation
* evidence strength
We model trust as a label computed deterministically from policy + verification.
## 6.1 Principal
A principal is an issuer identity with verifiable keys:
* `principal.id` (URI-like)
* `principal.key_ids[]`
* `principal.identity_claims` (e.g., cert SANs, OIDC subject, org, repo)
* `principal.roles[]` (vendor, distro, internal-sec, build-system, scanner, auditor)
## 6.2 Trust label
A trust label is a tuple:
```
TrustLabel := (
assurance_level, // cryptographic + identity verification strength
authority_scope, // what subjects this principal is authoritative for
freshness_class, // time validity
evidence_class // strength/type of evidence attached
)
```
### Assurance levels (example)
Deterministic levels, increasing:
* A0: unsigned / unverifiable
* A1: signed, key known but weak identity binding
* A2: signed, verified identity (e.g., cert chain / keyless identity)
* A3: signed + provenance binding to artifact digest
* A4: signed + provenance + transparency log inclusion (if available)
Sigstore cosigns attestation verification references DSSE signing for payloads. ([Sigstore][5])
DSSE design includes payload-type binding and avoids canonicalization. ([GitHub][6])
### Authority scope
Authority is not purely cryptographic. It is policy-defined mapping between:
* principal identity and
* subject namespaces (vendors, products, package namespaces, internal artifacts)
Examples:
* Vendor principal is authoritative for `product.vendor == VendorX`.
* Distro principal authoritative for packages under their repos.
* Internal security principal authoritative for internal runtime reachability proofs.
### Evidence class
Evidence class is derived from evidence types:
* E0: statement-only (no supporting evidence refs)
* E1: SBOM linkage evidence (component present + version)
* E2: reachability/mitigation evidence (call paths, config snapshots)
* E3: remediation evidence (patch diffs, pedigree/commit chain)
CycloneDX/ECMA-424 explicitly distinguishes `resolved_with_pedigree` as remediation with verifiable commit history/diffs in pedigree. ([Ecma International][1])
## 6.3 Trust ordering and operators
Trust labels define a partial order ≤t (policy-defined). A simple implementation is component-wise ordering, but authority scope is set-based.
Core operators:
* **join (⊔t)**: combine independent supporting trust (often max-by-order)
* **meet (⊓t)**: compose along dependency chain (often min-by-order)
* **compose (⊗)**: trust of derived claim = min(trust of prerequisites) adjusted by method assurance
**Important:** Trust affects **decision selection**, not raw knowledge aggregation. Aggregation retains conflicts even if one side is low-trust.
---
# 7) Normalization: external VEX → canonical atoms
## 7.1 CycloneDX / ECMA-424 normalization
From `analysis.state` ([Ecma International][1])
* `resolved`
→ FIXED := T
* `resolved_with_pedigree`
→ FIXED := T and require pedigree/diff evidence (E3)
* `exploitable`
→ APPLIES := T, REACHABLE := T, MITIGATED := F (unless explicit mitigation evidence exists)
* `in_triage`
→ mark triage flag; leave atoms mostly ⊥ unless other fields present
* `false_positive`
→ MISATTRIBUTED := T
* `not_affected`
→ requires justification mapping (below)
From `analysis.justification` ([Ecma International][1])
Map into atoms as conditional facts (context-sensitive):
* `code_not_present` → PRESENT := F
* `code_not_reachable` → REACHABLE := F
* `requires_configuration` → REACHABLE := F *under current config snapshot*
* `requires_dependency` → REACHABLE := F *unless dependency present*
* `requires_environment` → REACHABLE := F *under current environment constraints*
* `protected_by_compiler` / `protected_at_runtime` / `protected_at_perimeter` / `protected_by_mitigating_control`
→ MITIGATED := T (with evidence refs expected)
## 7.2 OpenVEX normalization
OpenVEX statuses: `not_affected`, `affected`, `fixed`, `under_investigation`. ([Docker Documentation][2])
For `not_affected`, OpenVEX requires justification or an impact statement. ([GitHub][3])
Mapping:
* `fixed` → FIXED := T
* `affected` → APPLIES := T (conservative; leave REACHABLE := ⊥ unless present)
* `under_investigation` → triage flag
* `not_affected` → choose mapping based on provided justification / impact statement:
* component not present → PRESENT := F
* vulnerable code not reachable → REACHABLE := F
* mitigations already exist → MITIGATED := T
* otherwise → APPLIES := F only if explicitly asserted
## 7.3 CSAF VEX normalization
CSAF product_status includes `fixed`, `known_affected`, `known_not_affected`, `under_investigation`. ([OASIS Documents][4])
Mapping:
* `fixed` → FIXED := T
* `known_affected` → APPLIES := T
* `known_not_affected` → APPLIES := F unless stronger justification indicates PRESENT := F / REACHABLE := F / MITIGATED := T
* `under_investigation` → triage flag
---
# 8) Lattice engine: aggregation algorithm
Aggregation is pure, monotone, and order-independent.
## 8.1 Support sets
For each Subject `S` and atom `p`, maintain:
* `SupportTrue[S,p]` = set of claim IDs supporting p=true
* `SupportFalse[S,p]` = set of claim IDs supporting p=false
Optionally store per-support:
* trust label
* evidence digests
* timestamps
## 8.2 Compute K4 value
For each `(S,p)`:
* if both support sets empty → ⊥
* if only true non-empty → T
* if only false non-empty → F
* if both non-empty →
## 8.3 Track trust on each side
Maintain:
* `TrustTrue[S,p]` = max trust label among SupportTrue
* `TrustFalse[S,p]` = max trust label among SupportFalse
This enables policy selection without losing conflict information.
---
# 9) Decision selection: from atoms → disposition
Decision selection is where “trust algebra” actually participates. It is **policy-driven** and can differ by environment (prod vs dev, regulated vs non-regulated).
## 9.1 Output disposition space
The engine should be able to emit a CycloneDX-compatible disposition (ECMA-424): ([Ecma International][1])
* `resolved_with_pedigree`
* `resolved`
* `false_positive`
* `not_affected`
* `exploitable`
* `in_triage`
## 9.2 Deterministic selection rules (baseline)
Define `D(S)`:
1. If `FIXED == T` and pedigree evidence meets threshold → `resolved_with_pedigree`
2. Else if `FIXED == T``resolved`
3. Else if `MISATTRIBUTED == T` and trust≥threshold → `false_positive`
4. Else if `APPLIES == F` or `PRESENT == F``not_affected`
5. Else if `REACHABLE == F` or `MITIGATED == T``not_affected` (with justification)
6. Else if `REACHABLE == T` and `MITIGATED != T``exploitable`
7. Else → `in_triage`
## 9.3 Conflict-handling modes (policy selectable)
When any required atom is (conflict) or ⊥ (unknown), policy chooses a stance:
* **Skeptical (default for production gating):**
* conflict/unknown biases toward `in_triage` or `exploitable` depending on risk tolerance
* **Authority-weighted:**
* if high-authority vendor statement conflicts with low-trust scanner output, accept vendor but record conflict in proof
* **Quorum-based:**
* accept `not_affected` only if:
* (vendor trust≥A3) OR
* (internal reachability proof trust≥A3) OR
* (two independent principals ≥A2 agree)
Otherwise remain `in_triage`.
This is where “trust algebra” expresses **institutional policy** without destroying underlying knowledge.
---
# 10) Proof object: verifiable explainability
Every verdict emits a **Proof Bundle** that can be verified offline.
## 10.1 Proof bundle contents
* `subject` (canonical form)
* `inputs`:
* list of claim IDs + digests
* list of evidence digests
* policy bundle digest
* vulnerability feed snapshot digest (if applicable)
* `normalization`:
* mappings applied (e.g., OpenVEX status→atoms)
* `atom_table`:
* each atom p: K4 value, support sets, trust per side
* `decision_trace`:
* rule IDs fired
* thresholds used
* `output`:
* disposition + justification + confidence metadata
## 10.2 Signing
The proof bundle is itself a payload suitable for signing in DSSE, enabling attested verdicts. DSSEs type binding is important so a proof bundle cannot be reinterpreted as a different payload class. ([GitHub][6])
---
# 11) Policy bundle specification (Trust + Decision DSL)
A policy bundle is a hashable document. Example structure (YAML-like; illustrative):
```yaml
policy_id: "org.prod.default.v1"
trust_roots:
- principal: "did:web:vendor.example"
min_assurance: A2
authority:
products: ["vendor.example/*"]
- principal: "did:web:sec.internal"
min_assurance: A2
authority:
artifacts: ["sha256:*"] # internal is authoritative for internal artifacts
acceptance_thresholds:
resolved_with_pedigree:
min_evidence_class: E3
min_assurance: A3
not_affected:
mode: quorum
quorum:
- any:
- { principal_role: vendor, min_assurance: A3 }
- { evidence_type: callgraph_path, min_assurance: A3 }
- { all:
- { distinct_principals: 2 }
- { min_assurance_each: A2 }
}
conflict_mode:
production: skeptical
development: authority_weighted
```
The engine must treat the policy bundle as an **input artifact** (hashed, stored, referenced in proofs).
---
# 12) Determinism requirements
To guarantee deterministic replay:
1. **Canonical JSON** for all stored objects (claims, evidence, policy bundles).
2. **Content-addressing**:
* `id = sha256(canonical_bytes)`
3. **Stable sorting**:
* when iterating claims/evidence, sort by `(type, id)` to prevent nondeterministic traversal
4. **Time handling**:
* evaluation time is explicit input (e.g., `as_of` timestamp)
* expired claims are excluded deterministically
5. **Version pinning**:
* tool identity + version recorded in evidence
* vuln feed snapshot digests recorded
---
# 13) Worked examples
## Example A: Vendor says not affected; scanner says exploitable
Inputs:
* OpenVEX: `not_affected` with justification (required by spec) ([GitHub][3])
* Internal scanner: flags exploitable
Aggregation:
* REACHABLE: (conflict)
Selection (production skeptical):
* verdict: `in_triage`
Selection (authority-weighted, vendor authoritative):
* verdict: `not_affected`
Proof bundle records conflict and why policy accepted vendor.
## Example B: Fixed with pedigree
Inputs:
* CycloneDX analysis.state = `resolved_with_pedigree` ([Ecma International][1])
* Evidence includes commit history/diff in pedigree
Selection:
* `resolved_with_pedigree`
## Example C: Not affected due to mitigations
Inputs:
* CycloneDX analysis.state=`not_affected`, justification=`protected_at_runtime` ([Ecma International][1])
* Evidence: runtime mitigation proof (E2/E3)
Selection:
* `not_affected` with justification preserved
---
# 14) Implementation checklist
Minimum viable implementation modules:
1. **Format adapters**
* CycloneDX/ECMA-424 VEX parser
* OpenVEX parser
* CSAF VEX parser
2. **Canonicalizer**
* canonical JSON encoder
* digest computation
3. **Verifier**
* signature/DSSE verifier (pluggable)
* principal resolver (keys/certs/OIDC claims)
4. **Normalizer**
* maps external statements → atoms
* maps justifications → conditions/evidence expectations
5. **Lattice store**
* support sets per (S,p)
* efficient indexing by artifact/component/vuln
6. **Policy evaluator**
* thresholds, authority scopes, conflict modes
7. **Proof generator**
* deterministic trace emission
* optional DSSE signing step
---
## What you should build next (practical sequencing)
1. Implement the **canonical atom model** + K4 aggregation (fast, foundational).
2. Add CycloneDX/ECMA-424 ingestion first (richest semantics). ([Ecma International][1])
3. Add OpenVEX + CSAF ingestion with mapping to atoms. ([Docker Documentation][2])
4. Add trust label computation + policy bundle hashing.
5. Add proof bundles + DSSE signing support. ([GitHub][6])
If you want, I can follow this with:
* a concrete JSON schema for `Claim`, `Evidence`, `PolicyBundle`, and `ProofBundle`;
* an explicit mapping table from OpenVEX/CSAF justifications to ECMA-424 justifications/atoms;
* a reference evaluation pseudocode implementation (deterministic, testable).
[1]: https://ecma-international.org/wp-content/uploads/ECMA-424_1st_edition_june_2024.pdf "ECMA-424, 1st edition, June 2024"
[2]: https://docs.docker.com/scout/how-tos/create-exceptions-vex/?utm_source=chatgpt.com "Create an exception using the VEX"
[3]: https://github.com/openvex/spec/blob/main/OPENVEX-SPEC.md?utm_source=chatgpt.com "spec/OPENVEX-SPEC.md at main"
[4]: https://docs.oasis-open.org/csaf/csaf/v2.0/os/csaf-v2.0-os.html?utm_source=chatgpt.com "Common Security Advisory Framework Version 2.0 - Index of /"
[5]: https://docs.sigstore.dev/cosign/verifying/attestation/?utm_source=chatgpt.com "In-Toto Attestations"
[6]: https://github.com/secure-systems-lab/dsse?utm_source=chatgpt.com "DSSE: Dead Simple Signing Envelope"