house keeping work
This commit is contained in:
463
docs/product-advisories/unprocessed/19-Dec-2025 - Moat #3.md
Normal file
463
docs/product-advisories/unprocessed/19-Dec-2025 - Moat #3.md
Normal file
@@ -0,0 +1,463 @@
|
||||
## Outcome you are shipping
|
||||
|
||||
A deterministic “claim resolution” capability that takes:
|
||||
|
||||
* Multiple **claims** about the same vulnerability (vendor VEX, distro VEX, internal assessments, scanner inferences),
|
||||
* A **policy** describing trust and merge semantics,
|
||||
* A set of **evidence artifacts** (SBOM, config snapshots, reachability proofs, etc.),
|
||||
|
||||
…and produces a **single resolved status** per vulnerability/component/artifact **with an explainable trail**:
|
||||
|
||||
* Which claims applied and why
|
||||
* Which were rejected and why
|
||||
* What evidence was required and whether it was satisfied
|
||||
* What policy rules triggered the resolution outcome
|
||||
|
||||
This replaces naive precedence like `vendor > distro > internal`.
|
||||
|
||||
---
|
||||
|
||||
# Directions for Product Managers
|
||||
|
||||
## 1) Write the PRD around “claims resolution,” not “VEX support”
|
||||
|
||||
The customer outcome is not “we ingest VEX.” It is:
|
||||
|
||||
* “We can *safely* accept ‘not affected’ without hiding risk.”
|
||||
* “We can prove, to auditors and change control, why a CVE was downgraded.”
|
||||
* “We can consistently resolve conflicts between issuer statements.”
|
||||
|
||||
### Non-negotiable product properties
|
||||
|
||||
* **Deterministic**: same inputs → same resolved outcome
|
||||
* **Explainable**: a human can trace the decision path
|
||||
* **Guardrailed**: a “safe” resolution requires evidence, not just a statement
|
||||
|
||||
---
|
||||
|
||||
## 2) Define the core objects (these drive everything)
|
||||
|
||||
In the PRD, define these three objects explicitly:
|
||||
|
||||
### A) Claim (normalized)
|
||||
|
||||
A “claim” is any statement about vulnerability applicability to an artifact/component, regardless of source format.
|
||||
|
||||
Minimum fields:
|
||||
|
||||
* `vuln_id` (CVE/GHSA/etc.)
|
||||
* `subject` (component identity; ideally package + version + digest/purl)
|
||||
* `target` (the thing we’re evaluating: image, repo build, runtime instance)
|
||||
* `status` (affected / not_affected / fixed / under_investigation / unknown)
|
||||
* `justification` (human/machine reason)
|
||||
* `issuer` (who said it; plus verification state)
|
||||
* `scope` (what it applies to; versions, ranges, products)
|
||||
* `timestamp` (when produced)
|
||||
* `references` (links/IDs to evidence or external material)
|
||||
|
||||
### B) Evidence
|
||||
|
||||
A typed artifact that can satisfy a requirement.
|
||||
|
||||
Examples (not exhaustive):
|
||||
|
||||
* `config_snapshot` (e.g., Helm values, env var map, feature flag export)
|
||||
* `sbom_presence_or_absence` (SBOM proof that component is/ isn’t present)
|
||||
* `reachability_proof` (call-path evidence from entrypoint to vulnerable symbol)
|
||||
* `symbol_absence` (binary inspection shows symbol/function not present)
|
||||
* `patch_presence` (artifact includes backport / fixed build)
|
||||
* `manual_attestation` (human-reviewed attestation with reviewer identity + scope)
|
||||
|
||||
Each evidence item must have:
|
||||
|
||||
* `type`
|
||||
* `collector` (tool/provider)
|
||||
* `inputs_hash` and `output_hash`
|
||||
* `scope` (what artifact/environment it applies to)
|
||||
* `confidence` (optional but recommended)
|
||||
* `expires_at` / `valid_for` (for config/runtime evidence)
|
||||
|
||||
### C) Policy
|
||||
|
||||
A policy describes:
|
||||
|
||||
* **Trust rules** (how much to trust whom, under which conditions)
|
||||
* **Merge semantics** (how to resolve conflicts)
|
||||
* **Evidence requirements** (what must be present to accept certain claims)
|
||||
|
||||
---
|
||||
|
||||
## 3) Ship “policy-controlled merge semantics” as a configuration schema first
|
||||
|
||||
Do not start with a fully general policy language. You need a small, explicit schema that makes behavior predictable.
|
||||
|
||||
PM deliverable: a policy spec with these sections:
|
||||
|
||||
1. **Issuer trust**
|
||||
|
||||
* weights by issuer category (vendor/distro/internal/scanner)
|
||||
* optional constraints (must be signed, must match product ownership, must be within time window)
|
||||
2. **Applicability rules**
|
||||
|
||||
* what constitutes a match to artifact/component (range semantics, digest match priority)
|
||||
3. **Evidence requirements**
|
||||
|
||||
* per status + per justification: what evidence types are required
|
||||
4. **Conflict resolution strategy**
|
||||
|
||||
* conservative vs weighted vs most-specific
|
||||
* explicit guardrails (never accept “safe” without evidence)
|
||||
5. **Override rules**
|
||||
|
||||
* when internal can override vendor (and what evidence is required to do so)
|
||||
* environment-specific policies (prod vs dev)
|
||||
|
||||
---
|
||||
|
||||
## 4) Make “evidence hooks” a first-class user workflow
|
||||
|
||||
You are explicitly shipping the ability to say:
|
||||
|
||||
> “This is not affected **because** feature flag X is off.”
|
||||
|
||||
That requires:
|
||||
|
||||
* a way to **provide or discover** feature flag state, and
|
||||
* a way to **bind** that flag to the vulnerable surface
|
||||
|
||||
PM must specify: what does the user do to assert that?
|
||||
|
||||
Minimum viable workflow:
|
||||
|
||||
* User attaches a `config_snapshot` (or system captures it)
|
||||
* User provides a “binding” to the vulnerable module/function:
|
||||
|
||||
* either automatic (later) or manual (first release)
|
||||
* e.g., `flag X gates module Y` with references (file path, code reference, runbook)
|
||||
|
||||
This “binding” itself becomes evidence.
|
||||
|
||||
---
|
||||
|
||||
## 5) Define acceptance criteria as decision trace tests
|
||||
|
||||
PM should write acceptance criteria as “given claims + policy + evidence → resolved outcome + trace”.
|
||||
|
||||
You need at least these canonical tests:
|
||||
|
||||
1. **Distro backport vs vendor version logic conflict**
|
||||
|
||||
* Vendor says affected (by version range)
|
||||
* Distro says fixed (backport)
|
||||
* Policy says: in distro context, distro claim can override vendor if patch evidence exists
|
||||
* Outcome: fixed, with trace proving why
|
||||
|
||||
2. **Internal ‘feature flag off’ downgrade**
|
||||
|
||||
* Vendor says affected
|
||||
* Internal says not_affected because flag off
|
||||
* Evidence: config snapshot + flag→module binding
|
||||
* Outcome: not_affected **only for that environment context**, with trace
|
||||
|
||||
3. **Evidence missing**
|
||||
|
||||
* Internal says not_affected because “code not reachable”
|
||||
* No reachability evidence present
|
||||
* Outcome: unknown or affected (policy-dependent), but **not “not_affected”**
|
||||
|
||||
4. **Conflicting “safe” claims**
|
||||
|
||||
* Vendor says not_affected (reason A)
|
||||
* Internal says affected (reason B) with strong evidence
|
||||
* Outcome follows merge strategy, and trace must show why.
|
||||
|
||||
---
|
||||
|
||||
## 6) Package it as an “Explainable Resolution” feature
|
||||
|
||||
UI/UX requirements PM must specify:
|
||||
|
||||
* A “Resolved Status” view per vuln/component showing:
|
||||
|
||||
* contributing claims (ranked)
|
||||
* rejected claims (with reason)
|
||||
* evidence required vs evidence present
|
||||
* the policy clauses triggered (line-level references)
|
||||
* A policy editor can be CLI/JSON first; UI later, but explainability cannot wait.
|
||||
|
||||
---
|
||||
|
||||
# Directions for Development Managers
|
||||
|
||||
## 1) Implement as three services/modules with strict interfaces
|
||||
|
||||
### Module A: Claim Normalization
|
||||
|
||||
* Inputs: OpenVEX / CycloneDX VEX / CSAF / internal annotations / scanner hints
|
||||
* Output: canonical `Claim` objects
|
||||
|
||||
Rules:
|
||||
|
||||
* Canonicalize IDs (normalize CVE formats, normalize package coordinates)
|
||||
* Preserve provenance: issuer identity, signature metadata, timestamps, original document hash
|
||||
|
||||
### Module B: Evidence Providers (plugin boundary)
|
||||
|
||||
* Provide an interface like:
|
||||
|
||||
```
|
||||
evaluate_evidence(context, claim) -> EvidenceEvaluation
|
||||
```
|
||||
|
||||
Where `EvidenceEvaluation` returns:
|
||||
|
||||
* required evidence types for this claim (from policy)
|
||||
* found evidence items (from store/providers)
|
||||
* satisfied / not satisfied
|
||||
* explanation strings
|
||||
* confidence
|
||||
|
||||
Start with 3 providers:
|
||||
|
||||
1. SBOM provider (presence/absence)
|
||||
2. Config provider (feature flags/config snapshot ingestion)
|
||||
3. Reachability provider (even if initially limited or stubbed, it must exist as a typed hook)
|
||||
|
||||
### Module C: Merge & Resolution Engine
|
||||
|
||||
* Inputs: set of claims + policy + evidence evaluations + context
|
||||
* Output: `ResolvedDecision`
|
||||
|
||||
A `ResolvedDecision` must include:
|
||||
|
||||
* final status
|
||||
* selected “winning” claim(s)
|
||||
* all considered claims
|
||||
* evidence satisfaction summary
|
||||
* applied policy rule IDs
|
||||
* deterministic ordering keys/hashes
|
||||
|
||||
---
|
||||
|
||||
## 2) Define the evaluation context (this avoids foot-guns)
|
||||
|
||||
The resolved outcome must be context-aware.
|
||||
|
||||
Create an immutable `EvaluationContext` object, containing:
|
||||
|
||||
* artifact identity (image digest / build digest / SBOM hash)
|
||||
* environment identity (prod/stage/dev; cluster; region)
|
||||
* config snapshot ID
|
||||
* time (evaluation timestamp)
|
||||
* policy version hash
|
||||
|
||||
This is how you support: “not affected because feature flag off” in prod but not in dev.
|
||||
|
||||
---
|
||||
|
||||
## 3) Merge semantics: implement scoring + guardrails, not precedence
|
||||
|
||||
You need a deterministic function. One workable approach:
|
||||
|
||||
### Step 1: compute statement strength
|
||||
|
||||
For each claim:
|
||||
|
||||
* `trust_weight` from policy (issuer + scope + signature requirements)
|
||||
* `evidence_factor` (1.0 if requirements satisfied; <1 or 0 if not)
|
||||
* `specificity_factor` (exact digest match > exact version > range)
|
||||
* `freshness_factor` (optional; policy-defined)
|
||||
* `applicability` must be true or claim is excluded
|
||||
|
||||
Compute:
|
||||
|
||||
```
|
||||
support = trust_weight * evidence_factor * specificity_factor * freshness_factor
|
||||
```
|
||||
|
||||
### Step 2: apply merge strategy (policy-controlled)
|
||||
|
||||
Ship at least two strategies:
|
||||
|
||||
1. **Conservative default**
|
||||
|
||||
* If any “unsafe” claim (affected/under_investigation) has support above threshold, it wins
|
||||
* A “safe” claim (not_affected/fixed) can override only if:
|
||||
|
||||
* it has equal/higher support + delta, AND
|
||||
* its evidence requirements are satisfied
|
||||
|
||||
2. **Evidence-weighted**
|
||||
|
||||
* Highest support wins, but safe statuses have a hard evidence gate
|
||||
|
||||
### Step 3: apply guardrails
|
||||
|
||||
Hard guardrail to prevent bad outcomes:
|
||||
|
||||
* **Never emit a safe status unless evidence requirements for that safe claim are satisfied.**
|
||||
* If a safe claim lacks evidence, downgrade the safe claim to “unsupported” and do not allow it to win.
|
||||
|
||||
This single rule is what makes your system materially different from “VEX as suppression.”
|
||||
|
||||
---
|
||||
|
||||
## 4) Evidence hooks: treat them as typed contracts, not strings
|
||||
|
||||
For “feature flag off,” implement it as a structured evidence requirement.
|
||||
|
||||
Example evidence requirement for a “safe because feature flag off” claim:
|
||||
|
||||
* Required evidence types:
|
||||
|
||||
* `config_snapshot`
|
||||
* `flag_binding` (the mapping “flag X gates vulnerable surface Y”)
|
||||
|
||||
Implementation:
|
||||
|
||||
* Config provider can parse:
|
||||
|
||||
* Helm values / env var sets / feature flag exports
|
||||
* Store them as normalized key/value with hashes
|
||||
* Binding evidence can start as manual JSON that references:
|
||||
|
||||
* repo path / module / function group
|
||||
* a link to code ownership / runbook
|
||||
* optional test evidence
|
||||
|
||||
Later you can automate binding via static analysis, but do not block shipping on that.
|
||||
|
||||
---
|
||||
|
||||
## 5) Determinism requirements (engineering non-negotiables)
|
||||
|
||||
Development manager should enforce:
|
||||
|
||||
* stable sorting of claims by canonical key
|
||||
* stable tie-breakers (e.g., issuer ID, timestamp, claim hash)
|
||||
* no nondeterministic external calls during evaluation (or they must be snapshot-based)
|
||||
* every evaluation produces:
|
||||
|
||||
* `input_bundle_hash` (claims + evidence + policy + context)
|
||||
* `decision_hash`
|
||||
|
||||
This is the foundation for replayability and audits.
|
||||
|
||||
---
|
||||
|
||||
## 6) Storage model: store raw inputs and canonical forms
|
||||
|
||||
Minimum stores:
|
||||
|
||||
* Raw documents (original VEX/CSAF/etc.) keyed by content hash
|
||||
* Canonical claims keyed by claim hash
|
||||
* Evidence items keyed by evidence hash and scoped by context
|
||||
* Policy versions keyed by policy hash
|
||||
* Resolutions keyed by (context, vuln_id, subject) with decision hash
|
||||
|
||||
---
|
||||
|
||||
## 7) “Definition of done” checklist for engineering
|
||||
|
||||
You are done when:
|
||||
|
||||
1. You can ingest at least two formats into canonical claims (pick OpenVEX + CycloneDX VEX first).
|
||||
2. You can configure issuer trust and evidence requirements in a policy file.
|
||||
3. You can resolve conflicts deterministically.
|
||||
4. You can attach a config snapshot and produce:
|
||||
|
||||
* `not_affected because feature flag off` **only when evidence satisfied**
|
||||
5. The system produces a decision trace with:
|
||||
|
||||
* applied policy rules
|
||||
* evidence satisfaction
|
||||
* selected/rejected claims and reasons
|
||||
6. Golden test vectors exist for the acceptance scenarios listed above.
|
||||
|
||||
---
|
||||
|
||||
# A concrete example policy (schema-first, no full DSL required)
|
||||
|
||||
```yaml
|
||||
version: 1
|
||||
|
||||
trust:
|
||||
issuers:
|
||||
- match: {category: vendor}
|
||||
weight: 70
|
||||
require_signature: true
|
||||
- match: {category: distro}
|
||||
weight: 75
|
||||
require_signature: true
|
||||
- match: {category: internal}
|
||||
weight: 85
|
||||
require_signature: false
|
||||
- match: {category: scanner}
|
||||
weight: 40
|
||||
|
||||
evidence_requirements:
|
||||
safe_status_requires_evidence: true
|
||||
|
||||
rules:
|
||||
- when:
|
||||
status: not_affected
|
||||
reason: feature_flag_off
|
||||
require: [config_snapshot, flag_binding]
|
||||
|
||||
- when:
|
||||
status: not_affected
|
||||
reason: component_not_present
|
||||
require: [sbom_absence]
|
||||
|
||||
- when:
|
||||
status: not_affected
|
||||
reason: not_reachable
|
||||
require: [reachability_proof]
|
||||
|
||||
merge:
|
||||
strategy: conservative
|
||||
unsafe_wins_threshold: 50
|
||||
safe_override_delta: 10
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# A concrete example output trace (what auditors and engineers must see)
|
||||
|
||||
```json
|
||||
{
|
||||
"vuln_id": "CVE-XXXX-YYYY",
|
||||
"subject": "pkg:maven/org.example/foo@1.2.3",
|
||||
"context": {
|
||||
"artifact_digest": "sha256:...",
|
||||
"environment": "prod",
|
||||
"policy_hash": "sha256:..."
|
||||
},
|
||||
"resolved_status": "not_affected",
|
||||
"because": [
|
||||
{
|
||||
"winning_claim": "claim_hash_abc",
|
||||
"reason": "feature_flag_off",
|
||||
"evidence_required": ["config_snapshot", "flag_binding"],
|
||||
"evidence_present": ["ev_hash_1", "ev_hash_2"],
|
||||
"policy_rules_applied": ["trust.issuers[internal]", "evidence.rules[0]", "merge.safe_override_delta"]
|
||||
}
|
||||
],
|
||||
"claims_considered": [
|
||||
{"issuer": "vendor", "status": "affected", "support": 62, "accepted": false, "rejection_reason": "overridden_by_higher_support_safe_claim_with_satisfied_evidence"},
|
||||
{"issuer": "internal", "status": "not_affected", "support": 78, "accepted": true, "evidence_satisfied": true}
|
||||
],
|
||||
"decision_hash": "sha256:..."
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## The two strategic pitfalls to explicitly avoid
|
||||
|
||||
1. **“Trust precedence” as the merge mechanism**
|
||||
|
||||
* It will fail immediately on backports, forks, downstream patches, and environment-specific mitigations.
|
||||
2. **Allowing “safe” without evidence**
|
||||
|
||||
* That turns VEX into a suppression system and will collapse trust in the product.
|
||||
Reference in New Issue
Block a user