stella-ops.org/git.stella-ops.org

Fork 0

Files

StellaOps Bot 5b57b04484 house keeping work

2025-12-19 22:19:08 +02:00

14 KiB

Raw Blame History

Outcome you are shipping

A deterministic “claim resolution” capability that takes:

Multiple claims about the same vulnerability (vendor VEX, distro VEX, internal assessments, scanner inferences),
A policy describing trust and merge semantics,
A set of evidence artifacts (SBOM, config snapshots, reachability proofs, etc.),

…and produces a single resolved status per vulnerability/component/artifact with an explainable trail:

Which claims applied and why
Which were rejected and why
What evidence was required and whether it was satisfied
What policy rules triggered the resolution outcome

This replaces naive precedence like vendor > distro > internal.

Directions for Product Managers

1) Write the PRD around “claims resolution,” not “VEX support”

The customer outcome is not “we ingest VEX.” It is:

“We can safely accept ‘not affected’ without hiding risk.”
“We can prove, to auditors and change control, why a CVE was downgraded.”
“We can consistently resolve conflicts between issuer statements.”

Non-negotiable product properties

Deterministic: same inputs → same resolved outcome
Explainable: a human can trace the decision path
Guardrailed: a “safe” resolution requires evidence, not just a statement

2) Define the core objects (these drive everything)

In the PRD, define these three objects explicitly:

A) Claim (normalized)

A “claim” is any statement about vulnerability applicability to an artifact/component, regardless of source format.

Minimum fields:

vuln_id (CVE/GHSA/etc.)
subject (component identity; ideally package + version + digest/purl)
target (the thing we’re evaluating: image, repo build, runtime instance)
status (affected / not_affected / fixed / under_investigation / unknown)
justification (human/machine reason)
issuer (who said it; plus verification state)
scope (what it applies to; versions, ranges, products)
timestamp (when produced)
references (links/IDs to evidence or external material)

B) Evidence

A typed artifact that can satisfy a requirement.

Examples (not exhaustive):

config_snapshot (e.g., Helm values, env var map, feature flag export)
sbom_presence_or_absence (SBOM proof that component is/ isn’t present)
reachability_proof (call-path evidence from entrypoint to vulnerable symbol)
symbol_absence (binary inspection shows symbol/function not present)
patch_presence (artifact includes backport / fixed build)
manual_attestation (human-reviewed attestation with reviewer identity + scope)

Each evidence item must have:

type
collector (tool/provider)
inputs_hash and output_hash
scope (what artifact/environment it applies to)
confidence (optional but recommended)
expires_at / valid_for (for config/runtime evidence)

C) Policy

A policy describes:

Trust rules (how much to trust whom, under which conditions)
Merge semantics (how to resolve conflicts)
Evidence requirements (what must be present to accept certain claims)

3) Ship “policy-controlled merge semantics” as a configuration schema first

Do not start with a fully general policy language. You need a small, explicit schema that makes behavior predictable.

PM deliverable: a policy spec with these sections:

Issuer trust
- weights by issuer category (vendor/distro/internal/scanner)
- optional constraints (must be signed, must match product ownership, must be within time window)
Applicability rules
- what constitutes a match to artifact/component (range semantics, digest match priority)
Evidence requirements
- per status + per justification: what evidence types are required
Conflict resolution strategy
- conservative vs weighted vs most-specific
- explicit guardrails (never accept “safe” without evidence)
Override rules
- when internal can override vendor (and what evidence is required to do so)
- environment-specific policies (prod vs dev)

4) Make “evidence hooks” a first-class user workflow

You are explicitly shipping the ability to say:

“This is not affected because feature flag X is off.”

That requires:

a way to provide or discover feature flag state, and
a way to bind that flag to the vulnerable surface

PM must specify: what does the user do to assert that?

Minimum viable workflow:

User attaches a config_snapshot (or system captures it)
User provides a “binding” to the vulnerable module/function:
- either automatic (later) or manual (first release)
- e.g., flag X gates module Y with references (file path, code reference, runbook)

This “binding” itself becomes evidence.

5) Define acceptance criteria as decision trace tests

PM should write acceptance criteria as “given claims + policy + evidence → resolved outcome + trace”.

You need at least these canonical tests:

Distro backport vs vendor version logic conflict
- Vendor says affected (by version range)
- Distro says fixed (backport)
- Policy says: in distro context, distro claim can override vendor if patch evidence exists
- Outcome: fixed, with trace proving why
Internal ‘feature flag off’ downgrade
- Vendor says affected
- Internal says not_affected because flag off
- Evidence: config snapshot + flag→module binding
- Outcome: not_affected only for that environment context, with trace
Evidence missing
- Internal says not_affected because “code not reachable”
- No reachability evidence present
- Outcome: unknown or affected (policy-dependent), but not “not_affected”
Conflicting “safe” claims
- Vendor says not_affected (reason A)
- Internal says affected (reason B) with strong evidence
- Outcome follows merge strategy, and trace must show why.

6) Package it as an “Explainable Resolution” feature

UI/UX requirements PM must specify:

A “Resolved Status” view per vuln/component showing:
- contributing claims (ranked)
- rejected claims (with reason)
- evidence required vs evidence present
- the policy clauses triggered (line-level references)
A policy editor can be CLI/JSON first; UI later, but explainability cannot wait.

Directions for Development Managers

1) Implement as three services/modules with strict interfaces

Module A: Claim Normalization

Inputs: OpenVEX / CycloneDX VEX / CSAF / internal annotations / scanner hints
Output: canonical Claim objects

Rules:

Canonicalize IDs (normalize CVE formats, normalize package coordinates)
Preserve provenance: issuer identity, signature metadata, timestamps, original document hash

Module B: Evidence Providers (plugin boundary)

Provide an interface like:

evaluate_evidence(context, claim) -> EvidenceEvaluation

Where EvidenceEvaluation returns:

required evidence types for this claim (from policy)
found evidence items (from store/providers)
satisfied / not satisfied
explanation strings
confidence

Start with 3 providers:

SBOM provider (presence/absence)
Config provider (feature flags/config snapshot ingestion)
Reachability provider (even if initially limited or stubbed, it must exist as a typed hook)

Module C: Merge & Resolution Engine

Inputs: set of claims + policy + evidence evaluations + context
Output: ResolvedDecision

A ResolvedDecision must include:

final status
selected “winning” claim(s)
all considered claims
evidence satisfaction summary
applied policy rule IDs
deterministic ordering keys/hashes

2) Define the evaluation context (this avoids foot-guns)

The resolved outcome must be context-aware.

Create an immutable EvaluationContext object, containing:

artifact identity (image digest / build digest / SBOM hash)
environment identity (prod/stage/dev; cluster; region)
config snapshot ID
time (evaluation timestamp)
policy version hash

This is how you support: “not affected because feature flag off” in prod but not in dev.

3) Merge semantics: implement scoring + guardrails, not precedence

You need a deterministic function. One workable approach:

Step 1: compute statement strength

For each claim:

trust_weight from policy (issuer + scope + signature requirements)
evidence_factor (1.0 if requirements satisfied; <1 or 0 if not)
specificity_factor (exact digest match > exact version > range)
freshness_factor (optional; policy-defined)
applicability must be true or claim is excluded

Compute:

support = trust_weight * evidence_factor * specificity_factor * freshness_factor

Step 2: apply merge strategy (policy-controlled)

Ship at least two strategies:

Conservative default
- If any “unsafe” claim (affected/under_investigation) has support above threshold, it wins
- A “safe” claim (not_affected/fixed) can override only if:
  - it has equal/higher support + delta, AND
  - its evidence requirements are satisfied
Evidence-weighted
- Highest support wins, but safe statuses have a hard evidence gate

Step 3: apply guardrails

Hard guardrail to prevent bad outcomes:

Never emit a safe status unless evidence requirements for that safe claim are satisfied.
If a safe claim lacks evidence, downgrade the safe claim to “unsupported” and do not allow it to win.

This single rule is what makes your system materially different from “VEX as suppression.”

4) Evidence hooks: treat them as typed contracts, not strings

For “feature flag off,” implement it as a structured evidence requirement.

Example evidence requirement for a “safe because feature flag off” claim:

Required evidence types:
- config_snapshot
- flag_binding (the mapping “flag X gates vulnerable surface Y”)

Implementation:

Config provider can parse:
- Helm values / env var sets / feature flag exports
- Store them as normalized key/value with hashes
Binding evidence can start as manual JSON that references:
- repo path / module / function group
- a link to code ownership / runbook
- optional test evidence

Later you can automate binding via static analysis, but do not block shipping on that.

5) Determinism requirements (engineering non-negotiables)

Development manager should enforce:

stable sorting of claims by canonical key
stable tie-breakers (e.g., issuer ID, timestamp, claim hash)
no nondeterministic external calls during evaluation (or they must be snapshot-based)
every evaluation produces:
- input_bundle_hash (claims + evidence + policy + context)
- decision_hash

This is the foundation for replayability and audits.

6) Storage model: store raw inputs and canonical forms

Minimum stores:

Raw documents (original VEX/CSAF/etc.) keyed by content hash
Canonical claims keyed by claim hash
Evidence items keyed by evidence hash and scoped by context
Policy versions keyed by policy hash
Resolutions keyed by (context, vuln_id, subject) with decision hash

7) “Definition of done” checklist for engineering

You are done when:

You can ingest at least two formats into canonical claims (pick OpenVEX + CycloneDX VEX first).
You can configure issuer trust and evidence requirements in a policy file.
You can resolve conflicts deterministically.
You can attach a config snapshot and produce:
- not_affected because feature flag off only when evidence satisfied
The system produces a decision trace with:
- applied policy rules
- evidence satisfaction
- selected/rejected claims and reasons
Golden test vectors exist for the acceptance scenarios listed above.

A concrete example policy (schema-first, no full DSL required)

version: 1

trust:
  issuers:
    - match: {category: vendor}
      weight: 70
      require_signature: true
    - match: {category: distro}
      weight: 75
      require_signature: true
    - match: {category: internal}
      weight: 85
      require_signature: false
    - match: {category: scanner}
      weight: 40

evidence_requirements:
  safe_status_requires_evidence: true

  rules:
    - when:
        status: not_affected
        reason: feature_flag_off
      require: [config_snapshot, flag_binding]

    - when:
        status: not_affected
        reason: component_not_present
      require: [sbom_absence]

    - when:
        status: not_affected
        reason: not_reachable
      require: [reachability_proof]

merge:
  strategy: conservative
  unsafe_wins_threshold: 50
  safe_override_delta: 10

A concrete example output trace (what auditors and engineers must see)

{
  "vuln_id": "CVE-XXXX-YYYY",
  "subject": "pkg:maven/org.example/foo@1.2.3",
  "context": {
    "artifact_digest": "sha256:...",
    "environment": "prod",
    "policy_hash": "sha256:..."
  },
  "resolved_status": "not_affected",
  "because": [
    {
      "winning_claim": "claim_hash_abc",
      "reason": "feature_flag_off",
      "evidence_required": ["config_snapshot", "flag_binding"],
      "evidence_present": ["ev_hash_1", "ev_hash_2"],
      "policy_rules_applied": ["trust.issuers[internal]", "evidence.rules[0]", "merge.safe_override_delta"]
    }
  ],
  "claims_considered": [
    {"issuer": "vendor", "status": "affected", "support": 62, "accepted": false, "rejection_reason": "overridden_by_higher_support_safe_claim_with_satisfied_evidence"},
    {"issuer": "internal", "status": "not_affected", "support": 78, "accepted": true, "evidence_satisfied": true}
  ],
  "decision_hash": "sha256:..."
}

The two strategic pitfalls to explicitly avoid

“Trust precedence” as the merge mechanism
- It will fail immediately on backports, forks, downstream patches, and environment-specific mitigations.
Allowing “safe” without evidence
- That turns VEX into a suppression system and will collapse trust in the product.

14 KiB Raw Blame History Unescape Escape