Files
git.stella-ops.org/docs/product-advisories/09-Dec-2025 - Smart‑Diff and Provenance‑Rich Binaries.md
2025-12-09 20:23:50 +02:00

22 KiB
Raw Blame History

Heres a compact, firsttimefriendly plan to add two highleverage features to your platform: an image SmartDiff (with signed, policyaware deltas) and better binaries (symbol/bytelevel SCA + provenance + SARIF).

SmartDiff (images & containers) — what/why/how

What it is: Compute deltas between two images (or layers) and enrich them with context: which files, which packages, which configs flip behavior, and whether the change is actually reachable at runtime. Then sign the report so downstream tools can trust it.

Why it matters: Teams drown in “changed but harmless” noise. A diff that knows “is this reachable, configactivated, and under the running user?” prioritizes real risk and shortens MTTR.

How to ship it (StellaOpsstyle, onprem, .NET 10):

  • Scope of diff

    • Layer → file → package → symbol (map file changes to package + version; map package to symbols/exports when available).
    • Config/env lens: overlay ENTRYPOINT/CMD, env, feature flags, mounted secrets, user/UID.
  • Reachability gates (3bit severity gate)

    • Reachable? (call graph / entrypoints / process tree)
    • Configactivated? (feature flags, env, args)
    • Running user? (match file/dir ACLs, capabilities, container User:)
    • Compute a severity class from these bits (e.g., 07) and attach a short rationale.
  • Attestation

    • Emit a DSSEwrapped intoto attestation with the SmartDiff as predicate.
    • Include: artifact digests (old/new), diff summary, gate bits, rule versions, and scanner build info.
    • Sign offline; verify with cosign/rekor when online is available.
  • Predicate (minimal JSON shape)

{
  "predicateType": "stellaops.dev/predicates/smart-diff@v1",
  "predicate": {
    "baseImage": {"name":"...", "digest":"sha256:..."},
    "targetImage": {"name":"...", "digest":"sha256:..."},
    "diff": {
      "filesAdded": [...],
      "filesRemoved": [...],
      "filesChanged": [{"path":"...", "hunks":[...]}],
      "packagesChanged": [{"name":"openssl","from":"1.1.1u","to":"3.0.14"}]
    },
    "context": {
      "entrypoint":["/app/start"],
      "env":{"FEATURE_X":"true"},
      "user":{"uid":1001,"caps":["NET_BIND_SERVICE"]}
    },
    "reachabilityGate": {"reachable":true,"configActivated":true,"runningUser":false,"class":6},
    "scanner": {"name":"StellaOps.Scanner","version":"...","ruleset":"reachability-2025.12"}
  }
}
  • Pipelines

    • Scanner computes diff → predicate JSON → DSSE envelope → write .intoto.jsonl.
    • Optionally export a lightweight human report (markdown) and a machine delta (protobuf/JSON).

Better binaries — symbol/byte SCA + provenance + SARIF

What it is: Go beyond package SBOMs. Identify symbols, sections, and compiler fingerprints in each produced binary; capture provenance (compiler, flags, LTO, link order, hashes); then emit:

  1. an intoto statement per binary, and
  2. a SARIF 2.1.0 report for GitHub code scanning.

Why it matters: A lot of risk hides below package level (vendored code, static libs, LTO). Symbol/byte SCA catches it; provenance proves how the binary was built.

How to ship it:

  • Extractors (modular analyzers)

    • ELF/PE/MachO parsers (sections, imports/exports, buildids, rpaths).
    • Symbol tables (public + demangled), string tables, compiler notes (.comment), PDB/DWARF when present.
    • Fingerprints: rolling hashes per section/function; Bloom filters for quick symbol presence checks.
  • Provenance capture

    • Compiler: name/version, target triple, LTO (on/off/mode).
    • Flags: -O, -fstack-protector, -D_FORTIFY_SOURCE, PIE/RELRO, CET/CFGuard.
    • Linker: version, libs, order, deadstrip/LTO decisions.
  • Intoto statement (per binary)

{
  "predicateType":"slsa.dev/provenance/v1",
  "subject":[{"name":"bin/app","digest":{"sha256":"..."}}],
  "predicate":{
    "builder":{"id":"stellaops://builder/ci"},
    "buildType":"stellaops.dev/build/native@v1",
    "metadata":{"buildInvocationID":"...","buildStartedOn":"...","buildFinishedOn":"..."},
    "materials":[{"uri":"git+ssh://...#<commit>","digest":{"sha1":"..."}}],
    "buildConfig":{
      "compiler":{"name":"clang","version":"18.1.3"},
      "flags":["-O2","-fstack-protector-strong","-fPIE"],
      "lto":"thin",
      "linker":{"name":"lld","version":"18.1.3"},
      "hardening":{"pie":true,"relro":"full","fortify":true}
    }
  }
}
  • SARIF 2.1.0 for GitHub code scanning

    • One SARIF file per build (or per repo), tool name StellaOps.BinarySCA.

    • For each finding (e.g., vulnerable function signature or insecure linker flag), add:

      • ruleId, CWE/Vuln ID, severity, location (binary + symbol), helpUri.
    • Upload via Actions/API so issues appear in Security → Code scanning alerts.

  • CI wiring (onprem friendly)

    • Build → run binary analyzers → write binary.intoto.jsonl + sca.sarif.json.
    • Sign the intoto statement (DSSE). If airgapped, store in your internal evidence bucket; sync to Rekor mirror later.
    • Optional: export a compact “binary SBOM” (function inventory + hashes).

Minimal .NET 10 / CLI layout (suggested)

src/Scanner/StellaOps.Scanner.SmartDiff/
src/Scanner/StellaOps.Scanner.BinarySCA/
src/Predicates/StellaOps.Predicates/           # JSON schemas, versioned
src/Sign/StellaOps.Attestation/                # DSSE envelopes, cosign integration
src/Exports/StellaOps.Exports.Sarif/
src/Exports/StellaOps.Exports.InToto/
  • Contracts: freeze JSON Schemas under StellaOps.Predicates and version them (e.g., smart-diff@v1, binary-provenance@v1).
  • Determinism: lock analyzer rulesets + feeds with content hashes; record them in each predicate (rulesetDigest).

Quick wins this week

  • Implement the 3bit reachability gate and surface it in your UI filter.
  • Emit DSSEwrapped intoto for SmartDiff first (binary provenance next).
  • Add a basic SARIF exporter that maps binary findings → GitHub alerts (even if only a couple rules to start: missing RELRO/PIE; insecure __strcpy occurrences).

If you want, I can draft:

  • The JSON Schemas for smart-diff@v1 and binary-provenance@v1.
  • A tiny C# sample that wraps a predicate in DSSE and verifies with cosign.
  • A SARIF emitter stub wired to your CI. I will split this into two parts:
  1. A compact list of Stella Ops advantages (how we win vs other scanners).
  2. Concrete developer guidelines that make those advantages real in code.

1. Stella Ops advantages (what we are optimizing for)

When you build any feature, it should clearly reinforce at least one of these:

  1. Evidence-first, signed, replayable Every non-trivial operation produces signed, DSSE-wrapped attestations and can be re-run later to obtain the same result byte-for-byte.

  2. Reachability-first triage (Smart-Diff + gates) We never just say “this CVE exists”; we say: it changed, it is or is not reachable, it is or is not activated by config, and which user actually executes it.

  3. Binary-level SCA + provenance We do not stop at packages. We inspect binaries (symbols, sections, toolchain fingerprints) and provide in-toto/SLSA provenance plus SARIF to development tools.

  4. Crypto-sovereign and offline-ready All signing/verification can use local trust roots and local cryptographic profiles (FIPS / eIDAS / GOST / SM) with no hard dependency on public CAs or external clouds.

  5. Deterministic, replayable scans A “scan” is a pure function of: artifact digests, feeds, rules, lattice policies, and configuration. Anything not captured there is a bug.

  6. Policy & lattice engine instead of ad-hoc rules Risk and VEX decisions are the result of explicit lattice merge rules (“trust algebra”), not opaque if-else trees in the code.

  7. Proof-of-integrity graph All artifacts (source → build → container → runtime) are connected in a cryptographic graph that can be traversed, audited, and exported.

  8. Quiet-by-design UX The system is optimized to answer three questions fast:

    1. Can I ship this? 2) If not, what blocks me? 3) What is the minimal safe change?

Everything you build should clearly map to one or more of the above.


2. Developer guidelines by advantage

2.1 Evidence-first, signed, replayable

Core rule: Any non-trivial action must be traceable as a signed, re-playable evidence record.

Implementation guidelines

  1. Uniform attestation model

    • Define and use a shared library, e.g. StellaOps.Predicates, with:

      • Versioned JSON Schemas (e.g. smart-diff@v1, binary-provenance@v1, reachability-summary@v1).
      • Strongly-typed C# DTOs that match the schemas.
    • Every module (Scanner, Sbomer, Concelier, Excititor/Vexer, Authority, Scheduler, Feedser) must:

      • Emit DSSE-wrapped in-toto statements.
      • Use the same hashing strategy (e.g., SHA-256 over canonical JSON, no whitespace variance).
      • Include tool name, version, ruleset/feeds digests and configuration id in each predicate.
  2. Link-not-merge

    • Never rewrite or mutate third-party SBOM/VEX/attestations.

    • Instead:

      • Store original documents as immutable blobs addressed by hash.
      • Refer to them using digests and URIs (e.g. sha256:…) from your own predicates.
      • Emit linking evidence: “this SBOM (digest X) was used to compute decision Y”.
  3. Deterministic scan manifests

    • Each scan must have a manifest object:

      {
        "artifactDigest": "sha256:...",
        "scannerVersion": "1.2.3",
        "rulesetDigest": "sha256:...",
        "feedsDigests": { "nvd": "sha256:...", "vendorX": "sha256:..." },
        "latticePolicyDigest": "sha256:...",
        "configId": "prod-eu-1",
        "timestamp": "2025-12-09T13:37:00Z"
      }
      
    • Store it alongside results and include its digest in all predicates produced by that run.

  4. Signing & verification

    • All attestation writing goes through a single abstraction, e.g.:

      interface IAttestationSigner {
          Task<DSSEEnvelope> SignAsync<TPredicate>(TPredicate predicate, CancellationToken ct);
      }
      
    • Implementations may use:

      • Sigstore (Fulcio + Rekor) when online.
      • Local keys (HSM, TPM, file key) when offline.
    • Never do ad-hoc crypto directly in features; always go through the shared crypto layer (see 2.4).


2.2 Reachability-first triage and Smart-Diff

Core rule: You must never treat “found a CVE” as sufficient. You must track change + reachability + config + execution context.

Smart-Diff

  1. Diff levels

    • Implement layered diffs:

      • Image / layer → file → package → symbol.
    • Map:

      • File changes → owning package, version.
      • Package changes → known vulnerabilities and exports.
    • Attach config context: entrypoint, env vars, feature flags, user/UID.

  2. Signed Smart-Diff predicate

    • Use the minimal shape like (simplified):

      {
        "predicateType": "stellaops.dev/predicates/smart-diff@v1",
        "predicate": {
          "baseImage": {...},
          "targetImage": {...},
          "diff": {...},
          "context": {...},
          "reachabilityGate": {...},
          "scanner": {...}
        }
      }
      
    • Always sign as DSSE and attach the scan manifest digest.

Reachability gate

Use the 3-bit gate consistently:

  • reachable (static/dynamic call graph says “yes”)
  • configActivated (env/flags/args activate the code path)
  • runningUser (the user/UID that can actually execute it)

Guidelines:

  1. Data model

    public sealed record ReachabilityGate(
        bool? Reachable,       // true / false / null for unknown
        bool? ConfigActivated,
        bool? RunningUser,
        int Class,             // 0..7 derived from the bits when all known
        string Rationale       // short explanation, human-readable
    );
    
  2. Unknowns must stay unknown

    • Never silently treat null as false or true.
    • If any of the bits is null, compute Class only from known bits or set Class = -1 to denote “incomplete”.
    • Feed all “unknown” cases into a dedicated “Unknowns ranking” path (separate heuristics and UX).
  3. Where reachability is computed

    • Respect your standing rule: lattice and reachability algorithms run in Scanner.WebService, not in Concelier, Feedser, or Excitors/Vexer.

    • Other services only:

      • Persist / index results.
      • Prune / filter based on policy.
      • Present data — never recompute core reachability.
  4. Caching reachability

    • Key caches by:

      • Artifact digest (image/layer/binary).
      • Ruleset/lattice digest.
      • Language/runtime version (for static analysis).
    • Pattern:

      • First time a call path is requested, compute and cache.
      • Subsequent accesses in the same scan use the in-memory cache.
      • For cross-scan reuse, store a compact summary keyed by (artifactDigest, rulesetDigest) in Scanners persistence node.
    • Never cache across incompatible rule or feed versions.


2.3 Binary-level SCA and provenance

Core rule: Treat each built binary as a first-class subject with its own SBOM, SCA, and provenance.

  1. Pluggable analyzers

    • Create analyzers per binary format/language:

      • ELF, PE, Mach-O.
      • Language/toolchain detectors (GCC/Clang/MSVC/.NET/Go/Rust).
    • Common interface:

      interface IBinaryAnalyzer {
          bool CanHandle(BinaryContext ctx);
          Task<BinaryAnalysisResult> AnalyzeAsync(BinaryContext ctx, CancellationToken ct);
      }
      
  2. Binary SBOM + SCA

    • Output per-binary:

      • Function/symbol inventory (names, addresses).
      • Linked static libraries.
      • Detected third-party components (via fingerprints).
    • Map to known vulnerabilities via:

      • Symbol signatures.
      • Function-level or section-level hashes.
    • Emit:

      • CycloneDX/SPDX component entries for binaries.
      • A separate predicate binary-sca@v1.
  3. Provenance (in-toto/SLSA)

    • Emit an in-toto statement per binary:

      • Subject = bin/app (digest).
      • Predicate = build metadata (compiler, flags, LTO, linker, hardening).
    • Always include:

      • Source material (git repo + commit).
      • Build environment (container image digest or runner OS).
      • Exact build command / script identifier.
  4. SARIF for GitHub / IDEs

    • Provide an exporter:

      • Input: BinaryAnalysisResult.

      • Output: SARIF 2.1.0 with:

        • Findings: missing RELRO/PIE, unsafe functions, known vulns, weak flags.
        • Locations: binary path + symbol/function name.
    • Keep rule IDs stable and documented (e.g. STB001_NO_RELRO, STB010_VULN_SYMBOL).


2.4 Crypto-sovereign, offline-ready

Core rule: No feature may rely on a single global PKI or always-online trust path.

  1. Crypto abstraction

    • Introduce a narrow interface:

      interface ICryptoProfile {
          string Name { get; }
          IAttestationSigner AttestationSigner { get; }
          IVerifier DefaultVerifier { get; }
      }
      
    • Provide implementations:

      • FipsCryptoProfile
      • EUeIDASCryptoProfile
      • GostCryptoProfile
      • SmCryptoProfile
    • Selection via configuration, not code changes.

  2. Offline bundles

    • Everything needed to verify a decision must be downloadable:

      • Scanner binaries.
      • Rules/feeds snapshot.
      • CA chains / trust roots.
      • Public keys for signers.
    • Implement a “bundle manifest” that ties these together and is itself signed.

  3. Rekor / ledger independence

    • If Rekor is available:

      • Log attestations.
    • If not:

      • Log to Stella Ops Proof-Market Ledger or internal append-only store.
    • Features must not break when Rekor is absent.


2.5 Policy & lattice engine

Core rule: Risk decisions are lattice evaluations over facts; do not hide policy logic inside business code.

  1. Facts vs policy

    • Facts are:

      • CVE presence, severity, exploit data.
      • Reachability gates.
      • Runtime events (was this function ever executed?).
      • Vendor VEX statements.
    • Policy is:

      • Lattice definitions and merge rules.
      • Trust preferences (vendor vs runtime vs scanner).
    • In code:

      • Facts are input DTOs stored in the evidence graph.
      • Policy is JSON/YAML configuration with versioned schemas.
  2. Single evaluation engine in Scanner.WebService

    • Lattice evaluation must run only in StellaOps.Scanner.WebService (your standing rule).

    • Other services:

      • Request decisions from Scanner.
      • Pass only references (IDs/digests) to facts, not raw policy.
  3. Deterministic evaluation

    • Lattice evaluation must:

      • Use only input facts + policy.
      • Never depend on current time, random, environment state.
    • Every decision object must include:

      • policyDigest
      • inputFactsDigests[]
      • decisionReason (short machine+human readable explanation)

2.6 Proof-of-integrity graph

Core rule: Everything is a node; all relationships are typed edges; nothing disappears.

  1. Graph model

    • Nodes: source repo, commit, build job, SBOM, attestation, image, container runtime, host.

    • Edges: “built_from”, “scanned_with”, “deployed_as”, “executes_on”, “derived_from”.

    • Store in a graph store or graph-like relational schema:

      • IDs are content digests where possible.
  2. Append-only

    • Never delete or overwrite nodes; mark as superseded if needed.
    • Evidence mutations (e.g. new scan) are new nodes/edges.
  3. APIs

    • Provide traversal APIs:

      • “Given this CVE, which production pods are affected?”
      • “Given this pod, show full ancestry up to source commit.”
    • All UI queries must work via these APIs, not ad-hoc joins.


2.7 Quiet-by-design UX and observability

Core rule: Default to minimal, actionable noise; logs and telemetry must be compliant and air-gap friendly.

  1. Triage model

    • Classify everything into:

      • “Blockers” (fail pipeline).
      • “Needs review” (warn but pass).
      • “Noise” (hidden unless requested).
    • The classification uses:

      • Lattice decisions.
      • Reachability gates.
      • Environment criticality (prod vs dev).
  2. Evidence-centric UX

    • Each UI card or API answer must:

      • Reference the underlying attestations by ID/digest.
      • Provide a one-click path to “show raw evidence”.
  3. Logging & telemetry defaults

    • Logging:

      • Structured JSON.
      • No secrets, no PII, no full source in logs.
      • Local file + log rotation is the default.
    • Telemetry:

      • OpenTelemetry-compatible exporters.

      • Pluggable sinks:

        • In-memory (dev).
        • Postgres.
        • External APM if configured.
    • For on-prem:

      • All telemetry must be optional.
      • The system must be fully operational with only local logs.

2.8 AI Codex / Zastava Companion

Core rule: AI is a consumer of the evidence graph, never a source of truth.

  1. Separation of roles

    • Zastava:

      • Reads evidence, decisions, and context.
      • Produces explanations and remediation plans.
    • It must not:

      • Invent vulnerabilities or states not present in evidence.
      • Change decisions or policies.
  2. Interfaces

    • Input:

      • IDs/digests of:

        • Attestations.
        • Lattice decisions.
        • Smart-Diff results.
    • Output:

      • Natural language summary.
      • Ordered remediation steps with references back to evidence IDs.
  3. Determinism around AI

    • Core security behaviour must not depend on AI responses.
    • Pipelines should never “pass/fail based on AI text”.
    • AI is advice only; enforcement is always policy + lattice + evidence.

3. Cross-cutting rules for all Stella Ops developers

When you implement anything in Stella Ops, verify you comply with these:

  1. Determinism first

    • If re-running with the same:

      • artifact digests,
      • feeds,
      • rules,
      • lattices,
      • config,
    • then results must be identical (except for timestamps and cryptographic randomness inside signatures).

  2. Offline-first

    • No hard dependency on:

      • External CAs.
      • External DBs of vulnerabilities.
      • External ledgers.
    • All remote interactions must be:

      • Optional.
      • Pluggable.
      • Replaceable with local mirrors.
  3. Evidence over UI

    • Never implement logic “only in the UI”.
    • The API and attestations must fully reflect what the UI shows.
  4. Contracts over convenience

    • Schemas are contracts:

      • Version them.
      • Do not change existing fields meaning.
      • Add fields with defaults.
    • Deprecate explicitly, never silently break consumers.

  5. Golden fixtures

    • For any new predicate or decision:

      • Create golden fixtures (input → output → attestations).
      • Use them in regression tests.
    • This is crucial for “deterministic replayable scans”.

  6. Respect service boundaries

    • Scanner: facts + evaluation (lattices, reachability).
    • Sbomer: SBOM generation and normalization.
    • Concelier / Vexer: policy application, filtering, presentation; they “preserve prune source”.
    • Authority: signing keys, crypto profiles, trust roots.
    • Feedser: feeds ingestion; must never “decide”, only normalize.

If you want, next step I can do a very concrete checklist for adding a new scanner feature (e.g., “Smart-Diff for Python wheels”) with exact project structure (src/Scanner/StellaOps.Scanner.*), tests, and the minimal set of predicates and attestations that must be produced.