Here’s a compact, first‑time‑friendly plan to add two high‑leverage features to your platform: an **image Smart‑Diff** (with signed, policy‑aware deltas) and **better binaries** (symbol/byte‑level SCA + provenance + SARIF). # Smart‑Diff (images & containers) — what/why/how **What it is:** Compute *deltas* between two images (or layers) and enrich them with context: which files, which packages, which configs flip behavior, and whether the change is actually *reachable at runtime*. Then sign the report so downstream tools can trust it. **Why it matters:** Teams drown in “changed but harmless” noise. A diff that knows “is this reachable, config‑activated, and under the running user?” prioritizes real risk and shortens MTTR. **How to ship it (Stella Ops‑style, on‑prem, .NET 10):** * **Scope of diff** * Layer → file → package → symbol (map file changes to package + version; map package to symbols/exports when available). * Config/env lens: overlay `ENTRYPOINT/CMD`, env, feature flags, mounted secrets, user/UID. * **Reachability gates (3‑bit severity gate)** * `Reachable?` (call graph / entrypoints / process tree) * `Config‑activated?` (feature flags, env, args) * `Running user?` (match file/dir ACLs, capabilities, container `User:`) * Compute a severity class from these bits (e.g., 0–7) and attach a short rationale. * **Attestation** * Emit a **DSSE‑wrapped in‑toto attestation** with the Smart‑Diff as predicate. * Include: artifact digests (old/new), diff summary, gate bits, rule versions, and scanner build info. * Sign offline; verify with cosign/rekor when online is available. * **Predicate (minimal JSON shape)** ```json { "predicateType": "stellaops.dev/predicates/smart-diff@v1", "predicate": { "baseImage": {"name":"...", "digest":"sha256:..."}, "targetImage": {"name":"...", "digest":"sha256:..."}, "diff": { "filesAdded": [...], "filesRemoved": [...], "filesChanged": [{"path":"...", "hunks":[...]}], "packagesChanged": [{"name":"openssl","from":"1.1.1u","to":"3.0.14"}] }, "context": { "entrypoint":["/app/start"], "env":{"FEATURE_X":"true"}, "user":{"uid":1001,"caps":["NET_BIND_SERVICE"]} }, "reachabilityGate": {"reachable":true,"configActivated":true,"runningUser":false,"class":6}, "scanner": {"name":"StellaOps.Scanner","version":"...","ruleset":"reachability-2025.12"} } } ``` * **Pipelines** * Scanner computes diff → predicate JSON → DSSE envelope → write `.intoto.jsonl`. * Optionally export a lightweight **human report** (markdown) and a **machine delta** (protobuf/JSON). # Better binaries — symbol/byte SCA + provenance + SARIF **What it is:** Go beyond package SBOMs. Identify *symbols, sections, and compiler fingerprints* in each produced binary; capture provenance (compiler, flags, LTO, link order, hashes); then emit: 1. an **in‑toto statement per binary**, and 2. a **SARIF 2.1.0** report for GitHub code scanning. **Why it matters:** A lot of risk hides below package level (vendored code, static libs, LTO). Symbol/byte SCA catches it; provenance proves how the binary was built. **How to ship it:** * **Extractors (modular analyzers)** * ELF/PE/Mach‑O parsers (sections, imports/exports, build‑ids, rpaths). * Symbol tables (public + demangled), string tables, compiler notes (`.comment`), PDB/DWARF when present. * Fingerprints: rolling hashes per section/function; Bloom filters for quick symbol presence checks. * **Provenance capture** * Compiler: name/version, target triple, LTO (on/off/mode). * Flags: `-O`, `-fstack-protector`, `-D_FORTIFY_SOURCE`, PIE/RELRO, CET/CFGuard. * Linker: version, libs, order, dead‑strip/LTO decisions. * **In‑toto statement (per binary)** ```json { "predicateType":"slsa.dev/provenance/v1", "subject":[{"name":"bin/app","digest":{"sha256":"..."}}], "predicate":{ "builder":{"id":"stellaops://builder/ci"}, "buildType":"stellaops.dev/build/native@v1", "metadata":{"buildInvocationID":"...","buildStartedOn":"...","buildFinishedOn":"..."}, "materials":[{"uri":"git+ssh://...#","digest":{"sha1":"..."}}], "buildConfig":{ "compiler":{"name":"clang","version":"18.1.3"}, "flags":["-O2","-fstack-protector-strong","-fPIE"], "lto":"thin", "linker":{"name":"lld","version":"18.1.3"}, "hardening":{"pie":true,"relro":"full","fortify":true} } } } ``` * **SARIF 2.1.0 for GitHub code scanning** * One SARIF file per build (or per repo), tool name `StellaOps.BinarySCA`. * For each finding (e.g., vulnerable function signature or insecure linker flag), add: * `ruleId`, CWE/Vuln ID, severity, location (binary + symbol), `helpUri`. * Upload via Actions/API so issues appear in *Security → Code scanning alerts*. * **CI wiring (on‑prem friendly)** * Build → run binary analyzers → write `binary.intoto.jsonl` + `sca.sarif.json`. * Sign the in‑toto statement (DSSE). If air‑gapped, store in your internal evidence bucket; sync to Rekor mirror later. * Optional: export a compact “binary SBOM” (function inventory + hashes). # Minimal .NET 10 / CLI layout (suggested) ``` src/Scanner/StellaOps.Scanner.SmartDiff/ src/Scanner/StellaOps.Scanner.BinarySCA/ src/Predicates/StellaOps.Predicates/ # JSON schemas, versioned src/Sign/StellaOps.Attestation/ # DSSE envelopes, cosign integration src/Exports/StellaOps.Exports.Sarif/ src/Exports/StellaOps.Exports.InToto/ ``` * **Contracts:** freeze JSON Schemas under `StellaOps.Predicates` and version them (e.g., `smart-diff@v1`, `binary-provenance@v1`). * **Determinism:** lock analyzer rulesets + feeds with content hashes; record them in each predicate (`rulesetDigest`). # Quick wins this week * Implement the **3‑bit reachability gate** and surface it in your UI filter. * Emit **DSSE‑wrapped in‑toto** for Smart‑Diff first (binary provenance next). * Add a **basic SARIF exporter** that maps binary findings → GitHub alerts (even if only a couple rules to start: missing RELRO/PIE; insecure `__strcpy` occurrences). If you want, I can draft: * The JSON Schemas for `smart-diff@v1` and `binary-provenance@v1`. * A tiny C# sample that wraps a predicate in DSSE and verifies with cosign. * A SARIF emitter stub wired to your CI. I will split this into two parts: 1. A compact list of Stella Ops advantages (how we win vs other scanners). 2. Concrete developer guidelines that make those advantages real in code. --- ## 1. Stella Ops advantages (what we are optimizing for) When you build any feature, it should clearly reinforce at least one of these: 1. **Evidence-first, signed, replayable** Every non-trivial operation produces signed, DSSE-wrapped attestations and can be re-run later to obtain the same result byte-for-byte. 2. **Reachability-first triage (Smart-Diff + gates)** We never just say “this CVE exists”; we say: *it changed*, *it is or is not reachable*, *it is or is not activated by config*, and *which user actually executes it*. 3. **Binary-level SCA + provenance** We do not stop at packages. We inspect binaries (symbols, sections, toolchain fingerprints) and provide in-toto/SLSA provenance plus SARIF to development tools. 4. **Crypto-sovereign and offline-ready** All signing/verification can use local trust roots and local cryptographic profiles (FIPS / eIDAS / GOST / SM) with no hard dependency on public CAs or external clouds. 5. **Deterministic, replayable scans** A “scan” is a pure function of: artifact digests, feeds, rules, lattice policies, and configuration. Anything not captured there is a bug. 6. **Policy & lattice engine instead of ad-hoc rules** Risk and VEX decisions are the result of explicit lattice merge rules (“trust algebra”), not opaque if-else trees in the code. 7. **Proof-of-integrity graph** All artifacts (source → build → container → runtime) are connected in a cryptographic graph that can be traversed, audited, and exported. 8. **Quiet-by-design UX** The system is optimized to answer three questions fast: 1. Can I ship this? 2) If not, what blocks me? 3) What is the minimal safe change? Everything you build should clearly map to one or more of the above. --- ## 2. Developer guidelines by advantage ### 2.1 Evidence-first, signed, replayable **Core rule:** Any non-trivial action must be traceable as a signed, re-playable evidence record. **Implementation guidelines** 1. **Uniform attestation model** * Define and use a shared library, e.g. `StellaOps.Predicates`, with: * Versioned JSON Schemas (e.g. `smart-diff@v1`, `binary-provenance@v1`, `reachability-summary@v1`). * Strongly-typed C# DTOs that match the schemas. * Every module (Scanner, Sbomer, Concelier, Excititor/Vexer, Authority, Scheduler, Feedser) must: * Emit **DSSE-wrapped in-toto statements**. * Use the same hashing strategy (e.g., SHA-256 over canonical JSON, no whitespace variance). * Include tool name, version, ruleset/feeds digests and configuration id in each predicate. 2. **Link-not-merge** * Never rewrite or mutate third-party SBOM/VEX/attestations. * Instead: * Store original documents as immutable blobs addressed by hash. * Refer to them using digests and URIs (e.g. `sha256:…`) from your own predicates. * Emit **linking evidence**: “this SBOM (digest X) was used to compute decision Y”. 3. **Deterministic scan manifests** * Each scan must have a manifest object: ```json { "artifactDigest": "sha256:...", "scannerVersion": "1.2.3", "rulesetDigest": "sha256:...", "feedsDigests": { "nvd": "sha256:...", "vendorX": "sha256:..." }, "latticePolicyDigest": "sha256:...", "configId": "prod-eu-1", "timestamp": "2025-12-09T13:37:00Z" } ``` * Store it alongside results and include its digest in all predicates produced by that run. 4. **Signing & verification** * All attestation writing goes through a single abstraction, e.g.: ```csharp interface IAttestationSigner { Task SignAsync(TPredicate predicate, CancellationToken ct); } ``` * Implementations may use: * Sigstore (Fulcio + Rekor) when online. * Local keys (HSM, TPM, file key) when offline. * Never do ad-hoc crypto directly in features; always go through the shared crypto layer (see 2.4). --- ### 2.2 Reachability-first triage and Smart-Diff **Core rule:** You must never treat “found a CVE” as sufficient. You must track change + reachability + config + execution context. #### Smart-Diff 1. **Diff levels** * Implement layered diffs: * Image / layer → file → package → symbol. * Map: * File changes → owning package, version. * Package changes → known vulnerabilities and exports. * Attach config context: entrypoint, env vars, feature flags, user/UID. 2. **Signed Smart-Diff predicate** * Use the minimal shape like (simplified): ```json { "predicateType": "stellaops.dev/predicates/smart-diff@v1", "predicate": { "baseImage": {...}, "targetImage": {...}, "diff": {...}, "context": {...}, "reachabilityGate": {...}, "scanner": {...} } } ``` * Always sign as DSSE and attach the scan manifest digest. #### Reachability gate Use the **3-bit gate** consistently: * `reachable` (static/dynamic call graph says “yes”) * `configActivated` (env/flags/args activate the code path) * `runningUser` (the user/UID that can actually execute it) Guidelines: 1. **Data model** ```csharp public sealed record ReachabilityGate( bool? Reachable, // true / false / null for unknown bool? ConfigActivated, bool? RunningUser, int Class, // 0..7 derived from the bits when all known string Rationale // short explanation, human-readable ); ``` 2. **Unknowns must stay unknown** * Never silently treat `null` as `false` or `true`. * If any of the bits is `null`, compute `Class` only from known bits or set `Class = -1` to denote “incomplete”. * Feed all “unknown” cases into a dedicated “Unknowns ranking” path (separate heuristics and UX). 3. **Where reachability is computed** * Respect your standing rule: **lattice and reachability algorithms run in `Scanner.WebService`**, not in Concelier, Feedser, or Excitors/Vexer. * Other services only: * Persist / index results. * Prune / filter based on policy. * Present data — never recompute core reachability. 4. **Caching reachability** * Key caches by: * Artifact digest (image/layer/binary). * Ruleset/lattice digest. * Language/runtime version (for static analysis). * Pattern: * First time a call path is requested, compute and cache. * Subsequent accesses in the same scan use the in-memory cache. * For cross-scan reuse, store a compact summary keyed by (artifactDigest, rulesetDigest) in Scanner’s persistence node. * Never cache across incompatible rule or feed versions. --- ### 2.3 Binary-level SCA and provenance **Core rule:** Treat each built binary as a first-class subject with its own SBOM, SCA, and provenance. 1. **Pluggable analyzers** * Create analyzers per binary format/language: * ELF, PE, Mach-O. * Language/toolchain detectors (GCC/Clang/MSVC/.NET/Go/Rust). * Common interface: ```csharp interface IBinaryAnalyzer { bool CanHandle(BinaryContext ctx); Task AnalyzeAsync(BinaryContext ctx, CancellationToken ct); } ``` 2. **Binary SBOM + SCA** * Output per-binary: * Function/symbol inventory (names, addresses). * Linked static libraries. * Detected third-party components (via fingerprints). * Map to known vulnerabilities via: * Symbol signatures. * Function-level or section-level hashes. * Emit: * CycloneDX/SPDX component entries for binaries. * A separate predicate `binary-sca@v1`. 3. **Provenance (in-toto/SLSA)** * Emit an in-toto statement per binary: * Subject = `bin/app` (digest). * Predicate = build metadata (compiler, flags, LTO, linker, hardening). * Always include: * Source material (git repo + commit). * Build environment (container image digest or runner OS). * Exact build command / script identifier. 4. **SARIF for GitHub / IDEs** * Provide an exporter: * Input: `BinaryAnalysisResult`. * Output: SARIF 2.1.0 with: * Findings: missing RELRO/PIE, unsafe functions, known vulns, weak flags. * Locations: binary path + symbol/function name. * Keep rule IDs stable and documented (e.g. `STB001_NO_RELRO`, `STB010_VULN_SYMBOL`). --- ### 2.4 Crypto-sovereign, offline-ready **Core rule:** No feature may rely on a single global PKI or always-online trust path. 1. **Crypto abstraction** * Introduce a narrow interface: ```csharp interface ICryptoProfile { string Name { get; } IAttestationSigner AttestationSigner { get; } IVerifier DefaultVerifier { get; } } ``` * Provide implementations: * `FipsCryptoProfile` * `EUeIDASCryptoProfile` * `GostCryptoProfile` * `SmCryptoProfile` * Selection via configuration, not code changes. 2. **Offline bundles** * Everything needed to verify a decision must be downloadable: * Scanner binaries. * Rules/feeds snapshot. * CA chains / trust roots. * Public keys for signers. * Implement a “bundle manifest” that ties these together and is itself signed. 3. **Rekor / ledger independence** * If Rekor is available: * Log attestations. * If not: * Log to Stella Ops Proof-Market Ledger or internal append-only store. * Features must not break when Rekor is absent. --- ### 2.5 Policy & lattice engine **Core rule:** Risk decisions are lattice evaluations over facts; do not hide policy logic inside business code. 1. **Facts vs policy** * Facts are: * CVE presence, severity, exploit data. * Reachability gates. * Runtime events (was this function ever executed?). * Vendor VEX statements. * Policy is: * Lattice definitions and merge rules. * Trust preferences (vendor vs runtime vs scanner). * In code: * Facts are input DTOs stored in the evidence graph. * Policy is JSON/YAML configuration with versioned schemas. 2. **Single evaluation engine in Scanner.WebService** * Lattice evaluation must run only in `StellaOps.Scanner.WebService` (your standing rule). * Other services: * Request decisions from Scanner. * Pass only references (IDs/digests) to facts, not raw policy. 3. **Deterministic evaluation** * Lattice evaluation must: * Use only input facts + policy. * Never depend on current time, random, environment state. * Every decision object must include: * `policyDigest` * `inputFactsDigests[]` * `decisionReason` (short machine+human readable explanation) --- ### 2.6 Proof-of-integrity graph **Core rule:** Everything is a node; all relationships are typed edges; nothing disappears. 1. **Graph model** * Nodes: source repo, commit, build job, SBOM, attestation, image, container runtime, host. * Edges: “built_from”, “scanned_with”, “deployed_as”, “executes_on”, “derived_from”. * Store in a graph store or graph-like relational schema: * IDs are content digests where possible. 2. **Append-only** * Never delete or overwrite nodes; mark as superseded if needed. * Evidence mutations (e.g. new scan) are new nodes/edges. 3. **APIs** * Provide traversal APIs: * “Given this CVE, which production pods are affected?” * “Given this pod, show full ancestry up to source commit.” * All UI queries must work via these APIs, not ad-hoc joins. --- ### 2.7 Quiet-by-design UX and observability **Core rule:** Default to minimal, actionable noise; logs and telemetry must be compliant and air-gap friendly. 1. **Triage model** * Classify everything into: * “Blockers” (fail pipeline). * “Needs review” (warn but pass). * “Noise” (hidden unless requested). * The classification uses: * Lattice decisions. * Reachability gates. * Environment criticality (prod vs dev). 2. **Evidence-centric UX** * Each UI card or API answer must: * Reference the underlying attestations by ID/digest. * Provide a one-click path to “show raw evidence”. 3. **Logging & telemetry defaults** * Logging: * Structured JSON. * No secrets, no PII, no full source in logs. * Local file + log rotation is the default. * Telemetry: * OpenTelemetry-compatible exporters. * Pluggable sinks: * In-memory (dev). * Postgres. * External APM if configured. * For on-prem: * All telemetry must be optional. * The system must be fully operational with only local logs. --- ### 2.8 AI Codex / Zastava Companion **Core rule:** AI is a consumer of the evidence graph, never a source of truth. 1. **Separation of roles** * Zastava: * Reads evidence, decisions, and context. * Produces explanations and remediation plans. * It must not: * Invent vulnerabilities or states not present in evidence. * Change decisions or policies. 2. **Interfaces** * Input: * IDs/digests of: * Attestations. * Lattice decisions. * Smart-Diff results. * Output: * Natural language summary. * Ordered remediation steps with references back to evidence IDs. 3. **Determinism around AI** * Core security behaviour must not depend on AI responses. * Pipelines should never “pass/fail based on AI text”. * AI is advice only; enforcement is always policy + lattice + evidence. --- ## 3. Cross-cutting rules for all Stella Ops developers When you implement anything in Stella Ops, verify you comply with these: 1. **Determinism first** * If re-running with the same: * artifact digests, * feeds, * rules, * lattices, * config, * then results must be identical (except for timestamps and cryptographic randomness inside signatures). 2. **Offline-first** * No hard dependency on: * External CAs. * External DBs of vulnerabilities. * External ledgers. * All remote interactions must be: * Optional. * Pluggable. * Replaceable with local mirrors. 3. **Evidence over UI** * Never implement logic “only in the UI”. * The API and attestations must fully reflect what the UI shows. 4. **Contracts over convenience** * Schemas are contracts: * Version them. * Do not change existing fields’ meaning. * Add fields with defaults. * Deprecate explicitly, never silently break consumers. 5. **Golden fixtures** * For any new predicate or decision: * Create golden fixtures (input → output → attestations). * Use them in regression tests. * This is crucial for “deterministic replayable scans”. 6. **Respect service boundaries** * Scanner: facts + evaluation (lattices, reachability). * Sbomer: SBOM generation and normalization. * Concelier / Vexer: policy application, filtering, presentation; they “preserve prune source”. * Authority: signing keys, crypto profiles, trust roots. * Feedser: feeds ingestion; must never “decide”, only normalize. If you want, next step I can do a very concrete checklist for adding a **new scanner feature** (e.g., “Smart-Diff for Python wheels”) with exact project structure (`src/Scanner/StellaOps.Scanner.*`), tests, and the minimal set of predicates and attestations that must be produced.