Here’s a compact, from‑scratch playbook for running **attestation, verification, and SBOM ingestion fully offline**—including pre‑seeded keyrings, an offline Rekor‑style log, and deterministic evidence reconciliation inside sealed networks. --- # 1) Core concepts (quick) * **SBOM**: a machine‑readable inventory (CycloneDX/SPDX) of what’s in an artifact. * **Attestation**: signed metadata (e.g., in‑toto/SLSA provenance, VEX) bound to an artifact’s digest. * **Verification**: cryptographically checking the artifact + attestations against trusted keys/policies. * **Transparency log (Rekor‑style)**: tamper‑evident ledger of entries (hashes + proofs). Offline we use a **private mirror** (no internet). * **Deterministic reconciliation**: repeatable joining of SBOM + attestation + policy into a stable “evidence graph” with identical results when inputs match. --- # 2) Golden inputs you must pre‑seed into the air‑gap * **Root of trust**: * Vendor/org public keys (X.509 or SSH/age/PGP), **AND** their certificate chains if using Fulcio‑like PKI. * A pinned **CT/transparency log root** (your private one) + inclusion proof parameters. * **Policy bundle**: * Verification policies (Cosign/in‑toto rules, VEX merge rules, allow/deny lists). * Hash‑pinned toolchain manifests (exact versions + SHA256 of cosign, oras, jq, your scanners, etc.). * **Evidence bundle**: * SBOMs (CycloneDX 1.5/1.6 and/or SPDX 3.0.x). * DSSE‑wrapped attestations (provenance, build, SLSA, VEX). * Optional: vendor CVE feeds/VEX as static snapshots. * **Offline log snapshot**: * A **signed checkpoint** (tree head) and **entry pack** (all log entries you rely on), plus Merkle proofs. Ship all of the above on signed, write‑once media (WORM/BD‑R or signed tar with detached sigs). --- # 3) Minimal offline directory layout ``` /evidence/ keys/ roots/ # root/intermediate certs, PGP pubkeys identities/ # per-vendor public keys tlog-root/ # hashed/pinned tlog root(s) policy/ verify-policy.yaml # cosign/in-toto verification policies lattice-rules.yaml # your VEX merge / trust lattice rules sboms/ # *.cdx.json, *.spdx.json attestations/ # *.intoto.jsonl.dsig (DSSE) tlog/ checkpoint.sig # signed tree head entries/ # *.jsonl (Merkle leaves) + proofs tools/ cosign- (sha256) oras- (sha256) jq- (sha256) your-scanner- (sha256) ``` --- # 4) Pre‑seeded keyrings (no online CA lookups) **Cosign** (example with file‑based roots and identity pins): ```bash # Verify a DSSE attestation with local roots & identities only COSIGN_EXPERIMENTAL=1 cosign verify-attestation \ --key ./evidence/keys/identities/vendor_A.pub \ --insecure-ignore-tlog \ --certificate-identity "https://ci.vendorA/build" \ --certificate-oidc-issuer "https://fulcio.offline" \ --rekor-url "http://127.0.0.1:8080" \ # your private tlog OR omit entirely --policy ./evidence/policy/verify-policy.yaml \ ``` If you do **not** run any server inside the air‑gap, omit `--rekor-url` and use **local tlog proofs** (see §6). **in‑toto** (offline layout): ```bash in-toto-verify \ --layout ./attestations/layout.root.json \ --layout-keys ./keys/identities/vendor_A.pub \ --products ``` --- # 5) SBOM ingestion (deterministic) 1. Normalize SBOMs to a canonical form: ```bash jq -S . sboms/app.cdx.json > sboms/_canon/app.cdx.json jq -S . sboms/app.spdx.json > sboms/_canon/app.spdx.json ``` 2. Validate schemas (use vendored validators). 3. Hash‑pin the canonical files and record in a **manifest.lock**: ```bash sha256sum sboms/_canon/*.json > manifest.lock ``` 4. Import into your DB with **idempotent keys = (artifactDigest, sbomHash)**. Reject if same key exists with different bytes. --- # 6) Offline Rekor mirror (no internet) Two patterns: **A. Embedded file‑ledger (simplest)** * Keep `tlog/checkpoint.sig` (signed tree head) and `tlog/entries/*.jsonl` (leaves + inclusion proofs). * During verify: * Recompute the Merkle root from entries. * Check it matches `checkpoint.sig` (after verifying its signature with your **tlog root key**). * For each attestation, verify its **UUID / digest** appears in the entry pack and the **inclusion proof** resolves. **B. Private Rekor instance (inside air‑gap)** * Run Rekor pointing to your local storage. * Load entries via an **import job** from the entry pack. * Pin the Rekor **public key** in `keys/tlog-root/`. * Verification uses `--rekor-url http://rekor.local:3000` with no outbound traffic. > In both cases, verification must **not** fall back to the public internet. Fail closed if proofs or keys are missing. --- # 7) Deterministic evidence reconciliation (the “merge without magic” loop) Goal: produce the same “evidence graph” every time given the same inputs. Algorithm sketch: 1. **Index** artifacts by immutable digest. 2. For each artifact digest: * Collect SBOM nodes (components) from canonical SBOM files. * Collect attestations: provenance, VEX, SLSA, signatures (DSSE). * Validate each attestation **before** merge: * Sig verifies with pre‑seeded keys. * (If used) tlog inclusion proof verifies against offline checkpoint. 3. **Normalize** all docs (stable sort keys, strip timestamps to allowed fields, lower‑case URIs). 4. **Apply lattice rules** (your `lattice-rules.yaml`): * Example: `VEX: under_review < affected < fixed < not_affected (statement-trust)` with **vendor > maintainer > 3rd‑party** precedence. * Conflicts resolved via deterministic priority list (source, signature strength, issuance time rounded to minutes, then lexical tiebreak). 5. Emit: * `evidence-graph.json` (stable node/edge order). * `evidence-graph.sha256` and a DSSE signature from your **Authority** key. This gives you **byte‑for‑byte identical** outputs across runs. --- # 8) Offline provenance for the tools themselves * Treat every tool binary in `/evidence/tools/` like a supply‑chain artifact: * Keep **SBOM for the tool**, its **checksum**, and a **signature** from your build or a trusted vendor. * Verification policy must reject running a tool without a matching `(checksum, signature)` entry. --- # 9) Example verification policy (cosign‑style, offline) ```yaml # evidence/policy/verify-policy.yaml keys: - ./evidence/keys/identities/vendor_A.pub - ./evidence/keys/identities/your_authority.pub tlog: mode: "offline" # never reach out checkpoint: "./evidence/tlog/checkpoint.sig" entry_pack: "./evidence/tlog/entries" attestations: required: - type: slsa-provenance - type: cyclonedx-sbom optional: - type: vex constraints: subjects: alg: "sha256" # only sha256 digests accepted certs: allowed_issuers: - "https://fulcio.offline" allow_expired_if_timepinned: true ``` --- # 10) Operational flow inside the sealed network 1. **Import bundle** (mount WORM media read‑only). 2. **Verify tools** (hash + signature) before execution. 3. **Verify tlog checkpoint**, then **verify each inclusion proof**. 4. **Verify attestations** (keyring + policy). 5. **Ingest SBOMs** (canonicalize + hash). 6. **Reconcile** (apply lattice rules → evidence graph). 7. **Record your run**: * Write `run.manifest` with hashes of: inputs, policies, tools, outputs. * DSSE‑sign `run.manifest` with the Authority key. --- # 11) Disaster‑ready “seed and refresh” model * **Seed**: quarterly (or release‑based) export from connected world → signed bundle. * **Delta refreshes**: smaller entry packs with only new SBOMs/attestations + updated checkpoint. * Always keep **N previous checkpoints** to allow replay and audits. --- # 12) Quick hardening checklist * Fail closed on: unknown keys, missing proofs, schema drift, clock skew beyond tolerance. * No online fallbacks—env vars like `NO_NETWORK=1` guardrails in your verification binaries. * Pin all versions and capture `--version` output into `run.manifest`. * Use reproducible container images (digest‑locked) even for your internal tools. --- If you want, I can turn this into: * a **ready‑to‑run folder template** (with sample policies + scripts), * a **.NET 10** helper library for DSSE + offline tlog proof checks, * or a **Stella Ops module sketch** (Authority, Sbomer, Vexer, Scanner, Feedser) wired exactly to this flow. I will split this in two parts: 1. Stella Ops advantages (deepened, structured as “moats”). 2. Concrete developer guidelines you can drop into a `DEV_GUIDELINES.md` for all Stella services. --- ## 1. Stella Ops advantages – expanded ### 1.1 Evidence-first, not “CVE list-first” **Problem in the market** Most tools: * Dump long CVE lists from a single scanner + single feed. * Have weak cross-scanner consistency. * Treat SBOM, VEX, and runtime evidence as separate, loosely coupled features. **Stella advantage** Stella’s core product is an **evidence graph**, not a report: * All inputs (SBOMs, scanner findings, VEX, runtime probes, policies) are ingested as **immutable evidence nodes**, with: * Cryptographic identity (hash / dsse envelope / tlog proof). * Clear provenance (source, time, keys, feeds). * Risk signals (what is exploitable/important) are derived **after** evidence is stored, via lattice logic in `Scanner.WebService`, not during ingestion. * UI, API, and CI output are always **explanations of the evidence graph** (“this CVE is suppressed by this signed VEX statement, proven by these keys and these rules”). This gives you: * **Quiet-by-design UX**: the “noise vs. signal” ratio is controlled by lattice logic and reachability, not vendor marketing severity. * **Traceable decisions**: every “allow/deny” decision can be traced to concrete evidence and rules. Developer consequence: Every Stella module must treat its job as **producing or transforming evidence**, not “telling the user what to do.” --- ### 1.2 Deterministic, replayable scans **Problem** * Existing tools are hard to replay: feeds change, scanners change, rules change. * For audits/compliance you cannot easily re-run “the same scan” from 9 months ago and get the same answer. **Stella advantage** Each scan in Stella is defined by a **deterministic manifest**: * Precise hashes and versions of: * Scanner binaries / containers. * SBOM parsers, VEX parsers. * Lattice rules, policies, allow/deny lists. * Feeds snapshots (CVE, CPE/CPE-2.3, OS vendor advisories, distro data). * Exact artifact digests (image, files, dependencies). * Exact tlog checkpoints used for attestation verification. * Config parameters (flags, perf knobs) recorded. From this: * You can recreate a *replay bundle* and re-run the scan offline with **byte-for-byte identical outcomes**, given the same inputs. * Auditors/clients can verify that a historical decision was correct given the knowledge at that time. Developer consequence: Any new feature that affects risk decisions must: * Persist versioned configuration and inputs in a **scan manifest**, and * Be able to reconstruct results from that manifest without network calls. --- ### 1.3 Crypto-sovereign, offline-ready by design **Problem** * Most “Sigstore-enabled” tooling assumes access to public Fulcio/Rekor over the internet. * Many orgs (banks, defense, state operators) cannot rely on foreign CAs or public transparency logs. * Regional crypto standards (GOST, SM2/3/4, eIDAS, FIPS) are rarely supported properly. **Stella advantage** * **Offline trust anchors**: Stella runs with a fully pre-seeded root of trust: * Local CA chains (Fulcio-like), private Rekor mirror or file-based Merkle log. * Vendor/org keys and cert chains for SBOM, VEX, and provenance. * **Crypto abstraction layer**: * Pluggable algorithms: NIST curves, Ed25519, GOST, SM2/3/4, PQC (Dilithium/Falcon) as optional profiles. * Policy-driven: per-tenant crypto policy that defines what signatures are acceptable in which contexts. * **No online fallback**: * Verification will never “phone home” to public CAs/logs. * Missing keys/proofs → deterministic, explainable failures. Developer consequence: Every crypto operation must: * Go through a **central crypto and trust-policy abstraction**, not directly through platform libraries. * Support an **offline-only execution mode** that fails closed when external services are not available. --- ### 1.4 Rich SBOM/VEX semantics and “link-not-merge” **Problem** * Many tools turn SBOMs into their own proprietary schema early, losing fidelity. * VEX data is often flattened into flags (“affected/not affected”) without preserving original statements and signatures. **Stella advantage** * **Native support** for: * CycloneDX 1.5/1.6 and SPDX 3.x as first-class citizens. * DSSE-wrapped attestations (provenance, VEX, custom). * **Link-not-merge model**: * Original SBOM/VEX files are stored **immutable** (canonical JSON). * Stella maintains **links** between: * Artifacts ↔ Components ↔ Vulnerabilities ↔ VEX statements ↔ Attestations. * Derived views are computed on top of links, not by mutating original data. * **Trust-aware VEX lattice**: * Multiple VEX statements from different parties can conflict. * A lattice engine defines precedence and resolution: vendor vs maintainer vs third-party; affected/under-investigation/not-affected/fixed, etc. Developer consequence: No module is ever allowed to “rewrite” SBOM/VEX content. They may: * Store it, * Canonicalize it, * Link it, * Derive views on top of it, but must always keep original bytes addressable and hash-pinned. --- ### 1.5 Lattice-based trust algebra (Stella “Trust Algebra Studio”) **Problem** * Existing tools treat suppression, exception, and VEX as ad-hoc rule sets, hard to reason about and even harder to audit. * There is no clear, composable way to combine multiple trust sources. **Stella advantage** * Use of **lattice theory** for trust: * Risk states (e.g., exploitable, mitigated, irrelevant, unknown) are elements of a lattice. * VEX statements, policies, and runtime evidence act as **morphisms** over that lattice. * Final state is a deterministic “join/meet” over all evidence. * Vendor- and customer-configurable: * Visual and declarative editing in “Trust Algebra Studio.” * Exported as machine-readable manifests used by `Scanner.WebService`. Developer consequence: All “Is this safe?” or “Should we fail the build?” logic: * Lives in the **lattice engine in `Scanner.WebService`**, not in Sbomer / Vexer / Feedser / Concelier. * Must be fully driven by declarative policy artifacts, which are: * Versioned, * Hash-pinned, * Stored as evidence. --- ### 1.6 Proof-of-Integrity Graph (build → deploy → runtime) **Problem** * Many vendors provide a one-shot scan or “image signing” with no continuous linkage back to build provenance and SBOM. * Runtime views are disconnected from build-time evidence. **Stella advantage** * **Proof-of-Integrity Graph**: * For each running container/process, Stella tracks: * Image digest → SBOM → provenance attestation → signatures and tlog proofs → policies applied → runtime signals. * Every node in that chain is cryptographically linked. * This lets you say: * “This running pod corresponds to this exact build, these SBOM components, and these VEX statements, verified with these keys.” Developer consequence: Any runtime-oriented module (scanner sidecars, agents, k8s admission, etc.) must: * Treat the **digest** + attestation chain as the identity of a workload. * Never operate solely on mutable labels (tags, names, namespaces) without a digest backlink. --- ### 1.7 AI Codex / Assistant on top of proofs, not heuristics **Problem** * Most AI-driven security assistants are “LLM over text reports,” effectively hallucinating risk judgments. **Stella advantage** * AI assistant (Zastava / Companion) is constrained to: * Read from the **evidence graph**, lattice decisions, and deterministic manifests. * Generate **explanations**, remediation plans, and playbooks—but never bypass hard rules. * This yields: * Human-readable, audit-friendly reasoning. * Low hallucination risk, because the assistant is grounded in structured facts. Developer consequence: All AI-facing APIs must: * Expose **structured, well-typed evidence and decisions**, not raw strings. * Treat LLM/AI output as advisory, never as an authority that can modify evidence, policy, or crypto state. --- ## 2. Stella Ops – developer guidelines You can think of this as a “short charter” for all devs in the Stella codebase. ### 2.1 Architectural principles 1. **Evidence-first, policy-second, UI-third** * First: model and persist raw evidence (SBOM, VEX, scanner findings, attestations, logs). * Second: apply policies/lattice logic to evaluate evidence. * Third: build UI and CLI views that explain decisions based on evidence and policies. 2. **Pipeline-first interfaces** * Every capability must be consumable from: * CLI, * API, * CI/CD YAML integration. * The web UI is an explainer/debugger, not the only control plane. 3. **Offline-first design** * Every network dependency must have: * A clear “online” path, and * A documented “offline bundle” path (pre-seeded feeds, keyrings, logs). * No module is allowed to perform optional online calls that change security outcomes when offline. 4. **Determinism by default** * Core algorithms (matching, reachability, lattice resolution) must not: * Depend on wall-clock time (beyond inputs captured in the scan manifest), * Depend on network responses, * Use randomness without a seed recorded in the manifest. * Outputs must be reproducible given: * Same inputs, * Same policies, * Same versions of components. --- ### 2.2 Solution & code organization (.NET 10 / C#) For each service, follow a consistent layout, e.g.: * `StellaOps..Domain` * Pure domain models, lattice algebra types, value objects. * No I/O, no HTTP, no EF, no external libs except core BCL and domain math libs. * `StellaOps..Application` * Use-cases / handlers / orchestrations (CQRS style if preferred). * Interfaces for repositories, crypto, feeds, scanners. * `StellaOps..Infrastructure` * Implementations of ports: * EF Core 9 / Dapper for Postgres, * MongoDB drivers, * Integration with external scanners and tools. * `StellaOps..WebService` * ASP.NET minimal APIs or controllers. * AuthZ, multi-tenancy boundaries, DTOs, API versioning. * Lattice engine for Scanner only (per your standing rule). * `StellaOps.Sdk.*` * Shared models and clients for: * Evidence graph schemas, * DSSE/attestation APIs, * Crypto abstraction. Guideline: No domain logic inside controllers, jobs, or EF entities. All logic lives in `Domain` and `Application`. --- ### 2.3 Global invariants developers must respect 1. **Original evidence is immutable** * Once an SBOM/VEX/attestation/scanner report is stored: * Never update the stored bytes. * Only mark it as superseded / obsolete via new records. * Every mutation of state must be modeled as: * New evidence node or * New relationship. 2. **“Link-not-merge” for external content** * Store external documents as canonical blobs + parsed, normalized models. * Link them to internal models; do not re-serialize a “Stella version” and throw away the original. 3. **Lattice logic only in `Scanner.WebService`** * Sbomer/Vexer/Feedser/Concelier must: * Ingest/normalize/publish evidence, * Never implement their own evaluation of “safe vs unsafe.” * `Scanner.WebService` is the only place where: * Reachability, * Severity, * VEX resolution, * Policy decisions are computed. 4. **Crypto operations via Authority** * Any signing or verification of: * SBOMs, * VEX, * Provenance, * Scan manifests, must go through Authority abstractions: * Key store, * Trust policy engine, * Rekor/log verifier (online or offline). * No direct `RSA.Create()` etc. inside application services. 5. **No implicit network trust** * Any HTTP client must: * Explicitly declare whether it is allowed in: * Online mode only, or * Online + offline (with mirror). * Online fetches may only: * Pull feeds and cache them as immutable snapshots. * Never change decisions made for already-completed scans. --- ### 2.4 Module-level guidelines #### 2.4.1 Scanner.* Responsibilities: * Integrate one or more scanners (Trivy, Grype, OSV, custom engines, Bun/Node etc.). * Normalize their findings into a **canonical finding model**. * Run lattice + reachability algorithms to derive final “risk states”. Guidelines: * Each engine integration: * Runs in an isolated, well-typed adapter (e.g., `IScannerEngine`). * Produces **raw findings** with full context (CVE, package, version, location, references). * Canonical model: * Represent vulnerability, package, location, and evidence origin explicitly. * Track which engine(s) reported each finding. * Lattice engine: * Consumes: * Canonical findings, * SBOM components, * VEX statements, * Policies, * Optional runtime call graph / reachability information. * Produces: * Deterministic risk state per (vulnerability, component, artifact). * Scanner output: * Always include: * Raw evidence references (IDs), * Decisions (state), * Justification (which rules fired). #### 2.4.2 Sbomer.* Responsibilities: * Ingest, validate, and store SBOMs. * Canonicalize and expose them as structured evidence. Guidelines: * Support CycloneDX + SPDX first; plug-in architecture for others. * Canonicalization: * Sort keys, normalize IDs, strip non-essential formatting. * Compute **canonical hash** and store. * Never drop information: * Unknown/extension fields should be preserved in a generic structure. * Indexing: * Index SBOMs by artifact digest and canonical hash. * Make ingestion idempotent for identical content. #### 2.4.3 Vexer / Excitors.* Responsibilities: * Ingest and normalize VEX and advisory documents from multiple sources. * Maintain a source-preserving model of statements, not final risk. Guidelines: * VEX statements: * Model as: (subject, vulnerability, status, justification, timestamp, signer, source). * Keep source granularity (which file, line, signature). * Excitors (feed-to-VEX/advisory converters): * Pull from vendors (Red Hat, Debian, etc.) and convert into normalized VEX-like statements or internal advisory format. * Preserve raw docs alongside normalized statements. * No lattice resolution: * They only output statements; resolution happens in Scanner based on trust lattice. #### 2.4.4 Feedser.* Responsibilities: * Fetch and snapshot external feeds (CVE, OS, language ecosystems, vendor advisories). Guidelines: * Snapshot model: * Each fetch = versioned snapshot with: * Source URL, * Time, * Hash, * Signed metadata if available. * Offline bundles: * Ability to export/import snapshots as tarballs for air-gapped environments. * Idempotency: * Importing the same snapshot twice must be a no-op. #### 2.4.5 Authority.* Responsibilities: * Central key and trust management. * DSSE, signing, verification. * Rekor/log (online or offline) integration. Guidelines: * Key management: * Clearly separate: * Online signing keys, * Offline/HSM keys, * Root keys. * Verification: * Use local keyrings, pinned CAs, and offline logs by default. * Enforce “no public fallback” unless explicitly opted in by the admin. * API: * Provide a stable interface for: * `VerifyAttestation(artifactDigest, dsseEnvelope, verificationPolicy) → VerificationResult` * `SignEvidence(evidenceHash, keyId, context) → Signature` #### 2.4.6 Concelier.* Responsibilities: * Map the evidence graph to business context: * Applications, environments, customers, SLAs. Guidelines: * Never change evidence; only: * Attach business labels, * Build views (per app, per cluster, per customer). * Use decisions from Scanner: * Do not re-implement risk logic. * Only interpret risk in business terms (SLA breach, policy exceptions etc.). --- ### 2.5 Testing & quality guidelines 1. **Golden fixtures everywhere** * For SBOM/VEX/attestation/scan pipelines: * Maintain small, realistic fixture sets with: * Inputs (files), * Config manifests, * Expected evidence graph outputs. * Tests must be deterministic and work offline. 2. **Snapshot-style tests** * For lattice decisions: * Use snapshot tests of decisions per (artifact, vulnerability). * Any change must be reviewed as a potential policy or algorithm change. 3. **Offline mode tests** * CI must include a job that: * Runs with `NO_NETWORK=1` (or equivalent), * Uses only pre-seeded bundles, * Ensures features degrade gracefully but deterministically. 4. **Performance caps** * For core algorithms (matching, lattice, reachability): * Maintain per-feature benchmarks with target upper bounds. * Fail PRs that introduce significant regressions. --- ### 2.6 CI/CD and deployment guidelines 1. **Immutable build** * All binaries and containers: * Built in controlled CI, * SBOM-ed, * Signed (Authority), * Optional tlog entry (online or offline). 2. **Self-hosting expectations** * Default deployment is: * Docker Compose or Kubernetes, * Postgres + Mongo (if used) pinned with migrations, * No internet required after initial bundle import. 3. **Scan as code** * Scans declared as YAML manifests (or JSON) checked into Git: * Artifact(s), * Policies, * Feeds snapshot IDs, * Toolchain versions. * CI jobs call Stella CLI/SDK using those manifests. --- ### 2.7 Definition of Done for a new feature When you implement a new Stella feature, it is “Done” only if: 1. Evidence * New data is persisted as immutable evidence with canonical hashes. * Original external content is stored and linkable. 2. Determinism * Results are deterministic given a manifest of inputs; a “replay” test exists. 3. Offline * Feature works with offline bundles and does not silently call the internet. * Degradation behavior is clearly defined and tested. 4. Trust & crypto * All signing/verification goes through Authority. * Any new trust decisions are expressible via lattice/policy manifests. 5. UX & pipeline * Feature is accessible from: * CLI, * API, * CI. * UI only explains and navigates; it is not the sole control. If you like, next step I can do is: take one module (e.g., `StellaOps.Scanner.WebService`) and write a concrete, file-level skeleton (projects, directories, main classes/interfaces) that follows all of these rules.