Here’s a clean, air‑gap‑ready spine for turning container images into verifiable SBOMs and provenance—built to be idempotent and easy to slot into Stella Ops or any CI/CD. ```mermaid flowchart LR A[OCI Image/Repo]-->B[Layer Extractor] B-->C[Sbomer: CycloneDX/SPDX] C-->D[DSSE Sign] D-->E[in-toto Statement (SLSA Provenance)] E-->F[Transparency Log Adapter] C-->G[POST /sbom/ingest] F-->H[POST /attest/verify] ``` ### What this does (in plain words) * **Pull & crack the image** → extract layers, metadata (labels, env, history). * **Build an SBOM** → emit **CycloneDX 1.6** and **SPDX 3.0.1** (pick one or both). * **Sign artifacts** → wrap SBOM/provenance in **DSSE** envelopes. * **Provenance** → generate **in‑toto Statement** with **SLSA Provenance v1** as the predicate. * **Auditability** → optionally publish attestations to a transparency log (e.g., Rekor) so they’re tamper‑evident via Merkle proofs. * **APIs are idempotent** → safe to re‑ingest the same image/SBOM/attestation without version churn. ### Design notes you can hand to an agent * **Idempotency keys** * `contentAddress` = SHA256 of OCI manifest (or full image digest) * `sbomHash` = SHA256 of normalized SBOM JSON * `attHash` = SHA256 of DSSE payload (base64‑stable) Store these; reject duplicates with HTTP 200 + `"status":"already_present"`. * **Default formats** * SBOM export: CycloneDX v1.6 (`application/vnd.cyclonedx+json`), SPDX 3.0.1 (`application/spdx+json`) * DSSE envelope: `application/dsse+json` * in‑toto Statement: `application/vnd.in-toto+json` with `predicateType` = SLSA Provenance v1 * **Air‑gap mode** * No external calls required; Rekor publish is optional. * Keep a local Merkle log (pluggable) and allow later “sync‑to‑Rekor” when online. * **Transparency log adapter** * Interface: `Put(entry) -> {logIndex, logID, inclusionProof}` * Backends: `rekor`, `local-merkle`, `null` (no‑op) ### Minimal API sketch * `POST /sbom/ingest` * Body: `{ imageDigest, sbom, format, dsseSignature? }` * Returns: `{ sbomId, status, sbomHash }` (status: `stored|already_present`) * `POST /attest/verify` * Body: `{ dsseEnvelope, expectedSubjects:[{name, digest}] }` * Verifies DSSE, checks in‑toto subject ↔ image digest, optionally records/logs. * Returns: `{ verified:true, predicateType, logIndex?, inclusionProof? }` ### CLI flow (pseudocode) ```bash # 1) Extract stella-extract --image $IMG --out /work/extract # 2) SBOM (Cdx + SPDX) stella-sbomer cdx --in /work/extract --out /work/sbom.cdx.json stella-sbomer spdx --in /work/extract --out /work/sbom.spdx.json # 3) DSSE sign (offline keyring or HSM) stella-sign dsse --in /work/sbom.cdx.json --out /work/sbom.cdx.dsse.json --key file:k.pem # 4) SLSA provenance (in‑toto Statement) stella-provenance slsa-v1 --subject $IMG_DIGEST --materials /work/extract/manifest.json \ --out /work/prov.dsse.json --key file:k.pem # 5) (optional) Publish to transparency log stella-log publish --in /work/prov.dsse.json --backend rekor --rekor-url $REKOR ``` ### Validation rules (quick) * **Subject binding**: in‑toto Statement `subject[].digest.sha256` must equal the OCI image digest you scanned. * **Key policy**: enforce allowed issuers (Fulcio, internal CA, GOST/SM/EIDAS/FIPS as needed). * **Normalization**: canonicalize JSON before hashing/signing to keep idempotency stable. ### Why this matters * **Audit‑ready**: You can always prove *what* you scanned, *how* it was built, and *who* signed it. * **Noise‑gated**: With deterministic SBOMs + provenance, downstream VEX/reachability gets much cleaner. * **Drop‑in**: Works in harsh environments—offline, mirrors, sovereign crypto stacks—without changing your pipeline. If you want, I can generate: * a ready‑to‑use OpenAPI stub for `POST /sbom/ingest` and `POST /attest/verify`, * C# (.NET 10) DSSE + in‑toto helpers (interfaces + test fixtures), * or a Docker‑compose “air‑gap bundle” showing the full spine end‑to‑end. Below is a full architecture plan you can hand to an agent as the “master spec” for implementing the SBOM & provenance spine (image → SBOM → DSSE → in-toto/SLSA → transparency log → REST APIs), with idempotent APIs and air-gap readiness. --- ## 1. Scope and Objectives **Goal:** Implement a deterministic, air-gap-ready “SBOM spine” that: * Converts OCI images into SBOMs (CycloneDX 1.6 and SPDX 3.0.1). * Generates SLSA v1 provenance wrapped in in-toto Statements. * Signs all artifacts with DSSE envelopes using pluggable crypto providers. * Optionally publishes attestations to transparency logs (Rekor/local-Merkle/none). * Exposes stable, idempotent APIs: * `POST /sbom/ingest` * `POST /attest/verify` * Avoids versioning by design; APIs are extended, not versioned; all mutations are idempotent keyed by content digests. **Out of scope (for this iteration):** * Full vulnerability scanning (delegated to Scanner service). * Policy evaluation / lattice logic (delegated to Scanner/Graph engine). * Vendor-facing proof-market ledger and trust economics (future module). --- ## 2. High-Level Architecture ### 2.1 Logical Components 1. **StellaOps.SupplyChain.Core (Library)** * Shared types and utilities: * Domain models: SBOM, DSSE, in-toto Statement, SLSA predicates. * Canonicalization & hashing utilities. * DSSE sign/verify abstractions. * Transparency log entry model & Merkle proof verification. 2. **StellaOps.Sbomer.Engine (Library)** * Image → SBOM functionality: * Layer & manifest analysis. * SBOM generation: CycloneDX, SPDX. * Extraction of metadata (labels, env, history). * Deterministic ordering & normalization. 3. **StellaOps.Provenance.Engine (Library)** * Build provenance & in-toto: * In-toto Statement generator. * SLSA v1 provenance predicate builder. * Subject and material resolution from image metadata & SBOM. 4. **StellaOps.Authority (Service/Library)** * Crypto & keys: * Key management abstraction (file, HSM, KMS, sovereign crypto). * DSSE signing & verification with multiple key types. * Trust roots, certificate chains, key policies. 5. **StellaOps.LogBridge (Service/Library)** * Transparency log adapter: * Rekor backend. * Local Merkle log backend (for air-gap). * Null backend (no-op). * Merkle proof validation. 6. **StellaOps.SupplyChain.Api (Service)** * The SBOM spine HTTP API: * `POST /sbom/ingest` * `POST /attest/verify` * Optionally: `GET /sbom/{id}`, `GET /attest/{id}`, `GET /image/{digest}/summary`. * Performs orchestrations: * SBOM/attestation parsing, canonicalization, hashing. * Idempotency and persistence. * Delegation to Authority and LogBridge. 7. **CLI Tools (optional but recommended)** * `stella-extract`, `stella-sbomer`, `stella-sign`, `stella-provenance`, `stella-log`. * Thin wrappers over the above libraries; usable offline and in CI pipelines. 8. **Persistence Layer** * Primary DB: PostgreSQL (or other RDBMS). * Optional object storage: S3/MinIO for large SBOM/attestation blobs. * Tables: `images`, `sboms`, `attestations`, `signatures`, `log_entries`, `keys`. ### 2.2 Deployment View (Kubernetes / Docker) ```mermaid flowchart LR subgraph Node1[Cluster Node] A[StellaOps.SupplyChain.Api (ASP.NET Core)] B[StellaOps.Authority Service] C[StellaOps.LogBridge Service] end subgraph Node2[Worker Node] D[Runner / CI / Air-gap host] E[CLI Tools\nstella-extract/sbomer/sign/provenance/log] end F[(PostgreSQL)] G[(Object Storage\nS3/MinIO)] H[(Local Merkle Log\nor Rekor)] A --> F A --> G A --> C A --> B C --> H E --> A ``` * **Air-gap mode:** * Rekor backend disabled; LogBridge uses local Merkle log (`H`) or `null`. * All components run within the offline network. * **Online mode:** * LogBridge talks to external Rekor instance using outbound HTTPS only. --- ## 3. Domain Model and Storage Design Use EF Core 9 with PostgreSQL in .NET 10. ### 3.1 Core Entities 1. **ImageArtifact** * `Id` (GUID/ULID, internal). * `ImageDigest` (string; OCI digest; UNIQUE). * `Registry` (string). * `Repository` (string). * `Tag` (string, nullable, since digest is canonical). * `FirstSeenAt` (timestamp). * `MetadataJson` (JSONB; manifest, labels, env). 2. **Sbom** * `Id` (string, primary key = `SbomHash` or derived ULID). * `ImageArtifactId` (FK). * `Format` (enum: `CycloneDX_1_6`, `SPDX_3_0_1`). * `ContentHash` (string; normalized JSON SHA-256; UNIQUE with `TenantId`). * `StorageLocation` (inline JSONB or external object storage key). * `CreatedAt`. * `Origin` (enum: `Generated`, `Uploaded`, `ExternalVendor`). * Unique constraint: `(TenantId, ContentHash)`. 3. **Attestation** * `Id` (string, primary key = `AttestationHash` or derived ULID). * `ImageArtifactId` (FK). * `Type` (enum: `InTotoStatement_SLSA_v1`, `Other`). * `PayloadHash` (hash of DSSE payload, before envelope). * `DsseEnvelopeHash` (hash of full DSSE JSON). * `StorageLocation` (inline JSONB or object storage). * `CreatedAt`. * `Issuer` (string; signer identity / certificate subject). * Unique constraint: `(TenantId, DsseEnvelopeHash)`. 4. **SignatureInfo** * `Id` (GUID/ULID). * `AttestationId` (FK). * `KeyId` (logical key identifier). * `Algorithm` (enum; includes PQ & sovereign algs). * `VerifiedAt`. * `VerificationStatus` (enum: `Valid`, `Invalid`, `Unknown`). * `DetailsJson` (JSONB; trust-chain, error reasons, etc.). 5. **TransparencyLogEntry** * `Id` (GUID/ULID). * `AttestationId` (FK). * `Backend` (enum: `Rekor`, `LocalMerkle`). * `LogIndex` (string). * `LogId` (string). * `InclusionProofJson` (JSONB). * `RecordedAt`. * Unique constraint: `(Backend, LogId, LogIndex)`. 6. **KeyRecord** (optional if not reusing Authority’s DB) * `KeyId` (string, PK). * `KeyType` (enum). * `Usage` (enum: `Signing`, `Verification`, `Both`). * `Status` (enum: `Active`, `Retired`, `Revoked`). * `MetadataJson` (JSONB; KMS ARN, HSM slot, etc.). ### 3.2 Idempotency Keys * SBOM: * `sbomHash = SHA256(canonicalJson(sbom))`. * Uniqueness enforced by `(TenantId, sbomHash)` in DB. * Attestation: * `attHash = SHA256(canonicalJson(dsse.payload))` or full envelope. * Uniqueness enforced by `(TenantId, attHash)` in DB. * Image: * `imageDigest` is globally unique (per OCI spec). --- ## 4. Service-Level Architecture ### 4.1 StellaOps.SupplyChain.Api (.NET 10, ASP.NET Core) **Responsibilities:** * Expose HTTP API for ingest / verify. * Handle idempotency logic & persistence. * Delegate cryptographic operations to Authority. * Delegate transparency logging to LogBridge. * Perform basic validation against schemas (SBOM, DSSE, in-toto, SLSA). **Key Endpoints:** 1. `POST /sbom/ingest` * Request: * `imageDigest` (string). * `sbom` (raw JSON). * `format` (enum/string). * Optional: `dsseSignature` or `dsseEnvelope`. * Behavior: * Parse & validate SBOM structure. * Canonicalize JSON, compute `sbomHash`. * If `sbomHash` exists for `imageDigest` and tenant: * Return `200` with `{ status: "already_present", sbomId, sbomHash }`. * Else: * Persist `Sbom` entity. * Optionally verify DSSE signature via Authority. * Return `201` with `{ status: "stored", sbomId, sbomHash }`. 2. `POST /attest/verify` * Request: * `dsseEnvelope` (JSON). * `expectedSubjects` (list of `{ name, digest }`). * Behavior: * Canonicalize payload, compute `attHash`. * Verify DSSE signature via Authority. * Parse in-toto Statement; ensure `subject[].digest.sha256` matches `expectedSubjects`. * Persist `Attestation` & `SignatureInfo`. * If configured, call LogBridge to publish and store `TransparencyLogEntry`. * If `attHash` already exists: * Return `200` with `status: "already_present"` and existing references. * Else, return `201` with `verified:true`, plus log info when available. 3. Optional read APIs: * `GET /sbom/by-image/{digest}` * `GET /attest/by-image/{digest}` * `GET /image/{digest}/summary` (SBOM + attestations + log status). ### 4.2 StellaOps.Sbomer.Engine **Responsibilities:** * Given: * OCI image manifest & layers (from local tarball or remote registry). * Produce: * CycloneDX 1.6 JSON. * SPDX 3.0.1 JSON. **Design:** * Use layered analyzers: * `ILayerAnalyzer` for generic filesystem traversal. * Language-specific analyzers (optional for SBOM detail): * `DotNetAnalyzer`, `NodeJsAnalyzer`, `PythonAnalyzer`, `JavaAnalyzer`, `PhpAnalyzer`, etc. * Determinism: * Sort all lists (components, dependencies) by stable keys. * Remove unstable fields (timestamps, machine IDs, ephemeral paths). * Provide `Normalize()` method per format that returns canonical JSON. ### 4.3 StellaOps.Provenance.Engine **Responsibilities:** * Build in-toto Statement with SLSA v1 predicate: * `subject` derived from image digest(s). * `materials` from: * Git commit, tag, builder image, SBOM components if available. * Ensure determinism: * Sort materials by URI + digest. * Normalize nested maps. **Key APIs (internal library):** * `InTotoStatement BuildSlsaProvenance(ImageArtifact image, Sbom sbom, ProvenanceContext ctx)` * `string ToCanonicalJson(InTotoStatement stmt)` ### 4.4 StellaOps.Authority **Responsibilities:** * DSSE signing & verification. * Key management abstraction. * Policy enforcement (which keys/trust roots are allowed). **Interfaces:** * `ISigningProvider` * `Task SignAsync(byte[] payload, string payloadType, string keyId)` * `IVerificationProvider` * `Task VerifyAsync(DsseEnvelope envelope, VerificationPolicy policy)` **Backends:** * File-based keys (PEM). * HSM/KMS (AWS KMS, Azure Key Vault, on-prem HSM). * Sovereign crypto providers (GOST, SMx, etc.). * Optional PQ providers (Dilithium, Falcon). ### 4.5 StellaOps.LogBridge **Responsibilities:** * Abstract interaction with transparency logs. **Interface:** * `ILogBackend` * `Task PutAsync(byte[] canonicalPayloadHash, DsseEnvelope env)` * `Task VerifyInclusionAsync(LogEntryResult entry)` **Backends:** * `RekorBackend`: * Calls Rekor REST API with hashed payload. * `LocalMerkleBackend`: * Maintains Merkle tree in local DB. * Returns `logIndex`, `logId`, and inclusion proof. * `NullBackend`: * Returns empty/no-op results. ### 4.6 CLI Tools (Optional) Use the same libraries as the services: * `stella-extract`: * Input: image reference. * Output: local tarball + manifest JSON. * `stella-sbomer`: * Input: manifest & layers. * Output: SBOM JSON. * `stella-sign`: * Input: JSON file. * Output: DSSE envelope. * `stella-provenance`: * Input: image digest, build metadata. * Output: signed in-toto/SLSA DSSE. * `stella-log`: * Input: DSSE envelope. * Output: log entry details. --- ## 5. End-to-End Flows ### 5.1 SBOM Ingest (Upload Path) ```mermaid sequenceDiagram participant Client participant API as SupplyChain.Api participant Core as SupplyChain.Core participant DB as PostgreSQL Client->>API: POST /sbom/ingest (imageDigest, sbom, format) API->>Core: Validate & canonicalize SBOM Core-->>API: sbomHash API->>DB: SELECT Sbom WHERE sbomHash & imageDigest DB-->>API: Not found API->>DB: INSERT Sbom (sbomHash, imageDigest, content) DB-->>API: ok API-->>Client: 201 { status:"stored", sbomId, sbomHash } ``` Re-ingest of the same SBOM repeats steps up to SELECT, then returns `status:"already_present"` with `200`. ### 5.2 Attestation Verify & Record ```mermaid sequenceDiagram participant Client participant API as SupplyChain.Api participant Auth as Authority participant Log as LogBridge participant DB as PostgreSQL Client->>API: POST /attest/verify (dsseEnvelope, expectedSubjects) API->>Auth: Verify DSSE (keys, policy) Auth-->>API: VerificationResult(Valid/Invalid) API->>API: Parse in-toto, check subjects vs expected API->>DB: SELECT Attestation WHERE attHash DB-->>API: Not found API->>DB: INSERT Attestation + SignatureInfo alt Logging enabled API->>Log: PutAsync(attHash, envelope) Log-->>API: LogEntryResult(logIndex, logId, proof) API->>DB: INSERT TransparencyLogEntry end API-->>Client: 201 { verified:true, attestationId, logIndex?, inclusionProof? } ``` If attestation already exists, API returns `200` with `status:"already_present"`. --- ## 6. Idempotency and Determinism Strategy 1. **Canonicalization rules:** * Remove insignificant whitespace. * Sort all object keys lexicographically. * Sort arrays where order is not semantically meaningful (components, materials). * Strip non-deterministic fields (timestamps, random IDs) where allowed. 2. **Hashing:** * Always hash canonical JSON as UTF-8. * Use SHA-256 for core IDs; allow crypto provider to also compute other digests if needed. 3. **Persistence:** * Enforce uniqueness in DB via indices on: * `(TenantId, ContentHash)` for SBOMs. * `(TenantId, AttHash)` for attestations. * `(Backend, LogId, LogIndex)` for log entries. * API behavior: * Existing row → `200` with `"already_present"`. * New row → `201` with `"stored"`. 4. **API design:** * No version numbers in path. * Add fields over time; never break or repurpose existing ones. * Use explicit capability discovery via `GET /meta/capabilities` if needed. --- ## 7. Air-Gap Mode and Synchronization ### 7.1 Air-Gap Mode * Configuration flag `Mode = Offline` on SupplyChain.Api. * LogBridge backend: * Default to `LocalMerkle` or `Null`. * Rekor-specific configuration disabled or absent. * DB & Merkle log stored locally inside the secure network. ### 7.2 Later Synchronization to Rekor (Optional Future Step) Not mandatory for first iteration, but prepare for: * Background job (Scheduler module) that: * Enumerates local `TransparencyLogEntry` not yet exported. * Publishes hashed payloads to Rekor when network is available. * Stores mapping between local log entries and remote Rekor entries. --- ## 8. Security, Access Control, and Observability ### 8.1 Security * mTLS between internal services (SupplyChain.Api, Authority, LogBridge). * Authentication: * API keys/OIDC for clients. * Per-tenant scoping; `TenantId` must be present in context. * Authorization: * RBAC: which tenants/users can write/verify/only read. ### 8.2 Crypto Policies * Policy object defines: * Allowed key types and algorithms. * Trust roots (Fulcio, internal CA, sovereign PKI). * Revocation checking strategy (CRL/OCSP, offline lists). * Authority enforces policies; SupplyChain.Api only consumes `VerificationResult`. ### 8.3 Observability * Logs: * Structured logs with correlation IDs; log imageDigest, sbomHash, attHash. * Metrics: * SBOM ingest count, dedup hit rate. * Attestation verify latency. * Transparency log publish success/failure counts. * Traces: * OpenTelemetry tracing across API → Authority → LogBridge. --- ## 9. Implementation Plan (Epics & Work Packages) You can give this section directly to agents to split. ### Epic 1: Core Domain & Canonicalization 1. Define .NET 10 solution structure: * Projects: * `StellaOps.SupplyChain.Core` * `StellaOps.Sbomer.Engine` * `StellaOps.Provenance.Engine` * `StellaOps.SupplyChain.Api` * `StellaOps.Authority` (if not already present) * `StellaOps.LogBridge` 2. Implement core domain models: * SBOM, DSSE, in-toto, SLSA v1. 3. Implement canonicalization & hashing utilities. 4. Unit tests: * Given semantically equivalent JSON, hashes must match. * Negative tests where order changes but meaning does not. ### Epic 2: Persistence Layer 1. Design EF Core models for: * ImageArtifact, Sbom, Attestation, SignatureInfo, TransparencyLogEntry, KeyRecord. 2. Write migrations for PostgreSQL. 3. Implement repository interfaces for read/write. 4. Tests: * Unique constraints and idempotency behavior. * Query performance for common access paths (by imageDigest). ### Epic 3: SBOM Engine 1. Implement minimal layer analysis: * Accepts local tarball or path (for now). 2. Implement CycloneDX 1.6 generator. 3. Implement SPDX 3.0.1 generator. 4. Deterministic normalization across formats. 5. Tests: * Golden files for images → SBOM output. * Stability under repeated runs. ### Epic 4: Provenance Engine 1. Implement in-toto Statement model with SLSA v1 predicate. 2. Implement builder to map: * ImageDigest → subject. * Build metadata → materials. 3. Deterministic canonicalization. 4. Tests: * Golden in-toto/SLSA statements for sample inputs. * Subject matching logic. ### Epic 5: Authority Integration 1. Implement `ISigningProvider`, `IVerificationProvider` contracts. 2. Implement file-based key backend as default. 3. Implement DSSE wrapper: * `SignAsync(payload, payloadType, keyId)`. * `VerifyAsync(envelope, policy)`. 4. Tests: * DSSE round-trip; invalid signature scenarios. * Policy enforcement tests. ### Epic 6: Transparency Log Bridge 1. Implement `ILogBackend` interface. 2. Implement `LocalMerkleBackend`: * Simple Merkle tree with DB storage. 3. Implement `NullBackend`. 4. Define configuration model to select backend. 5. (Optional later) Implement `RekorBackend`. 6. Tests: * Stable Merkle root; inclusion proof verification. ### Epic 7: SupplyChain.Api 1. Implement `POST /sbom/ingest`: * Request/response DTOs. * Integration with canonicalization, persistence, idempotency logic. 2. Implement `POST /attest/verify`: * End-to-end verification and persistence. * Integration with Authority and LogBridge. 3. Optional read APIs. 4. Add input validation (JSON schema, basic constraints). 5. Integration tests: * Full flows for new and duplicate inputs. * Error cases (invalid DSSE, subject mismatch). ### Epic 8: CLI Tools 1. Implement `stella-sbomer` (wraps Sbomer.Engine). 2. Implement `stella-provenance` (wraps Provenance.Engine + Authority). 3. Implement `stella-sign` and `stella-log`. 4. Provide clear help/usage and sample scripts. ### Epic 9: Hardening, Air-Gap Profile, and Docs 1. Configuration profiles: * `Offline` vs `Online`. * Log backend selection. 2. Security hardening: * mTLS, authentication, authorization. 3. Observability: * Metrics, logs, traces wiring. 4. Documentation: * API reference. * Sequence diagrams. * Deployment recipes for: * Single-node air-gap. * Clustered online deployment. --- If you want, next step I can: * Turn this into an AGENTS/TASKS/PROMPT set for your codex workers, or * Produce concrete .NET 10 project skeletons (csproj layout, folder structure, and initial interfaces) for the core libraries and API service.