Files
git.stella-ops.org/docs/product-advisories/02-Dec-2025 - Converting SBOM Data into Proof Chains.md
2025-12-09 20:23:50 +02:00

18 KiB
Raw Blame History

Heres a clean way to turn an SBOM into a verifiable supplychain proof without extra fluff: use CycloneDXs metadata.component.hashes as the cryptographic anchors, map each component@version to an intoto subject, wrap the result in a DSSE envelope, record it in Rekor, and (optionally) attach or reference your VEX claims. This gives you a deterministic, endtoend “SBOM → DSSE → Rekor → VEX” spine you can replay and audit anytime.


Why this works (quick background)

  • CycloneDX SBOM: lists components; each can carry hashes (SHA256/512) under metadata.component.hashes.
  • intoto: describes supplychain steps; a “subject” is just a file/artifact + its digest(s).
  • DSSE: standard envelope to sign statements (like intoto) without touching payload bytes.
  • Rekor (Sigstore): transparency log—appendonly proofs (inclusion/consistency).
  • VEX: vulnerability status for components (affected/not affected, under investigation, fixed).

Minimal mapping

  1. From CycloneDX → subjects
  • For each component with a hash:

    • Subject name: pkg:<type>/<name>@<version> (or your canonical URI)
    • Subject digest(s): copy from metadata.component.hashes
  1. intoto statement
{
  "_type": "https://in-toto.io/Statement/v1",
  "predicateType": "https://stellaops.dev/predicate/sbom-linkage/v1",
  "subject": [
    { "name": "pkg:npm/lodash@4.17.21",
      "digest": { "sha256": "…", "sha512": "…" } }
  ],
  "predicate": {
    "sbom": {
      "format": "CycloneDX",
      "version": "1.6",
      "sha256": "…sbom file hash…"
    },
    "generatedAt": "2025-12-01T00:00:00Z",
    "generator": "StellaOps.Sbomer/1.0"
  }
}
  1. Wrap in DSSE
  • Create DSSE envelope with the statement as payload.
  • Sign with your org key (or keyless Sigstore if online; for airgap, use your offline CA/PKCS#11).
  1. Log to Rekor
  • Submit DSSE to Rekor; store back the logIndex, UUID, and inclusion proof.
  • In offline/airgap kits, mirror to your own Rekor instance and sync later.
  1. Link VEX
  • For each component subject, attach a VEX item (same subject name + digest) or store a pointer:
"predicate": {
  "vex": [
    { "subject": "pkg:npm/lodash@4.17.21",
      "digest": { "sha256": "…" },
      "vulnerability": "CVE-XXXX-YYYY",
      "status": "not_affected",
      "justification": "component_not_present",
      "timestamp": "2025-12-01T00:00:00Z" }
  ]
}
  • You can keep VEX in a separate DSSE/intoto document; crossreference by subject digest.

Deterministic replay recipe (StellaOpsstyle)

  • Input: CycloneDX file + deterministic hashing rules.

  • Process:

    1. Normalize SBOM (stable sort keys, strip volatile fields).
    2. Extract metadata.component.hashes; fail build if missing.
    3. Emit intoto statement with sorted subjects.
    4. DSSEsign with fixed algorithm (e.g., SHA256 + Ed25519) and pinned key id.
    5. Rekor log; record logIndex in your store.
    6. Emit VEX statements keyed by the same subject digests.
  • Output: (SBOM hash, DSSE envelope, Rekor proofs, VEX docs) — all contentaddressed.


Quick C# sketch (DOTNET 10) to build subjects

public record Subject(string Name, Dictionary<string,string> Digest);

IEnumerable<Subject> ToSubjects(CycloneDxSbom sbom)
{
    foreach (var c in sbom.Metadata.Components)
    {
        if (c.Hashes == null || c.Hashes.Count == 0) continue;
        var name = $"pkg:{c.Type}/{c.Name}@{c.Version}";
        var dig = c.Hashes
            .OrderBy(h => h.Algorithm) // deterministic
            .ToDictionary(h => h.Algorithm.ToLowerInvariant(), h => h.Value.ToLowerInvariant());
        yield return new Subject(name, dig);
    }
}

Validation gates youll want

  • Nohash = noship: reject SBOM components without strong digests.
  • Stable ordering: sort subjects and digests before signing.
  • Key policy: pin algorithm + key id; rotate on a schedule; record KMS path.
  • Proof check: verify Rekor inclusion on CI and during runtime attestation.
  • VEX parity: every shipped subject must have a VEX stance (even “unknown/underinvestigation”).

Where this helps you

  • Audits: one click from running container → component digest → Rekor proof → VEX decision.
  • Airgap: DSSE + local Rekor mirror keeps everything verifiable offline, syncs later.
  • Determinism: same inputs always produce byteidentical envelopes and proofs.

If you want, I can turn this into a dropin StellaOps.Sbomer → Vexer guideline (with schema files, DSSE signing helper, and a Rekor client wrapper) tailored to your .NET 10 repos. Below is a compact but complete guideline you can hand directly to Stella Ops devs.


Stella Ops Developer Guidelines

Converting SBOM Data into Proof Chains

1. Objective

Define how Stella Ops components (Sbomer, Authority, Vexer, Proof Graph, Rekor bridge) convert raw SBOM data (CycloneDX / SPDX) into cryptographically verifiable proof chains:

Artifact/Image → SBOM → in-toto Statement → DSSE Envelope → Rekor Entry → VEX Attestations → Proof-of-Integrity Graph.

This must be:

  • Deterministic (replayable).
  • Content-addressed (hashes everywhere).
  • Offline-capable (air-gapped), with later synchronization.
  • Crypto-sovereign (pluggable crypto backends, including PQC later).

2. Responsibilities by Service

StellaOps.Sbomer

  • Ingest SBOMs (CycloneDX 1.6, SPDX 3.x).
  • Canonicalize and hash SBOM.
  • Extract component subjects from SBOM.
  • Build in-toto Statement for “sbom-linkage”.
  • Call Authority to DSSE-sign Statement.
  • Hand signed envelopes to Rekor bridge + Proof Graph.

StellaOps.Authority

  • Abstract cryptography (sign/verify, hash, key resolution).
  • Support multiple profiles (default: FIPS-style SHA-256 + Ed25519/ECDSA; future: GOST/SM/eIDAS/PQC).
  • Enforce key policies (which key for which tenant/realm).

StellaOps.RekorBridge (could be sub-package of Authority or separate microservice)

  • Log DSSE envelopes to Rekor (or local Rekor-compatible ledger).
  • Handle offline queuing and later sync.
  • Return stable Rekor metadata: logIndex, logId, inclusionProof.

StellaOps.Vexer (Excitors)

  • Produce VEX statements that reference the same subjects as the SBOM proof chain.
  • DSSE-sign VEX statements via Authority.
  • Optionally log VEX DSSE envelopes to Rekor using the same bridge.
  • Never run lattice logic here (per your rule); only attach VEX and preserve provenance.

StellaOps.ProofGraph

  • Persist the full chain:

    • Artifacts, SBOM docs, in-toto Statements, DSSE envelopes, Rekor entries, VEX docs.
  • Expose graph APIs for Scanner / runtime agents:

    • “Show me proof for this container/image/binary.”

3. High-Level Flow

For each scanned artifact (e.g., container image):

  1. SBOM ingestion (Sbomer)

    • Accept SBOM file/stream (CycloneDX/SPDX).
    • Normalize & hash the SBOM document.
  2. Subject extraction (Sbomer)

    • Derive a stable list of subjects[] from SBOM components (name + digests).
  3. Statement construction (Sbomer)

    • Build in-toto Statement with predicateType = "https://stella-ops.org/predicates/sbom-linkage/v1".
  4. DSSE signing (Authority)

    • Wrap Statement as DSSE envelope.
    • Sign with the appropriate org/tenant key.
  5. Rekor logging (RekorBridge)

    • Submit DSSE envelope to Rekor.
    • Store log metadata & proofs.
  6. VEX linkage (Vexer)

    • For each subject, optionally emit VEX statements (status: affected/not_affected/etc.).
    • DSSE-sign and log VEX to Rekor (same pattern).
  7. Proof-of-Integrity Graph (ProofGraph)

    • Insert nodes & edges to represent the whole chain, content-addressed by hash.

4. Canonicalizing and Hashing SBOMs (Sbomer)

4.1 Supported formats

  • MUST support:

    • CycloneDX JSON 1.4+ (target 1.6).
    • SPDX 3.x JSON.
  • MUST map both formats into a common internal SbomDocument model.

4.2 Canonicalization rules

All hashes used as identifiers MUST be computed over canonical form:

  • For JSON SBOMs:

    • Remove insignificant whitespace.

    • Sort object keys lexicographically.

    • For arrays where order is not semantically meaningful (e.g., components), sort deterministically (e.g., by bom-ref or purl).

    • Strip volatile fields if present:

      • Timestamps (generation time).
      • Tool build IDs.
      • Non-deterministic UUIDs.
  • For other formats (if ever accepted):

    • Convert to internal JSON representation first, then canonicalize JSON.

Example C# signature:

public interface ISbomCanonicalizer
{
    byte[] Canonicalize(ReadOnlySpan<byte> rawSbom, string mediaType);
}

public interface IBlobHasher
{
    string ComputeSha256Hex(ReadOnlySpan<byte> data);
}

Contract: same input bytes → same canonical bytes → same sha256 → replayable.

4.3 SBOM identity

Define SBOM identity as:

sbomId = sha256(canonicalSbomBytes)

Store:

  • SbomId (hex string).
  • MediaType (e.g., application/vnd.cyclonedx+json).
  • SpecVersion.
  • Optional Source (file path, OCI label, etc.).

5. Extracting Subjects from SBOM Components

5.1 Subject schema

Internal model:

public sealed record ProofSubject(
    string Name,                       // e.g. "pkg:npm/lodash@4.17.21"
    IReadOnlyDictionary<string,string> Digest  // e.g. { ["sha256"] = "..." }
);

5.2 Name rules

  • Prefer PURL when present.

    • Name = purl exactly as in SBOM.
  • Fallback per eco-system:

    • npm: pkg:npm/{name}@{version}
    • NuGet/.NET: pkg:nuget/{name}@{version}
    • Maven: pkg:maven/{groupId}/{artifactId}@{version}
    • OS packages (rpm/deb/apk): appropriate purl.
  • If nothing else is available:

    • Name = "component:" + UrlEncode(componentName + "@" + version).

5.3 Digest rules

  • Consume all strong digests provided (CycloneDX hashes[], SPDX checksums).

  • Normalize algorithm keys:

    • Lowercase (e.g., sha256, sha512).
    • For SHA-1, still capture it but mark as weak in predicate metadata.
  • MUST have at least one of:

    • sha256
    • sha512
  • If no strong digest exists, the component:

    • MUST NOT be used as a primary subject in the proof chain.
    • MAY be logged in an “incomplete_subjects” block inside the predicate for diagnostics.

5.4 Deterministic ordering

  • Sort subjects by:

    1. Name ascending.
    2. Then by lexicographic concat of algorithm:value pairs.

This ordering must be applied before building the in-toto Statement.


6. Building the in-toto Statement (Sbomer)

6.1 Statement shape

Use the generic in-toto v1 Statement:

{
  "_type": "https://in-toto.io/Statement/v1",
  "subject": [ /* from SBOM subjects */ ],
  "predicateType": "https://stella-ops.org/predicates/sbom-linkage/v1",
  "predicate": {
    "sbom": {
      "id": "<sbomId hex>",
      "format": "CycloneDX",
      "specVersion": "1.6",
      "mediaType": "application/vnd.cyclonedx+json",
      "sha256": "<sha256 of canonicalSbomBytes>",
      "location": "oci://… or file://…"
    },
    "generator": {
      "name": "StellaOps.Sbomer",
      "version": "x.y.z"
    },
    "generatedAt": "2025-12-09T10:37:42Z",
    "incompleteSubjects": [ /* optional, see 5.3 */ ],
    "tags": {
      "tenantId": "…",
      "projectId": "…",
      "pipelineRunId": "…"
    }
  }
}

6.2 Implementation rules

  • All dictionary keys in the final JSON MUST be sorted.
  • Use UTC ISO-8601 for timestamps.
  • tags is an extensible string map; do not put secrets here.
  • The Statement payload given to DSSE MUST be the canonical JSON (same key order each time).

C# sketch:

public record SbomLinkagePredicate(
    SbomDescriptor Sbom,
    GeneratorDescriptor Generator,
    DateTimeOffset GeneratedAt,
    IReadOnlyList<IncompleteSubject>? IncompleteSubjects,
    IReadOnlyDictionary<string,string>? Tags
);

7. DSSE Signing (Authority)

7.1 Abstraction

All signing MUST run through Authority; no direct crypto calls from Sbomer/Vexer.

public interface IDsseSigner
{
    Task<DsseEnvelope> SignAsync(
        ReadOnlyMemory<byte> payload,
        string payloadType,   // always "application/vnd.in-toto+json"
        string keyProfile,    // e.g. "default", "gov-bg", "pqc-lab"
        CancellationToken ct = default);
}

7.2 DSSE rules

  • payloadType fixed: "application/vnd.in-toto+json".

  • signatures[]:

    • At least one signature.

    • Each signature MUST carry:

      • keyid (stable identifier within Authority).
      • sig (base64).
      • Optional cert if X.509 is used (but not required to be in the hashed payload).
  • Crypto profile:

    • Default: SHA-256 + Ed25519/ECDSA (configurable).
    • Key resolution must be config-driven per tenant/realm.

7.3 Determinism

  • DSSE envelope JSON MUST also be canonical when hashed or sent to Rekor.
  • Signature bytes will differ across runs (due to non-deterministic ECDSA), but payload hash and Statement hash MUST remain stable.

8. Rekor Logging (RekorBridge)

8.1 When to log

  • Every SBOM linkage DSSE envelope SHOULD be logged to a Rekor-compatible transparency log.

  • In air-gapped mode:

    • Enqueue entries in a local store.
    • Tag them with a “pending” status and sync log later.

8.2 Entry type

Use Rekors DSSE/intoto entry kind (exact spec is implementation detail, but guidelines:

  • Entry contains:

    • DSSE envelope.
    • apiVersion / kind fields required by Rekor.
  • On success, Rekor returns:

    • logIndex
    • logId
    • integratedTime
    • inclusionProof (Merkle proof).

8.3 Data persisted back into ProofGraph

For each DSSE envelope:

  • Store:
{
  "dsseSha256": "<sha256 of canonical dsse envelope>",
  "rekor": {
    "logIndex": 12345,
    "logId": "…",
    "integratedTime": 1733736000,
    "inclusionProof": { /* Merkle path */ }
  }
}
  • Link this Rekor entry node to the DSSE envelope node with LOGGED_IN edge.

9. VEX Linkage (Vexer)

9.1 Core rule

VEX subjects MUST align with SBOM proof subjects:

  • Same name value.
  • Same digest set (sha256 at minimum).
  • If VEX is created later (e.g., days after SBOM), they still link through the subject digests.

9.2 VEX statement

StellaOps VEX may be its own predicateType, e.g.:

{
  "_type": "https://in-toto.io/Statement/v1",
  "subject": [
    { "name": "pkg:npm/lodash@4.17.21",
      "digest": { "sha256": "…" } }
  ],
  "predicateType": "https://stella-ops.org/predicates/vex/v1",
  "predicate": {
    "vulnerabilities": [
      {
        "id": "CVE-2024-XXXX",
        "status": "not_affected",
        "justification": "component_not_present",
        "timestamp": "2025-12-09T10:40:00Z",
        "details": "…"
      }
    ]
  }
}

Then:

  1. Canonicalize JSON.
  2. DSSE-sign via Authority.
  3. Optionally log DSSE envelope to Rekor.
  4. Insert into ProofGraph with HAS_VEX relationships from subject → VEX node.

9.3 Non-functional

  • Vexer must not run lattice algorithms; Scanners policy engine consumes these VEX proofs.
  • Vexer MUST be idempotent when re-emitting VEX for the same (subject, CVE, status) tuple.

10. Proof-of-Integrity Graph (ProofGraph)

10.1 Node types (suggested)

  • Artifact (container image, binary, Helm chart, etc.).
  • SbomDocument (by sbomId).
  • InTotoStatement (by statement hash).
  • DsseEnvelope.
  • RekorEntry.
  • VexStatement.

10.2 Edge types

  • DESCRIBED_BY: ArtifactSbomDocument.
  • ATTESTED_BY: SbomDocumentInTotoStatement.
  • WRAPPED_BY: InTotoStatementDsseEnvelope.
  • LOGGED_IN: DsseEnvelopeRekorEntry.
  • HAS_VEX: Artifact/SubjectVexStatement.
  • Optionally CONTAINS_SUBJECT: InTotoStatementSubject nodes if you materialise them.

10.3 Identifiers

  • All nodes MUST be addressable by a content hash:

    • ArtifactId = hash of image manifest or binary.
    • SbomId = hash of canonical SBOM.
    • StatementId = hash of canonical in-toto JSON.
    • DsseId = hash of canonical DSSE JSON.
    • VexId = hash of canonical VEX Statement JSON.

Idempotence rule: inserting the same chain twice must result in the same nodes, not duplicates.


11. Error Handling & Policy Gates

11.1 Ingestion failures

  • If SBOM is missing or invalid:

    • Mark the artifact as “unproven” in the graph.
    • Raise a policy event so Scanner/CI can enforce “no SBOM, no ship” if configured.

11.2 Missing digests

  • If a component lacks sha256/sha512:

    • Log as incomplete subject.
    • Expose in predicate and UI as “unverifiable component not anchored to proof chain”.

11.3 Rekor failures

  • If Rekor is unavailable:

    • Still store DSSE envelope locally.
    • Queue for retry.
    • Proof chain is internal-only until Rekor sync succeeds; flag accordingly (rekorStatus: "pending").

12. Definition of Done for Dev Work

Any feature that “converts SBOMs into proof chains” is only done when:

  1. Canonicalization

    • Given the same SBOM file, multiple runs produce identical:

      • sbomId
      • Statement JSON bytes
      • DSSE payload bytes (before signing)
  2. Subject extraction

    • All strong-digest components appear as subjects.
    • Deterministic ordering is tested with golden fixtures.
  3. DSSE + Rekor

    • DSSE envelopes verifiable with Authority key material.
    • Rekor entry present (or in offline queue) for each envelope.
    • Rekor metadata linked in ProofGraph.
  4. VEX integration

    • VEX for a subject is discoverable via the same subject in graph queries.
    • Scanner can prove: “this vulnerability is (not_)affected because of VEX X”.
  5. Graph query

    • From a running container/image, you can traverse:

      • Artifact → SBOM → Statement → DSSE → Rekor → VEX in a single query.

If you want, next step I can do a concrete .cs layout (interfaces + record types + one golden test fixture) specifically for StellaOps.Sbomer and StellaOps.ProofGraph, so you can drop it straight into your .NET 10 solution.