# SLSA Source Track Capture (SC3) Status: Draft · Date: 2025-12-03 Scope: Define deterministic capture of SLSA Source Track data for replay bundles and CycloneDX 1.7 + CBOM exports. Aligns Scanner record/replay with provenance signals (build-id, repo/ref, provenance DSSE). ## Objectives - Persist source provenance required by SLSA 1.2 Source Track: repo URI, resolved ref, digest of checked-out tree, invocation hash, builder ID, and reproducible build inputs. - Make data replayable offline: no network fetch; hashes + DSSE envelope paths must resolve locally. - Keep ordering/hashes deterministic: canonical JSON (sorted keys), BLAKE3-256 primary hash, SHA-256 secondary. ## Minimal fields (per build) - `source.repo`: canonical URI (https, ssh); normalized to lower-case host; trailing slash stripped. - `source.ref`: fully qualified ref (`refs/heads/main`, tag, or immutable commit). - `source.commit`: 40-hex commit digest. - `source.treeHash`: BLAKE3-256 of source tree snapshot (stable archive); optional SHA-256 mirror. - `build.invocation.hash`: BLAKE3-256 of normalized invocation (args/env/tool versions); also store `build.invocation.dsse` hash when signed. - `builder.id`: URI for builder identity (SLSA-style). - `provenance.dsse`: SHA-256 of DSSE envelope for provenance statement (e.g., in-toto SLSA provenance v1.0). Optionally include BLAKE3 and CAS URI. ## JSON shape (suggested) ```json { "source": { "repo": "https://example.invalid/demo", "ref": "refs/tags/v1.0.0", "commit": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", "treeHash": "b3:1111...", "builderId": "https://builder.stellaops.local/scanner", "invocationHash": "b3:2222...", "invocationDsse": "sha256:3333...", "provenance": { "dsse": "sha256:4444...", "cas": "cas://provenance/demo/v1.0.0.dsse" } } } ``` ## Where to store - CycloneDX 1.7 + CBOM: encode under `metadata.properties` using namespaced keys: - `source.repo`, `source.ref`, `source.commit`, `source.tree.hash`, `builder.id`, `build.invocation.hash`, `build.invocation.dsse`, `provenance.dsse`, `provenance.cas`. - Replay manifest: add `source` block mirroring the JSON shape above; include hashes in manifest subject list. - CAS: store provenance DSSE envelope under `cas://provenance/{component}/{version}.dsse`; store tree snapshot tarball under `cas://source/{commit}.tar.gz`. ## Determinism rules - Canonical JSON (lexicographic keys, UTF-8, no pretty-print) before hashing. - Timestamps in provenance statements must be UTC `Z`; strip milliseconds unless non-zero. - All hashes recorded with algorithm prefix (`b3:` for BLAKE3-256, `sha256:` for SHA-256). ## Verification - Verifier MUST: (1) schema-check fields are present; (2) recompute `treeHash` from tree tarball; (3) recompute `build.invocation.hash` from normalized invocation file; (4) verify DSSE envelope hash matches `provenance.dsse` and signature keys; (5) ensure repo/ref/commit are consistent (ref→commit mapping known or provided in bundle). - Fail closed on any mismatch; never fetch network. ## Fixtures - `docs/modules/scanner/fixtures/cdx17-cbom/source-track.sample.json` — deterministic example with placeholder hashes. - Future: add CAS tarball + invocation file under `tests/reachability/fixtures/source-track/` with recomputation script. ## TODO (outside this doc) - Implement `scripts/scanner/verify_source_track.py` to validate source-track blocks and CAS payloads offline. - Extend replay manifest schema to include `source` block; add determinism tests in Scanner replay suite once manifest contract lands.