# Binary Diff Attestation ## Overview Binary Diff Attestation enables verification of binary-level changes between container images, producing cryptographically signed evidence of what changed at the ELF/PE section level. This capability is essential for: - **Vendor backport detection**: Identify when a vendor has patched a binary without changing version numbers - **Supply chain verification**: Prove that expected changes (and no unexpected changes) occurred between releases - **VEX evidence generation**: Provide concrete evidence for "not_affected" or "fixed" vulnerability status claims - **Audit trail**: Maintain verifiable records of binary modifications across deployments ### Relationship to SBOM and VEX Binary diff attestations complement SBOM and VEX documents: | Artifact | Purpose | Granularity | |----------|---------|-------------| | SBOM | Inventory of components | Package/library level | | VEX | Exploitability status | Vulnerability level | | Binary Diff Attestation | Change evidence | Section/function level | The attestation provides the *evidence* that supports VEX claims. For example, a VEX statement claiming a CVE is "fixed" due to a vendor backport can reference the binary diff attestation showing the `.text` section hash changed. ## Architecture ### Component Diagram ``` ┌──────────────────────────────────────────────────────────────────────────────┐ │ Binary Diff Attestation Flow │ ├──────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ OCI │ │ Layer │ │ Binary │ │ Section │ │ │ │ Registry │───▶│ Extraction │───▶│ Detection │───▶│ Hash │ │ │ │ Client │ │ │ │ │ │ Extractor │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └──────┬──────┘ │ │ │ │ │ Base Image ─────────────────────────────────────┐ │ │ │ Target Image ───────────────────────────────────┤ ▼ │ │ │ ┌─────────────┐ │ │ └─▶│ Diff │ │ │ │ Computation │ │ │ └──────┬──────┘ │ │ │ │ │ ▼ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ DSSE │◀───│ Predicate │◀───│ Finding │◀───│ Verdict │ │ │ │ Signer │ │ Builder │ │ Aggregation │ │ Classifier │ │ │ └──────┬──────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────┐ ┌─────────────┐ │ │ │ Rekor │ │ File │ │ │ │ Submission │ │ Output │ │ │ └─────────────┘ └─────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────────┘ ``` ### Key Components | Component | Location | Responsibility | |-----------|----------|----------------| | `ElfSectionHashExtractor` | `Scanner.Analyzers.Native` | Extract per-section SHA-256 hashes from ELF binaries | | `BinaryDiffService` | `Cli.Services` | Orchestrate diff computation between two images | | `BinaryDiffPredicateBuilder` | `Attestor.StandardPredicates` | Construct BinaryDiffV1 in-toto predicates | | `BinaryDiffDsseSigner` | `Attestor.StandardPredicates` | Sign predicates with DSSE envelopes | ### Data Flow 1. **Image Resolution**: Resolve base and target image references to manifest digests 2. **Layer Extraction**: Download and extract layers from both images 3. **Binary Identification**: Identify ELF binaries in both filesystems 4. **Section Hash Computation**: Compute SHA-256 for each target section in each binary 5. **Diff Computation**: Compare section hashes between base and target 6. **Verdict Classification**: Classify changes as patched/vanilla/unknown 7. **Predicate Construction**: Build BinaryDiffV1 predicate with findings 8. **DSSE Signing**: Sign predicate and optionally submit to Rekor ## ELF Section Hashing ### Target Sections The following ELF sections are analyzed for hash computation: | Section | Purpose | Backport Relevance | |---------|---------|-------------------| | `.text` | Executable code | **High** - Patched functions modify this section | | `.rodata` | Read-only data (strings, constants) | Medium - String constants may change with patches | | `.data` | Initialized global/static variables | Low - Rarely changes for security patches | | `.symtab` | Symbol table (function names, addresses) | **High** - Function signature changes | | `.dynsym` | Dynamic symbols (exports) | **High** - Exported API changes | ### Hash Algorithm **Primary**: SHA-256 - Industry standard, widely supported - Collision-resistant for security applications **Optional**: BLAKE3-256 - Faster computation for large binaries - Enabled via configuration ### Hash Computation ``` For each ELF binary: 1. Parse ELF header 2. Locate section headers 3. For each target section: a. Read section contents b. Compute SHA-256(contents) c. Store: {name, offset, size, sha256} 4. Sort sections by name (lexicographic) 5. Return ElfSectionHashSet ``` ### Determinism Guarantees All operations produce deterministic output: | Aspect | Guarantee | |--------|-----------| | Section ordering | Sorted lexicographically by name | | Hash format | Lowercase hexadecimal, no prefix | | Timestamps | From injected `TimeProvider` | | JSON serialization | RFC 8785 canonical JSON | ## BinaryDiffV1 Predicate ### Schema Overview The `BinaryDiffV1` predicate follows in-toto attestation format: ```json { "_type": "https://in-toto.io/Statement/v1", "subject": [ { "name": "docker://repo/app@sha256:target...", "digest": { "sha256": "target..." } } ], "predicateType": "stellaops.binarydiff.v1", "predicate": { "inputs": { "base": { "digest": "sha256:base..." }, "target": { "digest": "sha256:target..." } }, "findings": [...], "metadata": {...} } } ``` ### Predicate Fields | Field | Type | Description | |-------|------|-------------| | `subjects` | array | Target image references with digests | | `inputs.base` | object | Base image reference | | `inputs.target` | object | Target image reference | | `findings` | array | Per-binary diff findings | | `metadata` | object | Tool version, timestamp, config | ### Finding Structure Each finding represents a binary comparison: ```json { "path": "/usr/lib/libssl.so.3", "changeType": "modified", "binaryFormat": "elf", "sectionDeltas": [ { "section": ".text", "status": "modified" }, { "section": ".rodata", "status": "identical" } ], "confidence": 0.95, "verdict": "patched" } ``` ### Verdicts | Verdict | Meaning | Confidence Threshold | |---------|---------|---------------------| | `patched` | Binary shows evidence of security patch | >= 0.90 | | `vanilla` | Binary matches upstream/unmodified | >= 0.95 | | `unknown` | Cannot determine patch status | < 0.90 | | `incompatible` | Cannot compare (different architecture, etc.) | N/A | ## DSSE Attestation ### Envelope Structure ```json { "payloadType": "stellaops.binarydiff.v1", "payload": "", "signatures": [ { "keyid": "...", "sig": "" } ] } ``` ### Signature Algorithm - **Default**: Ed25519 - **Alternative**: ECDSA P-256, RSA-PSS (via `ICryptoProviderRegistry`) - **Keyless**: Sigstore Fulcio certificate chain ### Rekor Submission When Rekor is enabled: 1. DSSE envelope is submitted to Rekor transparency log 2. Inclusion proof is retrieved 3. Rekor metadata is stored in result ```json { "rekorLogIndex": 12345678, "rekorEntryId": "abc123...", "integratedTime": "2026-01-13T12:00:00Z" } ``` ### Verification Binary diff attestations can be verified with: ```bash # Using cosign cosign verify-attestation \ --type stellaops.binarydiff.v1 \ --certificate-identity-regexp '.*' \ --certificate-oidc-issuer-regexp '.*' \ docker://repo/app:1.0.1 # Using stella CLI stella verify attestation ./binarydiff.dsse.json \ --type stellaops.binarydiff.v1 ``` ## Integration Points ### VEX Mapping Binary diff evidence can support VEX claims: ```json { "vulnerability": "CVE-2024-1234", "status": "fixed", "justification": "vulnerable_code_not_present", "detail": "Vendor backport applied; evidence in binary diff attestation", "evidence": { "attestationRef": "sha256:dsse-envelope-hash...", "finding": { "path": "/usr/lib/libssl.so.3", "verdict": "patched", "confidence": 0.95 } } } ``` ### Policy Engine Policy rules can reference binary diff evidence: ```rego # Accept high-confidence patch verdicts as mitigation allow contains decision if { input.binaryDiff.findings[_].verdict == "patched" input.binaryDiff.findings[_].confidence >= 0.90 decision := { "action": "accept", "reason": "Binary diff shows patched code", "evidence": input.binaryDiff.attestationRef } } ``` ### SBOM Properties Section hashes appear in SBOM component properties: ```json { "type": "library", "name": "libssl.so.3", "properties": [ {"name": "evidence:section:.text:sha256", "value": "abc123..."}, {"name": "evidence:section:.rodata:sha256", "value": "def456..."}, {"name": "evidence:extractor-version", "value": "1.0.0"} ] } ``` ## Configuration ### Scanner Options ```yaml scanner: native: sectionHashes: enabled: true algorithms: - sha256 - blake3 # optional sections: - .text - .rodata - .data - .symtab - .dynsym maxSectionSize: 104857600 # 100MB limit ``` ### CLI Options See [CLI Reference](../../API_CLI_REFERENCE.md#stella-scan-diff) for full option documentation. ## Limitations and Future Work ### Current Limitations 1. **ELF only**: PE and Mach-O support planned for M2 2. **Single platform**: Multi-platform diff requires multiple invocations 3. **No function-level analysis**: Section-level granularity only 4. **Confidence scoring**: Based on section changes, not semantic analysis ### Roadmap | Milestone | Capability | |-----------|------------| | M2 | PE section analysis for Windows containers | | M2 | Mach-O section analysis for macOS binaries | | M3 | Vendor backport corpus with curated test fixtures | | M3 | Function-level diff using DWARF debug info | | M4 | ML-based verdict classification | ## References - [BinaryDiffV1 JSON Schema](../../schemas/binarydiff-v1.schema.json) - [in-toto Attestation Specification](https://github.com/in-toto/attestation) - [DSSE Envelope Specification](https://github.com/secure-systems-lab/dsse) - [ELF Specification](https://refspecs.linuxfoundation.org/elf/elf.pdf)