# Binary Diff Attestation ## Overview Binary Diff Attestation enables verification of binary-level changes between container images, producing cryptographically signed evidence of what changed at the ELF/PE section level. This capability is essential for: - **Vendor backport detection**: Identify when a vendor has patched a binary without changing version numbers - **Supply chain verification**: Prove that expected changes (and no unexpected changes) occurred between releases - **VEX evidence generation**: Provide concrete evidence for "not_affected" or "fixed" vulnerability status claims - **Audit trail**: Maintain verifiable records of binary modifications across deployments ### Relationship to SBOM and VEX Binary diff attestations complement SBOM and VEX documents: | Artifact | Purpose | Granularity | |----------|---------|-------------| | SBOM | Inventory of components | Package/library level | | VEX | Exploitability status | Vulnerability level | | Binary Diff Attestation | Change evidence | Section/function level | The attestation provides the *evidence* that supports VEX claims. For example, a VEX statement claiming a CVE is "fixed" due to a vendor backport can reference the binary diff attestation showing the `.text` section hash changed. ## Architecture ### Component Diagram ``` +-------------------+ +--------------------+ +--------------------+ +----------------------+ | OCI Registry |-->| Layer Extraction |-->| ELF Detection |-->| Section Hash Extract | +-------------------+ +--------------------+ +--------------------+ +----------------------+ | base + target images v +-------------------+ +--------------------+ +------------------+ +------------------+ | Diff Computation |-->| Predicate Builder |-->| DSSE Signer |-->| Output Files | +-------------------+ +--------------------+ +------------------+ +------------------+ ``` ### Key Components | Component | Location | Responsibility | |-----------|----------|----------------| | `ElfSectionHashExtractor` | `Scanner.Analyzers.Native` | Extract per-section SHA-256 hashes from ELF binaries | | `BinaryDiffService` | `Cli.Commands.Scan` | Orchestrate diff computation between two images | | `BinaryDiffPredicateBuilder` | `Attestor.StandardPredicates` | Construct BinaryDiffV1 predicate payloads | | `BinaryDiffDsseSigner` | `Attestor.StandardPredicates` | Sign predicates with DSSE envelopes | ### Data Flow 1. **Image Resolution**: Resolve base and target image references to manifest digests 2. **Layer Extraction**: Download and extract layers from both images 3. **Binary Identification**: Identify ELF binaries in both filesystems 4. **Section Hash Computation**: Compute SHA-256 for each target section in each binary 5. **Diff Computation**: Compare section hashes between base and target 6. **Verdict Classification**: Basic classification of unchanged vs modified binaries 7. **Predicate Construction**: Build BinaryDiffV1 predicate with findings 8. **DSSE Signing**: Sign predicate; optional transparency log submission is handled by attestor tooling ## ELF Section Hashing ### Target Sections The following ELF sections are analyzed for hash computation: | Section | Purpose | Backport Relevance | |---------|---------|-------------------| | `.text` | Executable code | **High** - Patched functions modify this section | | `.rodata` | Read-only data (strings, constants) | Medium - String constants may change with patches | | `.data` | Initialized global/static variables | Low - Rarely changes for security patches | | `.symtab` | Symbol table (function names, addresses) | **High** - Function signature changes | | `.dynsym` | Dynamic symbols (exports) | **High** - Exported API changes | ### Hash Algorithm **Primary**: SHA-256 - Industry standard, widely supported - Collision-resistant for security applications **Optional**: BLAKE3-256 - Faster computation for large binaries - Enabled via configuration ### Hash Computation ``` For each ELF binary: 1. Parse ELF header 2. Locate section headers 3. For each target section: a. Read section contents b. Compute SHA-256(contents) c. Store: {name, offset, size, sha256} 4. Sort sections by name (lexicographic) 5. Return ElfSectionHashSet ``` ### Determinism Guarantees All operations produce deterministic output: | Aspect | Guarantee | |--------|-----------| | Section ordering | Sorted lexicographically by name | | Hash format | Lowercase hexadecimal, no prefix | | Timestamps | From injected `TimeProvider` | | JSON serialization | RFC 8785 canonical JSON | ## BinaryDiffV1 Predicate ### Schema Overview The `BinaryDiffV1` predicate payload uses the following structure: ```json { "predicateType": "stellaops.binarydiff.v1", "subjects": [ { "name": "docker://repo/app@sha256:target...", "digest": { "sha256": "target..." }, "platform": { "os": "linux", "architecture": "amd64" } } ], "inputs": { "base": { "digest": "sha256:base..." }, "target": { "digest": "sha256:target..." } }, "findings": [...], "metadata": { ... } } ``` ### Predicate Fields | Field | Type | Description | |-------|------|-------------| | `subjects` | array | Target image references with digests | | `inputs.base` | object | Base image reference | | `inputs.target` | object | Target image reference | | `findings` | array | Per-binary diff findings | | `metadata` | object | Tool version, timestamp, config | ### Finding Structure Each finding represents a binary comparison: ```json { "path": "/usr/lib/libssl.so.3", "changeType": "modified", "binaryFormat": "elf", "sectionDeltas": [ { "section": ".text", "status": "modified" }, { "section": ".rodata", "status": "added" } ], "confidence": 0.50, "verdict": "unknown" } ``` ### Verdicts Current CLI output uses `vanilla` for unchanged binaries and `unknown` for modified binaries. Advanced verdict classification (patched/vanilla) is planned for follow-up work. | Verdict | Meaning | Confidence Threshold | |---------|---------|---------------------| | `patched` | Binary shows evidence of security patch | >= 0.90 | | `vanilla` | Binary matches upstream/unmodified | >= 0.95 | | `unknown` | Cannot determine patch status | < 0.90 | | `incompatible` | Cannot compare (different architecture, etc.) | N/A | ## DSSE Attestation ### Envelope Structure ```json { "payloadType": "stellaops.binarydiff.v1", "payload": "", "signatures": [ { "keyid": "...", "sig": "" } ] } ``` ### Signature Algorithm - **CLI output**: ECDSA (P-256/384/521) with operator-provided PEM key - **Library support**: Ed25519 available via `EnvelopeSignatureService` ### Rekor Submission When Rekor is enabled in attestor tooling: 1. DSSE envelope is submitted to Rekor transparency log 2. Inclusion proof is retrieved 3. Rekor metadata is stored in result ```json { "rekorLogIndex": 12345678, "rekorEntryId": "abc123...", "integratedTime": "2026-01-13T12:00:00Z" } ``` Note: `stella scan diff` does not submit to Rekor; it only emits local DSSE outputs. ### Verification Binary diff attestations can be verified with: ```bash # Attach the DSSE envelope to the image stella attest attach \ --image docker://repo/app:1.0.1 \ --attestation ./binarydiff.dsse.json # Verify with cosign (key-based) cosign verify-attestation \ --type stellaops.binarydiff.v1 \ --key ./keys/binarydiff.pub \ docker://repo/app:1.0.1 # Verify with stella CLI stella attest verify \ --image docker://repo/app:1.0.1 \ --predicate-type stellaops.binarydiff.v1 ``` ## Integration Points ### VEX Mapping Binary diff evidence can support VEX claims: ```json { "vulnerability": "CVE-2024-1234", "status": "fixed", "justification": "vulnerable_code_not_present", "detail": "Vendor backport applied; evidence in binary diff attestation", "evidence": { "attestationRef": "sha256:dsse-envelope-hash...", "finding": { "path": "/usr/lib/libssl.so.3", "verdict": "patched", "confidence": 0.95 } } } ``` ### Policy Engine Policy rules can reference binary diff evidence: ```rego # Accept high-confidence patch verdicts as mitigation allow contains decision if { input.binaryDiff.findings[_].verdict == "patched" input.binaryDiff.findings[_].confidence >= 0.90 decision := { "action": "accept", "reason": "Binary diff shows patched code", "evidence": input.binaryDiff.attestationRef } } ``` ### SBOM Properties Section hashes appear in SBOM component properties: ```json { "type": "library", "name": "libssl.so.3", "properties": [ {"name": "evidence:section:.text:sha256", "value": "abc123..."}, {"name": "evidence:section:.rodata:sha256", "value": "def456..."}, {"name": "evidence:extractor-version", "value": "1.0.0"} ] } ``` ## Configuration ### Scanner Options ```yaml scanner: native: sectionHashes: enabled: true algorithms: - sha256 - blake3 # optional sections: - .text - .rodata - .data - .symtab - .dynsym maxSectionSize: 104857600 # 100MB limit ``` ### CLI Options See [CLI Reference](../../API_CLI_REFERENCE.md#stella-scan-diff) for full option documentation. ## Limitations and Future Work ### Current Limitations 1. **ELF only**: PE and Mach-O support planned for M2 2. **Single platform**: Multi-platform diff requires multiple invocations 3. **No function-level analysis**: Section-level granularity only 4. **Confidence scoring**: Placeholder scoring only; verdict classifier is minimal ### Roadmap | Milestone | Capability | |-----------|------------| | M2 | PE section analysis for Windows containers | | M2 | Mach-O section analysis for macOS binaries | | M3 | Vendor backport corpus with curated test fixtures | | M3 | Function-level diff using DWARF debug info | | M4 | ML-based verdict classification | ## References - [BinaryDiffV1 JSON Schema](../../schemas/binarydiff-v1.schema.json) - [in-toto Attestation Specification](https://github.com/in-toto/attestation) - [DSSE Envelope Specification](https://github.com/secure-systems-lab/dsse) - [ELF Specification](https://refspecs.linuxfoundation.org/elf/elf.pdf)