Files
git.stella-ops.org/docs/modules/scanner/binary-diff-attestation.md
2026-01-13 18:53:39 +02:00

340 lines
10 KiB
Markdown

# Binary Diff Attestation
## Overview
Binary Diff Attestation enables verification of binary-level changes between container images, producing cryptographically signed evidence of what changed at the ELF/PE section level. This capability is essential for:
- **Vendor backport detection**: Identify when a vendor has patched a binary without changing version numbers
- **Supply chain verification**: Prove that expected changes (and no unexpected changes) occurred between releases
- **VEX evidence generation**: Provide concrete evidence for "not_affected" or "fixed" vulnerability status claims
- **Audit trail**: Maintain verifiable records of binary modifications across deployments
### Relationship to SBOM and VEX
Binary diff attestations complement SBOM and VEX documents:
| Artifact | Purpose | Granularity |
|----------|---------|-------------|
| SBOM | Inventory of components | Package/library level |
| VEX | Exploitability status | Vulnerability level |
| Binary Diff Attestation | Change evidence | Section/function level |
The attestation provides the *evidence* that supports VEX claims. For example, a VEX statement claiming a CVE is "fixed" due to a vendor backport can reference the binary diff attestation showing the `.text` section hash changed.
## Architecture
### Component Diagram
```
+-------------------+ +--------------------+ +--------------------+ +----------------------+
| OCI Registry |-->| Layer Extraction |-->| ELF Detection |-->| Section Hash Extract |
+-------------------+ +--------------------+ +--------------------+ +----------------------+
| base + target images
v
+-------------------+ +--------------------+ +------------------+ +------------------+
| Diff Computation |-->| Predicate Builder |-->| DSSE Signer |-->| Output Files |
+-------------------+ +--------------------+ +------------------+ +------------------+
```
### Key Components
| Component | Location | Responsibility |
|-----------|----------|----------------|
| `ElfSectionHashExtractor` | `Scanner.Analyzers.Native` | Extract per-section SHA-256 hashes from ELF binaries |
| `BinaryDiffService` | `Cli.Commands.Scan` | Orchestrate diff computation between two images |
| `BinaryDiffPredicateBuilder` | `Attestor.StandardPredicates` | Construct BinaryDiffV1 predicate payloads |
| `BinaryDiffDsseSigner` | `Attestor.StandardPredicates` | Sign predicates with DSSE envelopes |
### Data Flow
1. **Image Resolution**: Resolve base and target image references to manifest digests
2. **Layer Extraction**: Download and extract layers from both images
3. **Binary Identification**: Identify ELF binaries in both filesystems
4. **Section Hash Computation**: Compute SHA-256 for each target section in each binary
5. **Diff Computation**: Compare section hashes between base and target
6. **Verdict Classification**: Basic classification of unchanged vs modified binaries
7. **Predicate Construction**: Build BinaryDiffV1 predicate with findings
8. **DSSE Signing**: Sign predicate; optional transparency log submission is handled by attestor tooling
## ELF Section Hashing
### Target Sections
The following ELF sections are analyzed for hash computation:
| Section | Purpose | Backport Relevance |
|---------|---------|-------------------|
| `.text` | Executable code | **High** - Patched functions modify this section |
| `.rodata` | Read-only data (strings, constants) | Medium - String constants may change with patches |
| `.data` | Initialized global/static variables | Low - Rarely changes for security patches |
| `.symtab` | Symbol table (function names, addresses) | **High** - Function signature changes |
| `.dynsym` | Dynamic symbols (exports) | **High** - Exported API changes |
### Hash Algorithm
**Primary**: SHA-256
- Industry standard, widely supported
- Collision-resistant for security applications
**Optional**: BLAKE3-256
- Faster computation for large binaries
- Enabled via configuration
### Hash Computation
```
For each ELF binary:
1. Parse ELF header
2. Locate section headers
3. For each target section:
a. Read section contents
b. Compute SHA-256(contents)
c. Store: {name, offset, size, sha256}
4. Sort sections by name (lexicographic)
5. Return ElfSectionHashSet
```
### Determinism Guarantees
All operations produce deterministic output:
| Aspect | Guarantee |
|--------|-----------|
| Section ordering | Sorted lexicographically by name |
| Hash format | Lowercase hexadecimal, no prefix |
| Timestamps | From injected `TimeProvider` |
| JSON serialization | RFC 8785 canonical JSON |
## BinaryDiffV1 Predicate
### Schema Overview
The `BinaryDiffV1` predicate payload uses the following structure:
```json
{
"predicateType": "stellaops.binarydiff.v1",
"subjects": [
{
"name": "docker://repo/app@sha256:target...",
"digest": { "sha256": "target..." },
"platform": { "os": "linux", "architecture": "amd64" }
}
],
"inputs": {
"base": { "digest": "sha256:base..." },
"target": { "digest": "sha256:target..." }
},
"findings": [...],
"metadata": { ... }
}
```
### Predicate Fields
| Field | Type | Description |
|-------|------|-------------|
| `subjects` | array | Target image references with digests |
| `inputs.base` | object | Base image reference |
| `inputs.target` | object | Target image reference |
| `findings` | array | Per-binary diff findings |
| `metadata` | object | Tool version, timestamp, config |
### Finding Structure
Each finding represents a binary comparison:
```json
{
"path": "/usr/lib/libssl.so.3",
"changeType": "modified",
"binaryFormat": "elf",
"sectionDeltas": [
{ "section": ".text", "status": "modified" },
{ "section": ".rodata", "status": "added" }
],
"confidence": 0.50,
"verdict": "unknown"
}
```
### Verdicts
Current CLI output uses `vanilla` for unchanged binaries and `unknown` for modified binaries.
Advanced verdict classification (patched/vanilla) is planned for follow-up work.
| Verdict | Meaning | Confidence Threshold |
|---------|---------|---------------------|
| `patched` | Binary shows evidence of security patch | >= 0.90 |
| `vanilla` | Binary matches upstream/unmodified | >= 0.95 |
| `unknown` | Cannot determine patch status | < 0.90 |
| `incompatible` | Cannot compare (different architecture, etc.) | N/A |
## DSSE Attestation
### Envelope Structure
```json
{
"payloadType": "stellaops.binarydiff.v1",
"payload": "<base64-encoded predicate>",
"signatures": [
{
"keyid": "...",
"sig": "<base64-encoded signature>"
}
]
}
```
### Signature Algorithm
- **CLI output**: ECDSA (P-256/384/521) with operator-provided PEM key
- **Library support**: Ed25519 available via `EnvelopeSignatureService`
### Rekor Submission
When Rekor is enabled in attestor tooling:
1. DSSE envelope is submitted to Rekor transparency log
2. Inclusion proof is retrieved
3. Rekor metadata is stored in result
```json
{
"rekorLogIndex": 12345678,
"rekorEntryId": "abc123...",
"integratedTime": "2026-01-13T12:00:00Z"
}
```
Note: `stella scan diff` does not submit to Rekor; it only emits local DSSE outputs.
### Verification
Binary diff attestations can be verified with:
```bash
# Attach the DSSE envelope to the image
stella attest attach \
--image docker://repo/app:1.0.1 \
--attestation ./binarydiff.dsse.json
# Verify with cosign (key-based)
cosign verify-attestation \
--type stellaops.binarydiff.v1 \
--key ./keys/binarydiff.pub \
docker://repo/app:1.0.1
# Verify with stella CLI
stella attest verify \
--image docker://repo/app:1.0.1 \
--predicate-type stellaops.binarydiff.v1
```
## Integration Points
### VEX Mapping
Binary diff evidence can support VEX claims:
```json
{
"vulnerability": "CVE-2024-1234",
"status": "fixed",
"justification": "vulnerable_code_not_present",
"detail": "Vendor backport applied; evidence in binary diff attestation",
"evidence": {
"attestationRef": "sha256:dsse-envelope-hash...",
"finding": {
"path": "/usr/lib/libssl.so.3",
"verdict": "patched",
"confidence": 0.95
}
}
}
```
### Policy Engine
Policy rules can reference binary diff evidence:
```rego
# Accept high-confidence patch verdicts as mitigation
allow contains decision if {
input.binaryDiff.findings[_].verdict == "patched"
input.binaryDiff.findings[_].confidence >= 0.90
decision := {
"action": "accept",
"reason": "Binary diff shows patched code",
"evidence": input.binaryDiff.attestationRef
}
}
```
### SBOM Properties
Section hashes appear in SBOM component properties:
```json
{
"type": "library",
"name": "libssl.so.3",
"properties": [
{"name": "evidence:section:.text:sha256", "value": "abc123..."},
{"name": "evidence:section:.rodata:sha256", "value": "def456..."},
{"name": "evidence:extractor-version", "value": "1.0.0"}
]
}
```
## Configuration
### Scanner Options
```yaml
scanner:
native:
sectionHashes:
enabled: true
algorithms:
- sha256
- blake3 # optional
sections:
- .text
- .rodata
- .data
- .symtab
- .dynsym
maxSectionSize: 104857600 # 100MB limit
```
### CLI Options
See [CLI Reference](../../API_CLI_REFERENCE.md#stella-scan-diff) for full option documentation.
## Limitations and Future Work
### Current Limitations
1. **ELF only**: PE and Mach-O support planned for M2
2. **Single platform**: Multi-platform diff requires multiple invocations
3. **No function-level analysis**: Section-level granularity only
4. **Confidence scoring**: Placeholder scoring only; verdict classifier is minimal
### Roadmap
| Milestone | Capability |
|-----------|------------|
| M2 | PE section analysis for Windows containers |
| M2 | Mach-O section analysis for macOS binaries |
| M3 | Vendor backport corpus with curated test fixtures |
| M3 | Function-level diff using DWARF debug info |
| M4 | ML-based verdict classification |
## References
- [BinaryDiffV1 JSON Schema](../../schemas/binarydiff-v1.schema.json)
- [in-toto Attestation Specification](https://github.com/in-toto/attestation)
- [DSSE Envelope Specification](https://github.com/secure-systems-lab/dsse)
- [ELF Specification](https://refspecs.linuxfoundation.org/elf/elf.pdf)