209 lines
9.0 KiB
Markdown
209 lines
9.0 KiB
Markdown
# ADR 0044: Binary Delta Signatures for Backport Detection
|
|
|
|
## Status
|
|
ACCEPTED (2026-01-03)
|
|
|
|
## Context
|
|
|
|
Vulnerability scanners today rely on version string comparison to determine if a package is vulnerable. However, Linux distributions (RHEL, Debian, Ubuntu, SUSE, Alpine) routinely **backport** security fixes into older versions without bumping the upstream version number.
|
|
|
|
### The Problem
|
|
|
|
**Example:** OpenSSL 1.0.1e on RHEL 6 has Heartbleed (CVE-2014-0160) patched, but upstream says `1.0.1e < 1.0.1g` (the fix version), so scanners flag it as vulnerable. This creates:
|
|
|
|
1. **False positives** - Patched systems flagged as vulnerable
|
|
2. **Alert fatigue** - Security teams waste time investigating non-issues
|
|
3. **Compliance failures** - Audit reports show phantom vulnerabilities
|
|
4. **Trust erosion** - Users distrust scanner results
|
|
|
|
### Current Mitigations
|
|
|
|
1. **Distro-specific advisory feeds** (DSA, RHSA, USN) - Incomplete coverage
|
|
2. **VEX statements from vendors** - Requires vendor participation, often delayed
|
|
3. **Manual triage** - Doesn't scale
|
|
4. **OVAL feeds** - OS packages only, not application binaries
|
|
|
|
### Requirements
|
|
|
|
- **Binary-level detection**: Examine compiled code, not version strings
|
|
- **Cryptographic proof**: Hash-based evidence that fix is present
|
|
- **Offline operation**: Work in air-gapped environments
|
|
- **Multi-architecture**: Support x86-64, ARM64, and other ISAs
|
|
- **Deterministic**: Same binary → same signature across platforms
|
|
- **LTO resilience**: Handle Link-Time Optimization changes
|
|
|
|
## Decision
|
|
|
|
**Implement binary delta signature matching using normalized code comparison.**
|
|
|
|
### Architecture
|
|
|
|
```
|
|
┌────────────────────────────────────────────────────────────────────────────┐
|
|
│ Delta Signature Pipeline │
|
|
├────────────────────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ Binary Disassembly Normalization Signature │
|
|
│ ───────────► ───────────────► ──────────────► ─────────────► │
|
|
│ ELF/PE/MachO Iced (x86) or Zero addresses SHA-256 + │
|
|
│ B2R2 (ARM/MIPS) Canonicalize NOPs CFG hash + │
|
|
│ Normalize PLT/GOT Chunk hashes │
|
|
│ │
|
|
└────────────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Disassembly Engine Selection
|
|
|
|
**Chosen: Plugin-based architecture with Iced (primary for disassembly) + B2R2 (primary for IR lifting)**
|
|
|
|
| Engine | Strengths | Weaknesses | Use Case |
|
|
|--------|-----------|------------|----------|
|
|
| **Iced** | Fastest x86/x86-64, MIT license, pure C# | x86 only | Fast disassembly for delta-sig normalization |
|
|
| **B2R2** | Multi-arch (ARM, MIPS, RISC-V), IR lifting, MIT license | F# (requires wrapper) | Semantic IR analysis, multi-arch |
|
|
|
|
**Rationale:**
|
|
- Iced for performance-critical x86/x86-64 delta-sig path (90%+ of scanned binaries)
|
|
- B2R2 for ARM64, MIPS, RISC-V when needed for delta-sigs
|
|
- **B2R2 as primary backend for semantic IR lifting** (see `SPRINT_20260118_027_BinaryIndex_b2r2_full_integration.md`)
|
|
- Plugin architecture allows adding engines without core changes
|
|
|
|
**Update (2026-01-19):** B2R2 is now the primary backend for semantic IR lifting via `B2R2LowUirLiftingService`. This enables high-fidelity semantic analysis across x86, ARM64, MIPS, RISC-V, PowerPC, and SPARC architectures. See `docs/modules/binary-index/semantic-diffing.md` for details.
|
|
|
|
### Normalization Strategy
|
|
|
|
To compare binaries compiled by different toolchains/versions, we normalize:
|
|
|
|
1. **Zero absolute addresses** - Remove PC-relative and RIP-relative variance
|
|
2. **Canonicalize NOPs** - Collapse multi-byte NOPs (0x90, 0x0F1F, etc.) to single NOP
|
|
3. **Normalize PLT/GOT** - Replace dynamic linking stubs with symbolic tokens
|
|
4. **Zero relocations** - Remove relocation target variance
|
|
5. **Normalize jump tables** - Convert absolute offsets to relative
|
|
|
|
**Recipe versioning**: Every signature includes the normalization recipe ID and version. Changing normalization behavior requires a version bump.
|
|
|
|
### Signature Components
|
|
|
|
```json
|
|
{
|
|
"schema": "stellaops.deltasig.v1",
|
|
"cve": "CVE-2014-0160",
|
|
"package": { "name": "openssl", "soname": "libssl.so.1.0.0" },
|
|
"target": { "arch": "x86_64", "abi": "gnu" },
|
|
"normalization": { "recipeId": "stellaops.normalize.x64.v1", "version": "1.0.0" },
|
|
"signatureState": "patched",
|
|
"symbols": [
|
|
{
|
|
"name": "tls1_process_heartbeat",
|
|
"hashAlg": "sha256",
|
|
"hashHex": "abc123...",
|
|
"sizeBytes": 1234,
|
|
"cfgBbCount": 15,
|
|
"cfgEdgeHash": "def456...",
|
|
"chunks": [
|
|
{ "offset": 0, "size": 2048, "hashHex": "..." },
|
|
{ "offset": 2048, "size": 2048, "hashHex": "..." }
|
|
]
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
### Matching Strategy
|
|
|
|
1. **Exact match** - Full normalized hash matches patched or vulnerable signature
|
|
2. **Chunk match** - ≥70% of chunks match (handles LTO modifications)
|
|
3. **CFG match** - Control flow graph structure matches (catches recompilations)
|
|
|
|
### VEX Evidence Emission
|
|
|
|
When a binary is confirmed patched via delta signature:
|
|
|
|
```json
|
|
{
|
|
"result": "patched",
|
|
"cveIds": ["CVE-2014-0160"],
|
|
"confidence": 0.95,
|
|
"symbolMatches": [
|
|
{ "symbolName": "tls1_process_heartbeat", "state": "patched", "exactMatch": true }
|
|
],
|
|
"justification": "vulnerable_code_not_present",
|
|
"summary": "Binary confirmed PATCHED with 95% confidence. 1 symbol(s) matched patched signatures exactly."
|
|
}
|
|
```
|
|
|
|
This evidence feeds into VEX candidate generation with full audit trail.
|
|
|
|
## Alternatives Considered
|
|
|
|
### 1. Source Code Comparison
|
|
**Rejected**: Requires source access, doesn't work for closed-source binaries, compile options affect behavior.
|
|
|
|
### 2. Debug Symbol Matching
|
|
**Rejected**: Symbols often stripped in production, doesn't prove code content.
|
|
|
|
### 3. File Hash Matching
|
|
**Rejected**: Entire binary must match exactly; any rebuild invalidates signature.
|
|
|
|
### 4. YARA Rules
|
|
**Rejected**: Pattern-based, high false positive rate, doesn't provide cryptographic proof.
|
|
|
|
### 5. Single Disassembly Engine (B2R2 only)
|
|
**Rejected**: Performance critical; Iced is 3-5x faster for x86/x86-64 which is 90%+ of scanned binaries.
|
|
|
|
## Consequences
|
|
|
|
### Positive
|
|
|
|
1. **Eliminate false positives** for backported security fixes
|
|
2. **Cryptographic proof** of patch status (auditable, reproducible)
|
|
3. **Offline operation** with signature packs
|
|
4. **Multi-architecture** support for modern infrastructure
|
|
5. **VEX integration** for automated triage
|
|
|
|
### Negative
|
|
|
|
1. **Signature authoring required** - Must create signatures for each CVE/package
|
|
2. **Normalization limits** - Extreme compiler optimizations may defeat matching
|
|
3. **Storage overhead** - Signature database growth
|
|
4. **Compute cost** - Disassembly + normalization per binary
|
|
|
|
### Mitigations
|
|
|
|
- **Signature federation** - Share signatures across organizations
|
|
- **Chunk matching** - Resilient to LTO and PGO changes
|
|
- **Priority authoring** - Focus on high-severity CVEs first
|
|
- **Incremental scanning** - Cache analysis results
|
|
|
|
## Implementation
|
|
|
|
### Sprint: SPRINT_20260102_001_BE
|
|
|
|
| Component | Status | Notes |
|
|
|-----------|--------|-------|
|
|
| Disassembly.Abstractions | DONE | Plugin interface, models |
|
|
| Disassembly.Iced | DONE | x86/x86-64 support |
|
|
| Disassembly.B2R2 | DONE | Multi-arch support |
|
|
| Normalization | DONE | X64 + ARM64 pipelines |
|
|
| DeltaSig | DONE | Generator + matcher |
|
|
| Persistence | DONE | PostgreSQL schema |
|
|
| CLI | DONE | extract, author, sign, verify, match, pack, inspect |
|
|
| Scanner integration | DONE | DeltaSigAnalyzer, IBinaryVulnerabilityService |
|
|
| VEX emission | DONE | DeltaSignatureEvidence, DeltaSigVexEmitter |
|
|
|
|
### Test Coverage
|
|
|
|
- 74 unit tests for DeltaSig library
|
|
- 45 unit tests for Normalization
|
|
- 24 unit tests for Disassembly
|
|
- 11 property tests (FsCheck) for normalization idempotency
|
|
- 14 golden tests for known CVEs (Heartbleed, Log4Shell, POODLE)
|
|
- 25 unit tests for VEX evidence emission
|
|
|
|
## References
|
|
|
|
- [Binary Diff Signatures Advisory](../product/advisories/30-Dec-2025%20-%20Binary%20Diff%20Signatures%20for%20Patch%20Detection.md)
|
|
- [B2R2 GitHub](https://github.com/B2R2-org/B2R2)
|
|
- [Iced GitHub](https://github.com/icedland/iced)
|
|
- [OpenVEX Specification](https://github.com/openvex/spec)
|
|
- [CVE-2014-0160 (Heartbleed)](https://nvd.nist.gov/vuln/detail/CVE-2014-0160)
|