Files
git.stella-ops.org/docs/adr/0044-binary-delta-signatures.md
StellaOps Bot ca578801fd save progress
2026-01-03 00:49:19 +02:00

8.4 KiB

ADR 0044: Binary Delta Signatures for Backport Detection

Status

ACCEPTED (2026-01-03)

Context

Vulnerability scanners today rely on version string comparison to determine if a package is vulnerable. However, Linux distributions (RHEL, Debian, Ubuntu, SUSE, Alpine) routinely backport security fixes into older versions without bumping the upstream version number.

The Problem

Example: OpenSSL 1.0.1e on RHEL 6 has Heartbleed (CVE-2014-0160) patched, but upstream says 1.0.1e < 1.0.1g (the fix version), so scanners flag it as vulnerable. This creates:

  1. False positives - Patched systems flagged as vulnerable
  2. Alert fatigue - Security teams waste time investigating non-issues
  3. Compliance failures - Audit reports show phantom vulnerabilities
  4. Trust erosion - Users distrust scanner results

Current Mitigations

  1. Distro-specific advisory feeds (DSA, RHSA, USN) - Incomplete coverage
  2. VEX statements from vendors - Requires vendor participation, often delayed
  3. Manual triage - Doesn't scale
  4. OVAL feeds - OS packages only, not application binaries

Requirements

  • Binary-level detection: Examine compiled code, not version strings
  • Cryptographic proof: Hash-based evidence that fix is present
  • Offline operation: Work in air-gapped environments
  • Multi-architecture: Support x86-64, ARM64, and other ISAs
  • Deterministic: Same binary → same signature across platforms
  • LTO resilience: Handle Link-Time Optimization changes

Decision

Implement binary delta signature matching using normalized code comparison.

Architecture

┌────────────────────────────────────────────────────────────────────────────┐
│                        Delta Signature Pipeline                             │
├────────────────────────────────────────────────────────────────────────────┤
│                                                                            │
│  Binary         Disassembly        Normalization       Signature           │
│  ───────────►   ───────────────►   ──────────────►     ─────────────►      │
│  ELF/PE/MachO   Iced (x86) or     Zero addresses      SHA-256 +            │
│                 B2R2 (ARM/MIPS)   Canonicalize NOPs   CFG hash +           │
│                                   Normalize PLT/GOT   Chunk hashes         │
│                                                                            │
└────────────────────────────────────────────────────────────────────────────┘

Disassembly Engine Selection

Chosen: Plugin-based architecture with Iced (primary) + B2R2 (fallback)

Engine Strengths Weaknesses
Iced Fastest x86/x86-64, MIT license, pure C# x86 only
B2R2 Multi-arch (ARM, MIPS, RISC-V), IR lifting, MIT license F# (requires wrapper)

Rationale:

  • Iced for performance-critical x86/x86-64 path (90%+ of scanned binaries)
  • B2R2 for ARM64, MIPS, RISC-V when needed
  • Plugin architecture allows adding engines without core changes

Normalization Strategy

To compare binaries compiled by different toolchains/versions, we normalize:

  1. Zero absolute addresses - Remove PC-relative and RIP-relative variance
  2. Canonicalize NOPs - Collapse multi-byte NOPs (0x90, 0x0F1F, etc.) to single NOP
  3. Normalize PLT/GOT - Replace dynamic linking stubs with symbolic tokens
  4. Zero relocations - Remove relocation target variance
  5. Normalize jump tables - Convert absolute offsets to relative

Recipe versioning: Every signature includes the normalization recipe ID and version. Changing normalization behavior requires a version bump.

Signature Components

{
  "schema": "stellaops.deltasig.v1",
  "cve": "CVE-2014-0160",
  "package": { "name": "openssl", "soname": "libssl.so.1.0.0" },
  "target": { "arch": "x86_64", "abi": "gnu" },
  "normalization": { "recipeId": "stellaops.normalize.x64.v1", "version": "1.0.0" },
  "signatureState": "patched",
  "symbols": [
    {
      "name": "tls1_process_heartbeat",
      "hashAlg": "sha256",
      "hashHex": "abc123...",
      "sizeBytes": 1234,
      "cfgBbCount": 15,
      "cfgEdgeHash": "def456...",
      "chunks": [
        { "offset": 0, "size": 2048, "hashHex": "..." },
        { "offset": 2048, "size": 2048, "hashHex": "..." }
      ]
    }
  ]
}

Matching Strategy

  1. Exact match - Full normalized hash matches patched or vulnerable signature
  2. Chunk match - ≥70% of chunks match (handles LTO modifications)
  3. CFG match - Control flow graph structure matches (catches recompilations)

VEX Evidence Emission

When a binary is confirmed patched via delta signature:

{
  "result": "patched",
  "cveIds": ["CVE-2014-0160"],
  "confidence": 0.95,
  "symbolMatches": [
    { "symbolName": "tls1_process_heartbeat", "state": "patched", "exactMatch": true }
  ],
  "justification": "vulnerable_code_not_present",
  "summary": "Binary confirmed PATCHED with 95% confidence. 1 symbol(s) matched patched signatures exactly."
}

This evidence feeds into VEX candidate generation with full audit trail.

Alternatives Considered

1. Source Code Comparison

Rejected: Requires source access, doesn't work for closed-source binaries, compile options affect behavior.

2. Debug Symbol Matching

Rejected: Symbols often stripped in production, doesn't prove code content.

3. File Hash Matching

Rejected: Entire binary must match exactly; any rebuild invalidates signature.

4. YARA Rules

Rejected: Pattern-based, high false positive rate, doesn't provide cryptographic proof.

5. Single Disassembly Engine (B2R2 only)

Rejected: Performance critical; Iced is 3-5x faster for x86/x86-64 which is 90%+ of scanned binaries.

Consequences

Positive

  1. Eliminate false positives for backported security fixes
  2. Cryptographic proof of patch status (auditable, reproducible)
  3. Offline operation with signature packs
  4. Multi-architecture support for modern infrastructure
  5. VEX integration for automated triage

Negative

  1. Signature authoring required - Must create signatures for each CVE/package
  2. Normalization limits - Extreme compiler optimizations may defeat matching
  3. Storage overhead - Signature database growth
  4. Compute cost - Disassembly + normalization per binary

Mitigations

  • Signature federation - Share signatures across organizations
  • Chunk matching - Resilient to LTO and PGO changes
  • Priority authoring - Focus on high-severity CVEs first
  • Incremental scanning - Cache analysis results

Implementation

Sprint: SPRINT_20260102_001_BE

Component Status Notes
Disassembly.Abstractions DONE Plugin interface, models
Disassembly.Iced DONE x86/x86-64 support
Disassembly.B2R2 DONE Multi-arch support
Normalization DONE X64 + ARM64 pipelines
DeltaSig DONE Generator + matcher
Persistence DONE PostgreSQL schema
CLI DONE extract, author, sign, verify, match, pack, inspect
Scanner integration DONE DeltaSigAnalyzer, IBinaryVulnerabilityService
VEX emission DONE DeltaSignatureEvidence, DeltaSigVexEmitter

Test Coverage

  • 74 unit tests for DeltaSig library
  • 45 unit tests for Normalization
  • 24 unit tests for Disassembly
  • 11 property tests (FsCheck) for normalization idempotency
  • 14 golden tests for known CVEs (Heartbleed, Log4Shell, POODLE)
  • 25 unit tests for VEX evidence emission

References