Files
git.stella-ops.org/docs/implplan/SPRINT_6000_SUMMARY.md
StellaOps Bot 634233dfed feat: Implement distro-native version comparison for RPM, Debian, and Alpine packages
- Add RpmVersionComparer for RPM version comparison with epoch, version, and release handling.
- Introduce DebianVersion for parsing Debian EVR (Epoch:Version-Release) strings.
- Create ApkVersion for parsing Alpine APK version strings with suffix support.
- Define IVersionComparator interface for version comparison with proof-line generation.
- Implement VersionComparisonResult struct to encapsulate comparison results and proof lines.
- Add tests for Debian and RPM version comparers to ensure correct functionality and edge case handling.
- Create project files for the version comparison library and its tests.
2025-12-22 09:50:12 +02:00

9.8 KiB

Sprint 6000 Series Summary: BinaryIndex Module

Overview

The 6000 series implements the BinaryIndex module - a vulnerable binaries database that enables detection of vulnerable code at the binary level, independent of package metadata.

Advisory Source: docs/product-advisories/21-Dec-2025 - Mapping Evidence Within Compiled Binaries.md


MVP Roadmap

MVP 1: Known-Build Binary Catalog (Sprint 6000.0001)

Goal: Query "is this Build-ID vulnerable?" with distro-level precision.

Sprint Topic Description
6000.0001.0001 Binaries Schema PostgreSQL schema creation
6000.0001.0002 Binary Identity Service Core identity extraction and storage
6000.0001.0003 Debian Corpus Connector Debian/Ubuntu package ingestion
6000.0001.0004 Build-ID Lookup Service Query API for Build-ID matching

Acceptance: Given a Build-ID, return associated CVEs from known distro builds.


MVP 2: Patch-Aware Backport Handling (Sprint 6000.0002)

Goal: Handle "version says vulnerable but distro backported the fix."

Sprint Topic Description
6000.0002.0001 Fix Evidence Parser Changelog and patch header parsing
6000.0002.0002 Fix Index Builder Merge evidence into fix index
6000.0002.0003 Version Comparator Integration Reference existing Concelier comparators (see note below)
6000.0002.0004 RPM Corpus Connector RHEL/Fedora package ingestion

Acceptance: For a CVE that upstream marks vulnerable, correctly identify distro backport as fixed.

Note (2025-12-22): Sprint 6000.0002.0003 originally planned to implement distro-specific version comparators. However, production-ready comparators already exist in Concelier:

  • src/Concelier/__Libraries/StellaOps.Concelier.Merge/Comparers/Nevra.cs (RPM)
  • src/Concelier/__Libraries/StellaOps.Concelier.Merge/Comparers/DebianEvr.cs (Debian/Ubuntu)
  • src/Concelier/__Libraries/StellaOps.Concelier.Merge/Comparers/ApkVersion.cs (Alpine, via SPRINT_2000_0003_0001)

This sprint should instead:

  1. Create a shared StellaOps.VersionComparison library extracting existing comparators
  2. Reference this library from BinaryIndex.FixIndex
  3. Add proof-line generation per SPRINT_4000_0002_0001

See also:

  • SPRINT_2000_0003_0001 (Alpine connector/comparator)
  • SPRINT_2000_0003_0002 (Comprehensive version tests)
  • SPRINT_4000_0002_0001 (Backport UX explainability)

MVP 3: Binary Fingerprint Factory (Sprint 6000.0003)

Goal: Detect vulnerable code independent of package metadata.

Sprint Topic Description
6000.0003.0001 Fingerprint Storage Database and blob storage for fingerprints
6000.0003.0002 Reference Build Pipeline Generate vulnerable/fixed reference builds
6000.0003.0003 Fingerprint Generator Extract function fingerprints from binaries
6000.0003.0004 Fingerprint Matching Engine Similarity search and matching
6000.0003.0005 Validation Corpus Golden corpus for fingerprint validation

Acceptance: Detect CVE in stripped binary with no package metadata, confidence > 0.95.


MVP 4: Scanner Integration (Sprint 6000.0004)

Goal: Binary evidence in production scans.

Sprint Topic Description
6000.0004.0001 Scanner Worker Integration Wire BinaryIndex into scan pipeline
6000.0004.0002 Findings Ledger Integration Record binary matches as findings
6000.0004.0003 Proof Segment Attestation DSSE attestations for binary evidence
6000.0004.0004 CLI Binary Match Inspection CLI commands for match inspection

Acceptance: Container scan produces binary match findings with evidence chain.


Dependencies

graph TD
    subgraph MVP1["MVP 1: Known-Build Catalog"]
        S6001[6000.0001.0001<br/>Schema]
        S6002[6000.0001.0002<br/>Identity Service]
        S6003[6000.0001.0003<br/>Debian Connector]
        S6004[6000.0001.0004<br/>Build-ID Lookup]

        S6001 --> S6002
        S6002 --> S6003
        S6002 --> S6004
        S6003 --> S6004
    end

    subgraph MVP2["MVP 2: Patch-Aware"]
        S6011[6000.0002.0001<br/>Fix Parser]
        S6012[6000.0002.0002<br/>Fix Index Builder]
        S6013[6000.0002.0003<br/>Version Comparators]
        S6014[6000.0002.0004<br/>RPM Connector]

        S6011 --> S6012
        S6013 --> S6012
        S6012 --> S6014
    end

    subgraph MVP3["MVP 3: Fingerprints"]
        S6021[6000.0003.0001<br/>FP Storage]
        S6022[6000.0003.0002<br/>Ref Build Pipeline]
        S6023[6000.0003.0003<br/>FP Generator]
        S6024[6000.0003.0004<br/>Matching Engine]
        S6025[6000.0003.0005<br/>Validation Corpus]

        S6021 --> S6023
        S6022 --> S6023
        S6023 --> S6024
        S6024 --> S6025
    end

    subgraph MVP4["MVP 4: Integration"]
        S6031[6000.0004.0001<br/>Scanner Integration]
        S6032[6000.0004.0002<br/>Findings Ledger]
        S6033[6000.0004.0003<br/>Attestations]
        S6034[6000.0004.0004<br/>CLI]

        S6031 --> S6032
        S6032 --> S6033
        S6031 --> S6034
    end

    MVP1 --> MVP2
    MVP1 --> MVP3
    MVP2 --> MVP4
    MVP3 --> MVP4

Module Structure

src/BinaryIndex/
├── StellaOps.BinaryIndex.WebService/        # API service
├── StellaOps.BinaryIndex.Worker/            # Corpus ingestion worker
├── __Libraries/
│   ├── StellaOps.BinaryIndex.Core/          # Domain models, interfaces
│   ├── StellaOps.BinaryIndex.Persistence/   # PostgreSQL + RustFS
│   ├── StellaOps.BinaryIndex.Corpus/        # Corpus connector framework
│   ├── StellaOps.BinaryIndex.Corpus.Debian/ # Debian connector
│   ├── StellaOps.BinaryIndex.Corpus.Rpm/    # RPM connector
│   ├── StellaOps.BinaryIndex.FixIndex/      # Patch-aware fix index
│   └── StellaOps.BinaryIndex.Fingerprints/  # Fingerprint generation
└── __Tests/
    ├── StellaOps.BinaryIndex.Core.Tests/
    ├── StellaOps.BinaryIndex.Persistence.Tests/
    ├── StellaOps.BinaryIndex.Corpus.Tests/
    └── StellaOps.BinaryIndex.Integration.Tests/

Key Interfaces

// Query interface (consumed by Scanner.Worker)
public interface IBinaryVulnerabilityService
{
    Task<ImmutableArray<BinaryVulnMatch>> LookupByIdentityAsync(BinaryIdentity identity, CancellationToken ct);
    Task<ImmutableArray<BinaryVulnMatch>> LookupByFingerprintAsync(CodeFingerprint fp, CancellationToken ct);
    Task<FixRecord?> GetFixStatusAsync(string distro, string release, string sourcePkg, string cveId, CancellationToken ct);
}

// Corpus connector interface
public interface IBinaryCorpusConnector
{
    string ConnectorId { get; }
    Task<CorpusSnapshot> FetchSnapshotAsync(CorpusQuery query, CancellationToken ct);
    IAsyncEnumerable<ExtractedBinary> ExtractBinariesAsync(PackageReference pkg, CancellationToken ct);
}

// Fix index interface
public interface IFixIndexBuilder
{
    Task BuildIndexAsync(DistroRelease distro, CancellationToken ct);
    Task<FixRecord?> GetFixRecordAsync(string distro, string release, string sourcePkg, string cveId, CancellationToken ct);
}

Database Schema

Schema: binaries Owner: BinaryIndex module

Key Tables:

Table Purpose
binary_identity Known binary identities (Build-ID, hashes)
binary_package_map Binary → package mapping per snapshot
vulnerable_buildids Build-IDs known to be vulnerable
cve_fix_index Patch-aware fix status per distro
vulnerable_fingerprints Function fingerprints for CVEs
fingerprint_matches Match results (findings evidence)

See: docs/db/schemas/binaries_schema_specification.md


Integration Points

Scanner.Worker

// During binary extraction
var identity = await _featureExtractor.ExtractIdentityAsync(binaryStream, ct);
var matches = await _binaryVulnService.LookupByIdentityAsync(identity, ct);

// If distro known, check fix status
var fixStatus = await _binaryVulnService.GetFixStatusAsync(
    distro, release, sourcePkg, cveId, ct);

Findings Ledger

public record BinaryVulnerabilityFinding : IFinding
{
    public string MatchType { get; init; }     // "fingerprint", "buildid"
    public string VulnerablePurl { get; init; }
    public string MatchedSymbol { get; init; }
    public float Similarity { get; init; }
    public string[] LinkedCves { get; init; }
}

Policy Engine

New proof segment type: binary_fingerprint_evidence


Configuration

binaryindex:
  enabled: true
  corpus:
    connectors:
      - type: debian
        enabled: true
        releases: [bookworm, bullseye, jammy, noble]
  fingerprinting:
    enabled: true
    target_components: [openssl, glibc, zlib, curl]
  lookup:
    cache_ttl: 3600

Success Criteria

MVP 1

  • binaries schema deployed and migrated
  • Debian/Ubuntu corpus ingestion operational
  • Build-ID lookup returns CVEs with < 100ms p95 latency

MVP 2

  • Fix index correctly handles Debian/RHEL backports
  • 95%+ accuracy on backport test corpus

MVP 3

  • Fingerprints generated for OpenSSL, glibc, zlib, curl
  • < 5% false positive rate on validation corpus

MVP 4

  • Scanner produces binary match findings
  • DSSE attestations include binary evidence
  • CLI stella binary-matches command operational

References

  • Architecture: docs/modules/binaryindex/architecture.md
  • Schema: docs/db/schemas/binaries_schema_specification.md
  • Advisory: docs/product-advisories/21-Dec-2025 - Mapping Evidence Within Compiled Binaries.md
  • Existing fingerprinting: src/Scanner/__Libraries/StellaOps.Scanner.EntryTrace/Binary/
  • Build-ID indexing: src/Scanner/StellaOps.Scanner.Analyzers.Native/Index/

Document Version: 1.0.0 Created: 2025-12-21