# Sprint 6000 Series Summary: BinaryIndex Module ## Overview The 6000 series implements the **BinaryIndex** module - a vulnerable binaries database that enables detection of vulnerable code at the binary level, independent of package metadata. **Advisory Source:** `docs/product-advisories/21-Dec-2025 - Mapping Evidence Within Compiled Binaries.md` --- ## MVP Roadmap ### MVP 1: Known-Build Binary Catalog (Sprint 6000.0001) **Goal:** Query "is this Build-ID vulnerable?" with distro-level precision. | Sprint | Topic | Description | |--------|-------|-------------| | 6000.0001.0001 | Binaries Schema | PostgreSQL schema creation | | 6000.0001.0002 | Binary Identity Service | Core identity extraction and storage | | 6000.0001.0003 | Debian Corpus Connector | Debian/Ubuntu package ingestion | | 6000.0001.0004 | Build-ID Lookup Service | Query API for Build-ID matching | **Acceptance:** Given a Build-ID, return associated CVEs from known distro builds. --- ### MVP 2: Patch-Aware Backport Handling (Sprint 6000.0002) **Goal:** Handle "version says vulnerable but distro backported the fix." | Sprint | Topic | Description | |--------|-------|-------------| | 6000.0002.0001 | Fix Evidence Parser | Changelog and patch header parsing | | 6000.0002.0002 | Fix Index Builder | Merge evidence into fix index | | 6000.0002.0003 | Version Comparators | Distro-specific version comparison | | 6000.0002.0004 | RPM Corpus Connector | RHEL/Fedora package ingestion | **Acceptance:** For a CVE that upstream marks vulnerable, correctly identify distro backport as fixed. --- ### MVP 3: Binary Fingerprint Factory (Sprint 6000.0003) **Goal:** Detect vulnerable code independent of package metadata. | Sprint | Topic | Description | |--------|-------|-------------| | 6000.0003.0001 | Fingerprint Storage | Database and blob storage for fingerprints | | 6000.0003.0002 | Reference Build Pipeline | Generate vulnerable/fixed reference builds | | 6000.0003.0003 | Fingerprint Generator | Extract function fingerprints from binaries | | 6000.0003.0004 | Fingerprint Matching Engine | Similarity search and matching | | 6000.0003.0005 | Validation Corpus | Golden corpus for fingerprint validation | **Acceptance:** Detect CVE in stripped binary with no package metadata, confidence > 0.95. --- ### MVP 4: Scanner Integration (Sprint 6000.0004) **Goal:** Binary evidence in production scans. | Sprint | Topic | Description | |--------|-------|-------------| | 6000.0004.0001 | Scanner Worker Integration | Wire BinaryIndex into scan pipeline | | 6000.0004.0002 | Findings Ledger Integration | Record binary matches as findings | | 6000.0004.0003 | Proof Segment Attestation | DSSE attestations for binary evidence | | 6000.0004.0004 | CLI Binary Match Inspection | CLI commands for match inspection | **Acceptance:** Container scan produces binary match findings with evidence chain. --- ## Dependencies ```mermaid graph TD subgraph MVP1["MVP 1: Known-Build Catalog"] S6001[6000.0001.0001
Schema] S6002[6000.0001.0002
Identity Service] S6003[6000.0001.0003
Debian Connector] S6004[6000.0001.0004
Build-ID Lookup] S6001 --> S6002 S6002 --> S6003 S6002 --> S6004 S6003 --> S6004 end subgraph MVP2["MVP 2: Patch-Aware"] S6011[6000.0002.0001
Fix Parser] S6012[6000.0002.0002
Fix Index Builder] S6013[6000.0002.0003
Version Comparators] S6014[6000.0002.0004
RPM Connector] S6011 --> S6012 S6013 --> S6012 S6012 --> S6014 end subgraph MVP3["MVP 3: Fingerprints"] S6021[6000.0003.0001
FP Storage] S6022[6000.0003.0002
Ref Build Pipeline] S6023[6000.0003.0003
FP Generator] S6024[6000.0003.0004
Matching Engine] S6025[6000.0003.0005
Validation Corpus] S6021 --> S6023 S6022 --> S6023 S6023 --> S6024 S6024 --> S6025 end subgraph MVP4["MVP 4: Integration"] S6031[6000.0004.0001
Scanner Integration] S6032[6000.0004.0002
Findings Ledger] S6033[6000.0004.0003
Attestations] S6034[6000.0004.0004
CLI] S6031 --> S6032 S6032 --> S6033 S6031 --> S6034 end MVP1 --> MVP2 MVP1 --> MVP3 MVP2 --> MVP4 MVP3 --> MVP4 ``` --- ## Module Structure ``` src/BinaryIndex/ ├── StellaOps.BinaryIndex.WebService/ # API service ├── StellaOps.BinaryIndex.Worker/ # Corpus ingestion worker ├── __Libraries/ │ ├── StellaOps.BinaryIndex.Core/ # Domain models, interfaces │ ├── StellaOps.BinaryIndex.Persistence/ # PostgreSQL + RustFS │ ├── StellaOps.BinaryIndex.Corpus/ # Corpus connector framework │ ├── StellaOps.BinaryIndex.Corpus.Debian/ # Debian connector │ ├── StellaOps.BinaryIndex.Corpus.Rpm/ # RPM connector │ ├── StellaOps.BinaryIndex.FixIndex/ # Patch-aware fix index │ └── StellaOps.BinaryIndex.Fingerprints/ # Fingerprint generation └── __Tests/ ├── StellaOps.BinaryIndex.Core.Tests/ ├── StellaOps.BinaryIndex.Persistence.Tests/ ├── StellaOps.BinaryIndex.Corpus.Tests/ └── StellaOps.BinaryIndex.Integration.Tests/ ``` --- ## Key Interfaces ```csharp // Query interface (consumed by Scanner.Worker) public interface IBinaryVulnerabilityService { Task> LookupByIdentityAsync(BinaryIdentity identity, CancellationToken ct); Task> LookupByFingerprintAsync(CodeFingerprint fp, CancellationToken ct); Task GetFixStatusAsync(string distro, string release, string sourcePkg, string cveId, CancellationToken ct); } // Corpus connector interface public interface IBinaryCorpusConnector { string ConnectorId { get; } Task FetchSnapshotAsync(CorpusQuery query, CancellationToken ct); IAsyncEnumerable ExtractBinariesAsync(PackageReference pkg, CancellationToken ct); } // Fix index interface public interface IFixIndexBuilder { Task BuildIndexAsync(DistroRelease distro, CancellationToken ct); Task GetFixRecordAsync(string distro, string release, string sourcePkg, string cveId, CancellationToken ct); } ``` --- ## Database Schema Schema: `binaries` Owner: BinaryIndex module **Key Tables:** | Table | Purpose | |-------|---------| | `binary_identity` | Known binary identities (Build-ID, hashes) | | `binary_package_map` | Binary → package mapping per snapshot | | `vulnerable_buildids` | Build-IDs known to be vulnerable | | `cve_fix_index` | Patch-aware fix status per distro | | `vulnerable_fingerprints` | Function fingerprints for CVEs | | `fingerprint_matches` | Match results (findings evidence) | See: `docs/db/schemas/binaries_schema_specification.md` --- ## Integration Points ### Scanner.Worker ```csharp // During binary extraction var identity = await _featureExtractor.ExtractIdentityAsync(binaryStream, ct); var matches = await _binaryVulnService.LookupByIdentityAsync(identity, ct); // If distro known, check fix status var fixStatus = await _binaryVulnService.GetFixStatusAsync( distro, release, sourcePkg, cveId, ct); ``` ### Findings Ledger ```csharp public record BinaryVulnerabilityFinding : IFinding { public string MatchType { get; init; } // "fingerprint", "buildid" public string VulnerablePurl { get; init; } public string MatchedSymbol { get; init; } public float Similarity { get; init; } public string[] LinkedCves { get; init; } } ``` ### Policy Engine New proof segment type: `binary_fingerprint_evidence` --- ## Configuration ```yaml binaryindex: enabled: true corpus: connectors: - type: debian enabled: true releases: [bookworm, bullseye, jammy, noble] fingerprinting: enabled: true target_components: [openssl, glibc, zlib, curl] lookup: cache_ttl: 3600 ``` --- ## Success Criteria ### MVP 1 - [ ] `binaries` schema deployed and migrated - [ ] Debian/Ubuntu corpus ingestion operational - [ ] Build-ID lookup returns CVEs with < 100ms p95 latency ### MVP 2 - [ ] Fix index correctly handles Debian/RHEL backports - [ ] 95%+ accuracy on backport test corpus ### MVP 3 - [ ] Fingerprints generated for OpenSSL, glibc, zlib, curl - [ ] < 5% false positive rate on validation corpus ### MVP 4 - [ ] Scanner produces binary match findings - [ ] DSSE attestations include binary evidence - [ ] CLI `stella binary-matches` command operational --- ## References - Architecture: `docs/modules/binaryindex/architecture.md` - Schema: `docs/db/schemas/binaries_schema_specification.md` - Advisory: `docs/product-advisories/21-Dec-2025 - Mapping Evidence Within Compiled Binaries.md` - Existing fingerprinting: `src/Scanner/__Libraries/StellaOps.Scanner.EntryTrace/Binary/` - Build-ID indexing: `src/Scanner/StellaOps.Scanner.Analyzers.Native/Index/` --- *Document Version: 1.0.0* *Created: 2025-12-21*