- Created a new document for the Stella Ops Reference Architecture outlining the system's topology, trust boundaries, artifact association, and interfaces. - Developed a comprehensive Testing Strategy document detailing the importance of offline readiness, interoperability, determinism, and operational guardrails. - Introduced a README for the Testing Strategy, summarizing processing details and key concepts implemented. - Added guidance for AI agents and developers in the tests directory, including directory structure, test categories, key patterns, and rules for test development.
8.9 KiB
Sprint 6000 Series Summary: BinaryIndex Module
Overview
The 6000 series implements the BinaryIndex module - a vulnerable binaries database that enables detection of vulnerable code at the binary level, independent of package metadata.
Advisory Source: docs/product-advisories/21-Dec-2025 - Mapping Evidence Within Compiled Binaries.md
MVP Roadmap
MVP 1: Known-Build Binary Catalog (Sprint 6000.0001)
Goal: Query "is this Build-ID vulnerable?" with distro-level precision.
| Sprint | Topic | Description |
|---|---|---|
| 6000.0001.0001 | Binaries Schema | PostgreSQL schema creation |
| 6000.0001.0002 | Binary Identity Service | Core identity extraction and storage |
| 6000.0001.0003 | Debian Corpus Connector | Debian/Ubuntu package ingestion |
| 6000.0001.0004 | Build-ID Lookup Service | Query API for Build-ID matching |
Acceptance: Given a Build-ID, return associated CVEs from known distro builds.
MVP 2: Patch-Aware Backport Handling (Sprint 6000.0002)
Goal: Handle "version says vulnerable but distro backported the fix."
| Sprint | Topic | Description |
|---|---|---|
| 6000.0002.0001 | Fix Evidence Parser | Changelog and patch header parsing |
| 6000.0002.0002 | Fix Index Builder | Merge evidence into fix index |
| 6000.0002.0003 | Version Comparators | Distro-specific version comparison |
| 6000.0002.0004 | RPM Corpus Connector | RHEL/Fedora package ingestion |
Acceptance: For a CVE that upstream marks vulnerable, correctly identify distro backport as fixed.
MVP 3: Binary Fingerprint Factory (Sprint 6000.0003)
Goal: Detect vulnerable code independent of package metadata.
| Sprint | Topic | Description |
|---|---|---|
| 6000.0003.0001 | Fingerprint Storage | Database and blob storage for fingerprints |
| 6000.0003.0002 | Reference Build Pipeline | Generate vulnerable/fixed reference builds |
| 6000.0003.0003 | Fingerprint Generator | Extract function fingerprints from binaries |
| 6000.0003.0004 | Fingerprint Matching Engine | Similarity search and matching |
| 6000.0003.0005 | Validation Corpus | Golden corpus for fingerprint validation |
Acceptance: Detect CVE in stripped binary with no package metadata, confidence > 0.95.
MVP 4: Scanner Integration (Sprint 6000.0004)
Goal: Binary evidence in production scans.
| Sprint | Topic | Description |
|---|---|---|
| 6000.0004.0001 | Scanner Worker Integration | Wire BinaryIndex into scan pipeline |
| 6000.0004.0002 | Findings Ledger Integration | Record binary matches as findings |
| 6000.0004.0003 | Proof Segment Attestation | DSSE attestations for binary evidence |
| 6000.0004.0004 | CLI Binary Match Inspection | CLI commands for match inspection |
Acceptance: Container scan produces binary match findings with evidence chain.
Dependencies
graph TD
subgraph MVP1["MVP 1: Known-Build Catalog"]
S6001[6000.0001.0001<br/>Schema]
S6002[6000.0001.0002<br/>Identity Service]
S6003[6000.0001.0003<br/>Debian Connector]
S6004[6000.0001.0004<br/>Build-ID Lookup]
S6001 --> S6002
S6002 --> S6003
S6002 --> S6004
S6003 --> S6004
end
subgraph MVP2["MVP 2: Patch-Aware"]
S6011[6000.0002.0001<br/>Fix Parser]
S6012[6000.0002.0002<br/>Fix Index Builder]
S6013[6000.0002.0003<br/>Version Comparators]
S6014[6000.0002.0004<br/>RPM Connector]
S6011 --> S6012
S6013 --> S6012
S6012 --> S6014
end
subgraph MVP3["MVP 3: Fingerprints"]
S6021[6000.0003.0001<br/>FP Storage]
S6022[6000.0003.0002<br/>Ref Build Pipeline]
S6023[6000.0003.0003<br/>FP Generator]
S6024[6000.0003.0004<br/>Matching Engine]
S6025[6000.0003.0005<br/>Validation Corpus]
S6021 --> S6023
S6022 --> S6023
S6023 --> S6024
S6024 --> S6025
end
subgraph MVP4["MVP 4: Integration"]
S6031[6000.0004.0001<br/>Scanner Integration]
S6032[6000.0004.0002<br/>Findings Ledger]
S6033[6000.0004.0003<br/>Attestations]
S6034[6000.0004.0004<br/>CLI]
S6031 --> S6032
S6032 --> S6033
S6031 --> S6034
end
MVP1 --> MVP2
MVP1 --> MVP3
MVP2 --> MVP4
MVP3 --> MVP4
Module Structure
src/BinaryIndex/
├── StellaOps.BinaryIndex.WebService/ # API service
├── StellaOps.BinaryIndex.Worker/ # Corpus ingestion worker
├── __Libraries/
│ ├── StellaOps.BinaryIndex.Core/ # Domain models, interfaces
│ ├── StellaOps.BinaryIndex.Persistence/ # PostgreSQL + RustFS
│ ├── StellaOps.BinaryIndex.Corpus/ # Corpus connector framework
│ ├── StellaOps.BinaryIndex.Corpus.Debian/ # Debian connector
│ ├── StellaOps.BinaryIndex.Corpus.Rpm/ # RPM connector
│ ├── StellaOps.BinaryIndex.FixIndex/ # Patch-aware fix index
│ └── StellaOps.BinaryIndex.Fingerprints/ # Fingerprint generation
└── __Tests/
├── StellaOps.BinaryIndex.Core.Tests/
├── StellaOps.BinaryIndex.Persistence.Tests/
├── StellaOps.BinaryIndex.Corpus.Tests/
└── StellaOps.BinaryIndex.Integration.Tests/
Key Interfaces
// Query interface (consumed by Scanner.Worker)
public interface IBinaryVulnerabilityService
{
Task<ImmutableArray<BinaryVulnMatch>> LookupByIdentityAsync(BinaryIdentity identity, CancellationToken ct);
Task<ImmutableArray<BinaryVulnMatch>> LookupByFingerprintAsync(CodeFingerprint fp, CancellationToken ct);
Task<FixRecord?> GetFixStatusAsync(string distro, string release, string sourcePkg, string cveId, CancellationToken ct);
}
// Corpus connector interface
public interface IBinaryCorpusConnector
{
string ConnectorId { get; }
Task<CorpusSnapshot> FetchSnapshotAsync(CorpusQuery query, CancellationToken ct);
IAsyncEnumerable<ExtractedBinary> ExtractBinariesAsync(PackageReference pkg, CancellationToken ct);
}
// Fix index interface
public interface IFixIndexBuilder
{
Task BuildIndexAsync(DistroRelease distro, CancellationToken ct);
Task<FixRecord?> GetFixRecordAsync(string distro, string release, string sourcePkg, string cveId, CancellationToken ct);
}
Database Schema
Schema: binaries
Owner: BinaryIndex module
Key Tables:
| Table | Purpose |
|---|---|
binary_identity |
Known binary identities (Build-ID, hashes) |
binary_package_map |
Binary → package mapping per snapshot |
vulnerable_buildids |
Build-IDs known to be vulnerable |
cve_fix_index |
Patch-aware fix status per distro |
vulnerable_fingerprints |
Function fingerprints for CVEs |
fingerprint_matches |
Match results (findings evidence) |
See: docs/db/schemas/binaries_schema_specification.md
Integration Points
Scanner.Worker
// During binary extraction
var identity = await _featureExtractor.ExtractIdentityAsync(binaryStream, ct);
var matches = await _binaryVulnService.LookupByIdentityAsync(identity, ct);
// If distro known, check fix status
var fixStatus = await _binaryVulnService.GetFixStatusAsync(
distro, release, sourcePkg, cveId, ct);
Findings Ledger
public record BinaryVulnerabilityFinding : IFinding
{
public string MatchType { get; init; } // "fingerprint", "buildid"
public string VulnerablePurl { get; init; }
public string MatchedSymbol { get; init; }
public float Similarity { get; init; }
public string[] LinkedCves { get; init; }
}
Policy Engine
New proof segment type: binary_fingerprint_evidence
Configuration
binaryindex:
enabled: true
corpus:
connectors:
- type: debian
enabled: true
releases: [bookworm, bullseye, jammy, noble]
fingerprinting:
enabled: true
target_components: [openssl, glibc, zlib, curl]
lookup:
cache_ttl: 3600
Success Criteria
MVP 1
binariesschema deployed and migrated- Debian/Ubuntu corpus ingestion operational
- Build-ID lookup returns CVEs with < 100ms p95 latency
MVP 2
- Fix index correctly handles Debian/RHEL backports
- 95%+ accuracy on backport test corpus
MVP 3
- Fingerprints generated for OpenSSL, glibc, zlib, curl
- < 5% false positive rate on validation corpus
MVP 4
- Scanner produces binary match findings
- DSSE attestations include binary evidence
- CLI
stella binary-matchescommand operational
References
- Architecture:
docs/modules/binaryindex/architecture.md - Schema:
docs/db/schemas/binaries_schema_specification.md - Advisory:
docs/product-advisories/21-Dec-2025 - Mapping Evidence Within Compiled Binaries.md - Existing fingerprinting:
src/Scanner/__Libraries/StellaOps.Scanner.EntryTrace/Binary/ - Build-ID indexing:
src/Scanner/StellaOps.Scanner.Analyzers.Native/Index/
Document Version: 1.0.0 Created: 2025-12-21