tests fixes and sprints work
This commit is contained in:
@@ -1239,7 +1239,183 @@ binaryindex:
|
||||
|
||||
---
|
||||
|
||||
## 10. References
|
||||
## 10. Golden Corpus for Patch Provenance
|
||||
|
||||
> **Sprint:** SPRINT_20260121_034/035/036 - Golden Corpus Implementation
|
||||
|
||||
The BinaryIndex module supports a **golden corpus** of patch-paired artifacts that enables offline SBOM reproducibility and binary-level patch provenance verification.
|
||||
|
||||
### 10.1 Corpus Purpose
|
||||
|
||||
The golden corpus provides:
|
||||
- **Auditor-ready evidence bundles** for air-gapped customers
|
||||
- **Regression testing** for binary matching accuracy
|
||||
- **Proof of patch status** independent of package metadata
|
||||
|
||||
### 10.2 Corpus Sources
|
||||
|
||||
| Source | Type | Purpose |
|
||||
|--------|------|---------|
|
||||
| Debian Security Tracker / DSAs | Advisory | Primary advisory linkage |
|
||||
| Debian Snapshot | Binary archive | Pre/post patch binary pairs |
|
||||
| Ubuntu Security Notices | Advisory | Ubuntu-specific advisories |
|
||||
| Alpine secdb | Advisory | Alpine YAML advisories |
|
||||
| OSV dump | Unified schema | Cross-reference and commit ranges |
|
||||
|
||||
### 10.2.1 Symbol Source Connectors
|
||||
|
||||
> **Sprint:** SPRINT_20260121_035_BinaryIndex_golden_corpus_connectors_cli
|
||||
|
||||
The corpus ingestion layer uses pluggable connectors to retrieve symbols and metadata from upstream sources:
|
||||
|
||||
| Connector ID | Implementation | Protocol | Data Retrieved |
|
||||
|--------------|----------------|----------|----------------|
|
||||
| `debuginfod-fedora` | `DebuginfodConnector` | debuginfod HTTP | ELF debug symbols by Build-ID |
|
||||
| `debuginfod-ubuntu` | `DebuginfodConnector` | debuginfod HTTP | ELF debug symbols by Build-ID |
|
||||
| `ddeb-ubuntu` | `DdebConnector` | APT/HTTP | `.ddeb` debug packages |
|
||||
| `buildinfo-debian` | `BuildinfoConnector` | HTTP | `.buildinfo` reproducibility records |
|
||||
| `secdb-alpine` | `AlpineSecDbConnector` | Git/HTTP | `secfixes` YAML from APKBUILD |
|
||||
|
||||
**Connector Interface:**
|
||||
|
||||
```csharp
|
||||
public interface ISymbolSourceConnector
|
||||
{
|
||||
string ConnectorId { get; }
|
||||
string DisplayName { get; }
|
||||
string[] SupportedDistros { get; }
|
||||
|
||||
Task<ConnectorStatus> GetStatusAsync(CancellationToken ct);
|
||||
Task SyncAsync(SyncOptions options, CancellationToken ct);
|
||||
Task<SymbolLookupResult?> LookupByBuildIdAsync(string buildId, CancellationToken ct);
|
||||
Task<IAsyncEnumerable<SymbolRecord>> SearchAsync(SymbolSearchQuery query, CancellationToken ct);
|
||||
}
|
||||
```
|
||||
|
||||
**Debuginfod Connector:**
|
||||
|
||||
The `DebuginfodConnector` implements the [debuginfod protocol](https://sourceware.org/elfutils/Debuginfod.html) for retrieving debug symbols:
|
||||
|
||||
- Endpoint: `GET /buildid/<build-id>/debuginfo`
|
||||
- Supports federated queries across multiple debuginfod servers
|
||||
- Caches retrieved symbols in RustFS blob storage
|
||||
- Rate-limited to respect upstream server policies
|
||||
|
||||
**Ubuntu ddeb Connector:**
|
||||
|
||||
The `DdebConnector` retrieves Ubuntu debug symbol packages (`.ddeb`):
|
||||
|
||||
- Sources: `ddebs.ubuntu.com` mirror
|
||||
- Indexes: Reads `Packages.xz` for package metadata
|
||||
- Extraction: Unpacks `.ddeb` AR archives to extract DWARF symbols
|
||||
- Mapping: Links debug symbols to binary packages via Build-ID
|
||||
|
||||
**Debian Buildinfo Connector:**
|
||||
|
||||
The `BuildinfoConnector` retrieves Debian buildinfo files for reproducibility verification:
|
||||
|
||||
- Source: `buildinfos.debian.net` and snapshot archives
|
||||
- Purpose: Provides build environment metadata for reproducible builds
|
||||
- Fields extracted: `Build-Date`, `Build-Architecture`, `Checksums-Sha256`
|
||||
- Integration: Cross-references with binary packages for provenance
|
||||
|
||||
**Alpine SecDB Connector:**
|
||||
|
||||
The `AlpineSecDbConnector` parses Alpine's security database:
|
||||
|
||||
- Source: `secfixes` blocks in APKBUILD files
|
||||
- Repository: `alpine/aports` Git repository
|
||||
- Format: YAML blocks mapping CVEs to fixed versions
|
||||
- Example:
|
||||
```yaml
|
||||
secfixes:
|
||||
3.0.11-r0:
|
||||
- CVE-2024-0727
|
||||
- CVE-2024-0728
|
||||
```
|
||||
|
||||
**OSV Dump Parser:**
|
||||
|
||||
The `OsvDumpParser` processes Google OSV database dumps for advisory cross-correlation:
|
||||
|
||||
- Source: `osv.dev` bulk exports (JSON)
|
||||
- Purpose: CVE → commit range extraction for patch identification
|
||||
- Cross-reference: Correlates OSV entries with distribution advisories
|
||||
- Inconsistency detection: Identifies discrepancies between OSV and distro advisories
|
||||
|
||||
```csharp
|
||||
public interface IOsvDumpParser
|
||||
{
|
||||
IAsyncEnumerable<OsvParsedEntry> ParseDumpAsync(Stream osvDumpStream, CancellationToken ct);
|
||||
OsvCveIndex BuildCveIndex(IEnumerable<OsvParsedEntry> entries);
|
||||
IEnumerable<AdvisoryCorrelation> CrossReferenceWithExternal(
|
||||
OsvCveIndex osvIndex,
|
||||
IEnumerable<ExternalAdvisory> externalAdvisories);
|
||||
IEnumerable<AdvisoryInconsistency> DetectInconsistencies(
|
||||
IEnumerable<AdvisoryCorrelation> correlations);
|
||||
}
|
||||
```
|
||||
|
||||
**CLI Access:**
|
||||
|
||||
All connectors are manageable via the `stella groundtruth sources` CLI commands:
|
||||
|
||||
```bash
|
||||
# List all connectors
|
||||
stella groundtruth sources list
|
||||
|
||||
# Sync specific connector
|
||||
stella groundtruth sources sync --source buildinfo-debian --full
|
||||
|
||||
# Enable/disable connectors
|
||||
stella groundtruth sources enable ddeb-ubuntu
|
||||
stella groundtruth sources disable debuginfod-fedora
|
||||
```
|
||||
|
||||
See [Ground-Truth CLI Guide](../cli/guides/ground-truth-cli.md) for complete CLI documentation
|
||||
|
||||
### 10.3 Key Performance Indicators
|
||||
|
||||
| KPI | Target | Description |
|
||||
|-----|--------|-------------|
|
||||
| Per-function match rate | >= 90% | Functions matched in post-patch binary |
|
||||
| False-negative patch detection | <= 5% | Patched functions incorrectly classified |
|
||||
| SBOM canonical-hash stability | 3/3 | Determinism across independent runs |
|
||||
| Binary reconstruction equivalence | Trend | Rebuilt binary matches original |
|
||||
| End-to-end verify time (p95, cold) | Trend | Offline verification performance |
|
||||
|
||||
### 10.4 Validation Harness
|
||||
|
||||
The validation harness (`IValidationHarness`) orchestrates end-to-end verification:
|
||||
|
||||
```
|
||||
Binary Pair (pre/post) → Symbol Recovery → IR Lifting → Fingerprinting → Matching → Metrics
|
||||
```
|
||||
|
||||
### 10.5 Evidence Bundle Format
|
||||
|
||||
Evidence bundles follow OCI/ORAS conventions:
|
||||
|
||||
```
|
||||
<pkg>-<advisory>-bundle.oci.tar
|
||||
├── manifest.json # OCI manifest
|
||||
└── blobs/
|
||||
├── sha256:<sbom> # Canonical SBOM
|
||||
├── sha256:<pre-bin> # Pre-fix binary
|
||||
├── sha256:<post-bin> # Post-fix binary
|
||||
├── sha256:<delta-sig> # DSSE delta-sig predicate
|
||||
└── sha256:<timestamp> # RFC 3161 timestamp
|
||||
```
|
||||
|
||||
### 10.6 Related Documentation
|
||||
|
||||
- [Golden Corpus KPIs](../../benchmarks/golden-corpus-kpis.md)
|
||||
- [Golden Corpus Seed List](../../benchmarks/golden-corpus-seed-list.md)
|
||||
- [Ground-Truth Corpus Specification](../../benchmarks/ground-truth-corpus.md)
|
||||
|
||||
---
|
||||
|
||||
## 11. References
|
||||
|
||||
- Advisory: `docs/product/advisories/21-Dec-2025 - Mapping Evidence Within Compiled Binaries.md`
|
||||
- Scanner Native Analysis: `src/Scanner/StellaOps.Scanner.Analyzers.Native/`
|
||||
@@ -1248,8 +1424,9 @@ binaryindex:
|
||||
- **Semantic Diffing Sprint:** `docs/implplan/SPRINT_20260105_001_001_BINDEX_semdiff_ir_semantics.md`
|
||||
- **Semantic Library:** `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/`
|
||||
- **Semantic Tests:** `src/BinaryIndex/__Tests/StellaOps.BinaryIndex.Semantic.Tests/`
|
||||
- **Golden Corpus Sprints:** `docs/implplan/SPRINT_20260121_034_BinaryIndex_golden_corpus_foundation.md`
|
||||
|
||||
---
|
||||
|
||||
*Document Version: 1.1.1*
|
||||
*Last Updated: 2026-01-14*
|
||||
*Document Version: 1.2.0*
|
||||
*Last Updated: 2026-01-21*
|
||||
|
||||
Reference in New Issue
Block a user