Files
git.stella-ops.org/docs/features/unchecked/unknowns/structured-provenance-hints-for-unknowns.md

3.6 KiB

Structured Provenance Hints for Unknowns

Module

Unknowns

Status

IMPLEMENTED

Description

Structured provenance hint system for unknown binaries/components with typed hints (BuildIdMatch, DebugLink, ImportTableFingerprint, ExportTableFingerprint, SectionLayout, CompilerSignature, DistroPattern, VersionString, SymbolPattern), confidence scoring, and hypothesis generation for resolution (e.g., "Binary matches distro build-ID, likely backport").

Implementation Details

  • Provenance Hint Builder: src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Hints/ProvenanceHintBuilder.cs -- fluent builder for constructing typed provenance hints with confidence scores; supports chaining multiple hint sources (build ID, debug link, import table, section layout) into a ranked hypothesis list.
  • IProvenanceHintBuilder Interface: src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Hints/IProvenanceHintBuilder.cs -- interface for the provenance hint builder, enabling dependency injection and testability.
  • Provenance Hint Model: src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Models/ProvenanceHint.cs -- data model for a single provenance hint containing hint type, source evidence, confidence score, and generated hypothesis text.
  • Provenance Hint Type: src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Models/ProvenanceHintType.cs -- enum defining all supported hint types: BuildIdMatch, DebugLink, ImportTableFingerprint, ExportTableFingerprint, SectionLayout, CompilerSignature, DistroPattern, VersionString, SymbolPattern.
  • Provenance Evidence: src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Models/ProvenanceEvidence.cs -- evidence payload associated with a provenance hint (e.g., the matched build ID string, the fingerprint hash, the compiler version string).
  • Native Unknown Context: src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Models/NativeUnknownContext.cs -- context model for native (C/C++/Rust/Go) unknown binaries, providing the binary analysis data that hint builders consume.
  • Native Unknown Classifier: src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Services/NativeUnknownClassifier.cs -- classifies unknown binaries by running all available hint builders and ranking hypotheses by confidence score.
  • Unknown Proof Emitter: src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Services/UnknownProofEmitter.cs -- emits attestation proofs for resolved unknowns, linking the provenance hints that led to identification.

E2E Test Plan

  • Build a provenance hint for a binary with a known GNU build-ID, invoke ProvenanceHintBuilder with BuildIdMatch type, and verify the hint contains the correct build ID, a confidence score > 0.8, and a hypothesis string mentioning the matched package
  • Build hints for a binary with multiple evidence sources (build ID + section layout + compiler signature), and verify NativeUnknownClassifier ranks them by descending confidence score
  • Submit a NativeUnknownContext for a binary with no matching evidence and verify the classifier returns an empty hint list with no false-positive hypotheses
  • Build a DistroPattern hint for a binary matching a known distro build pattern (e.g., Debian hardening flags) and verify the hypothesis correctly identifies the distribution
  • Resolve an unknown binary using provenance hints and verify UnknownProofEmitter produces an attestation proof linking the hints to the resolution decision
  • Verify all ProvenanceHintType enum values have corresponding builder paths in ProvenanceHintBuilder by constructing one hint of each type and confirming no errors