# Structured Provenance Hints for Unknowns ## Module Unknowns ## Status IMPLEMENTED ## Description Structured provenance hint system for unknown binaries/components with typed hints (BuildIdMatch, DebugLink, ImportTableFingerprint, ExportTableFingerprint, SectionLayout, CompilerSignature, DistroPattern, VersionString, SymbolPattern), confidence scoring, and hypothesis generation for resolution (e.g., "Binary matches distro build-ID, likely backport"). ## Implementation Details - **Provenance Hint Builder**: `src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Hints/ProvenanceHintBuilder.cs` -- fluent builder for constructing typed provenance hints with confidence scores; supports chaining multiple hint sources (build ID, debug link, import table, section layout) into a ranked hypothesis list. - **IProvenanceHintBuilder Interface**: `src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Hints/IProvenanceHintBuilder.cs` -- interface for the provenance hint builder, enabling dependency injection and testability. - **Provenance Hint Model**: `src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Models/ProvenanceHint.cs` -- data model for a single provenance hint containing hint type, source evidence, confidence score, and generated hypothesis text. - **Provenance Hint Type**: `src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Models/ProvenanceHintType.cs` -- enum defining all supported hint types: BuildIdMatch, DebugLink, ImportTableFingerprint, ExportTableFingerprint, SectionLayout, CompilerSignature, DistroPattern, VersionString, SymbolPattern. - **Provenance Evidence**: `src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Models/ProvenanceEvidence.cs` -- evidence payload associated with a provenance hint (e.g., the matched build ID string, the fingerprint hash, the compiler version string). - **Native Unknown Context**: `src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Models/NativeUnknownContext.cs` -- context model for native (C/C++/Rust/Go) unknown binaries, providing the binary analysis data that hint builders consume. - **Native Unknown Classifier**: `src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Services/NativeUnknownClassifier.cs` -- classifies unknown binaries by running all available hint builders and ranking hypotheses by confidence score. - **Unknown Proof Emitter**: `src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Services/UnknownProofEmitter.cs` -- emits attestation proofs for resolved unknowns, linking the provenance hints that led to identification. ## E2E Test Plan - [ ] Build a provenance hint for a binary with a known GNU build-ID, invoke `ProvenanceHintBuilder` with BuildIdMatch type, and verify the hint contains the correct build ID, a confidence score > 0.8, and a hypothesis string mentioning the matched package - [ ] Build hints for a binary with multiple evidence sources (build ID + section layout + compiler signature), and verify `NativeUnknownClassifier` ranks them by descending confidence score - [ ] Submit a `NativeUnknownContext` for a binary with no matching evidence and verify the classifier returns an empty hint list with no false-positive hypotheses - [ ] Build a DistroPattern hint for a binary matching a known distro build pattern (e.g., Debian hardening flags) and verify the hypothesis correctly identifies the distribution - [ ] Resolve an unknown binary using provenance hints and verify `UnknownProofEmitter` produces an attestation proof linking the hints to the resolution decision - [ ] Verify all `ProvenanceHintType` enum values have corresponding builder paths in `ProvenanceHintBuilder` by constructing one hint of each type and confirming no errors