6.8 KiB
component_architecture_unknowns.md - Stella Ops Unknowns (2025Q4, updated 2026-03-04)
Unknown component and symbol tracking registry.
Scope. Standalone microservice architecture for Unknowns: tracking unresolved components, symbols, and mappings that Scanner and other analyzers cannot definitively identify.
0) Mission & boundaries
Mission. Provide a structured registry for tracking unknown components, unresolved symbols, and incomplete mappings. Enable visibility into coverage gaps and guide future enhancement priorities.
Boundaries.
- Unknowns is a standalone microservice with its own HTTP API surface, DbContext, and schema ownership.
- Unknowns is independently deployable and is not consolidated into Policy or any other module.
- Unknowns does not guess identities. It records what cannot be determined.
- All unknowns are categorized for actionability.
- Library layers within Unknowns are consumed by Scanner, Signals, and Platform via ProjectReference.
Boundary decision (Sprint 206, 2026-02-25): Unknowns retains its own
UnknownsDbContextand schema ownership. No source consolidation into Policy and no DbContext merge. Seedocs/implplan/SPRINT_20260225_206_Policy_absorb_unknowns.mdfor rationale.
1) Solution & project layout
src/Unknowns/
├─ StellaOps.Unknowns.WebService/ # Standalone Minimal API host
│ └─ Endpoints/
│ ├─ UnknownsEndpoints.cs # /api/unknowns (list, detail, hints, history, triage, hot-queue, summary)
│ └─ GreyQueueEndpoints.cs # /api/grey-queue (enqueue, process, resolve, escalate, etc.)
├─ StellaOps.Unknowns.Services/ # Business logic layer
├─ __Libraries/
│ ├─ StellaOps.Unknowns.Core/ # Unknown models, categorization
│ ├─ StellaOps.Unknowns.Persistence/ # Storage abstractions + EF DbContext (UnknownsDbContext with DbSet<UnknownEntity>)
│ └─ StellaOps.Unknowns.Persistence.EfCore/ # EF Core compiled model
│
└─ __Tests/
├─ StellaOps.Unknowns.Core.Tests/
└─ StellaOps.Unknowns.Persistence.Tests/
2) Contracts & data model
2.1 Unknown Record
{
"unknownId": "unk-2025-01-15-abc123",
"category": "symbol_unmapped",
"context": {
"scanId": "scan-xyz",
"binaryPath": "/usr/lib/libfoo.so",
"symbolName": "_ZN3foo3barEv"
},
"reason": "No PURL mapping available",
"firstSeen": "2025-01-15T10:30:00Z",
"occurrences": 42,
"provenanceHints": [
{
"hint_id": "hint:sha256:abc123...",
"type": "BuildIdMatch",
"confidence": 0.95,
"hypothesis": "Binary matches openssl 1.1.1k from debian",
"suggested_actions": [
{
"action": "verify_build_id",
"priority": 1,
"effort": "low",
"description": "Verify Build-ID against distro package repositories"
}
]
}
],
"bestHypothesis": "Binary matches openssl 1.1.1k from debian",
"combinedConfidence": 0.95
}
2.2 Categories
| Category | Description |
|---|---|
component_unidentified |
Binary without package mapping |
symbol_unmapped |
Symbol without PURL resolution |
version_ambiguous |
Multiple version candidates |
purl_invalid |
Malformed package URL |
2.3 Provenance Hints
Added in SPRINT_20260106_001_005_UNKNOWNS
Provenance hints explain why something is unknown and provide hypotheses for resolution.
Hint Types (15+):
- BuildIdMatch - ELF/PE Build-ID match against known catalog
- DebugLink - Debug link (.gnu_debuglink) reference
- ImportTableFingerprint - Import table fingerprint comparison
- ExportTableFingerprint - Export table fingerprint comparison
- SectionLayout - Section layout similarity
- StringTableSignature - String table signature match
- CompilerSignature - Compiler/linker identification
- PackageMetadata - Package manager metadata (RPATH, NEEDED, etc.)
- DistroPattern - Distro/vendor pattern match
- VersionString - Version string extraction
- SymbolPattern - Symbol name pattern match
- PathPattern - File path pattern match
- CorpusMatch - Hash match against known corpus
- SbomCrossReference - SBOM cross-reference
- AdvisoryCrossReference - Advisory cross-reference
Confidence Levels:
- VeryHigh (>= 0.9) - Strong evidence, high reliability
- High (0.7 - 0.9) - Good evidence, likely accurate
- Medium (0.5 - 0.7) - Moderate evidence, worth investigating
- Low (0.3 - 0.5) - Weak evidence, low confidence
- VeryLow (< 0.3) - Very weak evidence, exploratory only
Suggested Actions:
Each hint includes prioritized resolution actions:
- verify_build_id - Verify Build-ID against distro package repositories
- distro_package_lookup - Search distro package repositories
- version_verification - Verify extracted version against known releases
- analyze_imports - Cross-reference imported libraries
- compare_section_layout - Compare section layout with known binaries
- expand_catalog - Add missing distros/packages to Build-ID catalog
Hint Combination:
When multiple hints agree, confidence is boosted:
Single hint: confidence = 0.85
Two agreeing: confidence = min(0.99, 0.85 + 0.1) = 0.95
Three agreeing: confidence = min(0.99, 0.85 + 0.2) = 0.99
JSON Schema:
See src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Schemas/provenance-hint.schema.json
Related Documentation
- Scanner:
../scanner/architecture.md - Signals:
../signals/architecture.md - Policy:
../policy/architecture.md(Policy references Unknowns viaUnknownsBudgetGatebut does not own Unknowns persistence or source) - Boundary decision:
../../implplan/SPRINT_20260225_206_Policy_absorb_unknowns.md
Advisory Gap Status (2026-03-04 Batch)
Status: implementation delivered in Sprint 304.
AttachProvenanceHintsAsyncandGetWithHighConfidenceHintsAsyncare implemented in active repositories:src/Unknowns/__Libraries/StellaOps.Unknowns.Persistence/Postgres/Repositories/PostgresUnknownRepository.cssrc/Unknowns/__Libraries/StellaOps.Unknowns.Persistence/EfCore/Repositories/UnknownEfRepository.cs
- High-confidence retrieval now applies deterministic ordering (
combined_confidence DESC,id ASC) and tenant scoping. - Migration
src/Unknowns/__Libraries/StellaOps.Unknowns.Persistence/Migrations/002_provenance_hints.sqltargetsunknowns.unknown(aligned with runtime repositories). - Active EF runtime path is
src/Unknowns/__Libraries/StellaOps.Unknowns.Persistence/EfCore/**. - Duplicate scaffold path
src/Unknowns/__Libraries/StellaOps.Unknowns.Persistence.EfCore/**is explicitly marked as non-active/deprecated to prevent behavior drift.
Closure sprint:
docs/implplan/SPRINT_20260304_304_Unknowns_provenance_hints_persistence_completion.md