Files
git.stella-ops.org/docs/modules/unknowns/architecture.md
2026-01-07 09:43:12 +02:00

146 lines
4.5 KiB
Markdown

# component_architecture_unknowns.md - **Stella Ops Unknowns** (2025Q4)
> Unknown component and symbol tracking registry.
> **Scope.** Library architecture for **Unknowns**: tracking unresolved components, symbols, and mappings that Scanner and other analyzers cannot definitively identify.
---
## 0) Mission & boundaries
**Mission.** Provide a **structured registry** for tracking unknown components, unresolved symbols, and incomplete mappings. Enable visibility into coverage gaps and guide future enhancement priorities.
**Boundaries.**
* Unknowns is a **library layer** consumed by Scanner and Signals.
* Unknowns **does not** guess identities. It records what cannot be determined.
* All unknowns are **categorized** for actionability.
---
## 1) Solution & project layout
```
src/Unknowns/
├─ __Libraries/
│ ├─ StellaOps.Unknowns.Core/ # Unknown models, categorization
│ ├─ StellaOps.Unknowns.Persistence/ # Storage abstractions
│ └─ StellaOps.Unknowns.Persistence.EfCore/
└─ __Tests/
├─ StellaOps.Unknowns.Core.Tests/
└─ StellaOps.Unknowns.Persistence.Tests/
```
---
## 2) Contracts & data model
### 2.1 Unknown Record
```json
{
"unknownId": "unk-2025-01-15-abc123",
"category": "symbol_unmapped",
"context": {
"scanId": "scan-xyz",
"binaryPath": "/usr/lib/libfoo.so",
"symbolName": "_ZN3foo3barEv"
},
"reason": "No PURL mapping available",
"firstSeen": "2025-01-15T10:30:00Z",
"occurrences": 42,
"provenanceHints": [
{
"hint_id": "hint:sha256:abc123...",
"type": "BuildIdMatch",
"confidence": 0.95,
"hypothesis": "Binary matches openssl 1.1.1k from debian",
"suggested_actions": [
{
"action": "verify_build_id",
"priority": 1,
"effort": "low",
"description": "Verify Build-ID against distro package repositories"
}
]
}
],
"bestHypothesis": "Binary matches openssl 1.1.1k from debian",
"combinedConfidence": 0.95
}
```
### 2.2 Categories
| Category | Description |
|----------|-------------|
| `component_unidentified` | Binary without package mapping |
| `symbol_unmapped` | Symbol without PURL resolution |
| `version_ambiguous` | Multiple version candidates |
| `purl_invalid` | Malformed package URL |
### 2.3 Provenance Hints
**Added in SPRINT_20260106_001_005_UNKNOWNS**
Provenance hints explain **why** something is unknown and provide hypotheses for resolution.
**Hint Types (15+):**
* **BuildIdMatch** - ELF/PE Build-ID match against known catalog
* **DebugLink** - Debug link (.gnu_debuglink) reference
* **ImportTableFingerprint** - Import table fingerprint comparison
* **ExportTableFingerprint** - Export table fingerprint comparison
* **SectionLayout** - Section layout similarity
* **StringTableSignature** - String table signature match
* **CompilerSignature** - Compiler/linker identification
* **PackageMetadata** - Package manager metadata (RPATH, NEEDED, etc.)
* **DistroPattern** - Distro/vendor pattern match
* **VersionString** - Version string extraction
* **SymbolPattern** - Symbol name pattern match
* **PathPattern** - File path pattern match
* **CorpusMatch** - Hash match against known corpus
* **SbomCrossReference** - SBOM cross-reference
* **AdvisoryCrossReference** - Advisory cross-reference
**Confidence Levels:**
* **VeryHigh** (>= 0.9) - Strong evidence, high reliability
* **High** (0.7 - 0.9) - Good evidence, likely accurate
* **Medium** (0.5 - 0.7) - Moderate evidence, worth investigating
* **Low** (0.3 - 0.5) - Weak evidence, low confidence
* **VeryLow** (< 0.3) - Very weak evidence, exploratory only
**Suggested Actions:**
Each hint includes prioritized resolution actions:
* **verify_build_id** - Verify Build-ID against distro package repositories
* **distro_package_lookup** - Search distro package repositories
* **version_verification** - Verify extracted version against known releases
* **analyze_imports** - Cross-reference imported libraries
* **compare_section_layout** - Compare section layout with known binaries
* **expand_catalog** - Add missing distros/packages to Build-ID catalog
**Hint Combination:**
When multiple hints agree, confidence is boosted:
```
Single hint: confidence = 0.85
Two agreeing: confidence = min(0.99, 0.85 + 0.1) = 0.95
Three agreeing: confidence = min(0.99, 0.85 + 0.2) = 0.99
```
**JSON Schema:**
See `src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Schemas/provenance-hint.schema.json`
---
## Related Documentation
* Scanner: `../scanner/architecture.md`
* Signals: `../signals/architecture.md`