save progress

This commit is contained in:
master
2026-01-09 18:27:36 +02:00
parent e608752924
commit a21d3dbc1f
361 changed files with 63068 additions and 1192 deletions

View File

@@ -0,0 +1,551 @@
# Reachability Module Architecture
## Overview
The **Reachability** module provides a unified hybrid reachability analysis system that combines static call-graph analysis with runtime execution evidence to determine whether vulnerable code paths are actually exploitable in a given artifact. It serves as the **evidence backbone** for VEX (Vulnerability Exploitability eXchange) verdicts.
## Problem Statement
Vulnerability scanners generate excessive false positives:
- **Static analysis** over-approximates: flags code that is dead, feature-gated, or unreachable
- **Runtime analysis** under-approximates: misses rarely-executed but exploitable paths
- **No unified view** across static and runtime evidence sources
- **Symbol mismatch** between static extraction (Roslyn, ASM) and runtime observation (ETW, eBPF)
### Before Reachability Module
| Question | Answer Method | Limitation |
|----------|---------------|------------|
| Is CVE reachable statically? | Query ReachGraph | No runtime context |
| Was CVE executed at runtime? | Query Signals runtime facts | No static context |
| Should we mark CVE as NA? | Manual analysis | No evidence, no audit trail |
| What's the confidence? | Guesswork | No formal model |
### After Reachability Module
Single `IReachabilityIndex.QueryHybridAsync()` call returns:
- Lattice state (8-level certainty model)
- Confidence score (0.0-1.0)
- Evidence URIs (auditable, reproducible)
- Recommended VEX status + justification
---
## Module Location
```
src/__Libraries/StellaOps.Reachability.Core/
├── IReachabilityIndex.cs # Main facade interface
├── ReachabilityIndex.cs # Implementation
├── ReachabilityQueryOptions.cs # Query configuration
├── Models/
│ ├── SymbolRef.cs # Symbol reference
│ ├── CanonicalSymbol.cs # Canonicalized symbol
│ ├── StaticReachabilityResult.cs # Static query result
│ ├── RuntimeReachabilityResult.cs # Runtime query result
│ ├── HybridReachabilityResult.cs # Combined result
│ └── LatticeState.cs # 8-state lattice enum
├── Symbols/
│ ├── ISymbolCanonicalizer.cs # Symbol normalization interface
│ ├── SymbolCanonicalizer.cs # Implementation
│ ├── Normalizers/
│ │ ├── DotNetSymbolNormalizer.cs # .NET symbols
│ │ ├── JavaSymbolNormalizer.cs # Java symbols
│ │ ├── NativeSymbolNormalizer.cs # C/C++/Rust
│ │ └── ScriptSymbolNormalizer.cs # JS/Python/PHP
│ └── SymbolMatchOptions.cs # Matching configuration
├── CveMapping/
│ ├── ICveSymbolMappingService.cs # CVE-symbol mapping interface
│ ├── CveSymbolMappingService.cs # Implementation
│ ├── CveSymbolMapping.cs # Mapping record
│ ├── VulnerableSymbol.cs # Vulnerable symbol record
│ ├── MappingSource.cs # Source enum
│ └── Extractors/
│ ├── IPatchSymbolExtractor.cs # Patch analysis interface
│ ├── GitDiffExtractor.cs # Git diff parsing
│ ├── OsvEnricher.cs # OSV API enrichment
│ └── DeltaSigMatcher.cs # Binary signature matching
├── Lattice/
│ ├── ReachabilityLattice.cs # Lattice state machine
│ ├── LatticeTransition.cs # State transitions
│ └── ConfidenceCalculator.cs # Confidence scoring
├── Evidence/
│ ├── EvidenceUriBuilder.cs # stella:// URI construction
│ ├── EvidenceBundle.cs # Evidence collection
│ └── EvidenceAttestationService.cs # DSSE signing
└── Integration/
├── ReachGraphAdapter.cs # ReachGraph integration
├── SignalsAdapter.cs # Signals integration
└── PolicyEngineAdapter.cs # Policy Engine integration
```
---
## Core Concepts
### 1. Reachability Lattice (8-State Model)
The lattice provides mathematically sound evidence aggregation:
```
X (Contested)
/ \
/ \
CR (Confirmed CU (Confirmed
Reachable) Unreachable)
| \ / |
| \ / |
RO (Runtime RU (Runtime
Observed) Unobserved)
| |
| |
SR (Static SU (Static
Reachable) Unreachable)
\ /
\ /
U (Unknown)
```
| State | Code | Description | Confidence Base |
|-------|------|-------------|-----------------|
| Unknown | U | No analysis performed | 0.00 |
| Static Reachable | SR | Call graph shows path exists | 0.30 |
| Static Unreachable | SU | Call graph proves no path | 0.40 |
| Runtime Observed | RO | Symbol executed at runtime | 0.70 |
| Runtime Unobserved | RU | Observation window passed, no execution | 0.60 |
| Confirmed Reachable | CR | Multiple sources confirm reachability | 0.90 |
| Confirmed Unreachable | CU | Multiple sources confirm no reachability | 0.95 |
| Contested | X | Evidence conflict | 0.20 (requires review) |
### 2. Symbol Canonicalization
Symbols from different sources must be normalized to enable matching:
| Source | Raw Format | Canonical Format |
|--------|-----------|------------------|
| Roslyn (.NET) | `StellaOps.Scanner.Core.SbomGenerator::GenerateAsync` | `stellaops.scanner.core/sbomgenerator/generateasync/(cancellationtoken)` |
| ASM (Java) | `org/apache/log4j/core/lookup/JndiLookup.lookup(Ljava/lang/String;)Ljava/lang/String;` | `org.apache.log4j.core.lookup/jndilookup/lookup/(string)` |
| eBPF (Native) | `_ZN4llvm12DenseMapBaseINS_...` | `llvm/densemapbase/operator[]/(keytype)` |
| ETW (.NET) | `MethodID=12345 ModuleID=67890` | (resolved via metadata) |
### 3. CVE-Symbol Mapping
Maps CVE identifiers to specific vulnerable symbols:
```json
{
"cveId": "CVE-2021-44228",
"symbols": [
{
"canonicalId": "sha256:abc123...",
"displayName": "org.apache.log4j.core.lookup/jndilookup/lookup/(string)",
"type": "Sink",
"condition": "When lookup string contains ${jndi:...}"
}
],
"source": "PatchAnalysis",
"confidence": 0.98,
"patchCommitUrl": "https://github.com/apache/logging-log4j2/commit/abc123"
}
```
### 4. Evidence URIs
Standardized `stella://` URI scheme for evidence references:
| Pattern | Example |
|---------|---------|
| `stella://reachgraph/{digest}` | `stella://reachgraph/blake3:abc123` |
| `stella://reachgraph/{digest}/slice?symbol={id}` | `stella://reachgraph/blake3:abc123/slice?symbol=sha256:def` |
| `stella://signals/runtime/{tenant}/{artifact}` | `stella://signals/runtime/acme/sha256:abc` |
| `stella://cvemap/{cveId}` | `stella://cvemap/CVE-2021-44228` |
| `stella://attestation/{digest}` | `stella://attestation/sha256:sig789` |
---
## Architecture Diagram
```
┌─────────────────────────────────────────────────────────────────────────────────┐
│ Reachability Core Library │
├─────────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────────────────────────────────────────────────────────────────┐ │
│ │ IReachabilityIndex │ │
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────────────┐ │ │
│ │ │ QueryStaticAsync │ │ QueryRuntimeAsync│ │ QueryHybridAsync │ │ │
│ │ └────────┬────────┘ └────────┬────────┘ └────────────┬───────────────┘ │ │
│ └───────────┼────────────────────┼─────────────────────────┼────────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────────────┐│
│ │ Internal Components ││
│ │ ││
│ │ ┌────────────────┐ ┌────────────────┐ ┌────────────────────────────┐ ││
│ │ │ Symbol │ │ CVE-Symbol │ │ Reachability │ ││
│ │ │ Canonicalizer │ │ Mapping │ │ Lattice │ ││
│ │ │ │ │ │ │ │ ││
│ │ │ ┌────────────┐ │ │ ┌────────────┐ │ │ ┌───────────────────────┐ │ ││
│ │ │ │.NET Norm. │ │ │ │PatchExtract│ │ │ │ State Machine │ │ ││
│ │ │ │Java Norm. │ │ │ │OSV Enrich │ │ │ │ Confidence Calc │ │ ││
│ │ │ │Native Norm.│ │ │ │DeltaSig │ │ │ │ Transition Rules │ │ ││
│ │ │ │Script Norm.│ │ │ │Manual Input│ │ │ └───────────────────────┘ │ ││
│ │ │ └────────────┘ │ │ └────────────┘ │ │ │ ││
│ │ └────────────────┘ └────────────────┘ └────────────────────────────┘ ││
│ │ ││
│ └──────────────────────────────────────────────────────────────────────────────┘│
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────────────────┐│
│ │ Evidence Layer ││
│ │ ││
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────────────┐ ││
│ │ │ Evidence URI │ │ Evidence Bundle │ │ Evidence Attestation │ ││
│ │ │ Builder │ │ (Collection) │ │ Service (DSSE) │ ││
│ │ └─────────────────┘ └─────────────────┘ └─────────────────────────────┘ ││
│ │ ││
│ └──────────────────────────────────────────────────────────────────────────────┘│
│ │
└──────────────────────────────────────────────────────────────────────────────────┘
┌────────────────────────┼────────────────────────┐
│ │ │
▼ ▼ ▼
┌────────────────┐ ┌────────────────┐ ┌────────────────┐
│ ReachGraph │ │ Signals │ │ Policy Engine │
│ Adapter │ │ Adapter │ │ Adapter │
└───────┬────────┘ └───────┬────────┘ └───────┬────────┘
│ │ │
▼ ▼ ▼
┌────────────────┐ ┌────────────────┐ ┌────────────────┐
│ ReachGraph │ │ Signals │ │ Policy Engine │
│ WebService │ │ WebService │ │ (VEX Emit) │
└────────────────┘ └────────────────┘ └────────────────┘
```
---
## Data Flow
### Query Flow
```
1. Consumer calls IReachabilityIndex.QueryHybridAsync(symbol, artifact, options)
2. SymbolCanonicalizer normalizes input symbol to CanonicalSymbol
3. Parallel queries:
├── ReachGraphAdapter.QueryAsync() → StaticReachabilityResult
└── SignalsAdapter.QueryRuntimeFactsAsync() → RuntimeReachabilityResult
4. ReachabilityLattice computes combined state from evidence
5. ConfidenceCalculator applies evidence weights and guardrails
6. EvidenceBundle collects URIs for audit trail
7. Return HybridReachabilityResult with verdict recommendation
```
### Ingestion Flow (CVE Mapping)
```
1. Patch commit detected (Concelier, Feedser, or manual)
2. GitDiffExtractor parses diff to find changed functions
3. SymbolCanonicalizer normalizes extracted symbols
4. OsvEnricher adds context from OSV database
5. CveSymbolMappingService persists mapping with provenance
6. Mapping available for reachability queries
```
---
## API Contracts
### IReachabilityIndex
```csharp
public interface IReachabilityIndex
{
/// <summary>
/// Query static reachability from call graph.
/// </summary>
Task<StaticReachabilityResult> QueryStaticAsync(
SymbolRef symbol,
string artifactDigest,
CancellationToken ct);
/// <summary>
/// Query runtime reachability from observed facts.
/// </summary>
Task<RuntimeReachabilityResult> QueryRuntimeAsync(
SymbolRef symbol,
string artifactDigest,
TimeSpan observationWindow,
CancellationToken ct);
/// <summary>
/// Query hybrid reachability combining static + runtime.
/// </summary>
Task<HybridReachabilityResult> QueryHybridAsync(
SymbolRef symbol,
string artifactDigest,
HybridQueryOptions options,
CancellationToken ct);
/// <summary>
/// Batch query for CVE vulnerability analysis.
/// </summary>
Task<IReadOnlyList<HybridReachabilityResult>> QueryBatchAsync(
IEnumerable<SymbolRef> symbols,
string artifactDigest,
HybridQueryOptions options,
CancellationToken ct);
/// <summary>
/// Get vulnerable symbols for a CVE.
/// </summary>
Task<CveSymbolMapping?> GetCveMappingAsync(
string cveId,
CancellationToken ct);
}
```
### Result Types
```csharp
public sealed record HybridReachabilityResult
{
public required SymbolRef Symbol { get; init; }
public required string ArtifactDigest { get; init; }
public required LatticeState LatticeState { get; init; }
public required double Confidence { get; init; }
public required StaticEvidence? StaticEvidence { get; init; }
public required RuntimeEvidence? RuntimeEvidence { get; init; }
public required VerdictRecommendation Verdict { get; init; }
public required ImmutableArray<string> EvidenceUris { get; init; }
public required DateTimeOffset ComputedAt { get; init; }
public required string ComputedBy { get; init; }
}
public sealed record VerdictRecommendation
{
public required VexStatus Status { get; init; }
public VexJustification? Justification { get; init; }
public required ConfidenceBucket ConfidenceBucket { get; init; }
public string? ImpactStatement { get; init; }
public string? ActionStatement { get; init; }
}
public enum LatticeState
{
Unknown = 0,
StaticReachable = 1,
StaticUnreachable = 2,
RuntimeObserved = 3,
RuntimeUnobserved = 4,
ConfirmedReachable = 5,
ConfirmedUnreachable = 6,
Contested = 7
}
```
---
## Integration Points
### Upstream (Data Sources)
| Module | Interface | Data |
|--------|-----------|------|
| ReachGraph | `IReachGraphSliceService` | Static call-graph nodes/edges |
| Signals | `IRuntimeFactsService` | Runtime method observations |
| Scanner.CallGraph | `ICallGraphExtractor` | Per-artifact call graphs |
| Feedser | `IBackportProofService` | Patch analysis results |
### Downstream (Consumers)
| Module | Interface | Usage |
|--------|-----------|-------|
| Policy Engine | `IReachabilityAwareVexEmitter` | VEX verdict with evidence |
| VexLens | `IReachabilityIndex` | Consensus enrichment |
| Web Console | REST API | Evidence panel display |
| CLI | `stella reachability` | Command-line queries |
| ExportCenter | `IReachabilityExporter` | Offline bundles |
---
## Storage
### PostgreSQL Schema
```sql
-- CVE-Symbol Mappings
CREATE TABLE reachability.cve_symbol_mappings (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
cve_id TEXT NOT NULL,
symbol_canonical_id TEXT NOT NULL,
symbol_display_name TEXT NOT NULL,
vulnerability_type TEXT NOT NULL,
condition TEXT,
source TEXT NOT NULL,
confidence DECIMAL(3,2) NOT NULL,
patch_commit_url TEXT,
delta_sig_digest TEXT,
extracted_at TIMESTAMPTZ NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE (tenant_id, cve_id, symbol_canonical_id)
);
-- Query Cache
CREATE TABLE reachability.query_cache (
cache_key TEXT PRIMARY KEY,
artifact_digest TEXT NOT NULL,
symbol_canonical_id TEXT NOT NULL,
lattice_state INTEGER NOT NULL,
confidence DECIMAL(3,2) NOT NULL,
result_json JSONB NOT NULL,
computed_at TIMESTAMPTZ NOT NULL,
expires_at TIMESTAMPTZ NOT NULL
);
-- Audit Log
CREATE TABLE reachability.query_audit_log (
id BIGSERIAL PRIMARY KEY,
tenant_id UUID NOT NULL,
query_type TEXT NOT NULL,
artifact_digest TEXT NOT NULL,
symbol_count INTEGER NOT NULL,
lattice_state INTEGER NOT NULL,
confidence DECIMAL(3,2) NOT NULL,
duration_ms INTEGER NOT NULL,
queried_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
```
### Valkey (Redis) Caching
| Key Pattern | TTL | Purpose |
|-------------|-----|---------|
| `reach:static:{artifact}:{symbol}` | 1h | Static query cache |
| `reach:runtime:{artifact}:{symbol}` | 5m | Runtime query cache |
| `reach:hybrid:{artifact}:{symbol}:{options_hash}` | 15m | Hybrid query cache |
| `cvemap:{cve_id}` | 24h | CVE mapping cache |
---
## Determinism Guarantees
### Reproducibility Rules
1. **Canonical Symbol IDs:** SHA-256 of `purl|namespace|type|method|signature` (lowercase, sorted)
2. **Stable Lattice Transitions:** Deterministic state machine, no randomness
3. **Ordered Evidence:** Evidence URIs sorted lexicographically
4. **Time Injection:** All `ComputedAt` via `TimeProvider`
5. **Culture Invariance:** `InvariantCulture` for all string operations
### Replay Verification
```csharp
public interface IReachabilityReplayService
{
Task<ReplayResult> ReplayAsync(
HybridReachabilityInputs inputs,
HybridReachabilityResult expected,
CancellationToken ct);
}
```
---
## Performance Characteristics
| Operation | Target P95 | Notes |
|-----------|-----------|-------|
| Static query (cached) | <10ms | Valkey hit |
| Static query (uncached) | <100ms | ReachGraph slice |
| Runtime query (cached) | <5ms | Valkey hit |
| Runtime query (uncached) | <50ms | Signals lookup |
| Hybrid query | <50ms | Parallel static + runtime |
| Batch query (100 symbols) | <500ms | Parallelized |
| CVE mapping lookup | <10ms | Cached |
| Symbol canonicalization | <1ms | In-memory |
---
## Security Considerations
### Access Control
| Operation | Required Scope |
|-----------|---------------|
| Query reachability | `reachability:read` |
| Ingest CVE mapping | `reachability:write` |
| Admin CVE mapping | `reachability:admin` |
| Export bundles | `reachability:export` |
### Tenant Isolation
- All queries filtered by `tenant_id`
- RLS policies on all tables
- Cache keys include tenant prefix
### Data Sensitivity
- Symbol names may reveal internal architecture
- Runtime traces expose execution patterns
- CVE mappings are security-sensitive
---
## Observability
### Metrics
| Metric | Type | Labels |
|--------|------|--------|
| `reachability_query_duration_seconds` | histogram | query_type, cache_hit |
| `reachability_lattice_state_total` | counter | state |
| `reachability_cache_hit_ratio` | gauge | cache_type |
| `reachability_cvemap_count` | gauge | source |
### Traces
| Span | Description |
|------|-------------|
| `reachability.query.static` | Static graph query |
| `reachability.query.runtime` | Runtime facts query |
| `reachability.query.hybrid` | Combined computation |
| `reachability.canonicalize` | Symbol normalization |
| `reachability.lattice.compute` | State calculation |
---
## Related Documentation
- [Product Advisory: Hybrid Reachability](../../product/advisories/09-Jan-2026%20-%20Hybrid%20Reachability%20and%20VEX%20Integration%20(Revised).md)
- [ReachGraph Architecture](../reach-graph/architecture.md)
- [Signals Architecture](../signals/architecture.md)
- [VexLens Architecture](../vex-lens/architecture.md)
- [Sprint Index](../../implplan/SPRINT_20260109_009_000_INDEX_hybrid_reachability.md)
---
_Last updated: 09-Jan-2026_