Files
git.stella-ops.org/docs-archived/sprints/evid-001/TASKS_DETAILED.md
2026-01-12 12:24:17 +02:00

877 lines
20 KiB
Markdown

# Evidence Pipeline - Detailed Task Breakdown
**Format**: JIRA-compatible task definitions with acceptance criteria.
---
## Epic: EVID-001 - Evidence Pipeline Consolidation
### User Story: EVID-001-001 - CVE-to-Sink Mapping Service
**As a** scanner
**I want to** look up vulnerable symbols (sinks) for a given CVE
**So that** I can run targeted reachability analysis
#### Tasks
---
**EVID-001-001-001**: Create ICveSymbolMappingService interface
- **Type**: Task
- **Component**: Scanner.Reachability
- **Effort**: 0.5 days
- **Priority**: P0
**Description**:
Define the interface for CVE-to-symbol mapping lookup.
**Acceptance Criteria**:
```gherkin
Given a CVE ID and PURL
When I call GetSinksForCveAsync
Then I receive a list of VulnerableSymbol records
And each symbol has name, canonical_id, file_path, and confidence
```
**Technical Notes**:
```csharp
// Location: Scanner/__Libraries/StellaOps.Scanner.Reachability/Services/ICveSymbolMappingService.cs
public interface ICveSymbolMappingService
{
Task<IReadOnlyList<VulnerableSymbol>> GetSinksForCveAsync(
string cveId,
string purl,
CancellationToken ct);
Task<bool> HasMappingAsync(string cveId, CancellationToken ct);
Task<int> GetMappingCountAsync(CancellationToken ct);
}
public record VulnerableSymbol(
string SymbolName,
string? CanonicalId,
string? FilePath,
int? StartLine,
VulnerabilityType VulnType,
decimal Confidence);
```
**Dependencies**: None
---
**EVID-001-001-002**: Implement PostgresCveSymbolMappingRepository
- **Type**: Task
- **Component**: Scanner.Reachability
- **Effort**: 2 days
- **Priority**: P0
**Description**:
Implement the repository using existing `reachability.cve_symbol_mappings` schema.
**Acceptance Criteria**:
```gherkin
Given CVE-2021-44228 exists in the database
When I query for sinks with purl "pkg:maven/org.apache.logging.log4j/log4j-core"
Then I receive JndiLookup.lookup and JndiManager.lookup symbols
And confidence scores are included
```
**Technical Notes**:
```csharp
// Uses existing schema from V20260110__reachability_cve_mapping_schema.sql
// EF Core entity mapping to reachability.cve_symbol_mappings table
```
**Dependencies**: EVID-001-001-001
---
**EVID-001-001-003**: Create CveSymbolMappingLoader
- **Type**: Task
- **Component**: Concelier
- **Effort**: 3 days
- **Priority**: P0
**Description**:
Import CVE-to-symbol mappings from OSV, NVD, and patch analysis.
**Acceptance Criteria**:
```gherkin
Given an OSV advisory with affected symbols
When the loader processes the advisory
Then symbols are inserted into cve_symbol_mappings
And source is set to 'osv_advisory'
Given a git patch URL
When the loader analyzes the diff
Then changed functions are extracted
And inserted with source 'patch_analysis'
```
**Technical Notes**:
- Parse OSV `affected[].ranges[].events` for version info
- Parse `affected[].ecosystem_specific.vulnerable_functions` for symbols
- For patch analysis, use existing diff parsing from Concelier.Analyzers
**Dependencies**: EVID-001-001-002
---
**EVID-001-001-004**: Create PatchAnalysisExtractor
- **Type**: Task
- **Component**: Scanner.Reachability
- **Effort**: 2 days
- **Priority**: P1
**Description**:
Parse git diffs to extract changed function symbols.
**Acceptance Criteria**:
```gherkin
Given a git diff URL for a security patch
When I run the extractor
Then I receive a list of changed function names
And file paths and line numbers are included
And language is detected
```
**Technical Notes**:
```csharp
public interface IPatchAnalysisExtractor
{
Task<PatchAnalysisResult> ExtractAsync(string commitUrl, CancellationToken ct);
}
public record PatchAnalysisResult(
string CommitUrl,
string? Language,
IReadOnlyList<ExtractedSymbol> Symbols,
string? Error);
```
**Dependencies**: None
---
**EVID-001-001-005**: Wire to Concelier CVE enrichment
- **Type**: Task
- **Component**: Concelier
- **Effort**: 2 days
- **Priority**: P0
**Description**:
Enrich CVE data with sink mappings during ingestion.
**Acceptance Criteria**:
```gherkin
Given a new CVE is ingested from NVD
When Concelier processes it
Then it checks for OSV symbol data
And creates cve_symbol_mappings entries if found
And marks CVE as "has_reachability_data" = true
```
**Dependencies**: EVID-001-001-003
---
### User Story: EVID-001-002 - Reachability Evidence Job
**As a** scanner
**I want to** queue reachability analysis for a specific CVE+image
**So that** I get an evidence-backed verdict
#### Tasks
---
**EVID-001-002-001**: Create ReachabilityEvidenceJob model
- **Type**: Task
- **Component**: Scanner.Queue
- **Effort**: 1 day
- **Priority**: P0
**Description**:
Define the job model for queued reachability analysis.
**Acceptance Criteria**:
```gherkin
Given a reachability job request
When serialized and deserialized
Then all fields are preserved
And job ID is deterministic from inputs hash
```
**Technical Notes**:
```csharp
public record ReachabilityEvidenceJob(
string JobId,
string ImageDigest,
string CveId,
string Purl,
string? SourceCommit,
ReachabilityJobOptions Options,
DateTimeOffset QueuedAt);
public record ReachabilityJobOptions(
bool IncludeL2 = false, // Binary resolution
bool IncludeL3 = false, // Runtime (if available)
int MaxPaths = 5,
int MaxDepth = 256);
```
**Dependencies**: None
---
**EVID-001-002-002**: Create ReachabilityEvidenceJobExecutor
- **Type**: Task
- **Component**: Scanner.Worker
- **Effort**: 3 days
- **Priority**: P0
**Description**:
Implement the job executor that orchestrates L1 analysis.
**Acceptance Criteria**:
```gherkin
Given a queued reachability job
When the executor processes it
Then it retrieves or computes CallGraphSnapshot
And runs ReachabilityAnalyzer with CVE sinks
And produces ReachabilityStack with L1 populated
And stores EvidenceBundle in database
```
**Technical Notes**:
```csharp
public sealed class ReachabilityEvidenceJobExecutor : IJobExecutor<ReachabilityEvidenceJob>
{
// Inject: ICveSymbolMappingService, ICallGraphCache, ReachabilityAnalyzer,
// ReachabilityStackEvaluator, IEvidenceStore
public async Task<ReachabilityStack> ExecuteAsync(
ReachabilityEvidenceJob job,
CancellationToken ct)
{
// 1. Get sinks from CVE mapping service
var sinks = await _cveSymbolService.GetSinksForCveAsync(job.CveId, job.Purl, ct);
// 2. Get or compute call graph
var graph = await _callGraphCache.GetOrComputeAsync(job.ImageDigest, ct);
// 3. Run reachability analysis
var analysisResult = _analyzer.Analyze(graph, new ReachabilityAnalysisOptions
{
ExplicitSinks = sinks.Select(s => s.CanonicalId ?? s.SymbolName).ToImmutableArray(),
MaxTotalPaths = job.Options.MaxPaths,
MaxDepth = job.Options.MaxDepth
});
// 4. Build Layer 1
var layer1 = BuildLayer1(analysisResult);
// 5. Evaluate stack (L2, L3 as Unknown initially)
var stack = _stackEvaluator.Evaluate(
findingId: $"{job.CveId}:{job.Purl}",
symbol: sinks.FirstOrDefault() ?? VulnerableSymbol.Unknown,
layer1: layer1,
layer2: ReachabilityLayer2.Unknown(),
layer3: ReachabilityLayer3.Unknown());
// 6. Store evidence
await _evidenceStore.StoreAsync(stack.ToEvidenceBundle(), ct);
return stack;
}
}
```
**Dependencies**: EVID-001-001-001, EVID-001-002-001
---
**EVID-001-002-003**: Wire CallGraphSnapshot retrieval
- **Type**: Task
- **Component**: Scanner.Worker
- **Effort**: 2 days
- **Priority**: P0
**Description**:
Integrate with existing call graph cache/computation.
**Acceptance Criteria**:
```gherkin
Given an image digest with existing call graph
When the job requests it
Then the cached graph is returned
Given an image digest without cached call graph
When the job requests it
Then the graph is computed
And cached for future requests
```
**Dependencies**: None (uses existing CallGraph infrastructure)
---
**EVID-001-002-004**: Emit ReachabilityStack with L1 analysis
- **Type**: Task
- **Component**: Scanner.Reachability
- **Effort**: 1 day
- **Priority**: P0
**Description**:
Convert ReachabilityAnalysisResult to ReachabilityLayer1.
**Acceptance Criteria**:
```gherkin
Given a ReachabilityAnalysisResult with paths
When converted to Layer1
Then IsReachable is true
And Paths contains the converted paths
And ReachingEntrypoints lists unique entrypoints
And Confidence is High
Given a ReachabilityAnalysisResult with no paths
When converted to Layer1
Then IsReachable is false
And Confidence is Medium
```
**Dependencies**: EVID-001-002-002
---
**EVID-001-002-005**: Store result in EvidenceDbContext
- **Type**: Task
- **Component**: Evidence.Persistence
- **Effort**: 1 day
- **Priority**: P0
**Description**:
Persist the ReachabilityStack as an EvidenceBundle.
**Acceptance Criteria**:
```gherkin
Given a ReachabilityStack
When stored
Then an EvidenceBundle is created
And ReachabilityEvidence is included
And the bundle ID is returned
And it can be retrieved by ID
```
**Dependencies**: None (uses existing EvidenceDbContext)
---
**EVID-001-002-006**: Add WebService endpoint
- **Type**: Task
- **Component**: Scanner.WebService
- **Effort**: 1 day
- **Priority**: P0
**Description**:
Expose reachability analysis via REST API.
**Acceptance Criteria**:
```gherkin
Given a POST to /api/reachability/analyze
With body { imageDigest, cveId, purl }
When the request is processed
Then a job is queued
And the job ID is returned
Given a GET to /api/reachability/result/{jobId}
When the job is complete
Then the ReachabilityStack is returned
And evidence bundle URI is included
```
**Technical Notes**:
```csharp
// Location: Scanner.WebService/Endpoints/ReachabilityEvidenceEndpoints.cs
public static class ReachabilityEvidenceEndpoints
{
public static void MapReachabilityEvidenceEndpoints(this IEndpointRouteBuilder routes)
{
routes.MapPost("/api/reachability/analyze", AnalyzeAsync);
routes.MapGet("/api/reachability/result/{jobId}", GetResultAsync);
}
}
```
**Dependencies**: EVID-001-002-002
---
### User Story: EVID-001-003 - VEX Integration
**As a** security team
**I want** reachability verdicts to automatically update VEX status
**So that** I don't manually triage unreachable vulnerabilities
#### Tasks
---
**EVID-001-003-001**: Create IVexStatusDeterminer interface
- **Type**: Task
- **Component**: VexHub
- **Effort**: 0.5 days
- **Priority**: P0
**Description**:
Define interface for verdict-to-VEX mapping.
**Technical Notes**:
```csharp
public interface IVexStatusDeterminer
{
VexStatus DetermineStatus(ReachabilityVerdict verdict);
VexJustification BuildJustification(
ReachabilityStack stack,
IReadOnlyList<string> evidenceUris);
}
```
**Dependencies**: None
---
**EVID-001-003-002**: Implement verdict to VEX status mapping
- **Type**: Task
- **Component**: VexHub
- **Effort**: 2 days
- **Priority**: P0
**Description**:
Map ReachabilityVerdict to CycloneDX VEX status.
**Acceptance Criteria**:
```gherkin
Given verdict Exploitable
When mapped to VEX
Then status is "affected"
And impact_statement includes path summary
Given verdict Unreachable
When mapped to VEX
Then status is "not_affected"
And justification includes "code_not_reachable"
```
**Technical Notes**:
```csharp
public VexStatus DetermineStatus(ReachabilityVerdict verdict) => verdict switch
{
ReachabilityVerdict.Exploitable => VexStatus.Affected,
ReachabilityVerdict.LikelyExploitable => VexStatus.Affected,
ReachabilityVerdict.PossiblyExploitable => VexStatus.UnderInvestigation,
ReachabilityVerdict.Unreachable => VexStatus.NotAffected,
ReachabilityVerdict.Unknown => VexStatus.UnderInvestigation,
_ => VexStatus.UnderInvestigation
};
```
**Dependencies**: EVID-001-003-001
---
**EVID-001-003-003**: Create ReachabilityVexJustificationBuilder
- **Type**: Task
- **Component**: VexHub
- **Effort**: 2 days
- **Priority**: P0
**Description**:
Build VEX justification from reachability evidence.
**Acceptance Criteria**:
```gherkin
Given a ReachabilityStack with Unreachable verdict
When justification is built
Then detail includes "No call path from entrypoints to vulnerable symbol"
And layer summaries are included
And evidence URIs are referenced
```
**Dependencies**: EVID-001-003-002
---
**EVID-001-003-004**: Wire to VexHub emission
- **Type**: Task
- **Component**: VexHub
- **Effort**: 3 days
- **Priority**: P0
**Description**:
Automatically emit VEX documents when reachability evidence is produced.
**Acceptance Criteria**:
```gherkin
Given a new ReachabilityStack is stored
When the VEX bridge processes it
Then a VEX statement is created for the CVE+component
And it references the evidence bundle
And it's stored in VexHub
```
**Dependencies**: EVID-001-003-003
---
**EVID-001-003-005**: Add evidence URI to VEX justification
- **Type**: Task
- **Component**: VexHub
- **Effort**: 1 day
- **Priority**: P0
**Description**:
Include evidence bundle URI in VEX document.
**Acceptance Criteria**:
```gherkin
Given a VEX document with reachability evidence
When serialized to CycloneDX
Then analysis.detail includes evidence URI
And URI follows stella:// scheme
```
**Dependencies**: EVID-001-003-004
---
### User Story: EVID-001-004 - Runtime Observation
**As a** security team
**I want** runtime execution data to refine reachability verdicts
**So that** I know if vulnerable code is actually running
#### Tasks
---
**EVID-001-004-001**: Create IRuntimeCaptureAdapter interface
- **Type**: Task
- **Component**: Scanner.Analyzers.Native
- **Effort**: 1 day
- **Priority**: P0
**Description**:
Define the interface for runtime capture backends.
**Technical Notes**:
```csharp
// Location: Scanner.Analyzers.Native/RuntimeCapture/IRuntimeCaptureAdapter.cs
public interface IRuntimeCaptureAdapter
{
string Platform { get; } // "linux", "windows", "macos"
string Method { get; } // "ebpf", "etw", "dyld-interpose"
Task<RuntimeCaptureSession> StartSessionAsync(
RuntimeCaptureOptions options,
CancellationToken ct);
Task StopSessionAsync(string sessionId, CancellationToken ct);
IAsyncEnumerable<RuntimeLoadEvent> StreamEventsAsync(
string sessionId,
CancellationToken ct);
Task<RuntimeEvidence> GetEvidenceAsync(
string sessionId,
CancellationToken ct);
}
```
**Dependencies**: None (uses existing RuntimeEvidence models)
---
**EVID-001-004-002**: Implement TetragonAdapter
- **Type**: Task
- **Component**: Scanner.Analyzers.Native
- **Effort**: 5 days
- **Priority**: P1
**Description**:
Implement runtime capture using Cilium Tetragon.
**Acceptance Criteria**:
```gherkin
Given Tetragon is running in the cluster
When a capture session is started
Then tracing policies are applied
And library load events are captured
And events are correlated to container digest
Given a session is stopped
When evidence is requested
Then all captured events are returned
And unique libraries are summarized
```
**Technical Notes**:
- Use Tetragon gRPC API for event streaming
- Apply TracingPolicy for library loads
- Correlate events via cgroup/container ID
**Dependencies**: EVID-001-004-001
---
**EVID-001-004-003**: Implement EtwAdapter (Windows)
- **Type**: Task
- **Component**: Scanner.Analyzers.Native
- **Effort**: 3 days
- **Priority**: P2
**Description**:
Implement runtime capture using Windows ETW.
**Acceptance Criteria**:
```gherkin
Given ETW providers are available
When a capture session is started
Then DLL load events are captured
And events include process and timestamp
```
**Dependencies**: EVID-001-004-001
---
**EVID-001-004-004**: Create RuntimeEvidenceCollector service
- **Type**: Task
- **Component**: Scanner.Worker
- **Effort**: 2 days
- **Priority**: P1
**Description**:
Orchestrate runtime evidence collection for a target.
**Acceptance Criteria**:
```gherkin
Given a container image running in the cluster
When the collector is invoked
Then it selects the appropriate adapter
And starts a session for the configured duration
And stores the runtime evidence
```
**Dependencies**: EVID-001-004-002
---
**EVID-001-004-005**: Wire to existing RuntimeEvidence models
- **Type**: Task
- **Component**: Scanner.Analyzers.Native
- **Effort**: 1 day
- **Priority**: P1
**Description**:
Ensure adapters produce existing RuntimeEvidence types.
**Dependencies**: EVID-001-004-004
---
---
### User Story: EVID-001-005 - Binary Patch Verification
**As a** security team
**I want** to verify if a binary is actually patched
**So that** I trust distro backport claims
#### Tasks
---
**EVID-001-005-001**: Create IBinaryDiffService interface
- **Type**: Task
- **Component**: Scanner.Reachability
- **Effort**: 1 day
- **Priority**: P1
**Description**:
Define interface for binary comparison.
**Technical Notes**:
```csharp
public interface IBinaryDiffService
{
Task<PatchDiffResult> DiffAsync(
Stream vulnerableBinary,
Stream patchedBinary,
IReadOnlyList<string> targetSymbols,
CancellationToken ct);
}
public record PatchDiffResult(
bool IsPatched,
IReadOnlyList<FunctionDiff> ChangedFunctions,
double SimilarityScore,
string DiffSummary,
byte[]? DiffArtifact);
public record FunctionDiff(
string FunctionName,
ulong OriginalAddress,
ulong PatchedAddress,
int InstructionChanges,
double Similarity);
```
**Dependencies**: None
---
**EVID-001-005-002**: Implement B2R2BinaryDiffService
- **Type**: Task
- **Component**: Scanner.Reachability
- **Effort**: 5 days
- **Priority**: P1
**Description**:
Implement binary diff using B2R2 (already in dependencies).
**Acceptance Criteria**:
```gherkin
Given two versions of a binary (vulnerable and patched)
When diffed for a target symbol
Then changed basic blocks are identified
And similarity score is computed
And IsPatched is true if significant changes detected
```
**Technical Notes**:
- Use B2R2.FrontEnd for disassembly
- Use B2R2.MiddleEnd for IR comparison
- Compare function CFGs for similarity
**Dependencies**: EVID-001-005-001
---
**EVID-001-005-003**: Create function similarity matching
- **Type**: Task
- **Component**: Scanner.Reachability
- **Effort**: 3 days
- **Priority**: P1
**Description**:
Match functions between binaries even with different addresses.
**Acceptance Criteria**:
```gherkin
Given a function name from CVE mapping
When searching in patched binary
Then the function is located by name
Or by signature similarity if renamed
And match confidence is returned
```
**Dependencies**: EVID-001-005-002
---
**EVID-001-005-004**: Create PatchDiffEvidence model
- **Type**: Task
- **Component**: Evidence.Bundle
- **Effort**: 1 day
- **Priority**: P1
**Description**:
Add patch diff evidence to the bundle types.
**Technical Notes**:
```csharp
// Location: Evidence.Bundle/PatchDiffEvidence.cs
public sealed class PatchDiffEvidence
{
public required EvidenceStatus Status { get; init; }
public string? Hash { get; init; }
public bool IsPatched { get; init; }
public double SimilarityScore { get; init; }
public IReadOnlyList<ChangedFunctionSummary>? ChangedFunctions { get; init; }
public string? DiffSummary { get; init; }
public string? ArtifactUri { get; init; }
}
public record ChangedFunctionSummary(
string FunctionName,
int InstructionChanges,
double Similarity);
```
**Dependencies**: None
---
**EVID-001-005-005**: Add to evidence bundle
- **Type**: Task
- **Component**: Evidence.Bundle
- **Effort**: 1 day
- **Priority**: P1
**Description**:
Include PatchDiffEvidence in EvidenceBundle.
**Dependencies**: EVID-001-005-004
---
---
## Summary Statistics
| Category | Count |
|----------|-------|
| Epics | 1 |
| User Stories | 5 |
| Tasks | 25 |
| Total Effort | ~50 days |
## Sprint Assignment Suggestion
| Sprint | Stories | Effort |
|--------|---------|--------|
| S1 | EVID-001-001 (CVE Mapping) | 10 days |
| S2 | EVID-001-002 (Evidence Job) | 9 days |
| S3 | EVID-001-003 (VEX Integration) | 8.5 days |
| S4 | EVID-001-004 (Runtime - part 1) | 9 days |
| S5 | EVID-001-004 (Runtime - part 2) + EVID-001-005 (Binary Diff) | 13 days |