feat: Implement distro-native version comparison for RPM, Debian, and Alpine packages

- Add RpmVersionComparer for RPM version comparison with epoch, version, and release handling. - Introduce DebianVersion for parsing Debian EVR (Epoch:Version-Release) strings. - Create ApkVersion for parsing Alpine APK version strings with suffix support. - Define IVersionComparator interface for version comparison with proof-line generation. - Implement VersionComparisonResult struct to encapsulate comparison results and proof lines. - Add tests for Debian and RPM version comparers to ensure correct functionality and edge case handling. - Create project files for the version comparison library and its tests.
2025-12-22 09:49:38 +02:00
parent aff0ceb2fe
commit 634233dfed
112 changed files with 31925 additions and 1813 deletions
--- a/docs/modules/benchmark/architecture.md
+++ b/docs/modules/benchmark/architecture.md
@@ -0,0 +1,444 @@
+# Benchmark Module Architecture
+
+## Overview
+
+The Benchmark module provides infrastructure for validating and demonstrating Stella Ops' competitive advantages through automated comparison against other container security scanners (Trivy, Grype, Syft, etc.).
+
+**Module Path**: `src/Scanner/__Libraries/StellaOps.Scanner.Benchmark/`
+**Status**: PLANNED (Sprint 7000.0001.0001)
+
+---
+
+## Mission
+
+Establish verifiable, reproducible benchmarks that:
+1. Validate competitive claims with evidence
+2. Detect regressions in accuracy or performance
+3. Generate marketing-ready comparison materials
+4. Provide ground-truth corpus for testing
+
+---
+
+## Architecture
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                      Benchmark Module                            │
+├─────────────────────────────────────────────────────────────────┤
+│                                                                  │
+│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐         │
+│  │   Corpus    │    │   Harness   │    │   Metrics   │         │
+│  │  Manager    │───▶│   Runner    │───▶│  Calculator │         │
+│  └─────────────┘    └─────────────┘    └─────────────┘         │
+│         │                  │                  │                 │
+│         │                  │                  │                 │
+│         ▼                  ▼                  ▼                 │
+│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐         │
+│  │Ground Truth │    │ Competitor  │    │   Claims    │         │
+│  │  Manifest   │    │  Adapters   │    │   Index     │         │
+│  └─────────────┘    └─────────────┘    └─────────────┘         │
+│                                                                  │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Components
+
+### 1. Corpus Manager
+
+**Namespace**: `StellaOps.Scanner.Benchmark.Corpus`
+
+Manages the ground-truth corpus of container images with known vulnerabilities.
+
+```csharp
+public interface ICorpusManager
+{
+    Task<Corpus> LoadCorpusAsync(string corpusPath, CancellationToken ct);
+    Task<CorpusImage> GetImageAsync(string digest, CancellationToken ct);
+    Task<GroundTruth> GetGroundTruthAsync(string digest, CancellationToken ct);
+}
+
+public record Corpus(
+    string Version,
+    DateTimeOffset CreatedAt,
+    ImmutableArray<CorpusImage> Images
+);
+
+public record CorpusImage(
+    string Digest,
+    string Name,
+    string Tag,
+    CorpusCategory Category,
+    GroundTruth GroundTruth
+);
+
+public record GroundTruth(
+    ImmutableArray<string> TruePositives,
+    ImmutableArray<string> KnownFalsePositives,
+    ImmutableArray<string> Notes
+);
+
+public enum CorpusCategory
+{
+    BaseOS,           // Alpine, Debian, Ubuntu, RHEL
+    ApplicationNode,  // Node.js applications
+    ApplicationPython,// Python applications
+    ApplicationJava,  // Java applications
+    ApplicationDotNet,// .NET applications
+    BackportScenario, // Known backported fixes
+    Unreachable       // Known unreachable vulns
+}
+```
+
+### 2. Harness Runner
+
+**Namespace**: `StellaOps.Scanner.Benchmark.Harness`
+
+Executes scans using Stella Ops and competitor tools.
+
+```csharp
+public interface IHarnessRunner
+{
+    Task<BenchmarkRun> RunAsync(
+        Corpus corpus,
+        ImmutableArray<ITool> tools,
+        BenchmarkOptions options,
+        CancellationToken ct
+    );
+}
+
+public interface ITool
+{
+    string Name { get; }
+    string Version { get; }
+    Task<ToolResult> ScanAsync(string imageRef, CancellationToken ct);
+}
+
+public record BenchmarkRun(
+    string RunId,
+    DateTimeOffset StartedAt,
+    DateTimeOffset CompletedAt,
+    ImmutableArray<ToolResult> Results
+);
+
+public record ToolResult(
+    string ToolName,
+    string ToolVersion,
+    string ImageDigest,
+    ImmutableArray<NormalizedFinding> Findings,
+    TimeSpan Duration
+);
+```
+
+### 3. Competitor Adapters
+
+**Namespace**: `StellaOps.Scanner.Benchmark.Adapters`
+
+Normalize output from competitor tools.
+
+```csharp
+public interface ICompetitorAdapter : ITool
+{
+    Task<ImmutableArray<NormalizedFinding>> ParseOutputAsync(
+        string output,
+        CancellationToken ct
+    );
+}
+
+// Implementations
+public class TrivyAdapter : ICompetitorAdapter { }
+public class GrypeAdapter : ICompetitorAdapter { }
+public class SyftAdapter : ICompetitorAdapter { }
+public class StellaOpsAdapter : ICompetitorAdapter { }
+```
+
+### 4. Metrics Calculator
+
+**Namespace**: `StellaOps.Scanner.Benchmark.Metrics`
+
+Calculate precision, recall, F1, and other metrics.
+
+```csharp
+public interface IMetricsCalculator
+{
+    BenchmarkMetrics Calculate(
+        ToolResult result,
+        GroundTruth groundTruth
+    );
+
+    ComparativeMetrics Compare(
+        BenchmarkMetrics baseline,
+        BenchmarkMetrics comparison
+    );
+}
+
+public record BenchmarkMetrics(
+    int TruePositives,
+    int FalsePositives,
+    int TrueNegatives,
+    int FalseNegatives,
+    double Precision,
+    double Recall,
+    double F1Score,
+    ImmutableDictionary<string, BenchmarkMetrics> ByCategory
+);
+
+public record ComparativeMetrics(
+    string BaselineTool,
+    string ComparisonTool,
+    double PrecisionDelta,
+    double RecallDelta,
+    double F1Delta,
+    ImmutableArray<string> UniqueFindings,
+    ImmutableArray<string> MissedFindings
+);
+```
+
+### 5. Claims Index
+
+**Namespace**: `StellaOps.Scanner.Benchmark.Claims`
+
+Manage verifiable claims with evidence links.
+
+```csharp
+public interface IClaimsIndex
+{
+    Task<ImmutableArray<Claim>> GetAllClaimsAsync(CancellationToken ct);
+    Task<ClaimVerification> VerifyClaimAsync(string claimId, CancellationToken ct);
+    Task UpdateClaimsAsync(BenchmarkRun run, CancellationToken ct);
+}
+
+public record Claim(
+    string Id,
+    ClaimCategory Category,
+    string Statement,
+    string EvidencePath,
+    ClaimStatus Status,
+    DateTimeOffset LastVerified
+);
+
+public enum ClaimStatus { Pending, Verified, Published, Disputed, Resolved }
+
+public record ClaimVerification(
+    string ClaimId,
+    bool IsValid,
+    string? Evidence,
+    string? FailureReason
+);
+```
+
+---
+
+## Data Flow
+
+```
+┌────────────────┐
+│  Corpus Images │
+│  (50+ images)  │
+└───────┬────────┘
+        │
+        ▼
+┌────────────────┐     ┌────────────────┐
+│ Stella Ops Scan│     │ Trivy/Grype    │
+│                │     │ Scan           │
+└───────┬────────┘     └───────┬────────┘
+        │                      │
+        ▼                      ▼
+┌────────────────┐     ┌────────────────┐
+│ Normalized     │     │ Normalized     │
+│ Findings       │     │ Findings       │
+└───────┬────────┘     └───────┬────────┘
+        │                      │
+        └──────────┬───────────┘
+                   │
+                   ▼
+           ┌──────────────┐
+           │ Ground Truth │
+           │ Comparison   │
+           └──────┬───────┘
+                  │
+                  ▼
+           ┌──────────────┐
+           │   Metrics    │
+           │  (P/R/F1)    │
+           └──────┬───────┘
+                  │
+                  ▼
+           ┌──────────────┐
+           │ Claims Index │
+           │   Update     │
+           └──────────────┘
+```
+
+---
+
+## Corpus Structure
+
+```
+bench/competitors/
+├── corpus/
+│   ├── manifest.json              # Corpus metadata
+│   ├── ground-truth/
+│   │   ├── alpine-3.18.json       # Per-image ground truth
+│   │   ├── debian-bookworm.json
+│   │   └── ...
+│   └── images/
+│       ├── base-os/
+│       ├── applications/
+│       └── edge-cases/
+├── results/
+│   ├── 2025-12-22/
+│   │   ├── stellaops.json
+│   │   ├── trivy.json
+│   │   ├── grype.json
+│   │   └── comparison.json
+│   └── latest -> 2025-12-22/
+└── fixtures/
+    └── adapters/                  # Test fixtures for adapters
+```
+
+---
+
+## Ground Truth Format
+
+```json
+{
+  "imageDigest": "sha256:abc123...",
+  "imageName": "alpine:3.18",
+  "category": "BaseOS",
+  "groundTruth": {
+    "truePositives": [
+      {
+        "cveId": "CVE-2024-1234",
+        "package": "openssl",
+        "version": "3.0.8",
+        "notes": "Fixed in 3.0.9"
+      }
+    ],
+    "knownFalsePositives": [
+      {
+        "cveId": "CVE-2024-9999",
+        "package": "zlib",
+        "version": "1.2.13",
+        "reason": "Backported in alpine:3.18"
+      }
+    ],
+    "expectedUnreachable": [
+      {
+        "cveId": "CVE-2024-5678",
+        "package": "curl",
+        "reason": "Vulnerable function not linked"
+      }
+    ]
+  },
+  "lastVerified": "2025-12-01T00:00:00Z",
+  "verifiedBy": "security-team"
+}
+```
+
+---
+
+## CI Integration
+
+### Workflow: `benchmark-vs-competitors.yml`
+
+```yaml
+name: Competitive Benchmark
+
+on:
+  schedule:
+    - cron: '0 2 * * 0'  # Weekly Sunday 2 AM
+  workflow_dispatch:
+  push:
+    paths:
+      - 'src/Scanner/__Libraries/StellaOps.Scanner.Benchmark/**'
+      - 'bench/competitors/**'
+
+jobs:
+  benchmark:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Install competitor tools
+        run: |
+          # Install Trivy
+          curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sh
+          # Install Grype
+          curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh
+
+      - name: Run benchmark
+        run: stella benchmark run --corpus bench/competitors/corpus --output bench/competitors/results/$(date +%Y-%m-%d)
+
+      - name: Update claims index
+        run: stella benchmark claims --output docs/claims-index.md
+
+      - name: Upload results
+        uses: actions/upload-artifact@v4
+        with:
+          name: benchmark-results
+          path: bench/competitors/results/
+```
+
+---
+
+## CLI Commands
+
+```bash
+# Run full benchmark
+stella benchmark run --corpus <path> --competitors trivy,grype,syft
+
+# Verify a specific claim
+stella benchmark verify <CLAIM_ID>
+
+# Generate claims index
+stella benchmark claims --output docs/claims-index.md
+
+# Generate marketing battlecard
+stella benchmark battlecard --output docs/marketing/battlecard.md
+
+# Show comparison summary
+stella benchmark summary --format table|json|markdown
+```
+
+---
+
+## Testing
+
+| Test Type | Location | Purpose |
+|-----------|----------|---------|
+| Unit | `StellaOps.Scanner.Benchmark.Tests/` | Adapter parsing, metrics calculation |
+| Integration | `StellaOps.Scanner.Benchmark.Integration.Tests/` | Full benchmark flow |
+| Golden | `bench/competitors/fixtures/` | Deterministic output verification |
+
+---
+
+## Security Considerations
+
+1. **Competitor binaries**: Run in isolated containers, no network access during scan
+2. **Corpus images**: Verified digests, no external pulls during benchmark
+3. **Results**: Signed with DSSE before publishing
+4. **Claims**: Require PR review before status change
+
+---
+
+## Dependencies
+
+- `StellaOps.Scanner.Core` - Normalized finding models
+- `StellaOps.Attestor.Dsse` - Result signing
+- Docker - Competitor tool execution
+- Ground-truth corpus (maintained separately)
+
+---
+
+## Related Documentation
+
+- [Claims Index](../../claims-index.md)
+- [Sprint 7000.0001.0001](../../implplan/SPRINT_7000_0001_0001_competitive_benchmarking.md)
+- [Testing Strategy](../../implplan/SPRINT_5100_SUMMARY.md)
+
+---
+
+*Document Version*: 1.0.0
+*Created*: 2025-12-22