Add reference architecture and testing strategy documentation

- Created a new document for the Stella Ops Reference Architecture outlining the system's topology, trust boundaries, artifact association, and interfaces. - Developed a comprehensive Testing Strategy document detailing the importance of offline readiness, interoperability, determinism, and operational guardrails. - Introduced a README for the Testing Strategy, summarizing processing details and key concepts implemented. - Added guidance for AI agents and developers in the tests directory, including directory structure, test categories, key patterns, and rules for test development.
2025-12-22 07:59:15 +02:00
parent 5d398ec442
commit 53503cb407
96 changed files with 37565 additions and 71 deletions
--- a/docs/product-advisories/14-Dec-2025
+++ b/docs/product-advisories/14-Dec-2025
@@ -1,606 +0,0 @@
-# CVSS and Competitive Analysis Technical Reference
-
-**Source Advisories**:
- 29-Nov-2025 - CVSS v4.0 Momentum in Vulnerability Management
- 30-Nov-2025 - Comparative Evidence Patterns for Stella Ops
- 03-Dec-2025 - Next‑Gen Scanner Differentiators and Evidence Moat
-
-**Last Updated**: 2025-12-14
-
---
-
-## 1. CVSS V4.0 INTEGRATION
-
-### 1.1 Requirements
-
- Vendors (NVD, GitHub, Microsoft, Snyk) shipping CVSS v4 signals
- Awareness needed for receipt schemas, reporting, UI alignment
-
-### 1.2 Determinism & Offline
-
- Keep CVSS vector parsing deterministic
- Pin scoring library versions in receipts
- Avoid live API dependency
- Rely on mirrored NVD feeds or frozen samples
-
-### 1.3 Schema Mapping
-
- Map impacts to receipt schemas
- Identify UI/reporting deltas for transparency
- Note in sprint Decisions & Risks for CVSS receipts
-
-### 1.4 CVSS v4.0 MacroVector Scoring System
-
-CVSS v4.0 uses a **MacroVector-based scoring system** instead of the direct formula computation used in v2/v3. The MacroVector is a 6-digit string derived from the base metrics, which maps to a precomputed score table with 486 possible combinations.
-
-**MacroVector Structure**:
-```
-MacroVector = EQ1 + EQ2 + EQ3 + EQ4 + EQ5 + EQ6
-Example: "001100" -> Base Score = 8.2
-```
-
-**Equivalence Classes (EQ1-EQ6)**:
-
-| EQ | Metrics Used | Values | Meaning |
-|----|--------------|--------|---------|
-| EQ1 | Attack Vector + Privileges Required | 0-2 | Network reachability and auth barrier |
-| EQ2 | Attack Complexity + User Interaction | 0-1 | Attack prerequisites |
-| EQ3 | Vulnerable System CIA | 0-2 | Impact on vulnerable system |
-| EQ4 | Subsequent System CIA | 0-2 | Impact on downstream systems |
-| EQ5 | Attack Requirements | 0-1 | Preconditions needed |
-| EQ6 | Combined Impact Pattern | 0-2 | Multi-impact severity |
-
-**EQ1 (Attack Vector + Privileges Required)**:
- AV=Network + PR=None -> 0 (worst case: remote, no auth)
- AV=Network + PR=Low/High -> 1
- AV=Adjacent + PR=None -> 1
- AV=Adjacent + PR=Low/High -> 2
- AV=Local or Physical -> 2 (requires local access)
-
-**EQ2 (Attack Complexity + User Interaction)**:
- AC=Low + UI=None -> 0 (easiest to exploit)
- AC=Low + UI=Passive/Active -> 1
- AC=High + any UI -> 1 (harder to exploit)
-
-**EQ3 (Vulnerable System CIA)**:
- Any High in VC/VI/VA -> 0 (severe impact)
- Any Low in VC/VI/VA -> 1 (moderate impact)
- All None -> 2 (no impact)
-
-**EQ4 (Subsequent System CIA)**:
- Any High in SC/SI/SA -> 0 (cascading impact)
- Any Low in SC/SI/SA -> 1
- All None -> 2
-
-**EQ5 (Attack Requirements)**:
- AT=None -> 0 (no preconditions)
- AT=Present -> 1 (needs specific setup)
-
-**EQ6 (Combined Impact Pattern)**:
- >=2 High impacts (vuln OR sub) -> 0 (severe multi-impact)
- 1 High impact -> 1
- 0 High impacts -> 2
-
-**Scoring Algorithm**:
-1. Parse base metrics from vector string
-2. Compute EQ1-EQ6 from metrics
-3. Build MacroVector string: "{EQ1}{EQ2}{EQ3}{EQ4}{EQ5}{EQ6}"
-4. Lookup base score from MacroVectorLookup table
-5. Round up to nearest 0.1 (per FIRST spec)
-
-**Implementation**: `src/Policy/StellaOps.Policy.Scoring/Engine/CvssV4Engine.cs:262-359`
-
-### 1.5 Threat Metrics and Exploit Maturity
-
-CVSS v4.0 introduces **Threat Metrics** to adjust scores based on real-world exploit intelligence. The primary metric is **Exploit Maturity (E)**, which applies a multiplier to the base score.
-
-**Exploit Maturity Values**:
-
-| Value | Code | Multiplier | Description |
-|-------|------|------------|-------------|
-| Attacked | A | **1.00** | Active exploitation in the wild |
-| Proof of Concept | P | **0.94** | Public PoC exists but no active exploitation |
-| Unreported | U | **0.91** | No known exploit activity |
-| Not Defined | X | 1.00 | Default (assume worst case) |
-
-**Score Computation (CVSS-BT)**:
-```
-Threat Score = Base Score x Threat Multiplier
-
-Example:
-  Base Score = 9.1
-  Exploit Maturity = Unreported (U)
-  Threat Score = 9.1 x 0.91 = 8.3 (rounded up)
-```
-
-**Threat Metrics in Vector String**:
-```
-CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/SC:N/SI:N/SA:N/E:A
-                                                                ^^^
-                                                          Exploit Maturity
-```
-
-**Why Threat Metrics Matter**:
- Reduces noise: An unreported vulnerability scores ~9% lower
- Prioritizes real threats: Actively exploited vulns maintain full score
- Evidence-based: Integrates with KEV, EPSS, and internal threat feeds
-
-**Implementation**: `src/Policy/StellaOps.Policy.Scoring/Engine/CvssV4Engine.cs:365-375`
-
-### 1.6 Environmental Score Modifiers
-
-**Security Requirements Multipliers**:
-
-| Requirement | Low | Medium | High |
-|-------------|-----|--------|------|
-| Confidentiality (CR) | 0.5 | 1.0 | 1.5 |
-| Integrity (IR) | 0.5 | 1.0 | 1.5 |
-| Availability (AR) | 0.5 | 1.0 | 1.5 |
-
-**Modified Base Metrics** (can override any base metric):
- MAV (Modified Attack Vector)
- MAC (Modified Attack Complexity)
- MAT (Modified Attack Requirements)
- MPR (Modified Privileges Required)
- MUI (Modified User Interaction)
- MVC/MVI/MVA (Modified Vulnerable System CIA)
- MSC/MSI/MSA (Modified Subsequent System CIA)
-
-**Score Computation (CVSS-BE)**:
-1. Apply modified metrics to base metrics (if defined)
-2. Compute modified MacroVector
-3. Lookup modified base score
-4. Multiply by average of Security Requirements
-5. Clamp to [0, 10]
-
-```
-Environmental Score = Modified Base x (CR + IR + AR) / 3
-```
-
-### 1.7 Supplemental Metrics (Non-Scoring)
-
-CVSS v4.0 introduces supplemental metrics that provide context but **do not affect the score**:
-
-| Metric | Values | Purpose |
-|--------|--------|---------|
-| Safety (S) | Negligible/Present | Safety impact (ICS/OT systems) |
-| Automatable (AU) | No/Yes | Can attack be automated? |
-| Recovery (R) | Automatic/User/Irrecoverable | System recovery difficulty |
-| Value Density (V) | Diffuse/Concentrated | Target value concentration |
-| Response Effort (RE) | Low/Moderate/High | Effort to respond |
-| Provider Urgency (U) | Clear/Green/Amber/Red | Vendor urgency rating |
-
-**Use Cases**:
- **Safety**: Critical for ICS/SCADA vulnerability prioritization
- **Automatable**: Indicates wormable vulnerabilities
- **Provider Urgency**: Vendor-supplied priority signal
-
-## 2. SCANNER DISCREPANCIES ANALYSIS
-
-### 2.1 Trivy vs Grype Comparative Study (927 images)
-
-**Findings**:
- Tools disagreed on total vulnerability counts and specific CVE IDs
- Grype: ~603,259 vulns; Trivy: ~473,661 vulns
- Exact match in only 9.2% of cases (80 out of 865 vulnerable images)
- Even with same counts, specific vulnerability IDs differed
-
-**Root Causes**:
- Divergent vulnerability databases
- Differing matching logic
- Incomplete visibility
-
-### 2.2 VEX Tools Consistency Study (2025)
-
-**Tools Tested**:
- Trivy
- Grype
- OWASP DepScan
- Docker Scout
- Snyk CLI
- OSV-Scanner
- Vexy
-
-**Results**:
- Low consistency/similarity across container scanners
- DepScan: 18,680 vulns; Vexy: 191 vulns (2 orders of magnitude difference)
- Pairwise Jaccard indices very low (near 0)
- 4 most consistent tools shared only ~18% common vulnerabilities
-
-### 2.3 Implications for StellaOps
-
-**Moats Needed**:
- Golden-fixture benchmarks (container images with known, audited vulnerabilities)
- Deterministic, replayable scans
- Cryptographic integrity
- VEX/SBOM proofs
-
-**Metrics**:
- **Closure rate**: Time from flagged to confirmed exploitable
- **Proof coverage**: % of dependencies with valid SBOM/VEX proofs
- **Differential-closure**: Impact of database updates or policy changes on prior scan results
-
-### 2.4 Deterministic Receipt System
-
-Every CVSS scoring decision in StellaOps is captured in a **deterministic receipt** that enables audit-grade reproducibility.
-
-**Receipt Schema**:
-```json
-{
-  "receiptId": "uuid",
-  "inputHash": "sha256:...",
-  "baseMetrics": { ... },
-  "threatMetrics": { ... },
-  "environmentalMetrics": { ... },
-  "supplementalMetrics": { ... },
-  "scores": {
-    "baseScore": 9.1,
-    "threatScore": 8.3,
-    "environmentalScore": null,
-    "fullScore": null,
-    "effectiveScore": 8.3,
-    "effectiveScoreType": "threat"
-  },
-  "policyRef": "policy/cvss-v4-default@v1.2.0",
-  "policyDigest": "sha256:...",
-  "evidence": [ ... ],
-  "attestationRefs": [ ... ],
-  "createdAt": "2025-12-14T00:00:00Z"
-}
-```
-
-**InputHash Computation**:
-```
-inputHash = SHA256(canonicalize({
-  baseMetrics,
-  threatMetrics,
-  environmentalMetrics,
-  supplementalMetrics,
-  policyRef,
-  policyDigest
-}))
-```
-
-**Determinism Guarantees**:
- Same inputs -> same `inputHash` -> same scores
- Receipts are immutable once created
- Amendments create new receipts with `supersedes` reference
- Optional DSSE signatures for cryptographic binding
-
-**Implementation**: `src/Policy/StellaOps.Policy.Scoring/Receipts/ReceiptBuilder.cs`
-
-## 3. RUNTIME REACHABILITY APPROACHES
-
-### 3.1 Runtime-Aware Vulnerability Prioritization
-
-**Approach**:
- Monitor container workloads at runtime to determine which vulnerable components are actually used
- Use eBPF-based monitors, dynamic tracers, or built-in profiling
- Construct runtime call graph or dependency graph
- Map vulnerabilities to code entities (functions/modules)
- If execution trace covers entity, vulnerability is "reachable"
-
-**Findings**: ~85% of critical vulns in containers are in inactive code (Sysdig)
-
-### 3.2 Reachability Analysis Techniques
-
-**Static**:
- Call-graph analysis (Snyk reachability, CodeQL)
- All possible paths
-
-**Dynamic**:
- Runtime observation (loaded modules, invoked functions)
- Actual runtime paths
-
-**Granularity Levels**:
- Function-level (precise, limited languages: Java, .NET)
- Package/module-level (broader, coarse)
-
-**Hybrid Approach**: Combine static (all possible paths) + dynamic (actual runtime paths)
-
-## 4. CONTAINER PROVENANCE & SUPPLY CHAIN
-
-### 4.1 In-Toto/DSSE Framework (NDSS 2024)
-
-**Purpose**:
- Track chain of custody in software builds
- Signed metadata (attestations) for each step
- DSSE: Dead Simple Signing Envelope for standardized signing
-
-### 4.2 Scudo System
-
-**Features**:
- Combines in-toto with Uptane
- Verifies build process and final image
- Full verification on client inefficient; verify upstream and trust summary
- Client checks final signature + hash only
-
-### 4.3 Supply Chain Verification
-
-**Signers**:
- Developer key signs code commit
- CI key signs build attestation
- Scanner key signs vulnerability attestation
- Release key signs container image
-
-**Verification Optimization**: Repository verifies in-toto attestations; client verifies final metadata only
-
-## 5. VENDOR EVIDENCE PATTERNS
-
-### 5.1 Snyk
-
-**Evidence Handling**:
- Runtime insights integration (Nov 2025)
- Evolution from static-scan noise to prioritized workflow
- Deployment context awareness
-
-**VEX Support**:
- CycloneDX VEX format
- Reachability-aware suppression
-
-### 5.2 GitHub Advanced Security
-
-**Features**:
- CodeQL for static analysis
- Dependency graph
- Dependabot alerts
- Security advisories
-
-**Evidence**:
- SARIF output
- SBOM generation (SPDX)
-
-### 5.3 Aqua Security
-
-**Approach**:
- Runtime protection
- Image scanning
- Kubernetes security
-
-**Evidence**:
- Dynamic runtime traces
- Network policy violations
-
-### 5.4 Anchore/Grype
-
-**Features**:
- Open-source scanner
- Policy-based compliance
- SBOM generation
-
-**Evidence**:
- CycloneDX/SPDX SBOM
- Vulnerability reports (JSON)
-
-### 5.5 Prisma Cloud
-
-**Features**:
- Cloud-native security
- Runtime defense
- Compliance monitoring
-
-**Evidence**:
- Multi-cloud attestations
- Compliance reports
-
-## 6. STELLAOPS DIFFERENTIATORS
-
-### 6.1 Reachability-with-Evidence
-
-**Why it Matters**:
- Snyk Container integrating runtime insights as "signal" (Nov 2025)
- Evolution from static-scan noise to prioritized, actionable workflow
- Deployment context: what's running, what's reachable, what's exploitable
-
-**Implication**: Container security triage relies on runtime/context signals
-
-### 6.2 Proof-First Architecture
-
-**Advantages**:
- Every claim backed by DSSE-signed attestations
- Cryptographic integrity
- Audit trail
- Offline verification
-
-### 6.3 Deterministic Scanning
-
-**Advantages**:
- Reproducible results
- Bit-identical outputs given same inputs
- Replay manifests
- Golden fixture benchmarks
-
-### 6.4 VEX-First Decisioning
-
-**Advantages**:
- Exploitability modeled in OpenVEX
- Lattice logic for stable outcomes
- Evidence-linked justifications
-
-### 6.5 Offline/Air-Gap First
-
-**Advantages**:
- No hidden network dependencies
- Bundled feeds, keys, Rekor snapshots
- Verifiable without internet access
-
-### 6.6 CVSS + KEV Risk Signal Combination
-
-StellaOps combines CVSS scores with KEV (Known Exploited Vulnerabilities) data using a deterministic formula:
-
-**Risk Formula**:
-```
-risk_score = clamp01((cvss / 10) + kevBonus)
-
-where:
-  kevBonus = 0.2 if vulnerability is in CISA KEV catalog
-  kevBonus = 0.0 otherwise
-```
-
-**Example Calculations**:
-
-| CVSS Score | KEV Flag | Risk Score |
-|------------|----------|------------|
-| 9.0 | No | 0.90 |
-| 9.0 | Yes | 1.00 (clamped) |
-| 7.5 | No | 0.75 |
-| 7.5 | Yes | 0.95 |
-| 5.0 | No | 0.50 |
-| 5.0 | Yes | 0.70 |
-
-**Rationale**:
- KEV inclusion indicates active exploitation
- 20% bonus prioritizes known-exploited over theoretical risks
- Clamping prevents scores > 1.0
- Deterministic formula enables reproducible prioritization
-
-**Implementation**: `src/RiskEngine/StellaOps.RiskEngine/StellaOps.RiskEngine.Core/Providers/CvssKevProvider.cs`
-
-## 7. COMPETITIVE POSITIONING
-
-### 7.1 Market Segments
-
-| Vendor | Strength | Weakness vs StellaOps |
-|--------|----------|----------------------|
-| Snyk | Developer experience | Less deterministic, SaaS-only |
-| Aqua | Runtime protection | Less reachability precision |
-| Anchore | Open-source, SBOM | Less proof infrastructure |
-| Prisma Cloud | Cloud-native breadth | Less offline/air-gap support |
-| GitHub | Integration with dev workflow | Less cryptographic proof chain |
-
-### 7.2 StellaOps Unique Value
-
-1. **Deterministic + Provable**: Bit-identical scans with cryptographic proofs
-2. **Reachability + Runtime**: Hybrid static/dynamic analysis
-3. **Offline/Sovereign**: Air-gap operation with regional crypto (FIPS/GOST/eIDAS/SM)
-4. **VEX-First**: Evidence-backed decisioning, not just alerting
-5. **AGPL-3.0**: Self-hostable, no vendor lock-in
-
-## 8. MOAT METRICS
-
-### 8.1 Proof Coverage
-
-```
-proof_coverage = findings_with_valid_receipts / total_findings
-Target: ≥95%
-```
-
-### 8.2 Closure Rate
-
-```
-closure_rate = time_from_flagged_to_confirmed_exploitable
-Target: P95 < 24 hours
-```
-
-### 8.3 Differential-Closure Impact
-
-```
-differential_impact = findings_changed_after_db_update / total_findings
-Target: <5% (non-code changes)
-```
-
-### 8.4 False Positive Reduction
-
-```
-fp_reduction = (baseline_fp_rate - stella_fp_rate) / baseline_fp_rate
-Target: ≥50% vs baseline scanner
-```
-
-### 8.5 Reachability Accuracy
-
-```
-reachability_accuracy = correct_r0_r1_r2_r3_classifications / total_classifications
-Target: ≥90%
-```
-
-## 9. COMPETITIVE INTELLIGENCE TRACKING
-
-### 9.1 Feature Parity Matrix
-
-| Feature | Snyk | Aqua | Anchore | Prisma | StellaOps |
-|---------|------|------|---------|--------|-----------|
-| SBOM Generation | ✓ | ✓ | ✓ | ✓ | ✓ |
-| VEX Support | ✓ | ✗ | Partial | ✗ | ✓ |
-| Reachability Analysis | ✓ | ✗ | ✗ | ✗ | ✓ |
-| Runtime Evidence | ✓ | ✓ | ✗ | ✓ | ✓ |
-| Cryptographic Proofs | ✗ | ✗ | ✗ | ✗ | ✓ |
-| Deterministic Scans | ✗ | ✗ | ✗ | ✗ | ✓ |
-| Offline/Air-Gap | ✗ | Partial | ✗ | ✗ | ✓ |
-| Regional Crypto | ✗ | ✗ | ✗ | ✗ | ✓ |
-
-### 9.2 Monitoring Strategy
-
- Track vendor release notes
- Monitor GitHub repos for feature announcements
- Participate in security conferences
- Engage with customer feedback
- Update competitive matrix quarterly
-
-## 10. MESSAGING FRAMEWORK
-
-### 10.1 Core Message
-
-"StellaOps provides deterministic, proof-backed vulnerability management with reachability analysis for offline/air-gapped environments."
-
-### 10.2 Key Differentiators (Elevator Pitch)
-
-1. **Deterministic**: Same inputs → same outputs, every time
-2. **Provable**: Cryptographic proof chains for every decision
-3. **Reachable**: Static + runtime analysis, not just presence
-4. **Sovereign**: Offline operation, regional crypto compliance
-5. **Open**: AGPL-3.0, self-hostable, no lock-in
-
-### 10.3 Target Personas
-
- **Security Engineers**: Need proof-backed decisions for audits
- **DevOps Teams**: Need deterministic scans in CI/CD
- **Compliance Officers**: Need offline/air-gap for regulated environments
- **Platform Engineers**: Need self-hostable, sovereign solution
-
-## 11. BENCHMARKING PROTOCOL
-
-### 11.1 Comparative Test Suite
-
-**Images**:
- 50 representative production images
- Known vulnerabilities labeled
- Reachability ground truth established
-
-**Metrics**:
- Precision (1 - FP rate)
- Recall (TP / (TP + FN))
- F1 score
- Scan time (P50, P95)
- Determinism (identical outputs over 10 runs)
-
-### 11.2 Test Execution
-
-```bash
-# Run StellaOps scan
-stellaops scan --image test-image:v1 --output stella-results.json
-
-# Run competitor scans
-trivy image --format json test-image:v1 > trivy-results.json
-grype test-image:v1 -o json > grype-results.json
-snyk container test test-image:v1 --json > snyk-results.json
-
-# Compare results
-stellaops benchmark compare \
-  --ground-truth ground-truth.json \
-  --stella stella-results.json \
-  --trivy trivy-results.json \
-  --grype grype-results.json \
-  --snyk snyk-results.json
-```
-
-### 11.3 Results Publication
-
- Publish benchmarks quarterly
- Open-source test images and ground truth
- Invite community contributions
- Document methodology transparently
-
---
-
-**Document Version**: 1.0
-**Target Platform**: .NET 10, PostgreSQL ≥16, Angular v17