Add reference architecture and testing strategy documentation

- Created a new document for the Stella Ops Reference Architecture outlining the system's topology, trust boundaries, artifact association, and interfaces.
- Developed a comprehensive Testing Strategy document detailing the importance of offline readiness, interoperability, determinism, and operational guardrails.
- Introduced a README for the Testing Strategy, summarizing processing details and key concepts implemented.
- Added guidance for AI agents and developers in the tests directory, including directory structure, test categories, key patterns, and rules for test development.
This commit is contained in:
2025-12-22 07:59:15 +02:00
parent 5d398ec442
commit 53503cb407
96 changed files with 37565 additions and 71 deletions

View File

@@ -0,0 +1,81 @@
# Archived Advisory: Mapping Evidence Within Compiled Binaries
**Original Advisory:** `21-Dec-2025 - Mapping Evidence Within Compiled Binaries.md`
**Archived:** 2025-12-21
**Status:** Converted to Implementation Plan
---
## Summary
This advisory proposed building a **Vulnerable Binaries Database** that enables detection of vulnerable code at the binary level, independent of package metadata.
## Implementation Artifacts Created
### Architecture Documentation
- `docs/modules/binaryindex/architecture.md` - Full module architecture
- `docs/db/schemas/binaries_schema_specification.md` - Database schema
### Sprint Files
**Summary:**
- `docs/implplan/SPRINT_6000_SUMMARY.md` - MVP roadmap overview
**MVP 1: Known-Build Binary Catalog (Sprint 6000.0001)**
- `SPRINT_6000_0001_0001_binaries_schema.md` - PostgreSQL schema
- `SPRINT_6000_0001_0002_binary_identity_service.md` - Identity extraction
- `SPRINT_6000_0001_0003_debian_corpus_connector.md` - Debian/Ubuntu ingestion
**MVP 2: Patch-Aware Backport Handling (Sprint 6000.0002)**
- `SPRINT_6000_0002_0001_fix_evidence_parser.md` - Changelog/patch parsing
**MVP 3: Binary Fingerprint Factory (Sprint 6000.0003)**
- `SPRINT_6000_0003_0001_fingerprint_storage.md` - Fingerprint storage
**MVP 4: Scanner Integration (Sprint 6000.0004)**
- `SPRINT_6000_0004_0001_scanner_integration.md` - Scanner.Worker integration
## Key Decisions
| Decision | Rationale |
|----------|-----------|
| New `BinaryIndex` module | Binary vulnerability DB is distinct concern from Scanner |
| Build-ID as primary key | Most deterministic identifier for ELF binaries |
| `binaries` PostgreSQL schema | Aligns with existing per-module schema pattern |
| Three-tier lookup | Assertions → Build-ID → Fingerprints for precision |
| Patch-aware fix index | Handles distro backports correctly |
## Module Structure
```
src/BinaryIndex/
├── StellaOps.BinaryIndex.WebService/
├── StellaOps.BinaryIndex.Worker/
├── __Libraries/
│ ├── StellaOps.BinaryIndex.Core/
│ ├── StellaOps.BinaryIndex.Persistence/
│ ├── StellaOps.BinaryIndex.Corpus/
│ ├── StellaOps.BinaryIndex.Corpus.Debian/
│ ├── StellaOps.BinaryIndex.FixIndex/
│ └── StellaOps.BinaryIndex.Fingerprints/
└── __Tests/
```
## Database Tables
| Table | Purpose |
|-------|---------|
| `binaries.binary_identity` | Known binary identities |
| `binaries.binary_package_map` | Binary → package mapping |
| `binaries.vulnerable_buildids` | Vulnerable Build-IDs |
| `binaries.cve_fix_index` | Patch-aware fix status |
| `binaries.vulnerable_fingerprints` | Function fingerprints |
| `binaries.fingerprint_matches` | Scan match results |
## References
- Original advisory: This folder
- Architecture: `docs/modules/binaryindex/architecture.md`
- Schema: `docs/db/schemas/binaries_schema_specification.md`
- Sprints: `docs/implplan/SPRINT_6000_*.md`

View File

@@ -0,0 +1,97 @@
# MOAT Gap Closure Archive Manifest
**Archive Date**: 2025-12-21
**Archive Reason**: Product advisories processed and implementation gaps identified
---
## Summary
This archive contains 12 MOAT (Market-Oriented Architecture Transformation) product advisories that were analyzed against the StellaOps codebase. After thorough source code exploration, the implementation coverage was assessed at **~92%**.
---
## Implementation Coverage
| Advisory Topic | Coverage | Notes |
|---------------|----------|-------|
| CVSS and Competitive Analysis | 100% | Full CVSS v4 engine, all attack complexity metrics |
| Determinism and Reproducibility | 100% | Stable ordering, hash chains, replayTokens, NDJSON |
| Developer Onboarding | 100% | AGENTS.md files, CLAUDE.md, module dossiers |
| Offline and Air-Gap | 100% | Bundle system, egress allowlists, offline sources |
| PostgreSQL Patterns | 100% | RLS, tenant isolation, schema per module |
| Proof and Evidence Chain | 100% | ProofSpine, DSSE envelopes, hash chaining |
| Reachability Analysis | 100% | CallGraphAnalyzer, AttackPathScorer, CodePathResult |
| Rekor Integration | 100% | RekorClient, transparency log publishing |
| Smart-Diff | 100% | MaterialRiskChangeDetector, hash-based diffing |
| Testing and Quality Guardrails | 100% | Testcontainers, benchmarks, truth schemas |
| UX and Time-to-Evidence | 100% | EvidencePanel, keyboard shortcuts, motion tokens |
| Triage and Unknowns | 75% | UnknownRanker exists, missing decay/containment |
**Overall**: ~92% implementation coverage
---
## Identified Gaps & Sprint References
Three implementation gaps were identified and documented in sprints:
### Gap 1: Decay Algorithm (Sprint 4000.0001.0001)
- **File**: `docs/implplan/SPRINT_4000_0001_0001_unknowns_decay_algorithm.md`
- **Scope**: Add time-based decay factor to UnknownRanker
- **Story Points**: 15
- **Working Directory**: `src/Policy/__Libraries/StellaOps.Policy.Unknowns/`
### Gap 2: BlastRadius & Containment (Sprint 4000.0001.0002)
- **File**: `docs/implplan/SPRINT_4000_0001_0002_unknowns_blast_radius_containment.md`
- **Scope**: Add BlastRadius and ContainmentSignals to ranking
- **Story Points**: 19
- **Working Directory**: `src/Policy/__Libraries/StellaOps.Policy.Unknowns/`
### Gap 3: EPSS Feed Connector (Sprint 4000.0002.0001)
- **File**: `docs/implplan/SPRINT_4000_0002_0001_epss_feed_connector.md`
- **Scope**: Create Concelier connector for orchestrated EPSS ingestion
- **Story Points**: 22
- **Working Directory**: `src/Concelier/__Libraries/StellaOps.Concelier.Connector.Epss/`
**Total Gap Closure Effort**: 56 story points
---
## Archived Files (12)
1. `14-Dec-2025 - CVSS and Competitive Analysis Technical Reference.md`
2. `14-Dec-2025 - Determinism and Reproducibility Technical Reference.md`
3. `14-Dec-2025 - Developer Onboarding Technical Reference.md`
4. `14-Dec-2025 - Offline and Air-Gap Technical Reference.md`
5. `14-Dec-2025 - PostgreSQL Patterns Technical Reference.md`
6. `14-Dec-2025 - Proof and Evidence Chain Technical Reference.md`
7. `14-Dec-2025 - Reachability Analysis Technical Reference.md`
8. `14-Dec-2025 - Rekor Integration Technical Reference.md`
9. `14-Dec-2025 - Smart-Diff Technical Reference.md`
10. `14-Dec-2025 - Testing and Quality Guardrails Technical Reference.md`
11. `14-Dec-2025 - Triage and Unknowns Technical Reference.md`
12. `14-Dec-2025 - UX and Time-to-Evidence Technical Reference.md`
---
## Key Discoveries
Features that were discovered to exist with different naming than expected:
| Expected | Actual Implementation |
|----------|----------------------|
| FipsProfile, GostProfile, SmProfile | ComplianceProfiles (unified) |
| FindingsLedger.HashChain | Exists in FindingsSnapshot with replayTokens |
| Benchmark suite | Exists in `__Benchmarks/` directories |
| EvidencePanel | Exists in Web UI with motion tokens |
---
## Post-Closure Target
After completing the three gap-closure sprints:
- Implementation coverage: **95%+**
- All advisory requirements addressed
- Triage/Unknowns module fully featured

View File

@@ -0,0 +1,146 @@
# MOAT Phase 2 Archive Manifest
**Archive Date**: 2025-12-21
**Archive Reason**: Product advisories processed and implementation gaps identified
**Epoch**: 4100 (MOAT Phase 2 - Governance & Replay)
---
## Summary
This archive contains 11 MOAT (Market-Oriented Architecture Transformation) product advisories from 19-Dec and 20-Dec 2025 that were analyzed against the StellaOps codebase. After thorough source code exploration, the implementation coverage was assessed at **~65% baseline** with sprints planned to reach **~90% target**.
---
## Gap Analysis (from 65% baseline)
| Area | Current | Target | Gap |
|------|---------|--------|-----|
| Security Snapshots & Deltas | 55% | 90% | Unified snapshot, DeltaVerdict |
| Risk Verdict Attestations | 50% | 90% | RVA contract, OCI push |
| VEX Claims Resolution | 80% | 95% | JSON parsing, evidence providers |
| Unknowns First-Class | 60% | 95% | Reason codes, budgets, attestations |
| Knowledge Snapshots | 60% | 90% | Manifest, ReplayEngine |
| Risk Budgets & Gates | 20% | 80% | RP scoring, gate levels |
---
## Sprint Structure (10 Sprints, 169 Story Points)
### Batch 4100.0001: Unknowns Enhancement (40 pts)
| Sprint | Topic | Points | Status |
|--------|-------|--------|--------|
| 4100.0001.0001 | Reason-Coded Unknowns | 15 | Planned |
| 4100.0001.0002 | Unknown Budgets & Env Thresholds | 13 | Planned |
| 4100.0001.0003 | Unknowns in Attestations | 12 | Planned |
### Batch 4100.0002: Knowledge Snapshots & Replay (55 pts)
| Sprint | Topic | Points | Status |
|--------|-------|--------|--------|
| 4100.0002.0001 | Knowledge Snapshot Manifest | 18 | Planned |
| 4100.0002.0002 | Replay Engine | 22 | Planned |
| 4100.0002.0003 | Snapshot Export/Import | 15 | Planned |
### Batch 4100.0003: Risk Verdict & OCI (34 pts)
| Sprint | Topic | Points | Status |
|--------|-------|--------|--------|
| 4100.0003.0001 | Risk Verdict Attestation Contract | 16 | Planned |
| 4100.0003.0002 | OCI Referrer Push & Discovery | 18 | Planned |
### Batch 4100.0004: Deltas & Gates (38 pts)
| Sprint | Topic | Points | Status |
|--------|-------|--------|--------|
| 4100.0004.0001 | Security State Delta & Verdict | 20 | Planned |
| 4100.0004.0002 | Risk Budgets & Gate Levels | 18 | Planned |
---
## Sprint File References
| Sprint | File |
|--------|------|
| 4100.0001.0001 | `docs/implplan/SPRINT_4100_0001_0001_reason_coded_unknowns.md` |
| 4100.0001.0002 | `docs/implplan/SPRINT_4100_0001_0002_unknown_budgets.md` |
| 4100.0001.0003 | `docs/implplan/SPRINT_4100_0001_0003_unknowns_attestations.md` |
| 4100.0002.0001 | `docs/implplan/SPRINT_4100_0002_0001_knowledge_snapshot_manifest.md` |
| 4100.0002.0002 | `docs/implplan/SPRINT_4100_0002_0002_replay_engine.md` |
| 4100.0002.0003 | `docs/implplan/SPRINT_4100_0002_0003_snapshot_export_import.md` |
| 4100.0003.0001 | `docs/implplan/SPRINT_4100_0003_0001_risk_verdict_attestation.md` |
| 4100.0003.0002 | `docs/implplan/SPRINT_4100_0003_0002_oci_referrer_push.md` |
| 4100.0004.0001 | `docs/implplan/SPRINT_4100_0004_0001_security_state_delta.md` |
| 4100.0004.0002 | `docs/implplan/SPRINT_4100_0004_0002_risk_budgets_gates.md` |
---
## Archived Files (11)
### 19-Dec-2025 Moat Advisories (7)
1. `19-Dec-2025 - Moat #1.md`
2. `19-Dec-2025 - Moat #2.md`
3. `19-Dec-2025 - Moat #3.md`
4. `19-Dec-2025 - Moat #4.md`
5. `19-Dec-2025 - Moat #5.md`
6. `19-Dec-2025 - Moat #6.md`
7. `19-Dec-2025 - Moat #7.md`
### 20-Dec-2025 Moat Explanation Advisories (4)
8. `20-Dec-2025 - Moat Explanation - Exception management as auditable objects.md`
9. `20-Dec-2025 - Moat Explanation - Guidelines for Product and Development Managers - Signed, Replayable Risk Verdicts.md`
10. `20-Dec-2025 - Moat Explanation - Knowledge Snapshots and Time-Travel Replay.md`
11. `20-Dec-2025 - Moat Explanation - Risk Budgets and Diff-Aware Release Gates.md`
---
## Key New Concepts
| Concept | Description | Sprint |
|---------|-------------|--------|
| UnknownReasonCode | 7 reason codes: U-RCH, U-ID, U-PROV, U-VEX, U-FEED, U-CONFIG, U-ANALYZER | 4100.0001.0001 |
| UnknownBudget | Environment-aware thresholds (prod: block, stage: warn, dev: warn_only) | 4100.0001.0002 |
| KnowledgeSnapshotManifest | Content-addressed bundle (ksm:sha256:{hash}) | 4100.0002.0001 |
| ReplayEngine | Time-travel replay with frozen inputs for determinism verification | 4100.0002.0002 |
| RiskVerdictAttestation | PASS/FAIL/PASS_WITH_EXCEPTIONS/INDETERMINATE verdicts | 4100.0003.0001 |
| OCI Referrer Push | OCI 1.1 referrers API with fallback to tagged indexes | 4100.0003.0002 |
| SecurityStateDelta | Baseline vs target comparison with DeltaVerdict | 4100.0004.0001 |
| GateLevel | G0-G4 diff-aware release gates with RP scoring | 4100.0004.0002 |
---
## Recommended Parallel Execution
```
Phase 1: 4100.0001.0001 + 4100.0002.0001 + 4100.0003.0001 + 4100.0004.0002
Phase 2: 4100.0001.0002 + 4100.0002.0002 + 4100.0003.0002
Phase 3: 4100.0001.0003 + 4100.0002.0003 + 4100.0004.0001
```
---
## Success Criteria
| Metric | Target |
|--------|--------|
| Reason-coded unknowns | 7 codes implemented |
| Unknown budget tests | 5+ passing |
| Knowledge snapshot tests | 8+ passing |
| Replay engine golden tests | 10+ passing |
| RVA verification tests | 6+ passing |
| OCI push integration tests | 4+ passing |
| Delta computation tests | 6+ passing |
| Overall MOAT coverage | 85%+ |
---
## Post-Closure Target
After completing all 10 sprints:
- Implementation coverage: **90%+**
- All Phase 2 advisory requirements addressed
- Full governance and replay capabilities
- Risk budgets and gate levels operational

View File

@@ -1,12 +1,12 @@
Heres a compact, practical plan to harden StellaOps around **offlineready security evidence and deterministic verdicts**, with just enough background so it all clicks.
Here's a compact, practical plan to harden Stella Ops around **offlineready security evidence and deterministic verdicts**, with just enough background so it all clicks.
---
# Why this matters (quick primer)
* **Airgapped/offline**: Many customers cant reach public feeds or registries. Your scanners, SBOM tooling, and attestations must work with **presynced bundles** and prove what data they used.
* **Airgapped/offline**: Many customers can't reach public feeds or registries. Your scanners, SBOM tooling, and attestations must work with **presynced bundles** and prove what data they used.
* **Interoperability**: Teams mix tools (Syft/Grype/Trivy, cosign, CycloneDX/SPDX). Your CI should **roundtrip** SBOMs and attestations endtoend and prove that downstream consumers (e.g., Grype) can load them.
* **Determinism**: Auditors expect **same inputs → same verdict.** Capture inputs, policies, and feed hashes so a verdict is exactly reproducible later.
* **Determinism**: Auditors expect **"same inputs → same verdict."** Capture inputs, policies, and feed hashes so a verdict is exactly reproducible later.
* **Operational guardrails**: Shipping gates should fail early on **unknowns** and apply **backpressure** gracefully when load spikes.
---
@@ -15,14 +15,14 @@ Heres a compact, practical plan to harden StellaOps around **offlinerea
1. **Airgapped operation e2e**
* Package offline bundle (vuln feeds, package catalogs, policy/lattice rules, certs, keys).
* Package "offline bundle" (vuln feeds, package catalogs, policy/lattice rules, certs, keys).
* Run scans (containers, OS, language deps, binaries) **without network**.
* Assert: SBOMs generated, attestations signed/verified, verdicts emitted.
* Evidence: manifest of bundle contents + hashes in the run log.
2. **Interop roundtrips (SBOM ⇄ attestation ⇄ scanner)**
* Produce SBOM (CycloneDX1.6 and SPDX3.0.1) with Syft.
* Produce SBOM (CycloneDX 1.6 and SPDX 3.0.1) with Syft.
* Create **DSSE/cosign** attestation for that SBOM.
* Verify consumer tools:
@@ -33,11 +33,11 @@ Heres a compact, practical plan to harden StellaOps around **offlinerea
3. **Replayability (deltaverdicts + strict replay)**
* Store input set: artifact digest(s), SBOM digests, policy version, feed digests, lattice rules, tool versions.
* Rerun later; assert **byteidentical verdict** and same deltaverdict when inputs unchanged.
* Rerun later; assert **byteidentical verdict** and same "deltaverdict" when inputs unchanged.
4. **Unknownsbudget policy gates**
* Inject controlled unknown conditions (missing CPE mapping, unresolved package source, unparsed distro).
* Inject controlled "unknown" conditions (missing CPE mapping, unresolved package source, unparsed distro).
* Gate: **fail build if unknowns > budget** (e.g., prod=0, staging≤N).
* Assert: UI, CLI, and attestation all record unknown counts and gate decision.
@@ -45,7 +45,7 @@ Heres a compact, practical plan to harden StellaOps around **offlinerea
* Produce: buildprovenance (intoto/DSSE), SBOM attest, VEX attest, final **verdict attest**.
* Verify: signature (cosign), certificate chain, timestamping, Rekorstyle (or mirror) inclusion when online; cached proofs when offline.
* Assert: each attestation is linked in the verdicts evidence index.
* Assert: each attestation is linked in the verdict's evidence index.
6. **Router backpressure chaos (HTTP 429/503 + RetryAfter)**
@@ -55,7 +55,7 @@ Heres a compact, practical plan to harden StellaOps around **offlinerea
7. **UI reducer tests for reachability & VEX chips**
* Component tests: large SBOM graphs, focused **reachability subgraphs**, and VEX status chips (affected/notaffected/underinvestigation).
* Assert: stable rendering under 50k+ nodes; interactions remain <200ms.
* Assert: stable rendering under 50k+ nodes; interactions remain <200 ms.
---
@@ -95,7 +95,7 @@ Heres a compact, practical plan to harden StellaOps around **offlinerea
* Router under burst emits **correct RetryAfter** and recovers cleanly.
* UI handles huge graphs; VEX chips never desync from evidence.
If you want, Ill turn this into GitLab/Gitea pipeline YAML + a tiny sample repo (image, SBOM, policies, and goldens) so your team can plugandplay.
If you want, I'll turn this into GitLab/Gitea pipeline YAML + a tiny sample repo (image, SBOM, policies, and goldens) so your team can plugandplay.
Below is a complete, end-to-end testing strategy for Stella Ops that turns your moats (offline readiness, deterministic replayable verdicts, lattice/policy decisioning, attestation provenance, unknowns budgets, router backpressure, UI reachability evidence) into continuously verified guarantees.
---
@@ -124,21 +124,21 @@ A scan/verdict is *deterministic* iff **same inputs → byte-identical outputs**
### 1.2 Offline by default
Every CI job (except explicitly tagged online) runs with **no egress**.
Every CI job (except explicitly tagged "online") runs with **no egress**.
* Offline bundle is mandatory input for scanning.
* Any attempted network call fails the test (proves air-gap compliance).
### 1.3 Evidence-first validation
No assertion is verdict == pass without verifying the chain of evidence:
No assertion is "verdict == pass" without verifying the chain of evidence:
* verdict references SBOM digest(s)
* SBOM references artifact digest(s)
* VEX claims reference vulnerabilities + components + reachability evidence
* attestations verify cryptographically and chain to configured roots.
### 1.4 Interop is required, not nice to have
### 1.4 Interop is required, not "nice to have"
Stella Ops must round-trip with:
@@ -146,19 +146,19 @@ Stella Ops must round-trip with:
* Attestation: DSSE / in-toto style envelopes, cosign-compatible flows
* Consumer scanners: at least Grype from SBOM; ideally Trivy as cross-check
Interop tests are treated as compatibility contracts and block releases.
Interop tests are treated as "compatibility contracts" and block releases.
### 1.5 Architectural boundary enforcement (your standing rule)
* Lattice/policy merge algorithms run **in `scanner.webservice`**.
* `Concelier` and `Excitors` must preserve prune source.
* `Concelier` and `Excitors` must "preserve prune source".
This is enforced with tests that detect forbidden behavior (see §6.2).
---
## 2) The test portfolio (what kinds of tests exist)
Think coverage by risk, not coverage by lines.
Think "coverage by risk", not "coverage by lines".
### 2.1 Test layers and what they prove
@@ -172,9 +172,9 @@ Think “coverage by risk”, not “coverage by lines”.
2. **Property-based tests** (FsCheck)
* Reordering inputs does not change verdict hash
* Graph merge is associative/commutative where policy declares it
* Unknowns budgets always monotonic with missing evidence
* "Reordering inputs does not change verdict hash"
* "Graph merge is associative/commutative where policy declares it"
* "Unknowns budgets always monotonic with missing evidence"
* Parser robustness: arbitrary JSON for SBOM/VEX envelopes never crashes
3. **Component tests** (service + Postgres; optional Valkey)
@@ -194,7 +194,7 @@ Think “coverage by risk”, not “coverage by lines”.
* Router → scanner.webservice → attestor → storage
* Offline bundle import/export
* Knowledge snapshot time travel replay pipeline
* Knowledge snapshot "time travel" replay pipeline
6. **End-to-end tests** (realistic flows)
@@ -224,10 +224,10 @@ Both must pass.
### 3.2 Environment isolation
* Containers started with **no network** unless a test explicitly declares online.
* Containers started with **no network** unless a test explicitly declares "online".
* For Kubernetes e2e: apply a default-deny egress NetworkPolicy.
### 3.3 Golden corpora repository (your truth set)
### 3.3 Golden corpora repository (your "truth set")
Create a versioned `stellaops-test-corpus/` containing:
@@ -285,7 +285,7 @@ Bundle includes:
* crypto provider modules (for sovereign readiness)
* optional: Rekor mirror snapshot / inclusion proofs cache
**Test invariant:** offline scan is blocked if bundle is missing required parts; error is explicit and counts as unknown only where policy says so.
**Test invariant:** offline scan is blocked if bundle is missing required parts; error is explicit and counts as "unknown" only where policy says so.
### 4.3 Evidence Index
@@ -295,7 +295,7 @@ The verdict is not the product; the product is verdict + evidence graph:
* their digests and verification status
* unknowns list with codes + remediation hints
**Test invariant:** every not affected claim has required evidence hooks per policy (because feature flag off etc.), otherwise becomes unknown/fail.
**Test invariant:** every "not affected" claim has required evidence hooks per policy ("because feature flag off" etc.), otherwise becomes unknown/fail.
---
@@ -333,8 +333,8 @@ These are your release blockers.
* Assertions:
* verdict bytes identical
* evidence index identical (except allowed execution metadata section)
* delta verdict is empty delta
* evidence index identical (except allowed "execution metadata" section)
* delta verdict is "empty delta"
### Flow D: Diff-aware delta verdict (smart-diff)
@@ -366,7 +366,7 @@ These are your release blockers.
* clients backoff; no request loss
* metrics expose throttling reasons
### Flow G: Evidence export (audit pack)
### Flow G: Evidence export ("audit pack")
* Run scan
* Export a sealed audit pack (bundle + run manifest + evidence + verdict)
@@ -390,16 +390,16 @@ Must have:
**Critical invariant tests:**
* Vendor > distro > internal must be demonstrably *configurable*, and wrong merges must fail deterministically.
* "Vendor > distro > internal" must be demonstrably *configurable*, and wrong merges must fail deterministically.
### 6.2 Boundary enforcement: Concelier & Excitors preserve prune source
Add a behavioral boundary suite:
Add a "behavioral boundary suite":
* instrument events/telemetry that records where merges happened
* feed in conflicting VEX claims and assert:
* Concelier/Excitors do not resolve conflicts; they retain provenance and prune source
* Concelier/Excitors do not resolve conflicts; they retain provenance and "prune source"
* only `scanner.webservice` produces the final merged semantics
If Concelier/Excitors output a resolved claim, the test fails.
@@ -439,7 +439,7 @@ Define standard workloads:
* small image (200 packages)
* medium (2k packages)
* large (20k+ packages)
* monorepo container worst case (50k+ nodes graph)
* "monorepo container" worst case (50k+ nodes graph)
Metrics collected:
@@ -529,7 +529,7 @@ Release candidate is blocked if any of these fail:
### Phase 2: Offline e2e + interop
* offline bundle builder + strict no egress enforcement
* offline bundle builder + strict "no egress" enforcement
* SBOM attestation round-trip + consumer parsing suite
### Phase 3: Unknowns budgets + delta verdict
@@ -556,7 +556,7 @@ If you do only three things, do these:
1. **Run Manifest** as first-class test artifact
2. **Golden corpus** that pins all digests (feeds, policies, images, expected outputs)
3. **No egress default** in CI with explicit opt-in for online tests
3. **"No egress" default** in CI with explicit opt-in for online tests
Everything else becomes far easier once these are in place.

View File

@@ -0,0 +1,56 @@
# Archived Advisory: Testing Strategy
**Archived**: 2025-12-21
**Original**: `docs/product-advisories/20-Dec-2025 - Testing strategy.md`
## Processing Summary
This advisory was processed into Sprint Epic 5100 - Comprehensive Testing Strategy.
### Artifacts Created
**Sprint Files** (12 sprints, ~75 tasks):
| Sprint | Name | Phase |
|--------|------|-------|
| 5100.0001.0001 | Run Manifest Schema | Phase 0 |
| 5100.0001.0002 | Evidence Index Schema | Phase 0 |
| 5100.0001.0003 | Offline Bundle Manifest | Phase 0 |
| 5100.0001.0004 | Golden Corpus Expansion | Phase 0 |
| 5100.0002.0001 | Canonicalization Utilities | Phase 1 |
| 5100.0002.0002 | Replay Runner Service | Phase 1 |
| 5100.0002.0003 | Delta-Verdict Generator | Phase 1 |
| 5100.0003.0001 | SBOM Interop Round-Trip | Phase 2 |
| 5100.0003.0002 | No-Egress Enforcement | Phase 2 |
| 5100.0004.0001 | Unknowns Budget CI Gates | Phase 3 |
| 5100.0005.0001 | Router Chaos Suite | Phase 4 |
| 5100.0006.0001 | Audit Pack Export/Import | Phase 5 |
**Documentation Updated**:
- `docs/implplan/SPRINT_5100_SUMMARY.md` - Master epic summary
- `docs/19_TEST_SUITE_OVERVIEW.md` - Test suite documentation
- `tests/AGENTS.md` - AI agent guidance for tests directory
### Key Concepts Implemented
1. **Deterministic Replay**: Run Manifests capture all inputs for byte-identical verdict reproduction
2. **Canonical JSON**: RFC 8785 principles for stable serialization
3. **Evidence Index**: Linking verdicts to complete evidence chain
4. **Air-Gap Compliance**: Network-isolated testing with `--network none`
5. **SBOM Interoperability**: Round-trip testing with Syft, Grype, cosign
6. **Unknowns Budget Gates**: Environment-based budget enforcement
7. **Router Backpressure**: HTTP 429/503 with Retry-After validation
8. **Audit Packs**: Sealed export/import for compliance verification
### Release Blocking Gates
- Replay Verification: 0 byte diff
- Interop Suite: 95%+ findings parity
- Offline E2E: All pass with no network
- Unknowns Budget: Within configured limits
- Router Retry-After: 100% compliance
---
*Processed by: Claude Code*
*Date: 2025-12-21*