29 KiB
Testing Enhancements Architecture
Version: 1.0.0 Last Updated: 2026-01-05 Status: In Development
Overview
This document describes the architecture of StellaOps testing enhancements derived from the product advisory "New Testing Enhancements for Stella Ops" (05-Dec-2026). The enhancements address gaps in temporal correctness, policy drift control, replayability, and competitive awareness.
Problem Statement
"The next gains for StellaOps testing are no longer about coverage—they're about temporal correctness, policy drift control, replayability, and competitive awareness. Systems that fail now do so quietly, over time, and under sequence pressure."
Key Gaps Identified
| Gap | Impact | Current State |
|---|---|---|
| Temporal Edge Cases | Silent failures under clock drift, leap seconds, TTL boundaries | TimeProvider exists but no edge case tests |
| Failure Choreography | Cascading failures untested | Single-point chaos tests only |
| Trace Replay | Assumptions vs. reality mismatch | Replay module underutilized |
| Policy Drift | Silent behavior changes | Determinism tests exist but no diff testing |
| Decision Opacity | Audit/debug difficulty | Verdicts without explanations |
| Evidence Gaps | Test runs not audit-grade | TRX files not in EvidenceLocker |
Architecture Overview
┌─────────────────────────────────────────────────────────────────────────┐
│ Testing Enhancements Architecture │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │
│ │ Time-Skew │ │ Trace Replay │ │ Failure │ │
│ │ & Idempotency │ │ & Evidence │ │ Choreography │ │
│ └───────┬────────┘ └───────┬────────┘ └───────┬────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ StellaOps.Testing.* Libraries │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ │ │
│ │ │ Temporal │ │ Replay │ │ Chaos │ │ Evidence │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └──────────┘ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ │ │
│ │ │ Policy │ │Explainability│ │ Coverage │ │ConfigDiff│ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └──────────┘ │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ Existing Infrastructure │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ │ │
│ │ │ TestKit │ │Determinism │ │ Postgres │ │ AirGap │ │ │
│ │ │ │ │ Testing │ │ Testing │ │ Testing │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └──────────┘ │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Component Architecture
1. Temporal Testing (StellaOps.Testing.Temporal)
Purpose: Simulate temporal edge conditions and verify idempotency.
┌─────────────────────────────────────────────────────────────┐
│ Temporal Testing │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ SimulatedTimeProvider│ │ IdempotencyVerifier │ │
│ │ - Advance() │ │ - VerifyAsync() │ │
│ │ - JumpTo() │ │ - VerifyWithRetries│ │
│ │ - SetDrift() │ └─────────────────────┘ │
│ │ - JumpBackward() │ │
│ └─────────────────────┘ │
│ │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │LeapSecondTimeProvider│ │TtlBoundaryTimeProvider│ │
│ │ - AdvanceThrough │ │ - PositionAtExpiry │ │
│ │ LeapSecond() │ │ - GenerateBoundary │ │
│ └─────────────────────┘ │ TestCases() │ │
│ └─────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ ClockSkewAssertions │ │
│ │ - AssertHandlesClockJumpForward() │ │
│ │ - AssertHandlesClockJumpBackward() │ │
│ │ - AssertHandlesClockDrift() │ │
│ └─────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Key Interfaces:
SimulatedTimeProvider- Time progression with driftIdempotencyVerifier<T>- Retry idempotency verificationClockSkewAssertions- Clock anomaly assertions
2. Trace Replay & Evidence (StellaOps.Testing.Replay, StellaOps.Testing.Evidence)
Purpose: Replay production traces and link test runs to EvidenceLocker.
┌─────────────────────────────────────────────────────────────┐
│ Trace Replay & Evidence │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────┐ ┌─────────────────────┐ │
│ │TraceAnonymizer │ │ TestEvidenceService │ │
│ │ - AnonymizeAsync│ │ - BeginSessionAsync │ │
│ │ - ValidateAnon │ │ - RecordTestResult │ │
│ └────────┬────────┘ │ - FinalizeSession │ │
│ │ └──────────┬──────────┘ │
│ ▼ │ │
│ ┌─────────────────┐ ▼ │
│ │TraceCorpusManager│ ┌─────────────────────┐ │
│ │ - ImportAsync │ │ EvidenceLocker │ │
│ │ - QueryAsync │ │ (immutable storage)│ │
│ └────────┬─────────┘ └─────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ ReplayIntegrationTestBase │ │
│ │ - ReplayAndVerifyAsync() │ │
│ │ - ReplayBatchAsync() │ │
│ └─────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Data Flow:
Production Traces → Anonymization → Corpus → Replay Tests → Evidence Bundle
3. Failure Choreography (StellaOps.Testing.Chaos)
Purpose: Orchestrate sequenced, cascading failure scenarios.
┌─────────────────────────────────────────────────────────────┐
│ Failure Choreography │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ FailureChoreographer │ │
│ │ - InjectFailure(componentId, failureType) │ │
│ │ - RecoverComponent(componentId) │ │
│ │ - ExecuteOperation(name, action) │ │
│ │ - AssertCondition(name, condition) │ │
│ │ - ExecuteAsync() → ChoreographyResult │ │
│ └─────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────┼───────────────┐ │
│ ▼ ▼ ▼ │
│ ┌────────────────┐ ┌────────────┐ ┌────────────────┐ │
│ │DatabaseFailure │ │HttpClient │ │ CacheFailure │ │
│ │ Injector │ │ Injector │ │ Injector │ │
│ └────────────────┘ └────────────┘ └────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ ConvergenceTracker │ │
│ │ - CaptureSnapshotAsync() │ │
│ │ - WaitForConvergenceAsync() │ │
│ │ - VerifyConvergenceAsync() │ │
│ └─────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────┼───────────────┐ │
│ ▼ ▼ ▼ │
│ ┌────────────────┐ ┌────────────┐ ┌────────────────┐ │
│ │ DatabaseState │ │ Metrics │ │ QueueState │ │
│ │ Probe │ │ Probe │ │ Probe │ │
│ └────────────────┘ └────────────┘ └────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Failure Types:
Unavailable- Component completely downTimeout- Slow responsesIntermittent- Random failuresPartialFailure- Some operations failDegraded- Reduced capacityFlapping- Alternating up/down
4. Policy & Explainability (StellaOps.Core.Explainability, StellaOps.Testing.Policy)
Purpose: Explain automated decisions and test policy changes.
┌─────────────────────────────────────────────────────────────┐
│ Policy & Explainability │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ DecisionExplanation │ │
│ │ - DecisionId, DecisionType, DecidedAt │ │
│ │ - Outcome (value, confidence, summary) │ │
│ │ - Factors[] (type, weight, contribution) │ │
│ │ - AppliedRules[] (id, triggered, impact) │ │
│ │ - Metadata (engine version, input hashes) │ │
│ └─────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────┐ ┌─────────────────────────┐ │
│ │IExplainableDecision│ │ ExplainabilityAssertions│ │
│ │ <TInput, TOutput> │ │ - AssertHasExplanation │ │
│ │ - EvaluateWith │ │ - AssertExplanation │ │
│ │ ExplanationAsync│ │ Reproducible │ │
│ └─────────────────┘ └─────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ PolicyDiffEngine │ │
│ │ - ComputeDiffAsync(baseline, new, inputs) │ │
│ │ → PolicyDiffResult (changed behaviors, deltas) │ │
│ └─────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ PolicyRegressionTestBase │ │
│ │ - Policy_Change_Produces_Expected_Diff() │ │
│ │ - Policy_Change_No_Unexpected_Regressions() │ │
│ └─────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Explainable Services:
ExplainableVexConsensusServiceExplainableRiskScoringServiceExplainablePolicyEngine
5. Cross-Cutting Standards (StellaOps.Testing.*)
Purpose: Enforce standards across all testing.
┌─────────────────────────────────────────────────────────────┐
│ Cross-Cutting Standards │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌───────────────────────────────────────────┐ │
│ │ BlastRadius Annotations │ │
│ │ - Auth, Scanning, Evidence, Compliance │ │
│ │ - Advisories, RiskPolicy, Crypto │ │
│ │ - Integrations, Persistence, Api │ │
│ └───────────────────────────────────────────┘ │
│ │
│ ┌───────────────────────────────────────────┐ │
│ │ SchemaEvolutionTestBase │ │
│ │ - TestAgainstPreviousSchemaAsync() │ │
│ │ - TestReadBackwardCompatibilityAsync() │ │
│ │ - TestWriteForwardCompatibilityAsync() │ │
│ └───────────────────────────────────────────┘ │
│ │
│ ┌───────────────────────────────────────────┐ │
│ │ BranchCoverageEnforcer │ │
│ │ - Validate() → dead paths │ │
│ │ - GenerateDeadPathReport() │ │
│ │ - Exemption mechanism │ │
│ └───────────────────────────────────────────┘ │
│ │
│ ┌───────────────────────────────────────────┐ │
│ │ ConfigDiffTestBase │ │
│ │ - TestConfigBehavioralDeltaAsync() │ │
│ │ - TestConfigIsolationAsync() │ │
│ └───────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Library Structure
src/__Tests/__Libraries/
├── StellaOps.Testing.Temporal/
│ ├── SimulatedTimeProvider.cs
│ ├── LeapSecondTimeProvider.cs
│ ├── TtlBoundaryTimeProvider.cs
│ ├── IdempotencyVerifier.cs
│ └── ClockSkewAssertions.cs
│
├── StellaOps.Testing.Replay/
│ ├── ReplayIntegrationTestBase.cs
│ └── IReplayOrchestrator.cs
│
├── StellaOps.Testing.Evidence/
│ ├── ITestEvidenceService.cs
│ ├── TestEvidenceService.cs
│ └── XunitEvidenceReporter.cs
│
├── StellaOps.Testing.Chaos/
│ ├── FailureChoreographer.cs
│ ├── ConvergenceTracker.cs
│ ├── Injectors/
│ │ ├── IFailureInjector.cs
│ │ ├── DatabaseFailureInjector.cs
│ │ ├── HttpClientFailureInjector.cs
│ │ └── CacheFailureInjector.cs
│ └── Probes/
│ ├── IStateProbe.cs
│ ├── DatabaseStateProbe.cs
│ └── MetricsStateProbe.cs
│
├── StellaOps.Testing.Policy/
│ ├── PolicyDiffEngine.cs
│ ├── PolicyRegressionTestBase.cs
│ └── PolicyVersionControl.cs
│
├── StellaOps.Testing.Explainability/
│ └── ExplainabilityAssertions.cs
│
├── StellaOps.Testing.SchemaEvolution/
│ └── SchemaEvolutionTestBase.cs
│
├── StellaOps.Testing.Coverage/
│ └── BranchCoverageEnforcer.cs
│
└── StellaOps.Testing.ConfigDiff/
└── ConfigDiffTestBase.cs
CI/CD Integration
Pipeline Structure
┌─────────────────────────────────────────────────────────────┐
│ CI/CD Pipelines │
├─────────────────────────────────────────────────────────────┤
│ │
│ PR-Gating: │
│ ├── test-blast-radius.yml (validate annotations) │
│ ├── policy-diff.yml (policy change validation) │
│ ├── dead-path-detection.yml (coverage enforcement) │
│ └── test-evidence.yml (evidence capture) │
│ │
│ Scheduled: │
│ ├── schema-evolution.yml (backward compat tests) │
│ ├── chaos-choreography.yml (failure choreography) │
│ └── trace-replay.yml (production trace replay) │
│ │
│ On-Demand: │
│ └── rollback-lag.yml (rollback timing measurement) │
│ │
└─────────────────────────────────────────────────────────────┘
Workflow Triggers
| Workflow | Trigger | Purpose |
|---|---|---|
| test-blast-radius | PR (test files) | Validate annotations |
| policy-diff | PR (policy files) | Validate policy changes |
| dead-path-detection | Push/PR | Prevent untested code |
| test-evidence | Push (main) | Store test evidence |
| schema-evolution | Daily | Backward compatibility |
| chaos-choreography | Weekly | Cascading failure tests |
| trace-replay | Weekly | Production trace validation |
| rollback-lag | Manual | Measure rollback timing |
Implementation Roadmap
Sprint Schedule
| Sprint | Focus | Duration | Key Deliverables |
|---|---|---|---|
| 002_001 | Time-Skew & Idempotency | 3 weeks | Temporal libraries, module tests |
| 002_002 | Trace Replay & Evidence | 3 weeks | Anonymization, evidence linking |
| 002_003 | Failure Choreography | 3 weeks | Choreographer, cascade tests |
| 002_004 | Policy & Explainability | 3 weeks | Explanation schema, diff testing |
| 002_005 | Cross-Cutting Standards | 3 weeks | Annotations, CI enforcement |
Dependencies
002_001 (Temporal) ────┐
│
002_002 (Replay) ──────┼──→ 002_003 (Choreography) ──→ 002_005 (Cross-Cutting)
│ ↑
002_004 (Policy) ──────┘────────────────────────────────────┘
Success Metrics
| Metric | Baseline | Target | Sprint |
|---|---|---|---|
| Temporal edge case coverage | ~5% | 80%+ | 002_001 |
| Idempotency test coverage | ~10% | 90%+ | 002_001 |
| Replay test coverage | 0% | 50%+ | 002_002 |
| Test evidence capture | 0% | 100% | 002_002 |
| Choreographed failure scenarios | 0 | 15+ | 002_003 |
| Decisions with explanations | 0% | 100% | 002_004 |
| Policy changes with diff tests | 0% | 100% | 002_004 |
| Tests with blast-radius | ~10% | 100% | 002_005 |
| Dead paths (non-exempt) | Unknown | <50 | 002_005 |
References
- Sprint Files:
docs/implplan/SPRINT_20260105_002_001_TEST_time_skew_idempotency.mddocs/implplan/SPRINT_20260105_002_002_TEST_trace_replay_evidence.mddocs/implplan/SPRINT_20260105_002_003_TEST_failure_choreography.mddocs/implplan/SPRINT_20260105_002_004_TEST_policy_explainability.mddocs/implplan/SPRINT_20260105_002_005_TEST_cross_cutting.md
- Advisory:
docs/product/advisories/05-Dec-2026 - New Testing Enhancements for Stella Ops.md - Test Infrastructure:
src/__Tests/AGENTS.md