Files
git.stella-ops.org/docs/technical/testing/testing-enhancements-architecture.md

29 KiB

Testing Enhancements Architecture

Version: 1.0.0 Last Updated: 2026-01-05 Status: In Development

Overview

This document describes the architecture of StellaOps testing enhancements derived from the product advisory "New Testing Enhancements for Stella Ops" (05-Dec-2026). The enhancements address gaps in temporal correctness, policy drift control, replayability, and competitive awareness.

Problem Statement

"The next gains for StellaOps testing are no longer about coverage—they're about temporal correctness, policy drift control, replayability, and competitive awareness. Systems that fail now do so quietly, over time, and under sequence pressure."

Key Gaps Identified

Gap Impact Current State
Temporal Edge Cases Silent failures under clock drift, leap seconds, TTL boundaries TimeProvider exists but no edge case tests
Failure Choreography Cascading failures untested Single-point chaos tests only
Trace Replay Assumptions vs. reality mismatch Replay module underutilized
Policy Drift Silent behavior changes Determinism tests exist but no diff testing
Decision Opacity Audit/debug difficulty Verdicts without explanations
Evidence Gaps Test runs not audit-grade TRX files not in EvidenceLocker

Architecture Overview

┌─────────────────────────────────────────────────────────────────────────┐
│                    Testing Enhancements Architecture                     │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  ┌────────────────┐  ┌────────────────┐  ┌────────────────┐              │
│  │   Time-Skew    │  │ Trace Replay   │  │   Failure      │              │
│  │  & Idempotency │  │  & Evidence    │  │ Choreography   │              │
│  └───────┬────────┘  └───────┬────────┘  └───────┬────────┘              │
│          │                   │                   │                       │
│          ▼                   ▼                   ▼                       │
│  ┌───────────────────────────────────────────────────────────────┐      │
│  │                 StellaOps.Testing.* Libraries                  │      │
│  │  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ │      │
│  │  │  Temporal   │ │   Replay    │ │    Chaos    │ │ Evidence │ │      │
│  │  └─────────────┘ └─────────────┘ └─────────────┘ └──────────┘ │      │
│  │  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ │      │
│  │  │   Policy    │ │Explainability│ │  Coverage  │ │ConfigDiff│ │      │
│  │  └─────────────┘ └─────────────┘ └─────────────┘ └──────────┘ │      │
│  └───────────────────────────────────────────────────────────────┘      │
│                                  │                                       │
│                                  ▼                                       │
│  ┌───────────────────────────────────────────────────────────────┐      │
│  │                     Existing Infrastructure                    │      │
│  │  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ │      │
│  │  │  TestKit    │ │Determinism  │ │  Postgres   │ │  AirGap  │ │      │
│  │  │             │ │  Testing    │ │  Testing    │ │ Testing  │ │      │
│  │  └─────────────┘ └─────────────┘ └─────────────┘ └──────────┘ │      │
│  └───────────────────────────────────────────────────────────────┘      │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Component Architecture

1. Temporal Testing (StellaOps.Testing.Temporal)

Purpose: Simulate temporal edge conditions and verify idempotency.

┌─────────────────────────────────────────────────────────────┐
│                    Temporal Testing                          │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌─────────────────────┐    ┌─────────────────────┐         │
│  │ SimulatedTimeProvider│    │ IdempotencyVerifier │         │
│  │  - Advance()         │    │  - VerifyAsync()    │         │
│  │  - JumpTo()          │    │  - VerifyWithRetries│         │
│  │  - SetDrift()        │    └─────────────────────┘         │
│  │  - JumpBackward()    │                                    │
│  └─────────────────────┘                                     │
│                                                              │
│  ┌─────────────────────┐    ┌─────────────────────┐         │
│  │LeapSecondTimeProvider│   │TtlBoundaryTimeProvider│        │
│  │  - AdvanceThrough    │   │  - PositionAtExpiry   │        │
│  │    LeapSecond()      │   │  - GenerateBoundary   │        │
│  └─────────────────────┘   │    TestCases()        │        │
│                             └─────────────────────┘         │
│                                                              │
│  ┌─────────────────────────────────────────────────┐        │
│  │            ClockSkewAssertions                   │        │
│  │  - AssertHandlesClockJumpForward()              │        │
│  │  - AssertHandlesClockJumpBackward()             │        │
│  │  - AssertHandlesClockDrift()                    │        │
│  └─────────────────────────────────────────────────┘        │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Key Interfaces:

  • SimulatedTimeProvider - Time progression with drift
  • IdempotencyVerifier<T> - Retry idempotency verification
  • ClockSkewAssertions - Clock anomaly assertions

2. Trace Replay & Evidence (StellaOps.Testing.Replay, StellaOps.Testing.Evidence)

Purpose: Replay production traces and link test runs to EvidenceLocker.

┌─────────────────────────────────────────────────────────────┐
│              Trace Replay & Evidence                         │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌─────────────────┐      ┌─────────────────────┐           │
│  │TraceAnonymizer  │      │  TestEvidenceService │           │
│  │ - AnonymizeAsync│      │  - BeginSessionAsync │           │
│  │ - ValidateAnon  │      │  - RecordTestResult  │           │
│  └────────┬────────┘      │  - FinalizeSession   │           │
│           │               └──────────┬──────────┘           │
│           ▼                          │                       │
│  ┌─────────────────┐                 ▼                       │
│  │TraceCorpusManager│       ┌─────────────────────┐          │
│  │ - ImportAsync    │       │  EvidenceLocker     │          │
│  │ - QueryAsync     │       │  (immutable storage)│          │
│  └────────┬─────────┘       └─────────────────────┘          │
│           │                                                  │
│           ▼                                                  │
│  ┌─────────────────────────────────────────────────┐        │
│  │           ReplayIntegrationTestBase             │        │
│  │  - ReplayAndVerifyAsync()                       │        │
│  │  - ReplayBatchAsync()                           │        │
│  └─────────────────────────────────────────────────┘        │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Data Flow:

Production Traces → Anonymization → Corpus → Replay Tests → Evidence Bundle

3. Failure Choreography (StellaOps.Testing.Chaos)

Purpose: Orchestrate sequenced, cascading failure scenarios.

┌─────────────────────────────────────────────────────────────┐
│                Failure Choreography                          │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌─────────────────────────────────────────────────┐        │
│  │              FailureChoreographer                │        │
│  │  - InjectFailure(componentId, failureType)      │        │
│  │  - RecoverComponent(componentId)                │        │
│  │  - ExecuteOperation(name, action)               │        │
│  │  - AssertCondition(name, condition)             │        │
│  │  - ExecuteAsync() → ChoreographyResult          │        │
│  └─────────────────────────────────────────────────┘        │
│                           │                                  │
│           ┌───────────────┼───────────────┐                 │
│           ▼               ▼               ▼                 │
│  ┌────────────────┐ ┌────────────┐ ┌────────────────┐       │
│  │DatabaseFailure │ │HttpClient  │ │ CacheFailure   │       │
│  │  Injector      │ │ Injector   │ │   Injector     │       │
│  └────────────────┘ └────────────┘ └────────────────┘       │
│                                                              │
│  ┌─────────────────────────────────────────────────┐        │
│  │             ConvergenceTracker                   │        │
│  │  - CaptureSnapshotAsync()                       │        │
│  │  - WaitForConvergenceAsync()                    │        │
│  │  - VerifyConvergenceAsync()                     │        │
│  └─────────────────────────────────────────────────┘        │
│                           │                                  │
│           ┌───────────────┼───────────────┐                 │
│           ▼               ▼               ▼                 │
│  ┌────────────────┐ ┌────────────┐ ┌────────────────┐       │
│  │ DatabaseState  │ │ Metrics    │ │  QueueState    │       │
│  │    Probe       │ │  Probe     │ │    Probe       │       │
│  └────────────────┘ └────────────┘ └────────────────┘       │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Failure Types:

  • Unavailable - Component completely down
  • Timeout - Slow responses
  • Intermittent - Random failures
  • PartialFailure - Some operations fail
  • Degraded - Reduced capacity
  • Flapping - Alternating up/down

4. Policy & Explainability (StellaOps.Core.Explainability, StellaOps.Testing.Policy)

Purpose: Explain automated decisions and test policy changes.

┌─────────────────────────────────────────────────────────────┐
│              Policy & Explainability                         │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌─────────────────────────────────────────────────┐        │
│  │              DecisionExplanation                 │        │
│  │  - DecisionId, DecisionType, DecidedAt          │        │
│  │  - Outcome (value, confidence, summary)         │        │
│  │  - Factors[] (type, weight, contribution)       │        │
│  │  - AppliedRules[] (id, triggered, impact)       │        │
│  │  - Metadata (engine version, input hashes)      │        │
│  └─────────────────────────────────────────────────┘        │
│                                                              │
│  ┌─────────────────┐    ┌─────────────────────────┐         │
│  │IExplainableDecision│  │ ExplainabilityAssertions│         │
│  │ <TInput, TOutput> │  │  - AssertHasExplanation │         │
│  │ - EvaluateWith    │  │  - AssertExplanation    │         │
│  │   ExplanationAsync│  │    Reproducible         │         │
│  └─────────────────┘    └─────────────────────────┘         │
│                                                              │
│  ┌─────────────────────────────────────────────────┐        │
│  │             PolicyDiffEngine                     │        │
│  │  - ComputeDiffAsync(baseline, new, inputs)      │        │
│  │  → PolicyDiffResult (changed behaviors, deltas) │        │
│  └─────────────────────────────────────────────────┘        │
│                           │                                  │
│                           ▼                                  │
│  ┌─────────────────────────────────────────────────┐        │
│  │          PolicyRegressionTestBase               │        │
│  │  - Policy_Change_Produces_Expected_Diff()       │        │
│  │  - Policy_Change_No_Unexpected_Regressions()    │        │
│  └─────────────────────────────────────────────────┘        │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Explainable Services:

  • ExplainableVexConsensusService
  • ExplainableRiskScoringService
  • ExplainablePolicyEngine

5. Cross-Cutting Standards (StellaOps.Testing.*)

Purpose: Enforce standards across all testing.

┌─────────────────────────────────────────────────────────────┐
│                Cross-Cutting Standards                       │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌───────────────────────────────────────────┐              │
│  │         BlastRadius Annotations            │              │
│  │  - Auth, Scanning, Evidence, Compliance   │              │
│  │  - Advisories, RiskPolicy, Crypto         │              │
│  │  - Integrations, Persistence, Api         │              │
│  └───────────────────────────────────────────┘              │
│                                                              │
│  ┌───────────────────────────────────────────┐              │
│  │        SchemaEvolutionTestBase            │              │
│  │  - TestAgainstPreviousSchemaAsync()       │              │
│  │  - TestReadBackwardCompatibilityAsync()   │              │
│  │  - TestWriteForwardCompatibilityAsync()   │              │
│  └───────────────────────────────────────────┘              │
│                                                              │
│  ┌───────────────────────────────────────────┐              │
│  │        BranchCoverageEnforcer             │              │
│  │  - Validate() → dead paths                │              │
│  │  - GenerateDeadPathReport()               │              │
│  │  - Exemption mechanism                    │              │
│  └───────────────────────────────────────────┘              │
│                                                              │
│  ┌───────────────────────────────────────────┐              │
│  │          ConfigDiffTestBase               │              │
│  │  - TestConfigBehavioralDeltaAsync()       │              │
│  │  - TestConfigIsolationAsync()             │              │
│  └───────────────────────────────────────────┘              │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Library Structure

src/__Tests/__Libraries/
├── StellaOps.Testing.Temporal/
│   ├── SimulatedTimeProvider.cs
│   ├── LeapSecondTimeProvider.cs
│   ├── TtlBoundaryTimeProvider.cs
│   ├── IdempotencyVerifier.cs
│   └── ClockSkewAssertions.cs
│
├── StellaOps.Testing.Replay/
│   ├── ReplayIntegrationTestBase.cs
│   └── IReplayOrchestrator.cs
│
├── StellaOps.Testing.Evidence/
│   ├── ITestEvidenceService.cs
│   ├── TestEvidenceService.cs
│   └── XunitEvidenceReporter.cs
│
├── StellaOps.Testing.Chaos/
│   ├── FailureChoreographer.cs
│   ├── ConvergenceTracker.cs
│   ├── Injectors/
│   │   ├── IFailureInjector.cs
│   │   ├── DatabaseFailureInjector.cs
│   │   ├── HttpClientFailureInjector.cs
│   │   └── CacheFailureInjector.cs
│   └── Probes/
│       ├── IStateProbe.cs
│       ├── DatabaseStateProbe.cs
│       └── MetricsStateProbe.cs
│
├── StellaOps.Testing.Policy/
│   ├── PolicyDiffEngine.cs
│   ├── PolicyRegressionTestBase.cs
│   └── PolicyVersionControl.cs
│
├── StellaOps.Testing.Explainability/
│   └── ExplainabilityAssertions.cs
│
├── StellaOps.Testing.SchemaEvolution/
│   └── SchemaEvolutionTestBase.cs
│
├── StellaOps.Testing.Coverage/
│   └── BranchCoverageEnforcer.cs
│
└── StellaOps.Testing.ConfigDiff/
    └── ConfigDiffTestBase.cs

CI/CD Integration

Pipeline Structure

┌─────────────────────────────────────────────────────────────┐
│                    CI/CD Pipelines                           │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  PR-Gating:                                                  │
│  ├── test-blast-radius.yml    (validate annotations)        │
│  ├── policy-diff.yml          (policy change validation)    │
│  ├── dead-path-detection.yml  (coverage enforcement)        │
│  └── test-evidence.yml        (evidence capture)            │
│                                                              │
│  Scheduled:                                                  │
│  ├── schema-evolution.yml     (backward compat tests)       │
│  ├── chaos-choreography.yml   (failure choreography)        │
│  └── trace-replay.yml         (production trace replay)     │
│                                                              │
│  On-Demand:                                                  │
│  └── rollback-lag.yml         (rollback timing measurement) │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Workflow Triggers

Workflow Trigger Purpose
test-blast-radius PR (test files) Validate annotations
policy-diff PR (policy files) Validate policy changes
dead-path-detection Push/PR Prevent untested code
test-evidence Push (main) Store test evidence
schema-evolution Daily Backward compatibility
chaos-choreography Weekly Cascading failure tests
trace-replay Weekly Production trace validation
rollback-lag Manual Measure rollback timing

Implementation Roadmap

Sprint Schedule

Sprint Focus Duration Key Deliverables
002_001 Time-Skew & Idempotency 3 weeks Temporal libraries, module tests
002_002 Trace Replay & Evidence 3 weeks Anonymization, evidence linking
002_003 Failure Choreography 3 weeks Choreographer, cascade tests
002_004 Policy & Explainability 3 weeks Explanation schema, diff testing
002_005 Cross-Cutting Standards 3 weeks Annotations, CI enforcement

Dependencies

002_001 (Temporal) ────┐
                       │
002_002 (Replay) ──────┼──→ 002_003 (Choreography) ──→ 002_005 (Cross-Cutting)
                       │                                    ↑
002_004 (Policy) ──────┘────────────────────────────────────┘

Success Metrics

Metric Baseline Target Sprint
Temporal edge case coverage ~5% 80%+ 002_001
Idempotency test coverage ~10% 90%+ 002_001
Replay test coverage 0% 50%+ 002_002
Test evidence capture 0% 100% 002_002
Choreographed failure scenarios 0 15+ 002_003
Decisions with explanations 0% 100% 002_004
Policy changes with diff tests 0% 100% 002_004
Tests with blast-radius ~10% 100% 002_005
Dead paths (non-exempt) Unknown <50 002_005

References

  • Sprint Files:
    • docs/implplan/SPRINT_20260105_002_001_TEST_time_skew_idempotency.md
    • docs/implplan/SPRINT_20260105_002_002_TEST_trace_replay_evidence.md
    • docs/implplan/SPRINT_20260105_002_003_TEST_failure_choreography.md
    • docs/implplan/SPRINT_20260105_002_004_TEST_policy_explainability.md
    • docs/implplan/SPRINT_20260105_002_005_TEST_cross_cutting.md
  • Advisory: docs/product-advisories/05-Dec-2026 - New Testing Enhancements for Stella Ops.md
  • Test Infrastructure: src/__Tests/AGENTS.md