Files
git.stella-ops.org/docs/code-of-conduct/TESTING_PRACTICES.md
master 7943cfb3af chore(docs+devops): cross-module doc sync + sprint archival moves + compose updates
Bundled pre-session doc + ops work:
- docs/modules/**: sync across advisory-ai, airgap, cli, excititor,
  export-center, findings-ledger, notifier, notify, platform, router,
  sbom-service, ui, web (architectural + operational updates)
- docs/features/**: updates to checked excititor vex pipeline,
  developer workspace, quick verify drawer
- docs top-level: README, quickstart, API_CLI_REFERENCE, UI_GUIDE,
  code-of-conduct/TESTING_PRACTICES updates
- docs/qa/feature-checks/: FLOW.md + excititor state update
- docs/implplan/: remaining sprint updates + new Concelier source
  credentials sprint (SPRINT_20260422_003)
- docs-archived/implplan/: 30 sprint archival moves (ElkSharp series,
  misc completed sprints)
- devops/compose: .env + services compose + env example + router gateway
  config updates

File-level granularity preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 16:06:39 +03:00

8.4 KiB

Testing Practices

Scope

  • Applies to all modules, shared libraries, and tooling in this repository.
  • Covers quality, maintainability, security, reusability, and test readiness.

Required test layers

  • Unit tests for every library and service (happy paths, edge cases, determinism, serialization).
  • Integration tests for cross-component flows (database, messaging, storage, and service contracts).
  • End-to-end tests for user-visible workflows and release-critical flows.
  • Performance tests for scanners, exporters, and release orchestration paths.
  • Security tests for authn/authz, input validation, and dependency risk checks.
  • Offline and airgap validation: all suites must run without network access.

Cadence

  • Per change: unit tests plus relevant integration tests and determinism checks.
  • Nightly: full integration, end-to-end suites, and longevity tests per module.
  • Weekly: performance baselines, flakiness triage, and cross-version compatibility checks.
  • Release gate: full test matrix, security verification, reproducible build checks, and interop validation.

Evidence and reporting

  • Record results in sprint Execution Logs with date, scope, and outcomes.
  • Track flaky tests and block releases until mitigations are documented.
  • Store deterministic fixtures and hashes for any generated artifacts.

Environment expectations

  • Use UTC timestamps, fixed seeds, and CultureInfo.InvariantCulture where relevant.
  • Avoid live network calls; rely on fixtures and local emulators only.
  • Inject time and ID providers (TimeProvider, IGuidGenerator) for testability.

Targeted xUnit v3 execution

  • Some Stella Ops test projects expose the xUnit v3 in-process runner through Microsoft Testing Platform.
  • On those projects, dotnet test --filter ... may be ignored even when the caller expects a narrow subset.
  • For targeted verification on those projects, use pwsh ./scripts/test-targeted-xunit.ps1 -Project <test-project>.csproj ... when PowerShell 7 is available, or powershell -ExecutionPolicy Bypass -File .\scripts\test-targeted-xunit.ps1 -Project <test-project>.csproj ... on Windows PowerShell hosts.
  • The helper auto-selects the correct runner:
    • dotnet exec <test-dll> for standard library-style test assemblies
    • dotnet run --project <test-project> -- ... for ASP.NET host tests that reference Microsoft.AspNetCore.Mvc.Testing
  • Do not force raw dotnet exec for MVC-testing projects; that can fail with loader-context false negatives even when the targeted test actually passes through the normal project runner.
  • Capture the exact targeted method/class/trait arguments in sprint evidence so reviewers can confirm the run was actually scoped.
  • If the test assembly is stale, rebuild the specific .csproj first; prefer a scoped build over a solution-wide rebuild.

Intent tagging (Turn #6)

Every non-trivial test must declare its intent using the Intent trait. Intent clarifies why the behavior exists and enables CI to flag changes that violate intent even if tests pass.

Intent categories:

  • Regulatory: compliance, audit requirements, legal obligations.
  • Safety: security invariants, fail-secure behavior, cryptographic correctness.
  • Performance: latency, throughput, resource usage guarantees.
  • Competitive: parity with competitor tools (Syft, Grype, Trivy, Anchore).
  • Operational: observability, diagnosability, operability requirements.

Usage:

[Trait("Intent", "Safety")]
[Trait("Category", "Unit")]
public void Signer_RejectsExpiredCertificate()
{
    // Test that expired certificates are rejected (safety invariant)
}

[Trait("Intent", "Regulatory")]
[Trait("Category", "Integration")]
public void EvidenceBundle_IsImmutableAfterSigning()
{
    // Test that signed evidence cannot be modified (audit requirement)
}

Enforcement:

  • Tests without intent tags in regulatory modules (Policy, Authority, Signer, Attestor, EvidenceLocker) will trigger CI warnings.
  • Intent coverage metrics are tracked per module in TEST_COVERAGE_MATRIX.md.

Observability contract testing (Turn #6)

Logs, metrics, and traces are APIs. WebService tests (W1 model) must validate observability contracts.

OTel trace contracts:

  • Required spans must exist for core operations.
  • Span attributes must include required fields (correlation ID, tenant ID where applicable).
  • Attribute cardinality must be bounded (no unbounded label explosion).

Structured log contracts:

  • Required fields must be present (timestamp, level, message, correlation ID).
  • No PII in logs (validated via pattern matching).
  • Log levels must be appropriate (no ERROR for expected conditions).

Metrics contracts:

  • Required metrics must exist for core operations.
  • Label cardinality must be bounded (< 100 distinct values per label).
  • Counters must be monotonic.

Usage:

using var otel = new OtelCapture();
await sut.ProcessAsync(request);

OTelContractAssert.HasRequiredSpans(otel, "ProcessRequest", "ValidateInput", "PersistResult");
OTelContractAssert.SpanHasAttributes(otel.GetSpan("ProcessRequest"), "corr_id", "tenant_id");
OTelContractAssert.NoHighCardinalityAttributes(otel, threshold: 100);

Evidence traceability (Turn #6)

Every critical behavior must link: requirement -> test -> run -> artifact -> deployed version. This chain enables audit and root cause analysis.

Requirement linking:

[Requirement("REQ-EVIDENCE-001", SprintTaskId = "TEST-ENH6-06")]
[Trait("Intent", "Regulatory")]
public void EvidenceChain_IsComplete()
{
    // Test that evidence chain is traceable
}

Artifact immutability:

  • Tests for compliance-critical artifacts must verify hash stability.
  • Use EvidenceChainAssert.ArtifactImmutable() for determinism verification.

Traceability reporting:

  • CI generates traceability matrix linking requirements to tests to artifacts.
  • Orphaned tests (no requirement reference) in regulatory modules trigger warnings.

Cross-version and environment testing (Turn #6)

Integration tests must validate interoperability across versions and environments.

Cross-version testing (Interop):

  • N-1 compatibility: current service must work with previous schema/API version.
  • N+1 compatibility: previous service must work with current schema/API version.
  • Run before releases to prevent breaking changes.

Environment skew testing:

  • Run integration tests across varied infrastructure profiles.
  • Profiles: standard, high-latency (100ms), low-bandwidth (10 Mbps), packet-loss (1%).
  • Assert result equivalence across profiles.

Usage:

[Trait("Category", "Interop")]
public async Task SchemaV2_CompatibleWithV1Client()
{
    await using var v1Client = await fixture.StartVersion("v1.0.0", "EvidenceLocker");
    await using var v2Server = await fixture.StartVersion("v2.0.0", "EvidenceLocker");

    var result = await fixture.TestHandshake(v1Client, v2Server);
    Assert.True(result.IsCompatible);
}

Time-extended and post-incident testing (Turn #6)

Long-running tests surface issues that only emerge over time. Post-incident tests prevent recurrence.

Time-extended (longevity) tests:

  • Run E2E scenarios continuously for hours to detect memory leaks, counter drift, quota exhaustion.
  • Verify memory returns to baseline after sustained load.
  • Verify connection pools do not leak under sustained load.
  • Run nightly; release-gating for critical modules.

Post-incident replay tests:

  • Every production incident (P1/P2) produces a permanent E2E regression test.
  • Test derived from replay manifest capturing exact event sequence.
  • Test includes incident metadata (ID, root cause, severity).
  • Tests tagged with [Trait("Category", "PostIncident")].

Usage:

[Trait("Category", "Longevity")]
[Trait("Intent", "Operational")]
public async Task ScannerWorker_NoMemoryLeakUnderLoad()
{
    var runner = new StabilityTestRunner();
    await runner.RunExtended(
        scenario: () => ProcessScanBatch(),
        duration: TimeSpan.FromHours(1),
        metrics: new StabilityMetrics(),
        ct: CancellationToken.None);

    var report = runner.GenerateReport();
    Assert.True(report.MemoryGrowthRate < 0.01, "Memory growth rate exceeds threshold");
}

  • Test strategy models: docs/technical/testing/testing-strategy-models.md
  • CI quality gates: docs/technical/testing/ci-quality-gates.md
  • TestKit usage: docs/technical/testing/testkit-usage-guide.md
  • Test coverage matrix: docs/technical/testing/TEST_COVERAGE_MATRIX.md