199 lines
9.2 KiB
Markdown
199 lines
9.2 KiB
Markdown
# 19 · Test‑Suite Overview — **Stella Ops**
|
||
*(v2.0 — 12 Jul 2025)*
|
||
|
||
> **Purpose** — Describe the **multi‑layer automated‑test strategy** that guards Stella Ops’ five‑second performance promise, security posture and API stability, and show how each layer maps to CI gates and release criteria.
|
||
|
||
---
|
||
|
||
## 0 Table of Contents
|
||
|
||
1. Test‑pyramid at a glance
|
||
2. Layer definitions & tooling
|
||
3. Directory & naming conventions
|
||
4. CI workflows & failure policy
|
||
5. Quality gates & coverage budgets
|
||
6. Evidence retention & auditability
|
||
7. Local developer quick‑start
|
||
8. Flaky‑test triage & escalation
|
||
9. Change log
|
||
|
||
---
|
||
|
||
## 1 Test‑pyramid at a glance
|
||
|
||
| Layer | Framework(s) | Scope | CI frequency |
|
||
| ---------------------- | ------------------------ | --------------------------------------- | ------------ |
|
||
| **Unit** | xUnit + FluentAssertions | Pure C# methods, guard clauses, mapping | Every PR |
|
||
| **Mutation** | **Stryker.NET** | Critical algorithm branches | Nightly |
|
||
| **Static analysis** | **CodeQL**, **Semgrep** | OWASP, injection, secrets | Every PR |
|
||
| **Integration** | Testcontainers + xUnit | Redis, Trivy exec, plug‑in hot‑load | Every PR |
|
||
| **Quota / throttle** | Testcontainers + Clock‑mock | 333‑scan counter, 5 s & 60 s retry‑after headers | Every PR |
|
||
| **End‑to‑End (UI)** | **Playwright C#** | Login, scan list, mute flow | Merge→main |
|
||
| **Performance** | Hyperfine + K6 | P95 latency, 40 rps throughput | Nightly |
|
||
| **Security DAST** | OWASP ZAP baseline | TLS headers, auth, XSS | Nightly + RC |
|
||
| **Chaos / Resilience** | **Pumba** & Toxiproxy | Redis latency, container kill | Weekly |
|
||
| **Compliance smoke** | Spectral + JSON‑Schema | SBOM & API payloads | Every PR |
|
||
| **Token validity** | xUnit + ClockMock | Expiry warning, OUK update refresh, `/token/offline` flow | Every PR |
|
||
|
||
---
|
||
|
||
## 2 Layer definitions & tooling
|
||
|
||
### 2.1 Unit
|
||
|
||
* Target ≥ 80 % **line and** ≥ 60 % **branch** coverage (`coverlet` + ReportGenerator).
|
||
* Naming: `Method_ShouldExpected_WhenCondition`.
|
||
|
||
### 2.2 Mutation
|
||
|
||
* **Stryker.NET** runs only on projects tagged `critical‑logic=true` in `Directory.Build.props`.
|
||
* Threshold: ≥ 60 % mutation score; red build < 55 %.
|
||
|
||
### 2.3 Integration
|
||
|
||
* `RedisTestcontainer`, `TrivyServerTestcontainer`, `TestcontainersNetwork` for realistic wiring.
|
||
* Each test cleans keys and volumes; parallelisable.
|
||
|
||
* **Quota & throttle tests (new)** — spin up Redis container, fix system clock to just before UTC midnight, hammer `/scan` with a stub token to validate:
|
||
1. Counter hits **200** → header `X‑Stella‑Quota‑Remaining: 133`; banner socket event emitted. Delay of 5 secs is added.
|
||
2. Counter hits **333** → Delay of 60 secs is added.
|
||
3. At UTC midnight rollover key expires → counter resets to 0.
|
||
|
||
### 2.4 Quota / throttle layer (explicit)
|
||
|
||
* Uses the same fixture but runs in isolation to keep CI time predictable.
|
||
* Fails the pipeline if **any** of the four behaviours above mis‑fires.
|
||
|
||
### 2.4 End‑to‑End
|
||
|
||
* API suite asserts presence of `X‑Stella‑Quota‑Remaining` on every successful `/scan`.
|
||
* API suite uses **async httpx** for accurate latency numbers.
|
||
* UI suite uses **Playwright** headless Chromium; Lighthouse a11y snapshot recorded.
|
||
|
||
### 2.5 Performance
|
||
|
||
* Hyperfine measures CLI workflows (`SBOM_LOCAL`, `SBOM_REMOTE`, `IMAGE_WARM`).
|
||
* **K6** hits `/scan` at 40 rps for 3 min; checks P95 ≤ 5 s and error‑rate = 0.
|
||
|
||
### 2.6 Security (DAST + SAST)
|
||
|
||
* **PHASE QUOTA_WAIT** benchmark:
|
||
* ≤ 5 s median for first 30 blocked requests (soft back‑off).
|
||
* Exactly 60 s wall for hard wait‑wall.
|
||
* SAST: **CodeQL** (GitHub native) + **Semgrep OSS** ruleset.
|
||
* DAST: **ZAP baseline** spider + passive rules; fails on High risk alerts.
|
||
|
||
### 2.7 Chaos / Resilience
|
||
|
||
* **Pumba** randomly kills Trivy side‑car; test asserts queue retry.
|
||
* **Toxiproxy** injects 150 ms latency on Redis; perf budget still ≤ 6 s.
|
||
|
||
|
||
---
|
||
|
||
## 3 Repository layout
|
||
|
||
```text
|
||
tests/
|
||
├─ unit/ # *.Unit.csproj
|
||
├─ mutation/stryker.conf.json
|
||
├─ integration/ # *.Integration.csproj
|
||
│ └─ fixtures/
|
||
├─ e2e/
|
||
│ ├─ api/pytest/ # test_*.py
|
||
│ └─ ui/playwright/ # *.spec.ts
|
||
├─ perf/
|
||
│ ├─ compose-perf.yml
|
||
│ ├─ hyperfine/
|
||
│ └─ k6/
|
||
├─ security/
|
||
│ ├─ zap-baseline.conf
|
||
│ └─ semgrep/
|
||
└─ chaos/
|
||
├─ toxiproxy/
|
||
└─ pumba/
|
||
```
|
||
|
||
Tests mirror the module namespaces; each src project owns a matching test project.
|
||
|
||
## 4 CI workflows
|
||
|
||
| File | Trigger | Stages |
|
||
| ------------ | ----------------------------------------------------- | -------------------------------------- |
|
||
| ci.yml | Push / PR Lint → Unit → Static analysis → Integration |
|
||
| e2e.yml | Merge→main | Compose stack → API+UI Playwright |
|
||
| perf.yml | Nightly | Hyperfine + K6; update Grafana JSON |
|
||
| security.yml | Nightly | ZAP baseline, Trivy FS, CodeQL |
|
||
| mutation.yml | Nightly | Stryker.NET; comment PR if < threshold |
|
||
| chaos.yml | Weekly (cron) | Toxiproxy + Pumba scenarios |
|
||
| release.yml | Tag | Run all above + evidence bundling |
|
||
Failure policy: any Red gate blocks merge; nightly failures ping #stella-ci.
|
||
|
||
## 5 Quality gates & budgets
|
||
|
||
| Metric | Threshold | Source | Maps to KPI |
|
||
| ----------------------------------- | ---------- | --------------------------------- | --------------- |
|
||
| Line coverage | ≥ 80 % | Unit, Integration Maintainability |
|
||
| Mutation score | ≥ 60 % | Stryker Defect escape |
|
||
| P95 SBOM‑first | ≤ 5 s | Hyperfine | Product promise |
|
||
| P95 QUOTA_WAIT (soft) | ≤ 10 s | Hyperfine + Clock‑mock | Predictable throttling |
|
||
| Hard wait‑wall accuracy | 60 ± 1 s | Hyperfine | Compliance with spec |
|
||
| P95 image‑unpack | ≤ 10 s | Hyperfine | SRS FR‑IMG‑1 |
|
||
| /scan error‑rate | 0 | K6 | Reliability |
|
||
| ZAP High alerts | 0 | ZAP JSON | Security NFR |
|
||
| Trivy Critical CVEs in release SBOM | 0 Trivy FS | NFR‑SEC‑1 |
|
||
| Offline token expiry warning lead‑time | ≥ 7 days | Token tests |
|
||
|
||
Coverage & perf budgets live in tests/budgets/*.json; CI actions fail on regression.
|
||
|
||
## 6 Evidence retention
|
||
|
||
| Artefact | Retention | Storage |
|
||
| ------------------ | -------------- | --------------------- |
|
||
| Hyperfine & K6 CSV | 18 months | GitHub artefacts → S3 |
|
||
| Mutation reports | 6 months | S3 |
|
||
| ZAP & Trivy SARIF | 18 months | GitHub Security tab |
|
||
| Playwright videos | Last 50 builds | MinIO |
|
||
|
||
Test logs (JUnit/Allure) 12 months S3, lifecycle policy
|
||
|
||
## 7 Developer quick‑start
|
||
|
||
# Bring up full stack for e2e on a laptop
|
||
|
||
```bash
|
||
docker compose -f tests/e2e/compose-core.yml up -d
|
||
```
|
||
|
||
# Run unit + integration
|
||
|
||
```bash
|
||
dotnet test --collect:"XPlat Code Coverage"
|
||
|
||
# API e2e
|
||
cd tests/e2e/api
|
||
pytest -q
|
||
|
||
# UI e2e
|
||
cd tests/e2e/ui
|
||
npx playwright install
|
||
npm test
|
||
```
|
||
|
||
## 8 Flaky‑test triage & escalation
|
||
|
||
Label failing test with flaky and open GitHub Discussion.
|
||
After 3 consecutive nightly failures, auto‑page <ops@stella-ops.org>.
|
||
Root‑cause within next sprint or quarantine behind feature flag (max 2 weeks).
|
||
*Token‑expiry tests cannot be quarantined* — they guard offline operability.
|
||
|
||
|
||
## 9 Change log
|
||
|
||
| Version | Date | Notes |
|
||
| ------- | ---------- | -------------------------------------------------------------------------------------------------------------------------- |
|
||
| v2.0 | 2025‑07‑12 | Full overhaul: mutation tests, CodeQL/Semgrep, chaos layer, role‑based escalation, perf/security budgets aligned with SRS. |
|
||
| v1.0 | 2025‑07‑09 | Original minimal overview |
|
||
|
||
(End of Test‑Suite Overview v2.0)
|