311 lines
9.1 KiB
Markdown
311 lines
9.1 KiB
Markdown
# CI Quality Gates
|
|
|
|
Sprint: `SPRINT_0350_0001_0001_ci_quality_gates_foundation`
|
|
Task: `QGATE-0350-009`
|
|
|
|
## Overview
|
|
|
|
StellaOps implements automated quality gates in CI to enforce:
|
|
- **Reachability Quality** - Recall/precision thresholds for vulnerability detection
|
|
- **TTFS Regression** - Time-to-First-Signal performance tracking
|
|
- **Performance SLOs** - Scan time and compute budget enforcement
|
|
|
|
These gates run as part of the `build-test-deploy.yml` workflow after the main test suite completes.
|
|
|
|
## Quality Gate Jobs
|
|
|
|
### Reachability Quality Gate
|
|
|
|
**Script:** `scripts/ci/compute-reachability-metrics.sh`
|
|
**Config:** `scripts/ci/reachability-thresholds.yaml`
|
|
|
|
Validates that the scanner meets recall/precision thresholds against the ground-truth corpus.
|
|
|
|
#### Metrics Computed
|
|
|
|
| Metric | Description | Threshold |
|
|
|--------|-------------|-----------|
|
|
| `runtime_dependency_recall` | % of runtime dep vulns detected | ≥ 95% |
|
|
| `unreachable_false_positives` | FP rate for unreachable findings | ≤ 5% |
|
|
| `reachability_underreport` | Underreporting rate | ≤ 10% |
|
|
| `os_package_recall` | % of OS package vulns detected | ≥ 92% |
|
|
| `code_vuln_recall` | % of code vulns detected | ≥ 88% |
|
|
| `config_vuln_recall` | % of config vulns detected | ≥ 85% |
|
|
|
|
#### Running Locally
|
|
|
|
```bash
|
|
# Dry run (no enforcement)
|
|
./scripts/ci/compute-reachability-metrics.sh --dry-run
|
|
|
|
# Full run against corpus
|
|
./scripts/ci/compute-reachability-metrics.sh
|
|
```
|
|
|
|
### TTFS Regression Gate
|
|
|
|
**Script:** `scripts/ci/compute-ttfs-metrics.sh`
|
|
**Baseline:** `bench/baselines/ttfs-baseline.json`
|
|
|
|
Detects performance regressions in Time-to-First-Signal.
|
|
|
|
#### Metrics Computed
|
|
|
|
| Metric | Description | Threshold |
|
|
|--------|-------------|-----------|
|
|
| `ttfs_p50_ms` | P50 time to first signal | ≤ baseline + 10% |
|
|
| `ttfs_p95_ms` | P95 time to first signal | ≤ baseline + 15% |
|
|
| `ttfs_max_ms` | Maximum TTFS | ≤ baseline + 25% |
|
|
|
|
#### Baseline Format
|
|
|
|
```json
|
|
{
|
|
"ttfs_p50_ms": 450,
|
|
"ttfs_p95_ms": 1200,
|
|
"ttfs_max_ms": 3000,
|
|
"measured_at": "2025-12-16T00:00:00Z",
|
|
"sample_count": 1000
|
|
}
|
|
```
|
|
|
|
### Performance SLO Gate
|
|
|
|
**Script:** `scripts/ci/enforce-performance-slos.sh`
|
|
**Config:** `scripts/ci/performance-slos.yaml`
|
|
|
|
Enforces scan time and compute budget SLOs.
|
|
|
|
#### SLOs Enforced
|
|
|
|
| SLO | Description | Target |
|
|
|-----|-------------|--------|
|
|
| `scan_time_p50_ms` | P50 scan time | ≤ 120,000ms (2 min) |
|
|
| `scan_time_p95_ms` | P95 scan time | ≤ 300,000ms (5 min) |
|
|
| `memory_peak_mb` | Peak memory usage | ≤ 2048 MB |
|
|
| `cpu_seconds` | Total CPU time | ≤ 120 seconds |
|
|
|
|
## Workflow Integration
|
|
|
|
Quality gates are integrated into the main CI workflow:
|
|
|
|
```yaml
|
|
# .gitea/workflows/build-test-deploy.yml
|
|
|
|
quality-gates:
|
|
runs-on: ${{ vars.LINUX_RUNNER_LABEL || 'ubuntu-latest' }}
|
|
needs: build-test
|
|
steps:
|
|
- name: Reachability quality gate
|
|
run: ./scripts/ci/compute-reachability-metrics.sh
|
|
|
|
- name: TTFS regression gate
|
|
run: ./scripts/ci/compute-ttfs-metrics.sh
|
|
|
|
- name: Performance SLO gate
|
|
run: ./scripts/ci/enforce-performance-slos.sh --warn-only
|
|
```
|
|
|
|
## Failure Modes
|
|
|
|
### Hard Failure (Blocks Merge)
|
|
|
|
- Reachability recall below threshold
|
|
- TTFS regression exceeds 25%
|
|
- Memory budget exceeded by 50%
|
|
|
|
### Soft Failure (Warning Only)
|
|
|
|
- Minor TTFS regression (< 15%)
|
|
- Memory near budget limit
|
|
- Missing baseline data (new fixtures)
|
|
|
|
## Adding New Quality Gates
|
|
|
|
1. Create computation script in `scripts/ci/`
|
|
2. Add threshold configuration (YAML or JSON)
|
|
3. Integrate into workflow as a new step
|
|
4. Update this documentation
|
|
5. Add to sprint tracking
|
|
|
|
## Troubleshooting
|
|
|
|
### Gate Fails on PR but Passes on Main
|
|
|
|
Check for:
|
|
- Non-deterministic test execution
|
|
- Timing-sensitive assertions
|
|
- Missing test fixtures in PR branch
|
|
|
|
### Baseline Drift
|
|
|
|
If baselines become stale:
|
|
|
|
```bash
|
|
# Regenerate baselines
|
|
./scripts/ci/compute-ttfs-metrics.sh --update-baseline
|
|
./scripts/ci/compute-reachability-metrics.sh --update-baseline
|
|
```
|
|
|
|
---
|
|
|
|
## Turn #6 Quality Gates (2026-01-27)
|
|
|
|
Source: Testing Enhancements (Automation Turn #6)
|
|
Sprint: `docs/implplan/SPRINT_0127_002_DOCS_testing_enhancements_turn6.md`
|
|
|
|
### Intent Violation Gate
|
|
|
|
**Purpose:** Detect test changes that violate declared intent categories.
|
|
|
|
**Script:** `scripts/ci/check-intent-violations.sh`
|
|
|
|
| Check | Description | Action |
|
|
|-------|-------------|--------|
|
|
| Intent missing | Non-trivial test without Intent trait | Warning (regulatory modules: Error) |
|
|
| Intent contradiction | Test behavior contradicts declared intent | Error |
|
|
| Intent coverage drop | Module intent coverage decreased | Warning |
|
|
|
|
**Enforcement:**
|
|
- PR-gating for regulatory modules (Policy, Authority, Signer, Attestor, EvidenceLocker).
|
|
- Warning-only for other modules (to allow gradual adoption).
|
|
|
|
### Observability Contract Gate
|
|
|
|
**Purpose:** Validate OTel spans, structured logs, and metrics contracts.
|
|
|
|
**Script:** `scripts/ci/check-observability-contracts.sh`
|
|
|
|
| Check | Description | Threshold |
|
|
|-------|-------------|-----------|
|
|
| Required spans missing | Core operation spans not emitted | Error |
|
|
| Span attribute missing | Required attributes not present | Error |
|
|
| High cardinality attribute | Label cardinality exceeds limit | Warning (> 50), Error (> 100) |
|
|
| PII in logs | Sensitive data patterns in log output | Error |
|
|
| Missing log fields | Required fields not present | Warning |
|
|
|
|
**Enforcement:**
|
|
- PR-gating for all W1 (WebService) modules.
|
|
- Run as part of contract test lane.
|
|
|
|
### Evidence Chain Gate
|
|
|
|
**Purpose:** Verify requirement -> test -> artifact traceability.
|
|
|
|
**Script:** `scripts/ci/check-evidence-chain.sh`
|
|
|
|
| Check | Description | Action |
|
|
|-------|-------------|--------|
|
|
| Orphaned test | Regulatory test without Requirement attribute | Warning |
|
|
| Artifact hash drift | Artifact hash changed unexpectedly | Error |
|
|
| Artifact non-deterministic | Multiple runs produce different artifacts | Error |
|
|
| Traceability gap | Requirement without test coverage | Warning |
|
|
|
|
**Enforcement:**
|
|
- PR-gating for regulatory modules.
|
|
- Traceability report generated as CI artifact.
|
|
|
|
### Longevity Gate (Release Gating)
|
|
|
|
**Purpose:** Detect memory leaks, connection leaks, and counter drift under sustained load.
|
|
|
|
**Script:** `scripts/ci/run-longevity-gate.sh`
|
|
**Cadence:** Nightly + pre-release
|
|
|
|
| Metric | Description | Threshold |
|
|
|--------|-------------|-----------|
|
|
| Memory growth rate | Memory increase per hour | ≤ 1% |
|
|
| Connection pool leaks | Unreturned connections | 0 |
|
|
| Counter drift | Counter value outside expected range | Error |
|
|
| GC pressure | Gen2 collections per hour | ≤ 10 |
|
|
|
|
**Enforcement:**
|
|
- Not PR-gating (too slow).
|
|
- Release-gating: longevity tests must pass before release.
|
|
- Results stored for trend analysis.
|
|
|
|
### Interop Gate (Release Gating)
|
|
|
|
**Purpose:** Validate cross-version and environment compatibility.
|
|
|
|
**Script:** `scripts/ci/run-interop-gate.sh`
|
|
**Cadence:** Weekly + pre-release
|
|
|
|
| Check | Description | Threshold |
|
|
|-------|-------------|-----------|
|
|
| N-1 compatibility | Current server with previous client | Must pass |
|
|
| N+1 compatibility | Previous server with current client | Must pass |
|
|
| Environment equivalence | Same results across infra profiles | ≤ 5% deviation |
|
|
|
|
**Profiles Tested:**
|
|
- `standard`: default Testcontainers configuration.
|
|
- `high-latency`: +100ms network latency.
|
|
- `low-bandwidth`: 10 Mbps limit.
|
|
- `packet-loss`: 1% packet loss (Linux only).
|
|
|
|
**Enforcement:**
|
|
- Not PR-gating (requires multi-version infrastructure).
|
|
- Release-gating: interop tests must pass before release.
|
|
|
|
### Post-Incident Gate
|
|
|
|
**Purpose:** Ensure incident-derived tests are maintained and passing.
|
|
|
|
**Script:** `scripts/ci/check-post-incident-tests.sh`
|
|
|
|
| Check | Description | Action |
|
|
|-------|-------------|--------|
|
|
| Incident test failing | PostIncident test not passing | Error (P1/P2), Warning (P3) |
|
|
| Incident test missing metadata | Missing IncidentId or RootCause | Warning |
|
|
| Incident coverage | P1/P2 incidents without tests | Error |
|
|
|
|
**Enforcement:**
|
|
- PR-gating: P1/P2 incident tests must pass.
|
|
- Release-gating: all incident tests must pass.
|
|
|
|
---
|
|
|
|
## Gate Summary by Gating Level
|
|
|
|
### PR-Gating (Must Pass for Merge)
|
|
|
|
| Gate | Scope |
|
|
|------|-------|
|
|
| Reachability Quality | All |
|
|
| TTFS Regression | All |
|
|
| Intent Violation | Regulatory modules |
|
|
| Observability Contract | W1 modules |
|
|
| Evidence Chain | Regulatory modules |
|
|
| Post-Incident (P1/P2) | All |
|
|
|
|
### Release-Gating (Must Pass for Release)
|
|
|
|
| Gate | Scope |
|
|
|------|-------|
|
|
| All PR gates | All |
|
|
| Longevity | Worker modules |
|
|
| Interop | Schema/API-dependent modules |
|
|
| Post-Incident (all) | All |
|
|
| Performance SLO | All |
|
|
|
|
### Warning-Only (Informational)
|
|
|
|
| Gate | Scope |
|
|
|------|-------|
|
|
| Intent missing | Non-regulatory modules |
|
|
| Intent coverage drop | All |
|
|
| Orphaned test | All |
|
|
| Traceability gap | All |
|
|
|
|
---
|
|
|
|
## Related Documentation
|
|
|
|
- [Test Suite Overview](../TEST_SUITE_OVERVIEW.md)
|
|
- [Testing Strategy Models](./testing-strategy-models.md)
|
|
- [Test Catalog](./TEST_CATALOG.yml)
|
|
- [Reachability Corpus Plan](../reachability/corpus-plan.md)
|
|
- [Performance Workbook](../PERFORMANCE_WORKBOOK.md)
|
|
- [Testing Quality Guardrails](./testing-quality-guardrails-implementation.md)
|
|
- [Testing Practices](../../code-of-conduct/TESTING_PRACTICES.md)
|