Files
git.stella-ops.org/docs/technical/testing/ci-lane-integration.md
2026-01-07 10:23:21 +02:00

311 lines
7.8 KiB
Markdown

# CI Lane Integration Guide
This guide explains how to integrate the standardized test lane filtering into CI workflows.
## Overview
StellaOps uses a lane-based test categorization system with six standardized lanes:
- **Unit**: Fast, isolated, deterministic tests (PR-gating)
- **Contract**: API contract stability tests (PR-gating)
- **Integration**: Service and storage tests with Testcontainers (PR-gating)
- **Security**: AuthZ, input validation, negative tests (PR-gating)
- **Performance**: Benchmarks and regression thresholds (optional/scheduled)
- **Live**: External API smoke tests (opt-in only, never PR-gating)
## Using Lane Filters in CI
### Using the Test Runner Script
The recommended approach is to use `scripts/test-lane.sh`:
```yaml
- name: Run Unit lane tests
run: |
chmod +x scripts/test-lane.sh
./scripts/test-lane.sh Unit \
--logger "trx;LogFileName=unit-tests.trx" \
--results-directory ./test-results \
--verbosity normal
```
### Direct dotnet test Filtering
Alternatively, use `dotnet test` with lane filters directly:
```yaml
- name: Run Integration lane tests
run: |
dotnet test \
--filter "Lane=Integration" \
--configuration Release \
--logger "trx;LogFileName=integration-tests.trx" \
--results-directory ./test-results
```
## Lane-Based Workflow Pattern
### Full Workflow Example
See `.gitea/workflows/test-lanes.yml` for a complete reference implementation.
Key features:
- **Separate jobs per lane** for parallel execution
- **PR-gating lanes** run on all PRs (Unit, Contract, Integration, Security)
- **Optional lanes** run on schedule or manual trigger (Performance, Live)
- **Test results summary** aggregates all lane results
### Job Structure
```yaml
unit-tests:
name: Unit Tests
runs-on: ubuntu-22.04
timeout-minutes: 15
steps:
- uses: actions/checkout@v4
- name: Setup .NET
uses: actions/setup-dotnet@v4
with:
dotnet-version: '10.0.100'
- name: Build
run: dotnet build src/StellaOps.sln --configuration Release
- name: Run Unit lane
run: ./scripts/test-lane.sh Unit --results-directory ./test-results
- name: Upload results
uses: actions/upload-artifact@v4
with:
name: unit-test-results
path: ./test-results
```
## Lane Execution Guidelines
### Unit Lane
- **Timeout**: 10-15 minutes
- **Dependencies**: None (no I/O, no network, no databases)
- **PR gating**: ✅ Required
- **Characteristics**: Deterministic, fast, offline
### Contract Lane
- **Timeout**: 5-10 minutes
- **Dependencies**: None (schema validation only)
- **PR gating**: ✅ Required
- **Characteristics**: OpenAPI/schema validation, no external calls
### Integration Lane
- **Timeout**: 20-30 minutes
- **Dependencies**: Testcontainers (Postgres, Valkey)
- **PR gating**: ✅ Required
- **Characteristics**: End-to-end service flows, database tests
### Security Lane
- **Timeout**: 15-20 minutes
- **Dependencies**: Testcontainers (if needed for auth tests)
- **PR gating**: ✅ Required
- **Characteristics**: RBAC, injection prevention, negative tests
### Performance Lane
- **Timeout**: 30-45 minutes
- **Dependencies**: Baseline data, historical metrics
- **PR gating**: ❌ Optional (scheduled/manual)
- **Characteristics**: Benchmarks, regression thresholds
### Live Lane
- **Timeout**: 15-20 minutes
- **Dependencies**: External APIs, upstream services
- **PR gating**: ❌ Never (opt-in only)
- **Characteristics**: Smoke tests, connector validation
## Migration from Per-Project to Lane-Based
### Before (Per-Project)
```yaml
- name: Run Concelier tests
run: dotnet test src/Concelier/StellaOps.Concelier.sln
- name: Run Authority tests
run: dotnet test src/Authority/StellaOps.Authority.sln
- name: Run Scanner tests
run: dotnet test src/Scanner/StellaOps.Scanner.sln
```
### After (Lane-Based)
```yaml
- name: Run Unit lane
run: ./scripts/test-lane.sh Unit
- name: Run Integration lane
run: ./scripts/test-lane.sh Integration
- name: Run Security lane
run: ./scripts/test-lane.sh Security
```
**Benefits**:
- Run all unit tests across all modules in parallel
- Clear separation of concerns by test type
- Faster feedback (fast tests run first)
- Better resource utilization (no Testcontainers for Unit tests)
## Best Practices
### 1. Parallel Execution
Run PR-gating lanes in parallel for faster feedback:
```yaml
jobs:
unit-tests:
# ...
integration-tests:
# ...
security-tests:
# ...
```
### 2. Conditional Execution
Use workflow inputs for optional lanes:
```yaml
on:
workflow_dispatch:
inputs:
run_performance:
type: boolean
default: false
jobs:
performance-tests:
if: github.event.inputs.run_performance == 'true'
# ...
```
### 3. Test Result Aggregation
Create a summary job that depends on all lane jobs:
```yaml
test-summary:
needs: [unit-tests, contract-tests, integration-tests, security-tests]
if: always()
steps:
- name: Download all results
uses: actions/download-artifact@v4
- name: Generate summary
run: ./scripts/ci/aggregate-test-results.sh
```
### 4. Timeout Configuration
Set appropriate timeouts per lane:
```yaml
unit-tests:
timeout-minutes: 15 # Fast
integration-tests:
timeout-minutes: 30 # Testcontainers startup
performance-tests:
timeout-minutes: 45 # Benchmark execution
```
### 5. Environment Isolation
Use Testcontainers for Integration lane, not GitHub Actions services:
```yaml
integration-tests:
steps:
- name: Run Integration tests
env:
POSTGRES_TEST_IMAGE: postgres:16-alpine
run: ./scripts/test-lane.sh Integration
```
Testcontainers provides:
- Per-test isolation
- Automatic cleanup
- Consistent behavior across environments
## Troubleshooting
### Tests Not Found
**Problem**: `dotnet test --filter "Lane=Unit"` finds no tests
**Solution**: Ensure tests have lane attributes:
```csharp
[Fact]
[UnitTest] // This attribute is required
public void MyTest() { }
```
### Wrong Lane Assignment
**Problem**: Integration test running in Unit lane
**Solution**: Check test attributes:
```csharp
// Bad: No database in Unit lane
[Fact]
[UnitTest]
public async Task DatabaseTest() { /* uses Postgres */ }
// Good: Use Integration lane for database tests
[Fact]
[IntegrationTest]
public async Task DatabaseTest() { /* uses Testcontainers */ }
```
### Testcontainers Timeout
**Problem**: Integration tests timeout waiting for containers
**Solution**: Increase job timeout and ensure Docker is available:
```yaml
integration-tests:
timeout-minutes: 30 # Increased from 15
steps:
- name: Verify Docker
run: docker info
```
### Live Tests in PR
**Problem**: Live lane tests failing in PRs
**Solution**: Never run Live tests in PRs:
```yaml
live-tests:
if: github.event_name == 'workflow_dispatch' && github.event.inputs.run_live == 'true'
# Never runs automatically on PR
```
## Integration with Existing Workflows
### Adding Lane-Based Testing to build-test-deploy.yml
Replace per-module test execution with lane-based execution:
```yaml
# Old approach
- name: Run Concelier tests
run: dotnet test src/Concelier/StellaOps.Concelier.sln
# New approach (recommended)
- name: Run all Unit tests
run: ./scripts/test-lane.sh Unit
- name: Run all Integration tests
run: ./scripts/test-lane.sh Integration
```
### Gradual Migration Strategy
1. **Phase 1**: Add lane attributes to existing tests
2. **Phase 2**: Add lane-based jobs alongside existing per-project jobs
3. **Phase 3**: Monitor lane-based jobs for stability
4. **Phase 4**: Remove per-project jobs once lane-based jobs proven stable
## Related Documentation
- Test Lane Filters: `docs/technical/testing/ci-lane-filters.md`
- Testing Strategy: `docs/technical/testing/testing-strategy-models.md`
- Test Catalog: `docs/technical/testing/TEST_CATALOG.yml`
- TestKit README: `src/__Libraries/StellaOps.TestKit/README.md`
- Example Workflow: `.gitea/workflows/test-lanes.yml`