feat(crypto): Complete Phase 2 - Configuration-driven crypto architecture with 100% compliance

## Summary

This commit completes Phase 2 of the configuration-driven crypto architecture, achieving
100% crypto compliance by eliminating all hardcoded cryptographic implementations.

## Key Changes

### Phase 1: Plugin Loader Infrastructure
- **Plugin Discovery System**: Created StellaOps.Cryptography.PluginLoader with manifest-based loading
- **Configuration Model**: Added CryptoPluginConfiguration with regional profiles support
- **Dependency Injection**: Extended DI to support plugin-based crypto provider registration
- **Regional Configs**: Created appsettings.crypto.{international,russia,eu,china}.yaml
- **CI Workflow**: Added .gitea/workflows/crypto-compliance.yml for audit enforcement

### Phase 2: Code Refactoring
- **API Extension**: Added ICryptoProvider.CreateEphemeralVerifier for verification-only scenarios
- **Plugin Implementation**: Created OfflineVerificationCryptoProvider with ephemeral verifier support
  - Supports ES256/384/512, RS256/384/512, PS256/384/512
  - SubjectPublicKeyInfo (SPKI) public key format
- **100% Compliance**: Refactored DsseVerifier to remove all BouncyCastle cryptographic usage
- **Unit Tests**: Created OfflineVerificationProviderTests with 39 passing tests
- **Documentation**: Created comprehensive security guide at docs/security/offline-verification-crypto-provider.md
- **Audit Infrastructure**: Created scripts/audit-crypto-usage.ps1 for static analysis

### Testing Infrastructure (TestKit)
- **Determinism Gate**: Created DeterminismGate for reproducibility validation
- **Test Fixtures**: Added PostgresFixture and ValkeyFixture using Testcontainers
- **Traits System**: Implemented test lane attributes for parallel CI execution
- **JSON Assertions**: Added CanonicalJsonAssert for deterministic JSON comparisons
- **Test Lanes**: Created test-lanes.yml workflow for parallel test execution

### Documentation
- **Architecture**: Created CRYPTO_CONFIGURATION_DRIVEN_ARCHITECTURE.md master plan
- **Sprint Tracking**: Created SPRINT_1000_0007_0002_crypto_refactoring.md (COMPLETE)
- **API Documentation**: Updated docs2/cli/crypto-plugins.md and crypto.md
- **Testing Strategy**: Created testing strategy documents in docs/implplan/SPRINT_5100_0007_*

## Compliance & Testing

-  Zero direct System.Security.Cryptography usage in production code
-  All crypto operations go through ICryptoProvider abstraction
-  39/39 unit tests passing for OfflineVerificationCryptoProvider
-  Build successful (AirGap, Crypto plugin, DI infrastructure)
-  Audit script validates crypto boundaries

## Files Modified

**Core Crypto Infrastructure:**
- src/__Libraries/StellaOps.Cryptography/CryptoProvider.cs (API extension)
- src/__Libraries/StellaOps.Cryptography/CryptoSigningKey.cs (verification-only constructor)
- src/__Libraries/StellaOps.Cryptography/EcdsaSigner.cs (fixed ephemeral verifier)

**Plugin Implementation:**
- src/__Libraries/StellaOps.Cryptography.Plugin.OfflineVerification/ (new)
- src/__Libraries/StellaOps.Cryptography.PluginLoader/ (new)

**Production Code Refactoring:**
- src/AirGap/StellaOps.AirGap.Importer/Validation/DsseVerifier.cs (100% compliant)

**Tests:**
- src/__Libraries/__Tests/StellaOps.Cryptography.Plugin.OfflineVerification.Tests/ (new, 39 tests)
- src/__Libraries/__Tests/StellaOps.Cryptography.PluginLoader.Tests/ (new)

**Configuration:**
- etc/crypto-plugins-manifest.json (plugin registry)
- etc/appsettings.crypto.*.yaml (regional profiles)

**Documentation:**
- docs/security/offline-verification-crypto-provider.md (600+ lines)
- docs/implplan/CRYPTO_CONFIGURATION_DRIVEN_ARCHITECTURE.md (master plan)
- docs/implplan/SPRINT_1000_0007_0002_crypto_refactoring.md (Phase 2 complete)

## Next Steps

Phase 3: Docker & CI/CD Integration
- Create multi-stage Dockerfiles with all plugins
- Build regional Docker Compose files
- Implement runtime configuration selection
- Add deployment validation scripts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
master
2025-12-23 18:20:00 +02:00
parent b444284be5
commit dac8e10e36
241 changed files with 22567 additions and 307 deletions

View File

@@ -0,0 +1,310 @@
# CI Lane Integration Guide
This guide explains how to integrate the standardized test lane filtering into CI workflows.
## Overview
StellaOps uses a lane-based test categorization system with six standardized lanes:
- **Unit**: Fast, isolated, deterministic tests (PR-gating)
- **Contract**: API contract stability tests (PR-gating)
- **Integration**: Service and storage tests with Testcontainers (PR-gating)
- **Security**: AuthZ, input validation, negative tests (PR-gating)
- **Performance**: Benchmarks and regression thresholds (optional/scheduled)
- **Live**: External API smoke tests (opt-in only, never PR-gating)
## Using Lane Filters in CI
### Using the Test Runner Script
The recommended approach is to use `scripts/test-lane.sh`:
```yaml
- name: Run Unit lane tests
run: |
chmod +x scripts/test-lane.sh
./scripts/test-lane.sh Unit \
--logger "trx;LogFileName=unit-tests.trx" \
--results-directory ./test-results \
--verbosity normal
```
### Direct dotnet test Filtering
Alternatively, use `dotnet test` with lane filters directly:
```yaml
- name: Run Integration lane tests
run: |
dotnet test \
--filter "Lane=Integration" \
--configuration Release \
--logger "trx;LogFileName=integration-tests.trx" \
--results-directory ./test-results
```
## Lane-Based Workflow Pattern
### Full Workflow Example
See `.gitea/workflows/test-lanes.yml` for a complete reference implementation.
Key features:
- **Separate jobs per lane** for parallel execution
- **PR-gating lanes** run on all PRs (Unit, Contract, Integration, Security)
- **Optional lanes** run on schedule or manual trigger (Performance, Live)
- **Test results summary** aggregates all lane results
### Job Structure
```yaml
unit-tests:
name: Unit Tests
runs-on: ubuntu-22.04
timeout-minutes: 15
steps:
- uses: actions/checkout@v4
- name: Setup .NET
uses: actions/setup-dotnet@v4
with:
dotnet-version: '10.0.100'
- name: Build
run: dotnet build src/StellaOps.sln --configuration Release
- name: Run Unit lane
run: ./scripts/test-lane.sh Unit --results-directory ./test-results
- name: Upload results
uses: actions/upload-artifact@v4
with:
name: unit-test-results
path: ./test-results
```
## Lane Execution Guidelines
### Unit Lane
- **Timeout**: 10-15 minutes
- **Dependencies**: None (no I/O, no network, no databases)
- **PR gating**: ✅ Required
- **Characteristics**: Deterministic, fast, offline
### Contract Lane
- **Timeout**: 5-10 minutes
- **Dependencies**: None (schema validation only)
- **PR gating**: ✅ Required
- **Characteristics**: OpenAPI/schema validation, no external calls
### Integration Lane
- **Timeout**: 20-30 minutes
- **Dependencies**: Testcontainers (Postgres, Valkey)
- **PR gating**: ✅ Required
- **Characteristics**: End-to-end service flows, database tests
### Security Lane
- **Timeout**: 15-20 minutes
- **Dependencies**: Testcontainers (if needed for auth tests)
- **PR gating**: ✅ Required
- **Characteristics**: RBAC, injection prevention, negative tests
### Performance Lane
- **Timeout**: 30-45 minutes
- **Dependencies**: Baseline data, historical metrics
- **PR gating**: ❌ Optional (scheduled/manual)
- **Characteristics**: Benchmarks, regression thresholds
### Live Lane
- **Timeout**: 15-20 minutes
- **Dependencies**: External APIs, upstream services
- **PR gating**: ❌ Never (opt-in only)
- **Characteristics**: Smoke tests, connector validation
## Migration from Per-Project to Lane-Based
### Before (Per-Project)
```yaml
- name: Run Concelier tests
run: dotnet test src/Concelier/StellaOps.Concelier.sln
- name: Run Authority tests
run: dotnet test src/Authority/StellaOps.Authority.sln
- name: Run Scanner tests
run: dotnet test src/Scanner/StellaOps.Scanner.sln
```
### After (Lane-Based)
```yaml
- name: Run Unit lane
run: ./scripts/test-lane.sh Unit
- name: Run Integration lane
run: ./scripts/test-lane.sh Integration
- name: Run Security lane
run: ./scripts/test-lane.sh Security
```
**Benefits**:
- Run all unit tests across all modules in parallel
- Clear separation of concerns by test type
- Faster feedback (fast tests run first)
- Better resource utilization (no Testcontainers for Unit tests)
## Best Practices
### 1. Parallel Execution
Run PR-gating lanes in parallel for faster feedback:
```yaml
jobs:
unit-tests:
# ...
integration-tests:
# ...
security-tests:
# ...
```
### 2. Conditional Execution
Use workflow inputs for optional lanes:
```yaml
on:
workflow_dispatch:
inputs:
run_performance:
type: boolean
default: false
jobs:
performance-tests:
if: github.event.inputs.run_performance == 'true'
# ...
```
### 3. Test Result Aggregation
Create a summary job that depends on all lane jobs:
```yaml
test-summary:
needs: [unit-tests, contract-tests, integration-tests, security-tests]
if: always()
steps:
- name: Download all results
uses: actions/download-artifact@v4
- name: Generate summary
run: ./scripts/ci/aggregate-test-results.sh
```
### 4. Timeout Configuration
Set appropriate timeouts per lane:
```yaml
unit-tests:
timeout-minutes: 15 # Fast
integration-tests:
timeout-minutes: 30 # Testcontainers startup
performance-tests:
timeout-minutes: 45 # Benchmark execution
```
### 5. Environment Isolation
Use Testcontainers for Integration lane, not GitHub Actions services:
```yaml
integration-tests:
steps:
- name: Run Integration tests
env:
POSTGRES_TEST_IMAGE: postgres:16-alpine
run: ./scripts/test-lane.sh Integration
```
Testcontainers provides:
- Per-test isolation
- Automatic cleanup
- Consistent behavior across environments
## Troubleshooting
### Tests Not Found
**Problem**: `dotnet test --filter "Lane=Unit"` finds no tests
**Solution**: Ensure tests have lane attributes:
```csharp
[Fact]
[UnitTest] // This attribute is required
public void MyTest() { }
```
### Wrong Lane Assignment
**Problem**: Integration test running in Unit lane
**Solution**: Check test attributes:
```csharp
// Bad: No database in Unit lane
[Fact]
[UnitTest]
public async Task DatabaseTest() { /* uses Postgres */ }
// Good: Use Integration lane for database tests
[Fact]
[IntegrationTest]
public async Task DatabaseTest() { /* uses Testcontainers */ }
```
### Testcontainers Timeout
**Problem**: Integration tests timeout waiting for containers
**Solution**: Increase job timeout and ensure Docker is available:
```yaml
integration-tests:
timeout-minutes: 30 # Increased from 15
steps:
- name: Verify Docker
run: docker info
```
### Live Tests in PR
**Problem**: Live lane tests failing in PRs
**Solution**: Never run Live tests in PRs:
```yaml
live-tests:
if: github.event_name == 'workflow_dispatch' && github.event.inputs.run_live == 'true'
# Never runs automatically on PR
```
## Integration with Existing Workflows
### Adding Lane-Based Testing to build-test-deploy.yml
Replace per-module test execution with lane-based execution:
```yaml
# Old approach
- name: Run Concelier tests
run: dotnet test src/Concelier/StellaOps.Concelier.sln
# New approach (recommended)
- name: Run all Unit tests
run: ./scripts/test-lane.sh Unit
- name: Run all Integration tests
run: ./scripts/test-lane.sh Integration
```
### Gradual Migration Strategy
1. **Phase 1**: Add lane attributes to existing tests
2. **Phase 2**: Add lane-based jobs alongside existing per-project jobs
3. **Phase 3**: Monitor lane-based jobs for stability
4. **Phase 4**: Remove per-project jobs once lane-based jobs proven stable
## Related Documentation
- Test Lane Filters: `docs/testing/ci-lane-filters.md`
- Testing Strategy: `docs/testing/testing-strategy-models.md`
- Test Catalog: `docs/testing/TEST_CATALOG.yml`
- TestKit README: `src/__Libraries/StellaOps.TestKit/README.md`
- Example Workflow: `.gitea/workflows/test-lanes.yml`