11 KiB
11 KiB
CI/CD Architecture
Extended Documentation: See docs/cicd/ for comprehensive CI/CD guides.
Overview
StellaOps CI/CD infrastructure is built on Gitea Actions with a modular, layered architecture designed for:
- Determinism: Reproducible builds and tests across environments
- Offline-first: Support for air-gapped deployments
- Security: Cryptographic signing and attestation at every stage
- Scalability: Parallel execution with intelligent caching
Quick Links
| Document | Purpose |
|---|---|
| CI/CD Overview | High-level architecture and getting started |
| Workflow Triggers | Complete trigger matrix and dependency chains |
| Release Pipelines | Suite, module, and bundle release flows |
| Security Scanning | SAST, secrets, container, and dependency scanning |
| Troubleshooting | Common issues and solutions |
| Script Reference | CI/CD script documentation |
Workflow Trigger Summary
Trigger Matrix (100 Workflows)
| Trigger Type | Count | Examples |
|---|---|---|
| PR + Main Push | 15 | test-matrix.yml, build-test-deploy.yml |
| Tag-Based | 3 | release-suite.yml, release.yml, module-publish.yml |
| Scheduled | 8 | nightly-regression.yml, renovate.yml |
| Manual Only | 25+ | rollback.yml, cli-build.yml |
| Module-Specific | 50+ | Scanner, Concelier, Authority workflows |
Tag Patterns
| Pattern | Workflow | Example |
|---|---|---|
suite-* |
Suite release | suite-2026.04 |
v* |
Bundle release | v2025.12.1 |
module-*-v* |
Module publish | module-authority-v1.2.3 |
Schedule Overview
| Time (UTC) | Workflow | Purpose |
|---|---|---|
| 2:00 AM Daily | nightly-regression.yml |
Full regression |
| 3:00 AM/PM Daily | renovate.yml |
Dependency updates |
| 3:30 AM Monday | sast-scan.yml |
Weekly security scan |
| 5:00 AM Daily | test-matrix.yml |
Extended tests |
Full Details: See Workflow Triggers
Pipeline Architecture
Release Pipeline Flow
graph TD
subgraph "Trigger Layer"
TAG[Git Tag] --> PARSE[Parse Tag]
DISPATCH[Manual Dispatch] --> PARSE
SCHEDULE[Scheduled] --> PARSE
end
subgraph "Validation Layer"
PARSE --> VALIDATE[Validate Inputs]
VALIDATE --> RESOLVE[Resolve Versions]
end
subgraph "Build Layer"
RESOLVE --> BUILD[Build Modules]
BUILD --> TEST[Run Tests]
TEST --> DETERMINISM[Determinism Check]
end
subgraph "Artifact Layer"
DETERMINISM --> CONTAINER[Build Container]
CONTAINER --> SBOM[Generate SBOM]
SBOM --> SIGN[Sign Artifacts]
end
subgraph "Release Layer"
SIGN --> MANIFEST[Update Manifest]
MANIFEST --> CHANGELOG[Generate Changelog]
CHANGELOG --> DOCS[Generate Docs]
DOCS --> PUBLISH[Publish Release]
end
subgraph "Post-Release"
PUBLISH --> VERIFY[Verify Release]
VERIFY --> NOTIFY[Notify Stakeholders]
end
Service Release Pipeline
graph LR
subgraph "Trigger"
A[service-{name}-v{semver}] --> B[Parse Service & Version]
end
subgraph "Build"
B --> C[Read Directory.Versions.props]
C --> D[Bump Version]
D --> E[Build Service]
E --> F[Run Tests]
end
subgraph "Package"
F --> G[Build Container]
G --> H[Generate Docker Tag]
H --> I[Push to Registry]
end
subgraph "Attestation"
I --> J[Generate SBOM]
J --> K[Sign with Cosign]
K --> L[Create Attestation]
end
subgraph "Finalize"
L --> M[Update Manifest]
M --> N[Commit Changes]
end
Test Matrix Execution
graph TD
subgraph "Matrix Strategy"
TRIGGER[PR/Push] --> FILTER[Path Filter]
FILTER --> MATRIX[Generate Matrix]
end
subgraph "Parallel Execution"
MATRIX --> UNIT[Unit Tests]
MATRIX --> INT[Integration Tests]
MATRIX --> DET[Determinism Tests]
end
subgraph "Test Types"
UNIT --> UNIT_FAST[Fast Unit]
UNIT --> UNIT_SLOW[Slow Unit]
INT --> INT_PG[PostgreSQL]
INT --> INT_VALKEY[Valkey]
DET --> DET_SCANNER[Scanner]
DET --> DET_BUILD[Build Output]
end
subgraph "Reporting"
UNIT_FAST --> TRX[TRX Reports]
UNIT_SLOW --> TRX
INT_PG --> TRX
INT_VALKEY --> TRX
DET_SCANNER --> TRX
DET_BUILD --> TRX
TRX --> SUMMARY[Job Summary]
end
Workflow Dependencies
Core Dependencies
graph TD
BTD[build-test-deploy.yml] --> TM[test-matrix.yml]
BTD --> DG[determinism-gate.yml]
TM --> TL[test-lanes.yml]
TM --> ITG[integration-tests-gate.yml]
RS[release-suite.yml] --> BTD
RS --> MP[module-publish.yml]
RS --> AS[artifact-signing.yml]
SR[service-release.yml] --> BTD
SR --> AS
MP --> AS
MP --> AB[attestation-bundle.yml]
Security Chain
graph LR
BUILD[Build] --> SBOM[SBOM Generation]
SBOM --> SIGN[Cosign Signing]
SIGN --> ATTEST[Attestation]
ATTEST --> VERIFY[Verification]
VERIFY --> PUBLISH[Publish]
Execution Stages
Stage 1: Validation
| Step | Purpose | Tools |
|---|---|---|
| Parse trigger | Extract tag/input parameters | bash |
| Validate config | Check required files exist | bash |
| Resolve versions | Read from Directory.Versions.props | Python |
| Check permissions | Verify secrets available | Gitea Actions |
Stage 2: Build
| Step | Purpose | Tools |
|---|---|---|
| Restore packages | NuGet/npm dependencies | dotnet restore, npm ci |
| Build solution | Compile all projects | dotnet build |
| Run analyzers | Code analysis | dotnet analyzers |
Stage 3: Test
| Step | Purpose | Tools |
|---|---|---|
| Unit tests | Component testing | xUnit |
| Integration tests | Service integration | Testcontainers |
| Determinism tests | Output reproducibility | Custom scripts |
Stage 4: Package
| Step | Purpose | Tools |
|---|---|---|
| Build container | Docker image | docker build |
| Generate SBOM | Software bill of materials | Syft |
| Sign artifacts | Cryptographic signing | Cosign |
| Create attestation | in-toto/DSSE envelope | Custom tools |
Stage 5: Publish
| Step | Purpose | Tools |
|---|---|---|
| Push container | Registry upload | docker push |
| Upload attestation | Rekor transparency | Cosign |
| Update manifest | Version tracking | Python |
| Generate docs | Release documentation | Python |
Concurrency Control
Strategy
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
Workflow Groups
| Group | Behavior | Workflows |
|---|---|---|
| Build | Cancel in-progress | build-test-deploy.yml |
| Release | No cancel (sequential) | release-suite.yml |
| Deploy | Environment-locked | promote.yml |
| Scheduled | Allow concurrent | renovate.yml |
Caching Strategy
Cache Layers
graph TD
subgraph "Package Cache"
NUGET[NuGet Cache<br>~/.nuget/packages]
NPM[npm Cache<br>~/.npm]
end
subgraph "Build Cache"
OBJ[Object Files<br>**/obj]
BIN[Binaries<br>**/bin]
end
subgraph "Test Cache"
TC[Testcontainers<br>Images]
FIX[Test Fixtures]
end
subgraph "Keys"
K1[runner.os-nuget-hash] --> NUGET
K2[runner.os-npm-hash] --> NPM
K3[runner.os-dotnet-hash] --> OBJ
K3 --> BIN
end
Cache Configuration
| Cache | Key Pattern | Restore Keys |
|---|---|---|
| NuGet | ${{ runner.os }}-nuget-${{ hashFiles('**/*.csproj') }} |
${{ runner.os }}-nuget- |
| npm | ${{ runner.os }}-npm-${{ hashFiles('**/package-lock.json') }} |
${{ runner.os }}-npm- |
| .NET Build | ${{ runner.os }}-dotnet-${{ github.sha }} |
${{ runner.os }}-dotnet- |
Runner Requirements
Self-Hosted Runners
| Label | Purpose | Requirements |
|---|---|---|
ubuntu-latest |
General builds | 4 CPU, 16GB RAM, 100GB disk |
linux-arm64 |
ARM builds | ARM64 host |
windows-latest |
Windows builds | Windows Server 2022 |
macos-latest |
macOS builds | macOS 13+ |
Docker-in-Docker
Required for:
- Testcontainers integration tests
- Multi-architecture builds
- Container scanning
Network Requirements
| Endpoint | Purpose | Required |
|---|---|---|
git.stella-ops.org |
Source, Registry | Always |
nuget.org |
NuGet packages | Online mode |
registry.npmjs.org |
npm packages | Online mode |
ghcr.io |
GitHub Container Registry | Optional |
Artifact Flow
Build Artifacts
artifacts/
├── binaries/
│ ├── StellaOps.Cli-linux-x64
│ ├── StellaOps.Cli-linux-arm64
│ ├── StellaOps.Cli-win-x64
│ └── StellaOps.Cli-osx-arm64
├── containers/
│ ├── scanner:1.2.3+20250128143022
│ └── authority:1.0.0+20250128143022
├── sbom/
│ ├── scanner.cyclonedx.json
│ └── authority.cyclonedx.json
└── attestations/
├── scanner.intoto.jsonl
└── authority.intoto.jsonl
Release Artifacts
docs/releases/2026.04/
├── README.md
├── CHANGELOG.md
├── services.md
├── docker-compose.yml
├── docker-compose.airgap.yml
├── upgrade-guide.md
├── checksums.txt
└── manifest.yaml
Error Handling
Retry Strategy
| Step Type | Retries | Backoff |
|---|---|---|
| Network calls | 3 | Exponential |
| Docker push | 3 | Linear (30s) |
| Tests | 0 | N/A |
| Signing | 2 | Linear (10s) |
Failure Actions
| Failure Type | Action |
|---|---|
| Build failure | Fail fast, notify |
| Test failure | Continue, report |
| Signing failure | Fail, alert security |
| Deploy failure | Rollback, notify |
Security Architecture
Secret Management
graph TD
subgraph "Gitea Secrets"
GS[Organization Secrets]
RS[Repository Secrets]
ES[Environment Secrets]
end
subgraph "Usage"
GS --> BUILD[Build Workflows]
RS --> SIGN[Signing Workflows]
ES --> DEPLOY[Deploy Workflows]
end
subgraph "Rotation"
ROTATE[Key Rotation] --> RS
ROTATE --> ES
end
Signing Chain
- Build outputs: SHA-256 checksums
- Container images: Cosign keyless/keyed signing
- SBOMs: in-toto attestation
- Releases: GPG-signed tags
Monitoring & Observability
Workflow Metrics
| Metric | Source | Dashboard |
|---|---|---|
| Build duration | Gitea Actions | Grafana |
| Test pass rate | TRX reports | Grafana |
| Cache hit rate | Actions cache | Prometheus |
| Artifact size | Upload artifact | Prometheus |
Alerts
| Alert | Condition | Action |
|---|---|---|
| Build time > 30m | Duration threshold | Investigate |
| Test failures > 5% | Rate threshold | Review |
| Cache miss streak | 3 consecutive | Clear cache |
| Security scan critical | Any critical CVE | Block merge |