17 KiB
Release Orchestrator Architecture
Technical architecture specification for the Release Orchestrator — Stella Ops Suite's central release control plane for non-Kubernetes container estates.
Status: Planned (not yet implemented)
Overview
The Release Orchestrator transforms Stella Ops Suite from a vulnerability scanning platform into a centralized, auditable release control plane. It sits between CI systems and runtime targets, governing promotion across environments, enforcing security and policy gates, and producing verifiable evidence for every release decision.
Core Value Proposition
- Release orchestration — UI-driven promotion (Dev → Stage → Prod), approvals, policy gates, rollbacks
- Security decisioning as a gate — Scan on build, evaluate on release, re-evaluate on CVE updates
- OCI-digest-first releases — Immutable digest-based release identity
- Toolchain-agnostic integrations — Plug into any SCM, CI, registry, secrets system
- Auditability + standards — Evidence packets, SBOM/VEX/attestation support, deterministic replay
Design Principles
-
Digest-First Release Identity — A release is an immutable set of OCI digests, never mutable tags. Tags are resolved to digests at release creation time.
-
Pluggable Everything, Stable Core — Integrations are plugins; the core orchestration engine is stable. Plugins contribute UI screens, connector logic, step types, and agent types.
-
Evidence for Every Decision — Every deployment/promotion produces an immutable evidence record containing who, what, why, how, and when.
-
No Feature Gating — All plans include all features. Limits are only: environments, new digests/day, fair use on deployments.
-
Offline-First Operation — All core operations work in air-gapped environments. Plugins may require connectivity; core does not.
-
Immutable Generated Artifacts — Every deployment generates and stores immutable artifacts (compose lockfiles, scripts, evidence).
Platform Themes
The Release Orchestrator introduces ten new functional themes:
| Theme | Purpose | Key Modules |
|---|---|---|
| INTHUB | Integration hub | Integration Manager, Connection Profiles, Connector Runtime |
| ENVMGR | Environment management | Environment Manager, Target Registry, Agent Manager |
| RELMAN | Release management | Component Registry, Version Manager, Release Manager |
| WORKFL | Workflow engine | Workflow Designer, Workflow Engine, Step Executor |
| PROMOT | Promotion and approval | Promotion Manager, Approval Gateway, Decision Engine |
| DEPLOY | Deployment execution | Deploy Orchestrator, Target Executor, Artifact Generator |
| AGENTS | Deployment agents | Agent Core, Docker/Compose/ECS/Nomad agents |
| PROGDL | Progressive delivery | A/B Manager, Traffic Router, Canary Controller |
| RELEVI | Release evidence | Evidence Collector, Sticker Writer, Audit Exporter |
| PLUGIN | Plugin infrastructure | Plugin Registry, Plugin Loader, Plugin SDK |
Components
ReleaseOrchestrator/
├── __Libraries/
│ ├── StellaOps.ReleaseOrchestrator.Core/ # Core domain models
│ ├── StellaOps.ReleaseOrchestrator.Workflow/ # DAG workflow engine
│ ├── StellaOps.ReleaseOrchestrator.Promotion/ # Promotion logic
│ ├── StellaOps.ReleaseOrchestrator.Deploy/ # Deployment coordination
│ ├── StellaOps.ReleaseOrchestrator.Evidence/ # Evidence generation
│ ├── StellaOps.ReleaseOrchestrator.Plugin/ # Plugin infrastructure
│ └── StellaOps.ReleaseOrchestrator.Integration/ # Integration connectors
├── StellaOps.ReleaseOrchestrator.WebService/ # HTTP API
├── StellaOps.ReleaseOrchestrator.Worker/ # Background processing
├── StellaOps.Agent.Core/ # Agent base framework
├── StellaOps.Agent.Docker/ # Docker host agent
├── StellaOps.Agent.Compose/ # Docker Compose agent
├── StellaOps.Agent.SSH/ # SSH agentless executor
├── StellaOps.Agent.WinRM/ # WinRM agentless executor
├── StellaOps.Agent.ECS/ # AWS ECS agent
├── StellaOps.Agent.Nomad/ # HashiCorp Nomad agent
└── __Tests/
└── StellaOps.ReleaseOrchestrator.*.Tests/
Data Flow
Release Orchestration Flow
CI Build → Registry Push → Webhook → Stella Scan → Create Release →
Request Promotion → Gate Evaluation → Decision Record →
Deploy via Agent → Version Sticker → Evidence Packet
Detailed Flow
- CI pushes image to registry by digest; triggers webhook to Stella
- Stella scans the new digest (if not already scanned); stores verdict
- Release created bundling component digests with semantic version
- Promotion requested to move release from source → target environment
- Gate evaluation runs: security verdict, approval count, freeze windows, custom policies
- Decision record produced with evidence refs and signed
- Deployment executed via agent to target (Docker/Compose/ECS/Nomad)
- Version sticker written to target for drift detection
- Evidence packet sealed and stored
Key Abstractions
Environment
public sealed record Environment
{
public required Guid Id { get; init; }
public required Guid TenantId { get; init; }
public required string Name { get; init; } // "dev", "stage", "prod"
public required string Slug { get; init; } // URL-safe identifier
public required int PromotionOrder { get; init; } // 1, 2, 3...
public required FreezeWindow[] FreezeWindows { get; init; }
public required ApprovalPolicy ApprovalPolicy { get; init; }
public required bool IsProduction { get; init; }
public EnvironmentState State { get; init; } // Active, Frozen, Retired
}
Release
public sealed record Release
{
public required Guid Id { get; init; }
public required Guid TenantId { get; init; }
public required string Version { get; init; } // SemVer: "2.3.1"
public required string Name { get; init; } // Display name
public required ImmutableDictionary<string, ComponentDigest> Components { get; init; }
public required string SourceRef { get; init; } // Git SHA or tag
public required DateTimeOffset CreatedAt { get; init; }
public required Guid CreatedBy { get; init; }
public ReleaseState State { get; init; } // Draft, Active, Deprecated
}
public sealed record ComponentDigest
{
public required string Repository { get; init; } // registry.example.com/app/api
public required string Digest { get; init; } // sha256:abc123...
public required string? ResolvedFromTag { get; init; } // Optional: "v2.3.1"
}
Promotion
public sealed record Promotion
{
public required Guid Id { get; init; }
public required Guid TenantId { get; init; }
public required Guid ReleaseId { get; init; }
public required Guid SourceEnvironmentId { get; init; }
public required Guid TargetEnvironmentId { get; init; }
public required Guid RequestedBy { get; init; }
public required DateTimeOffset RequestedAt { get; init; }
public PromotionState State { get; init; } // Pending, Approved, Rejected, Deployed, RolledBack
public required ImmutableArray<GateResult> GateResults { get; init; }
public required ImmutableArray<ApprovalRecord> Approvals { get; init; }
public required DecisionRecord? Decision { get; init; }
}
Workflow
public sealed record Workflow
{
public required Guid Id { get; init; }
public required string Name { get; init; }
public required ImmutableArray<WorkflowStep> Steps { get; init; }
public required ImmutableDictionary<string, string[]> DependencyGraph { get; init; }
}
public sealed record WorkflowStep
{
public required string Id { get; init; }
public required string Type { get; init; } // "script", "approval", "deploy", "gate"
public required StepProvider Provider { get; init; }
public required ImmutableDictionary<string, object> Config { get; init; }
public required string[] DependsOn { get; init; }
public StepState State { get; init; }
}
Target
public sealed record Target
{
public required Guid Id { get; init; }
public required Guid TenantId { get; init; }
public required Guid EnvironmentId { get; init; }
public required string Name { get; init; }
public required TargetType Type { get; init; } // DockerHost, ComposeHost, ECSService, NomadJob
public required ImmutableDictionary<string, string> Labels { get; init; }
public required Guid? AgentId { get; init; } // Null for agentless
public required TargetState State { get; init; }
public required HealthStatus Health { get; init; }
}
public enum TargetType
{
DockerHost,
ComposeHost,
ECSService,
NomadJob,
SSHRemote,
WinRMRemote
}
Agent
public sealed record Agent
{
public required Guid Id { get; init; }
public required Guid TenantId { get; init; }
public required string Name { get; init; }
public required string Version { get; init; }
public required ImmutableArray<string> Capabilities { get; init; }
public required DateTimeOffset LastHeartbeat { get; init; }
public required AgentState State { get; init; } // Online, Offline, Degraded
public required ImmutableDictionary<string, string> Labels { get; init; }
}
Database Schema
| Table | Purpose |
|---|---|
release.environments |
Environment definitions with freeze windows |
release.targets |
Deployment targets within environments |
release.agents |
Registered deployment agents |
release.components |
Component definitions (service → repository mapping) |
release.releases |
Release bundles (version → component digests) |
release.promotions |
Promotion requests and state |
release.approvals |
Approval records |
release.workflows |
Workflow templates |
release.workflow_runs |
Workflow execution state |
release.deployment_jobs |
Deployment job records |
release.evidence_packets |
Sealed evidence records |
release.integrations |
Integration configurations |
release.plugins |
Plugin registrations |
Gate Types
| Gate | Purpose | Evaluation |
|---|---|---|
| Security | Check scan verdict | Query latest scan for release digest; block on critical/high reachable |
| Approval | Human sign-off | Count approvals; check SoD rules |
| FreezeWindow | Calendar-based blocking | Check target environment freeze windows |
| PreviousEnvironment | Require prior deployment | Verify release deployed to source environment |
| Policy | Custom OPA/Rego rules | Evaluate policy with promotion context |
| HealthCheck | Target health | Verify target is healthy before deploy |
Plugin System (Three-Surface Model)
Plugins contribute through three surfaces:
1. Manifest (Static Declaration)
# plugin-manifest.yaml
name: github-integration
version: 1.0.0
provider: StellaOps.Integration.GitHub.Plugin
capabilities:
integrations:
- type: scm
id: github
displayName: GitHub
steps:
- type: github-status
displayName: Update GitHub Status
gates:
- type: github-check
displayName: GitHub Check Required
2. Connector Runtime (Dynamic Execution)
public interface IIntegrationConnector
{
Task<ConnectionTestResult> TestConnectionAsync(CancellationToken ct);
Task<HealthStatus> GetHealthAsync(CancellationToken ct);
Task<IReadOnlyList<Resource>> DiscoverResourcesAsync(string resourceType, CancellationToken ct);
}
public interface ISCMConnector : IIntegrationConnector
{
Task<CommitInfo> GetCommitAsync(string ref, CancellationToken ct);
Task CreateCommitStatusAsync(string commit, CommitStatus status, CancellationToken ct);
}
public interface IRegistryConnector : IIntegrationConnector
{
Task<string> ResolveDigestAsync(string imageRef, CancellationToken ct);
Task<bool> VerifyDigestAsync(string imageRef, string expectedDigest, CancellationToken ct);
}
3. Step Provider (Execution Contract)
public interface IStepProvider
{
StepExecutionCharacteristics Characteristics { get; }
Task<StepResult> ExecuteAsync(StepContext context, CancellationToken ct);
Task<StepResult> RollbackAsync(StepContext context, CancellationToken ct);
}
public sealed record StepExecutionCharacteristics
{
public bool IsIdempotent { get; init; }
public bool SupportsRollback { get; init; }
public TimeSpan DefaultTimeout { get; init; }
public ResourceRequirements Resources { get; init; }
}
Invariants
-
Release identity is immutable — Once created, a release's component digests cannot be changed. Create a new release instead.
-
Promotions are append-only — Promotion state transitions are recorded; no edits or deletions.
-
Evidence packets are sealed — Evidence is cryptographically signed and stored immutably.
-
Digest verification at deploy time — Agents verify image digests at pull time; mismatch fails deployment.
-
Separation of duties enforced — Requester cannot be sole approver for production promotions.
-
Workflow execution is deterministic — Same inputs produce same execution order and outputs.
Error Handling
- Transient failures — Retry with exponential backoff; circuit breaker for repeated failures
- Agent disconnection — Mark agent offline; reassign pending tasks to other agents
- Deployment failure — Automatic rollback if configured; otherwise mark promotion as failed
- Gate failure — Block promotion; require manual intervention or re-evaluation
Observability
Metrics
release_promotions_total— Counter by environment and outcomerelease_deployments_duration_seconds— Histogram of deployment timesrelease_gate_evaluations_total— Counter by gate type and resultrelease_agents_online— Gauge of online agentsrelease_workflow_steps_duration_seconds— Histogram by step type
Traces
promotion.request— Span for promotion request handlinggate.evaluate— Span for each gate evaluationdeployment.execute— Span for deployment executionagent.task— Span for agent task execution
Logs
- Structured logs with correlation IDs
- Promotion ID, release ID, environment ID in all relevant logs
- Sensitive data (secrets, credentials) masked
Security Considerations
Agent Security
- mTLS authentication — Agents authenticate with CA-signed certificates
- Short-lived credentials — Task credentials expire after execution
- Capability-based authorization — Agents only receive tasks matching their capabilities
- Heartbeat monitoring — Detect and flag agent disconnections
Secrets Management
- Never stored in database — Only vault references stored
- Fetched at execution time — Secrets retrieved just-in-time for deployment
- Short-lived — Dynamic credentials with minimal TTL
- Masked in logs — Secret values never logged
Plugin Sandbox
- Resource limits — CPU, memory, timeout limits per plugin
- Capability restrictions — Plugins declare required capabilities
- Network isolation — Optional network restrictions for plugins
Performance Characteristics
- Promotion evaluation — < 5 seconds for typical gate evaluation
- Deployment latency — Dominated by image pull time; orchestration overhead < 10 seconds
- Agent heartbeat — 30-second interval; offline detection within 90 seconds
- Workflow step timeout — Configurable; default 5 minutes per step
Implementation Roadmap
| Phase | Focus | Key Deliverables |
|---|---|---|
| Phase 1 | Foundation | Environment management, integration hub, release bundles |
| Phase 2 | Workflow Engine | DAG execution, step registry, workflow templates |
| Phase 3 | Promotion & Decision | Approval gateway, security gates, decision records |
| Phase 4 | Deployment Execution | Docker/Compose agents, artifact generation, rollback |
| Phase 5 | UI & Polish | Release dashboard, promotion UI, environment management |
| Phase 6 | Progressive Delivery | A/B releases, canary, traffic routing |
| Phase 7 | Extended Targets | ECS, Nomad, SSH/WinRM agentless |
| Phase 8 | Plugin Ecosystem | Full plugin system, marketplace |