Files
git.stella-ops.org/docs/ARCHITECTURE_OVERVIEW.md

168 lines
9.1 KiB
Markdown
Executable File

# Architecture Overview (High-Level)
This document is the 10-minute tour for Stella Ops Suite: what components exist, how they fit together, and what "release control plane + security gates + evidence-linked decisions" means in practice.
For the full reference map (services, boundaries, detailed flows), see `docs/ARCHITECTURE_REFERENCE.md`.
## What Stella Ops Suite Is
**Stella Ops Suite is a centralized, auditable release control plane for non-Kubernetes container estates.**
It sits between your CI and your runtime targets, governs promotion across environments, enforces security and policy gates, and produces verifiable evidence for every release decision.
```
CI Build → Registry → Stella (Scan + Release + Promote + Gate + Deploy) → Targets → Evidence
```
## Guiding Principles
- **Digest-first releases:** a release is an immutable set of OCI digests, never mutable tags.
- **Deterministic replay:** the same inputs yield the same outputs (stable ordering, canonical hashing, UTC timestamps).
- **Evidence-linked decisions:** every release decision links to concrete evidence artifacts (scan verdicts, approvals, policy evaluations).
- **Pluggable everything:** integrations are plugins; the core orchestration engine is stable.
- **Offline-first:** all core operations work in air-gapped environments.
- **No feature gating:** all plans include all features; limits are environments + new digests/day.
## System Map
### Release-Centric Flow
```
Build → Scan → Create Release → Request Promotion → Gate Evaluation → Deploy → Evidence
↑ ↓
└── Re-evaluate on CVE Updates ┘
```
### Platform Themes
Stella Ops Suite organizes capabilities into **themes** (functional areas):
#### Existing Themes (Operational)
| Theme | Purpose | Key Modules |
|-------|---------|-------------|
| **INGEST** | Advisory ingestion | Concelier, Advisory-AI |
| **VEXOPS** | VEX document handling | Excititor, VEX Lens, VEX Hub |
| **REASON** | Policy and decisioning | Policy Engine, OPA Runtime |
| **SCANENG** | Scanning and SBOM | Scanner, SBOM Service, Reachability |
| **EVIDENCE** | Evidence and attestation | Evidence Locker, Attestor, Export Center |
| **RUNTIME** | Runtime signals | Signals, Graph, Zastava |
| **JOBCTRL** | Job orchestration | Scheduler, Orchestrator, TaskRunner |
| **OBSERVE** | Observability | Notifier, Telemetry |
| **REPLAY** | Deterministic replay | Replay Engine |
| **DEVEXP** | Developer experience | CLI, Web UI, SDK |
#### Planned Themes (Release Orchestration)
| Theme | Purpose | Key Modules |
|-------|---------|-------------|
| **INTHUB** | Integration hub | Integration Manager, Connection Profiles, Connector Runtime, Doctor Checks |
| **ENVMGR** | Environment management | Environment Manager, Target Registry, Agent Manager, Inventory Sync |
| **RELMAN** | Release management | Component Registry, Version Manager, Release Manager, Release Catalog |
| **WORKFL** | Workflow engine | Workflow Designer, Workflow Engine, Step Executor, Step Registry |
| **PROMOT** | Promotion and approval | Promotion Manager, Approval Gateway, Decision Engine, Gate Registry |
| **DEPLOY** | Deployment execution | Deploy Orchestrator, Target Executor, Runner Executor, Artifact Generator, Rollback Manager |
| **AGENTS** | Deployment agents | Agent Core, Agent Docker, Agent Compose, Agent SSH, Agent WinRM, Agent ECS, Agent Nomad |
| **PROGDL** | Progressive delivery | A/B Manager, Traffic Router, Canary Controller, Rollout Strategy |
| **RELEVI** | Release evidence | Evidence Collector, Evidence Signer, Sticker Writer, Audit Exporter |
| **PLUGIN** | Plugin infrastructure | Plugin Registry, Plugin Loader, Plugin Sandbox, Plugin SDK |
### Service Tiers
| Tier | Services | Key Responsibilities |
|------|----------|----------------------|
| **Edge / Identity** | `StellaOps.Authority` | Issues short-lived tokens (DPoP + mTLS), exposes OIDC flows, rotates JWKS |
| **Release Control** | `StellaOps.ReleaseManager`, `StellaOps.PromotionManager`, `StellaOps.WorkflowEngine` | Release bundles, promotion workflows, gate evaluation (planned) |
| **Integration Hub** | `StellaOps.IntegrationManager`, `StellaOps.ConnectorRuntime` | SCM/CI/Registry/Vault connectors (planned) |
| **Scan & Attest** | `StellaOps.Scanner`, `StellaOps.Signer`, `StellaOps.Attestor` | Accept SBOMs/images, produce DSSE bundles, transparency logging |
| **Evidence Graph** | `StellaOps.Concelier`, `StellaOps.Excititor`, `StellaOps.Policy.Engine` | Advisories/VEX, linksets, lattice policy |
| **Deployment** | `StellaOps.DeployOrchestrator`, `StellaOps.Agent.*` | Deployment execution to Docker/Compose/ECS/Nomad (planned) |
| **Experience** | `StellaOps.Web`, `StellaOps.Cli`, `StellaOps.Notify`, `StellaOps.ExportCenter` | Operator UX, automation, notifications |
| **Data Plane** | PostgreSQL, Valkey, RustFS/object storage | Canonical store, queues, artifact storage |
## Infrastructure (What Is Required)
**Required**
- **PostgreSQL:** canonical persistent store for module schemas.
- **Valkey:** Redis-compatible cache/streams and DPoP nonce store.
- **RustFS (or equivalent S3-compatible store):** object storage for artifacts, bundles, and evidence.
**Optional (deployment-dependent)**
- **NATS JetStream:** optional messaging transport in some deployments.
- **Transparency log services:** Rekor mirror (and CA services) when transparency is enabled.
## End-to-End Flows
### Current: Vulnerability Scanning Flow
1. **Evidence enters** via Concelier and Excititor connectors (Aggregation-Only Contract).
2. **SBOM arrives** from CLI/CI; Scanner deduplicates layers and enqueues work.
3. **Analyzer bundle** runs inside the Worker and stores evidence in content-addressed caches.
4. **Policy Engine** merges advisories, VEX, and inventory/usage facts; emits explain traces and stable dispositions.
5. **Signer + Attestor** wrap outputs into DSSE bundles and (optionally) anchor them in a Rekor mirror.
6. **Console/CLI/Export** surface findings and package verifiable evidence; Notify emits digests/incidents.
### Planned: Release Orchestration Flow
1. **CI pushes image** to registry by digest; triggers webhook to Stella.
2. **Stella scans** the new digest and stores the verdict.
3. **Release created** bundling component digests with semantic version.
4. **Promotion requested** to move release from Dev → Stage → Prod.
5. **Gate evaluation** checks: security verdict, approval count, freeze windows, custom policies.
6. **Decision record** produced with evidence refs and signed.
7. **Deployment executed** via agent to target (Docker/Compose/ECS/Nomad).
8. **Version sticker** written to target for drift detection.
9. **Evidence packet** sealed and stored.
## Extension Points
### Current Extension Points
- **Scanner analyzers** (restart-time plug-ins) for ecosystem-specific parsing and facts extraction.
- **Concelier connectors** for new advisory sources (preserving aggregation-only guardrails).
- **Policy packs** for organization-specific gating and waivers/justifications.
- **Export profiles** for output formats and offline bundle shapes.
### Planned Extension Points (Three-Surface Plugin Model)
Plugins contribute through three surfaces:
1. **Manifest** (static declaration): What the plugin provides (integrations, steps, agents, gates)
2. **Connector Runtime** (dynamic execution): gRPC interface for runtime operations
3. **Step Provider** (execution contract): Execution characteristics for workflow steps
Plugin types:
- **Integration connectors:** SCM, CI, Registry, Vault, Target, Router
- **Step providers:** Custom workflow steps
- **Agent types:** New deployment target types
- **Gate providers:** Custom gate evaluations
## Offline & Sovereign Notes
- Offline Kit carries vulnerability feeds, container images, signatures, and verification material so the workflow stays identical when air-gapped.
- Authority + token verification remain local; quota enforcement is verifiable offline.
- Attestor can cache transparency proofs for offline verification.
- Evidence packets are exportable for external audit in air-gapped environments.
- All release decisions can be replayed with frozen inputs.
## Key Architectural Decisions
| Decision | Rationale |
|----------|-----------|
| **Digest-first release identity** | Tags are mutable; digests provide immutable release identity for audit |
| **3-surface plugin model** | Enables extensibility without core code changes |
| **Compiled C# scripts + sandboxed bash** | C# for complex orchestration; bash for simple hooks |
| **Agent + agentless execution** | Agent-based preferred for reliability; agentless for adoption |
| **Evidence packets for every decision** | Enables deterministic replay and audit-grade compliance |
## References
- `docs/ARCHITECTURE_REFERENCE.md` — Full reference map
- `docs/modules/release-orchestrator/architecture.md` — Release orchestrator design (planned)
- `docs/OFFLINE_KIT.md` — Air-gap operations
- `docs/API_CLI_REFERENCE.md` — API and CLI contracts
- `docs/modules/platform/architecture-overview.md` — Platform service design
- `docs/product/advisories/09-Jan-2026 - Stella Ops Orchestrator Architecture.md` — Full orchestrator specification