518 lines
		
	
	
		
			26 KiB
		
	
	
	
		
			Markdown
		
	
	
		
			Executable File
		
	
	
	
	
			
		
		
	
	
			518 lines
		
	
	
		
			26 KiB
		
	
	
	
		
			Markdown
		
	
	
		
			Executable File
		
	
	
	
	
| # High‑Level Architecture — **Stella Ops** (Consolidated • 2025Q4)
 | ||
| 
 | ||
| > **Purpose.** A complete, implementation‑ready map of Stella Ops: product vision, all runtime components, trust boundaries, tokens/licensing, control/data flows, storage, APIs, security, scale, DevOps, and verification logic.
 | ||
| > **Scope.** This file **replaces** the separate `components.md`; all component details now live here.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 0) Product vision & principles
 | ||
| 
 | ||
| **Vision.** Stella Ops is a **deterministic SBOM + VEX platform** for CI/CD and runtime, tuned for **speed** (per‑layer deltas), **quiet output** (usage‑scoped views), and **verifiability** (DSSE + Rekor v2). It is **self‑hostable**, **air‑gap capable**, and **commercially enforceable**: only licensed installations can produce **Stella Ops‑verified** attestations.
 | ||
| 
 | ||
| **Operating principles.**
 | ||
| 
 | ||
| * **Scanner‑owned SBOMs.** We generate our own BOMs; we do not warehouse third‑party SBOM content (we can **link** to attested SBOMs).
 | ||
| * **Deterministic evidence.** Facts come from package DBs, installed metadata, linkers, and verified attestations; no fuzzy guessing in the core.
 | ||
| * **Per‑layer caching.** Cache fragments by **layer digest** and compose image SBOMs via **CycloneDX BOM‑Link** / **SPDX ExternalRef**.
 | ||
| * **Inventory vs Usage.** Always record the full **inventory** of what exists; separately present **usage** (entrypoint closure + loaded libs).
 | ||
| * **Backend decides.** PASS/FAIL is produced by **Policy** + **VEX** + **Advisories**. The scanner reports facts.
 | ||
| * **Attest or it didn’t happen.** Every export is signed as **in‑toto/DSSE** and logged in **Rekor v2**.
 | ||
| * **Sovereign‑ready.** Cloud is used only for licensing and optional endorsement; everything else is first‑party and self‑hostable.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 1) Service topology & trust boundaries
 | ||
| 
 | ||
| ### 1.1 Runtime inventory (first‑party)
 | ||
| 
 | ||
| | Service / Tool                  | Container image              | Core role                                                                                                                                             | Scale pattern                                      |
 | ||
| | ------------------------------- | ---------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------- |
 | ||
| | **Scanner.WebService**          | `stellaops/scanner-web`      | Control plane for scans; catalog; SBOM composition (inventory & usage); diff; exports; **analysis‑only report runs** for Scheduler.                   | Stateless; N replicas behind LB.                   |
 | ||
| | **Scanner.Worker**              | `stellaops/scanner-worker`   | Runs analyzers (OS, Lang: Java/Node/Python/Go/.NET/Rust, Native ELF/PE/Mach‑O, EntryTrace); emits per‑layer SBOMs and composes image SBOMs.           | Horizontal; queue‑driven; sharded by layer digest. |
 | ||
| | **Scanner.Sbomer.BuildXPlugin** | `stellaops/sbom-indexer`     | BuildKit **generator** for build‑time SBOMs as OCI **referrers**.                                                                                     | CI‑side; ephemeral.                                |
 | ||
| | **Scanner.Sbomer.DockerImage**  | `stellaops/scanner-cli`      | CLI‑orchestrated scanner container for post‑build scans.                                                                                              | Local/CI; ephemeral.                               |
 | ||
| | **Concelier.WebService**        | `stellaops/concelier-web`    | Vulnerability ingest/normalize/merge/export (JSON + Trivy DB).                                                                                        | HA via Mongo locks.                                |
 | ||
| | **Excititor.WebService**        | `stellaops/excititor-web`    | VEX ingest/normalize/consensus; conflict retention; exports.                                                                                          | HA via Mongo locks.                                |
 | ||
| | **Policy Engine**               | (in `scanner-web`)           | YAML DSL evaluator (waivers, vendor preferences, KEV/EPSS, license, usage‑gating); produces **policy digest**.                                        | In‑process; cache per digest.                      |
 | ||
| | **Scheduler.WebService**        | `stellaops/scheduler-web`    | Schedules **re‑evaluation** runs; consumes Concelier/Excititor deltas; selects **impacted images** via BOM‑Index; orchestrates analysis‑only reports. | Stateless API.                                     |
 | ||
| | **Scheduler.Worker**            | `stellaops/scheduler-worker` | Executes selection and enqueues batches toward Scanner; enforces rate/limits and windows; maintains impact cursors.                                   | Horizontal; queue‑driven.                          |
 | ||
| | **Notify.WebService**           | `stellaops/notify-web`       | Rules engine for outbound notifications; manages channels, templates, throttle/digest logic.                                                          | Stateless API.                                     |
 | ||
| | **Notify.Worker**               | `stellaops/notify-worker`    | Delivers to Slack/Teams/Email/Webhooks; idempotent retries; digests.                                                                                  | Horizontal; per‑channel rate limits.               |
 | ||
| | **Signer**                      | `stellaops/signer`           | **Hard gate:** validates entitlement + release integrity; mints signing cert (Fulcio keyless) or uses KMS; signs DSSE.                                | Stateless; HPA by QPS.                             |
 | ||
| | **Attestor**                    | `stellaops/attestor`         | Posts DSSE bundles to **Rekor v2**; verification endpoints.                                                                                           | Stateless; HPA by QPS.                             |
 | ||
| | **Authority**                   | `stellaops/authority`        | On‑prem OIDC issuing **short‑lived OpToks** with DPoP/mTLS sender constraint.                                                                         | HA behind LB.                                      |
 | ||
| | **Zastava** (Runtime)           | `stellaops/zastava`          | Runtime inspector/enforcer (observer + optional Admission Webhook).                                                                                   | DaemonSet + Webhook.                               |
 | ||
| | **Web UI**                      | `stellaops/ui`               | Angular app for scans, diffs, policy, VEX, **Scheduler**, **Notify**, runtime, reports.                                                               | Stateless.                                         |
 | ||
| | **StellaOps.Cli**               | `stellaops/cli`              | CLI for init/scan/export/diff/policy/report/verify; Buildx helper; **schedule** and **notify** verbs.                                                 | Local/CI.                                          |
 | ||
| 
 | ||
| ### 1.2 Third‑party (self‑hosted)
 | ||
| 
 | ||
| * **Fulcio** (Sigstore CA) — issues short‑lived signing certs (keyless).
 | ||
| * **Rekor v2** (tile‑backed transparency log).
 | ||
| * **MinIO** — S3‑compatible object store with lifecycle & Object Lock.
 | ||
| * **MongoDB** — catalog, advisories, VEX, scheduler, notify.
 | ||
| * **Queue** — Redis Streams / NATS / RabbitMQ (pluggable).
 | ||
| * **OCI Registry** — must support **Referrers API** (discover SBOMs/signatures).
 | ||
| 
 | ||
| ### 1.3 Cloud licensing (Stella Ops)
 | ||
| 
 | ||
| * **Licensing Service** (`www.stella-ops.org`) — issues long‑lived **License Tokens (LT)**; exchanges LT → **Proof‑of‑Entitlement (PoE)** bound to an installation key; revoke/introspect PoE; optional cross‑log **endorsement**.
 | ||
| 
 | ||
| ### 1.4 Diagram (control/data planes & trust)
 | ||
| 
 | ||
| ```mermaid
 | ||
| flowchart LR
 | ||
|   subgraph Cloud["www.stella-ops.org (Cloud)"]
 | ||
|     LS[Licensing Service<br/>LT→PoE / revoke / introspect]
 | ||
|   end
 | ||
| 
 | ||
|   subgraph OnPrem["Customer Site (Self-hosted)"]
 | ||
|     Auth[Authority (OIDC)\nOpTok (DPoP/mTLS)]
 | ||
|     SW[Scanner.WebService]
 | ||
|     WK[Scanner.Worker xN]
 | ||
|     CONC[Concelier]
 | ||
|     EXC[Excititor]
 | ||
|     SCHW[Scheduler.Web]
 | ||
|     SCH[Scheduler.Worker xN]
 | ||
|     NOTW[Notify.Web]
 | ||
|     NOT[Notify.Worker xN]
 | ||
|     POL[Policy Engine (in Scanner.Web)]
 | ||
|     SGN[Signer\n(entitlement + signing)]
 | ||
|     ATT[Attestor\n(Rekor v2 submit/verify)]
 | ||
|     UI[Web UI (Angular)]
 | ||
|     Z[Zastava\n(Runtime Inspector/Enforcer)]
 | ||
|     MIN[(MinIO S3)]
 | ||
|     MGO[(MongoDB)]
 | ||
|     QUE[(Queue/Streams)]
 | ||
|   end
 | ||
| 
 | ||
|   CLI[StellaOps.Cli / Buildx Plugin]
 | ||
|   REG[(OCI Registry with Referrers)]
 | ||
|   FUL[ Fulcio ]
 | ||
|   REK[ Rekor v2 (tiles) ]
 | ||
| 
 | ||
|   CLI -->|scan/build| SW
 | ||
|   SW -->|jobs| QUE
 | ||
|   QUE --> WK
 | ||
|   WK --> MIN
 | ||
|   SW --> MGO
 | ||
|   CONC --> MGO
 | ||
|   EXC --> MGO
 | ||
|   UI --> SW
 | ||
|   Z --> SW
 | ||
| 
 | ||
|   %% New event-driven loop
 | ||
|   CONC -- export.delta --> SCHW
 | ||
|   EXC  -- export.delta --> SCHW
 | ||
|   SCHW --> SCH
 | ||
|   SCH --> SW
 | ||
|   SW -- report.ready --> NOTW
 | ||
|   Z  -- admission/observe --> NOTW
 | ||
| 
 | ||
|   SGN <--> Auth
 | ||
|   SGN --> FUL
 | ||
|   SGN -->|mTLS| ATT
 | ||
|   ATT --> REK
 | ||
| 
 | ||
|   SGN <-->|verify referrers| REG
 | ||
| ```
 | ||
| 
 | ||
| **Trust boundaries.** Only **Signer** can sign; only **Attestor** can write to **Rekor v2**. Scanner/UI/Scheduler/Notify never sign.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 2) Licensing & tokens (installation‑ready, theft‑resistant)
 | ||
| 
 | ||
| **Two‑token model.**
 | ||
| 
 | ||
| * **License Token (LT)** — long‑lived JWT from **Licensing Service**; used **once** to enroll the installation; never used in hot path.
 | ||
| * **Proof‑of‑Entitlement (PoE)** — bound to the installation key (mTLS client cert **or** DPoP‑bound JWT with `cnf`); medium‑lived; renewable; revocable.
 | ||
| * **Operational token (OpTok)** — 2–5 min OIDC token from **Authority**, **sender‑constrained** (DPoP or mTLS). Used to authenticate to **Signer**/**Scanner.WebService**/**Scheduler.Web**/**Notify.Web**.
 | ||
| 
 | ||
| **Signer enforces both:** PoE proves entitlement; OpTok proves “who is calling now”. It also **independently verifies** the **scanner image digest** is **Stella Ops‑signed** via **Referrers + cosign** before signing anything.
 | ||
| 
 | ||
| **Enrollment sequence (LT → PoE).**
 | ||
| 
 | ||
| ```plantuml
 | ||
| @startuml
 | ||
| actor Operator
 | ||
| participant "Install Agent" as IA
 | ||
| participant "Licensing Service" as LS
 | ||
| Operator -> IA: Provide LT
 | ||
| IA -> IA: Generate K_inst
 | ||
| IA -> LS: /license/enroll {LT, pub(K_inst)}
 | ||
| LS --> IA: PoE (mTLS client cert or JWT with cnf=K_inst), CRL/OCSP/introspect
 | ||
| @enduml
 | ||
| ```
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 3) Scanner subsystem (facts engine)
 | ||
| 
 | ||
| ### 3.1 Analyzers (deterministic only)
 | ||
| 
 | ||
| * **OS packages:** apk/dpkg/rpm (Linux); Windows MSI/SxS/GAC (M2).
 | ||
| * **Language (installed state):**
 | ||
| 
 | ||
|   * Java (pom.properties / MANIFEST) → `pkg:maven/...`
 | ||
|   * Node (`node_modules/*/package.json`) → `pkg:npm/...`
 | ||
|   * Python (`*.dist-info/METADATA`) → `pkg:pypi/...`
 | ||
|   * Go (buildinfo) → `pkg:golang/...`
 | ||
|   * .NET (`*.deps.json`) → `pkg:nuget/...`
 | ||
|   * **Rust:** deterministic **language markers** (symbol mangling) and crates only when present; otherwise `bin:{sha256}`.
 | ||
| * **Native:** ELF/PE/Mach‑O imports, DT_NEEDED, RPATH/RUNPATH, symbol versions, PE version info.
 | ||
| * **EntryTrace:** parse `ENTRYPOINT`/`CMD`; shell AST; resolve launchers (Java/Node/Python) to terminal program; record file:line chain.
 | ||
| 
 | ||
| ### 3.2 Caching & composition
 | ||
| 
 | ||
| * **Layer cache:** `{layerDigest → SBOM fragment + analyzer meta}`.
 | ||
| * **File CAS:** `{sha256(file) → parse result (ELF/JAR metadata/etc.)}`.
 | ||
| * **Composition:** build **image SBOMs** from fragments via **BOM‑Link/ExternalRef**; emit **two views**:
 | ||
| 
 | ||
|   * **Inventory** (complete filesystem inventory).
 | ||
|   * **Usage** (entrypoint closure + linked libs).
 | ||
| * **Transport:** JSON **and** **CycloneDX Protobuf** (compact, fast to parse).
 | ||
| * **Index:** BOM‑Index sidecar with purl table + roaring bitmap + `usedByEntrypoint` flag for fast joins.
 | ||
| 
 | ||
| ### 3.3 Diff (image → layer → package)
 | ||
| 
 | ||
| * Added / Removed / Version‑changed changes, **attributed** to the layer that caused them.
 | ||
| * Raw diffs preserved; backend view applies **VEX + Policy**.
 | ||
| 
 | ||
| ### 3.4 Build‑time SBOMs (fast CI path)
 | ||
| 
 | ||
| * Buildx **generator** runs analyzers during `docker buildx build --attest=type=sbom,generator=stellaops/sbom-indexer`, attaches SBOMs as **OCI referrers**.
 | ||
| * Scanner.WebService can trust these (policy‑configurable) and **skip** re‑scan; DSSE + Rekor v2 can be done either at build time or post‑push via Signer/Attestor.
 | ||
| 
 | ||
| ### 3.5 Events / integrations
 | ||
| 
 | ||
| * **Out:** `report.ready` (summary + verdict + Rekor UUID) → internal bus for **Notify** & UI.
 | ||
| * **Expose:** image‑level **BOM‑Index** metadata for **Scheduler** impact selection.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 4) Backend evaluation (decider)
 | ||
| 
 | ||
| ### 4.1 Concelier (advisories)
 | ||
| 
 | ||
| * Ingests vendor, distro, OSS feeds; normalizes & merges; persists canonical advisories in Mongo; exports **deterministic JSON** and **Trivy DB**.
 | ||
| * Offline kit bundles for air‑gapped sites.
 | ||
| 
 | ||
| ### 4.2 Excititor (VEX)
 | ||
| 
 | ||
| * Ingests **OpenVEX / CSAF VEX / CycloneDX VEX**; normalizes claims; retains conflicts; computes **consensus** with provider trust weights and justification gates.
 | ||
| 
 | ||
| ### 4.3 Policy Engine (YAML DSL)
 | ||
| 
 | ||
| * Matchers: `image/repo/env/purl/cve/vendor/source/path/layerDigest/usedByEntrypoint`
 | ||
| * Actions: `ignore(until, justification)`, `fail`, `warn`, `defer`, `requireVEX{vendors, justifications}`, `escalate {sev, KEV, EPSS}`, license constraints.
 | ||
| * Produces a **policy digest** (SHA‑256 of canonicalized policy).
 | ||
| 
 | ||
| ### 4.4 PASS/FAIL flow
 | ||
| 
 | ||
| 1. SBOM (Inventory / Usage) → join with **Concelier** advisories.
 | ||
| 2. Apply **Excititor** consensus (statuses & justifications).
 | ||
| 3. Apply **Policy**; compute PASS/FAIL with waiver TTLs.
 | ||
| 4. Sign the **final report** (DSSE via **Signer**) and log to **Rekor v2** via **Attestor**.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 5) Runtime enforcement (Zastava)
 | ||
| 
 | ||
| * **Observer:** inventories running containers, checks image signatures, SBOM presence (referrers), detects drift (entrypoint chain divergence), flags unapproved images.
 | ||
| * **Admission Webhook (optional):** blocks policy‑fail pods (dry‑run first).
 | ||
| * **Integration:** posts runtime events to Scanner.WebService; can request **delta scans** on changed layers.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 6) Storage & catalogs (MinIO/Mongo)
 | ||
| 
 | ||
| **MinIO layout**
 | ||
| 
 | ||
| ```
 | ||
| s3://stellaops/
 | ||
|   layers/<sha256>/sbom.cdx.json.zst
 | ||
|   layers/<sha256>/sbom.spdx.json.zst
 | ||
|   images/<imgDigest>/inventory.cdx.pb
 | ||
|   images/<imgDigest>/usage.cdx.pb
 | ||
|   indexes/<imgDigest>/bom-index.bin
 | ||
|   attest/<artifactSha256>.dsse.json
 | ||
| ```
 | ||
| 
 | ||
| **Catalog (Mongo)**
 | ||
| 
 | ||
| * `artifacts` (type/format/sha/size/rekor/ttl/immutable/refCount/createdAt)
 | ||
| * `images`, `layers`, `links`, `lifecycleRules`
 | ||
| * **Scheduler:** `schedules`, `runs`, `locks`, `impact_cursors`
 | ||
| * **Notify:** `rules`, `deliveries`, `channels`, `templates`
 | ||
| 
 | ||
| **Retention**
 | ||
| 
 | ||
| * MinIO **ILM** for coarse TTL; Scanner.WebService GC decrements `refCount` and deletes unreferenced metadata; **Object Lock** for immutable classes (auditable artifacts).
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 7) APIs (consolidated surface)
 | ||
| 
 | ||
| ### 7.1 Scanner.WebService
 | ||
| 
 | ||
| ```
 | ||
| POST /api/scans                          { imageRef|digest, force? } → { scanId }
 | ||
| GET  /api/scans/{id}                     → { status, digests, artifacts[] }
 | ||
| GET  /api/sboms/{imageDigest}            ?format=cdx-json|cdx-pb|spdx-json&view=inventory|usage
 | ||
| GET  /api/diff?old=<digest>&new=<digest> → { added[], removed[], changed[], byLayer[] }
 | ||
| POST /api/exports                        { imageDigest, format, view } → { artifactId, rekorUrl }
 | ||
| POST /api/reports                        { imageDigest, policyRevision?, vexSnapshot? } → { reportId, verdict, rekorUrl }
 | ||
| GET  /api/catalog/artifacts/{id}         → { size, ttl, immutable, rekor, refs }
 | ||
| GET  /healthz | /readyz | /metrics
 | ||
| ```
 | ||
| 
 | ||
| ### 7.2 Signer (mTLS; hard gate)
 | ||
| 
 | ||
| ```
 | ||
| POST /sign/dsse    # body: {subjectHash, imageDigest, predicate}; headers: OpTok (DPoP/mTLS) + PoE
 | ||
| GET  /verify/referrers?imageDigest=sha256:...  # is this image StellaOps-signed?
 | ||
| ```
 | ||
| 
 | ||
| ### 7.3 Attestor (mTLS)
 | ||
| 
 | ||
| ```
 | ||
| POST /rekor/entries      # DSSE bundle → {uuid, index, proof, logURL}
 | ||
| GET  /rekor/entries/{uuid}
 | ||
| ```
 | ||
| 
 | ||
| ### 7.4 Authority (OIDC)
 | ||
| 
 | ||
| * `/.well-known/openid-configuration`, `/oauth/token` (DPoP/mTLS), `/oauth/introspect`, `/jwks`
 | ||
| 
 | ||
| ### 7.5 Licensing (cloud)
 | ||
| 
 | ||
| ```
 | ||
| POST /license/enroll      { LT, pubKey }           → PoE + introspection endpoints
 | ||
| POST /license/revoke      { license_id }           → ok
 | ||
| POST /license/introspect  { poe }                  → { active, claims, exp }
 | ||
| POST /attest/endorse      { bundle }               → endorsement bundle (optional)
 | ||
| ```
 | ||
| 
 | ||
| ### 7.6 Scheduler
 | ||
| 
 | ||
| ```
 | ||
| POST /api/v1/scheduler/schedules         {yaml|json}      → { scheduleId }
 | ||
| GET  /api/v1/scheduler/schedules                          → [ { id, nextRun, status, stats } ]
 | ||
| POST /api/v1/scheduler/run               { id|selector }   → { runId }
 | ||
| GET  /api/v1/scheduler/runs/{id}                          → { status, counts, links }
 | ||
| GET  /api/v1/scheduler/cursor                            → { lastConcelierExportId, lastExcititorExportId }
 | ||
| ```
 | ||
| 
 | ||
| ### 7.7 Notify
 | ||
| 
 | ||
| ```
 | ||
| POST /api/v1/notify/test                 { channel, target } → { delivered }
 | ||
| POST /api/v1/notify/rules                {yaml|json}         → { ruleId }
 | ||
| GET  /api/v1/notify/rules                                   → [ { id, match, actions, enabled } ]
 | ||
| GET  /api/v1/notify/deliveries                              → [ { id, eventId, channel, status, attempts } ]
 | ||
| ```
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 8) Security & verifiability
 | ||
| 
 | ||
| * **Sender‑constrained tokens.** All operational calls use **DPoP** (RFC 9449) or **mTLS‑bound** tokens (RFC 8705).
 | ||
| * **Entitlement.** **PoE** is mandatory; revocation honored online.
 | ||
| * **Release integrity.** **Signer** independently verifies **scanner image digest** via **Referrers + cosign** before signing.
 | ||
| * **Separation of duties.** Scanner/UI/Scheduler/Notify cannot sign; only **Signer** can sign; only **Attestor** can write to **Rekor v2**.
 | ||
| * **Verifiers.** Anyone can verify: DSSE signature → certificate chain to **Stella Ops Fulcio/KMS root** → **Rekor v2** inclusion.
 | ||
| * **RBAC.** Roles: `scanner.admin|read`, `scheduler.admin|read`, `notify.admin|read`, `zastava.admin|read`.
 | ||
| * **Community vs Authorized.** Free/community runs throttled with no official attestations; authorized runs full speed and produce **Stella Ops‑verified** bundles.
 | ||
| 
 | ||
| **DSSE predicate (SBOM/report)**
 | ||
| 
 | ||
| ```json
 | ||
| {
 | ||
|   "predicateType": "https://stella-ops.org/attestations/sbom/1",
 | ||
|   "subject": [{ "name": "s3://stellaops/images/<digest>/inventory.cdx.pb", "digest": { "sha256": "<sha256>" } }],
 | ||
|   "predicate": {
 | ||
|     "image_digest": "<sha256:...>",
 | ||
|     "stellaops_version": "2.3.1 (2027.04)",
 | ||
|     "license_id": "LIC-9F2A...",
 | ||
|     "customer_id": "CUST-ACME",
 | ||
|     "plan": "pro",
 | ||
|     "policy_digest": "sha256:...",
 | ||
|     "views": ["inventory","usage"],
 | ||
|     "created": "2025-10-17T12:34:56Z"
 | ||
|   }
 | ||
| }
 | ||
| ```
 | ||
| 
 | ||
| **BOM‑Index sidecar**
 | ||
| Binary header + purl table + roaring bitmaps; optional `usedByEntrypoint` flags for fast policy joins.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 9) Scale, performance & quotas
 | ||
| 
 | ||
| * **Workers:** horizontal; **distributed lock per layer digest**; global CAS in MinIO.
 | ||
| * **Queues:** Redis Streams / NATS / RabbitMQ. HPA by queue depth, CPU, memory.
 | ||
| * **Registry throttling:** per‑registry concurrency budgets.
 | ||
| * **Targets:**
 | ||
| 
 | ||
|   * Build‑time path P95 ≤ 3–5 s on warmed bases.
 | ||
|   * Post‑build delta scan P95 ≤ 10 s for 200 MB images.
 | ||
|   * Policy + VEX evaluation ≤ 500 ms for 5k components using BOM‑Index.
 | ||
|   * **Event → notification** p95 ≤ **30–60 s** under nominal load.
 | ||
|   * **Export delta → re‑evaluation verdict** p95 ≤ **5 min** for 10k impacted images.
 | ||
| * **Quotas:** license plan enforces QPS/concurrency/size; **Signer** throttles and can deny DSSE.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 10) DevOps & distribution
 | ||
| 
 | ||
| * **Releases:** all first‑party images **cosign‑signed**; labels embed `org.stellaops.version` and `org.stellaops.release_date`.
 | ||
| * **Channels:**
 | ||
| 
 | ||
|   * **Community** (public registry): throttled, non‑attesting.
 | ||
|   * **Authorized** (private registry): full speed, DSSE enabled.
 | ||
| * **Client update flow:** containers self‑verify signatures at boot; report version; **Signer** enforces `valid_release_year` / `max_version` from PoE before signing.
 | ||
| * **Compose skeleton:**
 | ||
| 
 | ||
| ```yaml
 | ||
| services:
 | ||
|   authority:       { image: stellaops/authority }
 | ||
|   fulcio:          { image: sigstore/fulcio }
 | ||
|   rekor:           { image: sigstore/rekor-v2 }
 | ||
|   minio:           { image: minio/minio, command: server /data --console-address ":9001" }
 | ||
|   mongo:           { image: mongo:7 }
 | ||
|   signer:          { image: stellaops/signer, depends_on: [authority, fulcio] }
 | ||
|   attestor:        { image: stellaops/attestor, depends_on: [rekor, signer] }
 | ||
|   scanner-web:     { image: stellaops/scanner-web, depends_on: [mongo, minio, signer, attestor] }
 | ||
|   scanner-worker:  { image: stellaops/scanner-worker, deploy: { replicas: 4 }, depends_on: [scanner-web] }
 | ||
|   concelier:       { image: stellaops/concelier-web, depends_on: [mongo] }
 | ||
|   excititor:       { image: stellaops/excititor-web, depends_on: [mongo] }
 | ||
|   scheduler-web:   { image: stellaops/scheduler-web, depends_on: [mongo] }
 | ||
|   scheduler-worker:{ image: stellaops/scheduler-worker, deploy: { replicas: 2 }, depends_on: [scheduler-web] }
 | ||
|   notify-web:      { image: stellaops/notify-web, depends_on: [mongo] }
 | ||
|   notify-worker:   { image: stellaops/notify-worker, deploy: { replicas: 2 }, depends_on: [notify-web] }
 | ||
|   ui:              { image: stellaops/ui, depends_on: [scanner-web, concelier, excititor, scheduler-web, notify-web] }
 | ||
| ```
 | ||
| 
 | ||
| * **Backups:** Mongo dumps; MinIO versioned buckets & replication; Rekor v2 DB snapshots; JWKS/Fulcio/KMS key rotation.
 | ||
| * **Ops runbooks:** Scheduler catch‑up after Concelier/Excititor recovery; connector key rotation (Slack/Teams/SMTP).
 | ||
| * **SLOs & alerts:** lag between Concelier/Excititor export and first rescan verdict; delivery failure rates by channel.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 11) Observability & audit
 | ||
| 
 | ||
| * **Metrics:** scan latency, layer cache hit %, artifact bytes, DSSE/Rekor latency, policy evaluation time, queue depth, admission decisions (Zastava).
 | ||
| * **Scheduler metrics:** `scheduler.impacted_images_total`, `scheduler.jobs_enqueued_total`, `scheduler.selection_ms`, end‑to‑end p95 (event → verdict).
 | ||
| * **Notify metrics:** `notify.sent_total{channel}`, `notify.dropped_total{reason}`, `notify.digest_coalesced_total`, `notify.latency_ms`.
 | ||
| * **Tracing:** per‑stage spans; correlation IDs across Scanner→Signer→Attestor and Concelier/Excititor→Scheduler→Scanner→Notify.
 | ||
| * **Audit logs:** every signing records `license_id`, `image_digest`, `policy_digest`, and Rekor UUID; Scheduler records who scheduled what; Notify records where, when, and why messages were sent or deduped.
 | ||
| * **Compliance:** MinIO **Object Lock** for immutable artifacts; reproducible outputs via policy digest + SBOM digest in predicate.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 12) Roadmap (anchored to this architecture)
 | ||
| 
 | ||
| * M2: Windows MSI/SxS/GAC analyzers; deeper Rust (DWARF enrichers).
 | ||
| * M2: Buildx generator certified flows; cross‑registry trust policies.
 | ||
| * M3: Patch‑Presence plugin (signature‑based backport detection), opt‑in.
 | ||
| * M3: Zastava Admission control GA with policy presets and dry‑run→enforce stages.
 | ||
| * M3: **Scheduler GA** with export‑delta impact routing and capacity‑aware pacing.
 | ||
| * M3: **Notify GA** with digests, Slack/Teams/Email/Webhooks; **M4:** PagerDuty/Opsgenie connectors.
 | ||
| * Continuous: Policy UX (waiver TTLs, vendor rules), Excititor connectors expansion.
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 13) Canonical sequences (verification, re‑evaluation & notify)
 | ||
| 
 | ||
| **Sign & log (OpTok + PoE, image verify, DSSE, Rekor).**
 | ||
| 
 | ||
| ```mermaid
 | ||
| sequenceDiagram
 | ||
|   autonumber
 | ||
|   participant Scan as Scanner.WebService
 | ||
|   participant Auth as Authority (OIDC)
 | ||
|   participant Sign as Signer
 | ||
|   participant Reg as OCI Registry
 | ||
|   participant Ful as Fulcio/KMS
 | ||
|   participant Att as Attestor
 | ||
|   participant Rek as Rekor v2
 | ||
| 
 | ||
|   Scan->>Auth: Get OpTok (DPoP/mTLS)
 | ||
|   Scan->>Sign: sign(request) + OpTok + PoE + DPoP proof
 | ||
|   Sign->>Auth: Validate OpTok & sender-constraint
 | ||
|   Sign->>Sign: Validate PoE (introspect/revocation)
 | ||
|   Sign->>Reg: Verify scanner image is StellaOps-signed (Referrers + cosign)
 | ||
|   alt OK
 | ||
|     Sign->>Ful: Get signing cert (keyless) or use KMS key
 | ||
|     Sign-->>Scan: DSSE bundle (cert chain)
 | ||
|     Scan->>Att: Submit bundle
 | ||
|     Att-->>Rek: Create entry
 | ||
|     Rek-->>Att: {uuid,index,proof}
 | ||
|     Att-->>Scan: Rekor URL
 | ||
|   else Deny
 | ||
|     Sign-->>Scan: 403 (no attestation)
 | ||
|   end
 | ||
| ```
 | ||
| 
 | ||
| **Event‑driven re‑evaluation & notify.**
 | ||
| 
 | ||
| ```mermaid
 | ||
| sequenceDiagram
 | ||
|   participant CONC as Concelier
 | ||
|   participant EXC as Excititor
 | ||
|   participant SCH as Scheduler
 | ||
|   participant SC as Scanner.WebService
 | ||
|   participant NO as Notify
 | ||
| 
 | ||
|   CONC->>SCH: export.delta {changedProductKeys, exportId}
 | ||
|   EXC ->>SCH: export.delta {changedProductKeys, exportId}
 | ||
|   SCH->>SCH: Impact select via BOM-Index bitmaps
 | ||
|   SCH->>SC: Enqueue analysis-only reports (batches)
 | ||
|   SC-->>SCH: verdict stream (PASS/FAIL, deltas)
 | ||
|   SCH->>NO: rescan.delta {imageDigest, newCriticals, links}
 | ||
|   NO-->>Slack/Teams/Email/Webhook: deliver (throttle/digest rules applied)
 | ||
| ```
 | ||
| 
 | ||
| ---
 | ||
| 
 | ||
| ## 14) Minimal data shapes (Scheduler & Notify)
 | ||
| 
 | ||
| **Scheduler schedule (YAML via UI/CLI)**
 | ||
| 
 | ||
| ```yaml
 | ||
| name: nightly-eu
 | ||
| when: "0 2 * * * Europe/Sofia"
 | ||
| mode: analysis-only        # or content-refresh
 | ||
| selection:
 | ||
|   scope: all-images        # or tenant/ns/repo label selectors
 | ||
|   onlyIf: { lastReportOlderThanDays: 7 }
 | ||
| notify:
 | ||
|   onNewFindings: true
 | ||
|   minSeverity: high
 | ||
| limits:
 | ||
|   maxJobs: 5000
 | ||
|   ratePerSecond: 50
 | ||
| ```
 | ||
| 
 | ||
| **Notify rule (YAML)**
 | ||
| 
 | ||
| ```yaml
 | ||
| name: high-critical-alerts
 | ||
| match:
 | ||
|   eventKinds: ["report.ready","rescan.delta","zastava.admission"]
 | ||
|   minSeverity: high
 | ||
|   namespaces: ["prod-*"]
 | ||
|   vex: { includeAcceptedJustifications: false }
 | ||
| actions:
 | ||
|   - channel: slack
 | ||
|     target: "#sec-alerts"
 | ||
|     template: "concise"
 | ||
|     throttle: "5m"
 | ||
|   - channel: email
 | ||
|     target: "soc@acme.org"
 | ||
|     digest: "hourly"
 | ||
| enabled: true
 | ||
| ```
 |