Files
git.stella-ops.org/docs/07_HIGH_LEVEL_ARCHITECTURE.md
master 75f6942769
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Add integration tests for migration categories and execution
- Implemented MigrationCategoryTests to validate migration categorization for startup, release, seed, and data migrations.
- Added tests for edge cases, including null, empty, and whitespace migration names.
- Created StartupMigrationHostTests to verify the behavior of the migration host with real PostgreSQL instances using Testcontainers.
- Included tests for migration execution, schema creation, and handling of pending release migrations.
- Added SQL migration files for testing: creating a test table, adding a column, a release migration, and seeding data.
2025-12-04 19:10:54 +02:00

31 KiB
Executable File
Raw Blame History

HighLevel Architecture — StellaOps (Consolidated • 2025Q4)

Want the 10-minute tour? See high-level-architecture.md; this file retains the exhaustive reference.

Purpose. A complete, implementationready map of StellaOps: product vision, all runtime components, trust boundaries, tokens/licensing, control/data flows, storage, APIs, security, scale, DevOps, and verification logic. Scope. This file replaces the separate components.md; all component details now live here.


0) Product vision & principles

Vision. StellaOps is a deterministic SBOM + VEX platform for CI/CD and runtime, tuned for speed (perlayer deltas), quiet output (usagescoped views), and verifiability (DSSE + Rekor v2). It is selfhostable, airgap capable, and commercially enforceable: only licensed installations can produce StellaOpsverified attestations.

Operating principles.

  • Scannerowned SBOMs. We generate our own BOMs; we do not warehouse thirdparty SBOM content (we can link to attested SBOMs).
  • Deterministic evidence. Facts come from package DBs, installed metadata, linkers, and verified attestations; no fuzzy guessing in the core.
  • Per-layer caching. Cache fragments by layer digest and compose image SBOMs via CycloneDX BOM-Link / SPDX ExternalRef.
  • Inventory vs Usage. Always record the full inventory of what exists; separately present usage (entrypoint closure + loaded libs).
  • Backend decides. PASS/FAIL is produced by Policy + VEX + Advisories. The scanner reports facts.
  • Attest or it didnt happen. Every export is signed as in-toto/DSSE and logged in Rekor v2.
  • Hybrid reachability attestations. Every reachability graph ships with a graph-level DSSE (mandatory) plus optional edge-bundle DSSEs for runtime/init/contested edges; Policy/Signals consume graph DSSE as baseline and edge bundles for quarantine/disputes.
  • Sovereign-ready. Cloud is used only for licensing and optional endorsement; everything else is first-party and self-hostable.
  • Competitive clarity. Moats: deterministic replay, hybrid reachability proofs, lattice VEX, sovereign crypto, proof graph; see docs/market/competitive-landscape.md.

1) Service topology & trust boundaries

1.1 Runtime inventory (firstparty)

Service / Tool Container image Core role Scale pattern
Scanner.WebService stellaops/scanner-web Control plane for scans; catalog; SBOM composition (inventory & usage); diff; exports; analysisonly report runs for Scheduler. Stateless; N replicas behind LB.
Scanner.Worker stellaops/scanner-worker Runs analyzers (OS, Lang: Java/Node/Python/Go/.NET/Rust, Native ELF/PE/MachO, EntryTrace); emits perlayer SBOMs and composes image SBOMs. Horizontal; queuedriven; sharded by layer digest.
Scanner.Sbomer.BuildXPlugin stellaops/sbom-indexer BuildKit generator for buildtime SBOMs as OCI referrers. CIside; ephemeral.
Scanner.Sbomer.DockerImage stellaops/scanner-cli CLIorchestrated scanner container for postbuild scans. Local/CI; ephemeral.
Concelier.WebService stellaops/concelier-web Vulnerability ingest/normalize/merge/export (JSON + Trivy DB). HA via Mongo locks.
Excititor.WebService stellaops/excititor-web VEX ingest/normalize/consensus; conflict retention; exports. HA via Mongo locks.
Policy Engine (in scanner-web) YAML DSL evaluator (waivers, vendor preferences, KEV/EPSS, license, usagegating); produces policy digest. Inprocess; cache per digest.
Scheduler.WebService stellaops/scheduler-web Schedules reevaluation runs; consumes Concelier/Excititor deltas; selects impacted images via BOMIndex; orchestrates analysisonly reports. Stateless API.
Scheduler.Worker stellaops/scheduler-worker Executes selection and enqueues batches toward Scanner; enforces rate/limits and windows; maintains impact cursors. Horizontal; queuedriven.
Notify.WebService stellaops/notify-web Rules engine for outbound notifications; manages channels, templates, throttle/digest logic. Stateless API.
Notify.Worker stellaops/notify-worker Delivers to Slack/Teams/Email/Webhooks; idempotent retries; digests. Horizontal; perchannel rate limits.
Signer stellaops/signer Hard gate: validates entitlement + release integrity; mints signing cert (Fulcio keyless) or uses KMS; signs DSSE. Stateless; HPA by QPS.
Attestor stellaops/attestor Posts DSSE bundles to Rekor v2; verification endpoints. Stateless; HPA by QPS.
Authority stellaops/authority Onprem OIDC issuing shortlived OpToks with DPoP/mTLS sender constraint. HA behind LB.
Zastava (Runtime) stellaops/zastava Runtime inspector/enforcer (observer + optional Admission Webhook). DaemonSet + Webhook.
Web UI stellaops/ui Angular app for scans, diffs, policy, VEX, Scheduler, Notify, runtime, reports. Stateless.
StellaOps.Cli stellaops/cli CLI for init/scan/export/diff/policy/report/verify; Buildx helper; schedule and notify verbs. Local/CI.

1.2 Thirdparty (selfhosted)

  • Fulcio (Sigstore CA) — issues shortlived signing certs (keyless).
  • Rekor v2 (tilebacked transparency log).
  • RustFS — offline-first object store with deterministic REST API (S3/MinIO fallback available for legacy installs).
  • PostgreSQL (≥15) — control-plane storage with per-module schema isolation (auth, vuln, vex, scheduler, notify, policy). See Database Architecture.
  • MongoDB (≥7) — legacy catalog support; being phased out in favor of PostgreSQL for control-plane domains.
  • Queue — Redis Streams / NATS / RabbitMQ (pluggable).
  • OCI Registry — must support Referrers API (discover SBOMs/signatures).

1.3 Cloud licensing (StellaOps)

  • Licensing Service (www.stella-ops.org) — issues longlived License Tokens (LT); exchanges LT → ProofofEntitlement (PoE) bound to an installation key; revoke/introspect PoE; optional crosslog endorsement.

1.4 Diagram (control/data planes & trust)

flowchart LR
  subgraph Cloud["www.stella-ops.org (Cloud)"]
    LS[Licensing Service<br/>LT→PoE / revoke / introspect]
  end

  subgraph OnPrem["Customer Site (Self-hosted)"]
    Auth[Authority (OIDC)\nOpTok (DPoP/mTLS)]
    SW[Scanner.WebService]
    WK[Scanner.Worker xN]
    CONC[Concelier]
    EXC[Excititor]
    SCHW[Scheduler.Web]
    SCH[Scheduler.Worker xN]
    NOTW[Notify.Web]
    NOT[Notify.Worker xN]
    POL[Policy Engine (in Scanner.Web)]
    SGN[Signer\n(entitlement + signing)]
    ATT[Attestor\n(Rekor v2 submit/verify)]
    UI[Web UI (Angular)]
    Z[Zastava\n(Runtime Inspector/Enforcer)]
    RFS[(RustFS object store)]
    MGO[(MongoDB)]
    QUE[(Queue/Streams)]
  end

  CLI[StellaOps.Cli / Buildx Plugin]
  REG[(OCI Registry with Referrers)]
  FUL[ Fulcio ]
  REK[ Rekor v2 (tiles) ]

  CLI -->|scan/build| SW
  SW -->|jobs| QUE
  QUE --> WK
  WK --> RFS
  SW --> MGO
  CONC --> MGO
  EXC --> MGO
  UI --> SW
  Z --> SW

  %% New event-driven loop
  CONC -- export.delta --> SCHW
  EXC  -- export.delta --> SCHW
  SCHW --> SCH
  SCH --> SW
  SW -- report.ready --> NOTW
  Z  -- admission/observe --> NOTW

  SGN <--> Auth
  SGN --> FUL
  SGN -->|mTLS| ATT
  ATT --> REK

  SGN <-->|verify referrers| REG

Trust boundaries. Only Signer can sign; only Attestor can write to Rekor v2. Scanner/UI/Scheduler/Notify never sign.


2) Licensing & tokens (installationready, theftresistant)

Twotoken model.

  • License Token (LT) — longlived JWT from Licensing Service; used once to enroll the installation; never used in hot path.
  • ProofofEntitlement (PoE) — bound to the installation key (mTLS client cert or DPoPbound JWT with cnf); mediumlived; renewable; revocable.
  • Operational token (OpTok) — 25min OIDC token from Authority, senderconstrained (DPoP or mTLS). Used to authenticate to Signer/Scanner.WebService/Scheduler.Web/Notify.Web.

Signer enforces both: PoE proves entitlement; OpTok proves “who is calling now”. It also independently verifies the scanner image digest is StellaOpssigned via Referrers + cosign before signing anything.

Enrollment sequence (LT → PoE).

@startuml
actor Operator
participant "Install Agent" as IA
participant "Licensing Service" as LS
Operator -> IA: Provide LT
IA -> IA: Generate K_inst
IA -> LS: /license/enroll {LT, pub(K_inst)}
LS --> IA: PoE (mTLS client cert or JWT with cnf=K_inst), CRL/OCSP/introspect
@enduml

3) Scanner subsystem (facts engine)

3.1 Analyzers (deterministic only)

  • OS packages: apk/dpkg/rpm (Linux); Windows MSI/SxS/GAC (M2).

  • Language (installed state):

    • Java (pom.properties / MANIFEST) → pkg:maven/...
    • Node (node_modules/*/package.json) → pkg:npm/...
    • Python (*.dist-info/METADATA) → pkg:pypi/...
    • Go (buildinfo) → pkg:golang/...
    • .NET (*.deps.json) → pkg:nuget/...
    • Rust: deterministic language markers (symbol mangling) and crates only when present; otherwise bin:{sha256}.
  • Native: ELF/PE/MachO imports, DT_NEEDED, RPATH/RUNPATH, symbol versions, PE version info.

  • EntryTrace: parse ENTRYPOINT/CMD; shell AST; resolve launchers (Java/Node/Python) to terminal program; record file:line chain.

3.2 Caching & composition

  • Layer cache: {layerDigest → SBOM fragment + analyzer meta}.

  • File CAS: {sha256(file) → parse result (ELF/JAR metadata/etc.)}.

  • Composition: build image SBOMs from fragments via BOMLink/ExternalRef; emit two views:

    • Inventory (complete filesystem inventory).
    • Usage (entrypoint closure + linked libs).
  • Transport: JSON and CycloneDX Protobuf (compact, fast to parse).

  • Index: BOMIndex sidecar with purl table + roaring bitmap + usedByEntrypoint flag for fast joins.

3.3 Diff (image → layer → package)

  • Added / Removed / Versionchanged changes, attributed to the layer that caused them.
  • Raw diffs preserved; backend view applies VEX + Policy.

3.4 Buildtime SBOMs (fast CI path)

  • Buildx generator runs analyzers during docker buildx build --attest=type=sbom,generator=stellaops/sbom-indexer, attaches SBOMs as OCI referrers.
  • Scanner.WebService can trust these (policyconfigurable) and skip rescan; DSSE + Rekor v2 can be done either at build time or postpush via Signer/Attestor.

3.5 Events / integrations

  • Out: report.ready (summary + verdict + Rekor UUID) → internal bus for Notify & UI.
  • Expose: imagelevel BOMIndex metadata for Scheduler impact selection.

4) Backend evaluation (decider)

4.1 Concelier (advisories)

  • Ingests vendor, distro, OSS feeds; normalizes & merges; persists canonical advisories in Mongo; exports deterministic JSON and Trivy DB.
  • Offline kit bundles for airgapped sites.

4.2 Excititor (VEX)

  • Ingests OpenVEX / CSAF VEX / CycloneDX VEX; normalizes claims; retains conflicts; computes consensus with provider trust weights and justification gates.

4.3 Policy Engine (YAML DSL)

  • Matchers: image/repo/env/purl/cve/vendor/source/path/layerDigest/usedByEntrypoint
  • Actions: ignore(until, justification), fail, warn, defer, requireVEX{vendors, justifications}, escalate {sev, KEV, EPSS}, license constraints.
  • Produces a policy digest (SHA256 of canonicalized policy).

4.4 PASS/FAIL flow

  1. SBOM (Inventory / Usage) → join with Concelier advisories.
  2. Apply Excititor consensus (statuses & justifications).
  3. Apply Policy; compute PASS/FAIL with waiver TTLs.
  4. Sign the final report (DSSE via Signer) and log to Rekor v2 via Attestor.

5) Runtime enforcement (Zastava)

  • Observer: inventories running containers, checks image signatures, SBOM presence (referrers), detects drift (entrypoint chain divergence), flags unapproved images.
  • Admission Webhook (optional): blocks policyfail pods (dryrun first).
  • Integration: posts runtime events to Scanner.WebService; can request delta scans on changed layers.

6) Storage & catalogs (RustFS/PostgreSQL)

RustFS layout (default)

rustfs://stellaops/
  layers/<sha256>/sbom.cdx.json.zst
  layers/<sha256>/sbom.spdx.json.zst
  images/<imgDigest>/inventory.cdx.pb
  images/<imgDigest>/usage.cdx.pb
  indexes/<imgDigest>/bom-index.bin
  attest/<artifactSha256>.dsse.json

Database Architecture (PostgreSQL)

StellaOps uses PostgreSQL for all control-plane data with per-module schema isolation. Each module owns and manages only its own schema, ensuring clear ownership and independent migration lifecycles.

Schema topology:

┌─────────────────────────────────────────────────────────────────┐
│                    PostgreSQL Cluster                            │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                    stellaops (database)                      ││
│  │  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐           ││
│  │  │  auth   │ │  vuln   │ │   vex   │ │scheduler│           ││
│  │  └─────────┘ └─────────┘ └─────────┘ └─────────┘           ││
│  │  ┌─────────┐ ┌─────────┐ ┌─────────┐                       ││
│  │  │ notify  │ │ policy  │ │  audit  │                       ││
│  │  └─────────┘ └─────────┘ └─────────┘                       ││
│  └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘

Schema ownership:

Schema Owner Module Purpose
auth Authority Identity, authentication, authorization, licensing, sessions
vuln Concelier Vulnerability advisories, CVSS, affected packages, sources
vex Excititor VEX statements, graphs, observations, evidence, consensus
scheduler Scheduler Jobs, triggers, workers, locks, execution history
notify Notify Channels, templates, rules, deliveries, escalations
policy Policy Policy packs, rules, risk profiles, evaluations
audit Shared Cross-cutting audit log (optional)

Key design principles:

  1. Module isolation — Each module controls only its own schema. Cross-schema queries are rare and explicitly documented.
  2. Multi-tenancy — Single database, single schema set, tenant_id column on all tenant-scoped tables with row-level security.
  3. Forward-only migrations — No down migrations; fixes are applied as new forward migrations.
  4. Advisory lock coordination — Startup migrations use pg_try_advisory_lock(hashtext('schema_name')) to prevent concurrent execution.
  5. Air-gap compatible — All migrations embedded in assemblies, no external network dependencies.

Migration categories:

Category Prefix Execution Description
Startup (A) 001-099 Automatic at boot Non-breaking DDL (CREATE IF NOT EXISTS, ADD COLUMN nullable)
Release (B) 100-199 Manual via CLI Breaking changes (DROP, ALTER TYPE), require maintenance window
Seed S001-S999 After schema Reference data with ON CONFLICT DO NOTHING
Data (C) DM001-DM999 Background job Batched data transformations, resumable

Detailed documentation: See docs/db/ for full specification, coding rules, and phase-by-phase conversion tasks.

Retention

  • RustFS applies retention via X-RustFS-Retain-Seconds; Scanner.WebService GC decrements refCount and deletes unreferenced metadata; S3/MinIO fallback retains native Object Lock when enabled.
  • PostgreSQL retention managed via time-based partitioning for high-volume tables (runs, execution_logs) with monthly partition drops.

7) APIs (consolidated surface)

7.1 Scanner.WebService

POST /api/scans                          { imageRef|digest, force? } → { scanId }
GET  /api/scans/{id}                     → { status, digests, artifacts[] }
GET  /api/sboms/{imageDigest}            ?format=cdx-json|cdx-pb|spdx-json&view=inventory|usage
GET  /api/diff?old=<digest>&new=<digest> → { added[], removed[], changed[], byLayer[] }
POST /api/exports                        { imageDigest, format, view } → { artifactId, rekorUrl }
POST /api/reports                        { imageDigest, policyRevision?, vexSnapshot? } → { reportId, verdict, rekorUrl }
GET  /api/catalog/artifacts/{id}         → { size, ttl, immutable, rekor, refs }
GET  /healthz | /readyz | /metrics

7.2 Signer (mTLS; hard gate)

POST /sign/dsse    # body: {subjectHash, imageDigest, predicate}; headers: OpTok (DPoP/mTLS) + PoE
GET  /verify/referrers?imageDigest=sha256:...  # is this image StellaOps-signed?

7.3 Attestor (mTLS)

POST /rekor/entries      # DSSE bundle → {uuid, index, proof, logURL}
GET  /rekor/entries/{uuid}

7.4 Authority (OIDC)

  • /.well-known/openid-configuration, /oauth/token (DPoP/mTLS), /oauth/introspect, /jwks

7.5 Licensing (cloud)

POST /license/enroll      { LT, pubKey }           → PoE + introspection endpoints
POST /license/revoke      { license_id }           → ok
POST /license/introspect  { poe }                  → { active, claims, exp }
POST /attest/endorse      { bundle }               → endorsement bundle (optional)

7.6 Scheduler

POST /api/v1/scheduler/schedules         {yaml|json}      → { scheduleId }
GET  /api/v1/scheduler/schedules                          → [ { id, nextRun, status, stats } ]
POST /api/v1/scheduler/run               { id|selector }   → { runId }
GET  /api/v1/scheduler/runs/{id}                          → { status, counts, links }
GET  /api/v1/scheduler/cursor                            → { lastConcelierExportId, lastExcititorExportId }

7.7 Notify

POST /api/v1/notify/test                 { channel, target } → { delivered }
POST /api/v1/notify/rules                {yaml|json}         → { ruleId }
GET  /api/v1/notify/rules                                   → [ { id, match, actions, enabled } ]
GET  /api/v1/notify/deliveries                              → [ { id, eventId, channel, status, attempts } ]

8) Security & verifiability

  • Senderconstrained tokens. All operational calls use DPoP (RFC9449) or mTLSbound tokens (RFC8705).
  • Entitlement. PoE is mandatory; revocation honored online.
  • Release integrity. Signer independently verifies scanner image digest via Referrers + cosign before signing.
  • Separation of duties. Scanner/UI/Scheduler/Notify cannot sign; only Signer can sign; only Attestor can write to Rekor v2.
  • Verifiers. Anyone can verify: DSSE signature → certificate chain to StellaOps Fulcio/KMS rootRekor v2 inclusion.
  • RBAC. Roles: scanner.admin|read, scheduler.admin|read, notify.admin|read, zastava.admin|read.
  • Community vs Authorized. Free/community runs throttled with no official attestations; authorized runs full speed and produce StellaOpsverified bundles.

DSSE predicate (SBOM/report)

{
  "predicateType": "https://stella-ops.org/attestations/sbom/1",
  "subject": [{ "name": "s3://stellaops/images/<digest>/inventory.cdx.pb", "digest": { "sha256": "<sha256>" } }],
  "predicate": {
    "image_digest": "<sha256:...>",
    "stellaops_version": "2.3.1 (2027.04)",
    "license_id": "LIC-9F2A...",
    "customer_id": "CUST-ACME",
    "plan": "pro",
    "policy_digest": "sha256:...",
    "views": ["inventory","usage"],
    "created": "2025-10-17T12:34:56Z"
  }
}

BOMIndex sidecar Binary header + purl table + roaring bitmaps; optional usedByEntrypoint flags for fast policy joins.


9) Scale, performance & quotas

  • Workers: horizontal; distributed lock per layer digest; global CAS in MinIO.

  • Queues: Redis Streams / NATS / RabbitMQ. HPA by queue depth, CPU, memory.

  • Registry throttling: perregistry concurrency budgets.

  • Targets:

    • Buildtime path P95 ≤35s on warmed bases.
    • Postbuild delta scan P95 ≤10s for 200MB images.
    • Policy + VEX evaluation ≤500ms for 5k components using BOMIndex.
    • Event → notification p95 ≤ 3060s under nominal load.
    • Export delta → reevaluation verdict p95 ≤ 5min for 10k impacted images.
  • Quotas: license plan enforces QPS/concurrency/size; Signer throttles and can deny DSSE.


10) DevOps & distribution

  • Releases: all firstparty images cosignsigned; labels embed org.stellaops.version and org.stellaops.release_date.

  • Channels:

    • Community (public registry): throttled, nonattesting.
    • Authorized (private registry): full speed, DSSE enabled.
  • Client update flow: containers selfverify signatures at boot; report version; Signer enforces valid_release_year / max_version from PoE before signing.

  • Compose skeleton:

services:
  authority:       { image: stellaops/authority, depends_on: [postgres] }
  fulcio:          { image: sigstore/fulcio }
  rekor:           { image: sigstore/rekor-v2 }
  minio:           { image: minio/minio, command: server /data --console-address ":9001" }
  postgres:        { image: postgres:15-alpine, environment: { POSTGRES_DB: stellaops, POSTGRES_USER: stellaops } }
  signer:          { image: stellaops/signer, depends_on: [authority, fulcio] }
  attestor:        { image: stellaops/attestor, depends_on: [rekor, signer] }
  scanner-web:     { image: stellaops/scanner-web, depends_on: [postgres, minio, signer, attestor] }
  scanner-worker:  { image: stellaops/scanner-worker, deploy: { replicas: 4 }, depends_on: [scanner-web] }
  concelier:       { image: stellaops/concelier-web, depends_on: [postgres] }
  excititor:       { image: stellaops/excititor-web, depends_on: [postgres] }
  scheduler-web:   { image: stellaops/scheduler-web, depends_on: [postgres] }
  scheduler-worker:{ image: stellaops/scheduler-worker, deploy: { replicas: 2 }, depends_on: [scheduler-web] }
  notify-web:      { image: stellaops/notify-web, depends_on: [postgres] }
  notify-worker:   { image: stellaops/notify-worker, deploy: { replicas: 2 }, depends_on: [notify-web] }
  ui:              { image: stellaops/ui, depends_on: [scanner-web, concelier, excititor, scheduler-web, notify-web] }
  • Binary prerequisites (offline-first):

    • Single curated NuGet location: local-nugets/ holds the .nupkg feed (hashed in manifest.json) and the restore output (local-nugets/packages, configured via NuGet.config).
    • Non-NuGet binaries (plugins/CLIs/tools) are catalogued with SHA-256 in vendor/manifest.json; air-gap bundles are registered in offline/feeds/manifest.json.
    • CI guard: scripts/verify-binaries.sh blocks binaries outside approved roots; offline restores use dotnet restore --source local-nugets with OFFLINE=1 (override via ALLOW_REMOTE=1).
  • Backups: Mongo dumps; RustFS snapshots (or S3 versioning when fallback driver is used); Rekor v2 DB snapshots; JWKS/Fulcio/KMS key rotation.

  • Ops runbooks: Scheduler catchup after Concelier/Excititor recovery; connector key rotation (Slack/Teams/SMTP).

  • SLOs & alerts: lag between Concelier/Excititor export and first rescan verdict; delivery failure rates by channel.


11) Observability & audit

  • Metrics: scan latency, layer cache hit %, artifact bytes, DSSE/Rekor latency, policy evaluation time, queue depth, admission decisions (Zastava).
  • Scheduler metrics: scheduler.impacted_images_total, scheduler.jobs_enqueued_total, scheduler.selection_ms, endtoend p95 (event → verdict).
  • Notify metrics: notify.sent_total{channel}, notify.dropped_total{reason}, notify.digest_coalesced_total, notify.latency_ms.
  • Tracing: perstage spans; correlation IDs across Scanner→Signer→Attestor and Concelier/Excititor→Scheduler→Scanner→Notify.
  • Audit logs: every signing records license_id, image_digest, policy_digest, and Rekor UUID; Scheduler records who scheduled what; Notify records where, when, and why messages were sent or deduped.
  • Compliance: RustFS retention headers (or MinIO Object Lock when operating in S3 mode) keep immutable artifacts tamperresistant; reproducible outputs via policy digest + SBOM digest in predicate.

12) Roadmap (anchored to this architecture)

  • M2: Windows MSI/SxS/GAC analyzers; deeper Rust (DWARF enrichers).
  • M2: Buildx generator certified flows; crossregistry trust policies.
  • M3: PatchPresence plugin (signaturebased backport detection), optin.
  • M3: Zastava Admission control GA with policy presets and dryrun→enforce stages.
  • M3: Scheduler GA with exportdelta impact routing and capacityaware pacing.
  • M3: Notify GA with digests, Slack/Teams/Email/Webhooks; M4: PagerDuty/Opsgenie connectors.
  • Continuous: Policy UX (waiver TTLs, vendor rules), Excititor connectors expansion.

13) Canonical sequences (verification, reevaluation & notify)

Sign & log (OpTok + PoE, image verify, DSSE, Rekor).

sequenceDiagram
  autonumber
  participant Scan as Scanner.WebService
  participant Auth as Authority (OIDC)
  participant Sign as Signer
  participant Reg as OCI Registry
  participant Ful as Fulcio/KMS
  participant Att as Attestor
  participant Rek as Rekor v2

  Scan->>Auth: Get OpTok (DPoP/mTLS)
  Scan->>Sign: sign(request) + OpTok + PoE + DPoP proof
  Sign->>Auth: Validate OpTok & sender-constraint
  Sign->>Sign: Validate PoE (introspect/revocation)
  Sign->>Reg: Verify scanner image is StellaOps-signed (Referrers + cosign)
  alt OK
    Sign->>Ful: Get signing cert (keyless) or use KMS key
    Sign-->>Scan: DSSE bundle (cert chain)
    Scan->>Att: Submit bundle
    Att-->>Rek: Create entry
    Rek-->>Att: {uuid,index,proof}
    Att-->>Scan: Rekor URL
  else Deny
    Sign-->>Scan: 403 (no attestation)
  end

Eventdriven reevaluation & notify.

sequenceDiagram
  participant CONC as Concelier
  participant EXC as Excititor
  participant SCH as Scheduler
  participant SC as Scanner.WebService
  participant NO as Notify

  CONC->>SCH: export.delta {changedProductKeys, exportId}
  EXC ->>SCH: export.delta {changedProductKeys, exportId}
  SCH->>SCH: Impact select via BOM-Index bitmaps
  SCH->>SC: Enqueue analysis-only reports (batches)
  SC-->>SCH: verdict stream (PASS/FAIL, deltas)
  SCH->>NO: rescan.delta {imageDigest, newCriticals, links}
  NO-->>Slack/Teams/Email/Webhook: deliver (throttle/digest rules applied)

14) Minimal data shapes (Scheduler & Notify)

Scheduler schedule (YAML via UI/CLI)

name: nightly-eu
when: "0 2 * * * Europe/Sofia"
mode: analysis-only        # or content-refresh
selection:
  scope: all-images        # or tenant/ns/repo label selectors
  onlyIf: { lastReportOlderThanDays: 7 }
notify:
  onNewFindings: true
  minSeverity: high
limits:
  maxJobs: 5000
  ratePerSecond: 50

Notify rule (YAML)

name: high-critical-alerts
match:
  eventKinds: ["report.ready","rescan.delta","zastava.admission"]
  minSeverity: high
  namespaces: ["prod-*"]
  vex: { includeAcceptedJustifications: false }
actions:
  - channel: slack
    target: "#sec-alerts"
    template: "concise"
    throttle: "5m"
  - channel: email
    target: "soc@acme.org"
    digest: "hourly"
enabled: true