431 lines
20 KiB
Markdown
Executable File
431 lines
20 KiB
Markdown
Executable File
Below is the **revised, consolidated** `high_level_architecture.md`.
|
||
It **absorbs** all content from `components.md` so you have a single, authoritative file. No separate components doc is required.
|
||
|
||
---
|
||
|
||
# High‑Level Architecture — **Stella Ops** (Consolidated • 2025Q4)
|
||
|
||
> **Purpose.** A complete, implementation‑ready map of Stella Ops: product vision, all runtime components, trust boundaries, tokens/licensing, control/data flows, storage, APIs, security, scale, DevOps, and verification logic.
|
||
> **Scope.** This file **replaces** the separate `components.md`; all component details now live here.
|
||
|
||
---
|
||
|
||
## 0) Product vision & principles
|
||
|
||
**Vision.** Stella Ops is a **deterministic SBOM + VEX platform** for CI/CD and runtime, tuned for **speed** (per‑layer deltas), **quiet output** (usage‑scoped views), and **verifiability** (DSSE + Rekor v2). It is **self‑hostable**, **air‑gap capable**, and **commercially enforceable**: only licensed installations can produce **Stella Ops‑verified** attestations.
|
||
|
||
**Operating principles.**
|
||
|
||
* **Scanner‑owned SBOMs.** We generate our own BOMs; we do not warehouse third‑party SBOM content (we can **link** to attested SBOMs).
|
||
* **Deterministic evidence.** Facts come from package DBs, installed metadata, linkers, and verified attestations; no fuzzy guessing in the core.
|
||
* **Per‑layer caching.** Cache fragments by **layer digest** and compose image SBOMs via **CycloneDX BOM‑Link** / **SPDX ExternalRef**.
|
||
* **Inventory vs Usage.** Always record the full **inventory** of what exists; separately present **usage** (entrypoint closure + loaded libs).
|
||
* **Backend decides.** PASS/FAIL is produced by **Policy** + **VEX** + **Advisories**. The scanner reports facts.
|
||
* **Attest or it didn’t happen.** Every export is signed as **in‑toto/DSSE** and logged in **Rekor v2**.
|
||
* **Sovereign‑ready.** Cloud is used only for licensing and optional endorsement; everything else is first‑party and self‑hostable.
|
||
|
||
---
|
||
|
||
## 1) Service topology & trust boundaries
|
||
|
||
### 1.1 Runtime inventory (first‑party)
|
||
|
||
| Service / Tool | Container image | Core role | Scale pattern |
|
||
| ------------------------------- | -------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------- |
|
||
| **Scanner.WebService** | `stellaops/scanner-web` | Control plane for scans; catalog; SBOM composition (inventory & usage); diff; exports. | Stateless; N replicas behind LB. |
|
||
| **Scanner.Worker** | `stellaops/scanner-worker` | Runs analyzers (OS, Lang: Java/Node/Python/Go/.NET/Rust, Native ELF/PE/Mach‑O, EntryTrace); emits per‑layer SBOMs and composes image SBOMs. | Horizontal; queue‑driven; sharded by layer digest. |
|
||
| **Scanner.Sbomer.BuildXPlugin** | `stellaops/sbom-indexer` | BuildKit **generator** for build‑time SBOMs as OCI **referrers**. | CI‑side; ephemeral. |
|
||
| **Scanner.Sbomer.DockerImage** | `stellaops/scanner-cli` | CLI‑orchestrated scanner container for post‑build scans. | Local/CI; ephemeral. |
|
||
| **Concelier.WebService** | `stellaops/concelier-web` | Vulnerability ingest/normalize/merge/export (JSON + Trivy DB). | HA via Mongo locks. |
|
||
| **Excititor.WebService** | `stellaops/excititor-web` | VEX ingest/normalize/consensus; conflict retention; exports. | HA via Mongo locks. |
|
||
| **Policy Engine** | (in `scanner-web`) | YAML DSL evaluator (waivers, vendor preferences, KEV/EPSS, license, usage‑gating); produces **policy digest**. | In‑process; cache per digest. |
|
||
| **Signer** | `stellaops/signer` | **Hard gate:** validates entitlement + release integrity; mints signing cert (Fulcio keyless) or uses KMS; signs DSSE. | Stateless; HPA by QPS. |
|
||
| **Attestor** | `stellaops/attestor` | Posts DSSE bundles to **Rekor v2**; verification endpoints. | Stateless; HPA by QPS. |
|
||
| **Authority** | `stellaops/authority` | On‑prem OIDC issuing **short‑lived OpToks** with DPoP/mTLS sender constraint. | HA behind LB. |
|
||
| **Zastava** (Runtime) | `stellaops/zastava` | Runtime inspector/enforcer (observer + optional Admission Webhook). | DaemonSet + Webhook. |
|
||
| **Web UI** | `stellaops/ui` | Angular app for scans, diffs, policy, VEX, runtime, reports. | Stateless. |
|
||
| **StellaOps.Cli** | `stellaops/cli` | CLI for init/scan/export/diff/policy/report/verify; Buildx helper. | Local/CI. |
|
||
|
||
### 1.2 Third‑party (self‑hosted)
|
||
|
||
* **Fulcio** (Sigstore CA) — issues short‑lived signing certs (keyless).
|
||
* **Rekor v2** (tile‑backed transparency log).
|
||
* **MinIO** — S3‑compatible object store with lifecycle & Object Lock.
|
||
* **MongoDB** — catalog, advisories, VEX.
|
||
* **Queue** — Redis Streams / NATS / RabbitMQ (pluggable).
|
||
* **OCI Registry** — must support **Referrers API** (discover SBOMs/signatures).
|
||
|
||
### 1.3 Cloud licensing (Stella Ops)
|
||
|
||
* **Licensing Service** (`www.stella-ops.org`) — issues long‑lived **License Tokens (LT)**; exchanges LT → **Proof‑of‑Entitlement (PoE)** bound to an installation key; revoke/introspect PoE; optional cross‑log **endorsement**.
|
||
|
||
### 1.4 Diagram (control/data planes & trust)
|
||
|
||
```mermaid
|
||
flowchart LR
|
||
subgraph Cloud["www.stella-ops.org (Cloud)"]
|
||
LS[Licensing Service<br/>LT→PoE / revoke / introspect]
|
||
end
|
||
|
||
subgraph OnPrem["Customer Site (Self-hosted)"]
|
||
Auth[Authority (OIDC)\nOpTok (DPoP/mTLS)]
|
||
SW[Scanner.WebService]
|
||
WK[Scanner.Worker xN]
|
||
FEED[Concelier]
|
||
VEX[Excititor]
|
||
POL[Policy Engine (in Scanner.Web)]
|
||
SGN[Signer\n(entitlement + signing)]
|
||
ATT[Attestor\n(Rekor v2 submit/verify)]
|
||
UI[Web UI (Angular)]
|
||
Z[Zastava\n(Runtime Inspector/Enforcer)]
|
||
MIN[(MinIO S3)]
|
||
MGO[(MongoDB)]
|
||
QUE[(Queue/Streams)]
|
||
end
|
||
|
||
CLI[StellaOps.Cli / Buildx Plugin]
|
||
REG[(OCI Registry with Referrers)]
|
||
FUL[ Fulcio ]
|
||
REK[ Rekor v2 (tiles) ]
|
||
|
||
CLI -->|scan/build| SW
|
||
SW -->|jobs| QUE
|
||
QUE --> WK
|
||
WK --> MIN
|
||
SW --> MGO
|
||
FEED --> MGO
|
||
VEX --> MGO
|
||
UI --> SW
|
||
Z --> SW
|
||
|
||
SGN <--> Auth
|
||
SGN --> FUL
|
||
SGN -->|mTLS| ATT
|
||
ATT --> REK
|
||
|
||
SGN <-->|verify referrers| REG
|
||
```
|
||
|
||
**Trust boundaries.** Only **Signer** can sign; only **Attestor** can write to **Rekor v2**. Scanner/UI never sign.
|
||
|
||
---
|
||
|
||
## 2) Licensing & tokens (installation‑ready, theft‑resistant)
|
||
|
||
**Two‑token model.**
|
||
|
||
* **License Token (LT)** — long‑lived JWT from **Licensing Service**; used **once** to enroll the installation; never used in hot path.
|
||
* **Proof‑of‑Entitlement (PoE)** — bound to the installation key (mTLS client cert **or** DPoP‑bound JWT with `cnf`); medium‑lived; renewable; revocable.
|
||
* **Operational token (OpTok)** — 2–5 min OIDC token from **Authority**, **sender‑constrained** (DPoP or mTLS). Used to authenticate to **Signer**/**Scanner.WebService**.
|
||
|
||
**Signer enforces both:** PoE proves entitlement; OpTok proves “who is calling now”. It also **independently verifies** the **scanner image digest** is **Stella Ops‑signed** via **Referrers + cosign** before signing anything.
|
||
|
||
**Enrollment sequence (LT → PoE).**
|
||
|
||
```plantuml
|
||
@startuml
|
||
actor Operator
|
||
participant "Install Agent" as IA
|
||
participant "Licensing Service" as LS
|
||
Operator -> IA: Provide LT
|
||
IA -> IA: Generate K_inst
|
||
IA -> LS: /license/enroll {LT, pub(K_inst)}
|
||
LS --> IA: PoE (mTLS client cert or JWT with cnf=K_inst), CRL/OCSP/introspect
|
||
@enduml
|
||
```
|
||
|
||
---
|
||
|
||
## 3) Scanner subsystem (facts engine)
|
||
|
||
### 3.1 Analyzers (deterministic only)
|
||
|
||
* **OS packages:** apk/dpkg/rpm (Linux); Windows MSI/SxS/GAC (M2).
|
||
* **Language (installed state):**
|
||
|
||
* Java (pom.properties / MANIFEST) → `pkg:maven/...`
|
||
* Node (`node_modules/*/package.json`) → `pkg:npm/...`
|
||
* Python (`*.dist-info/METADATA`) → `pkg:pypi/...`
|
||
* Go (buildinfo) → `pkg:golang/...`
|
||
* .NET (`*.deps.json`) → `pkg:nuget/...`
|
||
* **Rust:** deterministic **language markers** (symbol mangling) and crates only when present; otherwise `bin:{sha256}`.
|
||
* **Native:** ELF/PE/Mach‑O imports, DT_NEEDED, RPATH/RUNPATH, symbol versions, PE version info.
|
||
* **EntryTrace:** parse `ENTRYPOINT`/`CMD`; shell AST; resolve launchers (Java/Node/Python) to terminal program; record file:line chain.
|
||
|
||
### 3.2 Caching & composition
|
||
|
||
* **Layer cache:** `{layerDigest → SBOM fragment + analyzer meta}`.
|
||
* **File CAS:** `{sha256(file) → parse result (ELF/JAR metadata/etc.)}`.
|
||
* **Composition:** build **image SBOMs** from fragments via **BOM‑Link/ExternalRef**; emit **two views**:
|
||
|
||
* **Inventory** (complete filesystem inventory).
|
||
* **Usage** (entrypoint closure + linked libs).
|
||
* **Transport:** JSON **and** **CycloneDX Protobuf** (compact, fast to parse).
|
||
* **Index:** BOM‑Index sidecar with purl table + roaring bitmap + `usedByEntrypoint` flag for fast joins.
|
||
|
||
### 3.3 Diff (image → layer → package)
|
||
|
||
* Added / Removed / Version‑changed changes, **attributed** to the layer that caused them.
|
||
* Raw diffs preserved; backend view applies **VEX + Policy**.
|
||
|
||
### 3.4 Build‑time SBOMs (fast CI path)
|
||
|
||
* Buildx **generator** runs analyzers during `docker buildx build --attest=type=sbom,generator=stellaops/sbom-indexer`, attaches SBOMs as **OCI referrers**.
|
||
* Scanner.WebService can trust these (policy‑configurable) and **skip** re‑scan; DSSE + Rekor v2 can be done either at build time or post‑push via Signer/Attestor.
|
||
|
||
---
|
||
|
||
## 4) Backend evaluation (decider)
|
||
|
||
### 4.1 Concelier (advisories)
|
||
|
||
* Ingests vendor, distro, OSS feeds; normalizes & merges; persists canonical advisories in Mongo; exports **deterministic JSON** and **Trivy DB**.
|
||
* Offline kit bundles for air‑gapped sites.
|
||
|
||
### 4.2 Excititor (VEX)
|
||
|
||
* Ingests **OpenVEX / CSAF VEX / CycloneDX VEX**; normalizes claims; retains conflicts; computes **consensus** with provider trust weights and justification gates.
|
||
|
||
### 4.3 Policy Engine (YAML DSL)
|
||
|
||
* Matchers: `image/repo/env/purl/cve/vendor/source/path/layerDigest/usedByEntrypoint`
|
||
* Actions: `ignore(until, justification)`, `fail`, `warn`, `defer`, `requireVEX{vendors, justifications}`, `escalate {sev, KEV, EPSS}`, license constraints.
|
||
* Produces a **policy digest** (SHA‑256 of canonicalized policy).
|
||
|
||
### 4.4 PASS/FAIL flow
|
||
|
||
1. SBOM (Inventory / Usage) → join with **Concelier** advisories.
|
||
2. Apply **Excititor** consensus (statuses & justifications).
|
||
3. Apply **Policy**; compute PASS/FAIL with waiver TTLs.
|
||
4. Sign the **final report** (DSSE via **Signer**) and log to **Rekor v2** via **Attestor**.
|
||
|
||
---
|
||
|
||
## 5) Runtime enforcement (Zastava)
|
||
|
||
* **Observer:** inventories running containers, checks image signatures, SBOM presence (referrers), detects drift (entrypoint chain divergence), flags unapproved images.
|
||
* **Admission Webhook (optional):** blocks policy‑fail pods (dry‑run first).
|
||
* **Integration:** posts runtime events to Scanner.WebService; can request **delta scans** on changed layers.
|
||
|
||
---
|
||
|
||
## 6) Storage & catalogs (MinIO/Mongo)
|
||
|
||
**MinIO layout**
|
||
|
||
```
|
||
s3://stellaops/
|
||
layers/<sha256>/sbom.cdx.json.zst
|
||
layers/<sha256>/sbom.spdx.json.zst
|
||
images/<imgDigest>/inventory.cdx.pb
|
||
images/<imgDigest>/usage.cdx.pb
|
||
indexes/<imgDigest>/bom-index.bin
|
||
attest/<artifactSha256>.dsse.json
|
||
```
|
||
|
||
**Catalog (Mongo)**
|
||
|
||
* `artifacts` (type/format/sha/size/rekor/ttl/immutable/refCount/createdAt)
|
||
* `images`, `layers`, `links`, `lifecycleRules`
|
||
|
||
**Retention**
|
||
|
||
* MinIO **ILM** for coarse TTL; Scanner.WebService GC decrements `refCount` and deletes unreferenced metadata; **Object Lock** for immutable classes (auditable artifacts).
|
||
|
||
---
|
||
|
||
## 7) APIs (consolidated surface)
|
||
|
||
### 7.1 Scanner.WebService
|
||
|
||
```
|
||
POST /api/scans { imageRef|digest, force? } → { scanId }
|
||
GET /api/scans/{id} → { status, digests, artifacts[] }
|
||
GET /api/sboms/{imageDigest} ?format=cdx-json|cdx-pb|spdx-json&view=inventory|usage
|
||
GET /api/diff?old=<digest>&new=<digest> → { added[], removed[], changed[], byLayer[] }
|
||
POST /api/exports { imageDigest, format, view } → { artifactId, rekorUrl }
|
||
POST /api/reports { imageDigest, policyRevision? } → { reportId, rekorUrl }
|
||
GET /api/catalog/artifacts/{id} → { size, ttl, immutable, rekor, refs }
|
||
GET /healthz | /readyz | /metrics
|
||
```
|
||
|
||
### 7.2 Signer (mTLS; hard gate)
|
||
|
||
```
|
||
POST /sign/dsse # body: {subjectHash, imageDigest, predicate}; headers: OpTok (DPoP/mTLS) + PoE
|
||
GET /verify/referrers?imageDigest=sha256:... # is this image StellaOps-signed?
|
||
```
|
||
|
||
### 7.3 Attestor (mTLS)
|
||
|
||
```
|
||
POST /rekor/entries # DSSE bundle → {uuid, index, proof, logURL}
|
||
GET /rekor/entries/{uuid}
|
||
```
|
||
|
||
### 7.4 Authority (OIDC)
|
||
|
||
* `/.well-known/openid-configuration`, `/oauth/token` (DPoP/mTLS), `/oauth/introspect`, `/jwks`
|
||
|
||
### 7.5 Licensing (cloud)
|
||
|
||
```
|
||
POST /license/enroll { LT, pubKey } → PoE + introspection endpoints
|
||
POST /license/revoke { license_id } → ok
|
||
POST /license/introspect { poe } → { active, claims, exp }
|
||
POST /attest/endorse { bundle } → endorsement bundle (optional)
|
||
```
|
||
|
||
---
|
||
|
||
## 8) Security & verifiability
|
||
|
||
* **Sender‑constrained tokens.** All operational calls use **DPoP** (RFC 9449) or **mTLS‑bound** tokens (RFC 8705).
|
||
* **Entitlement.** **PoE** is mandatory; revocation honored online.
|
||
* **Release integrity.** **Signer** independently verifies **scanner image digest** via **Referrers + cosign** before signing.
|
||
* **Separation of duties.** Scanner/UI cannot sign; only **Signer** can sign; only **Attestor** can write to **Rekor v2**.
|
||
* **Verifiers.** Anyone can verify: DSSE signature → certificate chain to **Stella Ops Fulcio/KMS root** → **Rekor v2** inclusion.
|
||
* **Community vs Authorized.** Free/community runs throttled with no official attestations; authorized runs full speed and produce **Stella Ops‑verified** bundles.
|
||
|
||
**DSSE predicate (SBOM/report)**
|
||
|
||
```json
|
||
{
|
||
"predicateType": "https://stella-ops.org/attestations/sbom/1",
|
||
"subject": [{ "name": "s3://stellaops/images/<digest>/inventory.cdx.pb", "digest": { "sha256": "<sha256>" } }],
|
||
"predicate": {
|
||
"image_digest": "<sha256:...>",
|
||
"stellaops_version": "2.3.1 (2027.04)",
|
||
"license_id": "LIC-9F2A...",
|
||
"customer_id": "CUST-ACME",
|
||
"plan": "pro",
|
||
"policy_digest": "sha256:...",
|
||
"views": ["inventory","usage"],
|
||
"created": "2025-10-17T12:34:56Z"
|
||
}
|
||
}
|
||
```
|
||
|
||
**BOM‑Index sidecar**
|
||
Binary header + purl table + roaring bitmaps; optional `usedByEntrypoint` flags for fast policy joins.
|
||
|
||
---
|
||
|
||
## 9) Scale, performance & quotas
|
||
|
||
* **Workers:** horizontal; **distributed lock per layer digest**; global CAS in MinIO.
|
||
* **Queues:** Redis Streams / NATS / RabbitMQ. HPA by queue depth, CPU, memory.
|
||
* **Registry throttling:** per‑registry concurrency budgets.
|
||
* **Targets:**
|
||
|
||
* Build‑time path P95 ≤ 3–5 s on warmed bases.
|
||
* Post‑build delta scan P95 ≤ 10 s for 200 MB images.
|
||
* Policy + VEX evaluation ≤ 500 ms for 5k components using BOM‑Index.
|
||
* **Quotas:** license plan enforces QPS/concurrency/size; **Signer** throttles and can deny DSSE.
|
||
|
||
---
|
||
|
||
## 10) DevOps & distribution
|
||
|
||
* **Releases:** all first‑party images **cosign‑signed**; labels embed `org.stellaops.version` and `org.stellaops.release_date`.
|
||
* **Channels:**
|
||
|
||
* **Community** (public registry): throttled, non‑attesting.
|
||
* **Authorized** (private registry): full speed, DSSE enabled.
|
||
* **Client update flow:** containers self‑verify signatures at boot; report version; **Signer** enforces `valid_release_year` / `max_version` from PoE before signing.
|
||
* **Compose skeleton:**
|
||
|
||
```yaml
|
||
services:
|
||
authority: { image: stellaops/authority }
|
||
fulcio: { image: sigstore/fulcio }
|
||
rekor: { image: sigstore/rekor-v2 }
|
||
minio: { image: minio/minio, command: server /data --console-address ":9001" }
|
||
mongo: { image: mongo:7 }
|
||
signer: { image: stellaops/signer, depends_on: [authority, fulcio] }
|
||
attestor: { image: stellaops/attestor, depends_on: [rekor, signer] }
|
||
scanner-web:{ image: stellaops/scanner-web, depends_on: [mongo, minio, signer, attestor] }
|
||
scanner-worker:
|
||
image: stellaops/scanner-worker
|
||
deploy: { replicas: 4 }
|
||
depends_on: [scanner-web]
|
||
concelier: { image: stellaops/concelier-web, depends_on: [mongo] }
|
||
excititor: { image: stellaops/excititor-web, depends_on: [mongo] }
|
||
ui: { image: stellaops/ui, depends_on: [scanner-web, concelier, excititor] }
|
||
```
|
||
|
||
* **Backups:** Mongo dumps; MinIO versioned buckets & replication; Rekor v2 DB snapshots; JWKS/Fulcio/KMS key rotation.
|
||
|
||
---
|
||
|
||
## 11) Observability & audit
|
||
|
||
* **Metrics:** scan latency, layer cache hit %, artifact bytes, DSSE/Rekor latency, policy evaluation time, queue depth, admission decisions (Zastava).
|
||
* **Tracing:** per‑stage spans; correlation IDs across Scanner→Signer→Attestor.
|
||
* **Audit logs:** every signing records `license_id`, `image_digest`, `policy_digest`, and Rekor UUID.
|
||
* **Compliance:** MinIO **Object Lock** for immutable artifacts; reproducible outputs via policy digest + SBOM digest in predicate.
|
||
|
||
---
|
||
|
||
## 12) Roadmap (anchored to this architecture)
|
||
|
||
* M2: Windows MSI/SxS/GAC analyzers; deeper Rust (DWARF enrichers).
|
||
* M2: Buildx generator certified flows; cross‑registry trust policies.
|
||
* M3: Patch‑Presence plugin (signature‑based backport detection), opt‑in.
|
||
* M3: Zastava Admission control GA with policy presets and dry‑run→enforce stages.
|
||
* Continuous: Policy UX (waiver TTLs, vendor rules), Excititor connectors expansion.
|
||
|
||
---
|
||
|
||
## 13) Canonical sequences (verification & signing)
|
||
|
||
**Sign & log (OpTok + PoE, image verify, DSSE, Rekor).**
|
||
|
||
```mermaid
|
||
sequenceDiagram
|
||
autonumber
|
||
participant Scan as Scanner.WebService
|
||
participant Auth as Authority (OIDC)
|
||
participant Sign as Signer
|
||
participant Reg as OCI Registry
|
||
participant Ful as Fulcio/KMS
|
||
participant Att as Attestor
|
||
participant Rek as Rekor v2
|
||
|
||
Scan->>Auth: Get OpTok (DPoP/mTLS)
|
||
Scan->>Sign: sign(request) + OpTok + PoE + DPoP proof
|
||
Sign->>Auth: Validate OpTok & sender-constraint
|
||
Sign->>Sign: Validate PoE (introspect/revocation)
|
||
Sign->>Reg: Verify scanner image is StellaOps-signed (Referrers + cosign)
|
||
alt OK
|
||
Sign->>Ful: Get signing cert (keyless) or use KMS key
|
||
Sign-->>Scan: DSSE bundle (cert chain)
|
||
Scan->>Att: Submit bundle
|
||
Att-->>Rek: Create entry
|
||
Rek-->>Att: {uuid,index,proof}
|
||
Att-->>Scan: Rekor URL
|
||
else Deny
|
||
Sign-->>Scan: 403 (no attestation)
|
||
end
|
||
```
|
||
|
||
**Verification (third party).**
|
||
|
||
```plantuml
|
||
@startuml
|
||
actor Verifier
|
||
participant "stellaops verify" as Tool
|
||
database "Fulcio/KMS root" as Root
|
||
participant "Rekor v2" as R2
|
||
Verifier -> Tool: bundle (URL/file)
|
||
Tool -> Tool: Verify DSSE signature
|
||
Tool -> Root: Verify cert chain to StellaOps root
|
||
Tool -> R2: Verify inclusion proof / query by UUID
|
||
Tool -> Verifier: OK + claims (license_id, policy_digest, version)
|
||
@enduml
|
||
```
|
||
|
||
---
|
||
|
||
**End of `high_level_architecture.md` (Consolidated).**
|