Rewrite architecture docs and add Vexer connector template

This commit is contained in:
2025-10-17 19:34:43 +03:00
parent 29a7d51e41
commit fbd1826ef3
25 changed files with 4885 additions and 777 deletions

View File

@@ -25,4 +25,10 @@ Pipeline note: deployment workflows should template `etc/feedser.yaml` during CI
injecting environment-specific Mongo credentials and telemetry endpoints. Upcoming
releases will add Microsoft OAuth (Entra ID) authentication support—track the quickstart
for integration steps once available.
## Documentation
- `docs/README.md` now consolidates the platform index and points to the updated high-level architecture.
- Module architecture dossiers live under `docs/ARCHITECTURE_*.md`; the most relevant here are `docs/ARCHITECTURE_FEEDSER.md` (service layout, merge engine, exports) and `docs/ARCHITECTURE_CLI.md` (command surface, AOT packaging, auth flows). Related services such as the Signer, Attestor, Authority, Scanner, UI, Vexer, Zastava, and DevOps pipeline each have their own dossier.
- Offline operation guidance moved to `docs/24_OFFLINE_KIT.md`, which details bundle composition, verification, and delta workflows. Feedser-specific connector operations stay in `docs/ops/feedser-certbund-operations.md` and companion runbooks under `docs/ops/`.

View File

@@ -1,388 +1,430 @@
#7 · HighLevel Architecture — **StellaOps**
Below is the **revised, consolidated** `high_level_architecture.md`.
It **absorbs** all content from `components.md` so you have a single, authoritative file. No separate components doc is required.
---
##0Purpose &Scope
# HighLevel Architecture — **StellaOps** (Consolidated • 2025Q4)
Give contributors, DevOps engineers and auditors a **complete yet readable map** of the Core:
* Major runtime components and message paths.
* Where plugins, CLI helpers and runtime agents attach.
* Technology choices that enable the sub5second SBOM goal.
* Typical operational scenarios (pipeline scan, mute, nightly rescan, etc.).
Anything enterpriseonly (signed PDF, custom/regulated TLS, LDAP, enforcement) **must arrive as a plugin**; the Core never hardcodes those concerns.
---
##1Component Overview
| # | Component | Responsibility |
|---|-----------|---------------|
| 1 | **API Gateway** | REST endpoints (`/scan`, `/quota`, **`/token/offline`**); token auth; quota enforcement |
| 2 | **Scan Service** | SBOM parsing, DeltaSBOM cache, vulnerability lookup |
| 3 | **Policy Engine** | YAML / (optional) Rego rule evaluation; verdict assembly |
| 4 | **Quota Service** | Pertoken counters; **333 scans/day**; waits & HTTP 429 |
| 5 | **ClientJWT Issuer** | Issues 30day offline tokens; bundles them into OUK |
| 5 | **Registry** | Anonymous internal Docker registry for agents, SBOM uploads |
| 6 | **Web UI** | React/Blazor SPA; dashboards, policy editor, quota banner |
| 7 | **Data Stores** | **Redis** (cache, quota) & **MongoDB** (SBOMs, findings, audit) |
| 8 | **Plugin Host** | Hotload .NET DLLs; isolates community plugins |
| 9 | **Agents** | `sbombuilder`, `Stella CLI` scanner CLI, future `StellaOpsAttestor` |
```mermaid
flowchart TD
subgraph "External Actors"
DEV["Developer / DevSecOps / Manager"]
CI["CI/CD Pipeline (e.g., Stella CLI)"]
K8S["Kubernetes Cluster (e.g., Zastava Agent)"]
end
subgraph "Stella Ops Runtime"
subgraph "Core Services"
CORE["Stella Core<br>(REST + gRPC APIs, Orchestration)"]
REDIS[("Redis<br>(Cache, Queues, Trivy DB Mirror)")]
MONGO[("MongoDB<br>(Optional: Long-term Storage)")]
POL["Mute Policies<br>(OPA & YAML Evaluator)"]
REG["StellaOps Registry<br>(Docker Registry v2)"]
ATT["StellaOps Attestor<br>(SLSA + Rekor)"]
end
subgraph "Agents & Builders"
SB["SBOM Builder<br>(Go Binary: Extracts Layers, Generates SBOMs)"]
SA["Stella CLI<br>(Pipeline Helper: Invokes Builder, Triggers Scans)"]
ZA["Zastava Agent<br>(K8s Webhook: Enforces Policies, Inventories Containers)"]
end
subgraph "Scanners & UI"
TRIVY["Trivy Scanner<br>(Plugin Container: Vulnerability Scanning)"]
UI["Web UI<br>(Vue3 + Tailwind: Dashboards, Policy Editor)"]
CLI["Stella CLI<br>(CLI Helper: Triggers Scans, Mutes)"]
end
end
DEV -->|Browses Findings, Mutes CVEs| UI
DEV -->|Triggers Scans| CLI
CI -->|Generates SBOM, Calls /scan| SA
K8S -->|Inventories Containers, Enforces Gates| ZA
UI -- "REST" --> CORE
CLI -- "REST/gRPC" --> CORE
SA -->|Scan Requests| CORE
SB -->|Uploads SBOMs| CORE
ZA -->|Policy Gates| CORE
CORE -- "Queues, Caches" --> REDIS
CORE -- "Persists Data" --> MONGO
CORE -->|Evaluates Policies| POL
CORE -->|Attests Provenance| ATT
CORE -->|Scans Vulnerabilities| TRIVY
SB -- "Pulls Images" --> REG
SA -- "Pulls Images" --> REG
ZA -- "Pulls Images" --> REG
style DEV fill:#f9f,stroke:#333
style CI fill:#f9f,stroke:#333
style K8S fill:#f9f,stroke:#333
style CORE fill:#ddf,stroke:#333
style REDIS fill:#fdd,stroke:#333
style MONGO fill:#fdd,stroke:#333
style POL fill:#dfd,stroke:#333
style REG fill:#dfd,stroke:#333
style ATT fill:#dfd,stroke:#333
style SB fill:#fdf,stroke:#333
style SA fill:#fdf,stroke:#333
style ZA fill:#fdf,stroke:#333
style TRIVY fill:#ffd,stroke:#333
style UI fill:#ffd,stroke:#333
style CLI fill:#ffd,stroke:#333
```
* **Developer / DevSecOps / Manager** browses findings, mutes CVEs, triggers scans.
* **Stella CLI** generates SBOMs and calls `/scan` during CI.
* **Zastava Agent** inventories live containers; Core ships it in *passive* mode only (no kill).
###1.1ClientJWT Lifecycle (offline aware)
1. **Online instance** user signs in → `/connect/token` issues JWT valid 12h.
2. **Offline instance** JWT with `exp 30days` ships in OUK; backend
**resigns** and stores it during import.
3. Tokens embed a `tier` claim (“Free”) and `maxScansPerDay: 333`.
4. On expiry the UI surfaces a red toast **7days** in advance.
> **Purpose.** A complete, implementationready map of StellaOps: product vision, all runtime components, trust boundaries, tokens/licensing, control/data flows, storage, APIs, security, scale, DevOps, and verification logic.
> **Scope.** This file **replaces** the separate `components.md`; all component details now live here.
---
##2·Component Responsibilities (runtime view)
## 0) Product vision & principles
| Component | Core Responsibility | Implementation Highlights |
| -------------------------- | ---------------------------------------------------------------------------------------------------------- | --------------------------------------------------------- |
| **Stella Core** | Orchestrates scans, persists SBOM blobs, serves REST/gRPC APIs, fans out jobs to scanners & policy engine. | .NET{{ dotnet }}, CQRS, Redis Streams; pluggable runner interfaces. |
| **SBOM Builder** | Extracts image layers, queries Core for *missing* layers, generates SBOMs (multiformat), uploads blobs. | Go binary; wraps Trivy & Syft libs. |
| **Stella CLI** | Pipelineside helper; invokes Builder, triggers scan, streams progress back to CI/CD. | Static musl build. |
| **Zastava Agent** | K8s admission webhook enforcing policy verdicts before Pod creation. | Rust for sub10ms latencies. |
| **UI** | Angular17 SPA for dashboards, settings, policy editor. | Tailwind CSS; Webpack module federation (future). |
| **Redis** | Cache, queue, TrivyDB mirror, layer diffing. | Single instance or Sentinel. |
| **MongoDB** (opt.) | Longterm SBOM & policy audit storage (>180days). | Optional; enabled via flag. |
| **StellaOps.Registry** | Anonymous readonly Docker v2 registry with optional Cosign verification. | `registry :2` behind nginx reverse proxy. |
| **StellaOps.MutePolicies** | YAML/Rego evaluator, policy version store, `/policy/*` API. | Embeds OPAWASM; falls back to `opa exec`. |
| **StellaOpsAttestor** | Generate SLSA provenance & Rekor signatures; verify on demand. | Sidecar container; DSSE + Rekor CLI. |
**Vision.** StellaOps is a **deterministic SBOM + VEX platform** for CI/CD and runtime, tuned for **speed** (perlayer deltas), **quiet output** (usagescoped views), and **verifiability** (DSSE + Rekor v2). It is **selfhostable**, **airgap capable**, and **commercially enforceable**: only licensed installations can produce **StellaOpsverified** attestations.
All crosscomponent calls use dependencyinjected interfaces—no
intracomponent reachins.
**Operating principles.**
* **Scannerowned SBOMs.** We generate our own BOMs; we do not warehouse thirdparty SBOM content (we can **link** to attested SBOMs).
* **Deterministic evidence.** Facts come from package DBs, installed metadata, linkers, and verified attestations; no fuzzy guessing in the core.
* **Perlayer caching.** Cache fragments by **layer digest** and compose image SBOMs via **CycloneDX BOMLink** / **SPDX ExternalRef**.
* **Inventory vs Usage.** Always record the full **inventory** of what exists; separately present **usage** (entrypoint closure + loaded libs).
* **Backend decides.** PASS/FAIL is produced by **Policy** + **VEX** + **Advisories**. The scanner reports facts.
* **Attest or it didnt happen.** Every export is signed as **intoto/DSSE** and logged in **Rekor v2**.
* **Sovereignready.** Cloud is used only for licensing and optional endorsement; everything else is firstparty and selfhostable.
---
##3·Principal Backend Modules & Plugin Hooks
## 1) Service topology & trust boundaries
| Namespace | Responsibility | Builtin Tech / Default | Plugin Contract |
| --------------- | -------------------------------------------------- | ----------------------- | ------------------------------------------------- |
| `configuration` | Parse env/JSON, healthcheck endpoint | .NET{{ dotnet }} Options | `IConfigValidator` |
| `identity` | Embedded OAuth2/OIDC (OpenIddict 6) | MIT OpenIddict | `IIdentityProvider` for LDAP/SAML/JWT gateway |
| `pluginloader` | Discover DLLs, SemVer gate, optional Cosign verify | Reflection + Cosign | `IPluginLifecycleHook` for telemetry |
| `scanning` | SBOM & imageflow orchestration; runner pool | Trivy CLI (default) | `IScannerRunner` e.g., Grype, Copacetic, Clair |
| `feedser` (vulnerability ingest/merge/export service) | Nightly NVD merge & feed enrichment | Hangfire job | drop-in `*.Schedule.dll` for OSV, GHSA, NVD 2.0, CNNVD, CNVD, ENISA, JVN and BDU feeds |
| `tls` | TLS provider abstraction | OpenSSL | `ITlsProvider` for custom suites (incl. **SM2**, where law or security requires it) |
| `reporting` | Render HTML/PDF reports | RazorLight | `IReportRenderer` |
| `ui` | Angular SPA & i18n | Angular{{ angular }} | new locales via `/locales/{lang}.json` |
| `scheduling` | Cron + retries | Hangfire | any recurrent job via `*.Schedule.dll` |
### 1.1 Runtime inventory (firstparty)
```mermaid
classDiagram
class configuration
class identity
class pluginloader
class scanning
class feedser
class tls
class reporting
class ui
class scheduling
| Service / Tool | Container image | Core role | Scale pattern |
| ------------------------------- | -------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------- |
| **Scanner.WebService** | `stellaops/scanner-web` | Control plane for scans; catalog; SBOM composition (inventory & usage); diff; exports. | Stateless; N replicas behind LB. |
| **Scanner.Worker** | `stellaops/scanner-worker` | Runs analyzers (OS, Lang: Java/Node/Python/Go/.NET/Rust, Native ELF/PE/MachO, EntryTrace); emits perlayer SBOMs and composes image SBOMs. | Horizontal; queuedriven; sharded by layer digest. |
| **Scanner.Sbomer.BuildXPlugin** | `stellaops/sbom-indexer` | BuildKit **generator** for buildtime SBOMs as OCI **referrers**. | CIside; ephemeral. |
| **Scanner.Sbomer.DockerImage** | `stellaops/scanner-cli` | CLIorchestrated scanner container for postbuild scans. | Local/CI; ephemeral. |
| **Feedser.WebService** | `stellaops/feedser-web` | Vulnerability ingest/normalize/merge/export (JSON + Trivy DB). | HA via Mongo locks. |
| **Vexer.WebService** | `stellaops/vexer-web` | VEX ingest/normalize/consensus; conflict retention; exports. | HA via Mongo locks. |
| **Policy Engine** | (in `scanner-web`) | YAML DSL evaluator (waivers, vendor preferences, KEV/EPSS, license, usagegating); produces **policy digest**. | Inprocess; cache per digest. |
| **Signer** | `stellaops/signer` | **Hard gate:** validates entitlement + release integrity; mints signing cert (Fulcio keyless) or uses KMS; signs DSSE. | Stateless; HPA by QPS. |
| **Attestor** | `stellaops/attestor` | Posts DSSE bundles to **Rekor v2**; verification endpoints. | Stateless; HPA by QPS. |
| **Authority** | `stellaops/authority` | Onprem OIDC issuing **shortlived OpToks** with DPoP/mTLS sender constraint. | HA behind LB. |
| **Zastava** (Runtime) | `stellaops/zastava` | Runtime inspector/enforcer (observer + optional Admission Webhook). | DaemonSet + Webhook. |
| **Web UI** | `stellaops/ui` | Angular app for scans, diffs, policy, VEX, runtime, reports. | Stateless. |
| **StellaOps.Cli** | `stellaops/cli` | CLI for init/scan/export/diff/policy/report/verify; Buildx helper. | Local/CI. |
class AllModules
### 1.2 Thirdparty (selfhosted)
configuration ..> identity : Uses
identity ..> pluginloader : Authenticates Plugins
pluginloader ..> scanning : Loads Scanner Runners
scanning ..> feedser : Triggers Feed Merges
tls ..> AllModules : Provides TLS Abstraction
reporting ..> ui : Renders Reports for UI
scheduling ..> feedser : Schedules Nightly Jobs
* **Fulcio** (Sigstore CA) — issues shortlived signing certs (keyless).
* **Rekor v2** (tilebacked transparency log).
* **MinIO** — S3compatible object store with lifecycle & Object Lock.
* **MongoDB** — catalog, advisories, VEX.
* **Queue** — Redis Streams / NATS / RabbitMQ (pluggable).
* **OCI Registry** — must support **Referrers API** (discover SBOMs/signatures).
note for scanning "Pluggable: ISScannerRunner<br>e.g., Trivy, Grype"
note for feedser "Pluggable: *.Schedule.dll<br>e.g., OSV, GHSA Feeds"
note for identity "Pluggable: IIdentityProvider<br>e.g., LDAP, SAML"
note for reporting "Pluggable: IReportRenderer<br>e.g., Custom PDF"
```
### 1.3 Cloud licensing (StellaOps)
**When remaining =0:**
API returns `429 Too Many Requests`, `RetryAfter: <UTCmidnight>` (sequence omitted for brevity).
* **Licensing Service** (`www.stella-ops.org`) — issues longlived **License Tokens (LT)**; exchanges LT → **ProofofEntitlement (PoE)** bound to an installation key; revoke/introspect PoE; optional crosslog **endorsement**.
---
##4·Data Flows
###4.1SBOMFirst (≤5s P95)
Builder produces SBOM locally, so Core never touches the Docker
socket.
Trivy path hits ≤5s on alpine:3.19 with warmed DB.
Imageunpack fallback stays ≤10s for 200MB images.
```mermaid
sequenceDiagram
participant CI as CI/CD Pipeline (Stella CLI)
participant SB as SBOM Builder
participant CORE as Stella Core
participant REDIS as Redis Queue
participant RUN as Scanner Runner (e.g., Trivy)
participant POL as Policy Evaluator
CI->>SB: Invoke SBOM Generation
SB->>CORE: Check Missing Layers (/layers/missing)
CORE->>REDIS: Query Layer Diff (SDIFF)
REDIS-->>CORE: Missing Layers List
CORE-->>SB: Return Missing Layers
SB->>SB: Generate Delta SBOM
SB->>CORE: Upload SBOM Blob (POST /scan(sbom))
CORE->>REDIS: Enqueue Scan Job
REDIS->>RUN: Fan Out to Runner
RUN->>RUN: Perform Vulnerability Scan
RUN-->>CORE: Return Scan Results
CORE->>POL: Evaluate Mute Policies
POL-->>CORE: Policy Verdict
CORE-->>CI: JSON Verdict & Progress Stream
Note over CORE,CI: Achieves ≤5s P95 with Warmed DB
```
###4.2Delta SBOM
Builder collects layer digests.
`POST /layers/missing` → Redis SDIFF → missing layer list (<20ms).
SBOM generated only for those layers and uploaded.
###4.3Feedser Harvest & Export
```mermaid
sequenceDiagram
participant SCHED as Feedser Scheduler
participant CONN as Source Connector Plug-in
participant FEEDSER as Feedser Core
participant MONGO as MongoDB (Canonical Advisories)
participant EXPORT as Exporter (JSON / Trivy DB)
participant ART as Artifact Store / Offline Kit
SCHED->>CONN: Trigger window (init/resume)
CONN->>CONN: Fetch source documents + metadata
CONN->>FEEDSER: Submit raw document for parsing
FEEDSER->>FEEDSER: Parse & normalize to DTO
FEEDSER->>FEEDSER: Merge & deduplicate canonical advisory
FEEDSER->>MONGO: Write advisory, provenance, merge_event
FEEDSER->>EXPORT: Queue export delta request
EXPORT->>MONGO: Read canonical snapshot/deltas
EXPORT->>EXPORT: Build deterministic JSON & Trivy DB artifacts
EXPORT->>ART: Publish artifacts / Offline Kit bundle
ART-->>FEEDSER: Record export state + digests
```
###4.4Identity & Auth Flow
OpenIddict issues JWTs via clientcredentials or password grant.
An IIdentityProvider plugin can delegate to LDAP, SAML or external OIDC
without Core changes.
---
##5·Runtime Helpers
| Helper | Form | Purpose | Extensible Bits |
|-----------|---------------------------------------|--------------------------------------------------------------------|-------------------------------------------|
| **Stella CLI** | Distroless CLI | Generates SBOM, calls `/scan`, honours threshold flag | `--engine`, `--pdf-out` piped to plugins |
| **Zastava** | Static Go binary / DaemonSet | Watches Docker/CRIO events; uploads SBOMs; can enforce gate | Policy plugin could alter thresholds |
---
##6·Persistence & Cache Strategy
| Store | Primary Use | Why chosen |
|----------------|-----------------------------------------------|--------------------------------|
| **MongoDB** | Feedser canonical advisories, merge events, export state | Deterministic canonical store with flexible schema |
| **Redis7** | CLI quotas, short-lived job scheduling, layer diff cache | Sub-1ms P99 latency for hot-path coordination |
| **Local tmpfs**| Trivy layer cache (`/var/cache/trivy`) | Keeps disk I/O off hot path |
### 1.4 Diagram (control/data planes & trust)
```mermaid
flowchart LR
subgraph "Persistence Layers"
REDIS[(Redis: Quotas & Short-lived Queues<br>Sub-1ms P99)]
MONGO[(MongoDB: Canonical Advisories<br>Merge Events & Export State)]
TMPFS[(Local tmpfs: Trivy Layer Cache<br>Low I/O Overhead)]
end
subgraph Cloud["www.stella-ops.org (Cloud)"]
LS[Licensing Service<br/>LT→PoE / revoke / introspect]
end
CORE["Stella Core"] -- Queues & SBOM Cache --> REDIS
CORE -- Long-term Storage --> MONGO
TRIVY["Trivy Scanner"] -- Layer Unpack Cache --> TMPFS
subgraph OnPrem["Customer Site (Self-hosted)"]
Auth[Authority (OIDC)\nOpTok (DPoP/mTLS)]
SW[Scanner.WebService]
WK[Scanner.Worker xN]
FEED[Feedser]
VEX[Vexer]
POL[Policy Engine (in Scanner.Web)]
SGN[Signer\n(entitlement + signing)]
ATT[Attestor\n(Rekor v2 submit/verify)]
UI[Web UI (Angular)]
Z[Zastava\n(Runtime Inspector/Enforcer)]
MIN[(MinIO S3)]
MGO[(MongoDB)]
QUE[(Queue/Streams)]
end
style REDIS fill:#fdd,stroke:#333
style MONGO fill:#dfd,stroke:#333
style TMPFS fill:#ffd,stroke:#333
CLI[StellaOps.Cli / Buildx Plugin]
REG[(OCI Registry with Referrers)]
FUL[ Fulcio ]
REK[ Rekor v2 (tiles) ]
CLI -->|scan/build| SW
SW -->|jobs| QUE
QUE --> WK
WK --> MIN
SW --> MGO
FEED --> MGO
VEX --> MGO
UI --> SW
Z --> SW
SGN <--> Auth
SGN --> FUL
SGN -->|mTLS| ATT
ATT --> REK
SGN <-->|verify referrers| REG
```
**Trust boundaries.** Only **Signer** can sign; only **Attestor** can write to **Rekor v2**. Scanner/UI never sign.
---
## 2) Licensing & tokens (installationready, theftresistant)
**Twotoken model.**
* **License Token (LT)** — longlived JWT from **Licensing Service**; used **once** to enroll the installation; never used in hot path.
* **ProofofEntitlement (PoE)** — bound to the installation key (mTLS client cert **or** DPoPbound JWT with `cnf`); mediumlived; renewable; revocable.
* **Operational token (OpTok)** — 25min OIDC token from **Authority**, **senderconstrained** (DPoP or mTLS). Used to authenticate to **Signer**/**Scanner.WebService**.
**Signer enforces both:** PoE proves entitlement; OpTok proves “who is calling now”. It also **independently verifies** the **scanner image digest** is **StellaOpssigned** via **Referrers + cosign** before signing anything.
**Enrollment sequence (LT → PoE).**
```plantuml
@startuml
actor Operator
participant "Install Agent" as IA
participant "Licensing Service" as LS
Operator -> IA: Provide LT
IA -> IA: Generate K_inst
IA -> LS: /license/enroll {LT, pub(K_inst)}
LS --> IA: PoE (mTLS client cert or JWT with cnf=K_inst), CRL/OCSP/introspect
@enduml
```
---
##7·Typical Scenarios
## 3) Scanner subsystem (facts engine)
| # | Flow | Steps |
|---------|----------------------------|-------------------------------------------------------------------------------------------------|
| **S1** | Pipeline Scan & Alert | Stella CLI SBOM `/scan` policy verdict CI exit code & link to *Scan Detail* |
| **S2** | Mute Noisy CVE | Dev toggles **Mute** in UI rule stored in Redis next build passes |
| **S3** | Nightly Rescan | `SbomNightly.Schedule` requeues SBOMs (maskfilter) dashboard highlights new Criticals |
| **S4** | Feed Update Cycle | `Feedser (vulnerability ingest/merge/export service)` refreshes feeds UI *Feed Age* tile turns green |
| **S5** | Custom Report Generation | Plugin registers `IReportRenderer` `/report/custom/{digest}` CI downloads artifact |
### 3.1 Analyzers (deterministic only)
* **OS packages:** apk/dpkg/rpm (Linux); Windows MSI/SxS/GAC (M2).
* **Language (installed state):**
* Java (pom.properties / MANIFEST) → `pkg:maven/...`
* Node (`node_modules/*/package.json`) → `pkg:npm/...`
* Python (`*.dist-info/METADATA`) → `pkg:pypi/...`
* Go (buildinfo) → `pkg:golang/...`
* .NET (`*.deps.json`) → `pkg:nuget/...`
* **Rust:** deterministic **language markers** (symbol mangling) and crates only when present; otherwise `bin:{sha256}`.
* **Native:** ELF/PE/MachO imports, DT_NEEDED, RPATH/RUNPATH, symbol versions, PE version info.
* **EntryTrace:** parse `ENTRYPOINT`/`CMD`; shell AST; resolve launchers (Java/Node/Python) to terminal program; record file:line chain.
### 3.2 Caching & composition
* **Layer cache:** `{layerDigest → SBOM fragment + analyzer meta}`.
* **File CAS:** `{sha256(file) → parse result (ELF/JAR metadata/etc.)}`.
* **Composition:** build **image SBOMs** from fragments via **BOMLink/ExternalRef**; emit **two views**:
* **Inventory** (complete filesystem inventory).
* **Usage** (entrypoint closure + linked libs).
* **Transport:** JSON **and** **CycloneDX Protobuf** (compact, fast to parse).
* **Index:** BOMIndex sidecar with purl table + roaring bitmap + `usedByEntrypoint` flag for fast joins.
### 3.3 Diff (image → layer → package)
* Added / Removed / Versionchanged changes, **attributed** to the layer that caused them.
* Raw diffs preserved; backend view applies **VEX + Policy**.
### 3.4 Buildtime SBOMs (fast CI path)
* Buildx **generator** runs analyzers during `docker buildx build --attest=type=sbom,generator=stellaops/sbom-indexer`, attaches SBOMs as **OCI referrers**.
* Scanner.WebService can trust these (policyconfigurable) and **skip** rescan; DSSE + Rekor v2 can be done either at build time or postpush via Signer/Attestor.
---
## 4) Backend evaluation (decider)
### 4.1 Feedser (advisories)
* Ingests vendor, distro, OSS feeds; normalizes & merges; persists canonical advisories in Mongo; exports **deterministic JSON** and **Trivy DB**.
* Offline kit bundles for airgapped sites.
### 4.2 Vexer (VEX)
* Ingests **OpenVEX / CSAF VEX / CycloneDX VEX**; normalizes claims; retains conflicts; computes **consensus** with provider trust weights and justification gates.
### 4.3 Policy Engine (YAML DSL)
* Matchers: `image/repo/env/purl/cve/vendor/source/path/layerDigest/usedByEntrypoint`
* Actions: `ignore(until, justification)`, `fail`, `warn`, `defer`, `requireVEX{vendors, justifications}`, `escalate {sev, KEV, EPSS}`, license constraints.
* Produces a **policy digest** (SHA256 of canonicalized policy).
### 4.4 PASS/FAIL flow
1. SBOM (Inventory / Usage) → join with **Feedser** advisories.
2. Apply **Vexer** consensus (statuses & justifications).
3. Apply **Policy**; compute PASS/FAIL with waiver TTLs.
4. Sign the **final report** (DSSE via **Signer**) and log to **Rekor v2** via **Attestor**.
---
## 5) Runtime enforcement (Zastava)
* **Observer:** inventories running containers, checks image signatures, SBOM presence (referrers), detects drift (entrypoint chain divergence), flags unapproved images.
* **Admission Webhook (optional):** blocks policyfail pods (dryrun first).
* **Integration:** posts runtime events to Scanner.WebService; can request **delta scans** on changed layers.
---
## 6) Storage & catalogs (MinIO/Mongo)
**MinIO layout**
```
s3://stellaops/
layers/<sha256>/sbom.cdx.json.zst
layers/<sha256>/sbom.spdx.json.zst
images/<imgDigest>/inventory.cdx.pb
images/<imgDigest>/usage.cdx.pb
indexes/<imgDigest>/bom-index.bin
attest/<artifactSha256>.dsse.json
```
**Catalog (Mongo)**
* `artifacts` (type/format/sha/size/rekor/ttl/immutable/refCount/createdAt)
* `images`, `layers`, `links`, `lifecycleRules`
**Retention**
* MinIO **ILM** for coarse TTL; Scanner.WebService GC decrements `refCount` and deletes unreferenced metadata; **Object Lock** for immutable classes (auditable artifacts).
---
## 7) APIs (consolidated surface)
### 7.1 Scanner.WebService
```
POST /api/scans { imageRef|digest, force? } → { scanId }
GET /api/scans/{id} → { status, digests, artifacts[] }
GET /api/sboms/{imageDigest} ?format=cdx-json|cdx-pb|spdx-json&view=inventory|usage
GET /api/diff?old=<digest>&new=<digest> → { added[], removed[], changed[], byLayer[] }
POST /api/exports { imageDigest, format, view } → { artifactId, rekorUrl }
POST /api/reports { imageDigest, policyRevision? } → { reportId, rekorUrl }
GET /api/catalog/artifacts/{id} → { size, ttl, immutable, rekor, refs }
GET /healthz | /readyz | /metrics
```
### 7.2 Signer (mTLS; hard gate)
```
POST /sign/dsse # body: {subjectHash, imageDigest, predicate}; headers: OpTok (DPoP/mTLS) + PoE
GET /verify/referrers?imageDigest=sha256:... # is this image StellaOps-signed?
```
### 7.3 Attestor (mTLS)
```
POST /rekor/entries # DSSE bundle → {uuid, index, proof, logURL}
GET /rekor/entries/{uuid}
```
### 7.4 Authority (OIDC)
* `/.well-known/openid-configuration`, `/oauth/token` (DPoP/mTLS), `/oauth/introspect`, `/jwks`
### 7.5 Licensing (cloud)
```
POST /license/enroll { LT, pubKey } → PoE + introspection endpoints
POST /license/revoke { license_id } → ok
POST /license/introspect { poe } → { active, claims, exp }
POST /attest/endorse { bundle } → endorsement bundle (optional)
```
---
## 8) Security & verifiability
* **Senderconstrained tokens.** All operational calls use **DPoP** (RFC9449) or **mTLSbound** tokens (RFC8705).
* **Entitlement.** **PoE** is mandatory; revocation honored online.
* **Release integrity.** **Signer** independently verifies **scanner image digest** via **Referrers + cosign** before signing.
* **Separation of duties.** Scanner/UI cannot sign; only **Signer** can sign; only **Attestor** can write to **Rekor v2**.
* **Verifiers.** Anyone can verify: DSSE signature → certificate chain to **StellaOps Fulcio/KMS root****Rekor v2** inclusion.
* **Community vs Authorized.** Free/community runs throttled with no official attestations; authorized runs full speed and produce **StellaOpsverified** bundles.
**DSSE predicate (SBOM/report)**
```json
{
"predicateType": "https://stella-ops.org/attestations/sbom/1",
"subject": [{ "name": "s3://stellaops/images/<digest>/inventory.cdx.pb", "digest": { "sha256": "<sha256>" } }],
"predicate": {
"image_digest": "<sha256:...>",
"stellaops_version": "2.3.1 (2027.04)",
"license_id": "LIC-9F2A...",
"customer_id": "CUST-ACME",
"plan": "pro",
"policy_digest": "sha256:...",
"views": ["inventory","usage"],
"created": "2025-10-17T12:34:56Z"
}
}
```
**BOMIndex sidecar**
Binary header + purl table + roaring bitmaps; optional `usedByEntrypoint` flags for fast policy joins.
---
## 9) Scale, performance & quotas
* **Workers:** horizontal; **distributed lock per layer digest**; global CAS in MinIO.
* **Queues:** Redis Streams / NATS / RabbitMQ. HPA by queue depth, CPU, memory.
* **Registry throttling:** perregistry concurrency budgets.
* **Targets:**
* Buildtime path P95 ≤35s on warmed bases.
* Postbuild delta scan P95 ≤10s for 200MB images.
* Policy + VEX evaluation ≤500ms for 5k components using BOMIndex.
* **Quotas:** license plan enforces QPS/concurrency/size; **Signer** throttles and can deny DSSE.
---
## 10) DevOps & distribution
* **Releases:** all firstparty images **cosignsigned**; labels embed `org.stellaops.version` and `org.stellaops.release_date`.
* **Channels:**
* **Community** (public registry): throttled, nonattesting.
* **Authorized** (private registry): full speed, DSSE enabled.
* **Client update flow:** containers selfverify signatures at boot; report version; **Signer** enforces `valid_release_year` / `max_version` from PoE before signing.
* **Compose skeleton:**
```yaml
services:
authority: { image: stellaops/authority }
fulcio: { image: sigstore/fulcio }
rekor: { image: sigstore/rekor-v2 }
minio: { image: minio/minio, command: server /data --console-address ":9001" }
mongo: { image: mongo:7 }
signer: { image: stellaops/signer, depends_on: [authority, fulcio] }
attestor: { image: stellaops/attestor, depends_on: [rekor, signer] }
scanner-web:{ image: stellaops/scanner-web, depends_on: [mongo, minio, signer, attestor] }
scanner-worker:
image: stellaops/scanner-worker
deploy: { replicas: 4 }
depends_on: [scanner-web]
feedser: { image: stellaops/feedser-web, depends_on: [mongo] }
vexer: { image: stellaops/vexer-web, depends_on: [mongo] }
ui: { image: stellaops/ui, depends_on: [scanner-web, feedser, vexer] }
```
* **Backups:** Mongo dumps; MinIO versioned buckets & replication; Rekor v2 DB snapshots; JWKS/Fulcio/KMS key rotation.
---
## 11) Observability & audit
* **Metrics:** scan latency, layer cache hit %, artifact bytes, DSSE/Rekor latency, policy evaluation time, queue depth, admission decisions (Zastava).
* **Tracing:** perstage spans; correlation IDs across Scanner→Signer→Attestor.
* **Audit logs:** every signing records `license_id`, `image_digest`, `policy_digest`, and Rekor UUID.
* **Compliance:** MinIO **Object Lock** for immutable artifacts; reproducible outputs via policy digest + SBOM digest in predicate.
---
## 12) Roadmap (anchored to this architecture)
* M2: Windows MSI/SxS/GAC analyzers; deeper Rust (DWARF enrichers).
* M2: Buildx generator certified flows; crossregistry trust policies.
* M3: PatchPresence plugin (signaturebased backport detection), optin.
* M3: Zastava Admission control GA with policy presets and dryrun→enforce stages.
* Continuous: Policy UX (waiver TTLs, vendor rules), Vexer connectors expansion.
---
## 13) Canonical sequences (verification & signing)
**Sign & log (OpTok + PoE, image verify, DSSE, Rekor).**
```mermaid
sequenceDiagram
participant DEV as Developer
participant UI as Web UI
participant CORE as Stella Core
participant REDIS as Redis
participant RUN as Scanner Runner
autonumber
participant Scan as Scanner.WebService
participant Auth as Authority (OIDC)
participant Sign as Signer
participant Reg as OCI Registry
participant Ful as Fulcio/KMS
participant Att as Attestor
participant Rek as Rekor v2
DEV->>UI: Toggle Mute for CVE
UI->>CORE: Update Mute Rule (POST /policy/mute)
CORE->>REDIS: Store Mute Policy
Note over CORE,REDIS: YAML/Rego Evaluator Updates
alt Next Pipeline Build
CI->>CORE: Trigger Scan (POST /scan)
CORE->>RUN: Enqueue & Scan
RUN-->>CORE: Raw Findings
CORE->>REDIS: Apply Mute Policies
REDIS-->>CORE: Filtered Verdict (Passes)
CORE-->>CI: Success Exit Code
end
Scan->>Auth: Get OpTok (DPoP/mTLS)
Scan->>Sign: sign(request) + OpTok + PoE + DPoP proof
Sign->>Auth: Validate OpTok & sender-constraint
Sign->>Sign: Validate PoE (introspect/revocation)
Sign->>Reg: Verify scanner image is StellaOps-signed (Referrers + cosign)
alt OK
Sign->>Ful: Get signing cert (keyless) or use KMS key
Sign-->>Scan: DSSE bundle (cert chain)
Scan->>Att: Submit bundle
Att-->>Rek: Create entry
Rek-->>Att: {uuid,index,proof}
Att-->>Scan: Rekor URL
else Deny
Sign-->>Scan: 403 (no attestation)
end
```
```mermaid
sequenceDiagram
participant CRON as SbomNightly.Schedule
participant CORE as Stella Core
participant REDIS as Redis Queue
participant RUN as Scanner Runner
participant UI as Dashboard
**Verification (third party).**
CRON->>CORE: Re-queue SBOMs (Mask-Filter)
CORE->>REDIS: Enqueue Filtered Jobs
REDIS->>RUN: Fan Out to Runners
RUN-->>CORE: New Scan Results
CORE->>UI: Highlight New Criticals
Note over CORE,UI: Focus on Changes Since Last Scan
```plantuml
@startuml
actor Verifier
participant "stellaops verify" as Tool
database "Fulcio/KMS root" as Root
participant "Rekor v2" as R2
Verifier -> Tool: bundle (URL/file)
Tool -> Tool: Verify DSSE signature
Tool -> Root: Verify cert chain to StellaOps root
Tool -> R2: Verify inclusion proof / query by UUID
Tool -> Verifier: OK + claims (license_id, policy_digest, version)
@enduml
```
---
##8·UIFastFacts
* **Stack** Angular17 + Vite dev server; Tailwind CSS.
* **State** Signals + RxJS for live scan progress.
* **i18n / l10n** JSON bundles served from `/locales/{lang}.json`.
* **ModuleStructure** Lazyloaded feature modules (`dashboard`, `scans`, `settings`); runtime route injection by UI plugins (roadmap Q22026).
---
##9·CrossCutting Concerns
* **Security** containers run nonroot, `CAP_DROP:ALL`, readonly FS, hardened seccomp profiles.
* **Observability** Serilog JSON, OpenTelemetry OTLP exporter, Prometheus `/metrics`.
* **Upgrade Policy** `/api/v1` endpoints & CLI flags stable across a minor; breaking changes bump major.
---
##10·Performance & Scalability
| Scenario | P95 target | Bottleneck | Mitigation |
|-----------------|-----------:|-----------------|-------------------------------------------------|
| SBOMfirst | 5s | Redis queue | More CPU, increase `ScannerPool.Workers` |
| Imageunpack | 10s | Layer unpack | Prefer SBOM path, warm Docker cache |
| High concurrency| 40rps | Runner CPU | Scale Core replicas + sidecar scanner services |
---
##11·Future Architectural Anchors
* **ScanService microsplit (gRPC)** isolate heavy runners for large clusters.
* **UI route plugins** dynamic Angular module loader (roadmap Q22026).
* **Redis Cluster** transparently sharded cache once sustained>100rps.
---
##12·Assumptions & Tradeoffs
Requires Docker/CRIO runtime; .NET9 available on hosts; Windows containers are outofscope this cycle.
Embedded auth simplifies deployment but may need plugins for enterprise IdPs.
Speed is prioritised over exhaustive feature parity with heavyweight commercial scanners.
---
##13·References & Further Reading
* **C4 Model** <https://c4model.com>
* **.NET Architecture Guides** <https://learn.microsoft.com/dotnet/architecture>
* **OSS Examples** Kubernetes Architecture docs, Prometheus design papers, Backstage.
*(End of HighLevel Architecture v2.2)*
**End of `high_level_architecture.md` (Consolidated).**

View File

@@ -1,208 +0,0 @@
#8 · Detailed Module Specifications — **StellaOps Feedser**
_This document describes the Feedser service, its supporting libraries, connectors, exporters, and test assets that live in the OSS repository._
---
##0Scope
Feedser is the vulnerability ingest/merge/export subsystem of StellaOps. It
fetches primary advisories, normalizes and deduplicates them into MongoDB, and
produces deterministic JSON and Trivy DB exports. This document lists the
projects that make up that workflow, the extension points they expose, and the
artefacts they ship.
---
##1Repository layout (current)
```text
src/
├─ Directory.Build.props / Directory.Build.targets
├─ StellaOps.Plugin/
├─ StellaOps.Feedser.Core/
├─ StellaOps.Feedser.Core.Tests/
├─ StellaOps.Feedser.Models/ (+ .Tests/)
├─ StellaOps.Feedser.Normalization/ (+ .Tests/)
├─ StellaOps.Feedser.Merge/ (+ .Tests/)
├─ StellaOps.Feedser.Storage.Mongo/ (+ .Tests/)
├─ StellaOps.Feedser.Exporter.Json/ (+ .Tests/)
├─ StellaOps.Feedser.Exporter.TrivyDb/ (+ .Tests/)
├─ StellaOps.Feedser.Source.* / StellaOps.Feedser.Source.*.Tests/
├─ StellaOps.Feedser.Testing/
├─ StellaOps.Feedser.Tests.Shared/
├─ StellaOps.Feedser.WebService/ (+ .Tests/)
├─ PluginBinaries/
└─ StellaOps.Feedser.sln
```
Each folder is a .NET project (or set of projects) referenced by
`StellaOps.Feedser.sln`. Build assets are shared through the root
`Directory.Build.props/targets` so conventions stay consistent.
---
##2Shared libraries
| Project | Purpose | Key extension points |
|---------|---------|----------------------|
| `StellaOps.Plugin` | Base contracts for connectors, exporters, and DI routines plus Cosign validation helpers. | `IFeedConnector`, `IExporterPlugin`, `IDependencyInjectionRoutine` |
| `StellaOps.DependencyInjection` | Composable service registrations for Feedser and plug-ins. | `IDependencyInjectionRoutine` discovery |
| `StellaOps.Feedser.Testing` | Common fixtures, builders, and harnesses for integration/unit tests. | `FeedserMongoFixture`, test builders |
| `StellaOps.Feedser.Tests.Shared` | Shared assembly metadata and fixtures wired in via `Directory.Build.props`. | Test assembly references |
---
##3Core projects
| Project | Responsibility | Extensibility |
|---------|----------------|---------------|
| `StellaOps.Feedser.WebService` | ASP.NET Core minimal API hosting Feedser jobs, status endpoints, and scheduler. | DI-based plug-in discovery; configuration binding |
| `StellaOps.Feedser.Core` | Job orchestration, connector pipelines, merge workflows, export coordination. | `IFeedConnector`, `IExportJob`, deterministic merge policies |
| `StellaOps.Feedser.Models` | Canonical advisory DTOs and enums persisted in MongoDB and exported artefacts. | Partial classes for source-specific metadata |
| `StellaOps.Feedser.Normalization` | Version comparison, CVSS normalization, text utilities for canonicalization. | Helpers consumed by connectors/merge |
| `StellaOps.Feedser.Merge` | Precedence evaluation, alias graph maintenance, merge-event hashing. | Policy extensions via DI |
| `StellaOps.Feedser.Storage.Mongo` | Repository layer for documents, DTOs, advisories, merge events, export state. | Connection string/config via options |
| `StellaOps.Feedser.Exporter.Json` | Deterministic vuln-list JSON export pipeline. | Dependency injection for storage + plugin to host |
| `StellaOps.Feedser.Exporter.TrivyDb` | Builds Trivy DB artefacts from canonical advisories. | Optional ORAS push routines |
###3.1StellaOps.Feedser.WebService
* Hosts minimal API endpoints (`/health`, `/status`, `/jobs`).
* Runs the scheduler that triggers connectors and exporters according to
configured windows.
* Applies dependency-injection routines from `PluginBinaries/` at startup only
(restart-time plug-ins).
###3.2StellaOps.Feedser.Core
* Defines job primitives (fetch, parse, map, merge, export) used by connectors.
* Coordinates deterministic merge flows and writes `merge_event` documents.
* Provides telemetry/log scopes consumed by WebService and exporters.
###3.3StellaOps.Feedser.Storage.Mongo
* Persists raw documents, DTO records, canonical advisories, aliases, affected
packages, references, merge events, export state, and job leases.
* Exposes repository helpers for exporters to stream full/delta snapshots.
###3.4StellaOps.Feedser.Exporter.*
* `Exporter.Json` mirrors the Aqua vuln-list tree with canonical ordering.
* `Exporter.TrivyDb` builds Trivy DB Bolt archives and optional OCI bundles.
* Both exporters honour deterministic hashing and respect export cursors.
---
##4Source connectors
Connectors live under `StellaOps.Feedser.Source.*` and conform to the interfaces
in `StellaOps.Plugin`.
| Family | Project(s) | Notes |
|--------|------------|-------|
| Distro PSIRTs | `StellaOps.Feedser.Source.Distro.*` | Debian, Red Hat, SUSE, Ubuntu connectors with NEVRA/EVR helpers. |
| Vendor PSIRTs | `StellaOps.Feedser.Source.Vndr.*` | Adobe, Apple, Cisco, Chromium, Microsoft, Oracle, VMware. |
| Regional CERTs | `StellaOps.Feedser.Source.Cert*`, `Source.Ru.*`, `Source.Ics.*`, `Source.Kisa` | Provide enrichment metadata while preserving vendor precedence. |
| OSS ecosystems | `StellaOps.Feedser.Source.Ghsa`, `Source.Osv`, `Source.Cve`, `Source.Kev`, `Source.Acsc`, `Source.Cccs`, `Source.Jvn` | Emit SemVer/alias-rich advisories. |
Each connector ships fixtures/tests under the matching `*.Tests` project.
---
##5·ModuleDetails
> _Focus on the Feedser-specific services that replace the legacy FeedMerge cron._
###5.1Feedser.Core
* Owns the fetch → parse → merge → export job pipeline and enforces deterministic
merge hashes (`merge_event`).
* Provides `JobSchedulerBuilder`, job coordinator, and telemetry scopes consumed
by the WebService and exporters.
###5.2Feedser.Storage.Mongo
* Bootstrapper creates collections/indexes (documents, dto, advisory, alias,
affected, merge_event, export_state, jobs, locks).
* Repository APIs surface full/delta advisory reads for exporters, plus
SourceState and job lease persistence.
###5.3Feedser.Exporter.Json / Feedser.Exporter.TrivyDb
* JSON exporter mirrors vuln-list layout with per-file digests and manifest.
* Trivy DB exporter shells or native-builds Bolt archives, optionally pushes OCI
layers, and records export cursors. Delta runs reuse unchanged blobs from the
previous full baseline, annotating `metadata.json` with `mode`, `baseExportId`,
`baseManifestDigest`, `resetBaseline`, and `delta.changedFiles[]`/`delta.removedPaths[]`.
ORAS pushes honour `publishFull` / `publishDelta`, and offline bundles respect
`includeFull` / `includeDelta` for air-gapped syncs.
###5.4Feedser.WebService
* Minimal API host exposing `/health`, `/ready`, `/jobs` and wiring telemetry.
* Loads restart-time plug-ins from `PluginBinaries/`, executes Mongo bootstrap,
and registers built-in connectors/exporters with the scheduler.
###5.5Plugin host & DI bridge
* `StellaOps.Plugin` + `StellaOps.DependencyInjection` provide the contracts and
helper routines for connectors/exporters to integrate with the WebService.
---
##6·Plug-ins & Agents
* **Plug-in discovery** restart-only; the WebService enumerates
`PluginBinaries/` (or configured directories) and executes the contained
`IDependencyInjectionRoutine` implementations.
* **Connector/exporter packages** each source/exporter can ship as a plug-in
assembly with its own options and HttpClient configuration, keeping the core
image minimal.
* **StellaOps CLI (agent)** new `StellaOps.Cli` module that exposes
`scanner`, `scan`, and `db` verbs (via System.CommandLine 2.0) to download
scanner container bundles, install them locally, execute scans against target
directories, automatically upload results, and trigger Feedser jobs (`db
fetch/merge/export`) aligned with the SBOM-first workflow described in
`AGENTS.md`.
* **Offline Kit** bundles Feedser plug-ins, JSON tree, Trivy DB, and export
manifests so air-gapped sites can load the latest vulnerability data without
outbound connectivity.
---
##7·Docker & Distribution Artefacts
| Artefact | Path / Identifier | Notes |
|----------|-------------------|-------|
| Feedser WebService image | `containers/feedser/Dockerfile` (built via CI) | Self-contained ASP.NET runtime hosting scheduler/endpoints. |
| Plugin bundle | `PluginBinaries/` | Mounted or baked-in assemblies for connectors/exporters. |
| Offline Kit tarball | Produced by CI release pipeline | Contains JSON tree, Trivy DB OCI layout, export manifest, and plug-ins. |
| Local dev compose | `scripts/` + future compose overlays | Developers can run MongoDB, Redis (optional), and WebService locally. |
---
##8·Performance Budget
| Scenario | Budget | Source |
|----------|--------|--------|
| Advisory upsert (large advisory) | ≤500ms/advisory | `AdvisoryStorePerformanceTests` (Mongo) |
| Advisory fetch (`GetRecent`) | ≤200ms/advisory | Same performance test harness |
| Advisory point lookup (`Find`) | ≤200ms/advisory | Same performance test harness |
| Bulk upsert/fetch cycle | ≤28s total for 30 large advisories | Same performance test harness |
| Feedser job scheduling | Deterministic cron execution via `JobSchedulerHostedService` | `StellaOps.Feedser.Core` tests |
| Trivy DB export | Deterministic digests across runs (ongoing TODO for end-to-end test) | `Exporter.TrivyDb` backlog |
Budgets are enforced in automated tests where available; outstanding TODO/DOING
items (see task boards) continue tracking gaps such as exporter determinism.
---
##9Testing
* Unit and integration tests live alongside each component (`*.Tests`).
* Shared fixtures come from `StellaOps.Feedser.Testing` and
`StellaOps.Feedser.Tests.Shared` (linked via `Directory.Build.props`).
* Integration suites use ephemeral MongoDB and Redis via Testcontainers to
validate end-to-end flow without external dependencies.
---

View File

@@ -107,6 +107,7 @@ See the detailed rules in
## 6·Related documentation
* **Install guide:** `/install/#air-gapped`
* **Sovereign mode rationale:** `/sovereign/`
* **Security policy:** `/security/#reporting-a-vulnerability`
* **Install guide:** `/install/#air-gapped`
* **Sovereign mode rationale:** `/sovereign/`
* **Security policy:** `/security/#reporting-a-vulnerability`
* **CERT-Bund snapshots:** `python tools/certbund_offline_snapshot.py --help` (see `docs/ops/feedser-certbund-operations.md`)

View File

@@ -11,6 +11,7 @@ Produce and maintain offline-friendly documentation for StellaOps modules, cover
## Operating Principles
- Keep guides deterministic and in sync with shipped configuration samples.
- Prefer tables/checklists for operator steps; flag security-sensitive actions.
- When work involves a specific `StellaOps.<Component>` project, consult both `docs/07_HIGH_LEVEL_ARCHITECTURE.md` and the matching dossier `docs/ARCHITECTURE_<COMPONENT>.md` before drafting or editing content.
- Update `docs/TASKS.md` whenever work items change status (TODO/DOING/REVIEW/DONE/BLOCKED).
## Coordination

View File

@@ -0,0 +1,384 @@
# component_architecture_attestor.md — **StellaOps Attestor** (2025Q4)
> **Scope.** Implementationready architecture for the **Attestor**: the service that **submits** DSSE envelopes to **Rekor v2**, retrieves/validates inclusion proofs, caches results, and exposes verification APIs. It accepts DSSE **only** from the **Signer** over mTLS, enforces chainoftrust to StellaOps roots, and returns `{uuid, index, proof, logURL}` to calling services (Scanner.WebService for SBOMs; backend for final reports; Vexer exports when configured).
---
## 0) Mission & boundaries
**Mission.** Turn a signed DSSE envelope from the Signer into a **transparencylogged, verifiable fact** with a durable, replayable proof (Merkle inclusion + (optional) checkpoint anchoring). Provide **fast verification** for downstream consumers and a stable retrieval interface for UI/CLI.
**Boundaries.**
* Attestor **does not sign**; it **must not** accept unsigned or thirdpartysigned bundles.
* Attestor **does not decide PASS/FAIL**; it logs attestations for SBOMs, reports, and export artifacts.
* Rekor v2 backends may be **local** (selfhosted) or **remote**; Attestor handles both with retries, backoff, and idempotency.
---
## 1) Topology & dependencies
**Process shape:** single stateless service `stellaops/attestor` behind mTLS.
**Dependencies:**
* **Signer** (caller) — authenticated via **mTLS** and **Authority** OpToks.
* **Rekor v2** — tilebacked transparency log endpoint(s).
* **MinIO (S3)** — optional archive store for DSSE envelopes & verification bundles.
* **MongoDB** — local cache of `{uuid, index, proof, artifactSha256, bundleSha256}`; job state; audit.
* **Redis** — dedupe/idempotency keys and shortlived ratelimit buckets.
* **Licensing Service (optional)** — “endorse” call for crosslog publishing when customer optsin.
Trust boundary: **Only the Signer** is allowed to call submission endpoints; enforced by **mTLS peer cert allowlist** + `aud=attestor` OpTok.
---
## 2) Data model (Mongo)
Database: `attestor`
**Collections & schemas**
* `entries`
```
{ _id: "<rekor-uuid>",
artifact: { sha256: "<sha256>", kind: "sbom|report|vex-export", imageDigest?, subjectUri? },
bundleSha256: "<sha256>", // canonicalized DSSE
index: <int>, // log index/sequence if provided by backend
proof: { // inclusion proof
checkpoint: { origin, size, rootHash, timestamp },
inclusion: { leafHash, path[] } // Merkle path (tiles)
},
log: { url, logId? },
createdAt, status: "included|pending|failed",
signerIdentity: { mode: "keyless|kms", issuer, san?, kid? }
}
```
* `dedupe`
```
{ key: "bundle:<sha256>", rekorUuid, createdAt, ttlAt } // idempotency key
```
* `audit`
```
{ _id, ts, caller: { cn, mTLSThumbprint, sub, aud }, // from mTLS + OpTok
action: "submit|verify|fetch",
artifactSha256, bundleSha256, rekorUuid?, index?, result, latencyMs, backend }
```
Indexes:
* `entries` on `artifact.sha256`, `bundleSha256`, `createdAt`, and `{status:1, createdAt:-1}`.
* `dedupe.key` unique (TTL 2448h).
* `audit.ts` for timerange queries.
---
## 3) Input contract (from Signer)
**Attestor accepts only** DSSE envelopes that satisfy all of:
1. **mTLS** peer certificate maps to `signer` service (CApinned).
2. **Authority** OpTok with `aud=attestor`, `scope=attestor.write`, DPoP or mTLS bound.
3. DSSE envelope is **signed by the Signers key** (or includes a **Fulcioissued** cert chain) and **chains to configured roots** (Fulcio/KMS).
4. **Predicate type** is one of StellaOps types (sbom/report/vexexport) with valid schema.
5. `subject[*].digest.sha256` is present and canonicalized.
**Wire shape (JSON):**
```json
{
"bundle": { "dsse": { "payloadType": "application/vnd.in-toto+json", "payload": "<b64>", "signatures": [ ... ] },
"certificateChain": [ "-----BEGIN CERTIFICATE-----..." ],
"mode": "keyless" },
"meta": {
"artifact": { "sha256": "<subject sha256>", "kind": "sbom|report|vex-export", "imageDigest": "sha256:..." },
"bundleSha256": "<sha256 of canonical dsse>",
"logPreference": "primary", // "primary" | "mirror" | "both"
"archive": true // whether Attestor should archive bundle to S3
}
}
```
---
## 4) APIs
### 4.1 Submission
`POST /api/v1/rekor/entries` *(mTLS + OpTok required)*
* **Body**: as above.
* **Behavior**:
* Verify caller (mTLS + OpTok).
* Validate DSSE bundle (signature, cert chain to Fulcio/KMS; DSSE structure; payloadType allowed).
* Idempotency: compute `bundleSha256`; check `dedupe`. If present, return existing `rekorUuid`.
* Submit canonicalized bundle to Rekor v2 (primary or mirror according to `logPreference`).
* Retrieve **inclusion proof** (blocking until inclusion or up to `proofTimeoutMs`); if backend returns promise only, return `status=pending` and retry asynchronously.
* Persist `entries` record; archive DSSE to S3 if `archive=true`.
* **Response 200**:
```json
{
"uuid": "…",
"index": 123456,
"proof": {
"checkpoint": { "origin": "rekor@site", "size": 987654, "rootHash": "…", "timestamp": "…" },
"inclusion": { "leafHash": "…", "path": ["…","…"] }
},
"logURL": "https://rekor…/api/v2/log/…/entries/…",
"status": "included"
}
```
* **Errors**: `401 invalid_token`, `403 not_signer|chain_untrusted`, `409 duplicate_bundle` (with existing `uuid`), `502 rekor_unavailable`, `504 proof_timeout`.
### 4.2 Proof retrieval
`GET /api/v1/rekor/entries/{uuid}`
* Returns `entries` row (refreshes proof from Rekor if stale/missing).
* Accepts `?refresh=true` to force backend query.
### 4.3 Verification (thirdparty or internal)
`POST /api/v1/rekor/verify`
* **Body** (one of):
* `{ "uuid": "…" }`
* `{ "bundle": { …DSSE… } }`
* `{ "artifactSha256": "…" }` *(looks up most recent entry)*
* **Checks**:
1. **Bundle signature** → cert chain to Fulcio/KMS roots configured.
2. **Inclusion proof** → recompute leaf hash; verify Merkle path against checkpoint root.
3. Optionally verify **checkpoint** against local trust anchors (if Rekor signs checkpoints).
4. Confirm **subject.digest** matches callerprovided hash (when given).
* **Response**:
```json
{ "ok": true, "uuid": "…", "index": 123, "logURL": "…", "checkedAt": "…" }
```
### 4.4 Batch submission (optional)
`POST /api/v1/rekor/batch` accepts an array of submission objects; processes with peritem results.
---
## 5) Rekor v2 driver (backend)
* **Canonicalization**: DSSE envelopes are **normalized** (stable JSON ordering, no insignificant whitespace) before hashing and submission.
* **Transport**: HTTP/2 with retries (exponential backoff, jitter), budgeted timeouts.
* **Idempotency**: if backend returns “already exists,” map to existing `uuid`.
* **Proof acquisition**:
* In synchronous mode, poll the log for inclusion up to `proofTimeoutMs`.
* In asynchronous mode, return `pending` and schedule a **proof fetcher** job (Mongo job doc + backoff).
* **Mirrors/dual logs**:
* When `logPreference="both"`, submit to primary and mirror; store **both** UUIDs (primary canonical).
* Optional **cloud endorsement**: POST to the StellaOps cloud `/attest/endorse` with `{uuid, artifactSha256}`; store returned endorsement id.
---
## 6) Security model
* **mTLS required** for submission from **Signer** (CApinned).
* **Authority token** with `aud=attestor` and DPoP/mTLS binding must be presented; Attestor verifies both.
* **Bundle acceptance policy**:
* DSSE signature must chain to the configured **Fulcio** (keyless) or **KMS/HSM** roots.
* SAN (Subject Alternative Name) must match **Signer identity** policy (e.g., `urn:stellaops:signer` or pinned OIDC issuer).
* Predicate `predicateType` must be on allowlist (sbom/report/vex-export).
* `subject.digest.sha256` values must be present and wellformed (hex).
* **No public submission** path. **Never** accept bundles from untrusted clients.
* **Rate limits**: per mTLS thumbprint/license (from Signerforwarded claims) to avoid flooding the log.
* **Redaction**: Attestor never logs secret material; DSSE payloads **should** be public by design (SBOMs/reports). If customers require redaction, enforce policy at Signer (predicate minimization) **before** Attestor.
---
## 7) Storage & archival
* **Entries** in Mongo provide a local ledger keyed by `rekorUuid` and **artifact sha256** for quick reverse lookups.
* **S3 archival** (if enabled):
```
s3://stellaops/attest/
dsse/<bundleSha256>.json
proof/<rekorUuid>.json
bundle/<artifactSha256>.zip # optional verification bundle
```
* **Verification bundles** (zip):
* DSSE (`*.dsse.json`), proof (`*.proof.json`), `chain.pem` (certs), `README.txt` with verification steps & hashes.
---
## 8) Observability & audit
**Metrics** (Prometheus):
* `attestor.submit_total{result,backend}`
* `attestor.submit_latency_seconds{backend}`
* `attestor.proof_fetch_total{result}`
* `attestor.verify_total{result}`
* `attestor.dedupe_hits_total`
* `attestor.errors_total{type}`
**Tracing**:
* Spans: `validate`, `rekor.submit`, `rekor.poll`, `persist`, `archive`, `verify`.
**Audit**:
* Immutable `audit` rows (ts, caller, action, hashes, uuid, index, backend, result, latency).
---
## 9) Configuration (YAML)
```yaml
attestor:
listen: "https://0.0.0.0:8444"
security:
mtls:
caBundle: /etc/ssl/signer-ca.pem
requireClientCert: true
authority:
issuer: "https://authority.internal"
jwksUrl: "https://authority.internal/jwks"
requireSenderConstraint: "dpop" # or "mtls"
signerIdentity:
mode: ["keyless","kms"]
fulcioRoots: ["/etc/fulcio/root.pem"]
allowedSANs: ["urn:stellaops:signer"]
kmsKeys: ["kms://cluster-kms/stellaops-signer"]
rekor:
primary:
url: "https://rekor-v2.internal"
proofTimeoutMs: 15000
pollIntervalMs: 250
maxAttempts: 60
mirror:
enabled: false
url: "https://rekor-v2.mirror"
mongo:
uri: "mongodb://mongo/attestor"
s3:
enabled: true
endpoint: "http://minio:9000"
bucket: "stellaops"
prefix: "attest/"
objectLock: "governance"
redis:
url: "redis://redis:6379/2"
quotas:
perCaller:
qps: 50
burst: 100
```
---
## 10) Endtoend sequences
**A) Submit & include (happy path)**
```mermaid
sequenceDiagram
autonumber
participant SW as Scanner.WebService
participant SG as Signer
participant AT as Attestor
participant RK as Rekor v2
SW->>SG: POST /sign/dsse (OpTok+PoE)
SG-->>SW: DSSE bundle (+certs)
SW->>AT: POST /rekor/entries (mTLS + OpTok)
AT->>AT: Validate DSSE (chain to Fulcio/KMS; signer identity)
AT->>RK: submit(bundle)
RK-->>AT: {uuid, index?}
AT->>RK: poll inclusion until proof or timeout
RK-->>AT: inclusion proof (checkpoint + path)
AT-->>SW: {uuid, index, proof, logURL}
```
**B) Verify by artifact digest (CLI)**
```mermaid
sequenceDiagram
autonumber
participant CLI as stellaops verify
participant SW as Scanner.WebService
participant AT as Attestor
CLI->>SW: GET /catalog/artifacts/{id}
SW-->>CLI: {artifactSha256, rekor: {uuid}}
CLI->>AT: POST /rekor/verify { uuid }
AT-->>CLI: { ok: true, index, logURL }
```
---
## 11) Failure modes & responses
| Condition | Return | Details | | |
| ------------------------------------- | ----------------------- | --------------------------------------------------------- | -------- | ------------ |
| mTLS/OpTok invalid | `401 invalid_token` | Include `WWW-Authenticate` DPoP challenge when applicable | | |
| Bundle not signed by trusted identity | `403 chain_untrusted` | DSSE accepted only from Signer identities | | |
| Duplicate bundle | `409 duplicate_bundle` | Return existing `uuid` (idempotent) | | |
| Rekor unreachable/timeout | `502 rekor_unavailable` | Retry with backoff; surface `Retry-After` | | |
| Inclusion proof timeout | `202 accepted` | `status=pending`, background job continues to fetch proof | | |
| Archive failure | `207 multi-status` | Entry recorded; archive will retry asynchronously | | |
| Verification mismatch | `400 verify_failed` | Include reason: chain | leafHash | rootMismatch |
---
## 12) Performance & scale
* Stateless; scale horizontally.
* **Targets**:
* Submit+proof P95 ≤ **300ms** (warm log; local Rekor).
* Verify P95 ≤ **30ms** from cache; ≤ **120ms** with live proof fetch.
* 1k submissions/minute per replica sustained.
* **Hot caches**: `dedupe` (bundle hash → uuid), recent `entries` by artifact sha256.
---
## 13) Testing matrix
* **Happy path**: valid DSSE, inclusion within timeout.
* **Idempotency**: resubmit same `bundleSha256` → same `uuid`.
* **Security**: reject nonSigner mTLS, wrong `aud`, DPoP replay, untrusted cert chain, forbidden predicateType.
* **Rekor variants**: promisethenproof, proof delayed, mirror dualsubmit, mirror failure.
* **Verification**: corrupt leaf path, wrong root, tampered bundle.
* **Throughput**: soak test with 10k submissions; latency SLOs, zero drops.
---
## 14) Implementation notes
* Language: **.NET 10** minimal API; `HttpClient` with **sockets handler** tuned for HTTP/2.
* JSON: **canonical writer** for DSSE payload hashing.
* Crypto: use **BouncyCastle**/**System.Security.Cryptography**; PEM parsing for cert chains.
* Rekor client: pluggable driver; treat backend errors as retryable/nonretryable with granular mapping.
* Safety: size caps on bundles; decompress bombs guarded; strict UTF8.
* CLI integration: `stellaops verify attestation <uuid|bundle|artifact>` calls `/rekor/verify`.
---
## 15) Optional features
* **Duallog** write (primary + mirror) and **crosslog proof** packaging.
* **Cloud endorsement**: send `{uuid, artifactSha256}` to StellaOps cloud; store returned endorsement id for marketing/chainofcustody.
* **Checkpoint pinning**: periodically pin latest Rekor checkpoints to an external audit store for independent monitoring.

View File

@@ -0,0 +1,394 @@
# component_architecture_authority.md — **StellaOps Authority** (2025Q4)
> **Scope.** Implementationready architecture for **StellaOps Authority**: the onprem **OIDC/OAuth2** service that issues **shortlived, senderconstrained operational tokens (OpToks)** to firstparty services and tools. Covers protocols (DPoP & mTLS binding), token shapes, endpoints, storage, rotation, HA, RBAC, audit, and testing. This component is the trust anchor for *who* is calling inside a StellaOps installation. (Entitlement is proven separately by **PoE** from the cloud Licensing Service; Authority does not issue PoE.)
---
## 0) Mission & boundaries
**Mission.** Provide **fast, local, verifiable** authentication for StellaOps microservices and tools by minting **very shortlived** OAuth2/OIDC tokens that are **senderconstrained** (DPoP or mTLSbound). Support RBAC scopes, multitenant claims, and deterministic validation for APIs (Scanner, Signer, Attestor, Vexer, Feedser, UI, CLI, Zastava).
**Boundaries.**
* Authority **does not** validate entitlements/licensing. Thats enforced by **Signer** using **PoE** with the cloud Licensing Service.
* Authority tokens are **operational only** (25min TTL) and must not be embedded in longlived artifacts or stored in SBOMs.
* Authority is **stateless for validation** (JWT) and **optional introspection** for services that prefer online checks.
---
## 1) Protocols & cryptography
* **OIDC Discovery**: `/.well-known/openid-configuration`
* **OAuth2** grant types:
* **Client Credentials** (service↔service, with mTLS or private_key_jwt)
* **Device Code** (CLI login on headless agents; optional)
* **Authorization Code + PKCE** (browser login for UI; optional)
* **Sender constraint options** (choose per caller or per audience):
* **DPoP** (Demonstration of ProofofPossession): proof JWT on each HTTP request, bound to the access token via `cnf.jkt`.
* **OAuth 2.0 mTLS** (certificatebound tokens): token bound to client certificate thumbprint via `cnf.x5t#S256`.
* **Signing algorithms**: **EdDSA (Ed25519)** preferred; fallback **ES256 (P256)**. Rotation is supported via **kid** in JWKS.
* **Token format**: **JWT** access tokens (compact), optionally opaque reference tokens for services that insist on introspection.
* **Clock skew tolerance**: ±60s; issue `nbf`, `iat`, `exp` accordingly.
---
## 2) Token model
### 2.1 Access token (OpTok) — shortlived (120300s)
**Registered claims**
```
iss = https://authority.<domain>
sub = <client_id or user_id>
aud = <service audience: signer|scanner|attestor|feedser|vexer|ui|zastava>
exp = <unix ts> (<= 300 s from iat)
iat = <unix ts>
nbf = iat - 30
jti = <uuid>
scope = "scanner.scan scanner.export signer.sign ..."
```
**Senderconstraint (`cnf`)**
* **DPoP**:
```json
"cnf": { "jkt": "<base64url(SHA-256(JWK))>" }
```
* **mTLS**:
```json
"cnf": { "x5t#S256": "<base64url(SHA-256(client_cert_der))>" }
```
**Install/tenant context (custom claims)**
```
tid = <tenant id> // multi-tenant
inst = <installation id> // unique installation
roles = [ "svc.scanner", "svc.signer", "ui.admin", ... ]
plan? = <plan name> // optional hint for UIs; not used for enforcement
```
> **Note**: Do **not** copy PoE claims into OpTok; OpTok ≠ entitlement. Only **Signer** checks PoE.
### 2.2 Refresh tokens (optional)
* Default **disabled**. If enabled (for UI interactive logins), pair with **DPoPbound** refresh tokens or **mTLS** client sessions; short TTL (≤ 8h), rotating on use (replaysafe).
### 2.3 ID tokens (optional)
* Issued for UI/browser OIDC flows (Authorization Code + PKCE); not used for service auth.
---
## 3) Endpoints & flows
### 3.1 OIDC discovery & keys
* `GET /.well-known/openid-configuration` → endpoints, algs, jwks_uri
* `GET /jwks` → JSON Web Key Set (rotating, at least 2 active keys during transition)
### 3.2 Token issuance
* `POST /oauth/token`
* **Client Credentials** (service→service):
* **mTLS**: mutual TLS + `client_id` → bound token (`cnf.x5t#S256`)
* **private_key_jwt**: JWTbased client auth + **DPoP** header (preferred for tools and CLI)
* **Device Code** (CLI): `POST /oauth/device/code` + `POST /oauth/token` poll
* **Authorization Code + PKCE** (UI): standard
**DPoP handshake (example)**
1. Client prepares **JWK** (ephemeral keypair).
2. Client sends **DPoP proof** header with fields:
```
htm=POST
htu=https://authority.../oauth/token
iat=<now>
jti=<uuid>
```
signed with the DPoP private key; header carries JWK.
3. Authority validates proof; issues access token with `cnf.jkt=<thumbprint(JWK)>`.
4. Client uses the same DPoP key to sign **every subsequent API request** to services (Signer, Scanner, …).
**mTLS flow**
* Mutual TLS at the connection; Authority extracts client cert, validates chain; token carries `cnf.x5t#S256`.
### 3.3 Introspection & revocation (optional)
* `POST /oauth/introspect` → `{ active, sub, scope, aud, exp, cnf, ... }`
* `POST /oauth/revoke` → revokes refresh tokens or opaque access tokens.
* **Replay prevention**: maintain **DPoP `jti` cache** (TTL ≤ 10 min) to reject duplicate proofs when services supply DPoP nonces (Signer requires nonce for highvalue operations).
### 3.4 UserInfo (optional for UI)
* `GET /userinfo` (ID token context).
---
## 4) Audiences, scopes & RBAC
### 4.1 Audiences
* `signer` — only the **Signer** service should accept tokens with `aud=signer`.
* `attestor`, `scanner`, `feedser`, `vexer`, `ui`, `zastava` similarly.
Services **must** verify `aud` and **sender constraint** (DPoP/mTLS) per their policy.
### 4.2 Core scopes
| Scope | Service | Operation |
| ---------------------------------- | ------------------ | -------------------------- |
| `signer.sign` | Signer | Request DSSE signing |
| `attestor.write` | Attestor | Submit Rekor entries |
| `scanner.scan` | Scanner.WebService | Submit scan jobs |
| `scanner.export` | Scanner.WebService | Export SBOMs |
| `scanner.read` | Scanner.WebService | Read catalog/SBOMs |
| `vex.read` / `vex.admin` | Vexer | Query/operate |
| `feedser.read` / `feedser.export` | Feedser | Query/exports |
| `ui.read` / `ui.admin` | UI | View/admin |
| `zastava.emit` / `zastava.enforce` | Scanner/Zastava | Runtime events / admission |
**Roles → scopes mapping** is configured centrally (Authority policy) and pushed during token issuance.
---
## 5) Storage & state
* **Configuration DB** (PostgreSQL/MySQL): clients, audiences, role→scope maps, tenant/installation registry, device code grants, persistent consents (if any).
* **Cache** (Redis):
* DPoP **jti** replay cache (short TTL)
* **Nonce** store (per resource server, if they demand nonce)
* Device code pollers, rate limiting buckets
* **JWKS**: key material in HSM/KMS or encrypted at rest; JWKS served from memory.
---
## 6) Key management & rotation
* Maintain **at least 2 signing keys** active during rotation; tokens carry `kid`.
* Prefer **Ed25519** for compact tokens; maintain **ES256** fallback for FIPS contexts.
* Rotation cadence: 3090 days; emergency rotation supported.
* Publish new JWKS **before** issuing tokens with the new `kid` to avoid coldstart validation misses.
* Keep **old keys** available **at least** for max token TTL + 5 minutes.
---
## 7) HA & performance
* **Stateless issuance** (except device codes/refresh) → scale horizontally behind a loadbalancer.
* **DB** only for client metadata and optional flows; token checks are JWTlocal; introspection endpoints hit cache/DB minimally.
* **Targets**:
* Token issuance P95 ≤ **20ms** under warm cache.
* DPoP proof validation ≤ **1ms** extra per request at resource servers (Signer/Scanner).
* 99.9% uptime; HPA on CPU/latency.
---
## 8) Security posture
* **Strict TLS** (1.3 preferred); HSTS; modern cipher suites.
* **mTLS** enabled where required (Signer/Attestor paths).
* **Replay protection**: DPoP `jti` cache, nonce support for **Signer** (add `DPoP-Nonce` header on 401; clients resign).
* **Rate limits** per client & per IP; exponential backoff on failures.
* **Secrets**: clients use **private_key_jwt** or **mTLS**; never basic secrets over the wire.
* **CSP/CSRF** hardening on UI flows; `SameSite=Lax` cookies; PKCE enforced.
* **Logs** redact `Authorization` and DPoP proofs; store `sub`, `aud`, `scopes`, `inst`, `tid`, `cnf` thumbprints, not full keys.
---
## 9) Multitenancy & installations
* **Tenant (`tid`)** and **Installation (`inst`)** registries define which audiences/scopes a client can request.
* Crosstenant isolation enforced at issuance (disallow rogue `aud`), and resource servers **must** check that `tid` matches their configured tenant.
---
## 10) Admin & operations APIs
All under `/admin` (mTLS + `authority.admin` scope).
```
POST /admin/clients # create/update client (confidential/public)
POST /admin/audiences # register audience resource URIs
POST /admin/roles # define role→scope mappings
POST /admin/tenants # create tenant/install entries
POST /admin/keys/rotate # rotate signing key (zero-downtime)
GET /admin/metrics # Prometheus exposition (token issue rates, errors)
GET /admin/healthz|readyz # health/readiness
```
---
## 11) Integration hard lines (what resource servers must enforce)
Every StellaOps service that consumes Authority tokens **must**:
1. Verify JWT signature (`kid` in JWKS), `iss`, `aud`, `exp`, `nbf`.
2. Enforce **senderconstraint**:
* **DPoP**: validate DPoP proof (`htu`, `htm`, `iat`, `jti`) and match `cnf.jkt`; cache `jti` for replay defense; honor nonce challenges.
* **mTLS**: match presented client cert thumbprint to token `cnf.x5t#S256`.
3. Check **scopes**; optionally map to internal roles.
4. Check **tenant** (`tid`) and **installation** (`inst`) as appropriate.
5. For **Signer** only: require **both** OpTok and **PoE** in the request (enforced by Signer, not Authority).
---
## 12) Error surfaces & UX
* Token endpoint errors follow OAuth2 (`invalid_client`, `invalid_grant`, `invalid_scope`, `unauthorized_client`).
* Resource servers use RFC6750 style (`WWW-Authenticate: DPoP error="invalid_token", error_description="…", dpop_nonce="…" `).
* For DPoP nonce challenges, clients retry with the serversupplied nonce once.
---
## 13) Observability & audit
* **Metrics**:
* `authority.tokens_issued_total{grant,aud}`
* `authority.dpop_validations_total{result}`
* `authority.mtls_bindings_total{result}`
* `authority.jwks_rotations_total`
* `authority.errors_total{type}`
* **Audit log** (immutable sink): token issuance (`sub`, `aud`, `scopes`, `tid`, `inst`, `cnf thumbprint`, `jti`), revocations, admin changes.
* **Tracing**: token flows, DB reads, JWKS cache.
---
## 14) Configuration (YAML)
```yaml
authority:
issuer: "https://authority.internal"
keys:
algs: [ "EdDSA", "ES256" ]
rotationDays: 60
storage: kms://cluster-kms/authority-signing
tokens:
accessTtlSeconds: 180
enableRefreshTokens: false
clockSkewSeconds: 60
dpop:
enable: true
nonce:
enable: true
ttlSeconds: 600
mtls:
enable: true
caBundleFile: /etc/ssl/mtls/clients-ca.pem
clients:
- clientId: scanner-web
grantTypes: [ "client_credentials" ]
audiences: [ "scanner" ]
auth: { type: "private_key_jwt", jwkFile: "/secrets/scanner-web.jwk" }
senderConstraint: "dpop"
scopes: [ "scanner.scan", "scanner.export", "scanner.read" ]
- clientId: signer
grantTypes: [ "client_credentials" ]
audiences: [ "signer" ]
auth: { type: "mtls" }
senderConstraint: "mtls"
scopes: [ "signer.sign" ]
```
---
## 15) Testing matrix
* **JWT validation**: wrong `aud`, expired `exp`, skewed `nbf`, stale `kid`.
* **DPoP**: invalid `htu`/`htm`, replayed `jti`, stale `iat`, wrong `jkt`, nonce dance.
* **mTLS**: wrong client cert, wrong CA, thumbprint mismatch.
* **RBAC**: scope enforcement per audience; overprivileged client denied.
* **Rotation**: JWKS rotation while loadtesting; zerodowntime verification.
* **HA**: kill one Authority instance; verify issuance continues; JWKS served by peers.
* **Performance**: 1k token issuance/sec on 2 cores with Redis enabled for jti caching.
---
## 16) Threat model & mitigations (summary)
| Threat | Vector | Mitigation |
| ------------------- | ---------------- | ------------------------------------------------------------------------------------------ |
| Token theft | Copy of JWT | **Short TTL**, **senderconstraint** (DPoP/mTLS); replay blocked by `jti` cache and nonces |
| Replay across hosts | Reuse DPoP proof | Enforce `htu`/`htm`, `iat` freshness, `jti` uniqueness; services may require **nonce** |
| Impersonation | Fake client | mTLS or `private_key_jwt` with pinned JWK; client registration & rotation |
| Key compromise | Signing key leak | HSM/KMS storage, key rotation, audit; emergency key revoke path; narrow token TTL |
| Crosstenant abuse | Scope elevation | Enforce `aud`, `tid`, `inst` at issuance and resource servers |
| Downgrade to bearer | Strip DPoP | Resource servers require DPoP/mTLS based on `aud`; reject bearer without `cnf` |
---
## 17) Deployment & HA
* **Stateless** microservice, containerized; run ≥ 2 replicas behind LB.
* **DB**: HA Postgres (or MySQL) for clients/roles; **Redis** for device codes, DPoP nonces/jtis.
* **Secrets**: mount client JWKs via K8s Secrets/HashiCorp Vault; signing keys via KMS.
* **Backups**: DB daily; Redis not critical (ephemeral).
* **Disaster recovery**: export/import of client registry; JWKS rehydrate from KMS.
* **Compliance**: TLS audit; penetration testing for OIDC flows.
---
## 18) Implementation notes
* Reference stack: **.NET 10** + **OpenIddict 6** (or IdentityServer if licensed) with custom DPoP validator and mTLS binding middleware.
* Keep the DPoP/JTI cache pluggable; allow Redis/Memcached.
* Provide **client SDKs** for C# and Go: DPoP key mgmt, proof generation, nonce handling, token refresh helper.
---
## 19) Quick reference — wire examples
**Access token (payload excerpt)**
```json
{
"iss": "https://authority.internal",
"sub": "scanner-web",
"aud": "signer",
"exp": 1760668800,
"iat": 1760668620,
"nbf": 1760668620,
"jti": "9d9c3f01-6e1a-49f1-8f77-9b7e6f7e3c50",
"scope": "signer.sign",
"tid": "tenant-01",
"inst": "install-7A2B",
"cnf": { "jkt": "KcVb2V...base64url..." }
}
```
**DPoP proof header fields (for POST /sign/dsse)**
```json
{
"htu": "https://signer.internal/sign/dsse",
"htm": "POST",
"iat": 1760668620,
"jti": "4b1c9b3c-8a95-4c58-8a92-9c6cfb4a6a0b"
}
```
Signer validates that `hash(JWK)` in the proof matches `cnf.jkt` in the token.
---
## 20) Rollout plan
1. **MVP**: Client Credentials (private_key_jwt + DPoP), JWKS, short OpToks, peraudience scopes.
2. **Add**: mTLSbound tokens for Signer/Attestor; device code for CLI; optional introspection.
3. **Hardening**: DPoP nonce support; full audit pipeline; HA tuning.
4. **UX**: Tenant/installation admin UI; role→scope editors; client bootstrap wizards.

389
docs/ARCHITECTURE_CLI.md Normal file
View File

@@ -0,0 +1,389 @@
# component_architecture_cli.md — **StellaOps CLI** (2025Q4)
> **Scope.** Implementationready architecture for **StellaOps CLI**: command surface, process model, auth (Authority/DPoP), integration with Scanner/Vexer/Feedser/Signer/Attestor, Buildx plugin management, offline kit behavior, packaging, observability, security posture, and CI ergonomics.
---
## 0) Mission & boundaries
**Mission.** Provide a **fast, deterministic, CIfriendly** commandline interface to drive StellaOps workflows:
* Buildtime SBOM generation via **Buildx generator** orchestration.
* Postbuild **scan/compose/diff/export** against **Scanner.WebService**.
* **Policy** operations and **VEX/Vuln** data pulls (operator tasks).
* **Verification** (attestation, referrers, signatures) for audits.
* Airgapped/offline **kit** administration.
**Boundaries.**
* CLI **never** signs; it only calls **Signer**/**Attestor** via backend APIs when needed (e.g., `report --attest`).
* CLI **does not** store longlived credentials beyond OS keychain; tokens are **short** (Authority OpToks).
* Heavy work (scanning, merging, policy) is executed **serverside** (Scanner/Vexer/Feedser).
---
## 1) Solution layout & runtime form
```
src/
├─ StellaOps.Cli/ # net10.0 (Native AOT) single binary
├─ StellaOps.Cli.Core/ # verb plumbing, config, HTTP, auth
├─ StellaOps.Cli.Plugins/ # optional verbs packaged as plugins
├─ StellaOps.Cli.Tests/ # unit + golden-output tests
└─ packaging/
├─ msix / msi / deb / rpm / brew formula
└─ scoop manifest / winget manifest
```
**Language/runtime**: .NET 10 **Native AOT** for speed/startup; Linux builds use **musl** static when possible.
**OS targets**: linuxx64/arm64, windowsx64/arm64, macOSx64/arm64.
---
## 2) Command surface (verbs)
> All verbs default to **JSON** output when `--json` is set (CI mode). Human output is concise, deterministic.
### 2.1 Auth & profile
* `auth login`
* Modes: **devicecode** (default), **clientcredentials** (service principal).
* Produces **Authority** access token (OpTok) + stores **DPoP** keypair in OS keychain.
* `auth status` — show current issuer, subject, audiences, expiry.
* `auth logout` — wipe cached tokens/keys.
### 2.2 Buildtime SBOM (Buildx)
* `buildx install` — install/update the **StellaOps.Scanner.Sbomer.BuildXPlugin** on the host.
* `buildx verify` — ensure generator is usable.
* `buildx build` — thin wrapper around `docker buildx build --attest=type=sbom,generator=stellaops/sbom-indexer` with convenience flags:
* `--attest` (request Signer/Attestor via backend postpush)
* `--provenance` passthrough (optional)
### 2.3 Scanning & artifacts
* `scan image <ref|digest>`
* Options: `--force`, `--wait`, `--view=inventory|usage|both`, `--format=cdx-json|cdx-pb|spdx-json`, `--attest` (ask backend to sign/log).
* Streams progress; exits early unless `--wait`.
* `diff image --old <digest> --new <digest> [--view ...]` — show layerattributed changes.
* `export sbom <digest> [--view ... --format ... --out file]` — download artifact.
* `report final <digest> [--policy-revision ... --attest]` — request PASS/FAIL report from backend (policy+vex) and optional attestation.
### 2.4 Policy & data
* `policy get/set/apply` — fetch active policy, apply staged policy, compute digest.
* `feedser export` — trigger/export canonical JSON or Trivy DB (admin).
* `vexer export` — trigger/export consensus/raw claims (admin).
### 2.5 Verification
* `verify attestation --uuid <rekor-uuid> | --artifact <sha256> | --bundle <path>` — call **Attestor /verify** and print proof summary.
* `verify referrers <digest>` — ask **Signer /verify/referrers** (is image Stellasigned?).
* `verify image-signature <ref|digest>` — standalone cosign verification (optional, local).
### 2.6 Runtime (Zastava helper)
* `runtime policy test --images <digest,...> [--ns <name> --labels k=v,...]` — ask backend `/policy/runtime` like the webhook would.
### 2.7 Offline kit
* `offline kit pull` — fetch latest **Feedser JSON + Trivy DB + Vexer exports** as a tarball from a mirror.
* `offline kit import <tar>` — upload the kit to onprem services (Feedser/Vexer).
* `offline kit status` — list current seed versions.
### 2.8 Utilities
* `config set/get` — endpoint & defaults.
* `whoami` — short auth display.
* `version` — CLI + protocol versions; release channel.
---
## 3) AuthN: Authority + DPoP
### 3.1 Token acquisition
* **Devicecode**: the CLI opens an OIDC device code flow against **Authority**; the browser login is optional for service principals.
* **Clientcredentials**: service principals use **private_key_jwt** or **mTLS** to get tokens.
### 3.2 DPoP key management
* On first login, the CLI generates an **ephemeral JWK** (Ed25519) and stores it in the **OS keychain** (Keychain/DPAPI/KWallet/Gnome Keyring).
* Every request to backend services includes a **DPoP proof**; CLI refreshes tokens as needed.
### 3.3 Multiaudience & scopes
* CLI requests **audiences** as needed per verb:
* `scanner` for scan/export/report/diff
* `signer` (indirect; usually backend calls Signer)
* `attestor` for verify
* `feedser`/`vexer` for admin verbs
CLI rejects verbs if required scopes are missing.
---
## 4) Process model & reliability
### 4.1 HTTP client
* Single **http2** client with connection pooling, DNS pinning, retry/backoff (idempotent GET/POST marked safe).
* **DPoP nonce** handling: on `401` with nonce challenge, CLI replays once.
### 4.2 Streaming
* `scan` and `report` support **serversent JSON lines** (progress events).
* `--json` prints machine events; human mode shows compact spinners and crucial updates only.
### 4.3 Exit codes (CIsafe)
| Code | Meaning |
| ---- | ------------------------------------------- |
| 0 | Success |
| 2 | Policy fail (final report verdict=fail) |
| 3 | Verification failed (attestation/signature) |
| 4 | Auth error (invalid/missing token/DPoP) |
| 5 | Resource not found (image/SBOM) |
| 6 | Rate limited / quota exceeded |
| 7 | Backend unavailable (retryable) |
| 9 | Invalid arguments |
---
## 5) Configuration model
**Precedence:** CLI flags → env vars → config file → defaults.
**Config file**: `${XDG_CONFIG_HOME}/stellaops/config.yaml` (Windows: `%APPDATA%\StellaOps\config.yaml`)
```yaml
cli:
authority: "https://authority.internal"
backend:
scanner: "https://scanner-web.internal"
attestor: "https://attestor.internal"
feedser: "https://feedser-web.internal"
vexer: "https://vexer-web.internal"
auth:
audienceDefault: "scanner"
deviceCode: true
output:
json: false
color: auto
tls:
caBundle: "/etc/ssl/certs/ca-bundle.crt"
offline:
kitMirror: "s3://mirror/stellaops-kit"
```
Environment variables: `STELLAOPS_AUTHORITY`, `STELLAOPS_SCANNER_URL`, etc.
---
## 6) Buildx generator orchestration
* `buildx install` locates the Docker root directory, writes the **generator** plugin manifest, and pulls `stellaops/sbom-indexer` image (pinned digest).
* `buildx build` wrapper injects:
* `--attest=type=sbom,generator=stellaops/sbom-indexer`
* `--label org.stellaops.request=sbom`
* Postbuild: CLI optionally calls **Scanner.WebService** to **verify referrers**, **compose** image SBOMs, and **attest** via Signer/Attestor.
**Detection**: If Buildx or generator unavailable, CLI falls back to **postbuild scan** with a warning.
---
## 7) Artifact handling
* **Downloads** (`export sbom`, `report final`): stream to file; compute sha256 on the fly; write sidecar `.sha256` and optional **verification bundle** (if `--bundle`).
* **Uploads** (`offline kit import`): chunked upload; retry on transient errors; show progress bar (unless `--json`).
---
## 8) Security posture
* **DPoP private keys** stored in **OS keychain**; metadata cached in config.
* **No plaintext tokens** on disk; shortlived **OpToks** held in memory.
* **TLS**: verify backend certificates; allow custom CA bundle for onprem.
* **Redaction**: CLI logs remove `Authorization`, DPoP headers, PoE tokens.
* **Supply chain**: CLI distribution binaries are **cosignsigned**; `stellaops version --verify` checks its own signature.
---
## 9) Observability
* `--verbose` adds request IDs, timings, and retry traces.
* **Metrics** (optional, disabled by default): Prometheus text file exporter for local monitoring in longrunning agents.
* **Structured logs** (`--json`): perevent JSON lines with `ts`, `verb`, `status`, `latencyMs`.
---
## 10) Performance targets
* Startup ≤ **20ms** (AOT).
* `scan image` request/response overhead ≤ **5ms** (excluding server work).
* Buildx wrapper overhead negligible (<1ms).
* Large artifact download (100MB) sustained **80MB/s** on local networks.
---
## 11) Tests & golden outputs
* **Unit tests**: argument parsing, config precedence, URL resolution, DPoP proof creation.
* **Integration tests** (Testcontainers): mock Authority/Scanner/Attestor; CI pipeline with fake registry.
* **Golden outputs**: verb snapshots for `--json` across OSes; kept in `tests/golden/…`.
* **Contract tests**: ensure API shapes match service OpenAPI; fail build if incompatible.
---
## 12) Error envelopes (human + JSON)
**Human:**
```
✖ Policy FAIL: 3 high, 1 critical (VEX suppressed 12)
- pkg:rpm/openssl (CVE-2025-12345) — affected (vendor) — fixed in 3.0.14
- pkg:npm/lodash (GHSA-xxxx) — affected — no fix
See: https://ui.internal/scans/sha256:...
Exit code: 2
```
**JSON (`--json`):**
```json
{ "event":"report", "status":"fail", "critical":1, "high":3, "url":"https://ui..." }
```
---
## 13) Admin & advanced flags
* `--authority`, `--scanner`, `--attestor`, `--feedser`, `--vexer` override config URLs.
* `--no-color`, `--quiet`, `--json`.
* `--timeout`, `--retries`, `--retry-backoff-ms`.
* `--ca-bundle`, `--insecure` (dev only; prints warning).
* `--trace` (dump HTTP traces to file; scrubbed).
---
## 14) Interop with other tools
* Emits **CycloneDX Protobuf** directly to stdout when `export sbom --format cdx-pb --out -`.
* Pipes to `jq`/`yq` cleanly in JSON mode.
* Can act as a **credential helper** for scripts: `stellaops auth token --aud scanner` prints a oneshot token for curl.
---
## 15) Packaging & distribution
* **Installers**: deb/rpm (postinst registers completions), Homebrew, Scoop, Winget, MSI/MSIX.
* **Shell completions**: bash/zsh/fish/pwsh.
* **Update channel**: `stellaops self-update` (optional) fetches cosignsigned release manifest; corporate environments can disable.
---
## 16) Security hard lines
* Refuse to print token values; redact Authorization headers in verbose output.
* Disallow `--insecure` unless `STELLAOPS_CLI_ALLOW_INSECURE=1` set (double optin).
* Enforce **short token TTL**; refresh proactively when <30s left.
* Devicecode cache binding to **machine** and **user** (protect against copy to other machines).
---
## 17) Wire sequences
**A) Scan & wait with attestation**
```mermaid
sequenceDiagram
autonumber
participant CLI
participant Auth as Authority
participant SW as Scanner.WebService
participant SG as Signer
participant AT as Attestor
CLI->>Auth: device code flow (DPoP)
Auth-->>CLI: OpTok (aud=scanner)
CLI->>SW: POST /scans { imageRef, attest:true }
SW-->>CLI: { scanId }
CLI->>SW: GET /scans/{id} (poll)
SW-->>CLI: { status: completed, artifacts, rekor? } # if attested
alt attestation pending
SW->>SG: POST /sign/dsse (server-side)
SG-->>SW: DSSE
SW->>AT: POST /rekor/entries
AT-->>SW: { uuid, proof }
end
CLI->>SW: GET /sboms/<digest>?format=cdx-pb&view=usage
SW-->>CLI: bytes
```
**B) Verify attestation by artifact**
```mermaid
sequenceDiagram
autonumber
participant CLI
participant AT as Attestor
CLI->>AT: POST /rekor/verify { artifactSha256 }
AT-->>CLI: { ok:true, uuid, index, logURL }
```
---
## 18) Roadmap (CLI)
* `scan fs <path>` (local filesystem tree) upload to backend for analysis.
* `policy test --sbom <file>` (simulate policy results offline using local policy bundle).
* `runtime capture` (developer mode) capture small `/proc/<pid>/maps` for troubleshooting.
* Pluggable output renderers for SARIF/HTML (admincontrolled).
---
## 19) Example CI snippets
**GitHub Actions (postbuild)**
```yaml
- name: Login (device code w/ OIDC broker)
run: stellaops auth login --json --authority ${{ secrets.AUTHORITY_URL }}
- name: Scan
run: stellaops scan image ${{ steps.build.outputs.digest }} --wait --json
- name: Export (usage view, protobuf)
run: stellaops export sbom ${{ steps.build.outputs.digest }} --view usage --format cdx-pb --out sbom.pb
- name: Verify attestation
run: stellaops verify attestation --artifact $(sha256sum sbom.pb | cut -d' ' -f1) --json
```
**GitLab (buildx generator)**
```yaml
script:
- stellaops buildx install
- docker buildx build --attest=type=sbom,generator=stellaops/sbom-indexer -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- stellaops scan image $CI_REGISTRY_IMAGE@$IMAGE_DIGEST --wait --json
```
---
## 20) Test matrix (OS/arch)
* Linux: ubuntu20.04/22.04/24.04 (x64, arm64), alpine (musl).
* macOS: 1315 (x64, arm64).
* Windows: 10/11, Server 2019/2022 (x64, arm64).
* Docker engines: Docker Desktop, containerdbased runners.

462
docs/ARCHITECTURE_DEVOPS.md Normal file
View File

@@ -0,0 +1,462 @@
# component_architecture_devops.md — **StellaOps Release & Operations** (2025Q4)
> **Scope.** Implementationready blueprint for **how StellaOps is built, versioned, signed, distributed, upgraded, licensed (PoE)**, and operated in customer environments (online and airgapped). Covers reproducible builds, supplychain attestations, registries, offline kits, migration/rollback, artifact lifecycle (MinIO/Mongo), monitoring SLOs, and customer activation.
---
## 0) Product vision (operations lens)
StellaOps must be **trustable at a glance** and **boringly operable**:
* Every release ships with **firstparty SBOMs, provenance, and signatures**; services verify **each others** integrity at runtime.
* Customers can deploy by **digest** and stay aligned with **LTS/stable/edge** channels.
* Paid customers receive **attestation authority** (Signer accepts their PoE) while the core platform remains **free to run**.
* Airgapped customers receive **offline kits** with verifiable digests and deterministic import.
* Artifacts expire predictably; operators know whats kept, for how long, and why.
---
## 1) Release trains & versioning
### 1.1 Channels
* **LTS** (12month support window): quarterly cadence (Q1/Q2/Q3/Q4).
* **Stable** (default): monthly rollup (bug fixes + compatible features).
* **Edge**: weekly; for early adopters, no guarantees.
### 1.2 Version strings
Semantic core + calendar tag:
```
<MAJOR>.<MINOR>.<PATCH> (<YYYY>.<MM>) e.g., 2.4.1 (2027.06)
```
* **MAJOR**: breaking API/DB changes (rare).
* **MINOR**: new features, compatible schema migrations (expand/contract pattern).
* **PATCH**: bug fixes, perf and security updates.
* **Calendar tag** exposes **release year** used by Signer for **PoE window checks**.
### 1.3 Component alignment
A release is a **bundle** of image digests + charts + manifests. All services in a bundle are **wirecompatible**. Mixed minor versions are allowed within a bounded skew:
* **Web UI ↔ backend**: `±1 minor`.
* **Scanner ↔ Policy/Vexer/Feedser**: `±1 minor`.
* **Authority/Signer/Attestor triangle**: **must** be same minor (crypto and DPoP/mTLS binding rules).
At startup, services **selfadvertise** their semver & channel; the UI surfaces **mismatch warnings**.
---
## 2) Supplychain pipeline (how a release is built)
### 2.1 Deterministic builds
* **Builders**: isolated **BuildKit** workers with pinned base images (digest only).
* **Pinning**: lock files or `go.mod`, `package-lock.json`, `global.json`, `Directory.Packages.props` are **frozen** at tag.
* **Reproducibility**: timestamps normalized; source date epoch; deterministic zips/tars.
* **Multiarch**: linux/amd64 + linux/arm64 (Windows images track M2 roadmap).
### 2.2 Firstparty SBOMs & provenance
* Each image gets **CycloneDX (JSON+Protobuf) SBOM** and **SLSAstyle provenance** attached as **OCI referrers**.
* Scanners **Buildx generator** is used to produce SBOMs *during* build; a separate postbuild scan verifies parity (red flag if drift).
* **Release manifest** (see §6.1) lists all digests and SBOM/attestation refs.
### 2.3 Signing & transparency
* Images are **cosignsigned** (keyless) with a StellaOps release identity; inclusion in a **transparency log** (Rekor) is required.
* SBOM and provenance attestations are **DSSE** and also transparencylogged.
* Release keys (Fulcio roots or public keys) are embedded in **Signer** policy (for **scannerrelease validation** at customer side).
### 2.4 Gates & tests
* **Static**: linters, codegen checks, protobuf API freeze (backwardcompat tests).
* **Unit/integration**: percomponent, plus **endtoend** flows (scan→vex→policy→sign→attest).
* **Perf SLOs**: hot paths (SBOM compose, diff, export) measured against budgets.
* **Security**: dependency audit vs Feedser export; container hardening tests; minimal caps.
* **Canary cohort**: internal staging + selected customers; one week on **edge** before **stable** tag.
---
## 3) Distribution & activation
### 3.1 Registries
* **Primary**: `registry.stella-ops.org` (OCI v2, supports Referrers API).
* **Mirrors**: GHCR (readonly), regional mirrors for latency.
* **Pull by digest only** in Kubernetes/Compose manifests.
**Gating policy**:
* **Core images** (Authority, Scanner, Feedser, Vexer, Attestor, UI): public **read**.
* **Enterprise addons** (if any) and **prerelease**: private repos via OAuth2 token service.
> Monetization lever is **signing** (PoE gate), not image pulls, so the core remains simple to consume.
### 3.2 OAuth2 token service (for private repos)
* Docker Registrys token flow backed by **Authority**:
1. Client hits registry (`401` with `WWW-Authenticate: Bearer realm=…`).
2. Client gets an **access token** from the token service (validated by Authority) with `scope=repository:…:pull`.
3. Registry allows pull for the requested repo.
* Tokens are **shortlived** (60300s) and **DPoPbound**.
### 3.3 Offline kits (airgapped)
* Tarball per release channel:
```
stellaops-kit-<ver>-<channel>.tar.zst
/images/ OCI layout with all first-party images (multi-arch)
/sboms/ CycloneDX JSON+PB for each image
/attest/ DSSE bundles + Rekor proofs
/charts/ Helm charts + values templates
/compose/ docker-compose.yml + .env template
/plugins/ Feedser/Vexer connectors (restart-time)
/policy/ example policies
/manifest/ release.yaml (see §6.1)
```
* Import via CLI `offline kit import`; checks digests and signatures before load.
---
## 4) Licensing (PoE) & monetization
**Principle**: **Only paid StellaOps issues valid signed attestations.** Running the stack is free; signing requires PoE.
### 4.1 PoE issuance
* Customers purchase a plan and obtain a **PoE artifact** from `www.stella-ops.org`:
* **PoEJWT** (DPoP/mTLSbound) **or** **PoE mTLS client certificate**.
* Contains: `license_id`, `plan`, `valid_release_year`, `max_version`, `exp`, optional `tenant/customer` IDs.
### 4.2 Online enforcement
* **Signer** calls **Licensing /license/introspect** on every signing request (see signer doc).
* If **revoked/expired/outofwindow** → deny with machinereadable reason.
* All **valid** bundles are DSSEsigned and **Attestor** logs them; Rekor UUID returned.
* UI badges: “**Verified by StellaOps**” with link to the public log.
### 4.3 Airgapped / offline
* Customers obtain a **timeboxed PoE lease** (signed JSON, 730 days).
* Signer accepts the lease and emits **provisional** attestations (clearly labeled).
* When connectivity returns, a background job **endorses** the provisional entries with the cloud service, updating their status to **verified**.
* Operators can export a **verification bundle** for auditors even before endorsement (contains DSSE + local Rekor proof + lease snapshot).
### 4.4 Stolen/abused PoE
* Customers report theft; **Licensing** flags `license_id` as **revoked**.
* Subsequent Signer requests **deny**; previous attestations remain but can be marked **contested** (UI shows badge, optional resign path upon new PoE).
---
## 5) Deployment path (customer side)
### 5.1 First install
* **Helm** (Kubernetes) or **Compose** (VMs). Example (K8s):
```bash
helm repo add stellaops https://charts.stella-ops.org
helm install stella stellaops/platform \
--version 2.4.0 \
--set global.channel=stable \
--set authority.issuer=https://authority.stella.local \
--set scanner.minio.endpoint=http://minio.stella.local:9000 \
--set scanner.mongo.uri=mongodb://mongo/scanner \
--set feedser.mongo.uri=mongodb://mongo/feedser \
--set vexer.mongo.uri=mongodb://mongo/vexer
```
* Postinstall job registers **Authority clients** (Scanner, Signer, Attestor, UI) and prints **bootstrap** URLs and client credentials (sealed secrets).
* UI banner shows **release bundle** and verification state (cosign OK? Rekor OK?).
### 5.2 Updates
* **Blue/green**: pull new bundle by **digest**; deploy sidebyside; cut traffic.
* **Rolling**: upgrade stateful components in safe order:
1. Authority (stateless, dualkey rotation ready)
2. Signer/Attestor (same minor)
3. Scanner WebService & Workers
4. Feedser, then Vexer (schema migrations are expand/contract)
5. UI last
* **DB migrations** are **expand/contract**:
* Phase A (release N): **add** new fields/indexes, write old+new.
* Phase B (N+1): **read** new fields; **drop** old.
* Rollback is a matter of redeploying previous images and keeping both schemas valid.
### 5.3 Rollback
* Images referenced by **digest**; keep previous release manifest `K` versions back.
* `helm rollback` or compose `docker compose -f release-K.yml up -d`.
* Mongo migrations are additive; **no destructive changes** within a single minor.
---
## 6) Release payloads & manifests
### 6.1 Release manifest (`release.yaml`)
```yaml
release:
version: "2.4.1"
channel: "stable"
date: "2027-06-20T12:00:00Z"
calendar: "2027.06"
components:
- name: scanner-webservice
image: registry.stella-ops.org/stellaops/scanner-web@sha256:aa..bb
sbom: oci://.../referrers/cdx-json@sha256:11..22
provenance: oci://.../attest/provenance@sha256:33..44
signature: { rekorUUID: "…" }
- name: signer
image: registry.stella-ops.org/stellaops/signer@sha256:cc..dd
signature: { rekorUUID: "…" }
charts:
- name: platform
version: "2.4.1"
digest: "sha256:ee..ff"
compose:
file: "docker-compose.yml"
digest: "sha256:77..88"
checksums:
sha256: "… digest of this release.yaml …"
```
The manifest is **cosignsigned**; UI/CLI can verify a bundle without talking to registries.
### 6.2 Image labels (release metadata)
Each image sets OCI labels:
```
org.opencontainers.image.version = "2.4.1"
org.opencontainers.image.revision = "<git sha>"
org.opencontainers.image.created = "2027-06-20T12:00:00Z"
org.stellaops.release.calendar = "2027.06"
org.stellaops.release.channel = "stable"
org.stellaops.build.slsaProvenance = "oci://…"
```
Signer validates **scanner** images cosign identity + calendar tag for **release window** checks.
---
## 7) Artifact lifecycle & storage (MinIO/Mongo)
### 7.1 Buckets & prefixes (MinIO)
```
s3://stellaops/
scanner/
layers/<sha256>/sbom.cdx.json.zst
images/<imgDigest>/inventory.cdx.pb
images/<imgDigest>/usage.cdx.pb
diffs/<old>_<new>/diff.json.zst
attest/<artifactSha256>.dsse.json
feedser/
json/<exportId>/...
trivy/<exportId>/...
vexer/
exports/<exportId>/...
attestor/
dsse/<bundleSha256>.json
proof/<rekorUuid>.json
```
### 7.2 ILM classes
* **`short`**: working artifacts (diffs, queues) — TTL 714 days.
* **`default`**: SBOMs & indexes — TTL 90180 days (configurable).
* **`compliance`**: signed reports & attested exports — **Object Lock** (governance/compliance) 17 years.
### 7.3 Artifact Lifecycle Controller (ALC)
* A background worker (part of Scanner.WebService) enforces **TTL** and **reference counting**:
* Artifacts referenced by **reports** or **tickets** are pinned.
* ILM actions logged; UI shows perclass usage & upcoming purges.
### 7.4 Mongo retention
* **Scanner**: `runtime.events` use TTL (e.g., 3090 days); **catalog** permanent.
* **Feedser/Vexer**: raw docs keep **last N windows**; canonical stores permanent.
* **Attestor**: `entries` permanent; `dedupe` TTL 2448h.
---
## 8) Observability & SLOs (operations)
* **Uptime SLO**: 99.9% for Signer/Authority/Attestor; 99.5% for Scanner WebService; Vexer/Feedser 99.0%.
* **Error budgets**: tracked per month; dashboards show burn rates.
* **Golden signals**:
* **Latency**: token issuance, sign→attest roundtrip, scan enqueue→emit, export build.
* **Saturation**: queue depth, Mongo write IOPS, MinIO net throughput.
* **Traffic**: scans/min, attestations/min, webhook admits/min.
* **Errors**: 5xx rates, cosign verification failures, Rekor timeouts.
Prometheus + OTLP; Grafana dashboards ship in the charts.
---
## 9) Security & compliance operations
* **Key rotation**:
* Authority JWKS: 60day cadence, dualkey overlap.
* Release signing identities: rotate per minor or quarterly.
* Sigstore roots mirrored and pinned; alarms on drift.
* **FIPS mode** (Gov build):
* Enforce `ES256` + KMS/HSM; disable Ed25519; MLS ciphers only.
* Local **Rekor v2** and **Fulcio** alternatives; **airgapped** CA.
* **Vulnerability response**:
* Feedser redflag advisories trigger accelerated **stable** patch rollout; UI/CLI “security patch available” notice.
* **Backups/DR**:
* Mongo nightly snapshots; MinIO versioning + replication (if configured).
* Restore runbooks tested quarterly with synthetic data.
---
## 10) Customer update flow (how versions are fetched & activated)
### 10.1 Online clusters
* **UI** surfaces update banner with **release manifest** diff and risk notes.
* Operator approves → **Controller** pulls new images by digest; healthchecks; moves traffic; deprecates old revision.
* Postswitch, **schema Phase B** migrations (if any) run automatically.
### 10.2 Airgapped clusters
* Operator downloads **offline kit** from a mirror → `stellaops offline kit import`.
* Controller validates bundle checksums and **cosign signatures**; applies charts/compose by digest.
* After install, **verify** page shows green checks: image sigs, SBOMs attached, provenance logged.
### 10.3 CLI selfupdate (optional)
* `stellaops self-update` pulls a **signed release manifest** and verifies the **CLI binary** with cosign before swapping (admin can disable).
---
## 11) Compatibility & deprecation policy
* **APIs** are stable within a **major**; breaking changes imply **MAJOR++** and deprecation period of one minor.
* **Storage**: expand/contract; “drop old fields” only after one minor grace.
* **Config**: feature flags (default off) for risky features (e.g., eBPF).
---
## 12) Runbooks (selected)
### 12.1 Lost PoE
1. Suspend **automatic attestation** jobs.
2. Use CLI `stellaops signer status` to confirm `entitlement_denied`.
3. Obtain new PoE from portal; verify on Signer `/poe/verify`.
4. Reenable; optionally **resign** last N reports (UI button → batch).
### 12.2 Rekor outage (selfhosted)
* Attestor returns `202 (pending)` with queued proof fetch.
* Keep DSSE bundles locally; resubmit on schedule; UI badge shows **Pending**.
* If outage > SLA, you can switch to a **mirror** log in config; Attestor writes to both when restored.
### 12.3 Emergency downgrade
* Identify prior release manifest (UI → Admin → Releases).
* `helm rollback stella <revision>` (or compose apply previous file).
* Services tolerate skew per §1.3; ensure **Signer/Authority/Attestor** are rolled together.
---
## 13) Example: cluster bootstrap (Compose)
```yaml
version: "3.9"
services:
authority:
image: registry.stella-ops.org/stellaops/authority@sha256:...
env_file: ./env/authority.env
ports: ["8440:8440"]
signer:
image: registry.stella-ops.org/stellaops/signer@sha256:...
depends_on: [authority]
environment:
- SIGNER__POE__LICENSING__INTROSPECTURL=https://www.stella-ops.org/api/v1/license/introspect
attestor:
image: registry.stella-ops.org/stellaops/attestor@sha256:...
depends_on: [signer]
scanner-web:
image: registry.stella-ops.org/stellaops/scanner-web@sha256:...
environment:
- SCANNER__S3__ENDPOINT=http://minio:9000
scanner-worker:
image: registry.stella-ops.org/stellaops/scanner-worker@sha256:...
deploy: { replicas: 4 }
feedser:
image: registry.stella-ops.org/stellaops/feedser@sha256:...
vexer:
image: registry.stella-ops.org/stellaops/vexer@sha256:...
web-ui:
image: registry.stella-ops.org/stellaops/web-ui@sha256:...
mongo:
image: mongo:7
minio:
image: minio/minio:RELEASE.2025-07-10T00-00-00Z
```
---
## 14) Governance & keys (who owns the trust root)
* **Release key policy**: only the Release Engineering group can push signed releases; 4eyes approval; TUFstyle manifest possible in future.
* **Signer acceptance policy**: embedded release identities are updated **only** via minor upgrade; emergency CRL supported.
* **Customer keys**: none needed for core use; enterprise addons may require percustomer registries and keys.
---
## 15) Roadmap (Ops)
* **Windows containers GA** (Scanner + Zastava).
* **Key Transparency** for Signer certs.
* **Deltakit** (offline) for incremental updates.
* **Operator CRDs** (K8s) to manage policy and ILM declaratively.
* **SBOM **protobuf** as default transport at rest (smaller, faster).
---
### Appendix A — Minimal SLO monitors
* `authority.tokens_issued_total` slope ≈ normal.
* `signer.requests_total{result="success"}/minute` > 0 (when scans occur).
* `attestor.submit_latency_seconds{quantile=0.95}` < 0.3.
* `scanner.scan_latency_seconds{quantile=0.95}` < target per image size.
* `feedser.export.duration_seconds` stable; `vexer.consensus.conflicts_total` not exploding after policy changes.
* MinIO `s3_requests_errors_total` near zero; Mongo `opcounters` hit expected baseline.
### Appendix B — Upgrade safety checklist
* Verify **release manifest** signature.
* Ensure **Signer/Authority/Attestor** are same minor.
* Verify **DB backups** < 24h old.
* Confirm **ILM** wont purge compliance artifacts during upgrade window.
* Roll **one component** at a time; watch SLOs; abort on regression.
---
**End — component_architecture_devops.md**

View File

@@ -1,190 +1,433 @@
# ARCHITECTURE.md — **StellaOps.Feedser**
# component_architecture_feedser.md — **StellaOps Feedser** (2025Q4)
> **Goal**: Build a sovereign-ready, self-hostable **feed-merge service** that ingests authoritative vulnerability sources, normalizes and de-duplicates them into **MongoDB**, and exports **JSON** and **Trivy-compatible DB** artifacts.
> **Form factor**: Long-running **Web Service** with **REST APIs** (health, status, control) and an embedded **internal cron scheduler**. Controllable by StellaOps.Cli (# stella db ...)
> **No signing inside Feedser** (signing is a separate pipeline step).
> **Runtime SDK baseline**: .NET 10 Preview 7 (SDK 10.0.100-preview.7.25380.108) targeting `net10.0`, aligned with the deployed api.stella-ops.org service.
> **Four explicit stages**:
>
> 1. **Source Download** → raw documents.
> 2. **Parse & Normalize** → schema-validated DTOs enriched with canonical identifiers.
> 3. **Merge & Deduplicate** → precedence-aware canonical records persisted to MongoDB.
> 4. **Export** → JSON or TrivyDB (full or delta), then (externally) sign/publish.
> **Scope.** Implementationready architecture for **Feedser**: the vulnerability ingest/normalize/merge/export subsystem that produces deterministic advisory data for the Scanner + Policy + Vexer pipeline. Covers domain model, connectors, merge rules, storage schema, exports, APIs, performance, security, and test matrices.
---
## 1) Naming & Solution Layout
## 0) Mission & boundaries
**Source connectors** namespace prefix: `StellaOps.Feedser.Source.*`
**Exporters**:
**Mission.** Acquire authoritative **vulnerability advisories** (vendor PSIRTs, distros, OSS ecosystems, CERTs), normalize them into a **canonical model**, reconcile aliases and version ranges, and export **deterministic artifacts** (JSON, Trivy DB) for fast backend joins.
* `StellaOps.Feedser.Exporter.Json`
* `StellaOps.Feedser.Exporter.TrivyDb`
**Boundaries.**
**Projects** (`/src`):
* Feedser **does not** sign with private keys. When attestation is required, the export artifact is handed to the **Signer**/**Attestor** pipeline (outofprocess).
* Feedser **does not** decide PASS/FAIL; it provides data to the **Policy** engine.
* Online operation is **allowlistonly**; airgapped deployments use the **Offline Kit**.
---
## 1) Topology & processes
**Process shape:** single ASP.NET Core service `StellaOps.Feedser.WebService` hosting:
* **Scheduler** with distributed locks (Mongo backed).
* **Connectors** (fetch/parse/map).
* **Merger** (canonical record assembly + precedence).
* **Exporters** (JSON, Trivy DB).
* **Minimal REST** for health/status/trigger/export.
**Scale:** HA by running N replicas; **locks** prevent overlapping jobs per source/exporter.
---
## 2) Canonical domain model
> Stored in MongoDB (database `feedser`), serialized with a **canonical JSON** writer (stable order, camelCase, normalized timestamps).
### 2.1 Core entities
**Advisory**
```
StellaOps.Feedser.WebService/ # ASP.NET Core (Minimal API, net10.0 preview) WebService + embedded scheduler
StellaOps.Feedser.Core/ # Domain models, pipelines, merge/dedupe engine, jobs orchestration
StellaOps.Feedser.Models/ # Canonical POCOs, JSON Schemas, enums
StellaOps.Feedser.Storage.Mongo/ # Mongo repositories, GridFS access, indexes, resume "flags"
StellaOps.Feedser.Source.Common/ # HTTP clients, rate-limiters, schema validators, parsers utils
StellaOps.Feedser.Source.Cve/
StellaOps.Feedser.Source.Nvd/
StellaOps.Feedser.Source.Ghsa/
StellaOps.Feedser.Source.Osv/
StellaOps.Feedser.Source.Jvn/
StellaOps.Feedser.Source.CertCc/
StellaOps.Feedser.Source.Kev/
StellaOps.Feedser.Source.Kisa/
StellaOps.Feedser.Source.CertIn/
StellaOps.Feedser.Source.CertFr/
StellaOps.Feedser.Source.CertBund/
StellaOps.Feedser.Source.Acsc/
StellaOps.Feedser.Source.Cccs/
StellaOps.Feedser.Source.Ru.Bdu/ # HTML→schema with LLM fallback (gated)
StellaOps.Feedser.Source.Ru.Nkcki/ # PDF/HTML bulletins → structured
StellaOps.Feedser.Source.Vndr.Msrc/
StellaOps.Feedser.Source.Vndr.Cisco/
StellaOps.Feedser.Source.Vndr.Oracle/
StellaOps.Feedser.Source.Vndr.Adobe/ # APSB ingest; emits vendor RangePrimitives with adobe.track/platform/priority telemetry + fixed-status provenance.
StellaOps.Feedser.Source.Vndr.Apple/
StellaOps.Feedser.Source.Vndr.Chromium/
StellaOps.Feedser.Source.Vndr.Vmware/
StellaOps.Feedser.Source.Distro.RedHat/
StellaOps.Feedser.Source.Distro.Debian/ # Fetches DSA list + detail HTML, emits EVR RangePrimitives with per-release provenance and telemetry.
StellaOps.Feedser.Source.Distro.Ubuntu/ # Ubuntu Security Notices connector (JSON index → EVR ranges with ubuntu.pocket telemetry).
StellaOps.Feedser.Source.Distro.Suse/ # CSAF fetch pipeline emitting NEVRA RangePrimitives with suse.status vendor telemetry.
StellaOps.Feedser.Source.Ics.Cisa/
StellaOps.Feedser.Source.Ics.Kaspersky/
StellaOps.Feedser.Normalization/ # Canonical mappers, validators, version-range normalization
StellaOps.Feedser.Merge/ # Identity graph, precedence, deterministic merge
StellaOps.Feedser.Exporter.Json/
StellaOps.Feedser.Exporter.TrivyDb/
StellaOps.Feedser.<Component>.Tests/ # Component-scoped unit/integration suites (Core, Storage.Mongo, Source.*, Exporter.*, WebService, etc.)
advisoryId // internal GUID
advisoryKey // stable string key (e.g., CVE-2025-12345 or vendor ID)
title // short title (best-of from sources)
summary // normalized summary (English; i18n optional)
published // earliest source timestamp
modified // latest source timestamp
severity // normalized {none, low, medium, high, critical}
cvss // {v2?, v3?, v4?} objects (vector, baseScore, severity, source)
exploitKnown // bool (e.g., KEV/active exploitation flags)
references[] // typed links (advisory, kb, patch, vendor, exploit, blog)
sources[] // provenance for traceability (doc digests, URIs)
```
---
**Alias**
## 2) Runtime Shape
```
advisoryId
scheme // CVE, GHSA, RHSA, DSA, USN, MSRC, etc.
value // e.g., "CVE-2025-12345"
```
**Process**: single service (`StellaOps.Feedser.WebService`)
**Affected**
* `Program.cs`: top-level entry using **Generic Host**, **DI**, **Options** binding from `appsettings.json` + environment + optional `feedser.yaml`.
* Built-in **scheduler** (cron-like) + **job manager** with **distributed locks** in Mongo to prevent overlaps, enforce timeouts, allow cancel/kill.
* **REST APIs** for health/readiness/progress/trigger/kill/status.
```
advisoryId
productKey // canonical product identity (see 2.2)
rangeKind // semver | evr | nvra | apk | rpm | deb | generic | exact
introduced? // string (format depends on rangeKind)
fixed? // string (format depends on rangeKind)
lastKnownSafe? // optional explicit safe floor
arch? // arch or platform qualifier if source declares (x86_64, aarch64)
distro? // distro qualifier when applicable (rhel:9, debian:12, alpine:3.19)
ecosystem? // npm|pypi|maven|nuget|golang|…
notes? // normalized notes per source
```
**Key NuGet concepts** (indicative): `MongoDB.Driver`, `Polly` (retry/backoff), `System.Threading.Channels`, `Microsoft.Extensions.Http`, `Microsoft.Extensions.Hosting`, `Serilog`, `OpenTelemetry`.
**Reference**
```
advisoryId
url
kind // advisory | patch | kb | exploit | mitigation | blog | cvrf | csaf
sourceTag // e.g., vendor/redhat, distro/debian, oss/ghsa
```
**MergeEvent**
```
advisoryKey
beforeHash // canonical JSON hash before merge
afterHash // canonical JSON hash after merge
mergedAt
inputs[] // source doc digests that contributed
```
**ExportState**
```
exportKind // json | trivydb
baseExportId? // last full baseline
baseDigest? // digest of last full baseline
lastFullDigest? // digest of last full export
lastDeltaDigest? // digest of last delta export
cursor // per-kind incremental cursor
files[] // last manifest snapshot (path → sha256)
```
### 2.2 Product identity (`productKey`)
* **Primary:** `purl` (Package URL).
* **OS packages:** RPM (NEVRA→purl:rpm), DEB (dpkg→purl:deb), APK (apk→purl:alpine), with **EVR/NVRA** preserved.
* **Secondary:** `cpe` retained for compatibility; advisory records may carry both.
* **Image/platform:** `oci:<registry>/<repo>@<digest>` for imagelevel advisories (rare).
* **Unmappable:** if a source is nondeterministic, keep native string under `productKey="native:<provider>:<id>"` and mark **nonjoinable**.
---
## 3) Data Storage — **MongoDB** (single source of truth)
## 3) Source families & precedence
**Database**: `feedser`
**Write concern**: `majority` for merge/export state, `acknowledged` for raw docs.
**Collections** (with “flags”/resume points):
### 3.1 Families
* `source`
* `_id`, `name`, `type`, `baseUrl`, `auth`, `notes`.
* `source_state`
* Keys: `sourceName` (unique), `enabled`, `cursor`, `lastSuccess`, `failCount`, `backoffUntil`, `paceOverrides`, `paused`.
* Drives incremental fetch/parse/map resume and operator pause/pace controls.
* `document`
* `_id`, `sourceName`, `uri`, `fetchedAt`, `sha256`, `contentType`, `status`, `metadata`, `gridFsId`, `etag`, `lastModified`.
* Index `{sourceName:1, uri:1}` unique; optional TTL for superseded versions.
* `dto`
* `_id`, `sourceName`, `documentId`, `schemaVer`, `payload` (BSON), `validatedAt`.
* Index `{sourceName:1, documentId:1}`.
* `advisory`
* `_id`, `advisoryKey`, `title`, `summary`, `lang`, `published`, `modified`, `severity`, `exploitKnown`.
* Unique `{advisoryKey:1}` plus indexes on `modified` and `published`.
* `alias`
* `advisoryId`, `scheme`, `value` with index `{scheme:1, value:1}`.
* `affected`
* `advisoryId`, `platform`, `name`, `versionRange`, `cpe`, `purl`, `fixedBy`, `introducedVersion`.
* Index `{platform:1, name:1}`, `{advisoryId:1}`.
* `reference`
* `advisoryId`, `url`, `kind`, `sourceTag` (e.g., advisory/patch/kb).
* Flags collections: `kev_flag`, `ru_flags`, `jp_flags`, `psirt_flags` keyed by `advisoryId`.
* `merge_event`
* `_id`, `advisoryKey`, `beforeHash`, `afterHash`, `mergedAt`, `inputs` (document ids).
* `export_state`
* `_id` (`json`/`trivydb`), `baseExportId`, `baseDigest`, `lastFullDigest`, `lastDeltaDigest`, `exportCursor`, `targetRepo`, `exporterVersion`.
* `locks`
* `_id` (`jobKey`), `holder`, `acquiredAt`, `heartbeatAt`, `leaseMs`, `ttlAt` (TTL index cleans dead locks).
* `jobs`
* `_id`, `type`, `args`, `state`, `startedAt`, `endedAt`, `error`, `owner`, `heartbeatAt`, `timeoutMs`.
* **Vendor PSIRTs**: Microsoft, Oracle, Cisco, Adobe, Apple, VMware, Chromium…
* **Linux distros**: Red Hat, SUSE, Ubuntu, Debian, Alpine…
* **OSS ecosystems**: OSV, GHSA (GitHub Security Advisories), PyPI, npm, Maven, NuGet, Go.
* **CERTs / national CSIRTs**: CISA (KEV, ICS), JVN, ACSC, CCCS, KISA, CERTFR/BUND, etc.
**GridFS buckets**: `fs.documents` for raw large payloads; referenced by `document.gridFsId`.
### 3.2 Precedence (when claims conflict)
1. **Vendor PSIRT** (authoritative for their product).
2. **Distro** (authoritative for packages they ship, including backports).
3. **Ecosystem** (OSV/GHSA) for library semantics.
4. **CERTs/aggregators** for enrichment (KEV/known exploited).
> Precedence affects **Affected** ranges and **fixed** info; **severity** is normalized to the **maximum** credible severity unless policy overrides. Conflicts are retained with **source provenance**.
---
## 4) Job & Scheduler Model
## 4) Connectors & normalization
* Scheduler stores cron expressions per source/exporter in config; persists next-run pointers in Mongo.
* Jobs acquire locks (`locks` collection) to ensure singleton execution per source/exporter.
* Supports manual triggers via API endpoints (`POST /jobs/{type}`) and pause/resume toggles per source.
---
## 5) Connector Contracts
Connectors implement:
### 4.1 Connector contract
```csharp
public interface IFeedConnector {
string SourceName { get; }
Task FetchAsync(IServiceProvider sp, CancellationToken ct);
Task ParseAsync(IServiceProvider sp, CancellationToken ct);
Task MapAsync(IServiceProvider sp, CancellationToken ct);
string SourceName { get; }
Task FetchAsync(IServiceProvider sp, CancellationToken ct); // -> document collection
Task ParseAsync(IServiceProvider sp, CancellationToken ct); // -> dto collection (validated)
Task MapAsync(IServiceProvider sp, CancellationToken ct); // -> advisory/alias/affected/reference
}
```
* Fetch populates `document` rows respecting rate limits, conditional GET, and `source_state.cursor`.
* Parse validates schema (JSON Schema, XSD) and writes sanitized DTO payloads.
* Map produces canonical advisory rows + provenance entries; must be idempotent.
* Base helpers in `StellaOps.Feedser.Source.Common` provide HTTP clients, retry policies, and watermark utilities.
* **Fetch**: windowed (cursor), conditional GET (ETag/LastModified), retry/backoff, rate limiting.
* **Parse**: schema validation (JSON Schema, XSD/CSAF), content type checks; write **DTO** with normalized casing.
* **Map**: build canonical records; all outputs carry **provenance** (doc digest, URI, anchors).
### 4.2 Version range normalization
* **SemVer** ecosystems (npm, pypi, maven, nuget, golang): normalize to `introduced`/`fixed` semver ranges (use `~`, `^`, `<`, `>=` canonicalized to intervals).
* **RPM EVR**: `epoch:version-release` with `rpmvercmp` semantics; store raw EVR strings and also **computed order keys** for query.
* **DEB**: dpkg version comparison semantics mirrored; store computed keys.
* **APK**: Alpine version semantics; compute order keys.
* **Generic**: if provider uses text, retain raw; do **not** invent ranges.
### 4.3 Severity & CVSS
* Normalize **CVSS v2/v3/v4** where available (vector, baseScore, severity).
* If multiple CVSS sources exist, track them all; **effective severity** defaults to **max** by policy (configurable).
* **ExploitKnown** toggled by KEV and equivalent sources; store **evidence** (source, date).
---
## 6) Merge & Normalization
## 5) Merge engine
* Canonical model stored in `StellaOps.Feedser.Models` with serialization contracts used by storage/export layers.
* `StellaOps.Feedser.Normalization` handles NEVRA/EVR/PURL range parsing, CVSS normalization, localization.
* `StellaOps.Feedser.Merge` builds alias graphs keyed by CVE first, then falls back to vendor/regional IDs.
* Precedence rules: PSIRT/OVAL overrides generic ranges; KEV only toggles exploitation; regional feeds enrich severity but dont override vendor truth.
* Determinism enforced via canonical JSON hashing logged in `merge_event`.
### 5.1 Keying & identity
* Identity graph: **CVE** is primary node; vendor/distro IDs resolved via **Alias** edges (from connectors and Feedsers alias tables).
* `advisoryKey` is the canonical primary key (CVE if present, else vendor/distro key).
### 5.2 Merge algorithm (deterministic)
1. **Gather** all rows for `advisoryKey` (across sources).
2. **Select title/summary** by precedence source (vendor>distro>ecosystem>cert).
3. **Union aliases** (dedupe by scheme+value).
4. **Merge `Affected`** with rules:
* Prefer **vendor** ranges for vendor products; prefer **distro** for **distroshipped** packages.
* If both exist for same `productKey`, keep **both**; mark `sourceTag` and `precedence` so **Policy** can decide.
* Never collapse range semantics across different families (e.g., rpm EVR vs semver).
5. **CVSS/severity**: record all CVSS sets; compute **effectiveSeverity** = max (unless policy override).
6. **References**: union with type precedence (advisory > patch > kb > exploit > blog); dedupe by URL; preserve `sourceTag`.
7. Produce **canonical JSON**; compute **afterHash**; store **MergeEvent** with inputs and hashes.
> The merge is **pure** given inputs. Any change in inputs or precedence matrices changes the **hash** predictably.
---
## 6) Storage schema (MongoDB)
**Collections & indexes**
* `source` `{_id, type, baseUrl, enabled, notes}`
* `source_state` `{sourceName(unique), enabled, cursor, lastSuccess, backoffUntil, paceOverrides}`
* `document` `{_id, sourceName, uri, fetchedAt, sha256, contentType, status, metadata, gridFsId?, etag?, lastModified?}`
* Index: `{sourceName:1, uri:1}` unique, `{fetchedAt:-1}`
* `dto` `{_id, sourceName, documentId, schemaVer, payload, validatedAt}`
* Index: `{sourceName:1, documentId:1}`
* `advisory` `{_id, advisoryKey, title, summary, published, modified, severity, cvss, exploitKnown, sources[]}`
* Index: `{advisoryKey:1}` unique, `{modified:-1}`, `{severity:1}`, text index (title, summary)
* `alias` `{advisoryId, scheme, value}`
* Index: `{scheme:1,value:1}`, `{advisoryId:1}`
* `affected` `{advisoryId, productKey, rangeKind, introduced?, fixed?, arch?, distro?, ecosystem?}`
* Index: `{productKey:1}`, `{advisoryId:1}`, `{productKey:1, rangeKind:1}`
* `reference` `{advisoryId, url, kind, sourceTag}`
* Index: `{advisoryId:1}`, `{kind:1}`
* `merge_event` `{advisoryKey, beforeHash, afterHash, mergedAt, inputs[]}`
* Index: `{advisoryKey:1, mergedAt:-1}`
* `export_state` `{_id(exportKind), baseExportId?, baseDigest?, lastFullDigest?, lastDeltaDigest?, cursor, files[]}`
* `locks` `{_id(jobKey), holder, acquiredAt, heartbeatAt, leaseMs, ttlAt}` (TTL cleans dead locks)
* `jobs` `{_id, type, args, state, startedAt, heartbeatAt, endedAt, error}`
**GridFS buckets**: `fs.documents` for raw payloads.
---
## 7) Exporters
* JSON exporter mirrors `aquasecurity/vuln-list` layout with deterministic ordering and reproducible timestamps.
* Trivy DB exporter shells out to `trivy-db build`, produces Bolt archives, and reuses unchanged blobs from the last full baseline when running in delta mode. The exporter annotates `metadata.json` with `mode`, `baseExportId`, `baseManifestDigest`, `resetBaseline`, and `delta.changedFiles[]`/`delta.removedPaths[]`, and honours `publishFull` / `publishDelta` (ORAS) plus `includeFull` / `includeDelta` (offline bundle) toggles.
* `StellaOps.Feedser.Storage.Mongo` provides cursors for delta exports based on `export_state.exportCursor` and the persisted per-file manifest (`export_state.files`).
* Export jobs produce OCI tarballs (layer media type `application/vnd.aquasec.trivy.db.layer.v1.tar+gzip`) and optionally push via ORAS; `metadata.json` accompanies each layout so mirrors can decide between full refreshes and deltas.
### 7.1 Deterministic JSON (vulnlist style)
* Folder structure mirroring `/<scheme>/<first-two>/<rest>/…` with one JSON per advisory; deterministic ordering, stable timestamps, normalized whitespace.
* `manifest.json` lists all files with SHA256 and a toplevel **export digest**.
### 7.2 Trivy DB exporter
* Builds Bolt DB archives compatible with Trivy; supports **full** and **delta** modes.
* In delta, unchanged blobs are reused from the base; metadata captures:
```
{
"mode": "delta|full",
"baseExportId": "...",
"baseManifestDigest": "sha256:...",
"changed": ["path1", "path2"],
"removed": ["path3"]
}
```
* Optional ORAS push (OCI layout) for registries.
* Offline kit bundles include Trivy DB + JSON tree + export manifest.
### 7.3 Handoff to Signer/Attestor (optional)
* On export completion, if `attest: true` is set in job args, Feedser **posts** the artifact metadata to **Signer**/**Attestor**; Feedser itself **does not** hold signing keys.
* Export record stores returned `{ uuid, index, url }` from **Rekor v2**.
---
## 8) Observability
## 8) REST APIs
* Serilog structured logging with enrichment fields (`source`, `uri`, `stage`, `durationMs`).
* OpenTelemetry traces around fetch/parse/map/export; metrics for rate limit hits, schema failures, dedupe ratios, package size. Connector HTTP metrics are emitted via the shared `feedser.source.http.*` instruments tagged with `feedser.source=<connector>` so per-source dashboards slice on that label instead of bespoke metric names.
* Prometheus scraping endpoint served by WebService.
All under `/api/v1/feedser`.
**Health & status**
```
GET /healthz | /readyz
GET /status → sources, last runs, export cursors
```
**Sources & jobs**
```
GET /sources → list of configured sources
POST /sources/{name}/trigger → { jobId }
POST /sources/{name}/pause | /resume → toggle
GET /jobs/{id} → job status
```
**Exports**
```
POST /exports/json { full?:bool, force?:bool, attest?:bool } → { exportId, digest, rekor? }
POST /exports/trivy { full?:bool, force?:bool, publish?:bool, attest?:bool } → { exportId, digest, rekor? }
GET /exports/{id} → export metadata (kind, digest, createdAt, rekor?)
```
**Search (operator debugging)**
```
GET /advisories/{key}
GET /advisories?scheme=CVE&value=CVE-2025-12345
GET /affected?productKey=pkg:rpm/openssl&limit=100
```
**AuthN/Z:** Authority tokens (OpTok) with roles: `feedser.read`, `feedser.admin`, `feedser.export`.
---
## 9) Security Considerations
## 9) Configuration (YAML)
* Offline-first: connectors only reach allowlisted hosts.
* BDU LLM fallback gated by config flag; logs audit trail with confidence score.
* No secrets written to logs; secrets loaded via environment or mounted files.
* Signing handled outside Feedser pipeline.
```yaml
feedser:
mongo: { uri: "mongodb://mongo/feedser" }
s3:
endpoint: "http://minio:9000"
bucket: "stellaops-feedser"
scheduler:
windowSeconds: 30
maxParallelSources: 4
sources:
- name: redhat
kind: csaf
baseUrl: https://access.redhat.com/security/data/csaf/v2/
signature: { type: pgp, keys: [ "…redhat PGP…" ] }
enabled: true
windowDays: 7
- name: suse
kind: csaf
baseUrl: https://ftp.suse.com/pub/projects/security/csaf/
signature: { type: pgp, keys: [ "…suse PGP…" ] }
- name: ubuntu
kind: usn-json
baseUrl: https://ubuntu.com/security/notices.json
signature: { type: none }
- name: osv
kind: osv
baseUrl: https://api.osv.dev/v1/
signature: { type: none }
- name: ghsa
kind: ghsa
baseUrl: https://api.github.com/graphql
auth: { tokenRef: "env:GITHUB_TOKEN" }
exporters:
json:
enabled: true
output: s3://stellaops-feedser/json/
trivy:
enabled: true
mode: full
output: s3://stellaops-feedser/trivy/
oras:
enabled: false
repo: ghcr.io/org/feedser
precedence:
vendorWinsOverDistro: true
distroWinsOverOsv: true
severity:
policy: max # or 'vendorPreferred' / 'distroPreferred'
```
---
## 10) Deployment Notes
## 10) Security & compliance
* **Outbound allowlist** per connector (domains, protocols); proxy support; TLS pinning where possible.
* **Signature verification** for raw docs (PGP/cosign/x509) with results stored in `document.metadata.sig`. Docs failing verification may still be ingested but flagged; **merge** can downweight or ignore them by config.
* **No secrets in logs**; auth material via `env:` or mounted files; HTTP redaction of `Authorization` headers.
* **Multitenant**: pertenant DBs or prefixes; pertenant S3 prefixes; tenantscoped API tokens.
* **Determinism**: canonical JSON writer; export digests stable across runs given same inputs.
---
## 11) Performance targets & scale
* **Ingest**: ≥ 5k documents/min on 4 cores (CSAF/OpenVEX/JSON).
* **Normalize/map**: ≥ 50k `Affected` rows/min on 4 cores.
* **Merge**: ≤ 10ms P95 per advisory at steadystate updates.
* **Export**: 1M advisories JSON in ≤ 90s (streamed, zstd), Trivy DB in ≤ 60s on 8 cores.
* **Memory**: hard cap per job; chunked streaming writers; backpressure to avoid GC spikes.
**Scale pattern**: add Feedser replicas; Mongo scaling via indices and read/write concerns; GridFS only for oversized docs.
---
## 12) Observability
* **Metrics**
* `feedser.fetch.docs_total{source}`
* `feedser.fetch.bytes_total{source}`
* `feedser.parse.failures_total{source}`
* `feedser.map.affected_total{source}`
* `feedser.merge.changed_total`
* `feedser.export.bytes{kind}`
* `feedser.export.duration_seconds{kind}`
* **Tracing** around fetch/parse/map/merge/export.
* **Logs**: structured with `source`, `uri`, `docDigest`, `advisoryKey`, `exportId`.
---
## 13) Testing matrix
* **Connectors:** fixture suites for each provider/format (happy path; malformed; signature fail).
* **Version semantics:** EVR vs dpkg vs semver edge cases (epoch bumps, tilde versions, prereleases).
* **Merge:** conflicting sources (vendor vs distro vs OSV); verify precedence & dual retention.
* **Export determinism:** byteforbyte stable outputs across runs; digest equality.
* **Performance:** soak tests with 1M advisories; cap memory; verify backpressure.
* **API:** pagination, filters, RBAC, error envelopes (RFC 7807).
* **Offline kit:** bundle build & import correctness.
---
## 14) Failure modes & recovery
* **Source outages:** scheduler backs off with exponential delay; `source_state.backoffUntil`; alerts on staleness.
* **Schema drifts:** parse stage marks DTO invalid; job fails with clear diagnostics; connector version flags track supported schema ranges.
* **Partial exports:** exporters write to temp prefix; **manifest commit** is atomic; only then move to final prefix and update `export_state`.
* **Resume:** all stages idempotent; `source_state.cursor` supports window resume.
---
## 15) Operator runbook (quick)
* **Trigger all sources:** `POST /api/v1/feedser/sources/*/trigger`
* **Force full export JSON:** `POST /api/v1/feedser/exports/json { "full": true, "force": true }`
* **Force Trivy DB delta publish:** `POST /api/v1/feedser/exports/trivy { "full": false, "publish": true }`
* **Inspect advisory:** `GET /api/v1/feedser/advisories?scheme=CVE&value=CVE-2025-12345`
* **Pause noisy source:** `POST /api/v1/feedser/sources/osv/pause`
---
## 16) Rollout plan
1. **MVP**: Red Hat (CSAF), SUSE (CSAF), Ubuntu (USN JSON), OSV; JSON export.
2. **Add**: GHSA GraphQL, Debian (DSA HTML/JSON), Alpine secdb; Trivy DB export.
3. **Attestation handoff**: integrate with **Signer/Attestor** (optional).
4. **Scale & diagnostics**: provider dashboards, staleness alerts, export cache reuse.
5. **Offline kit**: endtoend verified bundles for airgap.
* Default storage MongoDB; for air-gapped, bundle Mongo image + seeded data backup.
* Horizontal scale achieved via multiple web service instances sharing Mongo locks.
* Provide `feedser.yaml` template describing sources, rate limits, and export settings.

View File

@@ -0,0 +1,413 @@
# component_architecture_scanner.md — **StellaOps Scanner** (2025Q4)
> **Scope.** Implementationready architecture for the **Scanner** subsystem: WebService, Workers, analyzers, SBOM assembly (inventory & usage), perlayer caching, threeway diffs, artifact catalog (MinIO+Mongo), attestation handoff, and scale/security posture. This document is the contract between the scanning plane and everything else (Policy, Vexer, Feedser, UI, CLI).
---
## 0) Mission & boundaries
**Mission.** Produce **deterministic**, **explainable** SBOMs and diffs for container images and filesystems, quickly and repeatedly, without guessing. Emit two views: **Inventory** (everything present) and **Usage** (entrypoint closure + actually linked libs). Attach attestations through **Signer→Attestor→Rekor v2**.
**Boundaries.**
* Scanner **does not** produce PASS/FAIL. The backend (Policy + Vexer + Feedser) decides presentation and verdicts.
* Scanner **does not** keep thirdparty SBOM warehouses. It may **bind** to existing attestations for exact hashes.
* Core analyzers are **deterministic** (no fuzzy identity). Optional heuristic plugins (e.g., patchpresence) run under explicit flags and never contaminate the core SBOM.
---
## 1) Solution & project layout
```
src/
├─ StellaOps.Scanner.WebService/ # REST control plane, catalog, diff, exports
├─ StellaOps.Scanner.Worker/ # queue consumer; executes analyzers
├─ StellaOps.Scanner.Models/ # DTOs, evidence, graph nodes, CDX/SPDX adapters
├─ StellaOps.Scanner.Storage/ # Mongo repositories; MinIO object client; ILM/GC
├─ StellaOps.Scanner.Queue/ # queue abstraction (Redis/NATS/RabbitMQ)
├─ StellaOps.Scanner.Cache/ # layer cache; file CAS; bloom/bitmap indexes
├─ StellaOps.Scanner.EntryTrace/ # ENTRYPOINT/CMD → terminal program resolver (shell AST)
├─ StellaOps.Scanner.Analyzers.OS.[Apk|Dpkg|Rpm]/
├─ StellaOps.Scanner.Analyzers.Lang.[Java|Node|Python|Go|DotNet|Rust]/
├─ StellaOps.Scanner.Analyzers.Native.[ELF|PE|MachO]/ # PE/Mach-O planned (M2)
├─ StellaOps.Scanner.Emit.CDX/ # CycloneDX (JSON + Protobuf)
├─ StellaOps.Scanner.Emit.SPDX/ # SPDX 3.0.1 JSON
├─ StellaOps.Scanner.Diff/ # image→layer→component threeway diff
├─ StellaOps.Scanner.Index/ # BOMIndex sidecar (purls + roaring bitmaps)
├─ StellaOps.Scanner.Tests.* # unit/integration/e2e fixtures
└─ tools/
├─ StellaOps.Scanner.Sbomer.BuildXPlugin/ # BuildKit generator (image referrer SBOMs)
└─ StellaOps.Scanner.Sbomer.DockerImage/ # CLIdriven scanner container
```
**Runtime formfactor:** two deployables
* **Scanner.WebService** (stateless REST)
* **Scanner.Worker** (N replicas; queuedriven)
---
## 2) External dependencies
* **OCI registry** with **Referrers API** (discover attached SBOMs/signatures).
* **MinIO** (S3compatible) for SBOM artifacts; **Object Lock** for immutable classes; **ILM** for TTL.
* **MongoDB** for catalog, job state, diffs, ILM rules.
* **Queue** (Redis Streams/NATS/RabbitMQ).
* **Authority** (onprem OIDC) for **OpToks** (DPoP/mTLS).
* **Signer** + **Attestor** (+ **Fulcio/KMS** + **Rekor v2**) for DSSE + transparency.
---
## 3) Contracts & data model
### 3.1 Evidencefirst component model
**Nodes**
* `Image`, `Layer`, `File`
* `Component` (`purl?`, `name`, `version?`, `type`, `id` — may be `bin:{sha256}`)
* `Executable` (ELF/PE/MachO), `Library` (native or managed), `EntryScript` (shell/launcher)
**Edges** (all carry **Evidence**)
* `contains(Image|Layer → File)`
* `installs(PackageDB → Component)` (OS database row)
* `declares(InstalledMetadata → Component)` (distinfo, pom.properties, deps.json…)
* `links_to(Executable → Library)` (ELF `DT_NEEDED`, PE imports)
* `calls(EntryScript → Program)` (file:line from shell AST)
* `attests(Rekor → Component|Image)` (SBOM/predicate binding)
* `bound_from_attestation(Component_attested → Component_observed)` (hash equality proof)
**Evidence**
```
{ source: enum, locator: (path|offset|line), sha256?, method: enum, timestamp }
```
No confidences. Either a fact is proven with listed mechanisms, or it is not claimed.
### 3.2 Catalog schema (Mongo)
* `artifacts`
```
{ _id, type: layer-bom|image-bom|diff|index,
format: cdx-json|cdx-pb|spdx-json,
bytesSha256, size, rekor: { uuid,index,url }?,
ttlClass, immutable, refCount, createdAt }
```
* `images { imageDigest, repo, tag?, arch, createdAt, lastSeen }`
* `layers { layerDigest, mediaType, size, createdAt, lastSeen }`
* `links { fromType, fromDigest, artifactId }` // image/layer -> artifact
* `jobs { _id, kind, args, state, startedAt, heartbeatAt, endedAt, error }`
* `lifecycleRules { ruleId, scope, ttlDays, retainIfReferenced, immutable }`
### 3.3 Object store layout (MinIO)
```
layers/<sha256>/sbom.cdx.json.zst
layers/<sha256>/sbom.spdx.json.zst
images/<imgDigest>/inventory.cdx.pb # CycloneDX Protobuf
images/<imgDigest>/usage.cdx.pb
indexes/<imgDigest>/bom-index.bin # purls + roaring bitmaps
diffs/<old>_<new>/diff.json.zst
attest/<artifactSha256>.dsse.json # DSSE bundle (cert chain + Rekor proof)
```
---
## 4) REST API (Scanner.WebService)
All under `/api/v1/scanner`. Auth: **OpTok** (DPoP/mTLS); RBAC scopes.
```
POST /scans { imageRef|digest, force?:bool } → { scanId }
GET /scans/{id} → { status, imageDigest, artifacts[], rekor? }
GET /sboms/{imageDigest} ?format=cdx-json|cdx-pb|spdx-json&view=inventory|usage → bytes
GET /diff?old=<digest>&new=<digest>&view=inventory|usage → diff.json
POST /exports { imageDigest, format, view, attest?:bool } → { artifactId, rekor? }
POST /reports { imageDigest, policyRevision? } → { reportId, rekor? } # delegates to backend policy+vex
GET /catalog/artifacts/{id} → { meta }
GET /healthz | /readyz | /metrics
```
---
## 5) Execution flow (Worker)
### 5.1 Acquire & verify
1. **Resolve image** (prefer `repo@sha256:`).
2. **(Optional) verify image signature** per policy (cosign).
3. **Pull blobs**, compute layer digests; record metadata.
### 5.2 Layer union FS
* Apply whiteouts; materialize final filesystem; map **file → first introducing layer**.
* Windows layers (MSI/SxS/GAC) planned in **M2**.
### 5.3 Evidence harvest (parallel analyzers; deterministic only)
**A) OS packages**
* **apk**: `/lib/apk/db/installed`
* **dpkg**: `/var/lib/dpkg/status`, `/var/lib/dpkg/info/*.list`
* **rpm**: `/var/lib/rpm/Packages` (via librpm or parser)
* Record `name`, `version` (epoch/revision), `arch`, source package where present, and **declared file lists**.
**B) Language ecosystems (installed state only)**
* **Java**: `META-INF/maven/*/pom.properties`, MANIFEST → `pkg:maven/...`
* **Node**: `node_modules/**/package.json` → `pkg:npm/...`
* **Python**: `*.dist-info/{METADATA,RECORD}` → `pkg:pypi/...`
* **Go**: Go **buildinfo** in binaries → `pkg:golang/...`
* **.NET**: `*.deps.json` + assembly metadata → `pkg:nuget/...`
* **Rust**: crates only when **explicitly present** (embedded metadata or cargo/registry traces); otherwise binaries reported as `bin:{sha256}`.
> **Rule:** We only report components proven **on disk** with authoritative metadata. Lockfiles are evidence only.
**C) Native link graph**
* **ELF**: parse `PT_INTERP`, `DT_NEEDED`, RPATH/RUNPATH, **GNU symbol versions**; map **SONAMEs** to file paths; link executables → libs.
* **PE/MachO** (planned M2): import table, delayimports; version resources; code signatures.
* Map libs back to **OS packages** if possible (via file lists); else emit `bin:{sha256}` components.
**D) EntryTrace (ENTRYPOINT/CMD → terminal program)**
* Read image config; parse shell (POSIX/Bash subset) with AST: `source`/`.` includes; `case/if`; `exec`/`command`; `runparts`.
* Resolve commands via **PATH** within the **built rootfs**; follow language launchers (Java/Node/Python) to identify the terminal program (ELF/JAR/venv script).
* Record **file:line** and choices for each hop; output chain graph.
* Unresolvable dynamic constructs are recorded as **unknown** edges with reasons (e.g., `$FOO` unresolved).
**E) Attestation & SBOM bind (optional)**
* For each **file hash** or **binary hash**, query local cache of **Rekor v2** indices; if an SBOM attestation is found for **exact hash**, bind it to the component (origin=`attested`).
* For the **image** digest, likewise bind SBOM attestations (buildtime referrers).
### 5.4 Component normalization (exact only)
* Create `Component` nodes only with deterministic identities: purl, or **`bin:{sha256}`** for unlabeled binaries.
* Record **origin** (OS DB, installed metadata, linker, attestation).
### 5.5 SBOM assembly & emit
* **Perlayer SBOM fragments**: components introduced by the layer (+ relationships).
* **Image SBOMs**: merge fragments; refer back to them via **CycloneDX BOMLink** (or SPDX ExternalRef).
* Emit both **Inventory** & **Usage** views.
* Serialize **CycloneDX JSON** and **CycloneDX Protobuf**; optionally **SPDX 3.0.1 JSON**.
* Build **BOMIndex** sidecar: purl table + roaring bitmap; flag `usedByEntrypoint` components for fast backend joins.
### 5.6 DSSE attestation (via Signer/Attestor)
* WebService constructs **predicate** with `image_digest`, `stellaops_version`, `license_id`, `policy_digest?` (when emitting **final reports**), timestamps.
* Calls **Signer** (requires **OpTok + PoE**); Signer verifies **entitlement + scanner image integrity** and returns **DSSE bundle**.
* **Attestor** logs to **Rekor v2**; returns `{uuid,index,proof}` → stored in `artifacts.rekor`.
---
## 6) Threeway diff (image → layer → component)
### 6.1 Keys & classification
* Component key: **purl** when present; else `bin:{sha256}`.
* Diff classes: `added`, `removed`, `version_changed` (`upgraded|downgraded`), `metadata_changed` (e.g., origin from attestation vs observed).
* Layer attribution: for each change, resolve the **introducing/removing layer**.
### 6.2 Algorithm (outline)
```
A = components(imageOld, key)
B = components(imageNew, key)
added = B \ A
removed = A \ B
changed = { k in A∩B : version(A[k]) != version(B[k]) || origin changed }
for each item in added/removed/changed:
layer = attribute_to_layer(item, imageOld|imageNew)
usageFlag = usedByEntrypoint(item, imageNew)
emit diff.json (grouped by layer with badges)
```
Diffs are stored as artifacts and feed **UI** and **CLI**.
---
## 7) Buildtime SBOMs (fast CI path)
**Scanner.Sbomer.BuildXPlugin** can act as a BuildKit **generator**:
* During `docker buildx build --attest=type=sbom,generator=stellaops/sbom-indexer`, run analyzers on the build context/output; attach SBOMs as OCI **referrers** to the built image.
* Optionally request **Signer/Attestor** to produce **StellaOpsverified** attestation immediately; else, Scanner.WebService can verify and reattest postpush.
* Scanner.WebService trusts buildtime SBOMs per policy, enabling **norescan** for unchanged bases.
---
## 8) Configuration (YAML)
```yaml
scanner:
queue:
kind: redis
url: "redis://queue:6379/0"
mongo:
uri: "mongodb://mongo/scanner"
s3:
endpoint: "http://minio:9000"
bucket: "stellaops"
objectLock: "governance" # or 'compliance'
analyzers:
os: { apk: true, dpkg: true, rpm: true }
lang: { java: true, node: true, python: true, go: true, dotnet: true, rust: true }
native: { elf: true, pe: false, macho: false } # PE/Mach-O in M2
entryTrace: { enabled: true, shellMaxDepth: 64, followRunParts: true }
emit:
cdx: { json: true, protobuf: true }
spdx: { json: true }
compress: "zstd"
rekor:
url: "https://rekor-v2.internal"
signer:
url: "https://signer.internal"
limits:
maxParallel: 8
perRegistryConcurrency: 2
policyHints:
verifyImageSignature: false
trustBuildTimeSboms: true
```
---
## 9) Scale & performance
* **Parallelism**: peranalyzer concurrency; bounded directory walkers; file CAS dedupe by sha256.
* **Distributed locks** per **layer digest** to prevent duplicate work across Workers.
* **Registry throttles**: perhost concurrency budgets; exponential backoff on 429/5xx.
* **Targets**:
* **Buildtime**: P95 ≤35s on warmed bases (CI generator).
* **Postbuild delta**: P95 ≤10s for 200MB images with cache hit.
* **Emit**: CycloneDX Protobuf ≤150ms for 5k components; JSON ≤500ms.
* **Diff**: ≤200ms for 5k vs 5k components.
---
## 10) Security posture
* **AuthN**: Authorityissued short OpToks (DPoP/mTLS).
* **AuthZ**: scopes (`scanner.scan`, `scanner.export`, `scanner.catalog.read`).
* **mTLS** to **Signer**/**Attestor**; only **Signer** can sign.
* **No network fetches** during analysis (except registry pulls and optional Rekor index reads).
* **Sandboxing**: nonroot containers; readonly FS; seccomp profiles; disable execution of scanned content.
* **Release integrity**: all firstparty images are **cosignsigned**; Workers/WebService selfverify at startup.
---
## 11) Observability & audit
* **Metrics**:
* `scanner.jobs_inflight`, `scanner.scan_latency_seconds`
* `scanner.layer_cache_hits_total`, `scanner.file_cas_hits_total`
* `scanner.artifact_bytes_total{format}`
* `scanner.attestation_latency_seconds`, `scanner.rekor_failures_total`
* **Tracing**: spans for acquire→union→analyzers→compose→emit→sign→log.
* **Audit logs**: DSSE requests log `license_id`, `image_digest`, `artifactSha256`, `policy_digest?`, Rekor UUID on success.
---
## 12) Testing matrix
* **Determinism:** given same image + analyzers → byteidentical **CDX Protobuf**; JSON normalized.
* **OS packages:** groundtruth images per distro; compare to package DB.
* **Lang ecosystems:** sample images per ecosystem (Java/Node/Python/Go/.NET/Rust) with installed metadata; negative tests w/ lockfileonly.
* **Native & EntryTrace:** ELF graph correctness; shell AST cases (includes, runparts, exec, case/if).
* **Diff:** layer attribution against synthetic twoimage sequences.
* **Performance:** cold vs warm cache; large `node_modules` and `sitepackages`.
* **Security:** ensure no code execution from image; fuzz parser inputs; path traversal resistance on layer extract.
---
## 13) Failure modes & degradations
* **Missing OS DB** (files exist, DB removed): record **files**; do **not** fabricate package components; emit `bin:{sha256}` where unavoidable; flag in evidence.
* **Unreadable metadata** (corrupt distinfo): record file evidence; skip component creation; annotate.
* **Dynamic shell constructs**: mark unresolved edges with reasons (env var unknown) and continue; **Usage** view may be partial.
* **Registry rate limits**: honor backoff; queue job retries with jitter.
* **Signer refusal** (license/plan/version): scan completes; artifact produced; **no attestation**; WebService marks result as **unverified**.
---
## 14) Optional plugins (off by default)
* **Patchpresence detector** (signaturebased backport checks). Reads curated functionlevel signatures from advisories; inspects binaries for patched code snippets to lower falsepositives for backported fixes. Runs as a sidecar analyzer that **annotates** components; never overrides core identities.
* **Runtime probes** (with Zastava): when allowed, compare **/proc/<pid>/maps** (DSOs actually loaded) with static **Usage** view for precision.
---
## 15) DevOps & operations
* **HA**: WebService horizontal scale; Workers autoscale by queue depth & CPU; distributed locks on layers.
* **Retention**: ILM rules per artifact class (`short`, `default`, `compliance`); **Object Lock** for compliance artifacts (reports, signed SBOMs).
* **Upgrades**: bump **cache schema** when analyzer outputs change; WebService triggers refresh of dependent artifacts.
* **Backups**: Mongo (daily dumps); MinIO (versioned buckets, replication); Rekor v2 DB snapshots.
---
## 16) CLI & UI touch points
* **CLI**: `stellaops scan <ref>`, `stellaops diff --old --new`, `stellaops export`, `stellaops verify attestation <bundle|url>`.
* **UI**: Scan detail shows **Inventory/Usage** toggles, **Diff by Layer**, **Attestation badge** (verified/unverified), Rekor link, and **EntryTrace** chain with file:line breadcrumbs.
---
## 17) Roadmap (Scanner)
* **M2**: Windows containers (MSI/SxS/GAC analyzers), PE/MachO native analyzer, deeper Rust metadata.
* **M2**: Buildx generator GA (certified external registries), crossregistry trust policies.
* **M3**: Patchpresence plugin GA (optin), crossimage corpus clustering (evidenceonly; not identity).
* **M3**: Advanced EntryTrace (POSIX shell features breadth, busybox detection).
---
### Appendix A — EntryTrace resolution (pseudo)
```csharp
ResolveEntrypoint(ImageConfig cfg, RootFs fs):
cmd = Normalize(cfg.ENTRYPOINT, cfg.CMD)
stack = [ Script(cmd, path=FindOnPath(cmd[0], fs)) ]
visited = set()
while stack not empty and depth < MAX:
cur = stack.pop()
if cur in visited: continue
visited.add(cur)
if IsShellScript(cur.path):
ast = ParseShell(cur.path)
foreach directive in ast:
if directive is Source include:
p = ResolveInclude(include.path, cur.env, fs)
stack.push(Script(p))
if directive is Exec call:
p = ResolveExec(call.argv[0], cur.env, fs)
stack.push(Program(p, argv=call.argv))
if directive is Interpreter (python -m / node / java -jar):
term = ResolveInterpreterTarget(call, fs)
stack.push(Program(term))
else:
return Terminal(cur.path)
return Unknown(reason)
```
### Appendix B — BOMIndex sidecar
```
struct Header { magic, version, imageDigest, createdAt }
vector<string> purls
map<purlIndex, roaring_bitmap> components
optional map<purlIndex, roaring_bitmap> usedByEntrypoint
```

418
docs/ARCHITECTURE_SIGNER.md Normal file
View File

@@ -0,0 +1,418 @@
# component_architecture_signer.md — **StellaOps Signer** (2025Q4)
> **Scope.** Implementationready architecture for the **Signer**: the *only* service allowed to produce **StellaOpsverified** signatures over SBOMs and reports. It enforces **entitlement** (PoE), **release integrity** (scanner provenance), **senderconstrained auth** (DPoP/mTLS), and emits **intoto/DSSE** bundles suitable for **Rekor v2** logging by the Attestor. Includes APIs, data flow, storage, quotas, security, and test matrices.
---
## 0) Mission & boundaries
**Mission.** Convert authenticated signing requests from trusted StellaOps services into **verifiable** DSSE bundles while enforcing **license policy** and **supplychain integrity**.
**Boundaries.**
* **Signer does not push to Rekor** — it returns DSSE to the caller; **Attestor** logs to **Rekor v2**.
* **Signer does not compute PASS/FAIL** — it signs SBOMs/reports produced by Scanner/WebService after backend evaluation.
* **Signer is stateless for hot path** — longterm storage is limited to audit events; all secrets/keys live in KMS/HSM or are ephemeral (keyless).
---
## 1) Responsibilities (contract)
1. **Authenticate** caller with **OpTok** (Authority OIDC, DPoP or mTLSbound).
2. **Authorize** scopes (`signer.sign`) + audience (`aud=signer`) + tenant/installation.
3. **Validate entitlement** via **PoE** (ProofofEntitlement) against Cloud Licensing `/license/introspect`.
4. **Verify release integrity** of the **scanner** image digest presented in the request: must be **cosignsigned** by StellaOps release key, discoverable via **OCI Referrers API**.
5. **Enforce plan & quotas** (concurrency/QPS/artifact size/rate caps).
6. **Mint signing identity**:
* **Keyless** (default): get a shortlived X.509 cert from **Fulcio** using the Signers OIDC identity and sign the DSSE.
* **Keyful** (optional): sign with an HSM/KMS key.
7. **Return DSSE bundle** (subject digests + predicate + cert chain or KMS key id).
8. **Audit** every decision; expose metrics.
---
## 2) External dependencies
* **Authority** (onprem OIDC): validates OpToks (JWKS/introspection) and DPoP/mTLS.
* **Licensing Service (cloud)**: `/license/introspect` to verify PoE (active, claims, expiry, revocation).
* **Fulcio** (Sigstore) *or* **KMS/HSM**: to obtain certs or perform signatures.
* **OCI Registry (Referrers API)**: to verify **scanner** image release signature.
* **Attestor**: downstream service that writes DSSE bundles to **Rekor v2**.
* **Config/state stores**: Redis (caches, rate buckets), Mongo/Postgres (audit log).
---
## 3) API surface (mTLS; DPoP supported)
Base path: `/api/v1/signer`. **All endpoints require**:
* Access token (JWT) from **Authority** with `aud=signer`, `scope=signer.sign`.
* **Sender constraint**: DPoP proof per request or mTLS client cert.
* **PoE** presented as either:
* **Client TLS cert** (if PoE is mTLSstyle) chained to Licensing CA, *or*
* **PoE JWT** (DPoP/mTLSbound) in `X-PoE` header or request body.
### 3.1 `POST /sign/dsse`
Request (JSON):
```json
{
"subject": [
{ "name": "s3://stellaops/images/sha256:.../inventory.cdx.pb",
"digest": { "sha256": "..." } }
],
"predicateType": "https://stella-ops.org/attestations/sbom/1",
"predicate": {
"image_digest": "sha256:...",
"stellaops_version": "2.3.1 (2027.04)",
"license_id": "LIC-9F2A...",
"customer_id": "CUST-ACME",
"plan": "pro",
"policy_digest": "sha256:...", // optional for final reports
"views": ["inventory", "usage"],
"created": "2025-10-17T12:34:56Z"
},
"scannerImageDigest": "sha256:sc-web-or-worker-digest",
"poe": {
"format": "jwt", // or "mtls"
"value": "eyJhbGciOi..." // PoE JWT when not using mTLS PoE
},
"options": {
"signingMode": "keyless", // "keyless" | "kms"
"expirySeconds": 600, // cert lifetime hint (keyless)
"returnBundle": "dsse+cert" // dsse (default) | dsse+cert
}
}
```
Response 200:
```json
{
"bundle": {
"dsse": { "payloadType": "application/vnd.in-toto+json", "payload": "<base64>", "signatures": [ ... ] },
"certificateChain": [ "-----BEGIN CERTIFICATE-----...", "... root ..." ],
"mode": "keyless",
"signingIdentity": { "issuer": "https://fulcio.internal", "san": "urn:stellaops:signer", "certExpiry": "2025-10-17T12:44:56Z" }
},
"policy": { "plan": "pro", "maxArtifactBytes": 104857600, "qpsRemaining": 97 },
"auditId": "a7c9e3f2-1b7a-4e87-8c3a-90d7d2c3ad12"
}
```
Errors (RFC7807):
* `401 invalid_token` (JWT/DPoP/mTLS failure)
* `403 entitlement_denied` (PoE invalid/revoked/expired; release year mismatch)
* `403 release_untrusted` (scanner image not Stellasigned)
* `429 plan_throttled` (license plan caps)
* `413 artifact_too_large` (size cap)
* `400 invalid_request` (schema/predicate/type invalid)
* `500 signing_unavailable` (Fulcio/KMS outage)
### 3.2 `GET /verify/referrers?imageDigest=<sha256>`
Checks whether the **image** at digest is signed by **StellaOps release key**.
Response:
```json
{ "trusted": true, "signatures": [ { "type": "cosign", "digest": "sha256:...", "signedBy": "StellaOps Release 2027 Q2" } ] }
```
> **Note:** This endpoint is also used internally by Signer before issuing signatures.
---
## 4) Validation pipeline (hot path)
```mermaid
sequenceDiagram
autonumber
participant Client as Scanner.WebService
participant Auth as Authority (OIDC)
participant Sign as Signer
participant Lic as Licensing Service (cloud)
participant Reg as OCI Registry (Referrers)
participant Ful as Fulcio/KMS
Client->>Sign: POST /sign/dsse (OpTok + DPoP/mTLS, PoE, request)
Note over Sign: 1) Validate OpTok, audience, scope, DPoP/mTLS binding
Sign->>Lic: /license/introspect(PoE)
Lic-->>Sign: { active, claims: {license_id, plan, valid_release_year, max_version}, exp }
Note over Sign: 2) Enforce plan/version window and revocation
Sign->>Reg: Verify scannerImageDigest signed (Referrers + cosign)
Reg-->>Sign: OK with signer identity
Note over Sign: 3) Enforce release integrity
Note over Sign: 4) Enforce quotas (QPS/concurrency/size)
Sign->>Ful: Mint cert (keyless) or sign via KMS
Ful-->>Sign: Cert or signature
Sign-->>Client: DSSE bundle (+cert chain), policy counters, auditId
```
**DPoP nonce dance (when enabled for highvalue ops):**
* If DPoP proof lacks a valid nonce, Signer replies `401` with `WWW-Authenticate: DPoP error="use_dpop_nonce", dpop_nonce="<nonce>"`.
* Client retries with new proof including the nonce; Signer validates nonce and `jti` uniqueness (Redis TTL cache).
---
## 5) Entitlement enforcement (PoE)
* **Accepted forms**:
* **mTLS PoE**: client presents a **PoE client cert** at TLS handshake; Signer validates chain to **Licensing CA** (CA bundle configured) and calls `/license/introspect` with cert thumbprint + serial.
* **JWT PoE**: `X-PoE` bearer token (DPoP/mTLSbound) is validated (sig + `cnf`) locally (Licensing JWKS) and then **introspected** for status and claims.
* **Claims required**:
* `license_id`, `plan` (free|pro|enterprise|gov), `valid_release_year`, `max_version`, `exp`.
* Optional: `tenant_id`, `customer_id`, `entitlements[]`.
* **Enforcements**:
* Reject if **revoked**, **expired**, **plan mismatch** or **release outside window** (`stellaops_version` in predicate exceeds `max_version` or release date beyond `valid_release_year`).
* Apply plan **throttles** (QPS/concurrency/artifact bytes) via tokenbucket in Redis keyed by `license_id`.
---
## 6) Release integrity (scanner provenance)
* **Input**: `scannerImageDigest` representing the actual Scanner component that produced the artifact.
* **Check**:
1. Use **OCI Referrers API** to enumerate signatures of that digest.
2. Verify **cosign** signatures against the configured **StellaOps Release** keyring (keyless Fulcio roots *or* keyful public keys).
3. Optionally require Rekor inclusion for those signatures.
* **Policy**:
* If not signed by an authorized **StellaOps Release** identity → **deny**.
* If signed but **release year** > PoE `valid_release_year`**deny**.
* **Cache**: LRU of digest → verification result (TTL 1030min) to avoid registry thrash.
---
## 7) Signing modes
### 7.1 Keyless (default; Sigstore Fulcio)
* Signer authenticates to **Fulcio** using its onprem OIDC identity (client credentials) and requests a **shortlived cert** (510min).
* Generates **ephemeral keypair**, gets cert for the public key, signs DSSE with the **private key**.
* DSSE **bundle** includes **certificate chain**; verifiers validate to Fulcio root.
### 7.2 Keyful (optional; KMS/HSM)
* Signer uses a configured **KMS** key (AWS KMS, GCP KMS, Azure Key Vault, Vault Transit, or HSM).
* DSSE bundle includes **key metadata** (kid, cert chain if x509).
* Recommended for FIPS/sovereign environments.
---
## 8) Predicates & schema
Supported **predicate types** (extensible):
* `https://stella-ops.org/attestations/sbom/1` (SBOM emissions)
* `https://stella-ops.org/attestations/report/1` (final PASS/FAIL reports)
* `https://stella-ops.org/attestations/vex-export/1` (Vexer exports; optional)
**Validation**:
* JSONSchema per predicate type; **canonical property order**.
* `subject[*].digest` must include `sha256`.
* `predicate.stellaops_version` must parse and match policy windows.
---
## 9) Quotas & throttling
Per `license_id` (from PoE):
* **QPS** (token bucket), **concurrency** (semaphore), **artifact bytes** (sliding window).
* On exceed → `429 plan_throttled` with `Retry-After`.
* Free/community plan may also receive **randomized delay** to disincentivize farmed signing.
---
## 10) Storage & caches
* **Redis**:
* DPoP nonce & `jti` replay cache (TTL ≤ 10min).
* PoE introspection cache (short TTL, e.g., 60120s).
* Releaseverify cache (`scannerImageDigest` → { trusted, ts }).
* **Audit store** (Mongo or Postgres): `signer.audit_events`
```
{ _id, ts, tenantId, installationId, licenseId, customerId,
plan, actor{sub,cnf}, request{predicateType, subjectSha256[], imageDigest},
poe{type, thumbprint|jwtKid, exp, introspectSnapshot},
release{digest, signerId, policy},
mode: "keyless"|"kms",
result: "success"|"deny:<reason>"|"error:<reason>",
bundleSha256? }
```
* **Config**: StellaOps release signing keyring, Fulcio roots, Licensing CA bundle.
---
## 11) Security & privacy
* **mTLS** on all Signer endpoints.
* **No bearer fallbacks** — DPoP/mTLS enforced for `aud=signer`.
* **PoE** is never persisted beyond audit snapshots (minimized fields).
* **Secrets**: no longlived private keys on disk (keyless) or handled via KMS APIs.
* **Input hardening**: schemavalidate predicates; cap payload sizes; zstd/gzip decompression bombs guarded.
* **Logging**: redact PoE JWTs, access tokens, DPoP proofs; log only hashes and identifiers.
---
## 12) Metrics & observability
* `signer.requests_total{result}`
* `signer.latency_seconds{stage=auth|introspect|release_verify|sign}`
* `signer.poe_failures_total{reason}`
* `signer.release_verify_failures_total{reason}`
* `signer.plan_throttle_total{license_id}`
* `signer.bundle_bytes_total`
* `signer.keyless_certs_issued_total` / `signer.kms_sign_total`
* OTEL traces across stages; correlation id (`auditId`) returned to client.
---
## 13) Configuration (YAML)
```yaml
signer:
listen: "https://0.0.0.0:8443"
authority:
issuer: "https://authority.internal"
jwksUrl: "https://authority.internal/jwks"
require: "dpop" # "dpop" | "mtls"
poe:
mode: "both" # "jwt" | "mtls" | "both"
licensing:
introspectUrl: "https://www.stella-ops.org/api/v1/license/introspect"
caBundle: "/etc/ssl/licensing-ca.pem"
cacheTtlSeconds: 90
release:
referrers:
allowRekorVerified: true
keyrings:
- type: "cosign-keyless"
fulcioRoots: ["/etc/fulcio/root.pem"]
identities:
- san: "mailto:release@stella-ops.org"
- san: "https://sigstore.dev/oidc/stellaops"
signing:
mode: "keyless" # "keyless" | "kms"
fulcio:
issuer: "https://fulcio.internal"
oidcClientId: "signer"
oidcClientSecretRef: "env:FULCIO_CLIENT_SECRET"
certTtlSeconds: 600
kms:
provider: "aws-kms"
keyId: "arn:aws:kms:...:key/..."
quotas:
default:
qps: 100
concurrency: 20
maxArtifactBytes: 104857600
free:
qps: 5
concurrency: 1
maxArtifactBytes: 1048576
```
---
## 14) Testing matrix
* **Auth & DPoP**: bad `aud`, wrong `jkt`, replayed `jti`, missing nonce, mTLS mismatch.
* **PoE**: expired, revoked, plan mismatch, release year gate, max_version gate.
* **Release verify**: unsigned digest, wrong signer, Rekorabsent (when required), referrers unreachable.
* **Signing**: Fulcio outage; KMS timeouts; bundle correctness (verifier harness).
* **Quotas**: burst above QPS, artifact over size, concurrency overflow.
* **Schema**: invalid predicate types/required fields.
* **Determinism**: same request → identical DSSE (aside from cert validity period).
* **Perf**: P95 endtoend under 120ms with caches warm (excluding network to Fulcio).
---
## 15) Failure modes & responses
| Failure | HTTP | Problem type | Notes |
| ----------------------- | ---- | --------------------- | -------------------------------------------- |
| Invalid OpTok / DPoP | 401 | `invalid_token` | `WWW-Authenticate` with DPoP nonce if needed |
| PoE invalid/revoked | 403 | `entitlement_denied` | Include `license_id` (hashed) and reason |
| Scanner image untrusted | 403 | `release_untrusted` | Include digest and required identity |
| Plan throttle | 429 | `plan_throttled` | Include limits and `Retry-After` |
| Artifact too large | 413 | `artifact_too_large` | Include cap |
| Fulcio/KMS down | 503 | `signing_unavailable` | RetryAfter with jitter |
---
## 16) Deployment & HA
* Run ≥ 2 replicas; front with L7 LB; **sticky** not required.
* Redis for replay/quota caches (HA).
* Audit sink (Mongo/Postgres) in primary region; asynchronous write with local fallback buffer.
* Fulcio/KMS clients configured with retries/backoff; circuit breakers.
---
## 17) Implementation notes
* **.NET 10** minimal API + Kestrel mTLS; custom DPoP middleware; JWT/JWKS cache.
* **Cosign verification** via sigstore libraries; Referrers queries over registry API with retries.
* **DSSE** via intoto libs; canonical JSON writer for predicates.
* **Backpressure** paths: refuse at auth/quota stages before any expensive network calls.
---
## 18) Examples (wire)
**Request (free plan; expect throttle if burst):**
```http
POST /api/v1/signer/sign/dsse HTTP/1.1
Authorization: DPoP <JWT>
DPoP: <proof>
Content-Type: application/json
```
**Error (release untrusted):**
```json
{
"type": "https://stella-ops.org/problems/release_untrusted",
"title": "Scanner image not signed by StellaOps",
"status": 403,
"detail": "sha256:abcd... not in trusted keyring",
"instance": "urn:audit:a7c9e3f2-..."
}
```
---
## 19) Roadmap
* **Key Transparency**: optional publication of Signers *own* certs to a KT log.
* **Attested Build**: SLSAstyle provenance for Signer container itself, checked at startup.
* **FIPS mode**: enforce `ES256` + KMS/HSM only; disallow Ed25519.
* **Dual attestation**: optional immediate push to **Attestor** (sync mode) with timeout budget, returning Rekor UUID inline.

342
docs/ARCHITECTURE_UI.md Normal file
View File

@@ -0,0 +1,342 @@
# component_architecture_web_ui.md — **StellaOps Web UI** (2025Q4)
> **Scope.** Implementationready architecture for the **Angular SPA** that operators and developers use to drive StellaOps. This document defines UX surfaces, module boundaries, data flows, auth, RBAC, realtime updates, performance targets, i18n/a11y, security headers, testing and deployment. The UI is a *consumer* of backend APIs (Scanner, Policy, Vexer, Feedser, Attestor, Authority) and never performs scanning, merging, or signing on its own.
---
## 0) Mission & nongoals
**Mission.** Provide a **fast, explainable** console for:
* Scans (status, SBOMs, diffs, EntryTrace, attestation).
* Policy management (rules, exemptions, VEX consumption view).
* Vulnerability intel (Feedser status), VEX consensus exploration (Vexer).
* Runtime posture (Zastava observer + admission).
* Admin operations (tenants, tokens, quotas, licensing posture).
**Nongoals.** No clientside crypto signing; no Docker/CRI access; no direct registry access beyond fetching static assets or OCI referrer summaries exposed by backend.
---
## 1) Technology baseline
* **Framework**: Angular 17+ (Standalone APIs / Signals), TypeScript 5.
* **Styling**: Tailwind CSS + headless component patterns; CSS variables for theming.
* **Charts**: Lightweight SVG (uPlot or Apache ECharts via ondemand import).
* **State**: Angular **Signals** + `@ngrx/signals` store for crosspage slices.
* **Transport**: `fetch` + RxJS interop; **SSE** (EventSource) for progress streams.
* **Build**: Angular CLI + Vite builder.
* **Testing**: Jest + Testing Library, Playwright for e2e.
* **Packaging**: Containerized NGINX (immutable assets, ETag + content hashing).
---
## 2) Highlevel module map
```
/app
├─ core/ # bootstrap, config, auth, http, error boundary, i18n
├─ shared/ # UI kit (tables, code-viewers, badges), pipes
├─ dashboard/ # live tiles, fleet KPIs, feed/vex age, queue depth
├─ scans/ # scan list, detail, SBOM viewer, diff-by-layer, EntryTrace
├─ runtime/ # Zastava posture, drift events, admission decisions
├─ policy/ # rules editor (YAML/Rego), exemptions, previews
├─ vex/ # VEX explorer (claims, consensus, conflicts)
├─ feedser/ # source health, export cursors, rebuild/export triggers
├─ attest/ # attestation proofs, verification bundles, Rekor links
├─ admin/ # tenants, roles, clients, quotas, licensing posture
└─ plugins/ # route plug-ins (lazy remote modules, governed)
```
Each feature folder builds as a **standalone route** (lazy loaded). All HTTP shapes live in `core/api/` clients with shared DTOs.
---
## 3) Navigation & key views
### 3.1 Dashboard
* **Tiles**: “New criticals (24h)”, “VEX suppressions applied”, “Attested SBOMs (7d)”, “Feed age per provider”, “Scanner queue depth”, “Admission events”.
* **Trends**: sparkline for vulns/day, pass/fail rates, attestation throughput.
### 3.2 Scans
* **Scan list** with status, image digest, repo, time, artifacts, attestation badge.
* **Scan detail**:
* **SBOM viewer**: Inventory/Usage toggle; component table (virtualized), filters by package type, severity, source.
* **Diff by layer**: A→B change grid (added/removed/upgraded), grouped by introducing/removing layer; tooltips show provenance and links to layer SBOM fragment.
* **EntryTrace**: shell chain with file:line breadcrumbs; jumpto source viewer (readonly, hexdump fallback).
* **Attestation**: Rekor UUID, index, inclusion proof; **Verify** button calls Attestor `/verify`.
* **Export**: download buttons (CycloneDX JSON, Protobuf, SPDX JSON); size shown; SHA256 inline.
### 3.3 Runtime (Zastava)
* **Observer timeline**: container start/stop, drift, policy violations; faceted by namespace/owner.
* **Live process view**: top N processes, loaded libs summary vs Usage SBOM.
* **Admission decisions**: pernamespace rules, allow/deny events, cache TTL, reasons.
### 3.4 Policy
* **Policy bundles**: active vs staged; diff viewer with change summary.
* **Editors**:
* YAML rules (ignore lists, thresholds, vendor precedence overrides).
* Rego blocks (advanced gates) with **WASM** preview evaluator (clientside sandbox) for “preview” against sample SBOMs.
* **VEX inclusion controls**: weight sliders (visualization only), provider allow/deny toggles.
* **Preview**: select SBOM (or image digest) → show verdict under staged policy.
### 3.5 Vexer
* **Claims explorer**: search by vulnId/productKey/provider; show raw claim (status, justification, evidence).
* **Consensus view**: rollup per (vuln, product) with accepted/rejected sources, weights, timestamps.
* **Conflicts**: grid of top conflicts; filters for justification gates failed.
### 3.6 Feedser
* **Sources** table: staleness, last run, errors.
* **Advisory search**: by CVE/alias; show normalized affected ranges.
* **Exports**: trigger full/delta JSON/Trivy DB; show manifest digests and Rekor link if attested.
### 3.7 Attest
* **Proofs list**: last 7 days Rekor entries; filter by kind (sbom/report/vex).
* **Verification**: paste UUID or upload bundle → verify; result with explanations (chain, Merkle path).
### 3.8 Admin
* **Tenants/Installations**: view/edit, isolation hints.
* **Clients & roles**: Authority clients, role→scope mapping, rotation hints.
* **Quotas**: per license plan, counters, throttle events.
* **Licensing posture**: last PoE introspection snapshot (redacted), release window.
---
## 4) Auth, sessions & RBAC
### 4.1 OIDC flow
* **Authorization Code + PKCE** to **Authority**.
* **ID Token** for UX identity; **Access Token** (OpTok) for APIs (25min TTL).
* **DPoP (browser)**: generate ephemeral **WebCrypto** keypair; store public JWK in memory, private key in **IndexedDB** (nonexportable if platform allows). Access token includes `cnf.jkt`; each API call adds `DPoP` proof; handle nonce challenges automatically.
* **Refresh**: optional DPoPbound refresh tokens with rotation; otherwise silent renew.
### 4.2 RBAC
* Roles (`ui.read`, `ui.admin`, plus service roles) are embedded in ID token or fetched via `/me` endpoint.
* **Route guards** enforce access; **feature flags** hide admin pages for nonadmins.
### 4.3 Session storage
* Access tokens & refresh metadata in memory; persist **only** minimal session (subject, expiries) in `sessionStorage`. Never persist raw JWTs to `localStorage`. Use **SameSite=Lax** cookies for antiCSRF if cookies are required (prefer pure bearer headers + DPoP).
---
## 5) HTTP layer & API clients
* **`core/http/api-client.ts`** centralizes:
* Base URLs (Scanner, Vexer, Feedser, Attestor).
* **Retry** policies on idempotent GETs (backoff + jitter).
* **Problem+JSON** parser → uniform error toasts with correlation ID.
* **SSE** helper (EventSource) with autoreconnect & backpressure.
* **DPoP** injector & nonce handling.
* Typed API clients (DTOs in `core/api/models.ts`):
* `ScannerApi`, `PolicyApi`, `VexerApi`, `FeedserApi`, `AttestorApi`, `AuthorityApi`.
**DTO examples (abbrev):**
```ts
export type ImageDigest = `sha256:${string}`;
export interface ScanSummary {
imageDigest: ImageDigest; createdAt: string;
artifacts: { view: 'inventory'|'usage'; format: 'cdx-json'|'cdx-pb'|'spdx-json'; sha256: string; size: number }[];
status: 'queued'|'running'|'completed'|'error';
rekor?: { uuid: string; index?: number; url?: string };
}
export interface DiffEntry {
key: string; change: 'added'|'removed'|'upgraded'|'downgraded';
fromVersion?: string; toVersion?: string; layer: string; usedByEntrypoint?: boolean;
}
export interface VexConsensus {
vulnId: string; productKey: string; rollupStatus: 'affected'|'not_affected'|'fixed'|'under_investigation';
sources: { providerId: string; status: string; weight: number; accepted: boolean; reason: string }[];
}
```
---
## 6) State, caching & realtime
* **Perpage stores** (Signals) for list filters, pagination, and selected entities.
* **Normalized caches** keyed by `(imageDigest, view, format)`; artifacts are downloaded via presigned URLs from Scanner and streamed; SHA256 verified clientside before exposing “verified” badge.
* **SSE channels**:
* `/scans/{id}/events` → progress log.
* `/runtime/events/stream` (optional) → live drift/admission feed (ratelimited).
* **Cache invalidation** on job completion or explicit “refresh”.
---
## 7) SBOM viewing & diff UX
* **Huge tables** rendered with **virtual scrolling** (CDK Virtual Scroll); sort/filter performed clientside for ≤ 20k rows; beyond that, serverside queries via BOMIndex endpoints.
* **Component row** shows purl, version, origin (OS pkg / metadata / linker / attested), licenses, and **used** badge (Usage view).
* **Diff**: compact heatmap per layer; clicking opens a rightpane with evidence: introducing paths, file hashes, VEX notes (from Vexer consensus) and links to advisories (Feedser).
---
## 8) Policy editor & VEX integration
* **YAML editor**: Monacobased with schema hints; previews show which rules matched.
* **Rego editor**: Monaco with WASMOPA sandbox for readonly evaluation on sample SBOMs.
* **VEX toggles**: perprovider enable/disable; “explain” drawer shows why a claim was accepted/rejected (justification gate, signature state, weight).
* **Staged → Active** promotion is a twostep flow with confirmation and automatic policy digest computation.
---
## 9) Accessibility, i18n & theming
* **A11y**: WCAG 2.2 AA; keyboard navigation, focus management, ARIA roles; colorcontrast tokens verified by unit tests.
* **I18n**: Angular i18n + runtime translation loader (`/locales/{lang}.json`); dates/numbers localized via `Intl`.
* **Languages**: English default; Bulgarian, German, Japanese as initial additions.
* **Theming**: dark/light via CSS variables; persisted in `prefers-color-scheme` aware store.
---
## 10) Performance budgets
* **TTI** ≤ 1.5s on 4G/slow CPU (first visit), ≤ 0.6s repeat (HTTP/2, cached).
* **JS** initial < 300KB gz (lazy routes).
* **SBOM list**: render 10k rows in < 70ms with virtualization; filter in < 150ms.
* **Diff view**: compute clientside grouping for 5k changes in < 120ms.
Techniques: routelevel code splitting; `ChangeDetectionStrategy.OnPush`; Signals; server compression (zstd/gzip), immutable assets with long maxage (cache busting via hashes).
---
## 11) Security headers & CSP
* **CSP**: `default-src 'self'; connect-src 'self' https://*.internal; img-src 'self' data:; script-src 'self'; style-src 'self' 'unsafe-inline'; frame-ancestors 'none';`
* **HSTS** enabled at gateway.
* **Referrer Policy**: `no-referrer`.
* **XFrameOptions**: `DENY`.
* **COOP/COEP** to enable faster crossorigin isolation (for WASM OPA).
* **Subresource Integrity (SRI)** for thirdparty fonts (minimize thirdparty).
---
## 12) Error handling & UX hygiene
* **Global error boundary** surfaces Problem+JSON `title/detail/instance` with correlation ID.
* **Retry toast** for 429 (quota throttles) with backoff timer.
* **Auth expiry**: preemptive refresh; unobtrusive banner when < 60s TTL; relogin modal if refresh fails.
* **Network down**: offline banner with queued actions (idempotent resubmits).
* **File verify**: show SHA256 mismatch warnings if artifact altered in transit.
---
## 13) Observability
* **Frontend telemetry** (OpenTelemetry Web): route timings, API latency by service, error counts; sampled to 15% and shipped to backend OTLP endpoint.
* **User actions** logged anonymously (no PII): policy promote”, scan export”, attest verify”.
* **Metrics dash** in admin shows SLOs and recent frontend errors.
---
## 14) Testing strategy
* **Unit**: pure component logic via Jest + Testing Library (no TestBed when possible).
* **Component harness** for table, code viewer, diff heatmap.
* **Contract tests**: OpenAPI schemas pulled at build time; DTOs validated; breaking changes fail CI.
* **e2e**: Playwright scenarios (login, scan detail, diff, policy edit, admit deny).
* **A11y**: axe-core CI checks; colorcontrast lints.
* **i18n**: key coverage tests (no missing translations in supported locales).
---
## 15) Deployment & ops
* **Container**: `stellaops/web-ui:<ver>-<rev>`; NGINX with `gzip_static` + brotli; immutable assets under `/static/<hash>/…`.
* **Config**: `/config.json` served by gateway (injected at runtime): API base URLs, authority issuer, telemetry sampling.
* **Version banner**: footer shows UI & backend versions; warns on major mismatches.
* **CDN** (optional): cache static bundle; APIs stay behind internal gateway.
* **Feature flags**: environment gates (staged policies, eBPF runtime) readable from config.
---
## 16) Plugin system (route plugins)
* **Manifest**: Backend provides a signed plugin manifest with remote module URLs and **cosign signature** per JS bundle.
* **Loader**: dynamic import with **SRI** and signature verification (WebCrypto).
* **Sandbox**: plugins are routed modules receiving a limited **UI SDK** (navigation, theme, API gateway). No direct token access; API calls proxied through the UI SDK which enforces RBAC.
* **Examples**: custom reports, vendor dashboards, regulated TLS config UIs.
---
## 17) Wire sequences (representative)
**A) View scan progress**
```mermaid
sequenceDiagram
autonumber
participant UI
participant Auth as Authority
participant SW as Scanner.WebService
UI->>Auth: /authorize (PKCE)
Auth-->>UI: code → token (DPoP-bound)
UI->>SW: GET /scans/{id} (Authorization+DPoP)
SW-->>UI: { status: running }
UI->>SW: (SSE) GET /scans/{id}/events
SW-->>UI: progress events …
SW-->>UI: terminal event { status: completed, artifacts[] }
```
**B) Verify attestation**
```mermaid
sequenceDiagram
autonumber
participant UI
participant AT as Attestor
UI->>AT: POST /rekor/verify { uuid }
AT-->>UI: { ok:true, index, logURL }
```
**C) Promote policy & preview**
```mermaid
sequenceDiagram
autonumber
participant UI
participant BE as Scanner.WebService (Policy endpoint)
UI->>BE: POST /policy/stage { yaml, rego }
BE-->>UI: { policyRevision, diagnostics }
UI->>BE: POST /policy/preview { imageDigest, policyRevision }
BE-->>UI: { verdict: pass|fail, reasons[] }
UI->>BE: POST /policy/promote { policyRevision }
BE-->>UI: { ok:true }
```
---
## 18) Security hard lines
* Never store JWTs in `localStorage`.
* Enforce DPoP for API calls; if DPoP unsupported for a service, require **SameSite=Lax** cookies with CSRF token header.
* Block mixedcontent; only HTTPS origins allowed.
* Validate and render only **escaped** user content; code viewer uses safe highlighter.
* Downloaded artifacts are treated as **opaque binaries**; no HTML rendering.
---
## 19) Roadmap
* **PWA** offline shell (readonly) for dashboards and cached scan details.
* **SBOM graph** visualization (forcedirected) for small components sets.
* **Runtime session replay** (privacysafe) to debug operator workflows (optin).
* **Assistive wizards** for policy creation with guided templates.

View File

@@ -1,85 +1,463 @@
# StellaOps Vexer Architecture
# component_architecture_vexer.md — **StellaOps Vexer** (2025Q4)
Vexer is StellaOps' vulnerability-exploitability (VEX) platform. It ingests VEX statements from multiple providers, normalizes them into canonical claims, projects trust-weighted consensus, and delivers deterministic export artifacts with signed attestations. This document summarizes the target architecture and how the current implementation maps to those goals.
> **Scope.** This document specifies the **Vexer** service: its purpose, trust model, data structures, APIs, plugin contracts, storage schema, normalization/consensus algorithms, performance budgets, testing matrix, and how it integrates with Scanner, Policy, Feedser, and the attestation chain. It is implementationready.
## 1. Solution topology
---
| Module | Purpose | Key contracts |
| --- | --- | --- |
| `StellaOps.Vexer.Core` | Domain models (`VexClaim`, `VexConsensus`, `VexExportManifest`), deterministic JSON helpers, shared abstractions (connectors, exporters, attestations). | `IVexConnector`, `IVexExporter`, `IVexAttestationClient`, `VexCanonicalJsonSerializer` |
| `StellaOps.Vexer.Policy` | Loads operator policy (weights, overrides, justification gates) and exposes snapshots for consensus. | `IVexPolicyProvider`, `IVexPolicyEvaluator`, `VexPolicyOptions` |
| `StellaOps.Vexer.Storage.Mongo` | Persistence layer for providers, raw docs, claims, consensus, exports, cache. | `IVexRawStore`, `IVexExportStore`, Mongo class maps |
| `StellaOps.Vexer.Export` | Orchestrates export pipeline (query signature → cache lookup → snapshot build → attestation handoff). | `IExportEngine`, `IVexExportDataSource` |
| `StellaOps.Vexer.Attestation` *(planned)* | Builds in-toto/DSSE envelopes and communicates with Sigstore/Rekor. | `IVexAttestationClient` |
| `StellaOps.Vexer.WebService` *(planned)* | Minimal API host for ingest/export endpoints. | `AddVexerWebService()` |
| `StellaOps.Vexer.Worker` *(planned)* | Background executor for scheduled pulls, verification, reconciliation, cache GC. | Hosted services |
## 0) Mission & role in the platform
All modules target .NET 10 preview and follow the same deterministic logging and serialization conventions as Feedser.
**Mission.** Convert heterogeneous **VEX** statements (OpenVEX, CSAF VEX, CycloneDX VEX; vendor/distro/platform sources) into **canonical, queryable claims**; compute **deterministic consensus** per *(vuln, product)*; preserve **conflicts with provenance**; publish **stable, attestable exports** that the backend uses to suppress nonexploitable findings, prioritize remaining risk, and explain decisions.
## 2. Data model
**Boundaries.**
MongoDB acts as the canonical store; collections (with logical responsibilities) are:
* Vexer **does not** decide PASS/FAIL. It supplies **evidence** (statuses + justifications + provenance weights).
* Vexer preserves **conflicting claims** unchanged; consensus encodes how we would pick, but the raw set is always exportable.
* VEX consumption is **backendonly**: Scanner never applies VEX. The backends **Policy Engine** asks Vexer for status evidence and then decides what to show.
- `vex.providers` provider metadata, trust tiers, discovery endpoints, and cosign/PGP details.
- `vex.raw` immutable raw documents (CSAF, CycloneDX VEX, OpenVEX, OCI attestations) with digests, retrieval metadata, and signature state.
- `vex.claims` normalized `VexClaim` rows; deduped on `(providerId, vulnId, productKey, docDigest)`.
- `vex.consensus` consensus projections per `(vulnId, productKey)` capturing rollup status, source weights, conflicts, and policy revision.
- `vex.exports` export manifests containing artifact digests, cache metadata, and attestation pointers.
- `vex.cache` index from `querySignature`/`format` to export digest for fast reuse.
- `vex.migrations` tracks applied storage migrations (index bootstrap, future schema updates).
---
GridFS is used for large raw payloads when necessary, and artifact stores (S3/MinIO/file) hold serialized exports referenced by `vex.exports`.
## 1) Inputs, outputs & canonical domain
## 3. Ingestion and reconciliation flow
### 1.1 Accepted input formats (ingest)
1. **Discovery & configuration** connectors load YAML/JSON settings via `StellaOps.Vexer.Policy` (provider enablement, trust overrides).
2. **Fetch** each `IVexConnector` pulls source windows, writing raw documents through `IVexRawDocumentSink` (Mongo-backed) with dedupe on digest.
3. **Verification** signatures/attestations validated through `IVexSignatureVerifier`; metadata stored alongside raw records.
4. **Normalization** format-specific `IVexNormalizer` instances translate raw payloads to canonical `VexClaim` batches.
5. **Consensus** `VexConsensusResolver` (Core) consumes claims with policy weights supplied by `IVexPolicyEvaluator`, producing deterministic consensus entries and conflict annotations.
6. **Export** query requests pass through `VexExportEngine`, generating `VexExportManifest` instances, caching by `VexQuerySignature`, and emitting artifacts for attestation/signature.
7. **Attestation & transparency** *(planned)* `IVexAttestationClient` signs exports (in-toto/DSSE) and records bundles in Rekor v2.
* **OpenVEX** JSON documents (attested or raw).
* **CSAF VEX** 2.x (vendor PSIRTs and distros commonly publish CSAF).
* **CycloneDX VEX** 1.4+ (standalone VEX or embedded VEX blocks).
* **OCIattached attestations** (VEX statements shipped as OCI referrers) — optional connectors.
The Worker coordinates the long-running steps (fetch/verify/normalize/export), while the WebService exposes synchronous APIs for on-demand operations and status lookups.
All connectors register **source metadata**: provider identity, trust tier, signature expectations (PGP/cosign/PKI), fetch windows, rate limits, and time anchors.
## 4. Policy semantics
### 1.2 Canonical model (normalized)
- **Weights** default tiers (`vendor=1.0`, `distro=0.9`, `platform=0.7`, `hub=0.5`, `attestation=0.6`) loaded via `VexPolicyOptions.Weights`, with per-provider overrides.
- **Justification gates** policy enforces that `not_affected` claims must provide a recognized justification; rejected claims are preserved as conflicts with reason metadata.
- **Diagnostics** policy snapshots carry structured issues for misconfigurations (out-of-range weights, empty overrides) surfaced to operators via logs and future CLI/Web endpoints.
Every incoming statement becomes a set of **VexClaim** records:
Policy snapshots are immutable and versioned so consensus records capture the policy revision used during evaluation.
```
VexClaim
- providerId // 'redhat', 'suse', 'ubuntu', 'github', 'vendorX'
- vulnId // 'CVE-2025-12345', 'GHSA-xxxx', canonicalized
- productKey // canonical product identity (see §2.2)
- status // affected | not_affected | fixed | under_investigation
- justification? // for 'not_affected'/'affected' where provided
- introducedVersion? // semantics per provider (range or exact)
- fixedVersion? // where provided (range or exact)
- lastObserved // timestamp from source or fetch time
- provenance // doc digest, signature status, fetch URI, line/offset anchors
- evidence[] // raw source snippets for explainability
- supersedes? // optional cross-doc chain (docDigest → docDigest)
```
## 5. Determinism & caching
### 1.3 Exports (consumption)
- JSON serialization uses `VexCanonicalJsonSerializer`, enforcing property ordering and camelCase naming for reproducible snapshots and test fixtures.
- `VexQuerySignature` produces canonical filter/order strings and SHA-256 digests, enabling cache keys shared across services.
- Export manifests reuse cached artifacts when the same signature/format is requested unless `ForceRefresh` is explicitly set.
- For scorring multiple sources on same VEX topic use - `VEXER_SCORRING.md`
* **VexConsensus** per `(vulnId, productKey)` with:
## 6. Observability & offline posture
* `rollupStatus` (after policy weights/justification gates),
* `sources[]` (winning + losing claims with weights & reasons),
* `policyRevisionId` (identifier of the Vexer policy used),
* `consensusDigest` (stable SHA256 over canonical JSON).
* **Raw claims** export for auditing (unchanged, with provenance).
* **Provider snapshots** (per source, last N days) for operator debugging.
* **Index** optimized for backend joins: `(productKey, vulnId) → (status, confidence, sourceSet)`.
- Structured logs (`ILogger`) capture correlation IDs, query signatures, provider IDs, and policy revisions. Metrics/OTel instrumentation will mirror Feedser once tracing hooks are added.
- Offline-first: connectors, policy bundles, and export caches can be bundled inside the Offline Kit; no mandatory outbound calls beyond configured provider allowlists.
- Operator tooling (CLI/WebService) will expose diagnostics (policy issues, verification failures, cache status) so air-gapped deployments maintain visibility without external telemetry.
All exports are **deterministic**, and (optionally) **attested** via DSSE and logged to Rekor v2.
## 7. Roadmap highlights
---
- Complete storage mappings for providers/consensus/cache and add migrations/indices per collection.
- Implement Rekor/in-toto attestation clients and wire export engine to produce signed bundles.
- Build WebService endpoints (`/vexer/status`, `/vexer/claims`, `/vexer/exports`) plus CLI verbs mirroring Feedser patterns.
- Provide CSAF, CycloneDX VEX, and OpenVEX normalizers along with vendor-specific connectors (Red Hat, Cisco, SUSE, MSRC, Oracle, Ubuntu, OCI attestation).
- Extend policy diagnostics with schema validation, change tracking, and operator-facing diff reports.
- Mongo bootstrapper runs ordered migrations (`vex.migrations`) to ensure indexes for raw documents, providers, consensus snapshots, exports, and cache entries.
## 2) Identity model — products & joins
## Appendix A Policy diagnostics workflow
### 2.1 Vuln identity
- `StellaOps.Vexer.Policy` now exposes `IVexPolicyDiagnostics`, producing deterministic diagnostics reports with timestamp, severity counts, active provider overrides, and the full issue list surfaced by `IVexPolicyProvider`.
- CLI/WebService layers should call `IVexPolicyDiagnostics.GetDiagnostics()` to display operator-friendly summaries (`vexer policy diagnostics` and `/vexer/policy/diagnostics` are the planned entry points).
- Recommendations in the report guide operators to resolve blocking errors, review warnings, and audit override usage before consensus runs—embed them directly in UX copy instead of re-deriving logic.
- Export/consensus telemetry should log the diagnostic `Version` alongside `policyRevisionId` so dashboards can correlate policy changes with consensus decisions.
- Offline installations can persist the diagnostics report (JSON) in the Offline Kit to document policy headroom during audits; the output is deterministic and diff-friendly.
- Use `VexPolicyBinder` when ingesting operator-supplied YAML/JSON bundles; it normalizes weight/override values, reports deterministic issues, and returns the consensus-ready `VexConsensusPolicyOptions` used by `VexPolicyProvider`.
- Reload telemetry emits `vex.policy.reloads` (tags: `revision`, `version`, `issues`) whenever a new digest is observed—feed this into dashboards to correlate policy changes with consensus outcomes.
* Accepts **CVE**, **GHSA**, vendor IDs (MSRC, RHSA…), distro IDs (DSA/USN/RHSA…) — normalized to `vulnId` with alias sets.
* **Alias graph** maintained (from Feedser) to map vendor/distro IDs → CVE (primary) and to **GHSA** where applicable.
### 2.2 Product identity (`productKey`)
* **Primary:** `purl` (Package URL).
* **Secondary links:** `cpe`, **OS package NVRA/EVR**, NuGet/Maven/Golang identity, and **OS package name** when purl unavailable.
* **Fallback:** `oci:<registry>/<repo>@<digest>` for imagelevel VEX.
* **Special cases:** kernel modules, firmware, platforms → providerspecific mapping helpers (connector captures providers product taxonomy → canonical `productKey`).
> Vexer does not invent identities. If a provider cannot be mapped to purl/CPE/NVRA deterministically, we keep the native **product string** and mark the claim as **nonjoinable**; the backend will ignore it unless a policy explicitly whitelists that provider mapping.
---
## 3) Storage schema (MongoDB)
Database: `vexer`
### 3.1 Collections
**`vex.providers`**
```
_id: providerId
name, homepage, contact
trustTier: enum {vendor, distro, platform, hub, attestation}
signaturePolicy: { type: pgp|cosign|x509|none, keys[], certs[], cosignKeylessRoots[] }
fetch: { baseUrl, kind: http|oci|file, rateLimit, etagSupport, windowDays }
enabled: bool
createdAt, modifiedAt
```
**`vex.raw`** (immutable raw documents)
```
_id: sha256(doc bytes)
providerId
uri
ingestedAt
contentType
sig: { verified: bool, method: pgp|cosign|x509|none, keyId|certSubject, bundle? }
payload: GridFS pointer (if large)
disposition: kept|replaced|superseded
correlation: { replaces?: sha256, replacedBy?: sha256 }
```
**`vex.claims`** (normalized rows; dedupe on providerId+vulnId+productKey+docDigest)
```
_id
providerId
vulnId
productKey
status
justification?
introducedVersion?
fixedVersion?
lastObserved
docDigest
provenance { uri, line?, pointer?, signatureState }
evidence[] { key, value, locator }
indices:
- {vulnId:1, productKey:1}
- {providerId:1, lastObserved:-1}
- {status:1}
- text index (optional) on evidence.value for debugging
```
**`vex.consensus`** (rollups)
```
_id: sha256(canonical(vulnId, productKey, policyRevision))
vulnId
productKey
rollupStatus
sources[]: [
{ providerId, status, justification?, weight, lastObserved, accepted:bool, reason }
]
policyRevisionId
evaluatedAt
consensusDigest // same as _id
indices:
- {vulnId:1, productKey:1}
- {policyRevisionId:1, evaluatedAt:-1}
```
**`vex.exports`** (manifest of emitted artifacts)
```
_id
querySignature
format: raw|consensus|index
artifactSha256
rekor { uuid, index, url }?
createdAt
policyRevisionId
cacheable: bool
```
**`vex.cache`**
```
querySignature -> exportId (for fast reuse)
ttl, hits
```
**`vex.migrations`**
* ordered migrations applied at bootstrap to ensure indexes.
### 3.2 Indexing strategy
* Hot path queries use exact `(vulnId, productKey)` and timebounded windows; compound indexes cover both.
* Providers list view by `lastObserved` for monitoring staleness.
* `vex.consensus` keyed by `(vulnId, productKey, policyRevision)` for deterministic reuse.
---
## 4) Ingestion pipeline
### 4.1 Connector contract
```csharp
public interface IVexConnector
{
string ProviderId { get; }
Task FetchAsync(VexConnectorContext ctx, CancellationToken ct); // raw docs
Task NormalizeAsync(VexConnectorContext ctx, CancellationToken ct); // raw -> VexClaim[]
}
```
* **Fetch** must implement: window scheduling, conditional GET (ETag/IfModifiedSince), rate limiting, retry/backoff.
* **Normalize** parses the format, validates schema, maps product identities deterministically, emits `VexClaim` records with **provenance**.
### 4.2 Signature verification (per provider)
* **cosign (keyless or keyful)** for OCI referrers or HTTPserved JSON with Sigstore bundles.
* **PGP** (provider keyrings) for distro/vendor feeds that sign docs.
* **x509** (mutual TLS / providerpinned certs) where applicable.
* Signature state is stored on **vex.raw.sig** and copied into **provenance.signatureState** on claims.
> Claims from sources failing signature policy are marked `"signatureState.verified=false"` and **policy** can downweight or ignore them.
### 4.3 Time discipline
* For each doc, prefer **providers document timestamp**; if absent, use fetch time.
* Claims carry `lastObserved` which drives **tiebreaking** within equal weight tiers.
---
## 5) Normalization: product & status semantics
### 5.1 Product mapping
* **purl** first; **cpe** second; OS package NVRA/EVR mapping helpers (distro connectors) produce purls via canonical tables (e.g., rpm→purl:rpm, deb→purl:deb).
* Where a provider publishes **platformlevel** VEX (e.g., “RHEL 9 not affected”), connectors expand to known product inventory rules (e.g., map to sets of packages/components shipped in the platform). Expansion tables are versioned and kept per provider; every expansion emits **evidence** indicating the rule applied.
* If expansion would be speculative, the claim remains **platformscoped** with `productKey="platform:redhat:rhel:9"` and is flagged **nonjoinable**; backend can decide to use platform VEX only when Scanner proves the platform runtime.
### 5.2 Status + justification mapping
* Canonical **status**: `affected | not_affected | fixed | under_investigation`.
* **Justifications** normalized to a controlled vocabulary (CISAaligned), e.g.:
* `component_not_present`
* `vulnerable_code_not_in_execute_path`
* `vulnerable_configuration_unused`
* `inline_mitigation_applied`
* `fix_available` (with `fixedVersion`)
* `under_investigation`
* Providers with freetext justifications are mapped by deterministic tables; raw text preserved as `evidence`.
---
## 6) Consensus algorithm
**Goal:** produce a **stable**, explainable `rollupStatus` per `(vulnId, productKey)` given possibly conflicting claims.
### 6.1 Inputs
* Set **S** of `VexClaim` for the key.
* **Vexer policy snapshot**:
* **weights** per provider tier and per provider overrides.
* **justification gates** (e.g., require justification for `not_affected` to be acceptable).
* **minEvidence** rules (e.g., `not_affected` must come from ≥1 vendor or 2 distros).
* **signature requirements** (e.g., require verified signature for fixed to be considered).
### 6.2 Steps
1. **Filter invalid** claims by signature policy & justification gates → set `S'`.
2. **Score** each claim:
`score = weight(provider) * freshnessFactor(lastObserved)` where freshnessFactor ∈ [0.8, 1.0] for staleness decay (configurable; small effect).
3. **Aggregate** scores per status: `W(status) = Σ score(claims with that status)`.
4. **Pick** `rollupStatus = argmax_status W(status)`.
5. **Tiebreakers** (in order):
* Higher **max single** provider score wins (vendor > distro > platform > hub).
* More **recent** lastObserved wins.
* Deterministic lexicographic order of status (`fixed` > `not_affected` > `under_investigation` > `affected`) as final tiebreaker.
6. **Explain**: mark accepted sources (`accepted=true; reason="weight"`/`"freshness"`), mark rejected sources with explicit `reason` (`"insufficient_justification"`, `"signature_unverified"`, `"lower_weight"`).
> The algorithm is **pure** given S and policy snapshot; result is reproducible and hashed into `consensusDigest`.
---
## 7) Query & export APIs
All endpoints are versioned under `/api/v1/vex`.
### 7.1 Query (online)
```
POST /claims/search
body: { vulnIds?: string[], productKeys?: string[], providers?: string[], since?: timestamp, limit?: int, pageToken?: string }
→ { claims[], nextPageToken? }
POST /consensus/search
body: { vulnIds?: string[], productKeys?: string[], policyRevisionId?: string, since?: timestamp, limit?: int, pageToken?: string }
→ { entries[], nextPageToken? }
POST /resolve
body: { purls: string[], vulnIds: string[], policyRevisionId?: string }
→ { results: [ { vulnId, productKey, rollupStatus, sources[] } ] }
```
### 7.2 Exports (cacheable snapshots)
```
POST /exports
body: { signature: { vulnFilter?, productFilter?, providers?, since? }, format: raw|consensus|index, policyRevisionId?: string, force?: bool }
→ { exportId, artifactSha256, rekor? }
GET /exports/{exportId} → bytes (application/json or binary index)
GET /exports/{exportId}/meta → { signature, policyRevisionId, createdAt, artifactSha256, rekor? }
```
### 7.3 Provider operations
```
GET /providers → provider list & signature policy
POST /providers/{id}/refresh → trigger fetch/normalize window
GET /providers/{id}/status → last fetch, doc counts, signature stats
```
**Auth:** servicetoservice via Authority tokens; operator operations via UI/CLI with RBAC.
---
## 8) Attestation integration
* Exports can be **DSSEsigned** via **Signer** and logged to **Rekor v2** via **Attestor** (optional but recommended for regulated pipelines).
* `vex.exports.rekor` stores `{uuid, index, url}` when present.
* **Predicate type**: `https://stella-ops.org/attestations/vex-export/1` with fields:
* `querySignature`, `policyRevisionId`, `artifactSha256`, `createdAt`.
---
## 9) Configuration (YAML)
```yaml
vexer:
mongo: { uri: "mongodb://mongo/vexer" }
s3:
endpoint: http://minio:9000
bucket: stellaops
policy:
weights:
vendor: 1.0
distro: 0.9
platform: 0.7
hub: 0.5
attestation: 0.6
providerOverrides:
redhat: 1.0
suse: 0.95
requireJustificationForNotAffected: true
signatureRequiredForFixed: true
minEvidence:
not_affected:
vendorOrTwoDistros: true
connectors:
- providerId: redhat
kind: csaf
baseUrl: https://access.redhat.com/security/data/csaf/v2/
signaturePolicy: { type: pgp, keys: [ "…redhat-pgp-key…" ] }
windowDays: 7
- providerId: suse
kind: csaf
baseUrl: https://ftp.suse.com/pub/projects/security/csaf/
signaturePolicy: { type: pgp, keys: [ "…suse-pgp-key…" ] }
- providerId: ubuntu
kind: openvex
baseUrl: https://…/vex/
signaturePolicy: { type: none }
- providerId: vendorX
kind: cyclonedx-vex
ociRef: ghcr.io/vendorx/vex@sha256:…
signaturePolicy: { type: cosign, cosignKeylessRoots: [ "sigstore-root" ] }
```
---
## 10) Security model
* **Input signature verification** enforced per provider policy (PGP, cosign, x509).
* **Connector allowlists**: outbound fetch constrained to configured domains.
* **Tenant isolation**: pertenant DB prefixes or separate DBs; pertenant S3 prefixes; pertenant policies.
* **AuthN/Z**: Authorityissued OpToks; RBAC roles (`vex.read`, `vex.admin`, `vex.export`).
* **No secrets in logs**; deterministic logging contexts include providerId, docDigest, claim keys.
---
## 11) Performance & scale
* **Targets:**
* Normalize 10k VEX claims/minute/core.
* Consensus compute ≤50ms for 1k unique `(vuln, product)` pairs in hot cache.
* Export (consensus) 1M rows in ≤60s on 8 cores with streaming writer.
* **Scaling:**
* WebService handles control APIs; **Worker** background services (same image) execute fetch/normalize in parallel with ratelimits; Mongo writes batched; upserts by natural keys.
* Exports stream straight to S3 (MinIO) with rolling buffers.
* **Caching:**
* `vex.cache` maps query signatures → export; TTL to avoid stampedes; optimistic reuse unless `force`.
---
## 12) Observability
* **Metrics:**
* `vex.ingest.docs_total{provider}`
* `vex.normalize.claims_total{provider}`
* `vex.signature.failures_total{provider,method}`
* `vex.consensus.conflicts_total{vulnId}`
* `vex.exports.bytes{format}` / `vex.exports.latency_seconds`
* **Tracing:** spans for fetch, verify, parse, map, consensus, export.
* **Dashboards:** provider staleness, top conflicting vulns/components, signature posture, export cache hitrate.
---
## 13) Testing matrix
* **Connectors:** golden raw docs → deterministic claims (fixtures per provider/format).
* **Signature policies:** valid/invalid PGP/cosign/x509 samples; ensure rejects are recorded but not accepted.
* **Normalization edge cases:** platformonly claims, freetext justifications, nonpurl products.
* **Consensus:** conflict scenarios across tiers; check tiebreakers; justification gates.
* **Performance:** 1Mrow export timing; memory ceilings; stream correctness.
* **Determinism:** same inputs + policy → identical `consensusDigest` and export bytes.
* **API contract tests:** pagination, filters, RBAC, rate limits.
---
## 14) Integration points
* **Backend Policy Engine** (in Scanner.WebService): calls `POST /resolve` with batched `(purl, vulnId)` pairs to fetch `rollupStatus + sources`.
* **Feedser**: provides alias graph (CVE↔vendor IDs) and may supply VEXadjacent metadata (e.g., KEV flag) for policy escalation.
* **UI**: VEX explorer screens use `/claims/search` and `/consensus/search`; show conflicts & provenance.
* **CLI**: `stellaops vex export --consensus --since 7d --out vex.json` for audits.
---
## 15) Failure modes & fallback
* **Provider unreachable:** stale thresholds trigger warnings; policy can downweight stale providers automatically (freshness factor).
* **Signature outage:** continue to ingest but mark `signatureState.verified=false`; consensus will likely exclude or downweight per policy.
* **Schema drift:** unknown fields are preserved as `evidence`; normalization rejects only on **invalid identity** or **status**.
---
## 16) Rollout plan (incremental)
1. **MVP**: OpenVEX + CSAF connectors for 3 major providers (e.g., Red Hat/SUSE/Ubuntu), normalization + consensus + `/resolve`.
2. **Signature policies**: PGP for distros; cosign for OCI.
3. **Exports + optional attestation**.
4. **CycloneDX VEX** connectors; platform claim expansion tables; UI explorer.
5. **Scale hardening**: export indexes; conflict analytics.
---
## 17) Appendix — canonical JSON (stable ordering)
All exports and consensus entries are serialized via `VexCanonicalJsonSerializer`:
* UTF8 without BOM;
* keys sorted (ASCII);
* arrays sorted by `(providerId, vulnId, productKey, lastObserved)` unless semantic order mandated;
* timestamps in `YYYYMMDDThh:mm:ssZ`;
* no insignificant whitespace.
This architecture keeps Vexer aligned with StellaOps' deterministic, offline-operable design while layering VEX-specific consensus and attestation capabilities on top of the Feedser foundations.

View File

@@ -0,0 +1,451 @@
# component_architecture_zastava.md — **StellaOps Zastava** (2025Q4)
> **Scope.** Implementationready architecture for **Zastava**: the **runtime inspector/enforcer** that watches real workloads, detects drift from the scanned baseline, verifies image/SBOM/attestation posture, and (optionally) **admits/blocks** deployments. Includes Kubernetes & plainDocker topologies, data contracts, APIs, security posture, performance targets, test matrices, and failure modes.
---
## 0) Mission & boundaries
**Mission.** Give operators **groundtruth** from running environments and a **fast guardrail** before workloads land:
* **Observer:** inventory containers, entrypoints actually executed, and DSOs actually loaded; verify **image signature**, **SBOM referrers**, and **attestation** presence; detect **drift** (unexpected processes/paths) and **policy violations**; publish **runtime events** to Scanner.WebService.
* **Admission (optional):** Kubernetes ValidatingAdmissionWebhook that enforces minimal posture (signed images, SBOM availability, known base images, policy PASS) **preflight**.
**Boundaries.**
* Zastava **does not** compute SBOMs and does not sign; it **consumes** Scanner/WebService outputs and **enforces** backend policy verdicts.
* Zastava can **request** a delta scan when the baseline is missing/stale, but scanning is done by **Scanner.Worker**.
* On nonK8s Docker hosts, Zastava runs as a host service with **observeronly** features.
---
## 1) Topology & processes
### 1.1 Components (Kubernetes)
```
stellaops/zastava-observer # DaemonSet on every node (read-only host mounts)
stellaops/zastava-webhook # ValidatingAdmissionWebhook (Deployment, 2+ replicas)
```
### 1.2 Components (Docker/VM)
```
stellaops/zastava-agent # System service; watch Docker events; observer only
```
### 1.3 Dependencies
* **Authority** (OIDC): short OpToks (DPoP/mTLS) for API calls to Scanner.WebService.
* **Scanner.WebService**: `/runtime/events` ingestion; `/policy/runtime` fetch.
* **OCI Registry** (optional): for direct referrers/sig checks if not delegated to backend.
* **Container runtime**: containerd/CRIO/Docker (read interfaces only).
* **Kubernetes API** (watch Pods in cluster; validating webhook).
* **Host mounts** (K8s DaemonSet): `/proc`, `/var/lib/containerd` (or CRIO), `/run/containerd/containerd.sock` (optional, readonly).
---
## 2) Data contracts
### 2.1 Runtime event (observer → Scanner.WebService)
```json
{
"eventId": "9f6a…",
"when": "2025-10-17T12:34:56Z",
"kind": "CONTAINER_START|CONTAINER_STOP|DRIFT|POLICY_VIOLATION|ATTESTATION_STATUS",
"tenant": "tenant-01",
"node": "ip-10-0-1-23",
"runtime": { "engine": "containerd", "version": "1.7.19" },
"workload": {
"platform": "kubernetes",
"namespace": "payments",
"pod": "api-7c9fbbd8b7-ktd84",
"container": "api",
"containerId": "containerd://...",
"imageRef": "ghcr.io/acme/api@sha256:abcd…",
"owner": { "kind": "Deployment", "name": "api" }
},
"process": {
"pid": 12345,
"entrypoint": ["/entrypoint.sh", "--serve"],
"entryTrace": [
{"file":"/entrypoint.sh","line":3,"op":"exec","target":"/usr/bin/python3"},
{"file":"<argv>","op":"python","target":"/opt/app/server.py"}
]
},
"loadedLibs": [
{ "path": "/lib/x86_64-linux-gnu/libssl.so.3", "inode": 123456, "sha256": "…"},
{ "path": "/usr/lib/x86_64-linux-gnu/libcrypto.so.3", "inode": 123457, "sha256": "…"}
],
"posture": {
"imageSigned": true,
"sbomReferrer": "present|missing",
"attestation": { "uuid": "rekor-uuid", "verified": true }
},
"delta": {
"baselineImageDigest": "sha256:abcd…",
"changedFiles": ["/opt/app/server.py"], // optional quick signal
"newBinaries": [{ "path":"/usr/local/bin/helper","sha256":"…" }]
},
"evidence": [
{"signal":"procfs.maps","value":"/lib/.../libssl.so.3@0x7f..."},
{"signal":"cri.task.inspect","value":"pid=12345"},
{"signal":"registry.referrers","value":"sbom: application/vnd.cyclonedx+json"}
]
}
```
### 2.2 Admission decision (webhook → API server)
```json
{
"admissionId": "…",
"namespace": "payments",
"podSpecDigest": "sha256:…",
"images": [
{
"name": "ghcr.io/acme/api:1.2.3",
"resolved": "ghcr.io/acme/api@sha256:abcd…",
"signed": true,
"hasSbomReferrers": true,
"policyVerdict": "pass|warn|fail",
"reasons": ["unsigned base image", "missing SBOM"]
}
],
"decision": "Allow|Deny",
"ttlSeconds": 300
}
```
---
## 3) Observer — node agent (DaemonSet)
### 3.1 Responsibilities
* **Watch** container lifecycle (start/stop) via CRI (`/run/containerd/containerd.sock` gRPC readonly) or `/var/log/containers/*.log` tail fallback.
* **Resolve** container → image digest, mount point rootfs.
* **Trace entrypoint**: attach **shortlived** nsenter/exec to PID 1 in container, parse shell for `exec` chain (bounded depth), record **terminal program**.
* **Sample loaded libs**: read `/proc/<pid>/maps` and `exe` symlink to collect **actually loaded** DSOs; compute **sha256** for each mapped file (bounded count/size).
* **Posture check** (cheap):
* Image signature presence (if cosign policies are local; else ask backend).
* SBOM **referrers** presence (HEAD to registry, optional).
* Rekor UUID known (query Scanner.WebService by image digest).
* **Publish runtime events** to Scanner.WebService `/runtime/events` (batch & compress).
* **Request delta scan** if: no SBOM in catalog OR base differs from known baseline.
### 3.2 Privileges & mounts (K8s)
* **SecurityContext:** `runAsUser: 0`, `readOnlyRootFilesystem: true`, `allowPrivilegeEscalation: false`.
* **Capabilities:** `CAP_SYS_PTRACE` (optional if using nsenter trace), `CAP_DAC_READ_SEARCH`.
* **Host mounts (readonly):**
* `/proc` (host) → `/host/proc`
* `/run/containerd/containerd.sock` (or CRIO socket)
* `/var/lib/containerd/io.containerd.runtime.v2.task` (rootfs paths & pids)
* **Networking:** clusterinternal egress to Scanner.WebService only.
* **Rate limits:** hard caps for bytes hashed and file count per container to avoid noisy tenants.
### 3.3 Event batching
* Buffer NDJSON; flush by **N events** or **2s**.
* Backpressure: local disk ring buffer (50MB default) if Scanner is temporarily unavailable; drop oldest after cap with **metrics** and **warning** event.
---
## 4) Admission Webhook (Kubernetes)
### 4.1 Gate criteria
Configurable policy (fetched from backend and cached):
* **Image signature**: must be cosignverifiable to configured key(s) or keyless identities.
* **SBOM availability**: at least one **CycloneDX** referrer or **Scanner.WebService** catalog entry.
* **Scanner policy verdict**: backend `PASS` required for namespaces/labels matching rules; allow `WARN` if configured.
* **Registry allowlists/denylists**.
* **Tag bans** (e.g., `:latest`).
* **Base image allowlists** (by digest).
### 4.2 Flow
```mermaid
sequenceDiagram
autonumber
participant K8s as API Server
participant WH as Zastava Webhook
participant SW as Scanner.WebService
K8s->>WH: AdmissionReview(Pod)
WH->>WH: Resolve images to digests (remote HEAD/pull if needed)
WH->>SW: POST /policy/runtime { digests, namespace, labels }
SW-->>WH: { per-image: {signed, hasSbom, verdict, reasons}, ttl }
alt All pass
WH-->>K8s: AdmissionResponse(Allow, ttl)
else Any fail (enforce=true)
WH-->>K8s: AdmissionResponse(Deny, message)
end
```
**Caching:** Perdigest result cached `ttlSeconds` (default 300 s). **Failopen** or **failclosed** is configurable per namespace.
### 4.3 TLS & HA
* Webhook has its own **serving cert** signed by cluster CA (or custom cert + CA bundle on configuration).
* Deployment ≥ 2 replicas; **leaderless**; stateless.
---
## 5) Backend integration (Scanner.WebService)
### 5.1 Ingestion endpoint
`POST /api/v1/scanner/runtime/events` *(OpTok + DPoP/mTLS)*
* Validates event schema; enforces rate caps by tenant/node; persists to **Mongo** (`runtime.events` capped collection or regular with TTL).
* Performs **correlation**:
* Attach nearest **image SBOM** (inventory/usage) and **BOMIndex** if known.
* If unknown/missing, schedule **delta scan** and return `202 Accepted`.
* Emits **derived signals** (usedByEntrypoint per component based on `/proc/<pid>/maps`).
### 5.2 Policy decision API (for webhook)
`POST /api/v1/scanner/policy/runtime`
Request:
```json
{
"namespace": "payments",
"labels": { "app": "api", "env": "prod" },
"images": ["ghcr.io/acme/api@sha256:...", "ghcr.io/acme/nginx@sha256:..."]
}
```
Response:
```json
{
"ttlSeconds": 300,
"results": {
"ghcr.io/acme/api@sha256:...": {
"signed": true,
"hasSbom": true,
"policyVerdict": "pass",
"reasons": [],
"rekor": { "uuid": "..." }
},
"ghcr.io/acme/nginx@sha256:...": {
"signed": false,
"hasSbom": false,
"policyVerdict": "fail",
"reasons": ["unsigned", "missing SBOM"]
}
}
}
```
---
## 6) Configuration (YAML)
```yaml
zastava:
mode:
observer: true
webhook: true
authority:
issuer: "https://authority.internal"
aud: ["scanner","zastava"] # tokens for backend and self-id
backend:
url: "https://scanner-web.internal"
connectTimeoutMs: 500
requestTimeoutMs: 1500
retry: { attempts: 3, backoffMs: 200 }
runtime:
engine: "auto" # containerd|cri-o|docker|auto
procfs: "/host/proc"
collect:
entryTrace: true
loadedLibs: true
maxLibs: 256
maxHashBytesPerContainer: 64_000_000
maxDepth: 48
admission:
enforce: true
failOpenNamespaces: ["dev", "test"]
verify:
imageSignature: true
sbomReferrer: true
scannerPolicyPass: true
cacheTtlSeconds: 300
resolveTags: true # do remote digest resolution for tag-only images
limits:
eventsPerSecond: 50
burst: 200
perNodeQueue: 10_000
security:
mounts:
containerdSock: "/run/containerd/containerd.sock:ro"
proc: "/proc:/host/proc:ro"
runtimeState: "/var/lib/containerd:ro"
```
---
## 7) Security posture
* **AuthN/Z**: Authority OpToks (DPoP preferred) to backend; webhook does **not** require client auth from API server (K8s handles).
* **Least privileges**: readonly host mounts; optional `CAP_SYS_PTRACE`; **no** host networking; **no** write mounts.
* **Isolation**: never exec untrusted code; nsenter only to **read** `/proc/<pid>`.
* **Data minimization**: do not exfiltrate env vars or command arguments unless policy explicitly enables diagnostic mode.
* **Rate limiting**: pernode caps; pertenant caps at backend.
* **Hard caps**: bytes hashed, files inspected, depth of shell parsing.
---
## 8) Metrics, logs, tracing
**Observer**
* `zastava.events_emitted_total{kind}`
* `zastava.proc_maps_samples_total{result}`
* `zastava.entrytrace_depth{p99}`
* `zastava.hash_bytes_total`
* `zastava.buffer_drops_total`
**Webhook**
* `zastava.admission_requests_total{decision}`
* `zastava.admission_latency_seconds`
* `zastava.cache_hits_total`
* `zastava.backend_failures_total`
**Logs** (structured): node, pod, image digest, decision, reasons.
**Tracing**: spans for observe→batch→post; webhook request→resolve→respond.
---
## 9) Performance & scale targets
* **Observer**: ≤ **30ms** to sample `/proc/<pid>/maps` and compute quick hashes for ≤ 64 files; ≤ **200ms** for full library set (256 libs).
* **Webhook**: P95 ≤ **8ms** with warm cache; ≤ **50ms** with one backend roundtrip.
* **Throughput**: 1k admission requests/min/replica; 5k runtime events/min/node with batching.
---
## 10) Drift detection model
**Signals**
* **Process drift**: terminal program differs from **EntryTrace** baseline.
* **Library drift**: loaded DSOs not present in **Usage** SBOM view.
* **Filesystem drift**: new executable files under `/usr/local/bin`, `/opt`, `/app` with **mtime** after image creation.
* **Network drift** (optional): listening sockets on unexpected ports (from policy).
**Action**
* Emit `DRIFT` event with evidence; backend can **autoqueue** a delta scan; policy may **escalate** to alert/block (Admission cannot block alreadyrunning pods; rely on K8s policies/PodSecurity or operator action).
---
## 11) Test matrix
* **Engines**: containerd, CRIO, Docker; ensure PID resolution and rootfs mapping.
* **EntryTrace**: bash features (case, if, runparts, `.`/`source`), language launchers (python/node/java).
* **Procfs**: multiple arches, musl/glibc images; static binaries (maps minimal).
* **Admission**: unsigned images, missing SBOM referrers, tagonly images, digest resolution, backend latency, cache TTL.
* **Perf/soak**: 500 Pods/node churn; webhook under HPA growth.
* **Security**: attempt privilege escalation disabled, readonly mounts enforced, ratelimit abuse.
* **Failure injection**: backend down (observer buffers, webhook failopen/closed), registry throttling, containerd socket unavailable.
---
## 12) Failure modes & responses
| Condition | Observer behavior | Webhook behavior |
| ------------------------------- | ---------------------------------------------- | ------------------------------------------------------ |
| Backend unreachable | Buffer to disk; drop after cap; emit metric | **Failopen/closed** per namespace config |
| PID vanished midsample | Retry once; emit partial evidence | N/A |
| CRI socket missing | Fallback to K8s events only (reduced fidelity) | N/A |
| Registry digest resolve blocked | Defer to backend; mark `resolve=unknown` | Deny or allow per `resolveTags` & `failOpenNamespaces` |
| Excessive events | Apply local rate limit, coalesce | N/A |
---
## 13) Deployment notes (K8s)
**DaemonSet (snippet):**
```yaml
apiVersion: apps/v1
kind: DaemonSet
metadata: { name: zastava-observer, namespace: stellaops }
spec:
template:
spec:
serviceAccountName: zastava
hostPID: true
containers:
- name: observer
image: stellaops/zastava-observer:2.3
securityContext:
runAsUser: 0
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities: { add: ["SYS_PTRACE","DAC_READ_SEARCH"] }
volumeMounts:
- { name: proc, mountPath: /host/proc, readOnly: true }
- { name: containerd-sock, mountPath: /run/containerd/containerd.sock, readOnly: true }
- { name: containerd-state, mountPath: /var/lib/containerd, readOnly: true }
volumes:
- { name: proc, hostPath: { path: /proc } }
- { name: containerd-sock, hostPath: { path: /run/containerd/containerd.sock } }
- { name: containerd-state, hostPath: { path: /var/lib/containerd } }
```
**Webhook (snippet):**
```yaml
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
webhooks:
- name: gate.zastava.stella-ops.org
admissionReviewVersions: ["v1"]
sideEffects: None
failurePolicy: Ignore # or Fail
rules:
- operations: ["CREATE","UPDATE"]
apiGroups: [""]
apiVersions: ["v1"]
resources: ["pods"]
clientConfig:
service:
namespace: stellaops
name: zastava-webhook
path: /admit
caBundle: <base64 CA>
```
---
## 14) Implementation notes
* **Language**: Rust (observer) for lowlatency `/proc` parsing; Go/.NET viable too. Webhook can be .NET 10 for parity with backend.
* **CRI drivers**: pluggable (`containerd`, `cri-o`, `docker`). Prefer CRI over parsing logs.
* **Shell parser**: reuse Scanner.EntryTrace grammar for consistent results (compile to WASM if observer is Rust/Go).
* **Hashing**: `BLAKE3` for speed locally, then convert to `sha256` (or compute `sha256` directly when budget allows).
* **Resilience**: never block container start; observer is **passive**; only webhook decides allow/deny.
---
## 15) Roadmap
* **eBPF** option for syscall/library load tracing (kernellevel, optin).
* **Windows containers** support (ETW providers, loaded modules).
* **Network posture** checks: listening ports vs policy.
* **Live **usedbyentrypoint** synthesis**: send compact bitset diff to backend to tighten Usage view.
* **Admission dryrun** dashboards (simulate block lists before enforcing).

View File

@@ -31,17 +31,26 @@ Everything here is opensource and versioned— when you check out a git ta
- **03[Vision & Roadmap](03_VISION.md)**
- **04[Feature Matrix](04_FEATURE_MATRIX.md)**
### Reference & concepts
- **05[System Requirements Specification](05_SYSTEM_REQUIREMENTS_SPEC.md)**
- **07[HighLevel Architecture](40_ARCHITECTURE_OVERVIEW.md)**
- **08Module Specifications**
- [README](08_MODULE_SPECIFICATIONS/README.md)
- [`backend_api.md`](08_MODULE_SPECIFICATIONS/backend_api.md)
- [`zastava_scanner.md`](08_MODULE_SPECIFICATIONS/zastava_scanner.md)
- [`registry_scanner.md`](08_MODULE_SPECIFICATIONS/registry_scanner.md)
- [`nightly_scheduler.md`](08_MODULE_SPECIFICATIONS/nightly_scheduler.md)
### Reference & concepts
- **05[System Requirements Specification](05_SYSTEM_REQUIREMENTS_SPEC.md)**
- **07[HighLevel Architecture](07_HIGH_LEVEL_ARCHITECTURE.md)**
- **08Module Architecture Dossiers**
- [Scanner](ARCHITECTURE_SCANNER.md)
- [Feedser](ARCHITECTURE_FEEDSER.md)
- [Vexer](ARCHITECTURE_VEXER.md)
- [Signer](ARCHITECTURE_SIGNER.md)
- [Attestor](ARCHITECTURE_ATTESTOR.md)
- [Authority](ARCHITECTURE_AUTHORITY.md)
- [CLI](ARCHITECTURE_CLI.md)
- [WebUI](ARCHITECTURE_UI.md)
- [Zastava Runtime](ARCHITECTURE_ZASTAVA.md)
- [Release & Operations](ARCHITECTURE_DEVOPS.md)
- **09[API&CLI Reference](09_API_CLI_REFERENCE.md)**
- **10[Plugin SDK Guide](10_PLUGIN_SDK_GUIDE.md)**
- **10[Feedser CLI Quickstart](10_FEEDSER_CLI_QUICKSTART.md)**
- **30[Vexer Connector Packaging Guide](dev/30_VEXER_CONNECTOR_GUIDE.md)**
- **30Developer Templates**
- [Vexer Connector Skeleton](dev/templates/vexer-connector/)
- **11[Authority Service](11_AUTHORITY.md)**
- **11[Data Schemas](11_DATA_SCHEMAS.md)**
- **12[Performance Workbook](12_PERFORMANCE_WORKBOOK.md)**
@@ -57,7 +66,7 @@ Everything here is opensource and versioned— when you check out a git ta
- **21[Install Guide](21_INSTALL_GUIDE.md)**
- **22[CI/CD Recipes Library](ci/20_CI_RECIPES.md)**
- **23[FAQ](23_FAQ_MATRIX.md)**
- **24[Offline Update Kit Admin Guide](24_OUK_ADMIN_GUIDE.md)**
- **24[Offline Update Kit Admin Guide](24_OFFLINE_KIT.md)**
- **25[Feedser Apple Connector Operations](ops/feedser-apple-operations.md)**
- **26[Authority Key Rotation Playbook](ops/authority-key-rotation.md)**
- **27[Feedser CCCS Connector Operations](ops/feedser-cccs-operations.md)**

View File

@@ -2,6 +2,7 @@
| ID | Status | Owner(s) | Depends on | Description | Exit Criteria |
|----|--------|----------|------------|-------------|---------------|
| DOC7.README-INDEX | DONE (2025-10-17) | Docs Guild | — | Refresh index docs (docs/README.md + root README) after architecture dossier split and Offline Kit overhaul. | ✅ ToC reflects new component architecture docs; ✅ root README highlights updated doc set; ✅ Offline Kit guide linked correctly. |
| DOC4.AUTH-PDG | REVIEW | Docs Guild, Plugin Team | PLG6.DOC | Copy-edit `docs/dev/31_AUTHORITY_PLUGIN_DEVELOPER_GUIDE.md`, export lifecycle diagram, add LDAP RFC cross-link. | ✅ PR merged with polish; ✅ Diagram committed; ✅ Slack handoff posted. |
| DOC1.AUTH | DONE (2025-10-12) | Docs Guild, Authority Core | CORE5B.DOC | Draft `docs/11_AUTHORITY.md` covering architecture, configuration, bootstrap flows. | ✅ Architecture + config sections approved by Core; ✅ Samples reference latest options; ✅ Offline note added. |
| DOC3.Feedser-Authority | DONE (2025-10-12) | Docs Guild, DevEx | FSR4 | Polish operator/runbook sections (DOC3/DOC5) to document Feedser authority rollout, bypass logging, and enforcement checklist. | ✅ DOC3/DOC5 updated with audit runbook references; ✅ enforcement deadline highlighted; ✅ Docs guild sign-off. |

View File

@@ -0,0 +1,220 @@
# Vexer Connector Packaging Guide
> **Audience:** teams implementing new Vexer provider plugins (CSAF feeds,
> OpenVEX attestations, etc.)
> **Prerequisites:** read `docs/ARCHITECTURE_VEXER.md` and the module
> `AGENTS.md` in `src/StellaOps.Vexer.Connectors.Abstractions/`.
The Vexer connector SDK gives you:
- `VexConnectorBase` deterministic logging, SHA256 helpers, time provider.
- `VexConnectorOptionsBinder` strongly typed YAML/JSON configuration binding.
- `IVexConnectorOptionsValidator<T>` custom validation hooks (offline defaults, auth invariants).
- `VexConnectorDescriptor` & metadata helpers for consistent telemetry.
This guide explains how to package a connector so the Vexer Worker/WebService
can load it via the plugin host.
---
## 1. Project layout
Start from the template under
`docs/dev/templates/vexer-connector/`. It contains:
```
Vexer.MyConnector/
├── src/
│ ├── Vexer.MyConnector.csproj
│ ├── MyConnectorOptions.cs
│ ├── MyConnector.cs
│ └── MyConnectorPlugin.cs
└── manifest/
└── connector.manifest.yaml
```
Key points:
- Target `net10.0`, enable `TreatWarningsAsErrors`, reference the
`StellaOps.Vexer.Connectors.Abstractions` project (or NuGet once published).
- Keep project ID prefix `StellaOps.Vexer.Connectors.<Provider>` so the
plugin loader can discover it with the default search pattern.
### 1.1 csproj snippet
```xml
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
</PropertyGroup>
<ItemGroup>
<ProjectReference Include="..\..\..\src\StellaOps.Vexer.Connectors.Abstractions\StellaOps.Vexer.Connectors.Abstractions.csproj" />
</ItemGroup>
</Project>
```
Adjust the `ProjectReference` for your checkout (or switch to a NuGet package
once published).
---
## 2. Implement the connector
1. **Options model** create an options POCO with data-annotation attributes.
Bind it via `VexConnectorOptionsBinder.Bind<TOptions>` in your connector
constructor or `ValidateAsync`.
2. **Validator** implement `IVexConnectorOptionsValidator<TOptions>` to add
complex checks (e.g., ensure both `clientId` and `clientSecret` are present).
3. **Connector** inherit from `VexConnectorBase`. Implement:
- `ValidateAsync` run binder/validators, log configuration summary.
- `FetchAsync` stream raw documents to `context.RawSink`.
- `NormalizeAsync` convert raw documents into `VexClaimBatch` via
format-specific normalizers (`context.Normalizers`).
4. **Plugin adapter** expose the connector via a plugin entry point so the
host can instantiate it.
### 2.1 Options binding example
```csharp
public sealed class MyConnectorOptions
{
[Required]
[Url]
public string CatalogUri { get; set; } = default!;
[Required]
public string ApiKey { get; set; } = default!;
[Range(1, 64)]
public int MaxParallelRequests { get; set; } = 4;
}
public sealed class MyConnectorOptionsValidator : IVexConnectorOptionsValidator<MyConnectorOptions>
{
public void Validate(VexConnectorDescriptor descriptor, MyConnectorOptions options, IList<string> errors)
{
if (!options.CatalogUri.StartsWith("https://", StringComparison.OrdinalIgnoreCase))
{
errors.Add("CatalogUri must use HTTPS.");
}
}
}
```
Bind inside the connector:
```csharp
private readonly MyConnectorOptions _options;
public MyConnector(VexConnectorDescriptor descriptor, ILogger<MyConnector> logger, TimeProvider timeProvider)
: base(descriptor, logger, timeProvider)
{
// `settings` comes from the orchestrator; validators registered via DI.
_options = VexConnectorOptionsBinder.Bind<MyConnectorOptions>(
descriptor,
VexConnectorSettings.Empty,
validators: new[] { new MyConnectorOptionsValidator() });
}
```
Replace `VexConnectorSettings.Empty` with the actual settings from context
inside `ValidateAsync`.
---
## 3. Plugin adapter & manifest
Create a simple plugin class that implements
`StellaOps.Plugin.IConnectorPlugin`. The Worker/WebService plugin host uses
this contract today.
```csharp
public sealed class MyConnectorPlugin : IConnectorPlugin
{
private static readonly VexConnectorDescriptor Descriptor =
new("vexer:my-provider", VexProviderKind.Vendor, "My Provider VEX");
public string Name => Descriptor.DisplayName;
public bool IsAvailable(IServiceProvider services) => true; // inject feature flags if needed
public IFeedConnector Create(IServiceProvider services)
{
var logger = services.GetRequiredService<ILogger<MyConnector>>();
var timeProvider = services.GetRequiredService<TimeProvider>();
return new MyConnector(Descriptor, logger, timeProvider);
}
}
```
> **Note:** the Vexer Worker currently instantiates connectors through the
> shared `IConnectorPlugin` contract. Once a dedicated Vexer plugin interface
> lands you simply swap the base interface; the descriptor/connector code
> remains unchanged.
Provide a manifest describing the assembly for operational tooling:
```yaml
# manifest/connector.manifest.yaml
id: vexer-my-provider
assembly: StellaOps.Vexer.Connectors.MyProvider.dll
entryPoint: StellaOps.Vexer.Connectors.MyProvider.MyConnectorPlugin
description: >
Official VEX feed for ExampleCorp products (CSAF JSON, daily updates).
tags:
- vexer
- csaf
- vendor
```
Store manifests under `/opt/stella/vexer/plugins/<connector>/manifest/` in
production so the deployment tooling can inventory and verify plugins.
---
## 4. Packaging workflow
1. `dotnet publish -c Release` → copy the published DLLs to
`/opt/stella/vexer/plugins/<Provider>/`.
2. Place `connector.manifest.yaml` next to the binaries.
3. Restart the Vexer Worker or WebService (hot reload not supported yet).
4. Verify logs: `VEX-ConnectorLoader` should list the connector descriptor.
### 4.1 Offline kits
- Add the connector folder (binaries + manifest) to the Offline Kit bundle.
- Include a `settings.sample.yaml` demonstrating offline-friendly defaults.
- Document any external dependencies (e.g., SHA mirrors) in the manifest `notes`
field.
---
## 5. Testing checklist
- Unit tests around options binding & validators.
- Integration tests (future `StellaOps.Vexer.Connectors.Abstractions.Tests`)
verifying deterministic logging scopes:
`logger.BeginScope` should produce `vex.connector.id`, `vex.connector.kind`,
and `vex.connector.operation`.
- Deterministic SHA tests: repeated `CreateRawDocument` calls with identical
content must return the same digest.
---
## 6. Reference template
See `docs/dev/templates/vexer-connector/` for the full quickstart including:
- Sample options class + validator.
- Connector implementation inheriting from `VexConnectorBase`.
- Plugin adapter + manifest.
Copy the directory, rename namespaces/IDs, then iterate on provider-specific
logic.
---
*Last updated: 2025-10-17*

View File

@@ -0,0 +1,8 @@
id: vexer-my-provider
assembly: StellaOps.Vexer.Connectors.MyProvider.dll
entryPoint: StellaOps.Vexer.Connectors.MyProvider.MyConnectorPlugin
description: |
Example connector template. Replace metadata before shipping.
tags:
- vexer
- template

View File

@@ -0,0 +1,72 @@
using System.Collections.Generic;
using System.Collections.Immutable;
using System.Runtime.CompilerServices;
using Microsoft.Extensions.Logging;
using StellaOps.Vexer.Connectors.Abstractions;
using StellaOps.Vexer.Core;
namespace StellaOps.Vexer.Connectors.MyProvider;
public sealed class MyConnector : VexConnectorBase
{
private readonly IEnumerable<IVexConnectorOptionsValidator<MyConnectorOptions>> _validators;
private MyConnectorOptions? _options;
public MyConnector(VexConnectorDescriptor descriptor, ILogger<MyConnector> logger, TimeProvider timeProvider, IEnumerable<IVexConnectorOptionsValidator<MyConnectorOptions>> validators)
: base(descriptor, logger, timeProvider)
{
_validators = validators;
}
public override ValueTask ValidateAsync(VexConnectorSettings settings, CancellationToken cancellationToken)
{
_options = VexConnectorOptionsBinder.Bind(
Descriptor,
settings,
validators: _validators);
LogConnectorEvent(LogLevel.Information, "validate", "MyConnector configuration loaded.",
new Dictionary<string, object?>
{
["catalogUri"] = _options.CatalogUri,
["maxParallelRequests"] = _options.MaxParallelRequests,
});
return ValueTask.CompletedTask;
}
public override IAsyncEnumerable<VexRawDocument> FetchAsync(VexConnectorContext context, CancellationToken cancellationToken)
{
if (_options is null)
{
throw new InvalidOperationException("Connector not validated.");
}
return FetchInternalAsync(context, cancellationToken);
}
private async IAsyncEnumerable<VexRawDocument> FetchInternalAsync(VexConnectorContext context, [EnumeratorCancellation] CancellationToken cancellationToken)
{
LogConnectorEvent(LogLevel.Information, "fetch", "Fetching catalog window...");
// Replace with real HTTP logic.
await Task.Delay(10, cancellationToken);
var metadata = BuildMetadata(builder => builder
.Add("sourceUri", _options!.CatalogUri)
.Add("window", context.Since?.ToString("O") ?? "full"));
yield return CreateRawDocument(
VexDocumentFormat.CsafJson,
new Uri($"{_options.CatalogUri.TrimEnd('/')}/sample.json"),
new byte[] { 0x7B, 0x7D },
metadata);
}
public override ValueTask<VexClaimBatch> NormalizeAsync(VexRawDocument document, CancellationToken cancellationToken)
{
var claims = ImmutableArray<VexClaim>.Empty;
var diagnostics = ImmutableDictionary<string, string>.Empty;
return ValueTask.FromResult(new VexClaimBatch(document, claims, diagnostics));
}
}

View File

@@ -0,0 +1,16 @@
using System.ComponentModel.DataAnnotations;
namespace StellaOps.Vexer.Connectors.MyProvider;
public sealed class MyConnectorOptions
{
[Required]
[Url]
public string CatalogUri { get; set; } = default!;
[Required]
public string ApiKey { get; set; } = default!;
[Range(1, 32)]
public int MaxParallelRequests { get; set; } = 4;
}

View File

@@ -0,0 +1,15 @@
using System.Collections.Generic;
using StellaOps.Vexer.Connectors.Abstractions;
namespace StellaOps.Vexer.Connectors.MyProvider;
public sealed class MyConnectorOptionsValidator : IVexConnectorOptionsValidator<MyConnectorOptions>
{
public void Validate(VexConnectorDescriptor descriptor, MyConnectorOptions options, IList<string> errors)
{
if (!options.CatalogUri.StartsWith("https://", StringComparison.OrdinalIgnoreCase))
{
errors.Add("CatalogUri must use HTTPS.");
}
}
}

View File

@@ -0,0 +1,27 @@
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using StellaOps.Plugin;
using StellaOps.Vexer.Connectors.Abstractions;
using StellaOps.Vexer.Core;
namespace StellaOps.Vexer.Connectors.MyProvider;
public sealed class MyConnectorPlugin : IConnectorPlugin
{
private static readonly VexConnectorDescriptor Descriptor = new(
id: "vexer:my-provider",
kind: VexProviderKind.Vendor,
displayName: "My Provider VEX");
public string Name => Descriptor.DisplayName;
public bool IsAvailable(IServiceProvider services) => true;
public IFeedConnector Create(IServiceProvider services)
{
var logger = services.GetRequiredService<ILogger<MyConnector>>();
var timeProvider = services.GetRequiredService<TimeProvider>();
var validators = services.GetServices<IVexConnectorOptionsValidator<MyConnectorOptions>>();
return new MyConnector(Descriptor, logger, timeProvider, validators);
}
}

View File

@@ -0,0 +1,12 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
</PropertyGroup>
<ItemGroup>
<!-- Adjust the relative path when copying this template into a repo -->
<ProjectReference Include="..\..\..\..\src\StellaOps.Vexer.Connectors.Abstractions\StellaOps.Vexer.Connectors.Abstractions.csproj" />
</ItemGroup>
</Project>

View File

@@ -1,6 +1,6 @@
# Feedser CERT-Bund Connector Operations
_Last updated: 2025-10-15_
_Last updated: 2025-10-17_
Germanys Federal Office for Information Security (BSI) operates the Warn- und Informationsdienst (WID) portal. The Feedser CERT-Bund connector (`source:cert-bund:*`) ingests the public RSS feed, hydrates the portals JSON detail endpoint, and maps the result into canonical advisories while preserving the original German content.
@@ -96,18 +96,30 @@ curl -s -b cookies.txt \
Iterate `page` until the response `content` array is empty. Pages 09 currently cover 2014→present. Persist JSON responses (plus SHA256) for Offline Kit parity.
> **Shortcut** run `python tools/certbund_offline_snapshot.py --output seed-data/cert-bund`
> to bootstrap the session, capture the paginated search responses, and regenerate
> the manifest/checksum files automatically. Supply `--cookie-file` and `--xsrf-token`
> if the portal requires a browser-derived session (see options via `--help`).
### 3.3 Export bundles
```bash
curl -s -b cookies.txt \
-H "Accept: application/json" \
-H "X-XSRF-TOKEN: ${XSRF}" \
"https://wid.cert-bund.de/portal/api/securityadvisory/export?format=json&from=2020-01-01" \
> certbund-2020-2025.json
python tools/certbund_offline_snapshot.py \
--output seed-data/cert-bund \
--start-year 2014 \
--end-year "$(date -u +%Y)"
```
Split long ranges per year and record provenance (`from`, `to`, SHA, capturedAt). Feedser can ingest these JSON payloads directly when operating offline.
Task `FEEDCONN-CERTBUND-02-009` tracks turning this workflow into a shipped Offline Kit artefact with manifests and documentation updates—coordinate with the Docs guild before publishing.
The helper stores yearly exports under `seed-data/cert-bund/export/`,
captures paginated search snapshots in `seed-data/cert-bund/search/`,
and generates the manifest + SHA files in `seed-data/cert-bund/manifest/`.
Split ranges according to your compliance window (default: one file per
calendar year). Feedser can ingest these JSON payloads directly when
operating offline.
> When automatic bootstrap fails (e.g. portal introduces CAPTCHA), run the
> manual `curl` flow above, then rerun the helper with `--skip-fetch` to
> rebuild the manifest from the existing files.
### 3.4 Connector-driven catch-up