docs consolidation

This commit is contained in:
StellaOps Bot
2025-12-24 12:38:14 +02:00
parent 7503c19b8f
commit 9a08d10b89
215 changed files with 2188 additions and 9623 deletions

View File

@@ -1,8 +1,48 @@
# Platform topology (detailed)
This document provides a clean, audit-friendly view of StellaOps platform topology without relying on fragile ASCII diagrams. For module-specific details (APIs, schemas, operations), use `docs/modules/`.
This document provides a comprehensive view of StellaOps platform topology. For module-specific details (APIs, schemas, operations), see `docs/modules/`.
## Layers
## Component topology (quick reference)
```
CLIENT LAYER
├─ stella CLI → Gateway (JWT + DPoP auth)
├─ Web UI (Angular) → Gateway (JWT + DPoP auth)
├─ CI/CD Pipelines → Gateway (JWT + DPoP auth)
└─ Zastava Observer → Scanner (runtime scans)
INFRASTRUCTURE (REQUIRED)
├─ PostgreSQL v16+ → Primary database (ALL services)
├─ Valkey v8.0 → Cache, DPoP, queues, events
└─ RustFS → Object storage (S3 API)
INFRASTRUCTURE (OPTIONAL)
└─ NATS JetStream → Alternative messaging (Valkey is default)
GATEWAY LAYER
└─ Gateway.WebService → Auth, routing, rate limiting
AUTH & CRYPTO
├─ Authority → OAuth2/OIDC, OpTok issuance
├─ Signer → DSSE signing (FIPS/GOST/SM)
└─ Attestor → Rekor v2 transparency log
CORE ENGINES
├─ Scanner.WebService → Scan orchestration
├─ Scanner.Worker → Image analysis, SBOM generation
├─ Concelier.WebService → Advisory ingestion (NVD, Red Hat, etc.)
├─ Excititor.WebService → VEX ingestion + consensus
├─ Policy.Gateway → OPA/Rego policy evaluation
├─ Scheduler.WebService → Re-scan orchestration
├─ Notify.WebService → Notification orchestration
├─ Notify.Worker → Slack/Teams/Email delivery
└─ Orchestrator.WebService → DAG workflows, pack runs
SUPPORTING
└─ IssuerDirectory → VEX issuer trust registry
```
## Layers (tabular reference)
| Layer | Primary components | Responsibility |
| --- | --- | --- |
@@ -12,6 +52,108 @@ This document provides a clean, audit-friendly view of StellaOps platform topolo
| Core engines | Scanner, Concelier, Excititor, Policy, Scheduler, Notify, Orchestrator | Scanning, ingestion, verdicts, orchestration, notifications, exports. |
| Data plane | PostgreSQL, Valkey, RustFS (S3), optional NATS | Persistent state, queues/streams, artifact storage, optional alternative messaging. |
## Service categories (detailed)
| Category | Services | Purpose |
|----------|----------|---------|
| **Gateway** | Gateway.WebService | API routing, auth enforcement |
| **Auth & Security** | Authority, Signer, Attestor | OAuth2, signing, transparency |
| **Scanning** | Scanner.Web, Scanner.Worker | Container analysis, SBOM |
| **Advisory** | Concelier.Web, Concelier.Worker | Vulnerability ingestion |
| **VEX** | Excititor.Web, Excititor.Worker | Exploitability statements |
| **Policy** | Policy.Gateway, Policy Engine | OPA/Rego evaluation |
| **Orchestration** | Scheduler, Orchestrator | Job coordination |
| **Notifications** | Notify.Web, Notify.Worker | Delivery to Slack/Teams/Email |
## Layered architecture diagram
```
┌─────────────────────────────────────────────────────────────────────┐
│ USER EXPERIENCE │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Gateway │ │ Web (UI) │ │ CLI │ │
│ │ (API Router) │ │ (Angular v17)│ │(Multi-plat) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│ DATA & EXPORT │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ExportCenter │ │EvidenceLocker│ │FindingsLedger│ │
│ │(SARIF/SBOM) │ │(Artifacts) │ │(Audit Trail) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│ EVENTS & NOTIFICATIONS │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Notify │ │ Notifier │ │TimelineIndex │ │
│ │(Slack/Teams) │ │ (Advanced) │ │ (Events) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│ ORCHESTRATION & WORKFLOW │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Scheduler │ │ Orchestrator │ │ TaskRunner │ │
│ │(Job Sched) │ │(Coordinator) │ │(Executor) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│ SCANNING & ANALYSIS │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │Scanner.Web │ │Scanner.Worker│ │ AdvisoryAI │ │
│ │(API/Control) │ │(Analyzers) │ │(ML Analysis) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ RiskEngine │ │ Policy │ │
│ │ (Scoring) │ │ (Engine) │ │
│ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│ INGESTION & AGGREGATION │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Concelier │ │ Excititor │ │IssuerDirectry│ │
│ │(Advisories) │ │ (VEX) │ │(CSAF Pubshrs)│ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│ AUTHENTICATION & SIGNING │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Authority │─▶│ Signer │─▶│ Attestor │ │
│ │ (OAuth2/OIDC)│ │(DSSE/PKIX) │ │(in-toto/DSSE)│ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│ INFRASTRUCTURE LAYER │
│ ┌──────────────────┐ ┌──────────────────┐ ┌─────────────────┐ │
│ │ PostgreSQL │ │ Valkey │ │ RustFS │ │
│ │ (v16+ ONLY) │ │ (Redis-compat) │ │ (S3-like API) │ │
│ │ │ │ - Caching │ │ - Artifacts │ │
│ │ All services use │ │ - DPoP nonces │ │ - SBOMs │ │
│ │ PostgreSQL for │ │ - Event queues │ │ - Signatures │ │
│ │ persistent data │ │ - Rate limiting│ │ │ │
│ └──────────────────┘ └──────────────────┘ └─────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Optional: NATS JetStream (alternative transport for queues) │ │
│ │ Only used if explicitly configured in appsettings │ │
│ └──────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
```
## Notes
- Module dossiers live under `docs/modules/<module>/architecture.md`.
- Deployment defaults (ports, profile overlays, pinned digests) live under `deploy/` (`deploy/compose/`, `deploy/helm/`, `deploy/releases/`).

View File

@@ -7,7 +7,7 @@ This document describes the canonical end-to-end flows at a level useful for deb
1. **Client -> Gateway**: submit scan request (authenticated; tenant-scoped).
2. **Gateway -> Scanner.WebService**: route request after auth/rate-limit checks.
3. **Scanner.WebService -> PostgreSQL**: persist scan manifest and initial status.
4. **Scanner.WebService -> queue/stream**: enqueue a scan job (Valkey streams by default; optional alternative transports exist).
4. **Scanner.WebService -> queue/stream**: enqueue a scan job (transport is profile/config dependent; for example Valkey streams or NATS).
5. **Scanner.Worker -> queue/stream**: claim job, pull image, extract layers, run analyzers.
6. **Scanner.Worker -> RustFS/S3**: write SBOM fragments, composed SBOMs, and other scan artifacts.
7. **Scanner.Worker -> Concelier**: query linksets / observations needed for evaluation (deployment-dependent).
@@ -15,10 +15,121 @@ This document describes the canonical end-to-end flows at a level useful for deb
9. **Scanner.WebService -> Policy**: request verdict evaluation using SBOM + advisory + VEX + policy inputs.
10. **Scanner.WebService -> Signer / Attestor (optional)**: create DSSE/in-toto evidence bundles and (optionally) attach transparency receipts.
11. **Scanner.WebService -> events stream**: publish completion events for notifications and downstream consumers.
12. **Notify.WebService/Worker -> channels**: render and deliver notifications with idempotency tracking.
12. **Notification engine -> channels**: render and deliver notifications with idempotency tracking.
Offline note: for air-gapped deployments, step 6 writes to local object storage and step 7 relies on offline mirrors/bundles rather than public feeds. See `docs/24_OFFLINE_KIT.md` and `docs/airgap/overview.md`.
### Scan execution sequence diagram
```
┌──────────────────────────────────────────────────────────────────────────────────┐
│ 1. CLIENT REQUEST (CLI or Web UI) │
│ $ stella scan docker://alpine:latest --sbom-format=spdx │
└───────────────────────────────────┬──────────────────────────────────────────────┘
│ HTTPS
┌──────────────────────────────────────────────────────────────────────────────────┐
│ 2. GATEWAY (API Router) │
│ - Terminates TLS │
│ - Routes to appropriate backend service │
│ - Load balancing (if multiple instances) │
└───────────────────────────────────┬──────────────────────────────────────────────┘
│ HTTP (internal)
┌──────────────────────────────────────────────────────────────────────────────────┐
│ 3. AUTHORITY (Authentication) │
│ - Validates OAuth2 access token (DPoP-bound) │
│ - Checks DPoP proof against Valkey nonce cache │
│ - Returns user identity and scopes │
│ │
│ ┌─────────────┐ │
│ │ Valkey │◀── DPoP nonce validation (GET/SET) │
│ │ (Cache) │ │
│ └─────────────┘ │
│ ┌─────────────┐ │
│ │ PostgreSQL │◀── User/client lookup (SELECT) │
│ └─────────────┘ │
└───────────────────────────────────┬──────────────────────────────────────────────┘
│ Authenticated request
┌──────────────────────────────────────────────────────────────────────────────────┐
│ 4. SCANNER.WEB (Scan API Controller) │
│ - Validates scan request parameters │
│ - Creates scan job record in PostgreSQL │
│ - Enqueues scan job to Valkey queue (default) or NATS (if configured) │
│ │
│ ┌─────────────┐ │
│ │ PostgreSQL │◀── INSERT scan_jobs (job_id, image_ref, status='pending') │
│ └─────────────┘ │
│ ┌─────────────┐ │
│ │ Valkey │◀── XADD scanner:jobs (enqueue job message) │
│ │ (Queue) │ │
│ └─────────────┘ │
│ │
│ Returns: HTTP 202 Accepted { "job_id": "scan-abc123", "status": "queued" } │
└───────────────────────────────────┬──────────────────────────────────────────────┘
│ (Client polls for status)
┌──────────────────────────────────────────────────────────────────────────────────┐
│ 5. SCANNER.WORKER (Background Processor) │
│ - Consumes job from Valkey queue (XREADGROUP scanner:jobs) │
│ - Updates job status to 'running' │
│ - Downloads container image from registry │
│ - Executes analyzers (OS packages, language deps, files) │
│ - Generates SBOM (SPDX/CycloneDX) │
│ - Stores artifacts to RustFS │
│ │
│ ┌─────────────┐ │
│ │ PostgreSQL │◀── UPDATE scan_jobs SET status='running' │
│ │ │◀── INSERT sbom_documents, packages, vulnerabilities │
│ │ │◀── UPDATE scan_jobs SET status='completed' │
│ └─────────────┘ │
│ ┌─────────────┐ │
│ │ RustFS │◀── PUT /artifacts/scan-abc123/sbom.spdx.json │
│ │ (S3 API) │◀── PUT /artifacts/scan-abc123/image-layers.tar.gz │
│ └─────────────┘ │
│ ┌─────────────┐ │
│ │ Valkey │◀── XADD scanner:events (publish scan.completed event) │
│ │(Event Stream│ │
│ └─────────────┘ │
└───────────────────────────────────┬──────────────────────────────────────────────┘
│ Event published
┌──────────────────────────────────────────────────────────────────────────────────┐
│ 6. EVENT PROPAGATION (Valkey Streams) │
│ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ Valkey Event Stream: "scanner:events" │ │
│ │ Event: { "type": "scan.completed", "job_id": "scan-abc123", ... } │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌──────────────┼──────────────┬───────────────┐ │
│ ▼ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Notify │ │Timeline │ │ Policy │ │ Export │ │
│ │ Worker │ │ Indexer │ │ Engine │ │ Center │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ │ │ │ │ │
│ │ (all subscribe to scanner:events via XREADGROUP) │
└─────────┼───────────────┼──────────────┼───────────────┼─────────────────────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────┐ ┌──────────────┐
│ 7a. NOTIFY │ │ 7b. TIMELINE │ │ 7c.POLICY│ │ 7d. EXPORT │
│ │ │ INDEXER │ │ ENGINE │ │ CENTER │
│ - Query scan │ │ │ │ │ │ │
│ results │ │ - Index event│ │ - Eval │ │ - Generate │
│ - Check user │ │ timeline │ │ policy │ │ SARIF │
│ notif prefs│ │ - Store in │ │ rules │ │ - Export to │
│ - Send Slack │ │ PostgreSQL │ │ - Block/ │ │ external │
│ message │ │ │ │ Allow │ │ systems │
│ │ │ │ │ │ │ │
│ PostgreSQL ◀─┤ │ PostgreSQL ◀─┤ │PostgreSQL│ │ RustFS ◀─┤
│ (user prefs) │ │ (timeline) │ │(policies)│ │ (exports) │
└──────────────┘ └──────────────┘ └──────────┘ └──────────────┘
```
## 2) Advisory ingestion (delta-driven)
1. **Concelier.Worker** fetches advisories from configured sources (mirrors first; no hidden outbound calls in air-gap profiles).
@@ -43,6 +154,111 @@ Offline note: for air-gapped deployments, step 6 writes to local object storage
## 5) Notification delivery
1. **Notify.WebService** consumes platform events (scan completed, advisory delta, etc.).
2. **Notify.WebService -> queue/stream** enqueues delivery tasks with idempotency keys.
3. **Notify.Worker -> channels** delivers (email/chat/webhook), records results, and retries with deterministic backoff rules.
1. **Notification engine** consumes platform events (scan completed, advisory delta, etc.).
2. **Notification engine -> queue/stream** enqueues delivery tasks with idempotency keys (when a worker model is used).
3. **Delivery workers -> channels** deliver (email/chat/webhook), record results, and retry with deterministic backoff rules.
### Notification flow diagram (vulnerability alert)
```
┌──────────────────────────────────────────────────────────────────────────────────┐
│ TRIGGER: New critical CVE detected in existing scan │
│ Source: Concelier advisory ingestion │
└───────────────────────────────────┬──────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────────────────┐
│ CONCELIER.WORKER (Advisory Processor) │
│ │
│ 1. Ingest new advisory from NVD/OSV/CSAF │
│ ┌─────────────┐ │
│ │ PostgreSQL │◀── INSERT INTO advisories (cve_id, severity, ...) │
│ └─────────────┘ │
│ │
│ 2. Match advisory against existing scans (PURL/CPE matching) │
│ ┌─────────────┐ │
│ │ PostgreSQL │◀── SELECT scans WHERE package_purl IN (affected_purls) │
│ └─────────────┘ │
│ │
│ 3. Publish drift event to Valkey │
│ ┌─────────────┐ │
│ │ Valkey │◀── XADD concelier:drift (new vulnerability found) │
│ └─────────────┘ │
└───────────────────────────────────┬──────────────────────────────────────────────┘
│ Event published
┌──────────────────────────────────────────────────────────────────────────────────┐
│ NOTIFY.WORKER (Notification Processor) │
│ │
│ 1. Consume drift event from Valkey stream │
│ ┌─────────────┐ │
│ │ Valkey │◀── XREADGROUP concelier:drift notify-workers │
│ └─────────────┘ │
│ │
│ 2. Query user notification preferences │
│ ┌─────────────┐ │
│ │ PostgreSQL │◀── SELECT * FROM user_notification_preferences │
│ │ │ WHERE user_id = scan_owner AND channel = 'slack' │
│ └─────────────┘ │
│ │
│ 3. Render notification template │
│ Template: "New critical CVE-2024-1234 affects alpine:latest scan" │
│ │
│ 4. Deliver notification via configured channels │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ External APIs │ │
│ │ - POST https://hooks.slack.com/services/T00/B00/xxx │ │
│ │ - POST https://graph.microsoft.com/v1.0/teams/channels │ │
│ │ - SMTP send (email) │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ 5. Store delivery receipt in PostgreSQL │
│ ┌─────────────┐ │
│ │ PostgreSQL │◀── INSERT INTO notification_deliveries (status, ...) │
│ └─────────────┘ │
└──────────────────────────────────────────────────────────────────────────────────┘
```
## 6) Export flow (SBOM distribution)
```
┌──────────────────────────────────────────────────────────────────────────────────┐
│ EXPORT REQUEST: GET /api/v1/scans/{scan_id}/export?format=spdx │
└───────────────────────────────────┬──────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────────────────┐
│ SCANNER.WEB or EXPORT CENTER │
│ │
│ 1. Query scan metadata from PostgreSQL │
│ ┌─────────────┐ │
│ │ PostgreSQL │◀── SELECT * FROM scan_jobs WHERE job_id = $1 │
│ │ │◀── SELECT * FROM sbom_documents WHERE scan_id = $1 │
│ └─────────────┘ │
│ │
│ 2. Retrieve SBOM artifact from RustFS │
│ ┌─────────────┐ │
│ │ RustFS │◀── GET /artifacts/scan-abc123/sbom.spdx.json │
│ └─────────────┘ │
│ │
│ 3. Sign SBOM with Signer service │
│ ┌─────────────┐ │
│ │ Signer │◀── POST /api/v1/sign (SBOM payload) │
│ │ │──▶ Returns: DSSE envelope with signature │
│ └─────────────┘ │
│ │
│ 4. Create in-toto attestation with Attestor │
│ ┌─────────────┐ │
│ │ Attestor │◀── POST /api/v1/attest (signed SBOM) │
│ │ │──▶ Returns: in-toto attestation bundle │
│ └─────────────┘ │
│ │
│ 5. Store final bundle to RustFS │
│ ┌─────────────┐ │
│ │ RustFS │◀── PUT /artifacts/scan-abc123/bundle.jsonl │
│ └─────────────┘ │
│ │
│ 6. Return signed bundle to client │
│ Returns: HTTP 200 OK (application/vnd.in-toto+json) │
└──────────────────────────────────────────────────────────────────────────────────┘
```

View File

@@ -10,6 +10,18 @@ All externally reachable services are expected to enforce:
3. Scope-based authorization (least privilege).
4. Tenant isolation: requests and data access are filtered by tenant context.
### Hard gates (typical examples)
Exact gates are module-specific, but common patterns include:
- **Authority**: nonce-based sender constraints (DPoP), strict token lifetimes, tenant-scoped issuance, and rate limiting.
- **Signing/attestation services**: narrow scopes, service identity requirements (often mTLS), and verification of the artifact being signed/attested (for example digest checks) before producing evidence.
Authoritative references:
- `docs/security/scopes-and-roles.md`
- `docs/modules/authority/architecture.md`
- `docs/modules/signer/architecture.md`
- `docs/modules/attestor/architecture.md`
## Network segmentation (typical deployment)
- **Front door / ingress**: TLS termination, rate limiting, and WAF controls.