docs: Add comprehensive component architecture documentation

Created detailed architectural documentation showing component interactions,
communication patterns, and data flows across all StellaOps services.

## New Documentation

**docs/ARCHITECTURE_DETAILED.md** - Comprehensive architecture guide:
- Component topology diagram (all 36+ services)
- Infrastructure layer details (PostgreSQL, Valkey, RustFS, NATS)
- Service-by-service catalog with responsibilities
- Communication patterns with WHY (business purpose)
- 5 detailed data flow diagrams:
  1. Scan Request Flow (CLI → Scanner → Worker → Policy → Signer → Attestor → Notify)
  2. Advisory Update Flow (Concelier → Scheduler → Scanner re-evaluation)
  3. VEX Update Flow (Excititor → IssuerDirectory → Scheduler → Policy)
  4. Notification Delivery Flow (Scanner → Valkey → Notify → Slack/Teams/Email)
  5. Policy Evaluation Flow (Scanner → Policy.Gateway → OPA → PostgreSQL replication)
- Database schema isolation details per service
- Security boundaries and authentication flows

## Updated Documentation

**docs/DEVELOPER_ONBOARDING.md**:
- Added link to detailed architecture
- Simplified overview with component categories
- Quick reference topology tree

**docs/07_HIGH_LEVEL_ARCHITECTURE.md**:
- Updated infrastructure requirements section
- Clarified PostgreSQL as ONLY database
- Emphasized Valkey as REQUIRED (not optional)
- Marked NATS as optional (Valkey is default transport)

**docs/README.md**:
- Added link to detailed architecture in navigation

## Key Architectural Insights Documented

**Communication Patterns:**
- 11 communication steps in scan flow (Gateway → Scanner → Valkey → Worker → Concelier → Policy → Signer → Attestor → Valkey → Notify → Slack)
- PostgreSQL logical replication (advisory_raw_stream, vex_raw_stream → Policy Engine)
- Valkey Streams for async job queuing (XADD/XREADGROUP pattern)
- HTTP webhooks for delta events (Concelier/Excititor → Scheduler)

**Security Boundaries:**
- Authority issues OpToks with DPoP binding (RFC 9449)
- Signer enforces PoE validation + scanner digest verification
- All services validate JWT + DPoP on every request
- Tenant isolation via tenant_id in all PostgreSQL queries

**Database Patterns:**
- 8 dedicated PostgreSQL schemas (authority, scanner, vuln, vex, scheduler, notify, policy, orchestrator)
- Append-only advisory/VEX storage (AOC - Aggregation-Only Contract)
- BOM-Index for impact selection (CVE → PURL → image mapping)

This documentation provides complete visibility into who calls who, why they
communicate, what data flows through the system, and how security is enforced
at every layer.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
master
2025-12-23 11:05:55 +02:00
parent 21337f4de6
commit 396e9b75a4
6 changed files with 1596 additions and 9 deletions

View File

@@ -50,15 +50,29 @@
| **Web UI** | `stellaops/ui` | Angular app for scans, diffs, policy, VEX, vulnerability triage (artifact-first), audit bundles, **Scheduler**, **Notify**, runtime, reports. | Stateless. |
| **StellaOps.Cli** | `stellaops/cli` | CLI for init/scan/export/diff/policy/report/verify; Buildx helper; **schedule** and **notify** verbs. | Local/CI. |
### 1.2 Thirdparty (selfhosted)
### 1.2 Infrastructure Requirements
* **Fulcio** (Sigstore CA) — issues shortlived signing certs (keyless).
* **Rekor v2** (tilebacked transparency log).
* **RustFS** — offline-first object store with deterministic REST API; S3/MinIO compatibility layer available for legacy deployments.
* **PostgreSQL** (≥16) — primary control-plane storage with per-module schema isolation (authority, vuln, vex, scheduler, notify, policy, concelier). See [Database Architecture](#database-architecture-postgresql).
* **Valkey** (≥8.0) — Redis-compatible cache for DPoP nonces, event streams, queues, and rate limiting.
* **Queue** — Valkey Streams (default); NATS JetStream available as optional transport (opt-in only).
* **OCI Registry** — must support **Referrers API** (discover SBOMs/signatures).
**REQUIRED Infrastructure:**
* **PostgreSQL** (≥16) — **ONLY** database for all persistent data. Per-module schema isolation (authority, vuln, vex, scanner, scheduler, notify, policy, orchestrator). See [Database Architecture](#database-architecture-postgresql).
* **Valkey** (≥8.0) — Redis-compatible cache, DPoP nonces (RFC 9449), event streams, job queues, rate limiting. **REQUIRED** for platform operation.
* **RustFS** — S3-compatible object storage for SBOM artifacts, proof bundles, and scan evidence. HTTP API with deterministic responses.
**OPTIONAL Infrastructure:**
* **NATS JetStream** — Alternative messaging transport (Valkey Streams is default). Opt-in only via configuration.
**External Dependencies:**
* **OCI Registry** — Must support **Referrers API** (discover SBOMs/signatures).
* **Fulcio** (Sigstore CA) — Issues short-lived signing certs (keyless signing). Optional if using KMS keys.
* **Rekor v2** — Tile-backed transparency log. Optional if `OFFLINEKIT_ENABLED=true` (airgap mode).
**Architecture Note:**
- PostgreSQL is the ONLY database (MongoDB fully removed as of 2025-12-23)
- Valkey replaces Redis (drop-in compatible, but required)
- RustFS is primary object storage (MinIO removed)
- NATS is OPTIONAL, not required (Valkey Streams handle queuing)
### 1.3 Cloud licensing (StellaOps)

File diff suppressed because it is too large Load Diff

View File

@@ -19,7 +19,62 @@
StellaOps is a deterministic SBOM + VEX platform built as a microservices architecture with 36+ services organized into functional domains.
### Runtime Topology - High-Level
**📖 For detailed component architecture with communication patterns, see [ARCHITECTURE_DETAILED.md](./ARCHITECTURE_DETAILED.md)**
### Quick Reference - Component Topology
```
CLIENT LAYER
├─ stella CLI → Gateway (JWT + DPoP auth)
├─ Web UI (Angular) → Gateway (JWT + DPoP auth)
├─ CI/CD Pipelines → Gateway (JWT + DPoP auth)
└─ Zastava Observer → Scanner (runtime scans)
INFRASTRUCTURE (REQUIRED)
├─ PostgreSQL v16+ → Primary database (ALL services)
├─ Valkey v8.0 → Cache, DPoP, queues, events
└─ RustFS → Object storage (S3 API)
INFRASTRUCTURE (OPTIONAL)
└─ NATS JetStream → Alternative messaging (Valkey is default)
GATEWAY LAYER
└─ Gateway.WebService → Auth, routing, rate limiting
AUTH & CRYPTO
├─ Authority → OAuth2/OIDC, OpTok issuance
├─ Signer → DSSE signing (FIPS/GOST/SM)
└─ Attestor → Rekor v2 transparency log
CORE ENGINES
├─ Scanner.WebService → Scan orchestration
├─ Scanner.Worker → Image analysis, SBOM generation
├─ Concelier.WebService → Advisory ingestion (NVD, Red Hat, etc.)
├─ Excititor.WebService → VEX ingestion + consensus
├─ Policy.Gateway → OPA/Rego policy evaluation
├─ Scheduler.WebService → Re-scan orchestration
├─ Notify.WebService → Notification orchestration
├─ Notify.Worker → Slack/Teams/Email delivery
└─ Orchestrator.WebService → DAG workflows, pack runs
SUPPORTING
└─ IssuerDirectory → VEX issuer trust registry
```
### Service Categories
| Category | Services | Purpose |
|----------|----------|---------|
| **Gateway** | Gateway.WebService | API routing, auth enforcement |
| **Auth & Security** | Authority, Signer, Attestor | OAuth2, signing, transparency |
| **Scanning** | Scanner.Web, Scanner.Worker | Container analysis, SBOM |
| **Advisory** | Concelier.Web, Concelier.Worker | Vulnerability ingestion |
| **VEX** | Excititor.Web, Excititor.Worker | Exploitability statements |
| **Policy** | Policy.Gateway, Policy Engine | OPA/Rego evaluation |
| **Orchestration** | Scheduler, Orchestrator | Job coordination |
| **Notifications** | Notify.Web, Notify.Worker | Delivery to Slack/Teams/Email |
### Runtime Topology - Infrastructure Dependencies
```
┌─────────────────────────────────────────────────────────────────────┐