Files
git.stella-ops.org/docs/architecture/overview.md
root 68da90a11a
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Restructure solution layout by module
2025-10-28 15:10:40 +02:00

169 lines
9.4 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# StellaOps Architecture Overview (Sprint19)
> **Ownership:** Architecture Guild • Docs Guild
> **Audience:** Service owners, platform engineers, solution architects
> **Related:** [High-Level Architecture](../07_HIGH_LEVEL_ARCHITECTURE.md), [Concelier Architecture](../ARCHITECTURE_CONCELIER.md), [Policy Engine Architecture](policy-engine.md), [Aggregation-Only Contract](../ingestion/aggregation-only-contract.md)
This dossier summarises the end-to-end runtime topology after the Aggregation-Only Contract (AOC) rollout. It highlights where raw facts live, how ingest services enforce guardrails, and how downstream components consume those facts to derive policy decisions and user-facing experiences.
---
## 1·System landscape
```mermaid
graph TD
subgraph Edge["Clients & Automation"]
CLI[stella CLI]
UI[Console SPA]
APIClients[CI / API Clients]
end
Gateway[API Gateway<br/>(JWT + DPoP scopes)]
subgraph Scanner["Fact Collection"]
ScannerWeb[Scanner.WebService]
ScannerWorkers[Scanner.Workers]
Agent[Agent Runtime]
end
subgraph Ingestion["Aggregation-Only Ingestion (AOC)"]
Concelier[Concelier.WebService]
Excititor[Excititor.WebService]
RawStore[(MongoDB<br/>advisory_raw / vex_raw)]
end
subgraph Derivation["Policy & Overlay"]
Policy[Policy Engine]
Scheduler[Scheduler Services]
Notify[Notifier]
end
subgraph Experience["UX & Export"]
UIService[Console Backend]
Exporters[Export / Offline Kit]
end
Observability[Telemetry Stack]
CLI --> Gateway
UI --> Gateway
APIClients --> Gateway
Gateway --> ScannerWeb
ScannerWeb --> ScannerWorkers
ScannerWorkers --> Concelier
ScannerWorkers --> Excititor
Concelier --> RawStore
Excititor --> RawStore
RawStore --> Policy
Policy --> Scheduler
Policy --> Notify
Policy --> UIService
Scheduler --> UIService
UIService --> Exporters
Exporters --> CLI
Exporters --> Offline[Offline Kit]
Observability -.-> ScannerWeb
Observability -.-> Concelier
Observability -.-> Excititor
Observability -.-> Policy
Observability -.-> Scheduler
Observability -.-> Notify
```
Key boundaries:
- **AOC border.** Everything inside the Ingestion subgraph writes only immutable raw facts plus link hints. Derived severity, consensus, and risk remain outside the border.
- **Policy-only derivation.** Policy Engine materialises `effective_finding_*` collections and emits overlays; other services consume but never mutate them.
- **Tenant enforcement.** Authority-issued DPoP scopes flow through Gateway to every service; raw stores and overlays include `tenant` strictly.
---
## 2·Aggregation-Only Contract focus
### 2.1 Responsibilities at the boundary
| Area | Services | Responsibilities under AOC | Forbidden under AOC |
|------|----------|-----------------------------|---------------------|
| **Ingestion (Concelier / Excititor)** | `StellaOps.Concelier.WebService`, `StellaOps.Excititor.WebService` | Fetch upstream advisories/VEX, verify signatures, compute linksets, append immutable documents to `advisory_raw` / `vex_raw`, emit observability signals, expose raw read APIs. | Computing severity, consensus, suppressions, or policy hints; merging upstream sources into a single derived record; mutating existing documents. |
| **Policy & Overlay** | `StellaOps.Policy.Engine`, Scheduler | Join SBOM inventory with raw advisories/VEX, evaluate policies, issue `effective_finding_*` overlays, drive remediation workflows. | Writing to raw collections; bypassing guard scopes; running without recorded provenance. |
| **Experience layers** | Console, CLI, Exporters | Surface raw facts + policy overlays; run `stella aoc verify`; render AOC dashboards and reports. | Accepting ingestion payloads that lack provenance or violate guard results. |
### 2.2 Raw stores
| Collection | Purpose | Key fields | Notes |
|------------|---------|------------|-------|
| `advisory_raw` | Immutable vendor/ecosystem advisory documents. | `_id`, `tenant`, `source.*`, `upstream.*`, `content.raw`, `linkset`, `supersedes`. | Idempotent by `(source.vendor, upstream.upstream_id, upstream.content_hash)`. |
| `vex_raw` | Immutable vendor VEX statements. | Mirrors `advisory_raw`; `identifiers.statements` summarises affected components. | Maintains supersedes chain identical to advisory flow. |
| Change streams (`advisory_raw_stream`, `vex_raw_stream`) | Feed Policy Engine and Scheduler. | `operationType`, `documentKey`, `fullDocument`, `tenant`, `traceId`. | Scope filtered per tenant before delivery. |
### 2.3 Guarded ingestion sequence
```mermaid
sequenceDiagram
participant Upstream as Upstream Source
participant Connector as Concelier/Excititor Connector
participant Guard as AOCWriteGuard
participant Mongo as MongoDB (advisory_raw / vex_raw)
participant Stream as Change Stream
participant Policy as Policy Engine
Upstream-->>Connector: CSAF / OSV / VEX document
Connector->>Connector: Normalize transport, compute content_hash
Connector->>Guard: Candidate raw doc (source + upstream + content + linkset)
Guard-->>Connector: ERR_AOC_00x on violation
Guard->>Mongo: Append immutable document (with tenant & supersedes)
Mongo-->>Stream: Change event (tenant scoped)
Stream->>Policy: Raw delta payload
Policy->>Policy: Evaluate policies, compute effective findings
```
---
### 2.4 Authority scopes & tenancy
| Scope | Holder | Purpose | Notes |
|-------|--------|---------|-------|
| `advisory:ingest` / `vex:ingest` | Concelier / Excititor collectors | Append raw documents through ingestion endpoints. | Paired with tenant claims; requests without tenant are rejected. |
| `advisory:read` / `vex:read` | DevOps verify identity, CLI | Run `stella aoc verify` or call `/aoc/verify`. | Read-only; cannot mutate raw docs. |
| `effective:write` | Policy Engine | Materialise `effective_finding_*` overlays. | Only Policy Engine identity may hold; ingestion contexts receive `ERR_AOC_006` if they attempt. |
| `findings:read` | Console, CLI, exports | Consume derived findings. | Enforced by Gateway and downstream services. |
---
## 3·Data & control flow highlights
1. **Ingestion:** Concelier / Excititor connectors fetch upstream documents, compute linksets, and hand payloads to `AOCWriteGuard`. Guards validate schema, provenance, forbidden fields, supersedes pointers, and append-only rules before writing to Mongo.
2. **Verification:** `stella aoc verify` (CLI/CI) and `/aoc/verify` endpoints replay guard checks against stored documents, mapping `ERR_AOC_00x` codes to exit codes for automation.
3. **Policy evaluation:** Mongo change streams deliver tenant-scoped raw deltas. Policy Engine joins SBOM inventory (via BOM Index), executes deterministic policies, writes overlays, and emits events to Scheduler/Notify.
4. **Experience surfaces:** Console renders an AOC dashboard showing ingestion latency, guard violations, and supersedes depth. CLI exposes raw-document fetch helpers for auditing. Offline Kit bundles raw collections alongside guard configs to keep air-gapped installs verifiable.
5. **Observability:** All services emit `ingestion_write_total`, `aoc_violation_total{code}`, `ingestion_latency_seconds`, and trace spans `ingest.fetch`, `ingest.transform`, `ingest.write`, `aoc.guard`. Logs correlate via `traceId`, `tenant`, `source.vendor`, and `content_hash`.
---
## 4·Offline & disaster readiness
- **Offline Kit:** Packages raw Mongo snapshots (`advisory_raw`, `vex_raw`) plus guard configuration and CLI verifier binaries so air-gapped sites can re-run AOC checks before promotion.
- **Recovery:** Supersedes chains allow rollback to prior revisions without mutating documents. Disaster exercises must rehearse restoring from snapshot, replaying change streams into Policy Engine, and re-validating guard compliance.
- **Migration:** Legacy normalised fields are moved to temporary views during cutover; ingestion runtime removes writes once guard-enforced path is live (see [Migration playbook](../ingestion/aggregation-only-contract.md#8-migration-playbook)).
---
## 5·References
- [Aggregation-Only Contract reference](../ingestion/aggregation-only-contract.md)
- [Concelier architecture](../ARCHITECTURE_CONCELIER.md)
- [Excititor architecture](../ARCHITECTURE_EXCITITOR.md)
- [Policy Engine architecture](policy-engine.md)
- [Authority service](../ARCHITECTURE_AUTHORITY.md)
- [Observability standards (upcoming)](../observability/policy.md) interim reference for telemetry naming.
---
## 6·Compliance checklist
- [ ] AOC guard enabled for all Concelier and Excititor write paths in production.
- [ ] Mongo schema validators deployed for `advisory_raw` and `vex_raw`; change streams scoped per tenant.
- [ ] Authority scopes (`advisory:*`, `vex:*`, `effective:*`) configured in Gateway and validated via integration tests.
- [ ] `stella aoc verify` wired into CI/CD pipelines with seeded violation fixtures.
- [ ] Console AOC dashboard and CLI documentation reference the new ingestion contract.
- [ ] Offline Kit bundles include guard configs, verifier tooling, and documentation updates.
- [ ] Observability dashboards include violation, latency, and supersedes depth metrics with alert thresholds.
---
*Last updated: 2025-10-26 (Sprint19).*