Files
git.stella-ops.org/docs/architecture/overview.md
root 68da90a11a
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Restructure solution layout by module
2025-10-28 15:10:40 +02:00

9.4 KiB
Raw Blame History

StellaOps Architecture Overview (Sprint19)

Ownership: Architecture Guild • Docs Guild
Audience: Service owners, platform engineers, solution architects
Related: High-Level Architecture, Concelier Architecture, Policy Engine Architecture, Aggregation-Only Contract

This dossier summarises the end-to-end runtime topology after the Aggregation-Only Contract (AOC) rollout. It highlights where raw facts live, how ingest services enforce guardrails, and how downstream components consume those facts to derive policy decisions and user-facing experiences.


1·System landscape

graph TD
    subgraph Edge["Clients & Automation"]
        CLI[stella CLI]
        UI[Console SPA]
        APIClients[CI / API Clients]
    end
    Gateway[API Gateway<br/>(JWT + DPoP scopes)]
    subgraph Scanner["Fact Collection"]
        ScannerWeb[Scanner.WebService]
        ScannerWorkers[Scanner.Workers]
        Agent[Agent Runtime]
    end
    subgraph Ingestion["Aggregation-Only Ingestion (AOC)"]
        Concelier[Concelier.WebService]
        Excititor[Excititor.WebService]
        RawStore[(MongoDB<br/>advisory_raw / vex_raw)]
    end
    subgraph Derivation["Policy & Overlay"]
        Policy[Policy Engine]
        Scheduler[Scheduler Services]
        Notify[Notifier]
    end
    subgraph Experience["UX & Export"]
        UIService[Console Backend]
        Exporters[Export / Offline Kit]
    end
    Observability[Telemetry Stack]

    CLI --> Gateway
    UI --> Gateway
    APIClients --> Gateway
    Gateway --> ScannerWeb
    ScannerWeb --> ScannerWorkers
    ScannerWorkers --> Concelier
    ScannerWorkers --> Excititor
    Concelier --> RawStore
    Excititor --> RawStore
    RawStore --> Policy
    Policy --> Scheduler
    Policy --> Notify
    Policy --> UIService
    Scheduler --> UIService
    UIService --> Exporters
    Exporters --> CLI
    Exporters --> Offline[Offline Kit]
    Observability -.-> ScannerWeb
    Observability -.-> Concelier
    Observability -.-> Excititor
    Observability -.-> Policy
    Observability -.-> Scheduler
    Observability -.-> Notify

Key boundaries:

  • AOC border. Everything inside the Ingestion subgraph writes only immutable raw facts plus link hints. Derived severity, consensus, and risk remain outside the border.
  • Policy-only derivation. Policy Engine materialises effective_finding_* collections and emits overlays; other services consume but never mutate them.
  • Tenant enforcement. Authority-issued DPoP scopes flow through Gateway to every service; raw stores and overlays include tenant strictly.

2·Aggregation-Only Contract focus

2.1 Responsibilities at the boundary

Area Services Responsibilities under AOC Forbidden under AOC
Ingestion (Concelier / Excititor) StellaOps.Concelier.WebService, StellaOps.Excititor.WebService Fetch upstream advisories/VEX, verify signatures, compute linksets, append immutable documents to advisory_raw / vex_raw, emit observability signals, expose raw read APIs. Computing severity, consensus, suppressions, or policy hints; merging upstream sources into a single derived record; mutating existing documents.
Policy & Overlay StellaOps.Policy.Engine, Scheduler Join SBOM inventory with raw advisories/VEX, evaluate policies, issue effective_finding_* overlays, drive remediation workflows. Writing to raw collections; bypassing guard scopes; running without recorded provenance.
Experience layers Console, CLI, Exporters Surface raw facts + policy overlays; run stella aoc verify; render AOC dashboards and reports. Accepting ingestion payloads that lack provenance or violate guard results.

2.2 Raw stores

Collection Purpose Key fields Notes
advisory_raw Immutable vendor/ecosystem advisory documents. _id, tenant, source.*, upstream.*, content.raw, linkset, supersedes. Idempotent by (source.vendor, upstream.upstream_id, upstream.content_hash).
vex_raw Immutable vendor VEX statements. Mirrors advisory_raw; identifiers.statements summarises affected components. Maintains supersedes chain identical to advisory flow.
Change streams (advisory_raw_stream, vex_raw_stream) Feed Policy Engine and Scheduler. operationType, documentKey, fullDocument, tenant, traceId. Scope filtered per tenant before delivery.

2.3 Guarded ingestion sequence

sequenceDiagram
    participant Upstream as Upstream Source
    participant Connector as Concelier/Excititor Connector
    participant Guard as AOCWriteGuard
    participant Mongo as MongoDB (advisory_raw / vex_raw)
    participant Stream as Change Stream
    participant Policy as Policy Engine

    Upstream-->>Connector: CSAF / OSV / VEX document
    Connector->>Connector: Normalize transport, compute content_hash
    Connector->>Guard: Candidate raw doc (source + upstream + content + linkset)
    Guard-->>Connector: ERR_AOC_00x on violation
    Guard->>Mongo: Append immutable document (with tenant & supersedes)
    Mongo-->>Stream: Change event (tenant scoped)
    Stream->>Policy: Raw delta payload
    Policy->>Policy: Evaluate policies, compute effective findings

2.4 Authority scopes & tenancy

Scope Holder Purpose Notes
advisory:ingest / vex:ingest Concelier / Excititor collectors Append raw documents through ingestion endpoints. Paired with tenant claims; requests without tenant are rejected.
advisory:read / vex:read DevOps verify identity, CLI Run stella aoc verify or call /aoc/verify. Read-only; cannot mutate raw docs.
effective:write Policy Engine Materialise effective_finding_* overlays. Only Policy Engine identity may hold; ingestion contexts receive ERR_AOC_006 if they attempt.
findings:read Console, CLI, exports Consume derived findings. Enforced by Gateway and downstream services.

3·Data & control flow highlights

  1. Ingestion: Concelier / Excititor connectors fetch upstream documents, compute linksets, and hand payloads to AOCWriteGuard. Guards validate schema, provenance, forbidden fields, supersedes pointers, and append-only rules before writing to Mongo.
  2. Verification: stella aoc verify (CLI/CI) and /aoc/verify endpoints replay guard checks against stored documents, mapping ERR_AOC_00x codes to exit codes for automation.
  3. Policy evaluation: Mongo change streams deliver tenant-scoped raw deltas. Policy Engine joins SBOM inventory (via BOM Index), executes deterministic policies, writes overlays, and emits events to Scheduler/Notify.
  4. Experience surfaces: Console renders an AOC dashboard showing ingestion latency, guard violations, and supersedes depth. CLI exposes raw-document fetch helpers for auditing. Offline Kit bundles raw collections alongside guard configs to keep air-gapped installs verifiable.
  5. Observability: All services emit ingestion_write_total, aoc_violation_total{code}, ingestion_latency_seconds, and trace spans ingest.fetch, ingest.transform, ingest.write, aoc.guard. Logs correlate via traceId, tenant, source.vendor, and content_hash.

4·Offline & disaster readiness

  • Offline Kit: Packages raw Mongo snapshots (advisory_raw, vex_raw) plus guard configuration and CLI verifier binaries so air-gapped sites can re-run AOC checks before promotion.
  • Recovery: Supersedes chains allow rollback to prior revisions without mutating documents. Disaster exercises must rehearse restoring from snapshot, replaying change streams into Policy Engine, and re-validating guard compliance.
  • Migration: Legacy normalised fields are moved to temporary views during cutover; ingestion runtime removes writes once guard-enforced path is live (see Migration playbook).

5·References


6·Compliance checklist

  • AOC guard enabled for all Concelier and Excititor write paths in production.
  • Mongo schema validators deployed for advisory_raw and vex_raw; change streams scoped per tenant.
  • Authority scopes (advisory:*, vex:*, effective:*) configured in Gateway and validated via integration tests.
  • stella aoc verify wired into CI/CD pipelines with seeded violation fixtures.
  • Console AOC dashboard and CLI documentation reference the new ingestion contract.
  • Offline Kit bundles include guard configs, verifier tooling, and documentation updates.
  • Observability dashboards include violation, latency, and supersedes depth metrics with alert thresholds.

Last updated: 2025-10-26 (Sprint19).