Add integration tests for migration categories and execution
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled

- Implemented MigrationCategoryTests to validate migration categorization for startup, release, seed, and data migrations.
- Added tests for edge cases, including null, empty, and whitespace migration names.
- Created StartupMigrationHostTests to verify the behavior of the migration host with real PostgreSQL instances using Testcontainers.
- Included tests for migration execution, schema creation, and handling of pending release migrations.
- Added SQL migration files for testing: creating a test table, adding a column, a release migration, and seeding data.
This commit is contained in:
master
2025-12-04 19:10:54 +02:00
parent 600f3a7a3c
commit 75f6942769
301 changed files with 32810 additions and 1128 deletions

View File

@@ -1,6 +1,6 @@
# HighLevel Architecture — **StellaOps** (Consolidated • 2025Q4)
> **Want the 10-minute tour?** See [`high-level-architecture.md`](high-level-architecture.md); this file retains the exhaustive reference.
# HighLevel Architecture — **StellaOps** (Consolidated • 2025Q4)
> **Want the 10-minute tour?** See [`high-level-architecture.md`](high-level-architecture.md); this file retains the exhaustive reference.
> **Purpose.** A complete, implementationready map of StellaOps: product vision, all runtime components, trust boundaries, tokens/licensing, control/data flows, storage, APIs, security, scale, DevOps, and verification logic.
> **Scope.** This file **replaces** the separate `components.md`; all component details now live here.
@@ -14,14 +14,14 @@
**Operating principles.**
* **Scannerowned SBOMs.** We generate our own BOMs; we do not warehouse thirdparty SBOM content (we can **link** to attested SBOMs).
* **Deterministic evidence.** Facts come from package DBs, installed metadata, linkers, and verified attestations; no fuzzy guessing in the core.
* **Per-layer caching.** Cache fragments by **layer digest** and compose image SBOMs via **CycloneDX BOM-Link** / **SPDX ExternalRef**.
* **Inventory vs Usage.** Always record the full **inventory** of what exists; separately present **usage** (entrypoint closure + loaded libs).
* **Backend decides.** PASS/FAIL is produced by **Policy** + **VEX** + **Advisories**. The scanner reports facts.
* **Attest or it didnt happen.** Every export is signed as **in-toto/DSSE** and logged in **Rekor v2**.
* **Hybrid reachability attestations.** Every reachability graph ships with a graph-level DSSE (mandatory) plus optional edge-bundle DSSEs for runtime/init/contested edges; Policy/Signals consume graph DSSE as baseline and edge bundles for quarantine/disputes.
* **Sovereign-ready.** Cloud is used only for licensing and optional endorsement; everything else is first-party and self-hostable.
* **Competitive clarity.** Moats: deterministic replay, hybrid reachability proofs, lattice VEX, sovereign crypto, proof graph; see `docs/market/competitive-landscape.md`.
* **Deterministic evidence.** Facts come from package DBs, installed metadata, linkers, and verified attestations; no fuzzy guessing in the core.
* **Per-layer caching.** Cache fragments by **layer digest** and compose image SBOMs via **CycloneDX BOM-Link** / **SPDX ExternalRef**.
* **Inventory vs Usage.** Always record the full **inventory** of what exists; separately present **usage** (entrypoint closure + loaded libs).
* **Backend decides.** PASS/FAIL is produced by **Policy** + **VEX** + **Advisories**. The scanner reports facts.
* **Attest or it didnt happen.** Every export is signed as **in-toto/DSSE** and logged in **Rekor v2**.
* **Hybrid reachability attestations.** Every reachability graph ships with a graph-level DSSE (mandatory) plus optional edge-bundle DSSEs for runtime/init/contested edges; Policy/Signals consume graph DSSE as baseline and edge bundles for quarantine/disputes.
* **Sovereign-ready.** Cloud is used only for licensing and optional endorsement; everything else is first-party and self-hostable.
* **Competitive clarity.** Moats: deterministic replay, hybrid reachability proofs, lattice VEX, sovereign crypto, proof graph; see `docs/market/competitive-landscape.md`.
---
@@ -53,8 +53,9 @@
* **Fulcio** (Sigstore CA) — issues shortlived signing certs (keyless).
* **Rekor v2** (tilebacked transparency log).
* **RustFS** — offline-first object store with deterministic REST API (S3/MinIO fallback available for legacy installs).
* **MongoDB** — catalog, advisories, VEX, scheduler, notify.
* **RustFS** — offline-first object store with deterministic REST API (S3/MinIO fallback available for legacy installs).
* **PostgreSQL** (≥15) — control-plane storage with per-module schema isolation (auth, vuln, vex, scheduler, notify, policy). See [Database Architecture](#database-architecture-postgresql).
* **MongoDB** (≥7) — legacy catalog support; being phased out in favor of PostgreSQL for control-plane domains.
* **Queue** — Redis Streams / NATS / RabbitMQ (pluggable).
* **OCI Registry** — must support **Referrers API** (discover SBOMs/signatures).
@@ -85,7 +86,7 @@ flowchart LR
ATT[Attestor\n(Rekor v2 submit/verify)]
UI[Web UI (Angular)]
Z[Zastava\n(Runtime Inspector/Enforcer)]
RFS[(RustFS object store)]
RFS[(RustFS object store)]
MGO[(MongoDB)]
QUE[(Queue/Streams)]
end
@@ -98,7 +99,7 @@ flowchart LR
CLI -->|scan/build| SW
SW -->|jobs| QUE
QUE --> WK
WK --> RFS
WK --> RFS
SW --> MGO
CONC --> MGO
EXC --> MGO
@@ -229,13 +230,13 @@ LS --> IA: PoE (mTLS client cert or JWT with cnf=K_inst), CRL/OCSP/introspect
---
## 6) Storage & catalogs (RustFS/Mongo)
**RustFS layout (default)**
## 6) Storage & catalogs (RustFS/PostgreSQL)
```
rustfs://stellaops/
layers/<sha256>/sbom.cdx.json.zst
**RustFS layout (default)**
```
rustfs://stellaops/
layers/<sha256>/sbom.cdx.json.zst
layers/<sha256>/sbom.spdx.json.zst
images/<imgDigest>/inventory.cdx.pb
images/<imgDigest>/usage.cdx.pb
@@ -243,16 +244,62 @@ rustfs://stellaops/
attest/<artifactSha256>.dsse.json
```
**Catalog (Mongo)**
### Database Architecture (PostgreSQL)
* `artifacts` (type/format/sha/size/rekor/ttl/immutable/refCount/createdAt)
* `images`, `layers`, `links`, `lifecycleRules`
* **Scheduler:** `schedules`, `runs`, `locks`, `impact_cursors`
* **Notify:** `rules`, `deliveries`, `channels`, `templates`
StellaOps uses PostgreSQL for all control-plane data with **per-module schema isolation**. Each module owns and manages only its own schema, ensuring clear ownership and independent migration lifecycles.
**Schema topology:**
```
┌─────────────────────────────────────────────────────────────────┐
│ PostgreSQL Cluster │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ stellaops (database) ││
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ││
│ │ │ auth │ │ vuln │ │ vex │ │scheduler│ ││
│ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ ││
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ││
│ │ │ notify │ │ policy │ │ audit │ ││
│ │ └─────────┘ └─────────┘ └─────────┘ ││
│ └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘
```
**Schema ownership:**
| Schema | Owner Module | Purpose |
|--------|--------------|---------|
| `auth` | Authority | Identity, authentication, authorization, licensing, sessions |
| `vuln` | Concelier | Vulnerability advisories, CVSS, affected packages, sources |
| `vex` | Excititor | VEX statements, graphs, observations, evidence, consensus |
| `scheduler` | Scheduler | Jobs, triggers, workers, locks, execution history |
| `notify` | Notify | Channels, templates, rules, deliveries, escalations |
| `policy` | Policy | Policy packs, rules, risk profiles, evaluations |
| `audit` | Shared | Cross-cutting audit log (optional) |
**Key design principles:**
1. **Module isolation** — Each module controls only its own schema. Cross-schema queries are rare and explicitly documented.
2. **Multi-tenancy** — Single database, single schema set, `tenant_id` column on all tenant-scoped tables with row-level security.
3. **Forward-only migrations** — No down migrations; fixes are applied as new forward migrations.
4. **Advisory lock coordination** — Startup migrations use `pg_try_advisory_lock(hashtext('schema_name'))` to prevent concurrent execution.
5. **Air-gap compatible** — All migrations embedded in assemblies, no external network dependencies.
**Migration categories:**
| Category | Prefix | Execution | Description |
|----------|--------|-----------|-------------|
| Startup (A) | `001-099` | Automatic at boot | Non-breaking DDL (CREATE IF NOT EXISTS, ADD COLUMN nullable) |
| Release (B) | `100-199` | Manual via CLI | Breaking changes (DROP, ALTER TYPE), require maintenance window |
| Seed | `S001-S999` | After schema | Reference data with ON CONFLICT DO NOTHING |
| Data (C) | `DM001-DM999` | Background job | Batched data transformations, resumable |
**Detailed documentation:** See [`docs/db/`](db/README.md) for full specification, coding rules, and phase-by-phase conversion tasks.
**Retention**
* RustFS applies retention via `X-RustFS-Retain-Seconds`; Scanner.WebService GC decrements `refCount` and deletes unreferenced metadata; S3/MinIO fallback retains native Object Lock when enabled.
* RustFS applies retention via `X-RustFS-Retain-Seconds`; Scanner.WebService GC decrements `refCount` and deletes unreferenced metadata; S3/MinIO fallback retains native Object Lock when enabled.
* PostgreSQL retention managed via time-based partitioning for high-volume tables (runs, execution_logs) with monthly partition drops.
---
@@ -376,36 +423,36 @@ Binary header + purl table + roaring bitmaps; optional `usedByEntrypoint` flags
* **Community** (public registry): throttled, nonattesting.
* **Authorized** (private registry): full speed, DSSE enabled.
* **Client update flow:** containers selfverify signatures at boot; report version; **Signer** enforces `valid_release_year` / `max_version` from PoE before signing.
* **Compose skeleton:**
* **Client update flow:** containers selfverify signatures at boot; report version; **Signer** enforces `valid_release_year` / `max_version` from PoE before signing.
* **Compose skeleton:**
```yaml
services:
authority: { image: stellaops/authority }
authority: { image: stellaops/authority, depends_on: [postgres] }
fulcio: { image: sigstore/fulcio }
rekor: { image: sigstore/rekor-v2 }
minio: { image: minio/minio, command: server /data --console-address ":9001" }
mongo: { image: mongo:7 }
postgres: { image: postgres:15-alpine, environment: { POSTGRES_DB: stellaops, POSTGRES_USER: stellaops } }
signer: { image: stellaops/signer, depends_on: [authority, fulcio] }
attestor: { image: stellaops/attestor, depends_on: [rekor, signer] }
scanner-web: { image: stellaops/scanner-web, depends_on: [mongo, minio, signer, attestor] }
scanner-web: { image: stellaops/scanner-web, depends_on: [postgres, minio, signer, attestor] }
scanner-worker: { image: stellaops/scanner-worker, deploy: { replicas: 4 }, depends_on: [scanner-web] }
concelier: { image: stellaops/concelier-web, depends_on: [mongo] }
excititor: { image: stellaops/excititor-web, depends_on: [mongo] }
scheduler-web: { image: stellaops/scheduler-web, depends_on: [mongo] }
concelier: { image: stellaops/concelier-web, depends_on: [postgres] }
excititor: { image: stellaops/excititor-web, depends_on: [postgres] }
scheduler-web: { image: stellaops/scheduler-web, depends_on: [postgres] }
scheduler-worker:{ image: stellaops/scheduler-worker, deploy: { replicas: 2 }, depends_on: [scheduler-web] }
notify-web: { image: stellaops/notify-web, depends_on: [mongo] }
notify-web: { image: stellaops/notify-web, depends_on: [postgres] }
notify-worker: { image: stellaops/notify-worker, deploy: { replicas: 2 }, depends_on: [notify-web] }
ui: { image: stellaops/ui, depends_on: [scanner-web, concelier, excititor, scheduler-web, notify-web] }
```
* **Binary prerequisites (offline-first):**
* Single curated NuGet location: `local-nugets/` holds the `.nupkg` feed (hashed in `manifest.json`) and the restore output (`local-nugets/packages`, configured via `NuGet.config`).
* Non-NuGet binaries (plugins/CLIs/tools) are catalogued with SHA-256 in `vendor/manifest.json`; air-gap bundles are registered in `offline/feeds/manifest.json`.
* CI guard: `scripts/verify-binaries.sh` blocks binaries outside approved roots; offline restores use `dotnet restore --source local-nugets` with `OFFLINE=1` (override via `ALLOW_REMOTE=1`).
ui: { image: stellaops/ui, depends_on: [scanner-web, concelier, excititor, scheduler-web, notify-web] }
```
* **Backups:** Mongo dumps; RustFS snapshots (or S3 versioning when fallback driver is used); Rekor v2 DB snapshots; JWKS/Fulcio/KMS key rotation.
* **Binary prerequisites (offline-first):**
* Single curated NuGet location: `local-nugets/` holds the `.nupkg` feed (hashed in `manifest.json`) and the restore output (`local-nugets/packages`, configured via `NuGet.config`).
* Non-NuGet binaries (plugins/CLIs/tools) are catalogued with SHA-256 in `vendor/manifest.json`; air-gap bundles are registered in `offline/feeds/manifest.json`.
* CI guard: `scripts/verify-binaries.sh` blocks binaries outside approved roots; offline restores use `dotnet restore --source local-nugets` with `OFFLINE=1` (override via `ALLOW_REMOTE=1`).
* **Backups:** Mongo dumps; RustFS snapshots (or S3 versioning when fallback driver is used); Rekor v2 DB snapshots; JWKS/Fulcio/KMS key rotation.
* **Ops runbooks:** Scheduler catchup after Concelier/Excititor recovery; connector key rotation (Slack/Teams/SMTP).
* **SLOs & alerts:** lag between Concelier/Excititor export and first rescan verdict; delivery failure rates by channel.
@@ -418,7 +465,7 @@ services:
* **Notify metrics:** `notify.sent_total{channel}`, `notify.dropped_total{reason}`, `notify.digest_coalesced_total`, `notify.latency_ms`.
* **Tracing:** perstage spans; correlation IDs across Scanner→Signer→Attestor and Concelier/Excititor→Scheduler→Scanner→Notify.
* **Audit logs:** every signing records `license_id`, `image_digest`, `policy_digest`, and Rekor UUID; Scheduler records who scheduled what; Notify records where, when, and why messages were sent or deduped.
* **Compliance:** RustFS retention headers (or MinIO Object Lock when operating in S3 mode) keep immutable artifacts tamperresistant; reproducible outputs via policy digest + SBOM digest in predicate.
* **Compliance:** RustFS retention headers (or MinIO Object Lock when operating in S3 mode) keep immutable artifacts tamperresistant; reproducible outputs via policy digest + SBOM digest in predicate.
---