feat: Add new projects to solution and implement contract testing documentation

- Added "StellaOps.Policy.Engine", "StellaOps.Cartographer", and "StellaOps.SbomService" projects to the StellaOps solution.
- Created AGENTS.md to outline the Contract Testing Guild Charter, detailing mission, scope, and definition of done.
- Established TASKS.md for the Contract Testing Task Board, outlining tasks for Sprint 62 and Sprint 63 related to mock servers and replay testing.
This commit is contained in:
2025-10-27 07:57:55 +02:00
parent 1e41ba7ffa
commit 651b8e0fa3
355 changed files with 17276 additions and 1160 deletions

View File

@@ -0,0 +1,429 @@
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
---
# Epic 16: AirGapped Mode
**Short name:** AirGapped Mode
**Primary components:** Web Services API, Console, CLI, Orchestrator, Task Runner, Conseiller (Feedser), Excitator (VEXer), Policy Engine, Findings Ledger, Export Center, Authority & Tenancy, Notifications, Observability & Forensics
**Surfaces:** offline bootstrap, update ingestion via mirror bundles, sealed egress, deterministic jobs, offline advisories/VEX, offline policy packs, offline notifications, evidence exports
**Dependencies:** Export Center, Containerized Distribution, AuthorityBacked Scopes & Tenancy, Observability & Forensics, Policy Studio
**AOC ground rule reminder:** Conseiller and Excitator aggregate and link advisories/VEX. They never merge or mutate source records. AirGapped Mode must preserve this invariant even when mirroring and importing updates.
---
## 1) What it is
A fully supported operating profile where StellaOps runs in a disconnected environment with:
* **Zero external egress** from platform services and jobs.
* **Deterministic inputs** provided via signed, offline **Mirror Bundles** (advisories, VEX, policy packs, vendor feeds, Stella metadata, container images, dashboards).
* **Offline bootstrap** for images and charts, plus reproducible configuration and cryptographically verifiable updates.
* **Graceful feature degradation** with explicit UX: features that require external connectivity are either backed by local artifacts or clearly disabled with an explanation.
* **Auditable import/export** including provenance attestations, evidence bundles, and chainofcustody for all offline exchanges.
AirGapped Mode is selectable at install time and enforceable at runtime. When enabled, all components operate under an “egress sealed” policy and only consume data from local stores.
---
## 2) Why
Many users operate in classified, regulated, or highsensitivity networks where egress is prohibited. They still need SBOM analysis, policy evaluation, advisory/VEX mapping, and reporting. AirGapped Mode provides the same core outcomes with verifiable offline inputs and explicit operational guardrails.
---
## 3) How it should work
### 3.1 Modes and lifecycle
* **Connected Mode:** normal operation; can create Mirror Bundles on a staging host.
* **Sealed AirGapped Mode:** platform enforces no egress. Only local resources are allowed.
* **Transition flow:**
1. Prepare an offline **Bootstrap Pack** with all container images, Helm/compose charts, seed database, and initial Mirror Bundle.
2. Install in the airgapped enclave and **seal** egress.
3. Periodically import new **Mirror Bundles** via removable media.
4. Export evidence/reports as needed.
### 3.2 Egress sealing
* **Static guardrails:**
* Platform flag `STELLA_AIRGAP=sealed` and database feature flag `env.mode='sealed'`.
* NetworkPolicy/iptables/eBPF denyall egress for namespaces/pods except loopback and the internal object store.
* Outbound DNS blocked.
* HTTP clients in code use a single `EgressPolicy` facade. When sealed, it panics on direct network calls and returns a typed error with remediation (“import a Mirror Bundle”).
* **Verification:** `GET /system/airgap/status` returns `sealed: true|false`, current policy hash, and last import timestamp. CLI prints warning if not sealed in declared airgapped install.
### 3.3 Trusted time
* Airgapped systems cannot NTP. Each Mirror Bundle includes a **signed time token** (Roughtimestyle or RFC 3161) from a trusted authority. On import, platform stores `time_anchor` for drift calculations and staleness checks.
* If time drift exceeds policy threshold, UI shows “stale view” badges and some jobs are blocked until a new bundle provides a fresh anchor.
### 3.4 Mirror Bundles (offline updates)
* **Content types:**
* Public advisories (OSV, GHSA, vendor advisories), NVD mappings, CPE/Package metadata.
* VEX statements from vendors/communities.
* Policy packs (templates, baselines, versioned rule sets).
* StellaOps engine metadata and schema migrations.
* Optional: **OCI image set** for platform and recommended runners.
* Optional: dashboards and alert rule packs.
* **Format:** a TUFlike layout:
```
root.json, snapshot.json, timestamp.json, targets/
advisories/*.jsonl.zst
vex/*.jsonl.zst
policy/*.tar.zst
images/* (OCI layout or oci-archive)
meta/engine/*.tgz
meta/time-anchor.json (signed)
```
* **Integrity & trust:**
* DSSEsigned target manifests.
* Root of trust rotated via `root.json` within strict policy; rotation requires manual dual approval in sealed mode.
* Each content artifact has a content digest and a **Merkle root** for the overall bundle.
* **Creation:** in connected networks, `stella mirror create --content advisories,vex,policy,images --since 2025-01-01 --out bundle.tgz`.
* **Import:** in airgap, `stella airgap import bundle.tgz`. The importer verifies DSSE, TUF metadata, Merkle root, then writes to local object store and updates catalog tables.
* **Idempotence:** imports are contentaddressed; reimports deduplicate.
### 3.5 Deterministic jobs and sources
* **Allowed sources:** filesystem, internal object store, tenant private registry, and preapproved connectors that dont require external egress.
* **Disallowed in sealed mode:** remote package registries, web scrapers, outbound webhooks, cloud KMS unless on the enclave network.
* **Runner policy:** the Task Runner verifies job descriptors contain no network calls unless marked `internal:` with allowlisted destinations. Violations fail at plan time with an explainable error.
### 3.6 Conseiller and Excitator in airgap
* **Conseiller (Feedser):** ingests advisories only from imported bundles or tenant local feeds. It preserves source identities and never merges. Linkage uses bundleprovided crossrefs and local heuristics.
* **Excitator (VEXer):** imports VEX records asis, links them to components and advisories, and records the origin bundle and statement digests. Consensus Lens (Epic 7) operates offline across the imported sources.
### 3.7 Policy Engine and Studio
* Policy packs are versioned and imported via bundles.
* Simulation and authoring work locally. Exports of new or updated policies can be packaged as **Policy SubBundles** for transfer back to connected environments if needed.
* Engine shows which rules depend on external evidence and how they degrade in sealed mode (e.g., “No external EPSS; using cached percentile from last bundle.”).
### 3.8 Notifications in sealed mode
* Default to **local delivery** only: SMTP relay inside enclave, syslog, file sink.
* External webhooks are disabled.
* Notification templates show “airgap compliant channel” tags to avoid misconfiguration.
### 3.9 Observability & Forensics
* Traces, logs, metrics remain local.
* Evidence Locker supports **portable evidence packages** for crossdomain transfer: `stella forensic snapshot create --portable`.
* Importing an evidence bundle in another enclave verifies signatures and maintains chainofcustody.
### 3.10 Console and CLI behavior
* Console shows a prominent **AirGapped: Sealed** badge with last import time and staleness indicators for advisories, VEX, and policy packs.
* CLI commands gain `--sealed` awareness: any operation that would egress prints a refusal with remediation suggesting the appropriate import.
### 3.11 Multitenant and scope
* Tenancy works unchanged. Bundle imports can target:
* `--tenant-global`: shared catalogs (advisories, VEX, policy baselines).
* `--tenant=<id>`: tenantspecific content (e.g., private advisories).
* Authority scopes gain `airgap:import`, `airgap:status:read`, `airgap:seal` (adminonly).
### 3.12 Feature degradation matrix
* **AI Assistant:** offline variants use local models if installed; otherwise feature is disabled with a message.
* **External reputation feeds (e.g., EPSSlike):** replaced by cached values from the bundle.
* **Container base image lookups:** rely on imported metadata or tenant private registry.
---
## 4) Architecture
### 4.1 New modules
* `airgap/controller`
* Sealing state machine; status API; guardrails wiring into HTTP clients and runner.
* `airgap/importer`
* TUF/DSSE verification, Merkle validation, object store loader, catalog updater.
* `mirror/creator`
* Connectedside builder for bundles; content plugins for advisories/VEX/policy/images.
* `airgap/policy`
* Enforcement library exposing `EgressPolicy` facade and job plan validators.
* `airgap/time`
* Time anchor parser, drift checks, staleness annotations.
* `console/airgap`
* Sealed badge, import UI, staleness dashboards, degradation notices.
* `cli/airgap`
* `stella airgap seal|status|import|verify` commands; `stella mirror create|verify`.
### 4.2 Data model additions
* `airgap_state(id, sealed BOOLEAN, policy_hash TEXT, last_import_at TIMESTAMP, time_anchor JSONB)`
* `bundle_catalog(id, kind ENUM, merkle_root TEXT, dsse_signer TEXT, created_at TIMESTAMP, imported_at TIMESTAMP, scope ENUM('global','tenant'), tenant_id NULLABLE, labels JSONB)`
* `bundle_items(bundle_id, path TEXT, sha256 TEXT, size BIGINT, type TEXT, meta JSONB)`
* `import_audit(id, bundle_id, actor, tenant_scope, verify_result, trace_id, created_at)`
RLS: tenantscoped rows when `scope='tenant'`; global rows readable only with `stella:airgap:status:read`.
### 4.3 Storage layout
Object store paths:
```
tenants/_global/mirror/<bundle_id>/targets/...
tenants/<tenant>/mirror/<bundle_id>/targets/...
tenants/_global/images/<digest>/...
```
Evidence locker remains separate. Imported images use **OCI layout** for local registry sync.
### 4.4 Message topics
* `stella.<tenant>.airgap.imported` with bundle metadata.
* `stella.<tenant>.airgap.staleness` periodic events emitted for UX.
* `stella.<tenant>.policy.degraded` when rules fall back due to sealed mode.
---
## 5) APIs and contracts
### 5.1 Status and control
* `GET /system/airgap/status` → `{ sealed, policy_hash, last_import_at, time_anchor, drift_seconds, staleness: { advisories_days, vex_days, policy_days } }`
* `POST /system/airgap/seal` → seals environment; requires `stella:airgap:seal#tenant/<id or global>`.
* `POST /system/airgap/unseal` → only allowed if installed mode is not declared “permanently sealed” at bootstrap. Typically disabled.
### 5.2 Import & verify
* `POST /airgap/import` multipart or file reference → runs verify, writes catalog, returns bundle summary and warnings.
* `POST /airgap/verify` dryrun verification returning DSSE/TUF and Merkle results.
* `GET /airgap/bundles` list imported bundles with filters.
### 5.3 Conseiller/Excitator sources
* `POST /feeds/register` supports `kind=mirror` with `bundle_id` and paths; disallowed to point to external URLs in sealed mode.
* `GET /feeds/status` shows persource staleness and last artifact version.
### 5.4 Errors
Standardized sealedmode error:
```
{
"code": "AIRGAP_EGRESS_BLOCKED",
"message": "Egress is sealed. Import a Mirror Bundle with advisories.",
"remediation": "Run: stella airgap import bundle.tgz",
"trace_id": "..."
}
```
---
## 6) Documentation changes
Create or update:
1. `/docs/airgap/overview.md`
* Modes, lifecycle, responsibilities, threat model, what degrades.
2. `/docs/airgap/bootstrap.md`
* Offline Bootstrap Pack creation, validation, install steps for Helm/compose, local registry seeding.
3. `/docs/airgap/mirror-bundles.md`
* Bundle format, DSSE/TUF/Merkle, signed time, creation on connected host, import in sealed environment, rotation of roots.
4. `/docs/airgap/sealing-and-egress.md`
* Network policies, EgressPolicy facade, runner validation, verifying sealed status.
5. `/docs/airgap/staleness-and-time.md`
* Time anchor, drift, staleness budgets and UI behavior.
6. `/docs/airgap/operations.md`
* Periodic update cadence, runbooks, failure scenarios, disaster recovery.
7. `/docs/airgap/degradation-matrix.md`
* Feature map: available, degraded, disabled; with remediation.
8. `/docs/console/airgap.md`
* Status badges, import wizard, staleness indicators.
9. `/docs/cli/airgap.md`
* Commands, examples, exit codes.
10. `/docs/security/trust-and-signing.md`
* Roots of trust, key rotation, DSSE, TUF model.
11. `/docs/dev/airgap-contracts.md`
* EgressPolicy usage, testing patterns, sealedmode CI gates.
Add the banner at the top of each page:
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
---
## 7) Implementation plan
### Phase 1 — Foundations
* Add `airgap/controller` with sealed state and status API.
* Integrate `EgressPolicy` facade in all outbound network call sites.
* Provide default NetworkPolicy/iptables templates and Helm values to block egress.
* Console shows sealed badge and status.
### Phase 2 — Mirror Bundles
* Implement `mirror/creator` in connected mode with content plugins.
* Implement `airgap/importer` with DSSE/TUF/Merkle verification and catalog updates.
* Export Center gains **Mirror bundle** build and verify commands (connected side).
### Phase 3 — Deterministic jobs
* Add job plan validation in the Task Runner.
* Restrict sources in sealed mode.
* Conseiller/Excitator add “mirror source” adapters.
### Phase 4 — Staleness and time
* Parse time anchors; enforce staleness budgets; add UI indicators and task refusal when budgets exceeded.
* Notifications for expiring anchors.
### Phase 5 — Degradation matrix and UX
* Wire feature flags and fallbacks in Console and APIs.
* Improve error messages with remediation guidance.
### Phase 6 — Evidence portability
* Portable evidence packages: export/import with full verification.
* Document crossdomain workflows.
---
## 8) Engineering tasks
**Airgap controller and sealing**
* [ ] Implement `airgap/controller` with persistent state and RBAC.
* [ ] Add `GET /system/airgap/status`, `POST /system/airgap/seal`.
* [ ] Provide cluster egress templates for Kubernetes and for dockercompose.
* [ ] Instrument startup checks to refuse running in sealed mode if egress rules arent applied.
**EgressPolicy integration**
* [ ] Create `pkg/egress` facade and replace all direct HTTP client constructions in services.
* [ ] Add linter rule and CI check forbidding raw `http.NewClient` in server code.
* [ ] Add unit tests for sealed and unsealed behavior.
**Mirror bundles**
* [ ] Implement TUF/DSSE verifiers and Merkle root builder.
* [ ] Build content plugins: advisories, VEX, policy packs, images.
* [ ] Write `bundle_catalog` and `bundle_items` tables with RLS.
* [ ] CLI: `stella mirror create|verify`, `stella airgap import|verify`.
**Conseiller/Excitator**
* [ ] Add mirror adapters for readonly ingestion from bundle paths.
* [ ] Persist source digests and bundle IDs on each linked record.
* [ ] Unit tests to ensure no merge behavior is introduced by bundle ingestion.
**Policy Engine & Studio**
* [ ] Accept policy packs from bundles; track `policy_version` and `bundle_id`.
* [ ] Add degradation notices for rules requiring external reputation; provide cached fallbacks.
**Task Runner & Orchestrator**
* [ ] Plantime validation against network calls; add `internal:` allowlist mapping.
* [ ] Emit sealedmode violations to Timeline with remediation text.
**Console**
* [ ] Status panel: sealed badge, last import, staleness meters.
* [ ] Import wizard with verify results and catalog diff preview.
* [ ] Degradation matrix UI and contextual tooltips.
**Observability & Forensics**
* [ ] Mark sealed mode in telemetry attributes.
* [ ] Add portable evidence package export/import; verify on read.
**Authority & Tenancy**
* [ ] New scopes: `airgap:seal`, `airgap:import`, `airgap:status:read`.
* [ ] Audit import actions with actor and trace ID.
**Docs**
* [ ] Author all pages listed in section 6, include signedtime workflow diagrams.
* [ ] Insert banner statement in each page.
**Testing**
* [ ] Sealedmode e2e: attempt egress; ensure refusal and remediation.
* [ ] Bundle import e2e: corrupt DSSE, wrong root, tampered artifact → rejected.
* [ ] Performance: large advisory bundle import within target time (see Acceptance).
* [ ] Time drift scenarios and staleness budget enforcement.
* [ ] Regression: ensure AOC rules unchanged in sealed mode.
---
## 9) Feature changes required in other components
* **Export Center:** add mirror bundle export profile, signedtime token inclusion, and portable evidence packages.
* **Notifications:** remove external webhooks by default in sealed mode; add local SMTP/syslog sinks.
* **CLI Parity:** ensure all admin and import operations are exposed; add sealedmode safety prompts.
* **Containerized Distribution:** ship **Bootstrap Pack** that includes all images and charts in a single ociarchive set with index manifest.
* **Observability:** disable remote exporters; include local dashboards; mark sealed mode in UI.
* **Policy Studio:** enable offline authoring and export of policy subbundles.
* **VEX Consensus Lens:** ensure it operates solely on imported VEX statements; highlight coverage vs. stale.
> **Imposed rule reminder:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
---
## 10) Acceptance criteria
* Environment can be **sealed** and verified via API, CLI, and network policies.
* Import of a valid Mirror Bundle succeeds; DSSE, TUF, and Merkle validations recorded in `import_audit`.
* Conseiller and Excitator operate only on imported sources; linkage reflects original source identities.
* Policy packs are importable and versioned; rules that depend on external evidence show clear degradation.
* Large bundle (e.g., 812 GB with images) imports in under 20 minutes on SSD storage and indexes advisories in under 5 minutes on a 4core node.
* Console displays sealed badge, last import, staleness, and degradation matrix.
* Attempted egress in sealed mode fails with `AIRGAP_EGRESS_BLOCKED` and remediation.
* Portable evidence packages export and verify across separate enclaves.
* All changes documented with the banner statement.
---
## 11) Risks and mitigations
* **Key management complexity:** rotate TUF roots with dualcontrol workflow and explicit docs; failsafe to previous root if rotation bundle absent.
* **Staleness risk:** enforce budgets and block riskcritical jobs when expired; provide monitoring and notifications for impending staleness.
* **Operator error during import:** dryrun verification, diff preview of catalog changes, and ability to roll back via content address.
* **Hidden egress paths:** CI lints and runtime guardrails; network policies enforced at cluster layer.
* **Bundle size bloat:** Zstandard compression, delta bundles, and selective content flags for creation.
---
## 12) Philosophy
* **Predictable over perfect:** deterministic, explainable results beat unknown “live” results in sensitive networks.
* **Trust is earned:** every offline exchange is signed, verifiable, and auditable.
* **Degrade transparently:** when features reduce capability, explain it and guide remediation.
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.