docs consolidation

This commit is contained in:
master
2026-01-07 10:23:21 +02:00
parent 4789027317
commit 044cf0923c
515 changed files with 5460 additions and 5292 deletions

557
docs/technical/DATA_SCHEMAS.md Executable file
View File

@@ -0,0 +1,557 @@
# Data Schemas & Persistence Contracts
*Audience* backend developers, plugin authors, DB admins.
*Scope* describes **Valkey**, **PostgreSQL**, and ondisk blob shapes that power Stella Ops.
---
##0Document Conventions
* **CamelCase** for JSON.
* All timestamps are **RFC 3339 / ISO 8601** with `Z` (UTC).
* `⭑` = planned but *not* shipped yet (kept on Feature Matrix “To Do”).
---
##1SBOMWrapper Envelope
Every SBOM blob (regardless of format) is stored on disk or in object storage with a *sidecar* JSON file that indexes it for the scanners.
#### 1.1 JSON Shape
```jsonc
{
"id": "sha256:417f…", // digest of the SBOM *file* itself
"imageDigest": "sha256:e2b9…", // digest of the original container image
"created": "2025-07-14T07:02:13Z",
"format": "trivy-json-v2", // NEW enum: trivy-json-v2 | spdx-json | cyclonedx-json
"layers": [
"sha256:d38b…", // layer digests (ordered)
"sha256:af45…"
],
"partial": false, // true => delta SBOM (only some layers)
"provenanceId": "prov_0291" // ⭑ link to SLSA attestation (Q12026)
}
```
*`format`* **NEW** added to support **multiple SBOM formats**.
*`partial`* **NEW** true when generated via the **delta SBOM** flow (§1.3).
#### 1.2 Filesystem Layout
```
blobs/
├─ 417f… # digest prefix
│   ├─ sbom.json # payload (any format)
│   └─ sbom.meta.json # wrapper (shape above)
```
> **Note** RustFS is the primary object store; S3/MinIO compatibility layer available for legacy deployments; driver plugins support multiple backends.
####1.3Delta SBOM Extension
When `partial: true`, *only* the missing layers have been scanned.
Merging logic inside `scanning` module stitches new data onto the cached full SBOM in Valkey.
---
##2Valkey Keyspace
Valkey (Redis-compatible) provides cache, DPoP nonces, event streams, and queues for real-time messaging and rate limiting.
| Key pattern | Type | TTL | Purpose |
|-------------------------------------|---------|------|--------------------------------------------------|
| `scan:<digest>` | string | ∞ | Last scan JSON result (as returned by `/scan`) |
| `layers:<digest>` | set | 90d | Layers already possessing SBOMs (delta cache) |
| `policy:active` | string | ∞ | YAML **or** Rego ruleset |
| `quota:<token>` | string | *until next UTC midnight* | Pertoken scan counter for Free tier ({{ quota_token }} scans). |
| `policy:history` | list | ∞ | Change audit IDs (see PostgreSQL) |
| `feed:nvd:json` | string | 24h | Normalised feed snapshot |
| `locator:<imageDigest>` | string | 30d | Maps image digest → sbomBlobId |
| `dpop:<jti>` | string | 5m | DPoP nonce cache (RFC 9449) for sender-constrained tokens |
| `events:*` | stream | 7d | Event streams for Scheduler/Notify (Valkey Streams) |
| `queue:*` | stream | — | Task queues (Scanner jobs, Notify deliveries) |
| `metrics:…` | various | — | Prom / OTLP runtime metrics |
> **Delta SBOM** uses `layers:*` to skip work in <20ms.
> **Quota enforcement** increments `quota:<token>` atomically; when {{ quota_token }} the API returns **429**.
> **DPoP & Events**: Valkey Streams support high-throughput, ordered event delivery for re-evaluation and notification triggers.
> **Alternative**: NATS JetStream can replace Valkey for queues (opt-in only; requires explicit configuration).
---
## 3 PostgreSQL Tables
PostgreSQL is the canonical persistent store for long-term audit and history.
| Table | Shape (summary) | Indexes |
|--------------------|------------------------------------------------------------|-------------------------------------|
| `sbom_history` | Wrapper JSON + `replace_ts` on overwrite | `(image_digest)` `(created)` |
| `policy_versions` | `{id, yaml, rego, author_id, created}` | `(created)` |
| `attestations` ⭑ | SLSA provenance doc + Rekor log pointer | `(image_digest)` |
| `audit_log` | Fully rendered RFC 5424 entries (UI & CLI actions) | `(user_id)` `(ts)` |
Schema detail for **policy_versions**:
Samples live under `samples/api/scheduler/` (e.g., `schedule.json`, `run.json`, `impact-set.json`, `audit.json`) and mirror the canonical serializer output shown below.
```jsonc
{
"_id": "6619e90b8c5e1f76",
"yaml": "version: 1.0\nrules:\n - …",
"rego": null, // filled when Rego uploaded
"authorId": "u_1021",
"created": "2025-07-14T08:15:04Z",
"comment": "Imported via API"
}
```
### 3.1 Scheduler Sprints 16 Artifacts
**Tables.** `schedules`, `runs`, `impact_snapshots`, `audit` (module-local). All rows use the canonical JSON emitted by `StellaOps.Scheduler.Models` so agents and fixtures remain deterministic.
#### 3.1.1 Schedule (`schedules`)
```jsonc
{
"id": "sch_20251018a",
"tenantId": "tenant-alpha",
"name": "Nightly Prod",
"enabled": true,
"cronExpression": "0 2 * * *",
"timezone": "UTC",
"mode": "analysis-only",
"selection": {
"scope": "by-namespace",
"namespaces": ["team-a", "team-b"],
"repositories": ["app/service-api"],
"includeTags": ["canary", "prod"],
"labels": [{"key": "env", "values": ["prod", "staging"]}],
"resolvesTags": true
},
"onlyIf": {"lastReportOlderThanDays": 7, "policyRevision": "policy@42"},
"notify": {"onNewFindings": true, "minSeverity": "high", "includeKev": true},
"limits": {"maxJobs": 1000, "ratePerSecond": 25, "parallelism": 4},
"subscribers": ["notify.ops"],
"createdAt": "2025-10-18T22:00:00Z",
"createdBy": "svc_scheduler",
"updatedAt": "2025-10-18T22:00:00Z",
"updatedBy": "svc_scheduler"
}
```
*Constraints*: arrays are alphabetically sorted; `selection.tenantId` is optional but when present must match `tenantId`. Cron expressions are validated for newline/length, timezones are validated via `TimeZoneInfo`.
####3.1.2Run (`runs`)
```jsonc
{
"_id": "run_20251018_0001",
"tenantId": "tenant-alpha",
"scheduleId": "sch_20251018a",
"trigger": "conselier",
"state": "running",
"stats": {
"candidates": 1280,
"deduped": 910,
"queued": 624,
"completed": 310,
"deltas": 42,
"newCriticals": 7,
"newHigh": 11,
"newMedium": 18,
"newLow": 6
},
"reason": {"conselierExportId": "exp-20251018-03"},
"createdAt": "2025-10-18T22:03:14Z",
"startedAt": "2025-10-18T22:03:20Z",
"finishedAt": null,
"error": null,
"deltas": [
{
"imageDigest": "sha256:a1b2c3",
"newFindings": 3,
"newCriticals": 1,
"newHigh": 1,
"newMedium": 1,
"newLow": 0,
"kevHits": ["CVE-2025-0002"],
"topFindings": [
{
"purl": "pkg:rpm/openssl@3.0.12-5.el9",
"vulnerabilityId": "CVE-2025-0002",
"severity": "critical",
"link": "https://ui.internal/scans/sha256:a1b2c3"
}
],
"attestation": {"uuid": "rekor-314", "verified": true},
"detectedAt": "2025-10-18T22:03:21Z"
}
]
}
```
Counters are clamped to ≥0, timestamps are converted to UTC, and delta arrays are sorted (critical → info severity precedence, then vulnerability id). Missing `deltas` implies "no change" snapshots.
####3.1.3Impact Snapshot (`impact_snapshots`)
```jsonc
{
"selector": {
"scope": "all-images",
"tenantId": "tenant-alpha"
},
"images": [
{
"imageDigest": "sha256:f1e2d3",
"registry": "registry.internal",
"repository": "app/api",
"namespaces": ["team-a"],
"tags": ["prod"],
"usedByEntrypoint": true,
"labels": {"env": "prod"}
}
],
"usageOnly": true,
"generatedAt": "2025-10-18T22:02:58Z",
"total": 412,
"snapshotId": "impact-20251018-1"
}
```
Images are deduplicated and sorted by digest. Label keys are normalised to lowercase to avoid casesensitive duplicates during reconciliation. `snapshotId` enables run planners to compare subsequent snapshots for drift.
####3.1.4Audit (`audit`)
```jsonc
{
"_id": "audit_169754",
"tenantId": "tenant-alpha",
"category": "scheduler",
"action": "pause",
"occurredAt": "2025-10-18T22:10:00Z",
"actor": {"actorId": "user_admin", "displayName": "Cluster Admin", "kind": "user"},
"scheduleId": "sch_20251018a",
"correlationId": "corr-123",
"metadata": {"details": "schedule paused", "reason": "maintenance"},
"message": "Paused via API"
}
```
Metadata keys are lowercased, firstwriter wins (duplicates with different casing are ignored), and optional IDs (`scheduleId`, `runId`) are trimmed when empty. Use the canonical serializer when emitting events so audit digests remain reproducible.
####3.1.5Run Summary (`run_summaries`)
Materialized view powering the Scheduler UI dashboards. Stores the latest roll-up per schedule/tenant, enabling quick “last run” banners and sparkline counters without scanning the full `runs` collection.
```jsonc
{
"tenantId": "tenant-alpha",
"scheduleId": "sch_20251018a",
"updatedAt": "2025-10-18T22:10:10Z",
"lastRun": {
"runId": "run_20251018_0001",
"trigger": "conselier",
"state": "completed",
"createdAt": "2025-10-18T22:03:14Z",
"startedAt": "2025-10-18T22:03:20Z",
"finishedAt": "2025-10-18T22:08:45Z",
"stats": {
"candidates": 1280,
"deduped": 910,
"queued": 0,
"completed": 910,
"deltas": 42,
"newCriticals": 7,
"newHigh": 11,
"newMedium": 18,
"newLow": 6
},
"error": null
},
"recent": [
{
"runId": "run_20251018_0001",
"trigger": "conselier",
"state": "completed",
"createdAt": "2025-10-18T22:03:14Z",
"startedAt": "2025-10-18T22:03:20Z",
"finishedAt": "2025-10-18T22:08:45Z",
"stats": {
"candidates": 1280,
"deduped": 910,
"queued": 0,
"completed": 910,
"deltas": 42,
"newCriticals": 7,
"newHigh": 11,
"newMedium": 18,
"newLow": 6
},
"error": null
},
{
"runId": "run_20251017_0003",
"trigger": "cron",
"state": "error",
"createdAt": "2025-10-17T22:01:02Z",
"startedAt": "2025-10-17T22:01:08Z",
"finishedAt": "2025-10-17T22:04:11Z",
"stats": {
"candidates": 1040,
"deduped": 812,
"queued": 0,
"completed": 640,
"deltas": 18,
"newCriticals": 2,
"newHigh": 4,
"newMedium": 7,
"newLow": 3
},
"error": "scanner timeout"
}
],
"counters": {
"total": 3,
"planning": 0,
"queued": 0,
"running": 0,
"completed": 1,
"error": 1,
"cancelled": 1,
"totalDeltas": 60,
"totalNewCriticals": 9,
"totalNewHigh": 15,
"totalNewMedium": 25,
"totalNewLow": 9
}
}
```
- `_id` combines `tenantId` and `scheduleId` (`tenant:schedule`).
- `recent` contains the 20 most recent runs ordered by `createdAt` (UTC). Updates replace the existing entry for a run to respect state transitions.
- `counters` aggregate over the retained window (20 runs) for quick trend indicators. Totals are recomputed after every update.
- Schedulers should call the projection service after every run state change so the cache mirrors planner/runner progress.
Sample file: `samples/api/scheduler/run-summary.json`.
---
##4Policy Schema (YAML v1.0)
Minimal viable grammar (subset of OSVSCHEMA ideas).
```yaml
version: "1.0"
rules:
- name: Block Critical
severity: [Critical]
action: block
- name: Ignore Low Dev
severity: [Low, None]
environments: [dev, staging]
action: ignore
expires: "2026-01-01"
- name: Escalate RegionalFeed High
sources: [NVD, CNNVD, CNVD, ENISA, JVN, BDU]
severity: [High, Critical]
action: escalate
```
Validation is performed by `policy:mapping.yaml` JSONSchema embedded in backend.
Canonical schema source: `src/Policy/__Libraries/StellaOps.Policy/Schemas/policy-schema@1.json` (embedded into `StellaOps.Policy`).
`PolicyValidationCli` (see `src/Policy/__Libraries/StellaOps.Policy/PolicyValidationCli.cs`) provides the reusable command handler that the main CLI wires up; in the interim it can be invoked from a short host like:
```csharp
await new PolicyValidationCli().RunAsync(new PolicyValidationCliOptions
{
Inputs = new[] { "policies/root.yaml" },
Strict = true,
});
```
###4.1Rego Variant (Advanced  TODO)
*Accepted but stored asis in `rego` field.*
Evaluated via internal **OPA** sidecar once feature graduates from TODO list.
###4.2Policy Scoring Config (JSON)
*Schema id.* `https://schemas.stella-ops.org/policy/policy-scoring-schema@1.json`
*Source.* `src/Policy/__Libraries/StellaOps.Policy/Schemas/policy-scoring-schema@1.json` (embedded in `StellaOps.Policy`), default fixture at `src/Policy/__Libraries/StellaOps.Policy/Schemas/policy-scoring-default.json`.
```jsonc
{
"version": "1.0",
"severityWeights": {"Critical": 90, "High": 75, "Unknown": 60, "...": 0},
"quietPenalty": 45,
"warnPenalty": 15,
"ignorePenalty": 35,
"trustOverrides": {"vendor": 1.0, "distro": 0.85},
"reachabilityBuckets": {"entrypoint": 1.0, "direct": 0.85, "runtime": 0.45, "unknown": 0.5},
"unknownConfidence": {
"initial": 0.8,
"decayPerDay": 0.05,
"floor": 0.2,
"bands": [
{"name": "high", "min": 0.65},
{"name": "medium", "min": 0.35},
{"name": "low", "min": 0.0}
]
}
}
```
Validation occurs alongside policy binding (`PolicyScoringConfigBinder`), producing deterministic digests via `PolicyScoringConfigDigest`. Bands are ordered descending by `min` so consumers can resolve confidence tiers deterministically. Reachability buckets are case-insensitive keys (`entrypoint`, `direct`, `indirect`, `runtime`, `unreachable`, `unknown`) with numeric multipliers (default ≤1.0).
**Runtime usage**
- `trustOverrides` are matched against `finding.tags` (`trust:<key>`) first, then `finding.source`/`finding.vendor`; missing keys default to `1.0`.
- `reachabilityBuckets` consume `finding.tags` with prefix `reachability:` (fallback `usage:` or `unknown`). Missing buckets fall back to `unknown` weight when present, otherwise `1.0`.
- Policy verdicts expose scoring inputs (`severityWeight`, `trustWeight`, `reachabilityWeight`, `baseScore`, penalties) plus unknown-state metadata (`unknownConfidence`, `unknownAgeDays`, `confidenceBand`) for auditability. See `samples/policy/policy-preview-unknown.json` and `samples/policy/policy-report-unknown.json` for offline reference payloads validated against the published schemas below.
Validate the samples locally with **Ajv** before publishing changes:
```bash
# install once per checkout (offline-safe):
npm install --no-save ajv-cli@5 ajv-formats@2
npx ajv validate --spec=draft2020 -c ajv-formats \
-s docs/modules/policy/schemas/policy-preview-sample@1.json \
-d samples/policy/policy-preview-unknown.json
npx ajv validate --spec=draft2020 -c ajv-formats \
-s docs/modules/policy/schemas/policy-report-sample@1.json \
-d samples/policy/policy-report-unknown.json
```
- Unknown confidence derives from `unknown-age-days:` (preferred) or `unknown-since:` + `observed-at:` tags; with no hints the engine keeps `initial` confidence. Values decay by `decayPerDay` down to `floor`, then resolve to the first matching `bands[].name`.
---
##5SLSA Attestation Schema 
Planned for Q12026 (kept here for early plugin authors).
```jsonc
{
"id": "prov_0291",
"imageDigest": "sha256:e2b9…",
"buildType": "https://slsa.dev/container/v1",
"builder": {
"id": "https://git.stella-ops.ru/ci/stella-runner@sha256:f7b7…"
},
"metadata": {
"invocation": {
"parameters": {"GIT_SHA": "f6a1…"},
"buildStart": "2025-07-14T06:59:17Z",
"buildEnd": "2025-07-14T07:01:22Z"
},
"completeness": {"parameters": true}
},
"materials": [
{"uri": "git+https://git…", "digest": {"sha1": "f6a1…"}}
],
"rekorLogIndex": 99817 // entry in local Rekor mirror
}
```
---
##6NotifyFoundations (Rule·Channel·Event)
*Sprint 15 target* canonically describe the Notify data shapes that UI, workers, and storage consume. JSON Schemas live under `docs/modules/notify/resources/schemas/` and deterministic fixtures under `docs/modules/notify/resources/samples/`.
| Artifact | Schema | Sample |
|----------|--------|--------|
| **Rule** (catalogued routing logic) | `docs/modules/notify/resources/schemas/notify-rule@1.json` | `docs/modules/notify/resources/samples/notify-rule@1.sample.json` |
| **Channel** (delivery endpoint definition) | `docs/modules/notify/resources/schemas/notify-channel@1.json` | `docs/modules/notify/resources/samples/notify-channel@1.sample.json` |
| **Template** (rendering payload) | `docs/modules/notify/resources/schemas/notify-template@1.json` | `docs/modules/notify/resources/samples/notify-template@1.sample.json` |
| **Event envelope** (Notify ingest surface) | `docs/modules/notify/resources/schemas/notify-event@1.json` | `docs/modules/notify/resources/samples/notify-event@1.sample.json` |
###6.1Rule highlights (`notify-rule@1`)
* Keys are lowercased camelCase. `schemaVersion` (`notify.rule@1`), `ruleId`, `tenantId`, `name`, `match`, `actions`, `createdAt`, and `updatedAt` are mandatory.
* `match.eventKinds`, `match.verdicts`, and other array selectors are presorted and casenormalized (e.g. `scanner.report.ready`).
* `actions[].throttle` serialises as ISO8601 duration (`PT5M`), mirroring worker backoff guardrails.
* `vex` gates let operators exclude accepted/notaffected justifications; omit the block to inherit default behaviour.
* Use `StellaOps.Notify.Models.NotifySchemaMigration.UpgradeRule(JsonNode)` when deserialising legacy payloads that might lack `schemaVersion` or retain older revisions.
* Soft deletions persist `deletedAt` in PostgreSQL (and disable the rule); repository queries automatically filter them.
###6.2Channel highlights (`notify-channel@1`)
* `schemaVersion` is pinned to `notify.channel@1` and must accompany persisted documents.
* `type` matches plugin identifiers (`slack`, `teams`, `email`, `webhook`, `custom`).
* `config.secretRef` stores an external secret handle (Authority, Vault, K8s). Notify never persists raw credentials.
* Optional `config.limits.timeout` uses ISO8601 durations identical to rule throttles; concurrency/RPM defaults apply when absent.
* `StellaOps.Notify.Models.NotifySchemaMigration.UpgradeChannel(JsonNode)` backfills the schema version when older documents omit it.
* Channels share the same soft-delete marker (`deletedAt`) so operators can restore prior configuration without purging history.
###6.3Event envelope (`notify-event@1`)
* Aligns with the platform event contract—`eventId` UUID, RFC3339 `ts`, tenant isolation enforced.
* Enumerated `kind` covers the initial Notify surface (`scanner.report.ready`, `scheduler.rescan.delta`, `zastava.admission`, etc.).
* `scope.labels`/`scope.attributes` and top-level `attributes` mirror the metadata dictionaries workers surface for templating and audits.
* Notify workers use the same migration helper to wrap event payloads before template rendering, so schema additions remain additive.
###6.4Template highlights (`notify-template@1`)
* Carries the presentation key (`channelType`, `key`, `locale`) and the raw template body; `schemaVersion` is fixed to `notify.template@1`.
* `renderMode` enumerates supported engines (`markdown`, `html`, `adaptiveCard`, `plainText`, `json`) aligning with `NotifyTemplateRenderMode`.
* `format` signals downstream connector expectations (`slack`, `teams`, `email`, `webhook`, `json`).
* Upgrade legacy definitions with `NotifySchemaMigration.UpgradeTemplate(JsonNode)` to auto-apply the new schema version and ordering.
* Templates also record soft deletes via `deletedAt`; UI/API skip them by default while retaining revision history.
**Validation loop:**
```bash
# Validate Notify schemas and samples (matches Docs CI)
for schema in docs/modules/notify/resources/schemas/*.json; do
npx ajv compile -c ajv-formats -s "$schema"
done
for sample in docs/modules/notify/resources/samples/*.sample.json; do
schema="docs/modules/notify/resources/schemas/$(basename "${sample%.sample.json}").json"
npx ajv validate -c ajv-formats -s "$schema" -d "$sample"
done
```
Integration tests can embed the sample fixtures to guarantee deterministic serialisation from the `StellaOps.Notify.Models` DTOs introduced in Sprint15.
---
##6Validator Contracts
* For SBOM wrapper `ISbomValidator` (DLL plugin) must return *typed* error list.
* For YAML policies JSONSchema at `/schemas/policyv1.json`.
* For Rego OPA `opa eval --fail-defined` under the hood.
* For **Freetier quotas** `IQuotaService` integration tests ensure `quota:<token>` resets at UTC midnight and produces correct `RetryAfter` headers.
---
##7Migration Notes
1. **Add `format` column** to existing SBOM wrappers; default to `trivy-json-v2`.
2. **Populate `layers` & `partial`** via backfill script (ship with `stellopsctl migrate` wizard).
3. Policy YAML previously stored in Valkey → copy to PostgreSQL if persistence enabled.
4. Prepare `attestations` table (empty) safe to create in advance.
---
##8Open Questions / Future Work
* How to deduplicate *identical* Rego policies differing only in whitespace?
* Embed *GOST 34.112018* digests when users enable Russian crypto suite?
* Should enterprise tiers share the same Valkey quota keys (Redis-compatible) or switch to JWT claim`tier != Free` bypass?
* Evaluate slidingwindow quota instead of strict daily reset.
* Consider ratelimit for `/layers/missing` to avoid bruteforce enumeration.
---
##9Change Log
| Date | Note |
|------------|--------------------------------------------------------------------------------|
| 20250714 | **Added:** `format`, `partial`, delta cache keys, YAML policy schema v1.0. |
| 20250712 | **Initial public draft** SBOM wrapper, Valkey keyspace (Redis-compatible), audit collections. |
---