- Added Program.cs to set up the web application with Serilog for logging, health check endpoints, and a placeholder admission endpoint. - Configured Kestrel server to use TLS 1.3 and handle client certificates appropriately. - Created StellaOps.Zastava.Webhook.csproj with necessary dependencies including Serilog and Polly. - Documented tasks in TASKS.md for the Zastava Webhook project, outlining current work and exit criteria for each task.
441 lines
18 KiB
Markdown
Executable File
441 lines
18 KiB
Markdown
Executable File
# Data Schemas & Persistence Contracts
|
||
|
||
*Audience* – backend developers, plug‑in authors, DB admins.
|
||
*Scope* – describes **Redis**, **MongoDB** (optional), and on‑disk blob shapes that power Stella Ops.
|
||
|
||
---
|
||
|
||
## 0 Document Conventions
|
||
|
||
* **CamelCase** for JSON.
|
||
* All timestamps are **RFC 3339 / ISO 8601** with `Z` (UTC).
|
||
* `⭑` = planned but *not* shipped yet (kept on Feature Matrix “To Do”).
|
||
|
||
---
|
||
|
||
## 1 SBOM Wrapper Envelope
|
||
|
||
Every SBOM blob (regardless of format) is stored on disk or in object storage with a *sidecar* JSON file that indexes it for the scanners.
|
||
|
||
#### 1.1 JSON Shape
|
||
|
||
```jsonc
|
||
{
|
||
"id": "sha256:417f…", // digest of the SBOM *file* itself
|
||
"imageDigest": "sha256:e2b9…", // digest of the original container image
|
||
"created": "2025-07-14T07:02:13Z",
|
||
"format": "trivy-json-v2", // NEW enum: trivy-json-v2 | spdx-json | cyclonedx-json
|
||
"layers": [
|
||
"sha256:d38b…", // layer digests (ordered)
|
||
"sha256:af45…"
|
||
],
|
||
"partial": false, // true => delta SBOM (only some layers)
|
||
"provenanceId": "prov_0291" // ⭑ link to SLSA attestation (Q1‑2026)
|
||
}
|
||
```
|
||
|
||
*`format`* **NEW** – added to support **multiple SBOM formats**.
|
||
*`partial`* **NEW** – true when generated via the **delta SBOM** flow (§1.3).
|
||
|
||
#### 1.2 File‑system Layout
|
||
|
||
```
|
||
blobs/
|
||
├─ 417f… # digest prefix
|
||
│ ├─ sbom.json # payload (any format)
|
||
│ └─ sbom.meta.json # wrapper (shape above)
|
||
```
|
||
|
||
> **Note** – blob storage can point at S3, MinIO, or plain disk; driver plug‑ins adapt.
|
||
|
||
#### 1.3 Delta SBOM Extension
|
||
|
||
When `partial: true`, *only* the missing layers have been scanned.
|
||
Merging logic inside `scanning` module stitches new data onto the cached full SBOM in Redis.
|
||
|
||
---
|
||
|
||
## 2 Redis Keyspace
|
||
|
||
| Key pattern | Type | TTL | Purpose |
|
||
|-------------------------------------|---------|------|--------------------------------------------------|
|
||
| `scan:<digest>` | string | ∞ | Last scan JSON result (as returned by `/scan`) |
|
||
| `layers:<digest>` | set | 90d | Layers already possessing SBOMs (delta cache) |
|
||
| `policy:active` | string | ∞ | YAML **or** Rego ruleset |
|
||
| `quota:<token>` | string | *until next UTC midnight* | Per‑token scan counter for Free tier ({{ quota_token }} scans). |
|
||
| `policy:history` | list | ∞ | Change audit IDs (see Mongo) |
|
||
| `feed:nvd:json` | string | 24h | Normalised feed snapshot |
|
||
| `locator:<imageDigest>` | string | 30d | Maps image digest → sbomBlobId |
|
||
| `metrics:…` | various | — | Prom / OTLP runtime metrics |
|
||
|
||
> **Delta SBOM** uses `layers:*` to skip work in <20 ms.
|
||
> **Quota enforcement** increments `quota:<token>` atomically; when {{ quota_token }} the API returns **429**.
|
||
|
||
---
|
||
|
||
## 3 MongoDB Collections (Optional)
|
||
|
||
Only enabled when `MONGO_URI` is supplied (for long‑term audit).
|
||
|
||
| Collection | Shape (summary) | Indexes |
|
||
|--------------------|------------------------------------------------------------|-------------------------------------|
|
||
| `sbom_history` | Wrapper JSON + `replaceTs` on overwrite | `{imageDigest}` `{created}` |
|
||
| `policy_versions` | `{_id, yaml, rego, authorId, created}` | `{created}` |
|
||
| `attestations` ⭑ | SLSA provenance doc + Rekor log pointer | `{imageDigest}` |
|
||
| `audit_log` | Fully rendered RFC 5424 entries (UI & CLI actions) | `{userId}` `{ts}` |
|
||
|
||
Schema detail for **policy_versions**:
|
||
|
||
Samples live under `samples/api/scheduler/` (e.g., `schedule.json`, `run.json`, `impact-set.json`, `audit.json`) and mirror the canonical serializer output shown below.
|
||
|
||
```jsonc
|
||
{
|
||
"_id": "6619e90b8c5e1f76",
|
||
"yaml": "version: 1.0\nrules:\n - …",
|
||
"rego": null, // filled when Rego uploaded
|
||
"authorId": "u_1021",
|
||
"created": "2025-07-14T08:15:04Z",
|
||
"comment": "Imported via API"
|
||
}
|
||
```
|
||
|
||
### 3.1 Scheduler Sprints 16 Artifacts
|
||
|
||
**Collections.** `schedules`, `runs`, `impact_snapshots`, `audit` (module‑local). All documents reuse the canonical JSON emitted by `StellaOps.Scheduler.Models` so agents and fixtures remain deterministic.
|
||
|
||
#### 3.1.1 Schedule (`schedules`)
|
||
|
||
```jsonc
|
||
{
|
||
"_id": "sch_20251018a",
|
||
"tenantId": "tenant-alpha",
|
||
"name": "Nightly Prod",
|
||
"enabled": true,
|
||
"cronExpression": "0 2 * * *",
|
||
"timezone": "UTC",
|
||
"mode": "analysis-only",
|
||
"selection": {
|
||
"scope": "by-namespace",
|
||
"namespaces": ["team-a", "team-b"],
|
||
"repositories": ["app/service-api"],
|
||
"includeTags": ["canary", "prod"],
|
||
"labels": [{"key": "env", "values": ["prod", "staging"]}],
|
||
"resolvesTags": true
|
||
},
|
||
"onlyIf": {"lastReportOlderThanDays": 7, "policyRevision": "policy@42"},
|
||
"notify": {"onNewFindings": true, "minSeverity": "high", "includeKev": true},
|
||
"limits": {"maxJobs": 1000, "ratePerSecond": 25, "parallelism": 4},
|
||
"subscribers": ["notify.ops"],
|
||
"createdAt": "2025-10-18T22:00:00Z",
|
||
"createdBy": "svc_scheduler",
|
||
"updatedAt": "2025-10-18T22:00:00Z",
|
||
"updatedBy": "svc_scheduler"
|
||
}
|
||
```
|
||
|
||
*Constraints*: arrays are alphabetically sorted; `selection.tenantId` is optional but when present must match `tenantId`. Cron expressions are validated for newline/length, timezones are validated via `TimeZoneInfo`.
|
||
|
||
#### 3.1.2 Run (`runs`)
|
||
|
||
```jsonc
|
||
{
|
||
"_id": "run_20251018_0001",
|
||
"tenantId": "tenant-alpha",
|
||
"scheduleId": "sch_20251018a",
|
||
"trigger": "feedser",
|
||
"state": "running",
|
||
"stats": {
|
||
"candidates": 1280,
|
||
"deduped": 910,
|
||
"queued": 624,
|
||
"completed": 310,
|
||
"deltas": 42,
|
||
"newCriticals": 7,
|
||
"newHigh": 11,
|
||
"newMedium": 18,
|
||
"newLow": 6
|
||
},
|
||
"reason": {"feedserExportId": "exp-20251018-03"},
|
||
"createdAt": "2025-10-18T22:03:14Z",
|
||
"startedAt": "2025-10-18T22:03:20Z",
|
||
"finishedAt": null,
|
||
"error": null,
|
||
"deltas": [
|
||
{
|
||
"imageDigest": "sha256:a1b2c3",
|
||
"newFindings": 3,
|
||
"newCriticals": 1,
|
||
"newHigh": 1,
|
||
"newMedium": 1,
|
||
"newLow": 0,
|
||
"kevHits": ["CVE-2025-0002"],
|
||
"topFindings": [
|
||
{
|
||
"purl": "pkg:rpm/openssl@3.0.12-5.el9",
|
||
"vulnerabilityId": "CVE-2025-0002",
|
||
"severity": "critical",
|
||
"link": "https://ui.internal/scans/sha256:a1b2c3"
|
||
}
|
||
],
|
||
"attestation": {"uuid": "rekor-314", "verified": true},
|
||
"detectedAt": "2025-10-18T22:03:21Z"
|
||
}
|
||
]
|
||
}
|
||
```
|
||
|
||
Counters are clamped to ≥0, timestamps are converted to UTC, and delta arrays are sorted (critical → info severity precedence, then vulnerability id). Missing `deltas` implies "no change" snapshots.
|
||
|
||
#### 3.1.3 Impact Snapshot (`impact_snapshots`)
|
||
|
||
```jsonc
|
||
{
|
||
"selector": {
|
||
"scope": "all-images",
|
||
"tenantId": "tenant-alpha"
|
||
},
|
||
"images": [
|
||
{
|
||
"imageDigest": "sha256:f1e2d3",
|
||
"registry": "registry.internal",
|
||
"repository": "app/api",
|
||
"namespaces": ["team-a"],
|
||
"tags": ["prod"],
|
||
"usedByEntrypoint": true,
|
||
"labels": {"env": "prod"}
|
||
}
|
||
],
|
||
"usageOnly": true,
|
||
"generatedAt": "2025-10-18T22:02:58Z",
|
||
"total": 412,
|
||
"snapshotId": "impact-20251018-1"
|
||
}
|
||
```
|
||
|
||
Images are deduplicated and sorted by digest. Label keys are normalised to lowercase to avoid case‑sensitive duplicates during reconciliation. `snapshotId` enables run planners to compare subsequent snapshots for drift.
|
||
|
||
#### 3.1.4 Audit (`audit`)
|
||
|
||
```jsonc
|
||
{
|
||
"_id": "audit_169754",
|
||
"tenantId": "tenant-alpha",
|
||
"category": "scheduler",
|
||
"action": "pause",
|
||
"occurredAt": "2025-10-18T22:10:00Z",
|
||
"actor": {"actorId": "user_admin", "displayName": "Cluster Admin", "kind": "user"},
|
||
"scheduleId": "sch_20251018a",
|
||
"correlationId": "corr-123",
|
||
"metadata": {"details": "schedule paused", "reason": "maintenance"},
|
||
"message": "Paused via API"
|
||
}
|
||
```
|
||
|
||
Metadata keys are lowercased, first‑writer wins (duplicates with different casing are ignored), and optional IDs (`scheduleId`, `runId`) are trimmed when empty. Use the canonical serializer when emitting events so audit digests remain reproducible.
|
||
|
||
---
|
||
|
||
## 4 Policy Schema (YAML v1.0)
|
||
|
||
Minimal viable grammar (subset of OSV‑SCHEMA ideas).
|
||
|
||
```yaml
|
||
version: "1.0"
|
||
rules:
|
||
- name: Block Critical
|
||
severity: [Critical]
|
||
action: block
|
||
- name: Ignore Low Dev
|
||
severity: [Low, None]
|
||
environments: [dev, staging]
|
||
action: ignore
|
||
expires: "2026-01-01"
|
||
- name: Escalate RegionalFeed High
|
||
sources: [NVD, CNNVD, CNVD, ENISA, JVN, BDU]
|
||
severity: [High, Critical]
|
||
action: escalate
|
||
```
|
||
|
||
Validation is performed by `policy:mapping.yaml` JSON‑Schema embedded in backend.
|
||
|
||
Canonical schema source: `src/StellaOps.Policy/Schemas/policy-schema@1.json` (embedded into `StellaOps.Policy`).
|
||
`PolicyValidationCli` (see `src/StellaOps.Policy/PolicyValidationCli.cs`) provides the reusable command handler that the main CLI wires up; in the interim it can be invoked from a short host like:
|
||
|
||
```csharp
|
||
await new PolicyValidationCli().RunAsync(new PolicyValidationCliOptions
|
||
{
|
||
Inputs = new[] { "policies/root.yaml" },
|
||
Strict = true,
|
||
});
|
||
```
|
||
|
||
### 4.1 Rego Variant (Advanced – TODO)
|
||
|
||
*Accepted but stored as‑is in `rego` field.*
|
||
Evaluated via internal **OPA** side‑car once feature graduates from TODO list.
|
||
|
||
### 4.2 Policy Scoring Config (JSON)
|
||
|
||
*Schema id.* `https://schemas.stella-ops.org/policy/policy-scoring-schema@1.json`
|
||
*Source.* `src/StellaOps.Policy/Schemas/policy-scoring-schema@1.json` (embedded in `StellaOps.Policy`), default fixture at `src/StellaOps.Policy/Schemas/policy-scoring-default.json`.
|
||
|
||
```jsonc
|
||
{
|
||
"version": "1.0",
|
||
"severityWeights": {"Critical": 90, "High": 75, "Unknown": 60, "...": 0},
|
||
"quietPenalty": 45,
|
||
"warnPenalty": 15,
|
||
"ignorePenalty": 35,
|
||
"trustOverrides": {"vendor": 1.0, "distro": 0.85},
|
||
"reachabilityBuckets": {"entrypoint": 1.0, "direct": 0.85, "runtime": 0.45, "unknown": 0.5},
|
||
"unknownConfidence": {
|
||
"initial": 0.8,
|
||
"decayPerDay": 0.05,
|
||
"floor": 0.2,
|
||
"bands": [
|
||
{"name": "high", "min": 0.65},
|
||
{"name": "medium", "min": 0.35},
|
||
{"name": "low", "min": 0.0}
|
||
]
|
||
}
|
||
}
|
||
```
|
||
|
||
Validation occurs alongside policy binding (`PolicyScoringConfigBinder`), producing deterministic digests via `PolicyScoringConfigDigest`. Bands are ordered descending by `min` so consumers can resolve confidence tiers deterministically. Reachability buckets are case-insensitive keys (`entrypoint`, `direct`, `indirect`, `runtime`, `unreachable`, `unknown`) with numeric multipliers (default ≤1.0).
|
||
|
||
**Runtime usage**
|
||
- `trustOverrides` are matched against `finding.tags` (`trust:<key>`) first, then `finding.source`/`finding.vendor`; missing keys default to `1.0`.
|
||
- `reachabilityBuckets` consume `finding.tags` with prefix `reachability:` (fallback `usage:` or `unknown`). Missing buckets fall back to `unknown` weight when present, otherwise `1.0`.
|
||
- Policy verdicts expose scoring inputs (`severityWeight`, `trustWeight`, `reachabilityWeight`, `baseScore`, penalties) plus unknown-state metadata (`unknownConfidence`, `unknownAgeDays`, `confidenceBand`) for auditability. See `samples/policy/policy-preview-unknown.json` for an end-to-end preview payload.
|
||
- Unknown confidence derives from `unknown-age-days:` (preferred) or `unknown-since:` + `observed-at:` tags; with no hints the engine keeps `initial` confidence. Values decay by `decayPerDay` down to `floor`, then resolve to the first matching `bands[].name`.
|
||
|
||
---
|
||
|
||
## 5 SLSA Attestation Schema ⭑
|
||
|
||
Planned for Q1‑2026 (kept here for early plug‑in authors).
|
||
|
||
```jsonc
|
||
{
|
||
"id": "prov_0291",
|
||
"imageDigest": "sha256:e2b9…",
|
||
"buildType": "https://slsa.dev/container/v1",
|
||
"builder": {
|
||
"id": "https://git.stella-ops.ru/ci/stella-runner@sha256:f7b7…"
|
||
},
|
||
"metadata": {
|
||
"invocation": {
|
||
"parameters": {"GIT_SHA": "f6a1…"},
|
||
"buildStart": "2025-07-14T06:59:17Z",
|
||
"buildEnd": "2025-07-14T07:01:22Z"
|
||
},
|
||
"completeness": {"parameters": true}
|
||
},
|
||
"materials": [
|
||
{"uri": "git+https://git…", "digest": {"sha1": "f6a1…"}}
|
||
],
|
||
"rekorLogIndex": 99817 // entry in local Rekor mirror
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 6 Notify Foundations (Rule · Channel · Event)
|
||
|
||
*Sprint 15 target* – canonically describe the Notify data shapes that UI, workers, and storage consume. JSON Schemas live under `docs/notify/schemas/` and deterministic fixtures under `docs/notify/samples/`.
|
||
|
||
| Artifact | Schema | Sample |
|
||
|----------|--------|--------|
|
||
| **Rule** (catalogued routing logic) | `docs/notify/schemas/notify-rule@1.json` | `docs/notify/samples/notify-rule@1.sample.json` |
|
||
| **Channel** (delivery endpoint definition) | `docs/notify/schemas/notify-channel@1.json` | `docs/notify/samples/notify-channel@1.sample.json` |
|
||
| **Template** (rendering payload) | `docs/notify/schemas/notify-template@1.json` | `docs/notify/samples/notify-template@1.sample.json` |
|
||
| **Event envelope** (Notify ingest surface) | `docs/notify/schemas/notify-event@1.json` | `docs/notify/samples/notify-event@1.sample.json` |
|
||
|
||
### 6.1 Rule highlights (`notify-rule@1`)
|
||
|
||
* Keys are lower‑cased camelCase. `schemaVersion` (`notify.rule@1`), `ruleId`, `tenantId`, `name`, `match`, `actions`, `createdAt`, and `updatedAt` are mandatory.
|
||
* `match.eventKinds`, `match.verdicts`, and other array selectors are pre‑sorted and case‑normalized (e.g. `scanner.report.ready`).
|
||
* `actions[].throttle` serialises as ISO 8601 duration (`PT5M`), mirroring worker backoff guardrails.
|
||
* `vex` gates let operators exclude accepted/not‑affected justifications; omit the block to inherit default behaviour.
|
||
* Use `StellaOps.Notify.Models.NotifySchemaMigration.UpgradeRule(JsonNode)` when deserialising legacy payloads that might lack `schemaVersion` or retain older revisions.
|
||
* Soft deletions persist `deletedAt` in Mongo (and disable the rule); repository queries automatically filter them.
|
||
|
||
### 6.2 Channel highlights (`notify-channel@1`)
|
||
|
||
* `schemaVersion` is pinned to `notify.channel@1` and must accompany persisted documents.
|
||
* `type` matches plug‑in identifiers (`slack`, `teams`, `email`, `webhook`, `custom`).
|
||
* `config.secretRef` stores an external secret handle (Authority, Vault, K8s). Notify never persists raw credentials.
|
||
* Optional `config.limits.timeout` uses ISO 8601 durations identical to rule throttles; concurrency/RPM defaults apply when absent.
|
||
* `StellaOps.Notify.Models.NotifySchemaMigration.UpgradeChannel(JsonNode)` backfills the schema version when older documents omit it.
|
||
* Channels share the same soft-delete marker (`deletedAt`) so operators can restore prior configuration without purging history.
|
||
|
||
### 6.3 Event envelope (`notify-event@1`)
|
||
|
||
* Aligns with the platform event contract—`eventId` UUID, RFC 3339 `ts`, tenant isolation enforced.
|
||
* Enumerated `kind` covers the initial Notify surface (`scanner.report.ready`, `scheduler.rescan.delta`, `zastava.admission`, etc.).
|
||
* `scope.labels`/`scope.attributes` and top-level `attributes` mirror the metadata dictionaries workers surface for templating and audits.
|
||
* Notify workers use the same migration helper to wrap event payloads before template rendering, so schema additions remain additive.
|
||
|
||
### 6.4 Template highlights (`notify-template@1`)
|
||
|
||
* Carries the presentation key (`channelType`, `key`, `locale`) and the raw template body; `schemaVersion` is fixed to `notify.template@1`.
|
||
* `renderMode` enumerates supported engines (`markdown`, `html`, `adaptiveCard`, `plainText`, `json`) aligning with `NotifyTemplateRenderMode`.
|
||
* `format` signals downstream connector expectations (`slack`, `teams`, `email`, `webhook`, `json`).
|
||
* Upgrade legacy definitions with `NotifySchemaMigration.UpgradeTemplate(JsonNode)` to auto-apply the new schema version and ordering.
|
||
* Templates also record soft deletes via `deletedAt`; UI/API skip them by default while retaining revision history.
|
||
|
||
**Validation loop:**
|
||
|
||
```bash
|
||
# Validate Notify schemas and samples (matches Docs CI)
|
||
for schema in docs/notify/schemas/*.json; do
|
||
npx ajv compile -c ajv-formats -s "$schema"
|
||
done
|
||
|
||
for sample in docs/notify/samples/*.sample.json; do
|
||
schema="docs/notify/schemas/$(basename "${sample%.sample.json}").json"
|
||
npx ajv validate -c ajv-formats -s "$schema" -d "$sample"
|
||
done
|
||
```
|
||
|
||
Integration tests can embed the sample fixtures to guarantee deterministic serialisation from the `StellaOps.Notify.Models` DTOs introduced in Sprint 15.
|
||
|
||
---
|
||
|
||
## 6 Validator Contracts
|
||
|
||
* For SBOM wrapper – `ISbomValidator` (DLL plug‑in) must return *typed* error list.
|
||
* For YAML policies – JSON‑Schema at `/schemas/policy‑v1.json`.
|
||
* For Rego – OPA `opa eval --fail-defined` under the hood.
|
||
* For **Free‑tier quotas** – `IQuotaService` integration tests ensure `quota:<token>` resets at UTC midnight and produces correct `Retry‑After` headers.
|
||
|
||
---
|
||
|
||
## 7 Migration Notes
|
||
|
||
1. **Add `format` column** to existing SBOM wrappers; default to `trivy-json-v2`.
|
||
2. **Populate `layers` & `partial`** via backfill script (ship with `stellopsctl migrate` wizard).
|
||
3. Policy YAML previously stored in Redis → copy to Mongo if persistence enabled.
|
||
4. Prepare `attestations` collection (empty) – safe to create in advance.
|
||
|
||
---
|
||
|
||
## 8 Open Questions / Future Work
|
||
|
||
* How to de‑duplicate *identical* Rego policies differing only in whitespace?
|
||
* Embed *GOST 34.11‑2018* digests when users enable Russian crypto suite?
|
||
* Should enterprise tiers share the same Redis quota keys or switch to JWT claim `tier != Free` bypass?
|
||
* Evaluate sliding‑window quota instead of strict daily reset.
|
||
* Consider rate‑limit for `/layers/missing` to avoid brute‑force enumeration.
|
||
|
||
---
|
||
|
||
## 9 Change Log
|
||
|
||
| Date | Note |
|
||
|------------|--------------------------------------------------------------------------------|
|
||
| 2025‑07‑14 | **Added:** `format`, `partial`, delta cache keys, YAML policy schema v1.0. |
|
||
| 2025‑07‑12 | **Initial public draft** – SBOM wrapper, Redis keyspace, audit collections. |
|
||
|
||
---
|