feat: Implement Scheduler Worker Options and Planner Loop
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled

- Added `SchedulerWorkerOptions` class to encapsulate configuration for the scheduler worker.
- Introduced `PlannerBackgroundService` to manage the planner loop, fetching and processing planning runs.
- Created `PlannerExecutionService` to handle the execution logic for planning runs, including impact targeting and run persistence.
- Developed `PlannerExecutionResult` and `PlannerExecutionStatus` to standardize execution outcomes.
- Implemented validation logic within `SchedulerWorkerOptions` to ensure proper configuration.
- Added documentation for the planner loop and impact targeting features.
- Established health check endpoints and authentication mechanisms for the Signals service.
- Created unit tests for the Signals API to ensure proper functionality and response handling.
- Configured options for authority integration and fallback authentication methods.
This commit is contained in:
2025-10-27 09:46:31 +02:00
parent 96d52884e8
commit 730354a1af
135 changed files with 10721 additions and 946 deletions

View File

@@ -49,15 +49,17 @@ Authority persists every issued token in MongoDB so operators can audit or revok
### Expectations for resource servers
Resource servers (Concelier WebService, Backend, Agent) **must not** assume in-memory caches are authoritative. They should:
- cache `/jwks` and `/revocations/export` responses within configured lifetimes;
- cache `/jwks` and `/revocations/export` responses within configured lifetimes;
- honour `revokedReason` metadata when shaping audit trails;
- treat `status != "valid"` or missing tokens as immediate denial conditions.
- propagate the `tenant` claim (`X-Stella-Tenant` header in REST calls) and reject requests when the tenant supplied by Authority does not match the resource server's scope; Concelier and Excititor guard endpoints refuse cross-tenant tokens.
### Tenant propagation
- Client provisioning (bootstrap or plug-in) accepts a `tenant` hint. Authority normalises the value (`trim().ToLowerInvariant()`) and persists it alongside the registration. Clients without an explicit tenant remain global.
- Issued principals include the `stellaops:tenant` claim. `PersistTokensHandler` mirrors this claim into `authority_tokens.tenant`, enabling per-tenant revocation and reporting.
- Rate limiter metadata now tags requests with `authority.tenant`, unlocking per-tenant throughput metrics and diagnostic filters. Audit events (`authority.client_credentials.grant`, `authority.password.grant`, bootstrap flows) surface the tenant and login attempt documents index on `{tenant, occurredAt}` for quick queries.
- Client credentials that request `advisory:ingest`, `advisory:read`, `vex:ingest`, `vex:read`, or `aoc:verify` now fail fast when the client registration lacks a tenant hint. Issued tokens are re-validated against persisted tenant metadata, and Authority rejects any cross-tenant replay (`invalid_client`/`invalid_token`), ensuring aggregation-only workloads remain tenant-scoped.
- Password grant flows reuse the client registration's tenant and enforce the configured scope allow-list. Requested scopes outside that list (or mismatched tenants) trigger `invalid_scope`/`invalid_client` failures, ensuring cross-tenant access is denied before token issuance.
### Default service scopes
@@ -66,7 +68,7 @@ Resource servers (Concelier WebService, Backend, Agent) **must not** assume in-m
|----------------------|---------------------------------------|--------------------------------------|-------------------|-----------------|
| `concelier-ingest` | Concelier raw advisory ingestion | `advisory:ingest`, `advisory:read` | `dpop` | `tenant-default` |
| `excitor-ingest` | Excititor raw VEX ingestion | `vex:ingest`, `vex:read` | `dpop` | `tenant-default` |
| `aoc-verifier` | Aggregation-only contract verification | `aoc:verify` | `dpop` | `tenant-default` |
| `aoc-verifier` | Aggregation-only contract verification | `aoc:verify`, `advisory:read`, `vex:read` | `dpop` | `tenant-default` |
| `cartographer-service` | Graph snapshot construction | `graph:write`, `graph:read` | `dpop` | `tenant-default` |
| `graph-api` | Graph Explorer gateway/API | `graph:read`, `graph:export`, `graph:simulate` | `dpop` | `tenant-default` |
| `vuln-explorer-ui` | Vuln Explorer UI/API | `vuln:read` | `dpop` | `tenant-default` |
@@ -188,6 +190,13 @@ POST /internal/clients
For environments with multiple tenants, repeat the call per tenant-specific client (e.g. `concelier-tenant-a`, `concelier-tenant-b`) or append suffixes to the client identifier.
### Aggregation-only verification tokens
- Issue a dedicated client (e.g. `aoc-verifier`) with the scopes `aoc:verify`, `advisory:read`, and `vex:read` for each tenant that runs guard checks. Authority refuses to mint tokens for these scopes unless the client registration provides a tenant hint.
- The CLI (`stella aoc verify --tenant <tenant>`) and Console verification panel both call `/aoc/verify` on Concelier and Excititor. Tokens that omit the tenant claim or present a tenant that does not match the stored registration are rejected with `invalid_client`/`invalid_token`.
- Verification responses map guard failures to `ERR_AOC_00x` codes and Authority emits `authority.client_credentials.grant` + `authority.token.validate_access` audit records containing the tenant and scopes so operators can trace who executed a run.
- For air-gapped or offline replicas, pre-issue verification tokens per tenant and rotate them alongside ingest credentials; the guard endpoints never mutate data and remain safe to expose through the offline kit schedule.
## 7. Configuration Reference
| Section | Key | Description | Notes |

View File

@@ -1,12 +1,12 @@
# component_architecture_concelier.md — **StellaOps Concelier** (2025Q4)
# component_architecture_concelier.md — **StellaOps Concelier** (Sprint22)
> **Scope.** Implementationready architecture for **Concelier**: the vulnerability ingest/normalize/merge/export subsystem that produces deterministic advisory data for the Scanner + Policy + Excititor pipeline. Covers domain model, connectors, merge rules, storage schema, exports, APIs, performance, security, and test matrices.
> **Scope.** Implementation-ready architecture for **Concelier**: the advisory ingestion and Link-Not-Merge (LNM) observation pipeline that produces deterministic raw observations, correlation linksets, and evidence events consumed by Policy Engine, Console, CLI, and Export centers. Covers domain models, connectors, observation/linkset builders, storage schema, events, APIs, performance, security, and test matrices.
---
## 0) Mission & boundaries
**Mission.** Acquire authoritative **vulnerability advisories** (vendor PSIRTs, distros, OSS ecosystems, CERTs), normalize them into a **canonical model**, reconcile aliases and version ranges, and export **deterministic artifacts** (JSON, Trivy DB) for fast backend joins.
**Mission.** Acquire authoritative **vulnerability advisories** (vendor PSIRTs, distros, OSS ecosystems, CERTs), persist them as immutable **observations** under the Aggregation-Only Contract (AOC), construct **linksets** that correlate observations without merging or precedence, and export deterministic evidence bundles (JSON, Trivy DB, Offline Kit) for downstream policy evaluation and operator tooling.
**Boundaries.**
@@ -21,10 +21,12 @@
**Process shape:** single ASP.NET Core service `StellaOps.Concelier.WebService` hosting:
* **Scheduler** with distributed locks (Mongo backed).
* **Connectors** (fetch/parse/map).
* **Merger** (canonical record assembly + precedence).
* **Exporters** (JSON, Trivy DB).
* **Minimal REST** for health/status/trigger/export.
* **Connectors** (fetch/parse/map) that emit immutable observation candidates.
* **Observation writer** enforcing AOC invariants via `AOCWriteGuard`.
* **Linkset builder** that correlates observations into `advisory_linksets` and annotates conflicts.
* **Event publisher** emitting `advisory.observation.updated` and `advisory.linkset.updated` messages.
* **Exporters** (JSON, Trivy DB, Offline Kit slices) fed from observation/linkset stores.
* **Minimal REST** for health/status/trigger/export and observation/linkset reads.
**Scale:** HA by running N replicas; **locks** prevent overlapping jobs per source/exporter.
@@ -36,113 +38,96 @@
### 2.1 Core entities
**Advisory**
#### AdvisoryObservation
```
advisoryId // internal GUID
advisoryKey // stable string key (e.g., CVE-2025-12345 or vendor ID)
title // short title (best-of from sources)
summary // normalized summary (English; i18n optional)
published // earliest source timestamp
modified // latest source timestamp
severity // normalized {none, low, medium, high, critical}
cvss // {v2?, v3?, v4?} objects (vector, baseScore, severity, source)
exploitKnown // bool (e.g., KEV/active exploitation flags)
references[] // typed links (advisory, kb, patch, vendor, exploit, blog)
sources[] // provenance for traceability (doc digests, URIs)
```
**Alias**
```
advisoryId
scheme // CVE, GHSA, RHSA, DSA, USN, MSRC, etc.
value // e.g., "CVE-2025-12345"
```
**Affected**
```
advisoryId
productKey // canonical product identity (see 2.2)
rangeKind // semver | evr | nvra | apk | rpm | deb | generic | exact
introduced? // string (format depends on rangeKind)
fixed? // string (format depends on rangeKind)
lastKnownSafe? // optional explicit safe floor
arch? // arch or platform qualifier if source declares (x86_64, aarch64)
distro? // distro qualifier when applicable (rhel:9, debian:12, alpine:3.19)
ecosystem? // npm|pypi|maven|nuget|golang|…
notes? // normalized notes per source
```
**Reference**
```
advisoryId
url
kind // advisory | patch | kb | exploit | mitigation | blog | cvrf | csaf
sourceTag // e.g., vendor/redhat, distro/debian, oss/ghsa
```
**MergeEvent**
```
advisoryKey
beforeHash // canonical JSON hash before merge
afterHash // canonical JSON hash after merge
mergedAt
inputs[] // source doc digests that contributed
```
**AdvisoryStatement (event log)**
```
statementId // GUID (immutable)
vulnerabilityKey // canonical advisory key (e.g., CVE-2025-12345)
advisoryKey // merge snapshot advisory key (may reference variant)
statementHash // canonical hash of advisory payload
asOf // timestamp of snapshot (UTC)
recordedAt // persistence timestamp (UTC)
inputDocuments[] // document IDs contributing to the snapshot
payload // canonical advisory document (BSON / canonical JSON)
```
**AdvisoryConflict**
```
conflictId // GUID
vulnerabilityKey // canonical advisory key
conflictHash // deterministic hash of conflict payload
asOf // timestamp aligned with originating statement set
recordedAt // persistence timestamp
statementIds[] // related advisoryStatement identifiers
details // structured conflict explanation / merge reasoning
```
- `AdvisoryEventLog` (Concelier.Core) provides the public API for appending immutable statements/conflicts and querying replay history. Inputs are normalized by trimming and lower-casing `vulnerabilityKey`, serializing advisories with `CanonicalJsonSerializer`, and computing SHA-256 hashes (`statementHash`, `conflictHash`) over the canonical JSON payloads. Consumers can replay by key with an optional `asOf` filter to obtain deterministic snapshots ordered by `asOf` then `recordedAt`.
- Conflict explainers are serialized as deterministic `MergeConflictExplainerPayload` records (type, reason, source ranks, winning values); replay clients can parse the payload to render human-readable rationales without re-computing precedence.
- Concelier.WebService exposes the immutable log via `GET /concelier/advisories/{vulnerabilityKey}/replay[?asOf=UTC_ISO8601]`, returning the latest statements (with hex-encoded hashes) and any conflict explanations for downstream exporters and APIs.
**AdvisoryObservation (new in Sprint 24)**
```
observationId // deterministic id: {tenant}:{source}:{upstreamId}:{revision}
```jsonc
observationId // deterministic id: {tenant}:{source.vendor}:{upstreamId}:{revision}
tenant // issuing tenant (lower-case)
source{vendor,stream,api,collectorVersion}
source{
vendor, stream, api, collectorVersion
}
upstream{
upstreamId, documentVersion, contentHash,
fetchedAt, receivedAt, signature{present,format,keyId,signature}}
content{format,specVersion,raw,metadata}
linkset{aliases[], purls[], cpes[], references[{type,url}]}
upstreamId, documentVersion, fetchedAt, receivedAt,
contentHash, signature{present, format?, keyId?, signature?}
}
content{
format, specVersion, raw, metadata?
}
identifiers{
cve?, ghsa?, vendorIds[], aliases[]
}
linkset{
purls[], cpes[], aliases[], references[{type,url}],
reconciledFrom[]
}
createdAt // when Concelier recorded the observation
attributes // optional provenance metadata (e.g., batch, connector)
```
attributes // optional provenance metadata (batch ids, ingest cursor)
```jsonc
The observation is an immutable projection of the raw ingestion document (post provenance validation, pre-merge) that powers LinkNotMerge overlays and Vuln Explorer. Observations live in the `advisory_observations` collection, keyed by tenant + upstream identity. `linkset` provides normalized aliases/PURLs/CPES that downstream services (Graph/Vuln Explorer) join against without triggering merge logic. Concelier.Core exposes strongly-typed models (`AdvisoryObservation`, `AdvisoryObservationLinkset`, etc.) and a Mongo-backed store for filtered queries by tenant/alias; this keeps overlay consumers read-only while preserving AOC guarantees.
#### AdvisoryLinkset
**ExportState**
```jsonc
linksetId // sha256 over sorted (tenant, product/vuln tuple, observation ids)
tenant
key{
vulnerabilityId,
productKey,
confidence // low|medium|high
}
observations[] = [
{
observationId,
sourceVendor,
statement{
status?, severity?, references?, notes?
},
collectedAt
}
]
aliases{
primary,
others[]
}
purls[]
cpes[]
conflicts[]? // see AdvisoryLinksetConflict
createdAt
updatedAt
```jsonc
```
#### AdvisoryLinksetConflict
```jsonc
conflictId // deterministic hash
type // severity-mismatch | affected-range-divergence | reference-clash | alias-inconsistency | metadata-gap
field? // optional JSON pointer (e.g., /statement/severity/vector)
observations[] // per-source values contributing to the conflict
confidence // low|medium|high (heuristic weight)
detectedAt
```jsonc
#### ObservationEvent / LinksetEvent
```jsonc
eventId // ULID
tenant
type // advisory.observation.updated | advisory.linkset.updated
key{
observationId? // on observation event
linksetId? // on linkset event
vulnerabilityId?,
productKey?
}
delta{
added[], removed[], changed[] // normalized summary for consumers
}
hash // canonical hash of serialized delta payload
occurredAt
```jsonc
#### ExportState
```jsonc
exportKind // json | trivydb
baseExportId? // last full baseline
baseDigest? // digest of last full baseline
@@ -150,7 +135,9 @@ lastFullDigest? // digest of last full export
lastDeltaDigest? // digest of last delta export
cursor // per-kind incremental cursor
files[] // last manifest snapshot (path → sha256)
```
```jsonc
Legacy `Advisory`, `Affected`, and merge-centric entities remain in the repository for historical exports and replay but are being phased out as Link-Not-Merge takes over. New code paths must interact with `AdvisoryObservation` / `AdvisoryLinkset` exclusively and emit conflicts through the structured payloads described above.
### 2.2 Product identity (`productKey`)
@@ -193,7 +180,7 @@ public interface IFeedConnector {
Task ParseAsync(IServiceProvider sp, CancellationToken ct); // -> dto collection (validated)
Task MapAsync(IServiceProvider sp, CancellationToken ct); // -> advisory/alias/affected/reference
}
```
```jsonc
* **Fetch**: windowed (cursor), conditional GET (ETag/LastModified), retry/backoff, rate limiting.
* **Parse**: schema validation (JSON Schema, XSD/CSAF), content type checks; write **DTO** with normalized casing.
@@ -215,63 +202,106 @@ public interface IFeedConnector {
---
## 5) Merge engine
## 5) Observation & linkset pipeline
### 5.1 Keying & identity
> **Goal:** deterministically ingest raw documents into immutable observations, correlate them into evidence-rich linksets, and broadcast changes without precedence or mutation.
* Identity graph: **CVE** is primary node; vendor/distro IDs resolved via **Alias** edges (from connectors and Conceliers alias tables).
* `advisoryKey` is the canonical primary key (CVE if present, else vendor/distro key).
### 5.1 Observation flow
### 5.2 Merge algorithm (deterministic)
1. **Connector fetch/parse/map** connectors download upstream payloads, validate signatures, and map to DTOs (identifiers, references, raw payload, provenance).
2. **AOC guard** `AOCWriteGuard` verifies forbidden keys, provenance completeness, tenant claims, timestamp normalization, and content hash idempotency. Violations raise `ERR_AOC_00x` mapped to structured logs and metrics.
3. **Append-only write** observations insert into `advisory_observations`; duplicates by `(tenant, source.vendor, upstream.upstreamId, upstream.contentHash)` become no-ops; new content for same upstream id creates a supersedes chain.
4. **Change feed + event** Mongo change streams trigger `advisory.observation.updated@1` events with deterministic payloads (IDs, hash, supersedes pointer, linkset summary). Policy Engine, Offline Kit builder, and guard dashboards subscribe.
1. **Gather** all rows for `advisoryKey` (across sources).
2. **Select title/summary** by precedence source (vendor>distro>ecosystem>cert).
3. **Union aliases** (dedupe by scheme+value).
4. **Merge `Affected`** with rules:
### 5.2 Linkset correlation
* Prefer **vendor** ranges for vendor products; prefer **distro** for **distroshipped** packages.
* If both exist for same `productKey`, keep **both**; mark `sourceTag` and `precedence` so **Policy** can decide.
* Never collapse range semantics across different families (e.g., rpm EVR vs semver).
5. **CVSS/severity**: record all CVSS sets; compute **effectiveSeverity** = max (unless policy override).
6. **References**: union with type precedence (advisory > patch > kb > exploit > blog); dedupe by URL; preserve `sourceTag`.
7. Produce **canonical JSON**; compute **afterHash**; store **MergeEvent** with inputs and hashes.
1. **Queue** observation deltas enqueue correlation jobs keyed by `(tenant, vulnerabilityId, productKey)` candidates derived from identifiers + alias graph.
2. **Canonical grouping** builder resolves aliases using Conceliers alias store and deterministic heuristics (vendor > distro > cert), deriving normalized product keys (purl preferred) and confidence scores.
3. **Linkset materialization** `advisory_linksets` documents store sorted observation references, alias sets, product keys, range metadata, and conflict payloads. Writes are idempotent; unchanged hashes skip updates.
4. **Conflict detection** builder emits structured conflicts (`severity-mismatch`, `affected-range-divergence`, `reference-clash`, `alias-inconsistency`, `metadata-gap`). Conflicts carry per-observation values for explainability.
5. **Event emission** `advisory.linkset.updated@1` summarizes deltas (`added`, `removed`, `changed` observation IDs, conflict updates, confidence changes) and includes a canonical hash for replay validation.
> The merge is **pure** given inputs. Any change in inputs or precedence matrices changes the **hash** predictably.
### 5.3 Event contract
| Event | Schema | Notes |
|-------|--------|-------|
| `advisory.observation.updated@1` | `events/advisory.observation.updated@1.json` | Fired on new or superseded observations. Includes `observationId`, source metadata, `linksetSummary` (aliases/purls), supersedes pointer (if any), SHA-256 hash, and `traceId`. |
| `advisory.linkset.updated@1` | `events/advisory.linkset.updated@1.json` | Fired when correlation changes. Includes `linksetId`, `key{vulnerabilityId, productKey, confidence}`, observation deltas, conflicts, `updatedAt`, and canonical hash. |
Events are emitted via NATS (primary) and Redis Stream (fallback). Consumers acknowledge idempotently using the hash; duplicates are safe. Offline Kit captures both topics during bundle creation for air-gapped replay.
---
## 6) Storage schema (MongoDB)
**Collections & indexes**
### Collections & indexes (LNM path)
* `source` `{_id, type, baseUrl, enabled, notes}`
* `source_state` `{sourceName(unique), enabled, cursor, lastSuccess, backoffUntil, paceOverrides}`
* `document` `{_id, sourceName, uri, fetchedAt, sha256, contentType, status, metadata, gridFsId?, etag?, lastModified?}`
* `concelier.sources` `{_id, type, baseUrl, enabled, notes}` connector catalog.
* `concelier.source_state` `{sourceName(unique), enabled, cursor, lastSuccess, backoffUntil, paceOverrides}` run-state (TTL indexes on `backoffUntil`).
* `concelier.documents` `{_id, sourceName, uri, fetchedAt, sha256, contentType, status, metadata, gridFsId?, etag?, lastModified?}` raw payload registry.
* Indexes: `{sourceName:1, uri:1}` unique; `{fetchedAt:-1}` for recent fetches.
* `concelier.dto` `{_id, sourceName, documentId, schemaVer, payload, validatedAt}` normalized connector DTOs used for replay.
* Index: `{sourceName:1, documentId:1}`.
* `concelier.advisory_observations`
* Index: `{sourceName:1, uri:1}` unique, `{fetchedAt:-1}`
* `dto` `{_id, sourceName, documentId, schemaVer, payload, validatedAt}`
```
{
_id: "tenant:vendor:upstreamId:revision",
tenant,
source: { vendor, stream, api, collectorVersion },
upstream: { upstreamId, documentVersion, fetchedAt, receivedAt, contentHash, signature },
content: { format, specVersion, raw, metadata? },
identifiers: { cve?, ghsa?, vendorIds[], aliases[] },
linkset: { purls[], cpes[], aliases[], references[], reconciledFrom[] },
supersedes?: "prevObservationId",
createdAt,
attributes?: object
}
```
* Index: `{sourceName:1, documentId:1}`
* `advisory` `{_id, advisoryKey, title, summary, published, modified, severity, cvss, exploitKnown, sources[]}`
* Indexes: `{tenant:1, upstream.upstreamId:1}`, `{tenant:1, source.vendor:1, linkset.purls:1}`, `{tenant:1, linkset.aliases:1}`, `{tenant:1, createdAt:-1}`.
* `concelier.advisory_linksets`
* Index: `{advisoryKey:1}` unique, `{modified:-1}`, `{severity:1}`, text index (title, summary)
* `alias` `{advisoryId, scheme, value}`
```
{
_id: "sha256:...",
tenant,
key: { vulnerabilityId, productKey, confidence },
observations: [
{ observationId, sourceVendor, statement, collectedAt }
],
aliases: { primary, others: [] },
purls: [],
cpes: [],
conflicts: [],
createdAt,
updatedAt
}
```
* Index: `{scheme:1,value:1}`, `{advisoryId:1}`
* `affected` `{advisoryId, productKey, rangeKind, introduced?, fixed?, arch?, distro?, ecosystem?}`
* Indexes: `{tenant:1, key.vulnerabilityId:1, key.productKey:1}`, `{tenant:1, purls:1}`, `{tenant:1, aliases.primary:1}`, `{tenant:1, updatedAt:-1}`.
* `concelier.advisory_events`
* Index: `{productKey:1}`, `{advisoryId:1}`, `{productKey:1, rangeKind:1}`
* `reference` `{advisoryId, url, kind, sourceTag}`
```
{
_id: ObjectId,
tenant,
type: "advisory.observation.updated" | "advisory.linkset.updated",
key,
delta,
hash,
occurredAt
}
```
* Index: `{advisoryId:1}`, `{kind:1}`
* `merge_event` `{advisoryKey, beforeHash, afterHash, mergedAt, inputs[]}`
* Index: `{advisoryKey:1, mergedAt:-1}`
* `export_state` `{_id(exportKind), baseExportId?, baseDigest?, lastFullDigest?, lastDeltaDigest?, cursor, files[]}`
* TTL index on `occurredAt` (configurable retention), `{type:1, occurredAt:-1}` for replay.
* `concelier.export_state` `{_id(exportKind), baseExportId?, baseDigest?, lastFullDigest?, lastDeltaDigest?, cursor, files[]}`
* `locks` `{_id(jobKey), holder, acquiredAt, heartbeatAt, leaseMs, ttlAt}` (TTL cleans dead locks)
* `jobs` `{_id, type, args, state, startedAt, heartbeatAt, endedAt, error}`
**GridFS buckets**: `fs.documents` for raw payloads.
**Legacy collections** (`advisory`, `alias`, `affected`, `reference`, `merge_event`) remain read-only during the migration window to support back-compat exports. New code must not write to them; scheduled cleanup removes them after Link-Not-Merge GA.
**GridFS buckets**: `fs.documents` for raw payloads (immutable); `fs.exports` for historical JSON/Trivy archives.
---
@@ -287,7 +317,7 @@ public interface IFeedConnector {
* Builds Bolt DB archives compatible with Trivy; supports **full** and **delta** modes.
* In delta, unchanged blobs are reused from the base; metadata captures:
```
```json
{
"mode": "delta|full",
"baseExportId": "...",
@@ -409,7 +439,7 @@ concelier:
## 10) Security & compliance
* **Outbound allowlist** per connector (domains, protocols); proxy support; TLS pinning where possible.
* **Signature verification** for raw docs (PGP/cosign/x509) with results stored in `document.metadata.sig`. Docs failing verification may still be ingested but flagged; **merge** can downweight or ignore them by config.
* **Signature verification** for raw docs (PGP/cosign/x509) with results stored in `document.metadata.sig`. Docs failing verification may still be ingested but flagged; Policy Engine or downstream policy can down-weight them.
* **No secrets in logs**; auth material via `env:` or mounted files; HTTP redaction of `Authorization` headers.
* **Multitenant**: pertenant DBs or prefixes; pertenant S3 prefixes; tenantscoped API tokens.
* **Determinism**: canonical JSON writer; export digests stable across runs given same inputs.
@@ -419,8 +449,9 @@ concelier:
## 11) Performance targets & scale
* **Ingest**: ≥ 5k documents/min on 4 cores (CSAF/OpenVEX/JSON).
* **Normalize/map**: ≥ 50k `Affected` rows/min on 4 cores.
* **Merge**: ≤ 10ms P95 per advisory at steadystate updates.
* **Normalize/map**: ≥ 50k observation statements/min on 4 cores.
* **Observation write**: ≤ 5ms P95 per document (including guard + Mongo write).
* **Linkset build**: ≤ 15ms P95 per `(vulnerabilityId, productKey)` update, even with 20+ contributing observations.
* **Export**: 1M advisories JSON in ≤ 90s (streamed, zstd), Trivy DB in ≤ 60s on 8 cores.
* **Memory**: hard cap per job; chunked streaming writers; backpressure to avoid GC spikes.
@@ -435,11 +466,13 @@ concelier:
* `concelier.fetch.docs_total{source}`
* `concelier.fetch.bytes_total{source}`
* `concelier.parse.failures_total{source}`
* `concelier.map.affected_total{source}`
* `concelier.merge.changed_total`
* `concelier.map.statements_total{source}`
* `concelier.observations.write_total{result=ok|noop|error}`
* `concelier.linksets.updated_total{result=ok|skip|error}`
* `concelier.linksets.conflicts_total{type}`
* `concelier.export.bytes{kind}`
* `concelier.export.duration_seconds{kind}`
* **Tracing** around fetch/parse/map/merge/export.
* **Tracing** around fetch/parse/map/observe/linkset/export.
* **Logs**: structured with `source`, `uri`, `docDigest`, `advisoryKey`, `exportId`.
---
@@ -448,7 +481,7 @@ concelier:
* **Connectors:** fixture suites for each provider/format (happy path; malformed; signature fail).
* **Version semantics:** EVR vs dpkg vs semver edge cases (epoch bumps, tilde versions, prereleases).
* **Merge:** conflicting sources (vendor vs distro vs OSV); verify precedence & dual retention.
* **Linkset correlation:** multi-source conflicts (severity, range, alias) produce deterministic conflict payloads; ensure confidence scoring stable.
* **Export determinism:** byteforbyte stable outputs across runs; digest equality.
* **Performance:** soak tests with 1M advisories; cap memory; verify backpressure.
* **API:** pagination, filters, RBAC, error envelopes (RFC 7807).
@@ -470,7 +503,8 @@ concelier:
* **Trigger all sources:** `POST /api/v1/concelier/sources/*/trigger`
* **Force full export JSON:** `POST /api/v1/concelier/exports/json { "full": true, "force": true }`
* **Force Trivy DB delta publish:** `POST /api/v1/concelier/exports/trivy { "full": false, "publish": true }`
* **Inspect advisory:** `GET /api/v1/concelier/advisories?scheme=CVE&value=CVE-2025-12345`
* **Inspect observation:** `GET /api/v1/concelier/observations/{observationId}`
* **Query linkset:** `GET /api/v1/concelier/linksets?vulnerabilityId=CVE-2025-12345&productKey=pkg:rpm/redhat/openssl`
* **Pause noisy source:** `POST /api/v1/concelier/sources/osv/pause`
---
@@ -482,4 +516,3 @@ concelier:
3. **Attestation handoff**: integrate with **Signer/Attestor** (optional).
4. **Scale & diagnostics**: provider dashboards, staleness alerts, export cache reuse.
5. **Offline kit**: endtoend verified bundles for airgap.

View File

@@ -1,17 +1,17 @@
# component_architecture_excititor.md — **StellaOps Excititor** (2025Q4)
> **Scope.** This document specifies the **Excititor** service: its purpose, trust model, data structures, APIs, plugin contracts, storage schema, normalization/consensus algorithms, performance budgets, testing matrix, and how it integrates with Scanner, Policy, Concelier, and the attestation chain. It is implementationready.
# component_architecture_excititor.md — **StellaOps Excititor** (Sprint22)
> **Scope.** This document specifies the **Excititor** service: its purpose, trust model, data structures, observation/linkset pipelines, APIs, plug-in contracts, storage schema, performance budgets, testing matrix, and how it integrates with Concelier, Policy Engine, and evidence surfaces. It is implementation-ready.
---
## 0) Mission & role in the platform
**Mission.** Convert heterogeneous **VEX** statements (OpenVEX, CSAF VEX, CycloneDX VEX; vendor/distro/platform sources) into **canonical, queryable claims**; compute **deterministic consensus** per *(vuln, product)*; preserve **conflicts with provenance**; publish **stable, attestable exports** that the backend uses to suppress nonexploitable findings, prioritize remaining risk, and explain decisions.
**Mission.** Convert heterogeneous **VEX** statements (OpenVEX, CSAF VEX, CycloneDX VEX; vendor/distro/platform sources) into immutable **VEX observations**, correlate them into **linksets** that retain provenance/conflicts without precedence, and publish deterministic evidence exports and events that Policy Engine, Console, and CLI use to suppress or explain findings.
**Boundaries.**
* Excititor **does not** decide PASS/FAIL. It supplies **evidence** (statuses + justifications + provenance weights).
* Excititor preserves **conflicting claims** unchanged; consensus encodes how we would pick, but the raw set is always exportable.
* Excititor preserves **conflicting observations** unchanged; consensus (when enabled) merely annotates how policy might choose, but raw evidence remains exportable.
* VEX consumption is **backendonly**: Scanner never applies VEX. The backends **Policy Engine** asks Excititor for status evidence and then decides what to show.
---
@@ -27,38 +27,121 @@
All connectors register **source metadata**: provider identity, trust tier, signature expectations (PGP/cosign/PKI), fetch windows, rate limits, and time anchors.
### 1.2 Canonical model (normalized)
Every incoming statement becomes a set of **VexClaim** records:
```
VexClaim
- providerId // 'redhat', 'suse', 'ubuntu', 'github', 'vendorX'
- vulnId // 'CVE-2025-12345', 'GHSA-xxxx', canonicalized
- productKey // canonical product identity (see §2.2)
- status // affected | not_affected | fixed | under_investigation
- justification? // for 'not_affected'/'affected' where provided
- introducedVersion? // semantics per provider (range or exact)
- fixedVersion? // where provided (range or exact)
- lastObserved // timestamp from source or fetch time
- provenance // doc digest, signature status, fetch URI, line/offset anchors
- evidence[] // raw source snippets for explainability
- supersedes? // optional cross-doc chain (docDigest → docDigest)
```
### 1.3 Exports (consumption)
* **VexConsensus** per `(vulnId, productKey)` with:
* `rollupStatus` (after policy weights/justification gates),
* `sources[]` (winning + losing claims with weights & reasons),
* `policyRevisionId` (identifier of the Excititor policy used),
* `consensusDigest` (stable SHA256 over canonical JSON).
* **Raw claims** export for auditing (unchanged, with provenance).
* **Provider snapshots** (per source, last N days) for operator debugging.
* **Index** optimized for backend joins: `(productKey, vulnId) → (status, confidence, sourceSet)`.
All exports are **deterministic**, and (optionally) **attested** via DSSE and logged to Rekor v2.
### 1.2 Canonical model (observations & linksets)
#### VexObservation
```jsonc
observationId // {tenant}:{providerId}:{upstreamId}:{revision}
tenant
providerId // e.g., redhat, suse, ubuntu, osv
streamId // connector stream (csaf, openvex, cyclonedx, attestation)
upstream{
upstreamId,
documentVersion?,
fetchedAt,
receivedAt,
contentHash,
signature{present, format?, keyId?, signature?}
}
statements[
{
vulnerabilityId,
productKey,
status, // affected | not_affected | fixed | under_investigation
justification?,
introducedVersion?,
fixedVersion?,
lastObserved,
locator?, // JSON Pointer/line for provenance
evidence?[]
}
]
content{
format,
specVersion?,
raw
}
linkset{
aliases[], // CVE/GHSA/vendor IDs
purls[],
cpes[],
references[{type,url}],
reconciledFrom[]
}
supersedes?
createdAt
attributes?
```
#### VexLinkset
```jsonc
linksetId // sha256 over sorted (tenant, vulnId, productKey, observationIds)
tenant
key{
vulnerabilityId,
productKey,
confidence // low|medium|high
}
observations[] = [
{
observationId,
providerId,
status,
justification?,
introducedVersion?,
fixedVersion?,
evidence?,
collectedAt
}
]
aliases{
primary,
others[]
}
purls[]
cpes[]
conflicts[]? // see VexLinksetConflict
createdAt
updatedAt
```
#### VexLinksetConflict
```jsonc
conflictId
type // status-mismatch | justification-divergence | version-range-clash | non-joinable-overlap | metadata-gap
field? // optional pointer for UI rendering
statements[] // per-observation values with providerId + status/justification/version data
confidence
detectedAt
```
#### VexConsensus (optional)
```jsonc
consensusId // sha256(vulnerabilityId, productKey, policyRevisionId)
vulnerabilityId
productKey
rollupStatus // derived by Excititor policy adapter (linkset aware)
sources[] // observation references with weight, accepted flag, reason
policyRevisionId
evaluatedAt
consensusDigest
```
Consensus persists only when Excititor policy adapters require pre-computed rollups (e.g., Offline Kit). Policy Engine can also compute consensus on demand from linksets.
### 1.3 Exports & evidence bundles
* **Raw observations** — JSON tree per observation for auditing/offline.
* **Linksets** — grouped evidence for policy/Console/CLI consumption.
* **Consensus (optional)** — if enabled, mirrors existing API contracts.
* **Provider snapshots** — last N days of observations per provider to support diagnostics.
* **Index** — `(productKey, vulnerabilityId) → {status candidates, confidence, observationIds}` for high-speed joins.
All exports remain deterministic and, when configured, attested via DSSE + Rekor v2.
---
@@ -98,73 +181,106 @@ enabled: bool
createdAt, modifiedAt
```
**`vex.raw`** (immutable raw documents)
**`vex.raw`** (immutable raw documents)
```
_id: sha256(doc bytes)
providerId
uri
ingestedAt
contentType
sig: { verified: bool, method: pgp|cosign|x509|none, keyId|certSubject, bundle? }
payload: GridFS pointer (if large)
disposition: kept|replaced|superseded
correlation: { replaces?: sha256, replacedBy?: sha256 }
```
**`vex.observations`**
```
{
_id: "tenant:providerId:upstreamId:revision",
tenant,
providerId,
streamId,
upstream: { upstreamId, documentVersion?, fetchedAt, receivedAt, contentHash, signature },
statements: [
{
vulnerabilityId,
productKey,
status,
justification?,
introducedVersion?,
fixedVersion?,
lastObserved,
locator?,
evidence?
}
],
content: { format, specVersion?, raw },
linkset: { aliases[], purls[], cpes[], references[], reconciledFrom[] },
supersedes?,
createdAt,
attributes?
}
```
* Indexes: `{tenant:1, providerId:1, upstream.upstreamId:1}`, `{tenant:1, statements.vulnerabilityId:1}`, `{tenant:1, linkset.purls:1}`, `{tenant:1, createdAt:-1}`.
**`vex.linksets`**
```
{
_id: "sha256:...",
tenant,
key: { vulnerabilityId, productKey, confidence },
observations: [
{ observationId, providerId, status, justification?, introducedVersion?, fixedVersion?, evidence?, collectedAt }
],
aliases: { primary, others: [] },
purls: [],
cpes: [],
conflicts: [],
createdAt,
updatedAt
}
```
* Indexes: `{tenant:1, key.vulnerabilityId:1, key.productKey:1}`, `{tenant:1, purls:1}`, `{tenant:1, updatedAt:-1}`.
```
_id: sha256(doc bytes)
providerId
uri
ingestedAt
contentType
sig: { verified: bool, method: pgp|cosign|x509|none, keyId|certSubject, bundle? }
payload: GridFS pointer (if large)
disposition: kept|replaced|superseded
correlation: { replaces?: sha256, replacedBy?: sha256 }
```
**`vex.statements`** (immutable normalized rows; append-only event log)
```
_id: ObjectId
providerId
vulnId
productKey
status
justification?
introducedVersion?
fixedVersion?
lastObserved
docDigest
provenance { uri, line?, pointer?, signatureState }
evidence[] { key, value, locator }
signals? {
severity? { scheme, score?, label?, vector? }
kev?: bool
epss?: double
}
insertedAt
indices:
- {vulnId:1, productKey:1}
- {providerId:1, insertedAt:-1}
- {docDigest:1}
- {status:1}
- text index (optional) on evidence.value for debugging
```
**`vex.consensus`** (rollups)
```
_id: sha256(canonical(vulnId, productKey, policyRevision))
vulnId
productKey
rollupStatus
sources[]: [
{ providerId, status, justification?, weight, lastObserved, accepted:bool, reason }
]
policyRevisionId
evaluatedAt
signals? {
severity? { scheme, score?, label?, vector? }
kev?: bool
epss?: double
}
consensusDigest // same as _id
indices:
- {vulnId:1, productKey:1}
- {policyRevisionId:1, evaluatedAt:-1}
```
**`vex.exports`** (manifest of emitted artifacts)
**`vex.events`** (observation/linkset events, optional long retention)
```
{
_id: ObjectId,
tenant,
type: "vex.observation.updated" | "vex.linkset.updated",
key,
delta,
hash,
occurredAt
}
```
* Indexes: `{type:1, occurredAt:-1}`, TTL on `occurredAt` for configurable retention.
**`vex.consensus`** (optional rollups)
```
_id: sha256(canonical(vulnerabilityId, productKey, policyRevisionId))
vulnerabilityId
productKey
rollupStatus
sources[] // observation references with weights/reasons
policyRevisionId
evaluatedAt
signals? // optional severity/kev/epss hints
consensusDigest
```
* Indexes: `{vulnerabilityId:1, productKey:1}`, `{policyRevisionId:1, evaluatedAt:-1}`.
**`vex.exports`** (manifest of emitted artifacts)
```
_id
@@ -177,23 +293,15 @@ policyRevisionId
cacheable: bool
```
**`vex.cache`**
```
querySignature -> exportId (for fast reuse)
ttl, hits
```
**`vex.migrations`**
* ordered migrations applied at bootstrap to ensure indexes.
* `20251019-consensus-signals-statements` introduces the statements log indexes and the `policyRevisionId + evaluatedAt` lookup for consensus — rerun consensus writers once to hydrate newly persisted signals.
### 3.2 Indexing strategy
* Hot path queries use exact `(vulnId, productKey)` and timebounded windows; compound indexes cover both.
* Providers list view by `lastObserved` for monitoring staleness.
* `vex.consensus` keyed by `(vulnId, productKey, policyRevision)` for deterministic reuse.
**`vex.cache`** — observation/linkset export cache: `{querySignature, exportId, ttl, hits}`.
**`vex.migrations`** — ordered migrations ensuring new indexes (`20251027-linksets-introduced`, etc.).
### 3.2 Indexing strategy
* Hot path queries rely on `{tenant, key.vulnerabilityId, key.productKey}` covering linkset lookup.
* Observability queries use `{tenant, updatedAt}` to monitor staleness.
* Consensus (if enabled) keyed by `{vulnerabilityId, productKey, policyRevisionId}` for deterministic reuse.
---
@@ -202,30 +310,30 @@ ttl, hits
### 4.1 Connector contract
```csharp
public interface IVexConnector
{
string ProviderId { get; }
Task FetchAsync(VexConnectorContext ctx, CancellationToken ct); // raw docs
Task NormalizeAsync(VexConnectorContext ctx, CancellationToken ct); // raw -> VexClaim[]
}
```
* **Fetch** must implement: window scheduling, conditional GET (ETag/IfModifiedSince), rate limiting, retry/backoff.
* **Normalize** parses the format, validates schema, maps product identities deterministically, emits `VexClaim` records with **provenance**.
public interface IVexConnector
{
string ProviderId { get; }
Task FetchAsync(VexConnectorContext ctx, CancellationToken ct); // raw docs
Task NormalizeAsync(VexConnectorContext ctx, CancellationToken ct); // raw -> ObservationStatements[]
}
```
* **Fetch** must implement: window scheduling, conditional GET (ETag/IfModifiedSince), rate limiting, retry/backoff.
* **Normalize** parses the format, validates schema, maps product identities deterministically, emits observation statements with **provenance** metadata (locator, justification, version ranges).
### 4.2 Signature verification (per provider)
* **cosign (keyless or keyful)** for OCI referrers or HTTPserved JSON with Sigstore bundles.
* **PGP** (provider keyrings) for distro/vendor feeds that sign docs.
* **x509** (mutual TLS / providerpinned certs) where applicable.
* Signature state is stored on **vex.raw.sig** and copied into **provenance.signatureState** on claims.
> Claims from sources failing signature policy are marked `"signatureState.verified=false"` and **policy** can downweight or ignore them.
* Signature state is stored on **vex.raw.sig** and copied into `statements[].signatureState` so downstream policy can gate by verification result.
> Observation statements from sources failing signature policy are marked `"signatureState.verified=false"` and policy can down-weight or ignore them.
### 4.3 Time discipline
* For each doc, prefer **providers document timestamp**; if absent, use fetch time.
* Claims carry `lastObserved` which drives **tiebreaking** within equal weight tiers.
* Statements carry `lastObserved` which drives **tie-breaking** within equal weight tiers.
---
@@ -235,7 +343,7 @@ public interface IVexConnector
* **purl** first; **cpe** second; OS package NVRA/EVR mapping helpers (distro connectors) produce purls via canonical tables (e.g., rpm→purl:rpm, deb→purl:deb).
* Where a provider publishes **platformlevel** VEX (e.g., “RHEL 9 not affected”), connectors expand to known product inventory rules (e.g., map to sets of packages/components shipped in the platform). Expansion tables are versioned and kept per provider; every expansion emits **evidence** indicating the rule applied.
* If expansion would be speculative, the claim remains **platformscoped** with `productKey="platform:redhat:rhel:9"` and is flagged **nonjoinable**; backend can decide to use platform VEX only when Scanner proves the platform runtime.
* If expansion would be speculative, the statement remains **platform-scoped** with `productKey="platform:redhat:rhel:9"` and is flagged **non-joinable**; backend can decide to use platform VEX only when Scanner proves the platform runtime.
### 5.2 Status + justification mapping
@@ -254,11 +362,11 @@ public interface IVexConnector
## 6) Consensus algorithm
**Goal:** produce a **stable**, explainable `rollupStatus` per `(vulnId, productKey)` given possibly conflicting claims.
**Goal:** produce a **stable**, explainable `rollupStatus` per `(vulnId, productKey)` when consumers opt into Excititor-managed consensus derived from linksets.
### 6.1 Inputs
* Set **S** of `VexClaim` for the key.
* Set **S** of observation statements drawn from the current `VexLinkset` for `(tenant, vulnId, productKey)`.
* **Excititor policy snapshot**:
* **weights** per provider tier and per provider overrides.
@@ -268,19 +376,19 @@ public interface IVexConnector
### 6.2 Steps
1. **Filter invalid** claims by signature policy & justification gates → set `S'`.
2. **Score** each claim:
`score = weight(provider) * freshnessFactor(lastObserved)` where freshnessFactor ∈ [0.8, 1.0] for staleness decay (configurable; small effect).
3. **Aggregate** scores per status: `W(status) = Σ score(claims with that status)`.
1. **Filter invalid** statements by signature policy & justification gates → set `S'`.
2. **Score** each statement:
`score = weight(provider) * freshnessFactor(lastObserved)` where freshnessFactor ∈ [0.8, 1.0] for staleness decay (configurable; small effect). Observations lacking verified signatures receive policy-configured penalties.
3. **Aggregate** scores per status: `W(status) = Σ score(statements with that status)`.
4. **Pick** `rollupStatus = argmax_status W(status)`.
5. **Tiebreakers** (in order):
* Higher **max single** provider score wins (vendor > distro > platform > hub).
* More **recent** lastObserved wins.
* Deterministic lexicographic order of status (`fixed` > `not_affected` > `under_investigation` > `affected`) as final tiebreaker.
6. **Explain**: mark accepted sources (`accepted=true; reason="weight"`/`"freshness"`), mark rejected sources with explicit `reason` (`"insufficient_justification"`, `"signature_unverified"`, `"lower_weight"`).
6. **Explain**: mark accepted observations (`accepted=true; reason="weight"`/`"freshness"`/`"confidence"`) and rejected ones with explicit `reason` (`"insufficient_justification"`, `"signature_unverified"`, `"lower_weight"`, `"low_confidence_linkset"`).
> The algorithm is **pure** given S and policy snapshot; result is reproducible and hashed into `consensusDigest`.
> The algorithm is **pure** given `S` and policy snapshot; result is reproducible and hashed into `consensusDigest`.
---
@@ -291,9 +399,13 @@ All endpoints are versioned under `/api/v1/vex`.
### 7.1 Query (online)
```
POST /claims/search
body: { vulnIds?: string[], productKeys?: string[], providers?: string[], since?: timestamp, limit?: int, pageToken?: string }
→ { claims[], nextPageToken? }
POST /observations/search
body: { vulnIds?: string[], productKeys?: string[], providers?: string[], since?: timestamp, limit?: int, pageToken?: string }
→ { observations[], nextPageToken? }
POST /linksets/search
body: { vulnIds?: string[], productKeys?: string[], confidence?: string[], since?: timestamp, limit?: int, pageToken?: string }
→ { linksets[], nextPageToken? }
POST /consensus/search
body: { vulnIds?: string[], productKeys?: string[], policyRevisionId?: string, since?: timestamp, limit?: int, pageToken?: string }
@@ -301,7 +413,7 @@ POST /consensus/search
POST /excititor/resolve (scope: vex.read)
body: { productKeys?: string[], purls?: string[], vulnerabilityIds: string[], policyRevisionId?: string }
→ { policy, resolvedAt, results: [ { vulnerabilityId, productKey, status, sources[], conflicts[], decisions[], signals?, summary?, envelope: { artifact, contentSignature?, attestation?, attestationEnvelope?, attestationSignature? } } ] }
→ { policy, resolvedAt, results: [ { vulnerabilityId, productKey, status, observations[], conflicts[], linksetConfidence, consensus?, signals?, envelope? } ] }
```
### 7.2 Exports (cacheable snapshots)
@@ -407,17 +519,18 @@ Run the ingestion endpoint once after applying migration `20251019-consensus-sig
* **Connector allowlists**: outbound fetch constrained to configured domains.
* **Tenant isolation**: pertenant DB prefixes or separate DBs; pertenant S3 prefixes; pertenant policies.
* **AuthN/Z**: Authorityissued OpToks; RBAC roles (`vex.read`, `vex.admin`, `vex.export`).
* **No secrets in logs**; deterministic logging contexts include providerId, docDigest, claim keys.
* **No secrets in logs**; deterministic logging contexts include providerId, docDigest, observationId, and linksetId.
---
## 11) Performance & scale
* **Targets:**
* Normalize 10k VEX claims/minute/core.
* Consensus compute ≤ 50ms for 1k unique `(vuln, product)` pairs in hot cache.
* Export (consensus) 1M rows in ≤ 60s on 8 cores with streaming writer.
* **Targets:**
* Normalize 10k observation statements/minute/core.
* Linkset rebuild ≤ 20ms P95 for 1k unique `(vuln, product)` pairs in hot cache.
* Consensus (when enabled) compute ≤ 50ms for 1k unique `(vuln, product)` pairs.
* Export (observations + linksets) 1M rows in ≤60s on 8 cores with streaming writer.
* **Scaling:**
@@ -465,26 +578,29 @@ Excititor.Worker ships with a background refresh service that re-evaluates stale
## 12) Observability
* **Metrics:**
* `vex.ingest.docs_total{provider}`
* `vex.normalize.claims_total{provider}`
* `vex.signature.failures_total{provider,method}`
* `vex.consensus.conflicts_total{vulnId}`
* `vex.exports.bytes{format}` / `vex.exports.latency_seconds`
* **Tracing:** spans for fetch, verify, parse, map, consensus, export.
* **Dashboards:** provider staleness, top conflicting vulns/components, signature posture, export cache hitrate.
* **Metrics:**
* `vex.fetch.requests_total{provider}` / `vex.fetch.bytes_total{provider}`
* `vex.fetch.failures_total{provider,reason}` / `vex.signature.failures_total{provider,method}`
* `vex.normalize.statements_total{provider}`
* `vex.observations.write_total{result}`
* `vex.linksets.updated_total{result}` / `vex.linksets.conflicts_total{type}`
* `vex.consensus.rollup_total{status}` (when enabled)
* `vex.exports.bytes_total{format}` / `vex.exports.latency_seconds{format}`
* **Tracing:** spans for fetch, verify, parse, map, observe, linkset, consensus, export.
* **Dashboards:** provider staleness, linkset conflict hot spots, signature posture, export cache hit-rate.
---
## 13) Testing matrix
* **Connectors:** golden raw docs → deterministic claims (fixtures per provider/format).
* **Connectors:** golden raw docs → deterministic observation statements (fixtures per provider/format).
* **Signature policies:** valid/invalid PGP/cosign/x509 samples; ensure rejects are recorded but not accepted.
* **Normalization edge cases:** platformonly claims, freetext justifications, nonpurl products.
* **Consensus:** conflict scenarios across tiers; check tiebreakers; justification gates.
* **Performance:** 1Mrow export timing; memory ceilings; stream correctness.
* **Determinism:** same inputs + policy → identical `consensusDigest` and export bytes.
* **Normalization edge cases:** platform-scoped statements, free-text justifications, non-purl products.
* **Linksets:** conflict scenarios across tiers; verify confidence scoring + conflict payload stability.
* **Consensus (optional):** ensure tie-breakers honour policy weights/justification gates.
* **Performance:** 1M-row observation/linkset export timing; memory ceilings; stream correctness.
* **Determinism:** same inputs + policy → identical linkset hashes, conflict payloads, optional `consensusDigest`, and export bytes.
* **API contract tests:** pagination, filters, RBAC, rate limits.
---
@@ -493,8 +609,8 @@ Excititor.Worker ships with a background refresh service that re-evaluates stale
* **Backend Policy Engine** (in Scanner.WebService): calls `POST /excititor/resolve` (scope `vex.read`) with batched `(purl, vulnId)` pairs to fetch `rollupStatus + sources`.
* **Concelier**: provides alias graph (CVE↔vendor IDs) and may supply VEXadjacent metadata (e.g., KEV flag) for policy escalation.
* **UI**: VEX explorer screens use `/claims/search` and `/consensus/search`; show conflicts & provenance.
* **CLI**: `stellaops vex export --consensus --since 7d --out vex.json` for audits.
* **UI**: VEX explorer screens use `/observations/search`, `/linksets/search`, and `/consensus/search`; show conflicts & provenance.
* **CLI**: `stella vex linksets export --since 7d --out vex-linksets.json` (optionally `--include-consensus`) for audits and Offline Kit parity.
---

View File

@@ -58,6 +58,8 @@ Everything here is opensource and versioned— when you check out a git ta
- **10[Scanner Cache Configuration](dev/SCANNER_CACHE_CONFIGURATION.md)**
- **30[Excititor Connector Packaging Guide](dev/30_EXCITITOR_CONNECTOR_GUIDE.md)**
- **31[Aggregation-Only Contract Reference](ingestion/aggregation-only-contract.md)**
- **31[Advisory Observations & Linksets](advisories/aggregation.md)**
- **31[VEX Observations & Linksets](vex/aggregation.md)**
- **30Developer Templates**
- [Excititor Connector Skeleton](dev/templates/excititor-connector/)
- **11[Authority Service](11_AUTHORITY.md)**
@@ -65,28 +67,34 @@ Everything here is opensource and versioned— when you check out a git ta
- **12[Performance Workbook](12_PERFORMANCE_WORKBOOK.md)**
- **13[ReleaseEngineering Playbook](13_RELEASE_ENGINEERING_PLAYBOOK.md)**
- **20[CLI AOC Commands Reference](cli/cli-reference.md)**
- **20[Console CLI Parity Matrix](cli-vs-ui-parity.md)**
- **60[Policy Engine Overview](policy/overview.md)**
- **61[Policy DSL Grammar](policy/dsl.md)**
- **62[Policy Lifecycle & Approvals](policy/lifecycle.md)**
- **63[Policy Runs & Orchestration](policy/runs.md)**
- **64[Policy Engine REST API](api/policy.md)**
- **65[Policy CLI Guide](cli/policy.md)**
- **66[Policy Editor Workspace](ui/policy-editor.md)**
- **67[Policy Observability](observability/policy.md)**
- **68[Policy Governance & Least Privilege](security/policy-governance.md)**
- **69[Policy Examples](examples/policies/README.md)**
- **70[Policy FAQ](faq/policy-faq.md)**
- **71[Policy Run DTOs](../src/StellaOps.Scheduler.Models/docs/SCHED-MODELS-20-001-POLICY-RUNS.md)**
- **64[Policy Exception Effects](policy/exception-effects.md)**
- **65[Policy Engine REST API](api/policy.md)**
- **66[Policy CLI Guide](cli/policy.md)**
- **67[Policy Editor Workspace](ui/policy-editor.md)**
- **68[Policy Observability](observability/policy.md)**
- **69[Console Observability](observability/ui-telemetry.md)**
- **70[Policy Governance & Least Privilege](security/policy-governance.md)**
- **71[Policy Examples](examples/policies/README.md)**
- **72[Policy FAQ](faq/policy-faq.md)**
- **73[Policy Run DTOs](../src/StellaOps.Scheduler.Models/docs/SCHED-MODELS-20-001-POLICY-RUNS.md)**
- **30[Fixture Maintenance](dev/fixtures.md)**
### User & operator guides
- **14[Glossary](14_GLOSSARY_OF_TERMS.md)**
- **15[UI Guide](15_UI_GUIDE.md)**
- **16[Console AOC Dashboard](ui/console.md)**
- **16[Console Accessibility Guide](accessibility.md)**
- **17[Security Hardening Guide](17_SECURITY_HARDENING_GUIDE.md)**
- **17[Console Security Posture](security/console-security.md)**
- **18[Coding Standards](18_CODING_STANDARDS.md)**
- **19[TestSuite Overview](19_TEST_SUITE_OVERVIEW.md)**
- **21[Install Guide](21_INSTALL_GUIDE.md)**
- **21[Docker Install Recipes](install/docker.md)**
- **22[CI/CD Recipes Library](ci/20_CI_RECIPES.md)**
- **23[FAQ](23_FAQ_MATRIX.md)**
- **24[Offline Update Kit Admin Guide](24_OFFLINE_KIT.md)**

View File

@@ -128,11 +128,15 @@
| ID | Status | Owner(s) | Depends on | Description | Exit Criteria |
|----|--------|----------|------------|-------------|---------------|
| DOCS-LNM-22-001 | TODO | Docs Guild, Concelier Guild | CONCELIER-LNM-21-001..003 | Author `/docs/advisories/aggregation.md` covering observation vs linkset, conflict handling, AOC requirements, and reviewer checklist. | Doc merged with examples + checklist; lint passes. |
| DOCS-LNM-22-002 | TODO | Docs Guild, Excititor Guild | EXCITITOR-LNM-21-001..003 | Publish `/docs/vex/aggregation.md` describing VEX observation/linkset model, product matching, conflicts. | Doc merged with fixtures; checklist appended. |
| DOCS-LNM-22-003 | TODO | Docs Guild, BE-Base Platform Guild | WEB-LNM-21-001..003 | Update `/docs/api/advisories.md` and `/docs/api/vex.md` for new endpoints, parameters, errors, exports. | API docs aligned with OpenAPI; examples validated. |
| DOCS-LNM-22-001 | BLOCKED (2025-10-27) | Docs Guild, Concelier Guild | CONCELIER-LNM-21-001..003 | Author `/docs/advisories/aggregation.md` covering observation vs linkset, conflict handling, AOC requirements, and reviewer checklist. | Draft doc merged with examples + checklist; final sign-off blocked until Concelier schema/API tasks land. |
> Blocker (2025-10-27): `CONCELIER-LNM-21-001..003` still TODO; update doc + fixtures once schema/API implementations are available.
| DOCS-LNM-22-002 | BLOCKED (2025-10-27) | Docs Guild, Excititor Guild | EXCITITOR-LNM-21-001..003 | Publish `/docs/vex/aggregation.md` describing VEX observation/linkset model, product matching, conflicts. | Draft doc merged with fixtures; final approval blocked until Excititor observation/linkset work ships. |
> Blocker (2025-10-27): `EXCITITOR-LNM-21-001..003` remain TODO; refresh doc, fixtures, and examples post-implementation.
| DOCS-LNM-22-003 | BLOCKED (2025-10-27) | Docs Guild, BE-Base Platform Guild | WEB-LNM-21-001..003 | Update `/docs/api/advisories.md` and `/docs/api/vex.md` for new endpoints, parameters, errors, exports. | Draft pending gateway/API delivery; unblock once endpoints + OpenAPI specs are available. |
> Blocker (2025-10-27): `WEB-LNM-21-001..003` all TODO—no gateway endpoints/OpenAPI to document yet.
| DOCS-LNM-22-004 | TODO | Docs Guild, Policy Guild | POLICY-ENGINE-40-001 | Create `/docs/policy/effective-severity.md` detailing severity selection strategies from multiple sources. | Doc merged with policy examples; checklist included. |
| DOCS-LNM-22-005 | TODO | Docs Guild, UI Guild | UI-LNM-22-001..003 | Document `/docs/ui/evidence-panel.md` with screenshots, conflict badges, accessibility guidance. | UI doc merged; accessibility checklist completed. |
| DOCS-LNM-22-005 | BLOCKED (2025-10-27) | Docs Guild, UI Guild | UI-LNM-22-001..003 | Document `/docs/ui/evidence-panel.md` with screenshots, conflict badges, accessibility guidance. | Awaiting UI implementation to capture screenshots + flows; unblock once Evidence panel ships. |
> Blocker (2025-10-27): `UI-LNM-22-001..003` all TODO; documentation requires final UI states and accessibility audit artifacts.
## StellaOps Console (Sprint 23)
@@ -148,14 +152,18 @@
| DOCS-CONSOLE-23-008 | DONE (2025-10-26) | Docs Guild, Authority Guild | AUTH-CONSOLE-23-002, CONSOLE-FEAT-23-108 | Draft `/docs/ui/admin.md` describing users/roles, tenants, tokens, integrations, fresh-auth prompts, and RBAC mapping. | Doc merged with tables for scopes vs roles, screenshots, compliance checklist. |
| DOCS-CONSOLE-23-009 | DONE (2025-10-27) | Docs Guild, DevOps Guild | DOWNLOADS-CONSOLE-23-001, CONSOLE-FEAT-23-109 | Publish `/docs/ui/downloads.md` listing product images, commands, offline instructions, parity with CLI, and compliance checklist. | Doc merged; manifest sample included; copy-to-clipboard guidance documented; checklist complete. |
| DOCS-CONSOLE-23-010 | DONE (2025-10-27) | Docs Guild, Deployment Guild, Console Guild | DEVOPS-CONSOLE-23-002, CONSOLE-REL-23-301 | Write `/docs/deploy/console.md` (Helm, ingress, TLS, CSP, env vars, health checks) with compliance checklist. | Deploy doc merged; templates validated; CSP guidance included; checklist appended. |
| DOCS-CONSOLE-23-011 | DOING (2025-10-27) | Docs Guild, Deployment Guild | DOCS-CONSOLE-23-010 | Update `/docs/install/docker.md` to cover Console image, Compose/Helm usage, offline tarballs, parity with CLI. | Doc updated with new sections; commands validated; compliance checklist appended. |
| DOCS-CONSOLE-23-012 | TODO | Docs Guild, Security Guild | AUTH-CONSOLE-23-003, WEB-CONSOLE-23-002 | Publish `/docs/security/console-security.md` detailing OIDC flows, scopes, CSP, fresh-auth, evidence handling, and compliance checklist. | Security doc merged; threat model notes included; checklist appended. |
| DOCS-CONSOLE-23-013 | TODO | Docs Guild, Observability Guild | TELEMETRY-CONSOLE-23-001, CONSOLE-QA-23-403 | Write `/docs/observability/ui-telemetry.md` cataloguing metrics/logs/traces, dashboards, alerts, and feature flags. | Doc merged with instrumentation tables, dashboard screenshots, checklist appended. |
| DOCS-CONSOLE-23-014 | TODO | Docs Guild, Console Guild, CLI Guild | CONSOLE-DOC-23-502 | Maintain `/docs/cli-vs-ui-parity.md` matrix and integrate CI check guidance. | Matrix published with parity status, CI workflow documented, compliance checklist appended. |
| DOCS-CONSOLE-23-011 | DONE (2025-10-28) | Docs Guild, Deployment Guild | DOCS-CONSOLE-23-010 | Update `/docs/install/docker.md` to cover Console image, Compose/Helm usage, offline tarballs, parity with CLI. | Doc updated with new sections; commands validated; compliance checklist appended. |
| DOCS-CONSOLE-23-012 | DONE (2025-10-28) | Docs Guild, Security Guild | AUTH-CONSOLE-23-003, WEB-CONSOLE-23-002 | Publish `/docs/security/console-security.md` detailing OIDC flows, scopes, CSP, fresh-auth, evidence handling, and compliance checklist. | Security doc merged; threat model notes included; checklist appended. |
| DOCS-CONSOLE-23-013 | DONE (2025-10-28) | Docs Guild, Observability Guild | TELEMETRY-CONSOLE-23-001, CONSOLE-QA-23-403 | Write `/docs/observability/ui-telemetry.md` cataloguing metrics/logs/traces, dashboards, alerts, and feature flags. | Doc merged with instrumentation tables, dashboard screenshots, checklist appended. |
| DOCS-CONSOLE-23-014 | DONE (2025-10-28) | Docs Guild, Console Guild, CLI Guild | CONSOLE-DOC-23-502 | Maintain `/docs/cli-vs-ui-parity.md` matrix and integrate CI check guidance. | Matrix published with parity status, CI workflow documented, compliance checklist appended. |
> 2025-10-28: Install Docker guide references pending CLI commands (`stella downloads manifest`, `stella downloads mirror`, `stella console status`). Update once CLI parity lands.
| DOCS-CONSOLE-23-015 | TODO | Docs Guild, Architecture Guild | CONSOLE-CORE-23-001, WEB-CONSOLE-23-001 | Produce `/docs/architecture/console.md` describing frontend packages, data flow diagrams, SSE design, performance budgets. | Architecture doc merged with diagrams + compliance checklist; reviewers approve. |
| DOCS-CONSOLE-23-016 | TODO | Docs Guild, Accessibility Guild | CONSOLE-QA-23-402, CONSOLE-FEAT-23-102 | Refresh `/docs/accessibility.md` with Console-specific keyboard flows, color tokens, testing tools, and compliance checklist updates. | Accessibility doc updated; audits referenced; checklist appended. |
| DOCS-CONSOLE-23-016 | DONE (2025-10-28) | Docs Guild, Accessibility Guild | CONSOLE-QA-23-402, CONSOLE-FEAT-23-102 | Refresh `/docs/accessibility.md` with Console-specific keyboard flows, color tokens, testing tools, and compliance checklist updates. | Accessibility doc updated; audits referenced; checklist appended. |
> 2025-10-28: Added guide covering keyboard matrix, screen reader behaviour, colour/focus tokens, testing workflow, offline guidance, and compliance checklist.
| DOCS-CONSOLE-23-017 | TODO | Docs Guild, Console Guild | CONSOLE-FEAT-23-101..109 | Create `/docs/examples/ui-tours.md` providing triage, audit, policy rollout walkthroughs with annotated screenshots and GIFs. | UI tours doc merged; media assets stored; compliance checklist appended. |
| DOCS-LNM-22-006 | TODO | Docs Guild, Architecture Guild | CONCELIER-LNM-21-001..005, EXCITITOR-LNM-21-001..005 | Refresh `/docs/architecture/conseiller.md` and `/docs/architecture/excitator.md` describing observation/linkset pipelines and event contracts. | Architecture docs updated with diagrams; checklist appended. |
| DOCS-CONSOLE-23-018 | TODO | Docs Guild, Security Guild | DOCS-CONSOLE-23-012 | Execute console security compliance checklist and capture Security Guild sign-off in Sprint 23 log. | Checklist completed; findings addressed or tickets filed; sign-off noted in updates file. |
| DOCS-LNM-22-006 | DONE (2025-10-27) | Docs Guild, Architecture Guild | CONCELIER-LNM-21-001..005, EXCITITOR-LNM-21-001..005 | Refresh `/docs/architecture/conseiller.md` and `/docs/architecture/excitator.md` describing observation/linkset pipelines and event contracts. | Architecture docs updated with observation/linkset flow + event tables; revisit once service implementations land. |
> Follow-up: align diagrams/examples after `CONCELIER-LNM-21` & `EXCITITOR-LNM-21` work merges (currently TODO).
| DOCS-LNM-22-007 | TODO | Docs Guild, Observability Guild | CONCELIER-LNM-21-005, EXCITITOR-LNM-21-005, DEVOPS-LNM-22-002 | Publish `/docs/observability/aggregation.md` with metrics/traces/logs/SLOs. | Observability doc merged; dashboards referenced; checklist appended. |
| DOCS-LNM-22-008 | TODO | Docs Guild, DevOps Guild | MERGE-LNM-21-001, CONCELIER-LNM-21-102 | Write `/docs/migration/no-merge.md` describing migration plan, backfill steps, rollback, feature flags. | Migration doc approved by stakeholders; checklist appended. |
@@ -193,7 +201,7 @@
| DOCS-EXC-25-001 | TODO | Docs Guild, Governance Guild | WEB-EXC-25-001 | Author `/docs/governance/exceptions.md` covering lifecycle, scope patterns, examples, compliance checklist. | Doc merged; reviewers sign off; checklist included. |
| DOCS-EXC-25-002 | TODO | Docs Guild, Authority Core | AUTH-EXC-25-001 | Publish `/docs/governance/approvals-and-routing.md` detailing roles, routing matrix, MFA rules, audit trails. | Doc merged; routing examples validated; checklist appended. |
| DOCS-EXC-25-003 | TODO | Docs Guild, BE-Base Platform Guild | WEB-EXC-25-001..003 | Create `/docs/api/exceptions.md` with endpoints, payloads, errors, idempotency notes. | API doc aligned with OpenAPI; examples tested; checklist appended. |
| DOCS-EXC-25-004 | TODO | Docs Guild, Policy Guild | POLICY-ENGINE-70-001 | Document `/docs/policy/exception-effects.md` explaining evaluation order, conflicts, simulation. | Doc merged; tests cross-referenced; checklist appended. |
| DOCS-EXC-25-004 | DONE (2025-10-27) | Docs Guild, Policy Guild | POLICY-ENGINE-70-001 | Document `/docs/policy/exception-effects.md` explaining evaluation order, conflicts, simulation. | Doc merged; tests cross-referenced; checklist appended. |
| DOCS-EXC-25-005 | TODO | Docs Guild, UI Guild | UI-EXC-25-001..004 | Write `/docs/ui/exception-center.md` with UI walkthrough, badges, accessibility, shortcuts. | Doc merged with screenshots; accessibility checklist completed. |
| DOCS-EXC-25-006 | TODO | Docs Guild, DevEx/CLI Guild | CLI-EXC-25-001..002 | Update `/docs/cli/exceptions.md` covering command usage and exit codes. | CLI doc updated; examples validated; checklist appended. |
| DOCS-EXC-25-007 | TODO | Docs Guild, DevOps Guild | SCHED-WORKER-25-101, DEVOPS-GRAPH-24-003 | Publish `/docs/migration/exception-governance.md` describing cutover from legacy suppressions, notifications, rollback. | Migration doc approved; checklist included. |
@@ -249,20 +257,48 @@
| ID | Status | Owner(s) | Depends on | Description | Exit Criteria |
|----|--------|----------|------------|-------------|---------------|
| DOCS-POLICY-27-001 | TODO | Docs Guild, Policy Guild | REGISTRY-API-27-001, POLICY-ENGINE-27-001 | Publish `/docs/policy/studio-overview.md` covering lifecycle, roles, glossary, and compliance checklist. | Doc merged with diagrams + lifecycle table; checklist appended; stakeholders sign off. |
| DOCS-POLICY-27-002 | TODO | Docs Guild, Console Guild | CONSOLE-STUDIO-27-001 | Write `/docs/policy/authoring.md` detailing workspace templates, snippets, lint rules, IDE shortcuts, and best practices. | Authoring doc includes annotated screenshots, snippet catalog, compliance checklist. |
| DOCS-POLICY-27-003 | TODO | Docs Guild, Policy Registry Guild | REGISTRY-API-27-007 | Document `/docs/policy/versioning-and-publishing.md` (semver rules, attestations, rollback) with compliance checklist. | Doc merged with flow diagrams; attestation steps documented; checklist appended. |
| DOCS-POLICY-27-004 | TODO | Docs Guild, Scheduler Guild | REGISTRY-API-27-005, SCHED-WORKER-27-301 | Write `/docs/policy/simulation.md` covering quick vs batch sim, thresholds, evidence bundles, CLI examples. | Simulation doc includes charts, sample manifests, checklist appended. |
| DOCS-POLICY-27-005 | TODO | Docs Guild, Product Ops | REGISTRY-API-27-006 | Publish `/docs/policy/review-and-approval.md` with approver requirements, comments, webhooks, audit trail guidance. | Doc merged with role matrix + webhook schema; checklist appended. |
| DOCS-POLICY-27-006 | TODO | Docs Guild, Policy Guild | REGISTRY-API-27-008 | Author `/docs/policy/promotion.md` covering environments, canary, rollback, and monitoring steps. | Promotion doc includes examples + checklist; verified by Policy Ops. |
| DOCS-POLICY-27-007 | TODO | Docs Guild, DevEx/CLI Guild | CLI-POLICY-27-001..004 | Update `/docs/policy/cli.md` with new commands, JSON schemas, CI usage, and compliance checklist. | CLI doc merged with transcripts; schema references validated; checklist appended. |
| DOCS-POLICY-27-008 | TODO | Docs Guild, Policy Registry Guild | REGISTRY-API-27-001..008 | Publish `/docs/policy/api.md` describing Registry endpoints, request/response schemas, errors, and feature flags. | API doc aligned with OpenAPI; examples validated; checklist appended. |
| DOCS-POLICY-27-009 | TODO | Docs Guild, Security Guild | AUTH-POLICY-27-002 | Create `/docs/security/policy-attestations.md` covering signing, verification, key rotation, and compliance checklist. | Security doc approved by Security Guild; verifier steps documented; checklist appended. |
| DOCS-POLICY-27-010 | TODO | Docs Guild, Architecture Guild | REGISTRY-API-27-001, SCHED-WORKER-27-301 | Author `/docs/architecture/policy-registry.md` (service design, schemas, queues, failure modes) with diagrams and checklist. | Architecture doc merged; diagrams committed; checklist appended. |
| DOCS-POLICY-27-011 | TODO | Docs Guild, Observability Guild | DEVOPS-POLICY-27-004 | Publish `/docs/observability/policy-telemetry.md` with metrics/log tables, dashboards, alerts, and compliance checklist. | Observability doc merged; dashboards linked; checklist appended. |
| DOCS-POLICY-27-012 | TODO | Docs Guild, Ops Guild | DEPLOY-POLICY-27-002 | Write `/docs/runbooks/policy-incident.md` detailing rollback, freeze, forensic steps, notifications. | Runbook merged; rehearsal recorded; checklist appended. |
| DOCS-POLICY-27-013 | TODO | Docs Guild, Policy Guild | CONSOLE-STUDIO-27-001, REGISTRY-API-27-002 | Update `/docs/examples/policy-templates.md` with new templates, snippets, and sample policies. | Examples committed with commentary; lint passes; checklist appended. |
| DOCS-POLICY-27-014 | TODO | Docs Guild, Policy Registry Guild | REGISTRY-API-27-003, WEB-POLICY-27-001 | Refresh `/docs/aoc/aoc-guardrails.md` to include Studio-specific guardrails and validation scenarios. | Doc updated with Studio guardrails; compliance checklist appended. |
| DOCS-POLICY-27-001 | BLOCKED (2025-10-27) | Docs Guild, Policy Guild | REGISTRY-API-27-001, POLICY-ENGINE-27-001 | Publish `/docs/policy/studio-overview.md` covering lifecycle, roles, glossary, and compliance checklist. | Doc merged with diagrams + lifecycle table; checklist appended; stakeholders sign off. |
> Blocked by `REGISTRY-API-27-001` and `POLICY-ENGINE-27-001`; need spec + compile data.
> Blocker: Registry OpenAPI (`REGISTRY-API-27-001`) and policy compile enrichments (`POLICY-ENGINE-27-001`) are still TODO; need final interfaces before drafting overview.
| DOCS-POLICY-27-002 | BLOCKED (2025-10-27) | Docs Guild, Console Guild | CONSOLE-STUDIO-27-001 | Write `/docs/policy/authoring.md` detailing workspace templates, snippets, lint rules, IDE shortcuts, and best practices. | Authoring doc includes annotated screenshots, snippet catalog, compliance checklist. |
> Blocked by `CONSOLE-STUDIO-27-001` Studio authoring UI pending.
> Blocker: Console Studio authoring UI (`CONSOLE-STUDIO-27-001`) not implemented; awaiting UX to capture flows/snippets.
| DOCS-POLICY-27-003 | BLOCKED (2025-10-27) | Docs Guild, Policy Registry Guild | REGISTRY-API-27-007 | Document `/docs/policy/versioning-and-publishing.md` (semver rules, attestations, rollback) with compliance checklist. | Doc merged with flow diagrams; attestation steps documented; checklist appended. |
> Blocked by `REGISTRY-API-27-007` publish/sign pipeline outstanding.
> Blocker: Registry publish/sign workflow (`REGISTRY-API-27-007`) pending.
| DOCS-POLICY-27-004 | BLOCKED (2025-10-27) | Docs Guild, Scheduler Guild | REGISTRY-API-27-005, SCHED-WORKER-27-301 | Write `/docs/policy/simulation.md` covering quick vs batch sim, thresholds, evidence bundles, CLI examples. | Simulation doc includes charts, sample manifests, checklist appended. |
> Blocked by `REGISTRY-API-27-005`/`SCHED-WORKER-27-301` batch simulation not ready.
> Blocker: Batch simulation APIs/workers (`REGISTRY-API-27-005`, `SCHED-WORKER-27-301`) still TODO.
| DOCS-POLICY-27-005 | BLOCKED (2025-10-27) | Docs Guild, Product Ops | REGISTRY-API-27-006 | Publish `/docs/policy/review-and-approval.md` with approver requirements, comments, webhooks, audit trail guidance. | Doc merged with role matrix + webhook schema; checklist appended. |
> Blocked by `REGISTRY-API-27-006` review workflow not implemented.
> Blocker: Review workflow (`REGISTRY-API-27-006`) not landed.
| DOCS-POLICY-27-006 | BLOCKED (2025-10-27) | Docs Guild, Policy Guild | REGISTRY-API-27-008 | Author `/docs/policy/promotion.md` covering environments, canary, rollback, and monitoring steps. | Promotion doc includes examples + checklist; verified by Policy Ops. |
> Blocked by `REGISTRY-API-27-008` promotion APIs pending.
> Blocker: Promotion/canary APIs (`REGISTRY-API-27-008`) outstanding.
| DOCS-POLICY-27-007 | BLOCKED (2025-10-27) | Docs Guild, DevEx/CLI Guild | CLI-POLICY-27-001..004 | Update `/docs/policy/cli.md` with new commands, JSON schemas, CI usage, and compliance checklist. | CLI doc merged with transcripts; schema references validated; checklist appended. |
> Blocked by `CLI-POLICY-27-001..004` CLI commands missing.
> Blocker: Policy CLI commands (`CLI-POLICY-27-001..004`) yet to implement.
| DOCS-POLICY-27-008 | BLOCKED (2025-10-27) | Docs Guild, Policy Registry Guild | REGISTRY-API-27-001..008 | Publish `/docs/policy/api.md` describing Registry endpoints, request/response schemas, errors, and feature flags. | API doc aligned with OpenAPI; examples validated; checklist appended. |
> Blocked by `REGISTRY-API-27-001..008` OpenAPI + endpoints incomplete.
> Blocker: Registry OpenAPI/spec suite (`REGISTRY-API-27-001..008`) incomplete.
| DOCS-POLICY-27-009 | BLOCKED (2025-10-27) | Docs Guild, Security Guild | AUTH-POLICY-27-002 | Create `/docs/security/policy-attestations.md` covering signing, verification, key rotation, and compliance checklist. | Security doc approved by Security Guild; verifier steps documented; checklist appended. |
> Blocked by `AUTH-POLICY-27-002` signing enforcement pending.
> Blocker: Authority signing enforcement (`AUTH-POLICY-27-002`) pending.
| DOCS-POLICY-27-010 | BLOCKED (2025-10-27) | Docs Guild, Architecture Guild | REGISTRY-API-27-001, SCHED-WORKER-27-301 | Author `/docs/architecture/policy-registry.md` (service design, schemas, queues, failure modes) with diagrams and checklist. | Architecture doc merged; diagrams committed; checklist appended. |
> Blocked by `REGISTRY-API-27-001` & `SCHED-WORKER-27-301` need delivery.
> Blocker: Policy Registry schema/workers not delivered (see `REGISTRY-API-27-001`, `SCHED-WORKER-27-301`).
| DOCS-POLICY-27-011 | BLOCKED (2025-10-27) | Docs Guild, Observability Guild | DEVOPS-POLICY-27-004 | Publish `/docs/observability/policy-telemetry.md` with metrics/log tables, dashboards, alerts, and compliance checklist. | Observability doc merged; dashboards linked; checklist appended. |
> Blocked by `DEVOPS-POLICY-27-004` observability dashboards outstanding.
> Blocker: Observability dashboards (`DEVOPS-POLICY-27-004`) not built.
| DOCS-POLICY-27-012 | BLOCKED (2025-10-27) | Docs Guild, Ops Guild | DEPLOY-POLICY-27-002 | Write `/docs/runbooks/policy-incident.md` detailing rollback, freeze, forensic steps, notifications. | Runbook merged; rehearsal recorded; checklist appended. |
> Blocked by `DEPLOY-POLICY-27-002` incident runbook inputs pending.
> Blocker: Ops runbook inputs (`DEPLOY-POLICY-27-002`) pending.
| DOCS-POLICY-27-013 | BLOCKED (2025-10-27) | Docs Guild, Policy Guild | CONSOLE-STUDIO-27-001, REGISTRY-API-27-002 | Update `/docs/examples/policy-templates.md` with new templates, snippets, and sample policies. | Examples committed with commentary; lint passes; checklist appended. |
> Blocked by `CONSOLE-STUDIO-27-001`/`REGISTRY-API-27-002` templates missing.
> Blocker: Studio templates and registry storage (`CONSOLE-STUDIO-27-001`, `REGISTRY-API-27-002`) not available.
| DOCS-POLICY-27-014 | BLOCKED (2025-10-27) | Docs Guild, Policy Registry Guild | REGISTRY-API-27-003, WEB-POLICY-27-001 | Refresh `/docs/aoc/aoc-guardrails.md` to include Studio-specific guardrails and validation scenarios. | Doc updated with Studio guardrails; compliance checklist appended. |
> Blocked by `REGISTRY-API-27-003` & `WEB-POLICY-27-001` guardrails not implemented.
> Blocker: Registry compile pipeline/web proxy (`REGISTRY-API-27-003`, `WEB-POLICY-27-001`) outstanding.
## Vulnerability Explorer (Sprint 29)

131
docs/accessibility.md Normal file
View File

@@ -0,0 +1,131 @@
# StellaOps Console Accessibility Guide
> **Audience:** Accessibility Guild, Console Guild, Docs Guild, QA.
> **Scope:** Keyboard interaction model, screen-reader behaviour, colour & focus tokens, testing workflows, offline considerations, and compliance checklist for the StellaOps Console (Sprint23).
The console targets **WCAG2.2 AA** across all supported browsers (Chromium, Firefox ESR) and honours StellaOps sovereign/offline constraints. Every build must keep keyboard-only users, screen-reader users, and high-contrast operators productive without relying on third-party services.
---
## 1·Accessibility Principles
1. **Deterministic navigation** Focus order, shortcuts, and announcements remain stable across releases; URLs encode state for deep links.
2. **Keyboard-first design** Every actionable element is reachable via keyboard; shortcuts provide accelerators, and remapping is available via *Settings → Accessibility → Keyboard shortcuts*.
3. **Assistive technology parity** ARIA roles and live regions mirror visual affordances (status banners, SSE tickers, progress drawers). Screen readers receive polite/atomic updates to avoid chatter.
4. **Colour & contrast tokens** All palettes derive from design tokens that achieve ≥4.5:1 contrast (text) and ≥3:1 for graphical indicators; tokens pass automated contrast linting.
5. **Offline equivalence** Accessibility features (shortcuts, offline banners, focus restoration) behave the same in sealed environments, with guidance when actions require online authority.
---
## 2·Keyboard Interaction Map
### 2.1 Global shortcuts
| Action | Macs | Windows/Linux | Notes |
|--------|------|---------------|-------|
| Command palette | `⌘K` | `CtrlK` | Focuses palette search; respects tenant scope. |
| Tenant picker | `⌘T` | `CtrlT` | Opens modal; `Enter` confirms, `Esc` cancels. |
| Filter tray toggle | `⇧F` | `ShiftF` | Focus lands on first filter; `Tab` cycles filters before returning to page. |
| Saved view presets | `⌘1-9` | `Ctrl1-9` | Bound per tenant; missing preset triggers tooltip. |
| Keyboard reference | `?` | `?` | Opens overlay listing context-specific shortcuts; `Esc` closes. |
| Global search (context) | `/` | `/` | When the filter tray is closed, focuses inline search field. |
### 2.2 Module-specific shortcuts
| Module | Action | Macs | Windows/Linux | Notes |
|--------|--------|------|---------------|-------|
| Findings | Explain search | `⌘ /` | `Ctrl/` | Only when Explain drawer open; announces results via live region. |
| SBOM Explorer | Toggle overlays | `⌘G` | `CtrlG` | Persists per session (see `/docs/ui/sbom-explorer.md`). |
| Advisories & VEX | Provider filter | `⌘F` | `CtrlAltF` | Moves focus to provider chip row. |
| Runs | Refresh snapshot | `⌘R` | `CtrlR` | Soft refresh of SSE state; no full page reload. |
| Policies | Save draft | `⌘S` | `CtrlS` | Requires edit scope; exposes toast + status live update. |
| Downloads | Copy CLI command | `⇧D` | `ShiftD` | Copies manifest or export command; toast announces scope hints. |
All shortcuts are remappable. Remaps persist in IndexedDB (per tenant) and export as part of profile bundles so operators can restore preferences offline.
---
## 3·Screen Reader & Focus Behaviour
- **Skip navigation** Each route exposes a “Skip to content” link revealed on keyboard focus. Focus order: global header → page breadcrumb → action shelf → data grid/list → drawers/dialogs.
- **Live regions** Status ticker and SSE progress bars use `aria-live="polite"` with throttling to avoid flooding AT. Error toasts use `aria-live="assertive"` and auto-focus dismiss buttons.
- **Drawers & modals** Dialog components trap focus, support `Esc` to close, and restore focus to the launching control. Screen readers announce title + purpose.
- **Tables & grids** Large tables (Findings, SBOM inventory) switch to virtualised rows but retain ARIA grid semantics (`aria-rowcount`, `aria-colindex`). Column headers include sorting state via `aria-sort`.
- **Tenancy context** Tenant badge exposes `aria-describedby` linking to context summary (environment, offline snapshot). Switching tenant queues a polite announcement summarising new scope.
- **Command palette** Uses `role="dialog"` with search input labelled. Keyboard navigation within results uses `Up/Down`; screen readers announce result category + command.
- **Offline banner** When offline, a dismissible banner announces reason and includes instructions for CLI fallback. The banner has `role="status"` so it announces once without stealing focus.
---
## 4·Colour & Focus Tokens
Console consumes design tokens published by the Console Guild (tracked via CONSOLE-FEAT-23-102). Tokens live in the design system bundle (`ui/design/tokens/colors.json`, mirrored at build time). Key tokens:
| Token | Purpose | Contrast target |
|-------|---------|-----------------|
| `so-color-surface-base` | Primary surface/background | ≥4.5:1 against `so-color-text-primary`. |
| `so-color-surface-raised` | Cards, drawers, modals | ≥3:1 against surrounding surfaces. |
| `so-color-text-primary` | Default text colour | ≥4.5:1 against base surfaces. |
| `so-color-text-inverted` | Text on accent buttons | ≥4.5:1 against accent fills. |
| `so-color-accent-primary` | Action buttons, focus headings | ≥3:1 against surface. |
| `so-color-status-critical` | Error toasts, violation chips | ≥4.5:1 for text; `critical-bg` provides >3:1 on neutral surface. |
| `so-color-status-warning` | Warning banners | Meets 3:1 on surface and 4.5:1 for text overlays. |
| `so-color-status-success` | Success toasts, pass badges | ≥3:1 for iconography; text uses `text-primary`. |
| `so-focus-ring` | 2px outline used across focusable elements | 3:1 against both light/dark surfaces. |
Colour tokens undergo automated linting (**axe-core contrast checks** + custom luminance script) during build. Any new token must include dark/light variants and pass the token contract tests.
---
## 5·Testing Workflow
| Layer | Tooling | Frequency | Notes |
|-------|---------|-----------|-------|
| Component a11y | Storybook + axe-core addon | On PR (story CI) | Fails when axe detects violations. |
| Route regression | Playwright a11y sweep (`pnpm test:a11y`) | Nightly & release pipeline | Executes keyboard navigation, checks focus trap, runs Axe on key routes (Dashboard, Findings, SBOM, Admin). |
| Colour contrast lint | Token validator (`tools/a11y/check-contrast.ts`) | On token change | Guards design token updates. |
| CI parity | Pending `scripts/check-console-cli-parity.sh` (CONSOLE-DOC-23-502) | Release CI | Ensures CLI commands documented for parity features. |
| Screen-reader spot checks | Manual NVDA + VoiceOver scripts | Pre-release checklist | Scenarios: tenant switch, explain drawer, downloads parity copy. |
| Offline smoke | `stella offline kit import` + Playwright sealed-mode run | Prior to Offline Kit cut | Validates offline banners, disabled actions, keyboard flows without Authority. |
Accessibility QA (CONSOLE-QA-23-402) tracks failing scenarios via Playwright snapshots and publishes reports in the Downloads parity channel (`kind = "parity.report"` placeholder until CLI parity CI lands).
---
## 6·Offline & Internationalisation Considerations
- Offline mode surfaces staleness badges and disables remote-only palette entries; keyboard focus skips disabled controls.
- Saved shortcuts, presets, and remaps serialise into Offline Kit bundles so operators can restore preferences post-import.
- Locale switching (future feature flag) will load translations at runtime; ensure ARIA labels use i18n tokens rather than hard-coded strings.
- For sealed installs, guidance panels include CLI equivalents (`stella auth fresh-auth`, `stella runs export`) to unblock tasks when Authority is unavailable.
---
## 7·Compliance Checklist
- [ ] Keyboard shortcut matrix validated (default + remapped) and documented.
- [ ] Screen-reader pass recorded for tenant switch, Explain drawer, Downloads copy-to-clipboard.
- [ ] Colour tokens audited; contrast reports stored with release artifacts.
- [ ] Automated a11y pipelines (Storybook axe, Playwright a11y) green; failures feed the `#console-qa` channel.
- [ ] Offline kit a11y smoke executed before publishing each bundle.
- [ ] CLI parity gaps logged in `/docs/cli-vs-ui-parity.md`; UI callouts reference fallback commands until parity closes.
- [ ] Accessibility Guild sign-off captured in sprint log and release notes reference this guide.
- [ ] References cross-checked (`/docs/ui/navigation.md`, `/docs/ui/downloads.md`, `/docs/security/console-security.md`, `/docs/observability/ui-telemetry.md`).
---
## 8·References
- `/docs/ui/navigation.md` shortcut definitions, URL schema.
- `/docs/ui/downloads.md` CLI parity and offline copy workflows.
- `/docs/ui/console-overview.md` tenant model, filter behaviours.
- `/docs/security/console-security.md` security metrics and DPoP/fresh-auth requirements.
- `/docs/observability/ui-telemetry.md` telemetry metrics mapped to accessibility features.
- `/docs/cli-vs-ui-parity.md` parity status per console feature.
- `CONSOLE-QA-23-402` Accessibility QA backlog (Playwright + manual checks).
- `CONSOLE-FEAT-23-102` Design tokens & theming delivery.
---
*Last updated: 2025-10-28 (Sprint23).*

View File

@@ -0,0 +1,218 @@
# Advisory Observations & Linksets
> Imposed rule: Work of this type or tasks of this type on this component must also
> be applied everywhere else it should be applied.
The Link-Not-Merge (LNM) initiative replaces the legacy "merge" pipeline with
immutable observations and correlation linksets. This guide explains how
Concelier ingests advisory statements, preserves upstream truth, and produces
linksets that downstream services (Policy Engine, Vuln Explorer, Console) can
use without collapsing sources together.
---
## 1. Model overview
### 1.1 Observation lifecycle
1. **Ingest** Connectors fetch upstream payloads (CSAF, OSV, vendor feeds),
validate signatures, and drop any derived fields prohibited by the
Aggregation-Only Contract (AOC).
2. **Persist** Concelier writes immutable `advisory_observations` scoped by
`tenant`, `(source.vendor, upstreamId)`, and `contentHash`. Supersedes chains
capture revisions without mutating history.
3. **Expose** WebService surfaces paged/read APIs; Offline Kit snapshots
include the same documents for air-gapped installs.
Observation schema highlights:
```text
observationId = {tenant}:{source.vendor}:{upstreamId}:{revision}
tenant, source{vendor, stream, api, collectorVersion}
upstream{upstreamId, documentVersion, fetchedAt, receivedAt,
contentHash, signature{present, format, keyId, signature}}
content{format, specVersion, raw}
identifiers{cve?, ghsa?, aliases[], osvIds[]}
linkset{purls[], cpes[], aliases[], references[], conflicts[]?}
createdAt, attributes{batchId?, replayCursor?}
```
- **Immutable raw** (`content.raw`) mirrors upstream payloads exactly.
- **Provenance** (`source.*`, `upstream.*`) satisfies AOC guardrails and enables
cryptographic attestations.
- **Identifiers** retain lossless extracts (CVE, GHSA, vendor aliases) that seed
linksets.
- **Linkset** captures join hints but never merges or adds derived severity.
### 1.2 Linkset lifecycle
Linksets correlate observations that describe the same vulnerable product while
keeping each source intact.
1. **Seed** Observations emit normalized identifiers (`purl`, `cpe`,
`alias`) during ingestion.
2. **Correlate** Linkset builder groups observations by tenant, product
coordinates, and equivalence signals (PURL alias graph, CVE overlap, CVSS
vector equality, fuzzy titles).
3. **Annotate** Detected conflicts (severity disagreements, affected-range
mismatch, incompatible references) are recorded with structured payloads and
preserved for UI/API export.
4. **Persist** Results land in `advisory_linksets` with deterministic IDs
(`linksetId = {tenant}:{hash(aliases+purls+seedIds)}`) and append-only history
for reproducibility.
Linksets never suppress or prefer one source; they provide aligned evidence so
other services can apply policy.
---
## 2. Observation vs. linkset
- **Purpose**
- Observation: Immutable record per vendor and revision.
- Linkset: Correlates observations that share product identity.
- **Mutation**
- Observation: Append-only via supersedes chain.
- Linkset: Rebuilt deterministically from canonical signals.
- **Allowed fields**
- Observation: Raw payload, provenance, identifiers, join hints.
- Linkset: Observation references, normalized product metadata, conflicts.
- **Forbidden fields**
- Observation: Derived severity, policy status, opinionated dedupe.
- Linkset: Derived severity (conflicts recorded but unresolved).
- **Consumers**
- Observation: Evidence API, Offline Kit, CLI exports.
- Linkset: Policy Engine overlay, UI evidence panel, Vuln Explorer.
### 2.1 Example sequence
1. Red Hat PSIRT publishes RHSA-2025:1234 for OpenSSL; Concelier inserts an
observation for vendor `redhat` with `pkg:rpm/redhat/openssl@1.1.1w-12`.
2. NVD issues CVE-2025-0001; a second observation is inserted for vendor `nvd`.
3. Linkset builder runs, groups the two observations, records alias and PURL
overlap, and flags a CVSS disagreement (`7.5` vs `7.2`).
4. Policy Engine reads the linkset, recognises the severity variance, and relies
on configured rules to decide the effective output.
---
## 3. Conflict handling
Conflicts record disagreements without altering source payloads. The builder
emits structured entries:
```json
{
"type": "severity-mismatch",
"field": "cvss.baseScore",
"observations": [
{
"source": "redhat",
"value": "7.5",
"vector": "AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N"
},
{
"source": "nvd",
"value": "7.2",
"vector": "AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:N"
}
],
"confidence": "medium",
"detectedAt": "2025-10-27T14:00:00Z"
}
```
Supported conflict classes:
- `severity-mismatch` CVSS or qualitative severities differ.
- `affected-range-divergence` Product ranges, fixed versions, or platforms
disagree.
- `statement-disagreement` One observation declares `not_affected` while
another states `affected`.
- `reference-clash` URL or classifier collisions (for example, exploit URL vs
conflicting advisory).
- `alias-inconsistency` Aliases map to different canonical IDs (GHSA vs CVE).
- `metadata-gap` Required provenance missing on one source; logged as a
warning.
Conflict surfaces:
- WebService endpoints (`GET /advisories/linksets/{id}``conflicts[]`).
- UI evidence panel chips and conflict badges.
- CLI exports (JSON/OSV) exposed through LNM commands.
- Observability metrics (`advisory_linkset_conflicts_total{type}`).
---
## 4. AOC alignment
Observations and linksets must satisfy Aggregation-Only Contract invariants:
- **No derived severity** `content.raw` may include upstream severity, but the
observation body never injects or edits severity.
- **No merges** Each upstream document stays separate; linksets reference
observations via deterministic IDs.
- **Provenance mandatory** Missing `signature` or `source` metadata is an AOC
violation (`ERR_AOC_004`).
- **Idempotent writes** Duplicate `contentHash` yields a no-op; supersedes
pointer captures new revisions.
- **Deterministic output** Linkset builder sorts keys, normalizes timestamps
(UTC ISO-8601), and uses canonical JSON hashing.
Violations trigger guard errors (`ERR_AOC_00x`), emit `aoc_violation_total`
metrics, and block persistence until corrected.
---
## 5. Downstream consumption
- **Policy Engine** Computes effective severity and risk overlays from linkset
evidence and conflicts.
- **Console UI** Renders per-source statements, signed hashes, and conflict
banners inside the evidence panel.
- **CLI (`stella advisories linkset …`)** Exports observations and linksets as
JSON or OSV for offline triage.
- **Offline Kit** Shipping snapshots include observation and linkset
collections for air-gap parity.
- **Observability** Dashboards track ingestion latency, conflict counts, and
supersedes depth.
When adding new consumers, ensure they honour append-only semantics and do not
mutate observation or linkset collections.
---
## 6. Validation & testing
- **Unit tests** (`StellaOps.Concelier.Core.Tests`) validate schema guards,
deterministic linkset hashing, conflict detection fixtures, and supersedes
chains.
- **Mongo integration tests** (`StellaOps.Concelier.Storage.Mongo.Tests`) verify
indexes and idempotent writes under concurrency.
- **CLI smoke suites** confirm `stella advisories observations` and `stella
advisories linksets` export stable JSON.
- **Determinism checks** replay identical upstream payloads and assert that the
resulting observation and linkset documents match byte for byte.
- **Offline kit verification** simulates air-gapped bootstrap to confirm that
snapshots align with live data.
Add fixtures whenever a new conflict type or correlation signal is introduced.
Ensure canonical JSON serialization remains stable across .NET runtime updates.
---
## 7. Reviewer checklist
- Observation schema segment matches the latest `StellaOps.Concelier.Models`
contract.
- Linkset lifecycle covers correlation signals, conflict classes, and
deterministic IDs.
- AOC invariants are explicitly called out with violation codes.
- Examples include multi-source correlation plus conflict annotation.
- Downstream consumer guidance reflects active APIs and CLI features.
- Testing section lists required suites (Core, Storage, CLI, Offline).
- Imposed rule reminder is present at the top of the document.
Confirmed against Concelier Link-Not-Merge tasks:
`CONCELIER-LNM-21-001..005`, `CONCELIER-LNM-21-101..103`,
`CONCELIER-LNM-21-201..203`.

View File

@@ -117,10 +117,10 @@ sequenceDiagram
| Scope | Holder | Purpose | Notes |
|-------|--------|---------|-------|
| `advisory:write` / `vex:write` | Concelier / Excititor collectors | Append raw documents through ingestion endpoints. | Paired with tenant claims; requests without tenant are rejected. |
| `advisory:verify` / `vex:verify` | DevOps verify identity, CLI | Run `stella aoc verify` or call `/aoc/verify`. | Read-only; cannot mutate raw docs. |
| `advisory:ingest` / `vex:ingest` | Concelier / Excititor collectors | Append raw documents through ingestion endpoints. | Paired with tenant claims; requests without tenant are rejected. |
| `advisory:read` / `vex:read` | DevOps verify identity, CLI | Run `stella aoc verify` or call `/aoc/verify`. | Read-only; cannot mutate raw docs. |
| `effective:write` | Policy Engine | Materialise `effective_finding_*` overlays. | Only Policy Engine identity may hold; ingestion contexts receive `ERR_AOC_006` if they attempt. |
| `effective:read` | Console, CLI, exports | Consume derived findings. | Enforced by Gateway and downstream services. |
| `findings:read` | Console, CLI, exports | Consume derived findings. | Enforced by Gateway and downstream services. |
---

155
docs/cli-vs-ui-parity.md Normal file
View File

@@ -0,0 +1,155 @@
# Console CLI ↔ UI Parity Matrix
> **Audience:** Docs Guild, Console Guild, CLI Guild, DevOps automation.
> **Scope:** Track feature-level parity between the StellaOps Console and the `stella` CLI, surface pending work, and describe the parity CI check owned by CONSOLE-DOC-23-502.
Status key:
- **✅ Available** command exists in `StellaOps.Cli` and is documented.
- **🟡 In progress** command implemented but still under active delivery (task status `DOING`).
- **🟩 Planned** command specd but not yet implemented (task `TODO`).
- **⚪ UI-only** no CLI equivalent required.
- **🔴 Gap** CLI feature missing with no active task; file a task before sprint exit.
---
## 1·Navigation & Tenancy
| UI capability | CLI command(s) | Status | Notes / Tasks |
|---------------|----------------|--------|---------------|
| Login / token cache status (`/console/profile`) | `stella auth login`, `stella auth status`, `stella auth whoami` | ✅ Available | Command definitions in `CommandFactory.BuildAuthCommand`. |
| Fresh-auth challenge for sensitive actions | `stella auth fresh-auth` | ✅ Available | Referenced in `/docs/ui/admin.md`. |
| Tenant switcher (UI shell) | `--tenant` flag across CLI commands | ✅ Available | All multi-tenant commands require explicit `--tenant`. |
| Tenant creation / suspension | *(pending CLI)* | 🟩 Planned | No `stella auth tenant *` commands yet track via `CLI-TEN-47-001` (scopes & tenancy). |
---
## 2·Policies & Findings
| UI capability | CLI command(s) | Status | Notes / Tasks |
|---------------|----------------|--------|---------------|
| Policy simulation diff, explain | `stella policy simulate` | 🟡 In progress | Implementation present; task `CLI-POLICY-20-002` marked DOING. |
| Promote / activate policy | `stella policy promote`, `stella policy activate` | 🟩 Planned | Spec tracked under `CLI-POLICY-23-005`. |
| History & explain trees | `stella policy history`, `stella policy explain` | 🟩 Planned | `CLI-POLICY-23-006`. |
| Findings explorer export | `stella findings get`, `stella findings export` | 🟩 Planned | Part of `CLI-POLICY-20-003`. |
| Explain drawer JSON | `stella policy simulate --format json` | 🟡 In progress | Same command; JSON output flagged for CLI tests. |
---
## 3·Runs & Evidence
| UI capability | CLI command(s) | Status | Notes / Tasks |
|---------------|----------------|--------|---------------|
| Run retry / cancel | `stella runs retry`, `stella runs cancel` | 🟩 Planned | Included in export suite task `CLI-EXPORT-35-001`. |
| Manual run submit / preview | `stella runs submit`, `stella runs preview` | 🟩 Planned | `CLI-EXPORT-35-001`. |
| Evidence bundle export | `stella runs export --run <id> --bundle` | 🟩 Planned | `CLI-EXPORT-35-001`. |
| Run status polling | `stella runs status` | 🟩 Planned | Same task. |
---
## 4·Advisories, VEX, SBOM
| UI capability | CLI command(s) | Status | Notes / Tasks |
|---------------|----------------|--------|---------------|
| Advisory observations search | `stella vuln observations` | ✅ Available | Implemented via `BuildVulnCommand`. |
| Advisory linkset export | `stella advisory linkset show/export` | 🟩 Planned | `CLI-LNM-22-001`. |
| VEX observations / linksets | `stella vex obs get/linkset show` | 🟩 Planned | `CLI-LNM-22-002`. |
| SBOM overlay export | `stella sbom overlay apply/export` | 🟩 Planned | Scoped to upcoming SBOM CLI sprint (`SBOM-CONSOLE-23-001/002` + CLI backlog). |
---
## 5·Downloads & Offline Kit
| UI capability | CLI command(s) | Status | Notes / Tasks |
|---------------|----------------|--------|---------------|
| Manifest lookup (Console Downloads) | `stella downloads manifest show --artifact <id>` | 🟩 Planned | Delivered with `CONSOLE-DOC-23-502` + CLI parity commands. |
| Mirror digest to OCI archive | `stella downloads mirror --artifact <id> --to <target>` | 🟩 Planned | Same task bundle (`CONSOLE-DOC-23-502`). |
| Console health check | `stella console status --endpoint <url>` | 🟩 Planned | Tracked in `CONSOLE-DOC-23-502`; interim use `curl` as documented. |
| Offline kit import/export | `stella offline kit import`, `stella offline kit export` | ✅ Available | Implemented (see `CommandHandlers.HandleOfflineKitImportAsync/HandleOfflineKitPullAsync`). |
---
## 6·Admin & Security
| UI capability | CLI command(s) | Status | Notes / Tasks |
|---------------|----------------|--------|---------------|
| Client creation / rotation | `stella auth client create` *(planned)* | 🟩 Planned | Pending tenancy backlog `CLI-TEN-47-001`. |
| Token revoke | `stella auth revoke export/verify` | ✅ Available | Already implemented. |
| Audit export | `stella auth audit export` | 🟩 Planned | Needs CLI work item (Authority guild). |
| Signing key rotation | `stella auth signing rotate` | 🟩 Planned | To be added with AUTH-CONSOLE-23-003 follow-up. |
---
## 7·Telemetry & Observability
| UI capability | CLI command(s) | Status | Notes / Tasks |
|---------------|----------------|--------|---------------|
| Telemetry dashboard parity | `stella obs top`, `stella obs trace`, `stella obs logs` | 🟩 Planned | CLI observability epic (`CLI-OBS-51-001`, `CLI-OBS-52-001`). |
| Incident mode toggle | `stella obs incident-mode enable|disable|status` | 🟩 Planned | CLI task `CLI-OBS-55-001`. |
| Verify console telemetry health | `stella console status --telemetry` | 🟩 Planned | Part of `CONSOLE-DOC-23-502`. |
---
## 8·Parity Gaps & Follow-up
- **Tenant and client lifecycle CLI**: create/suspend tenants, manage clients. Coordinate with Authority CLI epic (`CLI-TEN-47-001`, `CLI-TEN-49-001`).
- **Downloads parity commands**: blocked on `CONSOLE-DOC-23-502` and DevOps pipeline `DOWNLOADS-CONSOLE-23-001`.
- **Policy promotion/history**: requires completion of CLI policy epic (`CLI-POLICY-23-005`/`23-006`).
- **Runs/evidence exports**: waiting on `CLI-EXPORT-35-001`.
- **Observability tooling**: deliver `stella obs` commands before enabling parity CI checks for telemetry.
Document updates should occur whenever a row changes status. When promoting a command from Planned → Available, ensure:
1. CLI command merged with help text.
2. Relevant UI doc references updated to remove “pending” callouts.
3. This matrix row status updated to ✅ and task IDs moved to release notes.
---
## 9·Parity CI Check (CONSOLE-DOC-23-502)
- **Owner:** Docs Guild + DevEx/CLI Guild.
- **Artefact:** Planned `.gitea/workflows/cli-parity-console.yml`.
- **What it does:** Runs `scripts/check-console-cli-parity.sh` (to be committed with the workflow) which:
1. Parses this matrix (YAML view exported from Markdown) to identify rows marked ✅.
2. Executes `stella --help` to confirm listed commands exist.
3. Optionally triggers smoke commands in sandbox mode (e.g., `stella policy simulate --help`).
- **Failure action:** Workflow fails when a listed command is missing or when a row marked ✅ still contains “pending” notes. Update the matrix or fix CLI implementation before merging.
Until the workflow lands, run the checker locally:
```bash
# Pending CONSOLE-DOC-23-502 placeholder command
./scripts/check-console-cli-parity.sh
```
The script should emit a parity report that feeds into the Downloads workspace (`kind = "parity.report"`).
---
## 10·Compliance checklist
- [ ] Matrix reflects latest command availability (statuses accurate, task IDs linked).
- [ ] Notes include owning backlog items for every 🟩 / 🟡 row.
- [ ] CLI commands marked ✅ have corresponding entries in `/docs/cli/*.md` or module-specific docs.
- [ ] CI parity workflow description kept in sync with CONSOLE-DOC-23-502 implementation.
- [ ] Downloads workspace links to latest parity report.
- [ ] Install / observability guides reference this matrix for pending CLI parity.
- [ ] Offline workflows capture CLI fallbacks when commands are pending.
- [ ] Docs Guild review recorded in sprint log once parity CI lands.
---
## 11·References
- `/docs/ui/*.md` per-surface UI parity callouts.
- `/docs/install/docker.md` CLI parity section for deployments.
- `/docs/observability/ui-telemetry.md` telemetry metrics referencing CLI checks.
- `/docs/security/console-security.md` security metrics & CLI parity expectations.
- `src/StellaOps.Cli/TASKS.md` authoritative status for CLI backlog.
- `/docs/updates/2025-10-28-docs-guild.md` coordination note for Authority/Security follow-up.
---
*Last updated: 2025-10-28 (Sprint23).*

View File

@@ -11,8 +11,9 @@ Both commands are designed to enforce the AOC guardrails documented in the [aggr
- CLI version: `stella`0.19.0 (AOC feature gate enabled).
- Required scopes (DPoP-bound):
- `advisory:verify` for Concelier sources.
- `vex:verify` for Excititor sources (optional but required for VEX checks).
- `advisory:read` for Concelier sources.
- `vex:read` for Excititor sources (optional but required for VEX checks).
- `aoc:verify` to invoke guard verification endpoints.
- `tenant:select` if your deployment uses tenant switching.
- Connectivity: direct access to Concelier/Excititor APIs or Offline Kit snapshot (see §4).
- Environment: set `STELLA_AUTHORITY_URL`, `STELLA_TENANT`, and export a valid OpTok via `stella auth login` or existing token cache.

View File

@@ -209,6 +209,13 @@ stella policy run cancel run:P-7:2025-10-26:auto
Replay downloads sealed bundle for deterministic verification.
### 4.4 Schema artefacts for CLI validation
- CI publishes canonical JSON Schema exports for `PolicyRunRequest`, `PolicyRunStatus`, `PolicyDiffSummary`, and `PolicyExplainTrace` as the `policy-schema-exports` artifact (see `.gitea/workflows/build-test-deploy.yml`).
- Each run writes the files to `artifacts/policy-schemas/<commit>/` and stores a unified diff (`policy-schema-diff.patch`) comparing them with the tracked baseline in `docs/schemas/`.
- Schema changes trigger an alert in Slack `#policy-engine` via the `POLICY_ENGINE_SCHEMA_WEBHOOK` secret so CLI maintainers know to refresh fixtures or validation rules.
- Consume these artefacts in CLI tests to keep payload validation aligned without committing generated files into the repo.
---
## 5·Findings & Explainability
@@ -273,7 +280,7 @@ All non-zero exits emit structured error envelope on stderr when `--format json`
- [ ] **Help text synced:** `stella policy --help` matches documented flags/examples (update during release pipeline).
- [ ] **Exit codes mapped:** Table above reflects CLI implementation and CI asserts mapping for `ERR_POL_*`.
- [ ] **JSON schemas verified:** Example payloads validated against OpenAPI/SDK contracts before publishing.
- [ ] **JSON schemas verified:** Example payloads validated against OpenAPI/SDK contracts before publishing. (_CI now exports canonical schemas as `policy-schema-exports`; wire tests to consume them._)
- [ ] **Scope guidance present:** Each command lists required Authority scopes.
- [ ] **Offline guidance included:** Sealed-mode steps and bundle workflows documented.
- [ ] **Cross-links tested:** Links to DSL, lifecycle, runs, and API docs resolve locally (`yarn docs:lint`).
@@ -281,4 +288,4 @@ All non-zero exits emit structured error envelope on stderr when `--format json`
---
*Last updated: 2025-10-26 (Sprint 20).*
*Last updated: 2025-10-27 (Sprint 20).*

View File

@@ -206,7 +206,7 @@ Troubleshooting steps:
- `deploy/helm/stellaops/values-*.yaml` - environment-specific overrides.
- `deploy/compose/docker-compose.console.yaml` - Compose bundle.
- `/docs/ui/downloads.md` - manifest and offline bundle guidance.
- `/docs/security/console-security.md` (pending) - CSP and Authority scopes.
- `/docs/security/console-security.md` - CSP and Authority scopes.
- `/docs/24_OFFLINE_KIT.md` - Offline kit packaging and verification.
- `/docs/ops/deployment-runbook.md` (pending) - wider platform deployment steps.
@@ -226,4 +226,3 @@ Troubleshooting steps:
---
*Last updated: 2025-10-27 (Sprint 23).*

View File

@@ -68,7 +68,7 @@ Ensure `AOC_VERIFIER_USER` exists in Authority with `aoc:verify` scope and no wr
clients:
- clientId: stella-aoc-verify
grantTypes: [client_credentials]
scopes: [aoc:verify, advisory:verify, vex:verify]
scopes: [aoc:verify, advisory:read, vex:read]
tenants: [default]
```

View File

@@ -17,9 +17,11 @@ The exporter builds against `StellaOps.Scheduler.Models` and emits:
- `policy-diff-summary.schema.json`
- `policy-explain-trace.schema.json`
The build pipeline (`.gitea/workflows/build-test-deploy.yml`, job **Export policy run schemas**) runs this script on every push and pull request. Exports land under `artifacts/policy-schemas/<commit>/`, are published as the `policy-schema-exports` artifact, and changes trigger a Slack post to `#policy-engine` via the `POLICY_ENGINE_SCHEMA_WEBHOOK` secret. A unified diff is stored alongside the exports for downstream consumers.
## CI integration checklist
- [ ] Invoke the script in the DevOps pipeline (see `DEVOPS-POLICY-20-004`).
- [ ] Publish the generated schemas as pipeline artifacts.
- [ ] Notify downstream consumers when schemas change (Slack `#policy-engine`, changelog snippet).
- [x] Invoke the script in the DevOps pipeline (see `DEVOPS-POLICY-20-004`).
- [x] Publish the generated schemas as pipeline artifacts.
- [x] Notify downstream consumers when schemas change (Slack `#policy-engine`, changelog snippet).
- [ ] Gate CLI validation once schema artifacts are available.

View File

@@ -5,6 +5,17 @@ metadata:
- serverless
- prod
- strict
exceptions:
effects:
- id: suppress-canary
name: Canary Freeze
effect: suppress
routingTemplate: secops-approvers
maxDurationDays: 14
routingTemplates:
- id: secops-approvers
authorityRouteId: governance.secops
requireMfa: true
rules:
- name: Block High And Above
severity: [High, Critical]

View File

@@ -124,7 +124,7 @@ Consumers should map these codes to CLI exit codes and structured log events so
- `POST /aoc/verify`: runs guard checks over recent documents and returns summary totals plus first violations.
- **Excititor ingestion** (`StellaOps.Excititor.WebService`) mirrors the same surface for VEX documents.
- **CLI workflows** (`stella aoc verify`, `stella sources ingest --dry-run`) surface pre-flight verification; documentation will live in `/docs/cli/` alongside Sprint 19 CLI updates.
- **Authority scopes**: new `advisory:write`, `advisory:verify`, `vex:write`, and `vex:verify` scopes enforce least privilege; see [Authority Architecture](../ARCHITECTURE_AUTHORITY.md) for scope grammar.
- **Authority scopes**: new `advisory:ingest`, `advisory:read`, `vex:ingest`, and `vex:read` scopes enforce least privilege; see [Authority Architecture](../ARCHITECTURE_AUTHORITY.md) for scope grammar.
## 7. Idempotency and Supersedes Rules
@@ -154,7 +154,7 @@ Consumers should map these codes to CLI exit codes and structured log events so
## 10. Security and Tenancy Checklist
- Enforce Authority scopes (`advisory:write`, `vex:write`, `advisory:verify`, `vex:verify`) and require tenant claims on every request.
- Enforce Authority scopes (`advisory:ingest`, `vex:ingest`, `advisory:read`, `vex:read`) and require tenant claims on every request.
- Maintain pinned trust stores for signature verification; capture verification result in metrics and logs.
- Ensure collectors never log secrets or raw authentication headers; redact tokens before persistence.
- Validate that Policy Engine remains the only identity with permission to write `effective_finding_*` documents.
@@ -173,4 +173,4 @@ Consumers should map these codes to CLI exit codes and structured log events so
---
*Last updated: 2025-10-26 (Sprint 19).*
*Last updated: 2025-10-27 (Sprint 19).*

207
docs/install/docker.md Normal file
View File

@@ -0,0 +1,207 @@
# StellaOps Console — Docker Install Recipes
> **Audience:** Deployment Guild, Console Guild, platform operators.
> **Scope:** Acquire the `stellaops/web-ui` image, run it with Compose or Helm, mirror it for airgapped environments, and keep parity with CLI workflows.
This guide focuses on the new **StellaOps Console** container. Start with the general [Installation Guide](../21_INSTALL_GUIDE.md) for shared prerequisites (Docker, registry access, TLS) and use the steps below to layer in the console.
---
## 1·Release artefacts
| Artefact | Source | Verification |
|----------|--------|--------------|
| Console image | `registry.stella-ops.org/stellaops/web-ui@sha256:<digest>` | Listed in `deploy/releases/<channel>.yaml` (`yq '.services[] | select(.name=="web-ui") | .image'`). Signed with Cosign (`cosign verify --key https://stella-ops.org/keys/cosign.pub …`). |
| Compose bundles | `deploy/compose/docker-compose.{dev,stage,prod,airgap}.yaml` | Each profile already includes a `web-ui` service pinned to the release digest. Run `docker compose --env-file <env> -f docker-compose.<profile>.yaml config` to confirm the digest matches the manifest. |
| Helm values | `deploy/helm/stellaops/values-*.yaml` (`services.web-ui`) | CI lints the chart; use `helm template` to confirm the rendered Deployment/Service carry the expected digest and env vars. |
| Offline artefact (preview) | Generated via `oras copy registry.stella-ops.org/stellaops/web-ui@sha256:<digest> oci-archive:stellaops-web-ui-<channel>.tar` | Record SHA-256 in the downloads manifest (`DOWNLOADS-CONSOLE-23-001`) and sign with Cosign before shipping in the Offline Kit. |
> **Tip:** Keep Compose/Helm digests in sync with the release manifest to preserve determinism. `deploy/tools/validate-profiles.sh` performs a quick cross-check.
---
## 2·Compose quickstart (connected host)
1. **Prepare workspace**
```bash
mkdir stella-console && cd stella-console
cp /path/to/repo/deploy/compose/env/dev.env.example .env
```
2. **Add console configuration** append the following to `.env` (adjust per environment):
```bash
CONSOLE_PUBLIC_BASE_URL=https://console.dev.stella-ops.local
CONSOLE_GATEWAY_BASE_URL=https://api.dev.stella-ops.local
AUTHORITY_ISSUER=https://authority.dev.stella-ops.local
AUTHORITY_CLIENT_ID=console-ui
AUTHORITY_SCOPES="ui.read ui.admin findings:read advisory:read vex:read aoc:verify"
AUTHORITY_DPOP_ENABLED=true
```
Optional extras from [`docs/deploy/console.md`](../deploy/console.md):
```bash
CONSOLE_FEATURE_FLAGS=runs,downloads,policies
CONSOLE_METRICS_ENABLED=true
CONSOLE_LOG_LEVEL=Information
```
3. **Verify bundle provenance**
```bash
cosign verify-blob \
--key https://stella-ops.org/keys/cosign.pub \
--signature /path/to/repo/deploy/compose/docker-compose.dev.yaml.sig \
/path/to/repo/deploy/compose/docker-compose.dev.yaml
```
4. **Launch infrastructure + console**
```bash
docker compose --env-file .env -f /path/to/repo/deploy/compose/docker-compose.dev.yaml up -d mongo minio
docker compose --env-file .env -f /path/to/repo/deploy/compose/docker-compose.dev.yaml up -d web-ui
```
The `web-ui` service exposes the console on port `8443` by default. Change the published port in the Compose file if you need to front it with an existing reverse proxy.
5. **Health check**
```bash
curl -k https://console.dev.stella-ops.local/health/ready
```
Expect `{"status":"Ready"}`. If the response is `401`, confirm Authority credentials and scopes.
---
## 3·Helm deployment (cluster)
1. **Create an overlay** (example `console-values.yaml`):
```yaml
global:
release:
version: "2025.10.0-edge"
services:
web-ui:
image: registry.stella-ops.org/stellaops/web-ui@sha256:38b225fa7767a5b94ebae4dae8696044126aac429415e93de514d5dd95748dcf
service:
port: 8443
env:
CONSOLE_PUBLIC_BASE_URL: "https://console.dev.stella-ops.local"
CONSOLE_GATEWAY_BASE_URL: "https://api.dev.stella-ops.local"
AUTHORITY_ISSUER: "https://authority.dev.stella-ops.local"
AUTHORITY_CLIENT_ID: "console-ui"
AUTHORITY_SCOPES: "ui.read ui.admin findings:read advisory:read vex:read aoc:verify"
AUTHORITY_DPOP_ENABLED: "true"
CONSOLE_FEATURE_FLAGS: "runs,downloads,policies"
CONSOLE_METRICS_ENABLED: "true"
```
2. **Render and validate**
```bash
helm template stella-console ./deploy/helm/stellaops -f console-values.yaml | \
grep -A2 'name: stellaops-web-ui' -A6 'image:'
```
3. **Deploy**
```bash
helm upgrade --install stella-console ./deploy/helm/stellaops \
-f deploy/helm/stellaops/values-dev.yaml \
-f console-values.yaml
```
4. **Post-deploy checks**
```bash
kubectl get pods -l app.kubernetes.io/name=stellaops-web-ui
kubectl port-forward deploy/stellaops-web-ui 8443:8443
curl -k https://localhost:8443/health/ready
```
---
## 4·Offline packaging
1. **Mirror the image to an OCI archive**
```bash
DIGEST=$(yq '.services[] | select(.name=="web-ui") | .image' deploy/releases/2025.10-edge.yaml | cut -d@ -f2)
oras copy registry.stella-ops.org/stellaops/web-ui@${DIGEST} \
oci-archive:stellaops-web-ui-2025.10.0.tar
shasum -a 256 stellaops-web-ui-2025.10.0.tar
```
2. **Sign the archive**
```bash
cosign sign-blob --key ~/keys/offline-kit.cosign \
--output-signature stellaops-web-ui-2025.10.0.tar.sig \
stellaops-web-ui-2025.10.0.tar
```
3. **Load in the air-gap**
```bash
docker load --input stellaops-web-ui-2025.10.0.tar
docker tag stellaops/web-ui@${DIGEST} registry.airgap.local/stellaops/web-ui:2025.10.0
```
4. **Update the Offline Kit manifest** (once the downloads pipeline lands):
```bash
jq '.artifacts.console.webUi = {
"digest": "sha256:'"${DIGEST#sha256:}"'",
"archive": "stellaops-web-ui-2025.10.0.tar",
"signature": "stellaops-web-ui-2025.10.0.tar.sig"
}' downloads/manifest.json > downloads/manifest.json.tmp
mv downloads/manifest.json.tmp downloads/manifest.json
```
Re-run `stella offline kit import downloads/manifest.json` to validate signatures inside the airgapped environment.
---
## 5·CLI parity
Console operations map directly to scriptable workflows:
| Action | CLI path |
|--------|----------|
| Fetch signed manifest entry | `stella downloads manifest show --artifact console/web-ui` *(CLI task `CONSOLE-DOC-23-502`, pending release)* |
| Mirror digest to OCI archive | `stella downloads mirror --artifact console/web-ui --to oci-archive:stellaops-web-ui.tar` *(planned alongside CLI AOC parity)* |
| Import offline kit | `stella offline kit import stellaops-web-ui-2025.10.0.tar` |
| Validate console health | `stella console status --endpoint https://console.dev.stella-ops.local` *(planned; fallback to `curl` as shown above)* |
Track progress for the CLI commands via `DOCS-CONSOLE-23-014` (CLI vs UI parity matrix).
---
## 6·Compliance checklist
- [ ] Image digest validated against the current release manifest.
- [ ] Compose/Helm deployments verified with `docker compose config` / `helm template`.
- [ ] Authority issuer, scopes, and DPoP settings documented and applied.
- [ ] Offline archive mirrored, signed, and recorded in the downloads manifest.
- [ ] CLI parity notes linked to the upcoming `docs/cli-vs-ui-parity.md` matrix.
- [ ] References cross-checked with `docs/deploy/console.md` and `docs/security/console-security.md`.
- [ ] Health checks documented for connected and air-gapped installs.
---
## 7·References
- `deploy/releases/<channel>.yaml` Release manifest (digests, SBOM metadata).
- `deploy/compose/README.md` Compose profile overview.
- `deploy/helm/stellaops/values-*.yaml` Helm defaults per environment.
- `/docs/deploy/console.md` Detailed environment variables, CSP, health checks.
- `/docs/security/console-security.md` Auth flows, scopes, DPoP, monitoring.
- `/docs/ui/downloads.md` Downloads manifest workflow and offline parity guidance.
---
*Last updated: 2025-10-28 (Sprint23).*

View File

@@ -0,0 +1,191 @@
# Console Observability
> **Audience:** Observability Guild, Console Guild, SRE/operators.
> **Scope:** Metrics, logs, traces, dashboards, alerting, feature flags, and offline workflows for the StellaOps Console (Sprint23).
> **Prerequisites:** Console deployed with metrics enabled (`CONSOLE_METRICS_ENABLED=true`) and OTLP exporters configured (`OTEL_EXPORTER_OTLP_*`).
---
## 1·Instrumentation Overview
- **Telemetry stack:** OpenTelemetry Web SDK (browser) + Console telemetry bridge → OTLP collector (Tempo/Prometheus/Loki). Server-side endpoints expose `/metrics` (Prometheus) and `/health/*`.
- **Sampling:** Front-end spans sample at 5% by default (`OTEL_TRACES_SAMPLER=parentbased_traceidratio`). Metrics are un-sampled; log sampling is handled per category (§3).
- **Correlation IDs:** Every API call carries `x-stellaops-correlation-id`; structured UI events mirror that value so operators can follow a request across gateway, backend, and UI.
- **Scope gating:** Operators need the `ui.telemetry` scope to view live charts in the Admin workspace; the scope also controls access to `/console/telemetry` SSE streams.
---
## 2·Metrics
### 2.1 Experience & Navigation
| Metric | Type | Labels | Notes |
|--------|------|--------|-------|
| `ui_route_render_seconds` | Histogram | `route`, `tenant`, `device` (`desktop`,`tablet`) | Time between route activation and first interactive paint. Target P95 ≤1.5s (cached). |
| `ui_request_duration_seconds` | Histogram | `service`, `method`, `status`, `tenant` | Gateway proxy timing for backend calls performed by the console. Alerts when backend latency degrades. |
| `ui_filter_apply_total` | Counter | `route`, `filter`, `tenant` | Increments when a global filter or context chip is applied. Used to track adoption of saved views. |
| `ui_tenant_switch_total` | Counter | `fromTenant`, `toTenant`, `trigger` (`picker`, `shortcut`, `link`) | Emitted after a successful tenant switch; correlates with Authority `ui.tenant.switch` logs. |
| `ui_offline_banner_seconds` | Histogram | `reason` (`authority`, `manifest`, `gateway`), `tenant` | Duration of offline banner visibility; integrate with air-gap SLAs. |
### 2.2 Security & Session
| Metric | Type | Labels | Notes |
|--------|------|--------|-------|
| `ui_dpop_failure_total` | Counter | `endpoint`, `reason` (`nonce`, `jkt`, `clockSkew`) | Raised when DPoP validation fails; pair with Authority audit trail. |
| `ui_fresh_auth_prompt_total` | Counter | `action` (`token.revoke`, `policy.activate`, `client.create`), `tenant` | Counts fresh-auth modals; backlog above baseline indicates workflow friction. |
| `ui_fresh_auth_failure_total` | Counter | `action`, `reason` (`timeout`,`cancelled`,`auth_error`) | Optional metric (set `CONSOLE_FRESH_AUTH_METRICS=true` when feature flag lands). |
### 2.3 Downloads & Offline Kit
| Metric | Type | Labels | Notes |
|--------|------|--------|-------|
| `ui_download_manifest_refresh_seconds` | Histogram | `tenant`, `channel` (`edge`,`stable`,`airgap`) | Time to fetch and verify downloads manifest. Target <3s. |
| `ui_download_export_queue_depth` | Gauge | `tenant`, `artifactType` (`sbom`,`policy`,`attestation`,`console`) | Mirrors `/console/downloads` queue depth; triggers when offline bundles lag. |
| `ui_download_command_copied_total` | Counter | `tenant`, `artifactType` | Increments when users copy CLI commands from the UI. Useful to observe CLI parity adoption. |
### 2.4 Telemetry Emission & Errors
| Metric | Type | Labels | Notes |
|--------|------|--------|-------|
| `ui_telemetry_batch_failures_total` | Counter | `transport` (`otlp-http`,`otlp-grpc`), `reason` | Emitted by OTLP bridge when batches fail. Enable via `CONSOLE_METRICS_VERBOSE=true`. |
| `ui_telemetry_queue_depth` | Gauge | `priority` (`normal`,`high`), `tenant` | Browser-side buffer depth; monitor for spikes under degraded collectors. |
> **Scraping tips:**
> - Enable `/metrics` via `CONSOLE_METRICS_ENABLED=true`.
> - Set `OTEL_EXPORTER_OTLP_ENDPOINT=https://otel.collector:4318` and relevant headers (`OTEL_EXPORTER_OTLP_HEADERS=authorization=Bearer <token>`).
> - For air-gapped sites, point the exporter to the Offline Kit collector (`localhost:4318`) and forward the metrics snapshot using `stella offline bundle metrics`.
---
## 3·Logs
- **Format:** JSON via Console log bridge; emitted to stdout and optional OTLP log exporter. Core fields: `timestamp`, `level`, `action`, `route`, `tenant`, `subject`, `correlationId`, `dpop.jkt`, `device`, `offlineMode`.
- **Categories:**
- `ui.action` general user interactions (route changes, command palette, filter updates). Sampled 50% by default; override with feature flag `telemetry.logVerbose`.
- `ui.tenant.switch` always logged; includes `fromTenant`, `toTenant`, `tokenId`, and Authority audit correlation.
- `ui.download.commandCopied` download commands copied; includes `artifactId`, `digest`, `manifestVersion`.
- `ui.security.anomaly` DPoP mismatches, tenant header errors, CSP violations (level = `Warning`).
- `ui.telemetry.failure` OTLP export errors; include `httpStatus`, `batchSize`, `retryCount`.
- **PII handling:** Full emails are scrubbed; only hashed values (`user:<sha256>`) appear unless `ui.admin` + fresh-auth were granted for the action (still redacted in logs).
- **Retention:** Recommended 14days for connected sites, 30days for sealed/air-gap audits. Ship logs to Loki/Elastic with ingest label `service="stellaops-web-ui"`.
---
## 4·Traces
- **Span names & attributes:**
- `ui.route.transition` wraps route navigation; attributes: `route`, `tenant`, `renderMillis`, `prefetchHit`.
- `ui.api.fetch` HTTP fetch to backend; attributes: `service`, `endpoint`, `status`, `networkTime`.
- `ui.sse.stream` Server-sent event subscriptions (status ticker, runs); attributes: `channel`, `connectedMillis`, `reconnects`.
- `ui.telemetry.batch` Browser OTLP flush; attributes: `batchSize`, `success`, `retryCount`.
- `ui.policy.action` Policy workspace actions (simulate, approve, activate) per `docs/ui/policy-editor.md`.
- **Propagation:** Spans use W3C `traceparent`; gateway echoes header to backend APIs so traces stitch across UI gateway service.
- **Sampling controls:** `OTEL_TRACES_SAMPLER_ARG` (ratio) and feature flag `telemetry.forceSampling` (sets to 100% for incident debugging).
- **Viewing traces:** Grafana Tempo or Jaeger via collector. Filter by `service.name = stellaops-console`. For cross-service debugging, filter on `correlationId` and `tenant`.
---
## 5·Dashboards
### 5.1 Experience Overview
Panels:
- Route render histogram (P50/P90/P99) by route.
- Backend call latency stacked by service (`ui_request_duration_seconds`).
- Offline banner duration trend (`ui_offline_banner_seconds`).
- Tenant switch volume vs failure rate (overlay `ui_dpop_failure_total`).
- Command palette usage (`ui_filter_apply_total` + `ui.action` log counts).
### 5.2 Downloads & Offline Kit
- Manifest refresh time chart (per channel).
- Export queue depth gauge with alert thresholds.
- CLI command adoption (bar chart per artifact type, using `ui_download_command_copied_total`).
- Offline parity banner occurrences (`downloads.offlineParity` flag from API derived metric).
- Last Offline Kit import timestamp (join with Downloads API metadata).
### 5.3 Security & Session
- Fresh-auth prompt counts vs success/fail ratios.
- DPoP failure stacked by reason.
- Tenant mismatch warnings (from `ui.security.anomaly` logs).
- Scope usage heatmap (derived from Authority audit events + UI logs).
- CSP violation counts (browser `securitypolicyviolation` listener forwarded to logs).
> Capture screenshots for Grafana once dashboards stabilise (`docs/assets/ui/observability/*.png`). Replace placeholders before releasing the doc.
---
## 6·Alerting
| Alert | Condition | Suggested Action |
|-------|-----------|------------------|
| **ConsoleLatencyHigh** | `ui_route_render_seconds_bucket{le="1.5"}` drops below 0.95 for 3 intervals | Inspect route splits, check backend latencies, review CDN cache. |
| **BackendLatencyHigh** | `ui_request_duration_seconds_sum / ui_request_duration_seconds_count` > 1s for any service | Correlate with gateway/service dashboards; escalate to owning guild. |
| **TenantSwitchFailures** | Increase in `ui_dpop_failure_total` or `ui.security.anomaly` (tenant mismatch) > 3/min | Validate Authority issuer, check clock skew, confirm tenant config. |
| **FreshAuthLoop** | `ui_fresh_auth_prompt_total` spikes with matching `ui_fresh_auth_failure_total` | Review Authority `/fresh-auth` endpoint, session timeout config, UX regressions. |
| **OfflineBannerLong** | `ui_offline_banner_seconds` P95 > 120s | Investigate Authority/gateway availability; verify Offline Kit freshness. |
| **DownloadsBacklog** | `ui_download_export_queue_depth` > 5 for 10min OR queue age > alert threshold | Ping Downloads service, ensure manifest pipeline (`DOWNLOADS-CONSOLE-23-001`) is healthy. |
| **TelemetryExportErrors** | `ui_telemetry_batch_failures_total` > 0 for ≥5min | Check collector health, credentials, or TLS trust. |
Integrate alerts with Notifier (`ui.alerts`) or existing Ops channels. Tag incidents with `component=console` for correlation.
---
## 7·Feature Flags & Configuration
| Flag / Env Var | Purpose | Default |
|----------------|---------|---------|
| `CONSOLE_FEATURE_FLAGS` | Enables UI modules (`runs`, `downloads`, `policies`, `telemetry`). Telemetry panel requires `telemetry`. | `runs,downloads,policies` |
| `CONSOLE_METRICS_ENABLED` | Exposes `/metrics` for Prometheus scrape. | `true` |
| `CONSOLE_METRICS_VERBOSE` | Emits additional batching metrics (`ui_telemetry_*`). | `false` |
| `CONSOLE_LOG_LEVEL` | Minimum log level (`Information`, `Debug`). Use `Debug` for incident sampling. | `Information` |
| `CONSOLE_METRICS_SAMPLING` *(planned)* | Controls front-end span sampling ratio. Document once released. | `0.05` |
| `OTEL_EXPORTER_OTLP_ENDPOINT` | Collector URL; supports HTTPS. | unset |
| `OTEL_EXPORTER_OTLP_HEADERS` | Comma-separated headers (auth). | unset |
| `OTEL_EXPORTER_OTLP_INSECURE` | Allow HTTP (dev only). | `false` |
| `OTEL_SERVICE_NAME` | Service tag for traces/logs. Set to `stellaops-console`. | auto |
| `CONSOLE_TELEMETRY_SSE_ENABLED` | Enables `/console/telemetry` SSE feed for dashboards. | `true` |
Feature flag changes should be tracked in release notes and mirrored in `/docs/ui/navigation.md` (shortcuts may change when modules toggle).
---
## 8·Offline / Air-Gapped Workflow
- Mirror the console image and telemetry collector as part of the Offline Kit (see `/docs/install/docker.md` §4).
- Scrape metrics locally via `curl -k https://console.local/metrics > metrics.prom`; archive alongside logs for audits.
- Use `stella offline kit import` to keep the downloads manifest in sync; dashboards display staleness using `ui_download_manifest_refresh_seconds`.
- When collectors are unavailable, console queues OTLP batches (up to 5min) and exposes backlog through `ui_telemetry_queue_depth`; export queue metrics to prove no data loss.
- After reconnecting, run `stella console status --telemetry` *(CLI parity pending; see DOCS-CONSOLE-23-014)* or verify `ui_telemetry_batch_failures_total` resets to zero.
- Retain telemetry bundles for 30days per compliance guidelines; include Grafana JSON exports in audit packages.
---
## 9·Compliance Checklist
- [ ] `/metrics` scraped in staging & production; dashboards display `ui_route_render_seconds`, `ui_request_duration_seconds`, and downloads metrics.
- [ ] OTLP traces/logs confirmed end-to-end (collector, Tempo/Loki).
- [ ] Alert rules from §6 implemented in monitoring stack with runbooks linked.
- [ ] Feature flags documented and change-controlled; telemetry disabled only with approval.
- [ ] DPoP/fresh-auth anomalies correlated with Authority audit logs during drill.
- [ ] Offline capture workflow exercised; evidence stored in audit vault.
- [ ] Screenshots of Grafana dashboards committed once they stabilise (update references).
- [ ] Cross-links verified (`docs/deploy/console.md`, `docs/security/console-security.md`, `docs/ui/downloads.md`, `docs/ui/console-overview.md`).
---
## 10·References
- `/docs/deploy/console.md` Metrics endpoint, OTLP config, health checks.
- `/docs/security/console-security.md` Security metrics & alert hints.
- `/docs/ui/console-overview.md` Telemetry primitives and performance budgets.
- `/docs/ui/downloads.md` Downloads metrics and parity workflow.
- `/docs/observability/observability.md` Platform-wide practices.
- `/ops/telemetry-collector.md` & `/ops/telemetry-storage.md` Collector deployment.
- `/docs/install/docker.md` Compose/Helm environment variables.
---
*Last updated: 2025-10-28 (Sprint23).*

View File

@@ -0,0 +1,152 @@
# Policy Exception Effects
> **Audience:** Policy authors, reviewers, operators, and governance owners.
> **Scope:** How exception definitions are authored, resolved, and surfaced by the Policy Engine during evaluation, including precedence rules, metadata flow, and simulation/diff behaviour.
Exception effects let teams codify governed waivers without compromising determinism. This guide explains the artefacts involved, how the evaluator selects a single winning exception, and where downstream consumers observe the applied override.
---
## 1·Exception Building Blocks
| Artefact | Description |
|----------|-------------|
| **Exception Effect** | Declared inside a policy pack (`exceptions.effects`). Defines the override behaviour plus governance metadata. See effect fields in §2. |
| **Routing Template** | Optional mapping (`exceptions.routingTemplates`) used by Authority to route approvals/MFA. Effects reference templates by id. |
| **Exception Instance** | Stored outside the policy pack (Authority/API). Captures who requested the waiver, scope filters, metadata, and creation time. |
Effects are validated at bind time (`PolicyBinder`), while instances are ingested alongside policy evaluation inputs. Both are normalized to case-insensitive identifiers to avoid duplicate conflicts.
---
## 2·Effect Fields
| Field | Required | Purpose | Notes |
|-------|----------|---------|-------|
| `id` | ✅ | Stable identifier (`[A-Za-z0-9-_]+`). | Must be unique per policy pack. |
| `name` | — | Friendly label for consoles/reports. | Forwarded to verdict metadata if present. |
| `effect` | ✅ | Behaviour enum: `suppress`, `defer`, `downgrade`, `requireControl`. | Case-insensitive. |
| `downgradeSeverity` | ⚠️ | Target severity for `downgrade`. | Must map to DSL severities (`high`, `medium`, etc.). Validation enforced in `PolicyBinder` (`policy.exceptions.effect.downgrade.missingSeverity`). |
| `requiredControlId` | ⚠️ | Control catalogue key for `requireControl`. | Required when effect is `requireControl`. |
| `routingTemplate` | — | Connects to an Authority approval flow. | CLI/Console resolve to `authorityRouteId`. |
| `maxDurationDays` | — | Soft limit for temporary waivers. | Must be > 0 when provided. |
| `description` | — | Rich-text rationale. | Displayed in approvals centre (optional). |
Authoring invalid combinations returns structured errors with JSON paths, preventing packs from compiling (see `src/StellaOps.Policy.Tests/PolicyBinderTests.cs:33`). Routing templates additionally declare `authorityRouteId` and `requireMfa` flags for governance routing.
---
## 3·Exception Instances & Scope
Instances are resolved from Authority or API collections and injected into the evaluation context (`PolicyEvaluationExceptions`). Each instance contains:
| Field | Source | Usage |
|-------|--------|-------|
| `id` | Authority storage | Propagated to annotations and `appliedException.exceptionId`. |
| `effectId` | Links to pack-defined effect | Must resolve to a known effect; otherwise ignored. |
| `scope.ruleNames` | Optional list | Limits to specific rule identifiers. |
| `scope.severities` | Optional list (`severity.normalized`) | Normalized against the evaluators severity string. |
| `scope.sources` | Optional advisory sources (`GHSA`, `NVD`, …) | Compared against the advisory context. |
| `scope.tags` | Optional SBOM tags | Matched using `sbom.has_tag(...)`. |
| `createdAt` | RFC3339 UTC timestamp | Used as tie-breaker when specificity scores match. |
| `metadata` | Arbitrary key/value bag | Copied to verdict annotations (`exception.meta.*`). |
Scopes are case-insensitive and trimmed. Empty scopes behave as global waivers but still require routing and metadata supplied by Authority workflows.
---
## 4·Resolution & Specificity
Only one exception effect is applied per finding. Evaluation proceeds as follows:
1. Filter instances whose `effectId` resolves to a known effect.
2. Discard instances whose scope does not match the candidate finding (rule name, severity, advisory source, SBOM tags).
3. Score remaining instances for **specificity**:
- `ruleNames``1000 + (count × 25)`
- `severities``500 + (count × 10)`
- `sources``250 + (count × 10)`
- `tags``100 + (count × 5)`
4. Highest score wins. Ties fall back to the newest `createdAt`, then lexical `id` (stable sorting).
These rules guarantee deterministic selection even when multiple waivers overlap. See `src/StellaOps.Policy.Engine.Tests/PolicyEvaluatorTests.cs:209` for tie-break coverage.
---
## 5·Effect Behaviours
| Effect | Status impact | Severity impact | Warnings / metadata |
|--------|---------------|-----------------|---------------------|
| `suppress` | Forces status `suppressed`. | No change. | `exception.status=suppressed`. |
| `defer` | Forces status `deferred`. | No change. | `exception.status=deferred`. |
| `downgrade` | No change. | Sets severity to configured `downgradeSeverity`. | `exception.severity` annotation. |
| `requireControl` | No change. | No change. | Adds warning `Exception '<id>' requires control '<requiredControlId>'`. Annotation `exception.requiredControl`. |
All effects stamp shared annotations: `exception.id`, `exception.effectId`, `exception.effectType`, optional `exception.effectName`, optional `exception.routingTemplate`, plus `exception.maxDurationDays`. Instance metadata is surfaced both in annotations (`exception.meta.<key>`) and the structured `AppliedException.Metadata` payload for downstream APIs. Behaviour is validated by unit tests (`src/StellaOps.Policy.Engine.Tests/PolicyEvaluatorTests.cs:130` & `src/StellaOps.Policy.Engine.Tests/PolicyEvaluatorTests.cs:169`).
---
## 6·Explain, Simulation & Outputs
- **Explain traces / CLI simulate** Verdict payloads include `appliedException` capturing original vs applied status/severity, enabling diff visualisation in Console and CLI previews.
- **Annotations** Deterministic keys make it trivial for exports or alerting pipelines to flag waived findings.
- **Warnings** `requireControl` adds runtime warnings so operators can enforce completion of compensating controls.
- **Routing** When `routingTemplate` is populated, verdict metadata includes `routingTemplate`, allowing UI surfaces to deep-link into the approvals centre.
Example verdict excerpt (JSON):
```json
{
"status": "suppressed",
"severity": "Critical",
"annotations": {
"exception.id": "exc-001",
"exception.effectId": "suppress-critical",
"exception.effectType": "Suppress",
"exception.status": "suppressed",
"exception.meta.requestedBy": "alice"
},
"appliedException": {
"exceptionId": "exc-001",
"effectId": "suppress-critical",
"effectType": "Suppress",
"originalStatus": "blocked",
"appliedStatus": "suppressed",
"metadata": {
"effectName": "Rule Critical Suppress",
"requestedBy": "alice"
}
}
}
```
---
## 7·Operational Notes
- **Authoring** Policy packs must ship effect definitions before Authority can issue instances. CLI validation (`stella policy lint`) fails if required fields are missing.
- **Approvals & MFA** Effects referencing routing templates inherit `requireMfa` rules from `exceptions.routingTemplates`. Governance guidance in `/docs/11_GOVERNANCE.md` captures Authority approval flows and audit expectations.
- **Presence in exports** Even when an exception suppresses a finding, explain traces and effective findings retain the applied exception metadata for audit parity.
- **Determinism** Specificity scoring plus tie-breakers ensure repeatable outcomes across runs, supporting sealed/offline replay.
---
## 8·Testing References
- `src/StellaOps.Policy.Tests/PolicyBinderTests.cs:33` Validates schema rules for defining effects, routing templates, and downgrade guardrails.
- `src/StellaOps.Policy.Engine.Tests/PolicyEvaluatorTests.cs:130` Covers suppression, downgrade, and metadata propagation.
- `src/StellaOps.Policy.Engine.Tests/PolicyEvaluatorTests.cs:209` Confirms specificity ordering and metadata forwarding for competing exceptions.
---
## 9·Compliance Checklist
- [ ] **Effect catalogue maintained:** Each policy pack documents available effects and routing templates for auditors.
- [ ] **Authority alignment:** Approval routes in Authority mirror `routingTemplate` definitions and enforce MFA where required.
- [ ] **Explain coverage:** Console/CLI surfaces display `appliedException` details and `exception.*` annotations for every waived verdict.
- [ ] **Simulation parity:** `stella policy simulate` outputs include exception metadata, ensuring PR/CI reviews catch unintended waivers.
- [ ] **Audit retention:** Effective findings history retains `appliedException` payloads so exception lifecycle reviews remain replayable.
- [ ] **Tests locked:** Binder and evaluator tests covering exception paths remain green before publishing documentation updates.
---
*Last updated: 2025-10-27 (Sprint 25).*

View File

@@ -11,28 +11,29 @@ Authority issues short-lived tokens bound to tenants and scopes. Sprint19 int
| Scope | Surface | Purpose | Notes |
|-------|---------|---------|-------|
| `advisory:write` | Concelier ingestion APIs | Allows append-only writes to `advisory_raw`. | Granted to Concelier WebService and trusted connectors. Requires tenant claim. |
| `advisory:verify` | Concelier `/aoc/verify`, CLI, UI dashboard | Permits guard verification and access to violation summaries. | Read-only; used by `stella aoc verify` and console dashboard. |
| `vex:write` | Excititor ingestion APIs | Append-only writes to `vex_raw`. | Mirrors `advisory:write`. |
| `vex:verify` | Excititor `/aoc/verify`, CLI | Read-only verification of VEX ingestion. | Optional for environments without VEX feeds. |
| `graph:write` | Cartographer build pipeline | Enqueue graph build/overlay jobs. | Reserved for the Cartographer service identity; requires tenant claim. |
| `graph:read` | Graph API, Scheduler overlays, UI | Read graph projections/overlays. | Requires tenant claim; granted to Cartographer, Graph API, Scheduler. |
| `advisory:ingest` | Concelier ingestion APIs | Append-only writes to `advisory_raw` collections. | Requires tenant claim; blocked for global clients. |
| `advisory:read` | `/aoc/verify`, Concelier dashboards, CLI | Read-only access to stored advisories and guard results. | Needed alongside `aoc:verify` for CLI/console verification. |
| `vex:ingest` | Excititor ingestion APIs | Append-only writes to `vex_raw`. | Mirrors `advisory:ingest`; tenant required. |
| `vex:read` | `/aoc/verify`, Excititor dashboards, CLI | Read-only access to stored VEX material. | Pair with `aoc:verify` for guard checks. |
| `aoc:verify` | CLI/CI pipelines, Console verification jobs | Execute Aggregation-Only Contract guard runs. | Always issued with tenant; read-only combined with `advisory:read`/`vex:read`. |
| `graph:write` | Cartographer pipeline | Enqueue graph build/overlay jobs. | Reserved for Cartographer service identity; tenant required. |
| `graph:read` | Graph API, Scheduler overlays, UI | Read graph projections/overlays. | Tenant required; granted to Cartographer, Graph API, Scheduler. |
| `graph:export` | Graph export endpoints | Stream GraphML/JSONL artefacts. | UI/gateway automation only; tenant required. |
| `graph:simulate` | Policy simulation overlays | Trigger what-if overlays on graphs. | Restricted to automation; tenant required. |
| `effective:write` | Policy Engine | Allows creation/update of `effective_finding_*` collections. | **Only** the Policy Engine service client may hold this scope. |
| `effective:read` | Console, CLI, exports | Read derived findings. | Shared across tenants with role-based restrictions. |
| `aoc:dashboard` | Console UI | Access AOC dashboard resources. | Bundles `advisory:verify`/`vex:verify` by default; keep for UI RBAC group mapping. |
| `aoc:verify` | Automation service accounts | Execute verification via API without the full dashboard role. | For CI pipelines, offline kit validators. |
| Existing scopes | (e.g., `policy:*`, `sbom:*`) | Unchanged. | Review `/docs/security/policy-governance.md` for policy-specific scopes. |
| `effective:write` | Policy Engine | Create/update `effective_finding_*` collections. | **Only** the Policy Engine service client may hold this scope; tenant required. |
| `findings:read` | Console, CLI, exports | Read derived findings materialised by Policy Engine. | Shared across tenants with RBAC; tenant claim still enforced. |
| `vuln:read` | Vuln Explorer API/UI | Read normalized vulnerability data. | Tenant required. |
| Existing scopes | (e.g., `policy:*`, `concelier.jobs.trigger`) | Unchanged. | Review `/docs/security/policy-governance.md` for policy-specific scopes. |
### 1.1Scope bundles (roles)
- **`role/concelier-ingest`** → `advisory:write`, `advisory:verify`.
- **`role/excititor-ingest`** → `vex:write`, `vex:verify`.
- **`role/aoc-operator`** → `aoc:dashboard`, `aoc:verify`, `advisory:verify`, `vex:verify`.
- **`role/policy-engine`** → `effective:write`, `effective:read`.
- **`role/concelier-ingest`** → `advisory:ingest`, `advisory:read`.
- **`role/excititor-ingest`** → `vex:ingest`, `vex:read`.
- **`role/aoc-operator`** → `aoc:verify`, `advisory:read`, `vex:read`.
- **`role/policy-engine`** → `effective:write`, `findings:read`.
- **`role/cartographer-service`** → `graph:write`, `graph:read`.
- **`role/graph-gateway`** → `graph:read`, `graph:export`, `graph:simulate`.
- **`role/console`** → `advisory:read`, `vex:read`, `aoc:verify`, `findings:read`, `vuln:read`.
Roles are declared per tenant in `authority.yaml`:
@@ -41,11 +42,11 @@ tenants:
- name: default
roles:
concelier-ingest:
scopes: [advisory:write, advisory:verify]
scopes: [advisory:ingest, advisory:read]
aoc-operator:
scopes: [aoc:dashboard, aoc:verify, advisory:verify, vex:verify]
scopes: [aoc:verify, advisory:read, vex:read]
policy-engine:
scopes: [effective:write, effective:read]
scopes: [effective:write, findings:read]
```
---
@@ -62,10 +63,10 @@ Tokens now include:
Authority rejects requests when:
- `tenant` is missing while requesting `advisory:*`, `vex:*`, or `aoc:*` scopes.
- `tenant` is missing while requesting `advisory:ingest`, `advisory:read`, `vex:ingest`, `vex:read`, or `aoc:verify` scopes.
- `service_identity != policy-engine` but `effective:write` is present (`ERR_AOC_006` enforcement).
- `service_identity != cartographer` but `graph:write` is present (graph pipeline enforcement).
- Tokens attempt to combine `advisory:write` with `effective:write` (separation of duties).
- Tokens attempt to combine `advisory:ingest` with `effective:write` (separation of duties).
### 2.2Propagation
@@ -90,22 +91,30 @@ Add new scopes and optional claims transformations:
```yaml
security:
scopes:
- name: advisory:write
description: Concelier raw ingestion
- name: advisory:verify
description: Verify Concelier ingestion
- name: vex:write
- name: advisory:ingest
description: Concelier raw ingestion (append-only)
- name: advisory:read
description: Read Concelier advisories and guard verdicts
- name: vex:ingest
description: Excititor raw ingestion
- name: vex:verify
description: Verify Excititor ingestion
- name: aoc:dashboard
description: Access AOC UI dashboards
- name: vex:read
description: Read Excititor VEX records
- name: aoc:verify
description: Run AOC verification
- name: effective:write
description: Policy Engine materialisation
- name: effective:read
- name: findings:read
description: Read derived findings
- name: graph:write
description: Cartographer build submissions
- name: graph:read
description: Read graph overlays
- name: graph:export
description: Export graph artefacts
- name: graph:simulate
description: Run graph what-if simulations
- name: vuln:read
description: Read Vuln Explorer data
claimTransforms:
- match: { scope: "effective:write" }
require:
@@ -119,13 +128,13 @@ security:
Update service clients:
- `Concelier.WebService` → request `advisory:write`, `advisory:verify`.
- `Excititor.WebService` → request `vex:write`, `vex:verify`.
- `Policy.Engine` → request `effective:write`, `effective:read`; set `properties.serviceIdentity=policy-engine`.
- `Concelier.WebService` → request `advisory:ingest`, `advisory:read`.
- `Excititor.WebService` → request `vex:ingest`, `vex:read`.
- `Policy.Engine` → request `effective:write`, `findings:read`; set `properties.serviceIdentity=policy-engine`.
- `Cartographer.Service` → request `graph:write`, `graph:read`; set `properties.serviceIdentity=cartographer`.
- `Graph API Gateway` → request `graph:read`, `graph:export`, `graph:simulate`; tenant hint required.
- `Console` → request `aoc:dashboard`, `effective:read` plus existing UI scopes.
- `CLI automation` → request `aoc:verify`, `advisory:verify`, `vex:verify` as needed.
- `Console` → request `advisory:read`, `vex:read`, `aoc:verify`, `findings:read`, `vuln:read` plus existing UI scopes.
- `CLI automation` → request `aoc:verify`, `advisory:read`, `vex:read` as needed.
Client definition snippet:
@@ -133,11 +142,11 @@ Client definition snippet:
clients:
- clientId: concelier-web
grantTypes: [client_credentials]
scopes: [advisory:write, advisory:verify]
scopes: [advisory:ingest, advisory:read]
tenants: [default]
- clientId: policy-engine
grantTypes: [client_credentials]
scopes: [effective:write, effective:read]
scopes: [effective:write, findings:read]
properties:
serviceIdentity: policy-engine
- clientId: cartographer-service
@@ -152,7 +161,7 @@ clients:
## 4·Operational safeguards
- **Audit events:** Authority emits `authority.scope.granted` and `authority.scope.revoked` events with `scope` and `tenant`. Monitor for unexpected grants.
- **Rate limiting:** Apply stricter limits on `/token` endpoints for clients requesting `advisory:write` or `vex:write` to mitigate brute-force ingestion attempts.
- **Rate limiting:** Apply stricter limits on `/token` endpoints for clients requesting `advisory:ingest` or `vex:ingest` to mitigate brute-force ingestion attempts.
- **Incident response:** Link AOC alerts to Authority audit logs to confirm whether violations come from expected identities.
- **Rotation:** Rotate ingest client secrets alongside guard deployments; add rotation steps to `ops/authority-key-rotation.md`.
- **Testing:** Integration tests must fail if tokens lacking `tenant` attempt ingestion; add coverage in Concelier/Excititor smoke suites (see `CONCELIER-CORE-AOC-19-013`).
@@ -161,7 +170,7 @@ clients:
## 5·Offline & air-gap notes
- Offline Kit bundles include tenant-scoped service credentials. Ensure ingest bundles ship without `advisory:write` scopes unless strictly required.
- Offline Kit bundles include tenant-scoped service credentials. Ensure ingest bundles ship without `advisory:ingest` scopes unless strictly required.
- CLI verification in offline environments uses pre-issued `aoc:verify` tokens; document expiration and renewal processes.
- Authority replicas in air-gapped environments should restrict scope issuance to known tenants and log all `/token` interactions for later replay.
@@ -191,4 +200,4 @@ clients:
---
*Last updated: 2025-10-26 (Sprint19).*
*Last updated: 2025-10-27 (Sprint19).*

View File

@@ -0,0 +1,162 @@
# StellaOps Console Security Posture
> **Audience:** Security Guild, Console & Authority teams, deployment engineers.
> **Scope:** OIDC/DPoP flows, scope model, session controls, CSP and transport headers, evidence handling, offline posture, and monitoring expectations for the StellaOps Console (Sprint23).
The console is an Angular SPA fronted by the StellaOps Web gateway. It consumes Authority for identity, Concelier/Excititor for aggregation data, Policy Engine for findings, and Attestor for evidence bundles. This guide captures the security guarantees and required hardening so that the console can ship alongside the Aggregation-Only Contract (AOC) without introducing new attack surface.
---
## 1·Identity & Authentication
### 1.1 Authorization sequence
1. Browser→Authority uses **OAuth 2.1 Authorization Code + PKCE** (`S256`).
2. Upon code exchange the console requests a **DPoP-bound access token** (`aud=console`, `tenant=<id>`) with **120s TTL** and optional **rotating refresh token** (`rotate=true`).
3. Authority includes `cnf.jkt` for the ephemeral WebCrypto keypair; console stores the private key in **IndexedDB** (non-exportable) and keeps the public JWK in memory.
4. All API calls attach `Authorization: Bearer <token>` + `DPoP` proof header. Nonces from the gateway are replay-protected (`dpopt-nonce` header).
5. Tenanted API calls flow through the Web gateway which forwards `X-Stella-Tenant` and enforces tenancy headers. Missing or mismatched tenants trigger `403` with `ERR_TENANT_MISMATCH`.
### 1.2 Fresh-auth gating
- Sensitive actions (tenant edits, token revocation, policy promote, signing key rotation) call `Authority /fresh-auth` using `prompt=login` + `max_age=300`.
- Successful fresh-auth yields a **300s** scoped token (`fresh_auth=true`) stored only in memory; the UI disables guarded buttons when the timer expires.
- Audit events: `authority.fresh_auth.start`, `authority.fresh_auth.success`, `authority.fresh_auth.expired` (link to correlation IDs for the gated action).
### 1.3 Offline & sealed mode
- When `console.offlineMode=true` the console presents an offline banner and suppresses fresh-auth prompts, replacing them with CLI guidance (`stella auth fresh-auth --offline`).
- Offline mode requires pre-issued tenant-scoped tokens bundled with the Offline Kit; tokens must include `offline=true` claim and 15m TTL.
- Authority availability health is polled via `/api/console/status`. HTTP failures raise the offline banner and switch to read-only behaviour.
---
## 2·Session & Device Binding
- Access and refresh tokens live in memory; metadata (subject, tenant, expiry) persists in `sessionStorage` for reload continuity. **Never** store raw JWTs in `localStorage`.
- Inactivity timeout defaults to **15minutes**. Idle sessions trigger silent refresh; on failure the UI shows a modal requiring re-auth.
- Tokens are device-bound through DPoP; if a new device logs in, Authority revokes the previous DPoP key and emits `authority.token.binding_changed`.
- CSRF mitigations: bearer tokens plus DPoP remove cookie reliance. If cookies are required (e.g., same-origin analytics) they must be `HttpOnly`, `SameSite=Lax`, `Secure`.
- Browser hardening: enforce `Strict-Transport-Security`, `X-Content-Type-Options: nosniff`, `Referrer-Policy: no-referrer`, `Permissions-Policy: camera=(), microphone=(), geolocation=()`.
---
## 3·Authorization & Scope Model
The console client is registered in Authority as `console-ui` with scopes:
| Feature area | Required scopes | Notes |
|--------------|----------------|-------|
| Base navigation (Dashboard, Findings, SBOM, Runs) | `ui.read`, `findings:read`, `advisory:read`, `vex:read`, `aoc:verify` | `findings:read` enables Policy Engine overlays; `advisory:read`/`vex:read` load ingestion panes; `aoc:verify` allows on-demand guard runs. |
| Admin workspace | `ui.admin`, `authority:tenants.read`, `authority:tenants.write`, `authority:roles.read`, `authority:roles.write`, `authority:tokens.read`, `authority:tokens.revoke`, `authority:clients.read`, `authority:clients.write`, `authority:audit.read` | Scope combinations are tenant constrained. Role changes require fresh-auth. |
| Policy approvals | `policy:read`, `policy:review`, `policy:approve`, `policy:activate`, `policy:runs` | `policy:activate` gated behind fresh-auth. |
| Observability panes (status ticker, telemetry) | `ui.telemetry`, `scheduler:runs.read`, `advisory:read`, `vex:read` | `ui.telemetry` drives OTLP export toggles. |
| Downloads parity (SBOM, attestation) | `downloads:read`, `attestation:verify`, `sbom:export` | Console surfaces digests only; download links require CLI parity for write operations. |
Guidance:
- **Role mapping**: Provision Authority role `role/ui-console-admin` encapsulating the admin scopes above.
- **Tenant enforcement**: Gateway injects `X-Stella-Tenant` from token claims. Requests missing the header must be rejected by downstream services (Concelier, Excititor, Policy Engine) and logged.
- **Separation of duties**: Never grant `ui.admin` and `policy:approve` to the same human role without SOC sign-off; automation accounts should use least-privilege dedicated clients.
---
## 4·Transport, CSP & Browser Hardening
### 4.1 Gateway requirements
- TLS 1.2+ with modern cipher suites; enable HTTP/2 for SSE streams.
- Terminate TLS at the reverse proxy (Traefik, NGINX) and forward `X-Forwarded-*` headers (`ASPNETCORE_FORWARDEDHEADERS_ENABLED=true`).
- Rate-limit `/authorize` and `/token` according to [Authority rate-limit guidance](rate-limits.md).
### 4.2 Content Security Policy
Default CSP served by the console container:
```
default-src 'self';
connect-src 'self' https://*.stella-ops.local;
img-src 'self' data:;
script-src 'self';
style-src 'self' 'unsafe-inline';
font-src 'self';
frame-ancestors 'none';
```
Recommendations:
- Extend `connect-src` only for known internal APIs (e.g., telemetry collector). Use `console.config.cspOverrides` instead of editing NGINX directly.
- Enable **COOP/COEP** (`Cross-Origin-Opener-Policy: same-origin`, `Cross-Origin-Embedder-Policy: require-corp`) to support WASM policy previews.
- Use **Subresource Integrity (SRI)** hashes when adding third-party fonts or scripts.
- For embedded screenshots/GIFs sourced from Offline Kit, use `img-src 'self' data: blob:` and verify assets during build.
- Enforce `X-Frame-Options: DENY`, `X-XSS-Protection: 0`, and `Cache-Control: no-store` on JSON API responses (HTML assets remain cacheable).
### 4.3 SSE & WebSocket hygiene
- SSE endpoints (`/console/status/stream`, `/console/runs/{id}/events`) must set `Cache-Control: no-store` and disable proxy buffering.
- Gate SSE behind the same DPoP tokens; reject without `Authorization`.
- Proxy timeouts ≥60s to avoid disconnect storms; clients use exponential backoff with jitter.
---
## 5·Evidence & Data Handling
- **Evidence bundles**: Download links trigger `attestor.verify` or `downloads.manifest` APIs. The UI never caches bundle contents; it only surfaces SHA-256 digests and cosign signatures. Operators must use CLI to fetch the signed artefact.
- **Secrets**: UI redacts tokens, emails, and attachment paths in logs. Structured logs include only `subject`, `tenant`, `action`, `correlationId`.
- **Aggregation data**: Console honours Aggregation-Only contract—no client-side rewriting of Concelier/Excititor precedence. Provenance badges display source IDs and merge-event hashes.
- **PII minimisation**: User lists show minimal identity (display name, email hash). Full email addresses require `ui.admin` + fresh-auth.
- **Downloads parity**: Every downloadable artefact includes a CLI parity link (e.g., `stella downloads fetch --artifact <id>`). If CLI parity fails, the console displays a warning banner and links to troubleshooting docs.
---
## 6·Logging, Monitoring & Alerts
- Structured logs: `ui.action`, `tenantId`, `subject`, `scope`, `correlationId`, `dpop.jkt`. Log level `Information` for key actions; `Warning` for security anomalies (failed DPoP, tenant mismatch).
- Metrics (Prometheus): `ui_request_duration_seconds`, `ui_dpop_failure_total`, `ui_fresh_auth_prompt_total`, `ui_tenant_switch_total`, `ui_offline_banner_seconds`.
- Alerts:
1. **Fresh-auth failures** >5 per minute per tenant → security review.
2. **DPoP mismatches** sustained >1% of requests → potential replay attempt.
3. **Tenant mismatches** >0 triggers an audit incident (could indicate scope misconfiguration).
- Correlate with Authority audit events (`authority.scope.granted`, `authority.token.revoked`) and Concelier/Excititor ingestion logs to trace user impact.
---
## 7·Offline & Air-Gapped Posture
- Offline deployments require mirrored container images and Offline Kit manifest verification (see `/docs/deploy/console.md` §7).
- Console reads `offlineManifest.json` at boot to validate asset digests; mismatches block startup until the manifest is refreshed.
- Tenant and role edits queue change manifests for export; UI instructs operators to run `stella auth apply --bundle <file>` on the offline Authority host.
Evidence viewing remains read-only; download buttons provide scripts to export from local Attestor snapshots.
- Fresh-auth prompts display instructions for hardware-token usage on bastion hosts; system logs mark actions executed under offline fallback.
---
## 8·Threat Model Alignment
| Threat (Authority TM §5) | Console control |
|--------------------------|-----------------|
| Spoofed revocation bundle | Console verifies manifest signatures before showing revocation status; links to `stella auth revoke verify`. |
| Parameter tampering on `/token` | PKCE + DPoP enforced; console propagates correlation IDs so Authority logs can link anomalies. |
| Bootstrap invite replay | Admin UI surfaces invite status with expiry; fresh-auth required before issuing new invites. |
| Token replay by stolen agent | DPoP binding prevents reuse; console surfaces revocation latency warnings sourced from Zastava metrics. |
| Offline bundle tampering | Console refuses unsigned Offline Kit assets; prompts operators to re-import verified bundles. |
| Privilege escalation via plug-in overrides | Plug-in manifest viewer warns when a plug-in downgrades password policy; UI restricts plug-in activation to fresh-auth + `ui.admin` scoped users. |
Document gaps and remediation hooks in `SEC5.*` backlog as they are addressed.
---
## 9·Compliance checklist
- [ ] Authority client `console-ui` registered with PKCE, DPoP, tenant claim requirement, and scopes from §3.
- [ ] CSP enforced per §4 with overrides documented in deployment manifests.
- [ ] Fresh-auth timer (300s) validated for admin and policy actions; audit events captured.
- [ ] DPoP binding tested (replay attempt blocked; logs show `ui_dpop_failure_total` increment).
- [ ] Offline mode exercises performed (banner, CLI guidance, manifest verification).
- [ ] Evidence download parity verified with CLI scripts; console never caches sensitive artefacts.
- [ ] Monitoring dashboards show metrics and alerts outlined in §6; alert runbooks reviewed with Security Guild.
- [ ] Security review sign-off recorded in sprint log with links to Authority threat model references.
---
*Last updated: 2025-10-28 (Sprint23).*

View File

@@ -12,9 +12,10 @@ The Console AOC dashboard gives operators a live view of ingestion guardrails ac
- **Route:** `/console/sources` (dashboard) with contextual drawer routes `/console/sources/:sourceKey` and `/console/sources/:sourceKey/violations/:documentId`.
- **Feature flag:** `aocDashboard.enabled` (default `true` once Concelier WebService exposes `/aoc/verify`). Toggle is tenant-scoped to support phased rollout.
- **Scopes:**
- `ui.read` (base navigation) and `advisory:verify` to view ingestion stats/violations.
- `vex:verify` to see Excititor entries and run VEX verifications.
- `advisory:write` / `vex:write` **not** required; dashboard uses read-only APIs.
- `ui.read` (base navigation) plus `advisory:read` to view Concelier ingestion metrics/violations.
- `vex:read` to see Excititor entries and run VEX verifications.
- `aoc:verify` to trigger guard runs from the dashboard action bar.
- `advisory:ingest` / `vex:ingest` **not** required; the dashboard uses read-only APIs.
- **Tenancy:** All data is filtered by the active tenant selector. Switching tenants re-fetches tiles and drill-down tables with tenant-scoped tokens.
- **Back-end contracts:** Requires Concelier/Excititor 19.x (AOC guards enabled) and Authority scopes updated per [Authority service docs](../ARCHITECTURE_AUTHORITY.md#new-aoc-scopes).

View File

@@ -190,7 +190,7 @@ Telemetry entries include correlation IDs that match backend manifest refresh lo
- `/docs/ui/sbom-explorer.md` - export flows feeding the downloads queue.
- `/docs/ui/runs.md` - evidence bundle integration.
- `/docs/24_OFFLINE_KIT.md` - offline kit packaging and verification.
- `/docs/security/console-security.md` - scopes, CSP, and download token handling (pending).
- `/docs/security/console-security.md` - scopes, CSP, and download token handling.
- `/docs/cli-vs-ui-parity.md` - CLI equivalence checks (pending).
- `deploy/releases/*.yaml` - source of container digests mirrored into the manifest.

View File

@@ -0,0 +1,26 @@
# Docs Guild Update — 2025-10-28
## Console security posture draft
- Published `docs/security/console-security.md` covering console OIDC/DPoP flow, scope map, fresh-auth sequence, CSP defaults, evidence handling, and monitoring checklist.
- Authority owners (`AUTH-CONSOLE-23-003`) to verify `/fresh-auth` token semantics (120s OpTok, 300s fresh-auth window) and confirm scope bundles before closing the sprint task.
- Security Guild requested to execute the compliance checklist in §9 and record sign-off in SPRINT 23 log once alerts/dashboards are wired (metrics references: `ui_request_duration_seconds`, `ui_dpop_failure_total`, Grafana board `console-security.json`).
## Console CLI parity matrix
- Added `/docs/cli-vs-ui-parity.md` with feature-level status tracking (✅/🟡/🟩). Pending commands reference CLI backlog (`CLI-EXPORT-35-001`, `CLI-POLICY-23-005`, `CONSOLE-DOC-23-502`).
- DevEx/CLI Guild to wire parity CI workflow when CLI downloads commands ship; Downloads workspace already links to the forthcoming parity report slot.
## Accessibility refresh
- Published `/docs/accessibility.md` describing keyboard flows, screen-reader behaviour, colour tokens, testing rig (Storybook axe, Playwright a11y), and offline guidance.
- Accessibility Guild (CONSOLE-QA-23-402) to log the next Playwright a11y sweep results against the new checklist; design tokens follow-up tracked via CONSOLE-FEAT-23-102.
Artifacts:
- Doc: `/docs/security/console-security.md`
- Doc: `/docs/cli-vs-ui-parity.md`
- Doc: `/docs/accessibility.md`
- Sprint tracker: `SPRINTS.md` (DOCS-CONSOLE-23-012 now DONE)
cc: `@authority-core`, `@security-guild`, `@docs-guild`

229
docs/vex/aggregation.md Normal file
View File

@@ -0,0 +1,229 @@
# VEX Observations & Linksets
> Imposed rule: Work of this type or tasks of this type on this component must
> also be applied everywhere else it should be applied.
Link-Not-Merge brings the same immutable observation model to Excititor that
Concelier now uses for advisories. VEX statements are stored as append-only
observations; linksets correlate them, capture conflicts, and keep provenance so
Policy Engine and UI surfaces can explain decisions without collapsing sources.
---
## 1. Model overview
### 1.1 Observation lifecycle
1. **Ingest** Connectors fetch OpenVEX, CSAF VEX, CycloneDX VEX, or VEX
attestations, validate signatures, and strip any derived consensus data
forbidden by the Aggregation-Only Contract (AOC).
2. **Persist** Excititor writes immutable `vex_observations` keyed by tenant,
provider, upstream identifier, and `contentHash`. Supersedes chains record
revisions; the original payload is never mutated.
3. **Expose** WebService will surface paginated observation APIs and Offline
Kit snapshots mirror the same data for air-gapped sites.
Observation schema sketch (final shape lands with `EXCITITOR-LNM-21-001`):
```text
observationId = {tenant}:{providerId}:{upstreamId}:{revision}
tenant, providerId, streamId
upstream{ upstreamId, documentVersion, fetchedAt, receivedAt,
contentHash, signature{present, format?, keyId?, signature?} }
content{ format, specVersion, raw }
statements[
{ vulnerabilityId, productKey, status, justification?,
introducedVersion?, fixedVersion?, locator }
]
linkset{ purls[], cpes[], aliases[], references[],
reconciledFrom[], conflicts[]? }
attributes{ batchId?, replayCursor? }
createdAt
```
- **Raw payload** (`content.raw`) remains lossless (Relaxed Extended JSON).
- **Statements** provide normalized tuples for each claim contained in the
document, including justification and version hints.
- **Linkset** mirrors identifiers extracted during ingestion, retaining JSON
pointer metadata so audits can trace back to the source fragment.
### 1.2 Linkset lifecycle
Linksets correlate claims referring to the same `(vulnerabilityId, productKey)`
pair across providers.
1. **Seed** Observations push normalized identifiers (CVE, GHSA, vendor IDs)
plus canonical product keys (purl preferred, cpe fallback). Platform-scoped
statements remain marked `non_joinable`.
2. **Correlate** The linkset builder groups statements by tenant and identity,
combines alias graphs from Concelier, and uses justification/product overlap
to assign correlation confidence.
3. **Annotate** Conflicts (status disagreement, justification mismatch, range
inconsistencies) are recorded as structured entries.
4. **Persist** Results land in `vex_linksets` with deterministic IDs (hash of
sorted `(vulnerabilityId, productKey, observationIds)`) and append-only
history for replay/debugging.
Linksets never override statements or invent consensus; they simply align
evidence for Policy Engine and consumers.
---
## 2. Observation vs. linkset
- **Purpose**
- Observation: Immutable record of a single upstream VEX document.
- Linkset: Correlated evidence spanning observations that describe the same
product-vulnerability pair.
- **Mutation**
- Observation: Append-only via supersedes.
- Linkset: Regenerated deterministically by correlation jobs.
- **Allowed fields**
- Observation: Raw payload, provenance, normalized statement tuples, join
hints.
- Linkset: Observation references, statement IDs, confidence metrics, conflict
annotations.
- **Forbidden fields**
- Observation: Derived consensus, suppression flags, risk scores.
- Linkset: Derived severity or policy decisions (only evidence + conflicts).
- **Consumers**
- Observation: Evidence exports, Offline Kit mirrors, CLI raw dumps.
- Linkset: Policy Engine VEX overlay, Console evidence panes, Vuln Explorer.
### 2.1 Example sequence
1. Canonical vendor issues an attested OpenVEX declaring `CVE-2025-2222` as
`not_affected` for `pkg:rpm/redhat/openssl@1.1.1w-12`. Excititor inserts a
new observation referencing that statement.
2. Upstream CycloneDX VEX from a distro reports the same product as `affected`
with `under_investigation` justification.
3. Linkset builder groups both statements by alias overlap and product key,
setting confidence `high` because CVE and purl match.
4. Conflict annotation records `status-mismatch` and retains both justifications;
Policy Engine uses this to explain why suppression cannot proceed without
policy override.
---
## 3. Conflict handling
Structured conflicts capture disagreements without mutating source statements.
```json
{
"type": "status-mismatch",
"vulnerabilityId": "CVE-2025-2222",
"productKey": "pkg:rpm/redhat/openssl@1.1.1w-12",
"statements": [
{
"observationId": "tenant:redhat:openvex:3",
"providerId": "redhat",
"status": "not_affected",
"justification": "component_not_present"
},
{
"observationId": "tenant:ubuntu:cyclonedx:12",
"providerId": "ubuntu",
"status": "affected",
"justification": "under_investigation"
}
],
"confidence": "medium",
"detectedAt": "2025-10-27T14:30:00Z"
}
```
Conflict classes (tracked via `EXCITITOR-LNM-21-003`):
- `status-mismatch` Different statuses for the same pair (affected vs
not_affected vs fixed vs under_investigation).
- `justification-divergence` Same status but incompatible justifications or
missing justification where policy requires it.
- `version-range-clash` Introduced/fixed ranges contradict each other.
- `non-joinable-overlap` Platform-scoped statements collide with package
statements; flagged as warning but retained.
- `metadata-gap` Missing provenance/signature field on specific statements.
Conflicts surface through:
- `/vex/linksets/{id}` APIs (`conflicts[]` payload).
- Console evidence panels (badges + drawer detail).
- CLI exports (`stella vex linkset …` planned in `CLI-LNM-22-002`).
- Metrics dashboards (`vex_linkset_conflicts_total{type}`).
---
## 4. AOC alignment
- **Raw-first** `content.raw` and `statements[]` mirror upstream input; no
derived consensus or suppression values are written by ingestion.
- **No merges** Each upstream statement persists independently; linksets refer
back via `observationId`.
- **Provenance mandatory** Missing signature or source metadata yields
`ERR_AOC_004`; ingestion blocks until connectors fix the feed.
- **Idempotent writes** Duplicate `(providerId, upstreamId, contentHash)`
results in a no-op; revisions append with a `supersedes` pointer.
- **Deterministic output** Correlator sorts identifiers, normalizes timestamps
(UTC ISO-8601), and hashes canonical JSON to generate stable linkset IDs.
- **Scope-aware** Tenant claims enforced on write/read; Authority scopes
`vex:ingest` / `vex:read` are required (see `AUTH-AOC-22-001`).
Violations raise `ERR_AOC_00x`, emit `aoc_violation_total`, and prevent the data
from landing downstream.
---
## 5. Downstream consumption
- **Policy Engine** Evaluates VEX evidence alongside advisory linksets to gate
suppression, severity downgrades, or explainability.
- **Console UI** Evidence panel renders VEX statements grouped by provider and
highlights conflicts or missing signatures.
- **CLI** Planned commands export observations/linksets for offline analysis
(`CLI-LNM-22-002`).
- **Offline Kit** Bundled snapshots keep VEX data aligned with advisory
observations for air-gapped parity.
- **Observability** Dashboards track ingestion latency, conflict counts, and
supersedes depth per provider.
New consumers must treat both collections as read-only and preserve deterministic
ordering when caching.
---
## 6. Validation & testing
- **Unit tests** (`StellaOps.Excititor.Core.Tests`) to cover schema guards,
deterministic linkset hashing, conflict classification, and supersedes
behaviour.
- **Mongo integration tests** (`StellaOps.Excititor.Storage.Mongo.Tests`) to
verify indexes, shard keys, and idempotent writes across tenants.
- **CLI smoke suites** (`stella vex observations`, `stella vex linksets`) for
JSON determinism and exit code coverage.
- **Replay determinism** Feed identical upstream payloads twice and ensure
observation/linkset hashes match across runs.
- **Offline kit verification** Validate VEX exports packaged in Offline Kit
snapshots against live service outputs.
- **Fixture refresh** Samples (`SAMPLES-LNM-22-002`) must include multi-source
conflicts and justification variants used by docs and UI tests.
---
## 7. Reviewer checklist
- Observation schema aligns with `EXCITITOR-LNM-21-001` once the schema lands;
update references as soon as the final contract is published.
- Linkset lifecycle covers correlation signals (alias graphs, product keys,
justification rules) and deterministic ID strategy.
- Conflict classes include status, justification, version range, platform overlap
scenarios.
- AOC guardrails called out with relevant error codes and Authority scopes.
- Downstream consumer list matches active APIs/CLI features (update when
`CLI-LNM-22-002` and WebService endpoints ship).
- Validation section references Core, Storage, CLI, and Offline test suites plus
fixture requirements.
- Imposed rule reminder retained at top.
Dependencies outstanding (2025-10-27): `EXCITITOR-LNM-21-001..005` and
`EXCITITOR-LNM-21-101..102` are still TODO; revisit this document once schemas,
APIs, and fixtures are implemented.