consolidation of some of the modules, localization fixes, product advisories work, qa work

This commit is contained in:
master
2026-03-05 03:54:22 +02:00
parent 7bafcc3eef
commit 8e1cb9448d
3878 changed files with 72600 additions and 46861 deletions

View File

@@ -1,6 +1,6 @@
# Advisory Lens Architecture
> **Status: Production (Shared Library).** AdvisoryLens is a standalone deterministic library at `src/__Libraries/StellaOps.AdvisoryLens/`, **not** merged into AdvisoryAI. The two modules serve different purposes: AdvisoryLens provides pattern-based case matching without AI inference; AdvisoryAI provides LLM-powered advisory analysis with guardrails. They can be composed together but are architecturally independent. The library is currently available for integration but not yet referenced from any WebService `Program.cs`.
> **Status: Archived (2026-03-04).** AdvisoryLens is preserved under `src/__Libraries/_archived/StellaOps.AdvisoryLens/` with tests at `src/__Libraries/_archived/StellaOps.AdvisoryLens.Tests/`. It was archived in Sprint 217 after consumer verification confirmed zero production usage.
## Purpose
@@ -8,8 +8,8 @@ StellaOps.AdvisoryLens is a deterministic, offline-first library for semantic ca
## Scope
- Working directory: `src/__Libraries/StellaOps.AdvisoryLens/`
- Tests: `src/__Libraries/__Tests/StellaOps.AdvisoryLens.Tests/`
- Working directory: `src/__Libraries/_archived/StellaOps.AdvisoryLens/`
- Tests: `src/__Libraries/_archived/StellaOps.AdvisoryLens.Tests/`
- Integration entry point: `services.AddAdvisoryLens(...)`
## Models

View File

@@ -0,0 +1,47 @@
# Bench (Performance Benchmarks)
**Status:** Implemented
**Source:** `src/Bench/`
**Owner:** Platform Team
> **Note:** This folder documents **performance benchmarks**. For **competitive benchmarking** (accuracy comparison with other scanners), see [`../benchmark/`](../benchmark/).
## Purpose
Bench provides performance benchmark infrastructure for StellaOps modules. Measures throughput, latency, and resource usage to detect regressions and validate performance targets.
## Components
**Benchmark Projects:**
- `StellaOps.Bench.LinkNotMerge` - Link-Not-Merge correlation performance
- `StellaOps.Bench.LinkNotMerge.Vex` - LNM VEX statement performance
- `StellaOps.Bench.Notify` - Notification delivery throughput
- `StellaOps.Bench.PolicyEngine` - Policy evaluation performance
- `StellaOps.Bench.ScannerAnalyzers` - Language analyzer performance
## Scanner Vendor Parity Tracking
`StellaOps.Bench.ScannerAnalyzers` now supports vendor parity tracking for offline benchmark runs:
- Scenario-level vendor ingestion from JSON or SARIF artifacts (`vendorResults[]` in benchmark config).
- Optional Stella finding ingestion (`stellaFindingsPath`) for exact overlap comparisons.
- Deterministic parity outputs in benchmark JSON and Prometheus exports:
- overlap counts and percentages
- scanner-only / vendor-only counts
- parity score (Jaccard-style overlap over union)
## Usage
```bash
# Run all benchmarks
dotnet run -c Release --project src/Bench/StellaOps.Bench/LinkNotMerge/StellaOps.Bench.LinkNotMerge
# Run with specific runtime
dotnet run -c Release --project src/Bench/StellaOps.Bench/Notify/StellaOps.Bench.Notify
```
## Related Documentation
- Competitive Benchmark: `../benchmark/architecture.md`
- Scanner: `../scanner/architecture.md`
- Policy: `../policy/architecture.md`
- Notify: `../notify/architecture.md`

View File

@@ -1,7 +1,8 @@
# Cartographer Module
**Status:** Implemented
**Source:** `src/Cartographer/`
**Status:** Archived (absorbed into Scanner in Sprint 201)
**Source (current):** `src/Scanner/StellaOps.Scanner.Cartographer/`
**Historical source:** `src/Cartographer/`
## Purpose
@@ -49,4 +50,4 @@ or promotion lanes; those are owned by Release Orchestrator ENVMGR/PROMOT.
## Current Status
Active development. Materializes immutable SBOM property graphs with overlay hydration, deterministic snapshots, and optimized tile serving for dependency navigation.
Archived as a standalone module. Active implementation lives under Scanner at `src/Scanner/StellaOps.Scanner.Cartographer/`.

View File

@@ -0,0 +1,40 @@
# DevPortal
> Developer portal for API documentation and SDK access.
## Purpose
DevPortal provides a unified developer experience for StellaOps API consumers. It hosts API documentation, SDK downloads, and developer guides.
## Quick Links
- [Guides](./guides/) - Developer guides and tutorials
## Status
| Attribute | Value |
|-----------|-------|
| **Maturity** | Beta |
| **Last Reviewed** | 2025-12-29 |
| **Maintainer** | Platform Guild |
## Key Features
- **API Documentation**: Interactive OpenAPI documentation
- **SDK Downloads**: Language-specific SDK packages
- **Developer Guides**: Integration tutorials and examples
- **API Playground**: Interactive API testing environment
## Dependencies
### Upstream (this module depends on)
- **Authority** - Developer authentication
- **Gateway** - API proxy and rate limiting
### Downstream (modules that depend on this)
- None (consumer-facing portal)
## Related Documentation
- [API Overview](../../api/overview.md)
- [CLI Reference](../../cli/command-reference.md)

View File

@@ -0,0 +1,80 @@
# Developer Portal Publishing Guide
Last updated: 2025-11-25
## Goals
- Publish the StellaOps Developer Portal consistently across connected and air-gapped environments.
- Produce deterministic artefacts (checksums, manifests) so releases are auditable and reproducible.
- Keep docs, API specs, and examples in sync with the CI pipelines that build the portal.
## Prerequisites
- Node.js 20.x + pnpm 9.x
- Docker / Podman (for static-site container image)
- Spectral lint baseline from `src/Api/StellaOps.Api.OpenApi` (optional, to embed OAS links)
- Access to `local-nugets/` cache and offline asset bundle (see Offline section)
## Build & Test (connected)
```bash
pnpm install --frozen-lockfile
pnpm lint # markdownlint/prettier/eslint as configured
pnpm build # generates static site into dist/
pnpm test # component/unit tests if configured
```
- Determinism: ensure `pnpm-lock.yaml` is committed; no timestamps in emitted HTML (set `SOURCE_DATE_EPOCH` if needed).
## Publish (connected)
1. Build the static site: `pnpm build` (or reuse CI artifact).
2. Create artefact bundle:
```bash
tar -C dist -czf out/devportal/site.tar.gz .
sha256sum out/devportal/site.tar.gz > out/devportal/site.tar.gz.sha256
```
3. Container image (optional):
```bash
docker build -t registry.example.com/stella/devportal:${VERSION} -f ops/devportal/Dockerfile .
docker push registry.example.com/stella/devportal:${VERSION}
```
4. Record manifest `out/devportal/manifest.json`:
```json
{
"version": "${VERSION}",
"checksum": "$(cat out/devportal/site.tar.gz.sha256 | awk '{print $1}')",
"build": {
"node": "20.x",
"pnpm": "9.x"
},
"timestamp": "${UTC_ISO8601}",
"source_commit": "$(git rev-parse HEAD)"
}
```
## Offline / Air-gap
- Use pre-seeded bundle `offline/devportal/site.tar.gz` with accompanying `.sha256` and `manifest.json`.
- Verify before use:
```bash
sha256sum -c offline/devportal/site.tar.gz.sha256
```
- Serve locally:
```bash
mkdir -p /srv/devportal && tar -C /srv/devportal -xzf offline/devportal/site.tar.gz
# then point nginx/caddy to /srv/devportal
```
- No external CDN references allowed; ensure assets are bundled and CSP is self-contained.
## Deployment targets
- **Kubernetes**: use the static-site container image with a read-only root filesystem; expose via ingress with TLS; set `ETAG`/`Last-Modified` headers from manifest.
- **Docker Compose**: mount `site.tar.gz` into a lightweight nginx container; sample compose snippet lives in `ops/deployment/devportal/docker-compose.devportal.yml` (to be authored alongside this doc).
- **File share**: extract bundle onto shared storage for disconnected viewing; keep manifest + checksum adjacent.
## Checks & Observability
- Lint/OAS links: run `pnpm lint` and optional `pnpm api:check` (if wired) to ensure embedded API links resolve.
- Availability: configure basic `/healthz` (static 200) and enable access logging at the reverse proxy.
- Integrity: serve checksums/manifest from `/meta` path for auditors; include build `source_commit` and `timestamp`.
## Release checklist
- [ ] `pnpm build` succeeds reproducibly.
- [ ] `site.tar.gz` + `.sha256` generated and verified.
- [ ] `manifest.json` populated with version, checksum, UTC timestamp, commit SHA.
- [ ] Offline bundle placed in `offline/devportal/` with checksums.
- [ ] Image (if used) pushed to registry and noted in release notes.
- [ ] Deployment target (K8s/Compose/File share) instructions updated if changed.

View File

@@ -0,0 +1,34 @@
# Excititor agent guide
## Mission
Excititor converts heterogeneous VEX feeds into raw observations and linksets that honour the Aggregation-Only Contract.
## Key docs
- [Module README](./README.md)
- [Architecture](./architecture.md)
- [Implementation plan](./implementation_plan.md)
- [Task board](./TASKS.md)
## How to get started
1. Open sprint file `/docs/implplan/SPRINT_*.md` and locate the stories referencing this module.
2. Review ./TASKS.md for local follow-ups and confirm status transitions (TODO → DOING → DONE/BLOCKED).
3. Read the architecture and README for domain context before editing code or docs.
4. Coordinate cross-module changes in the main /AGENTS.md description and through the sprint plan.
## Guardrails
- Honour the Aggregation-Only Contract where applicable (see ../../aoc/aggregation-only-contract.md).
- Preserve determinism: sort outputs, normalise timestamps (UTC ISO-8601), and avoid machine-specific artefacts.
- Keep Offline Kit parity in mind—document air-gapped workflows for any new feature.
- Update runbooks/observability assets when operational characteristics change.
## Required Reading
- `docs/modules/excititor/README.md`
- `docs/modules/excititor/architecture.md`
- `docs/modules/excititor/implementation_plan.md`
- `docs/modules/platform/architecture-overview.md`
## Working Agreement
- 1. Update task status to `DOING`/`DONE` in both correspoding sprint file `/docs/implplan/SPRINT_*.md` and the local `TASKS.md` when you start or finish work.
- 2. Review this charter and the Required Reading documents before coding; confirm prerequisites are met.
- 3. Keep changes deterministic (stable ordering, timestamps, hashes) and align with offline/air-gap expectations.
- 4. Coordinate doc updates, tests, and cross-guild communication whenever contracts or workflows change.
- 5. Revert to `TODO` if you pause the task without shipping changes; leave notes in commit/PR descriptions for context.

View File

@@ -0,0 +1,76 @@
# StellaOps Excititor
Excititor converts heterogeneous VEX feeds into raw observations and linksets that honour the Aggregation-Only Contract.
## Latest updates (2025-12-05)
- Chunk API documentation remains blocked until CI is green and a pinned OpenAPI spec + deterministic samples are available.
- Sprint tracker `docs/implplan/SPRINT_0333_0001_0001_docs_modules_excititor.md` and module `TASKS.md` mirror status.
- Observability/runbook assets remain in `operations/observability.md` and `observability/` (timeline, locker manifests); dashboards stay offline-import friendly.
- Prior updates (2025-11-05): Link-Not-Merge readiness and consensus beta note (`../../implplan/archived/updates/2025-11-05-excitor-consensus-beta.md`), observability guide additions, DSSE packaging guidance, and Policy/CLI follow-ups tracked in SPRINT_200.
- Link-Not-Merge readiness: release note [Excitor consensus beta](../../implplan/archived/updates/2025-11-05-excitor-consensus-beta.md) captures how Excititor feeds power the Excititor consensus beta (sample payload in [consensus JSON](../../vex/consensus-json.md)).
- Added [observability guide](operations/observability.md) describing the evidence metrics emitted by `EXCITITOR-AIAI-31-003` (request counters, statement histogram, signature status, guard violations) so Ops/Lens can alert on misuse.
- README now points policy/UI teams to the upcoming consensus integration work.
- DSSE packaging for consensus bundles and Export Center hooks are documented in the [beta release note](../../implplan/archived/updates/2025-11-05-excitor-consensus-beta.md); operators mirroring Excititor exports must verify detached JWS artefacts (`bundle.json.jws`) alongside each bundle.
- Follow-ups called out in the release note (Policy weighting knobs `POLICY-ENGINE-30-101`, CLI verb `CLI-VEX-30-002`) remain in-flight and are tracked in `/docs/implplan/SPRINT_200_documentation_process.md`.
## Release references
- Consensus beta payload reference: [docs/vex/consensus-json.md](../../vex/consensus-json.md)
- Export Center offline packaging: [docs/modules/export-center/devportal-offline.md](../export-center/devportal-offline.md)
- Historical release log: [docs/implplan/archived/updates/](../../implplan/archived/updates/)
## Responsibilities
- Fetch OpenVEX/CSAF/CycloneDX statements via restart-only connectors.
- Store immutable VEX observations with full provenance.
- Publish linksets and events that drive policy suppression decisions.
- Provide deterministic exports for Offline Kit and downstream tooling.
## Key components
- `StellaOps.Excititor.WebService` scheduler/API host.
- Connector libraries under `StellaOps.Excititor.Connector.*`.
- Normalization helpers and exporters in `StellaOps.Excititor.*`.
## Integrations & dependencies
- Policy Engine for evidence queries.
- UI/CLI for conflict visibility and explanation.
- Notify for VEX-driven alerts.
## Operational notes
- PostgreSQL (schema `vex`) for observation storage and job metadata.
- Offline kit packaging aligned with Concelier merges.
- Connector-specific runbooks (see `docs/modules/concelier/operations/connectors`).
- Ubuntu CSAF provenance knobs: [`operations/ubuntu-csaf.md`](operations/ubuntu-csaf.md) captures TrustWeight/Tier, cosign, and fingerprint configuration for the sprint 120 enrichment.
## Backlog references
- DOCS-LNM-22-006 / DOCS-LNM-22-007 (shared with Concelier).
- CLI-EXC-25-001..002 follow-up for CLI parity.
## Epic alignment
- **Epic 1 AOC enforcement:** maintain immutable VEX observations, provenance, and AOC verifier coverage.
- **Epic 7 VEX Consensus Lens:** supply trustworthy raw inputs, trust metadata, and consensus hooks for the lens computations.
- **Epic 8 Advisory AI:** expose citation-ready VEX payloads for the advisory assistant pipeline.
## Implementation Status
### Objectives
- Maintain deterministic behaviour and offline parity across releases
- Keep documentation, telemetry, and runbooks aligned with the latest sprint outcomes
### Key Milestones
- **Epic 1 AOC enforcement:** enforce immutable VEX observation schema, provenance capture, and guardrails
- **Epic 7 VEX Consensus Lens:** provide lens-ready metadata (issuer trust, temporal scoping) and consensus APIs
- **Epic 8 Advisory AI:** guarantee citation-ready payloads and normalized context for AI summaries/explainers
### Recent Delivery Status
- Chunk API documentation remains blocked until CI is green and a pinned OpenAPI spec with deterministic samples are available
- Link-Not-Merge readiness and consensus beta completed with DSSE packaging guidance
- Observability guide additions and policy/CLI follow-ups tracked in sprint files
### Workstreams
- Backlog grooming: reconcile open stories with module roadmap
- Implementation: collaborate with service owners to land feature work
- Validation: extend tests/fixtures to preserve determinism and provenance requirements
### Coordination
- Review ./AGENTS.md before picking up new work
- Sync with cross-cutting teams noted in sprint files
- Update plan whenever scope, dependencies, or guardrails change

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,43 @@
# Excititor Attestation Plan (Sprint 110)
## Goals
- Align Excititor chunk API and attestation envelopes with Evidence Locker contract.
- Provide offline-ready chunk submission/attestation flow for VEX evidence.
## Chunk API shape (`/vex/evidence/chunks`)
- POST body (NDJSON, deterministic order by `chunk_id`):
```json
{
"chunk_id": "uuid",
"tenant": "acme",
"source": "ghsa",
"schema": "stellaops.vex.chunk.v1",
"items": [ {"advisory_id":"GHSA-123","status":"affected","purl":"pkg:npm/foo@1.0.0"} ],
"provenance": {"fetched_at":"2025-11-20T00:00:00Z","artifact_sha":"abc"}
}
```
- At submission, Excititor returns `chunk_digest` (sha256 of canonical JSON) and queue id.
## Attestation envelope
- Subject: `chunk_digest` from above.
- Predicates attached:
- `stellaops.vex.chunk.meta.v1` (tenant, source, schema version, item count).
- `stellaops.vex.chunk.integrity.v1` (sha256 per item block, canonical order).
- Optional `stellaops.transparency.v1` (Rekor UUID/logIndex) when online.
- Envelope format: DSSE using Evidence Locker provider registry; signing profile mirrors Evidence Locker bundle profile for tenant.
## DSSE bundling rules
- Deterministic JSON (sorted keys) before hashing.
- Canonical NDJSON for chunk payload; no gzip inside envelope.
- Attach verification report alongside attestation as `chunk-verify.json` (hashes + signature check results).
## Sample payloads
- `docs/modules/excititor/samples/chunk-sample.ndjson`
- `docs/modules/excititor/samples/chunk-attestation-sample.json`
## Integration points
- Evidence Locker contract v1 (see `docs/modules/evidence-locker/attestation-contract.md`).
- Concelier LNM schemas (observations remain aggregation-only; attestation is evidence, not merge).
## Ownership
- Excititor Guild (primary); Evidence Locker Guild reviewer.

View File

@@ -0,0 +1,13 @@
# Excititor Changes Log
This file records breaking or behavior-changing updates for the Excititor module.
Update this log whenever public contracts, schemas, or workflows change.
## Format
- Date (UTC)
- Change summary
- Impacted contracts or schemas
- Migration notes (if required)
## Entries
- 2026-01-30: Log initialized. No breaking changes recorded.

View File

@@ -0,0 +1,36 @@
# Connector signer metadata (v1.0.0)
**Scope.** Defines the canonical, offline-friendly metadata for Excititor connectors that validate signed feeds (MSRC CSAF, Oracle OVAL, Ubuntu OVAL, StellaOps mirror OpenVEX). The file is consumed by WebService/Worker composition roots and by Offline Kits to pin trust material deterministically.
**Location & format.**
- Schema: `docs/modules/excititor/schemas/connector-signer-metadata.schema.json` (JSON Schema 202012).
- Sample: `docs/modules/excititor/samples/connector-signer-metadata-sample.json` (aligns with schema).
- Expected production artifact: NDJSON or JSON stamped per release; store in offline kits alongside connector bundles.
## Required fields (summary)
- `schemaVersion` — must be `1.0.0`.
- `generatedAt` — ISO-8601 UTC timestamp for the metadata file.
- `connectors[]` — one entry per connector:
- `connectorId` — stable slug, e.g., `excititor-msrc-csaf`.
- `provider { name, slug }` — human label and slug.
- `issuerTier``tier-0`, `tier-1`, `tier-2`, or `untrusted` (aligns with trust weighting).
- `signers[]` — one per signing path; each has `usage` (`csaf|oval|openvex|bulk-meta|attestation`) and `fingerprints[]` (algorithm + format + value). Optional `keyLocator` and `certificateChain` for offline key retrieval.
- `bundle` — reference to the sealed bundle containing the feed/signing material (`kind`: `oci-referrer|oci-tag|file|tuf`, plus `uri`, optional `digest`, `publishedAt`).
- Optional `validFrom`, `validTo`, `revoked`, `notes` for rollover and incident handling.
## Rollover / migration guidance
1) **Author the metadata** using the schema and place the JSON next to connector bundles in the offline kit (`out/connectors/<provider>/signer-metadata.json`).
2) **Validate** with `dotnet tool run njsonschema validate connector-signer-metadata.schema.json connector-signer-metadata.json` (or `ajv validate`).
3) **Wire connector code** to load the file on startup (Worker + WebService) and pin signers per `connectorId`; reject feeds whose fingerprints are absent or marked `revoked=true` or out of `validFrom/To` range.
- Connectors look for `STELLAOPS_CONNECTOR_SIGNER_METADATA_PATH` (absolute/relative) and enrich provenance metadata automatically when present.
4) **Rollover keys** by appending a new `signers` entry and setting a future `validFrom`; keep the previous signer until all mirrors have caught up. Use `issuerTier` downgrades to quarantine while keeping history.
5) **Mirror references**: store the referenced bundles/keys under OCI tags or TUF targets already shipped in the offline kit so no live network is required.
6) **Record decisions** in sprint Decisions & Risks when changing trust tiers or fingerpints; update this doc if formats change.
## Sample entries (non-production)
See `docs/modules/excititor/samples/connector-signer-metadata-sample.json` for MSRC, Oracle, Ubuntu, and StellaOps example entries. These fingerprints are illustrative only; replace with real values before shipping.
## Consumer expectations
- Deterministic: sort connectors alphabetically before persistence; avoid clock-based defaults.
- Offline-first: all `keyLocator`/`bundle.uri` values must resolve inside the air-gap kit (OCI/TUF/file).
- Observability: emit a structured warning when metadata is missing or stale (>7 days) and fail closed for missing signers.

View File

@@ -0,0 +1,112 @@
# Excititor Advisory-AI Evidence Contract (v1)
Updated: 2025-11-18 · Scope: EXCITITOR-AIAI-31-004 (Phase 119)
This note defines the deterministic, aggregation-only contract that Excititor exposes to Advisory AI and Lens consumers. It covers the `/v1/vex/evidence/chunks` NDJSON stream plus the projection rules for observation IDs, signatures, and provenance metadata.
## Goals
- **Deterministic & replayable**: stable ordering, no implicit clocks, fixed schemas.
- **Aggregation-only**: no consensus/inference; raw supplier statements plus signatures and AOC (Aggregation-Only Contract) guardrails.
- **Offline-friendly**: chunked NDJSON; no cross-tenant lookups; portable enough for mirror/air-gap bundles.
## Endpoint
- `GET /v1/vex/evidence/chunks`
- **Query**:
- `tenant` (required)
- `vulnerabilityId` (optional, repeatable) — CVE, GHSA, etc.
- `productKey` (optional, repeatable) — PURLish key used by Advisory AI.
- `cursor` (optional) — stable pagination token.
- `limit` (optional) — max records per stream chunk (default 500, max 2000).
- **Response**: `Content-Type: application/x-ndjson`
- Each line is a single evidence record (see schema below).
- Ordered by `(tenant, vulnerabilityId, productKey, observationId, statementId)` to stay deterministic.
## Evidence record schema (NDJSON)
```json
{
"tenant": "acme",
"vulnerabilityId": "CVE-2024-1234",
"productKey": "pkg:pypi/django@3.2.24",
"observationId": "obs-3cf9d6e4-…",
"statementId": "stmt-9c1d…",
"source": {
"supplier": "upstream:osv",
"documentId": "osv:GHSA-xxxx-yyyy",
"retrievedAt": "2025-11-10T12:34:56Z",
"signatureStatus": "missing|unverified|verified"
},
"aoc": {
"violations": [
{ "code": "EVIDENCE_SIGNATURE_MISSING", "surface": "ingest" }
]
},
"evidence": {
"type": "vex.statement",
"payload": { "...supplier-normalized-fields..." }
},
"provenance": {
"hash": "sha256:...",
"canonicalUri": "https://mirror.example/bundles/…",
"bundleId": "mirror-bundle-001"
}
}
```
### Field notes
- `observationId` is stable and maps 1:1 to internal storage; Advisory AI must cite it when emitting narratives.
- `statementId` remains unique within an observation.
- `signatureStatus` is pass-through from ingest; no interpretation beyond `missing|unverified|verified`.
- `aoc.violations` enumerates guardrail violations without blocking delivery.
- `evidence.payload` is supplier-shaped; we **do not** merge or rank.
- `provenance.hash` is the SHA-256 of the supplier document bytes; `canonicalUri` points to the mirror bundle when available.
## Determinism rules
- Ordering: fixed sort above; pagination cursor is derived from the last emitted `(tenant, vulnerabilityId, productKey, observationId, statementId)`.
- Clocks: All timestamps are UTC ISO-8601 with `Z`.
- No server-generated randomness; record content is idempotent for identical upstream inputs.
## AOC guardrails
- Enforced surfaces: ingest, `/v1/vex/aoc/verify`, and chunk emission.
- Violations are reported via `aoc.violations` and metric `excititor.vex.aoc.guard_violations`.
- No statements are dropped due to AOC; consumers decide how to act.
## Telemetry (counters/logs-only until span sink arrives)
- `excititor.vex.chunks.requests` — by `tenant`, `outcome`, `truncated`.
- `excititor.vex.chunks.bytes` — histogram of NDJSON stream sizes.
- `excititor.vex.chunks.records` — histogram of records per stream.
- Existing observation metrics (`excititor.vex.observation.*`) remain unchanged.
## Error handling
- 400 for invalid tenant or mutually exclusive filters.
- 429 with `Retry-After` when throttle budgets exceeded.
- 503 on upstream store/transient failures; responses remain NDJSON-free on error.
## Offline / mirror readiness
- When mirror bundles are configured, `provenance.canonicalUri` points to the local bundle path; otherwise it is omitted.
- All payloads are side-effect free; no remote fetches occur while streaming.
## Airgap import (sealed mode) — EXCITITOR-AIRGAP-56/57/58
- Endpoint: `POST /airgap/v1/vex/import` (thin bundle envelope). Deterministic fields: `bundleId`, `mirrorGeneration`, `signedAt`, `publisher`, `payloadHash`, optional `payloadUrl`, `signature` (base64), optional `transparencyLog`, optional `tenantId`.
- Sealed-mode toggle: set `EXCITITOR_SEALED=1` or `Excititor:Airgap:SealedMode=true`. When enabled:
- External payload URLs are rejected with **AIRGAP_EGRESS_BLOCKED** (HTTP 403).
- Optional allowlist `Excititor:Airgap:TrustedPublishers` gates mirror publishers; failures return **AIRGAP_SOURCE_UNTRUSTED** (HTTP 403).
- Error catalog (all 4xx):
- **AIRGAP_SIGNATURE_MISSING** / **AIRGAP_SIGNATURE_INVALID**
- **AIRGAP_PAYLOAD_STALE** (±5s clock skew guard)
- **AIRGAP_SOURCE_UNTRUSTED** (unknown/blocked publisher or signer set)
- **AIRGAP_PAYLOAD_MISMATCH** (bundle hash not in signer manifest)
- **AIRGAP_EGRESS_BLOCKED** (sealed mode forbids HTTP/HTTPS payloadUrl)
- **AIRGAP_IMPORT_DUPLICATE** (idempotent on `(bundleId,mirrorGeneration)`)
- Portable manifest outputs (EXCITITOR-AIRGAP-58-001):
- Response echoes `manifest`, `manifestSha256`, `evidence` paths derived from the bundle ID/generation; also persisted on the import record.
- Evidence Locker linkage: `evidence/{bundleId}/{generation}/bundle.ndjson` path recorded for downstream replay/export.
- Timeline events (deterministic order, ISO timestamps):
- `airgap.import.started`, `airgap.import.completed`, `airgap.import.failed`
- Attributes: `{tenantId,bundleId,generation,stalenessSeconds?,errorCode?}`
- Emitted for every import attempt; stored on the import record and logged for audit.
## Samples
- NDJSON sample: `docs/modules/excititor/samples/chunks-sample.ndjson` (hashes in `.sha256`) aligned to the schema above.
## Versioning
- Contract version: `v1` (this document). Changes must be additive; breaking changes require `v2` path and updated doc.

View File

@@ -0,0 +1,87 @@
# Excititor Graph Overlay Contract (v1.0.0)
_Updated: 2025-12-10 | Owners: Excititor Core + UI Guilds | Scope: EXCITITOR-GRAPH-21-001..005, EXCITITOR-POLICY-20-001/002, EXCITITOR-RISK-66-001_
## Purpose
Defines the graph-ready overlay built from Link-Not-Merge observations/linksets so Console, Vuln Explorer, Policy, and Risk surfaces consume a single deterministic shape. This freezes the contract for Postgres materialization and cache APIs, unblocking Sprint 0120 tasks.
## Schema
- JSON Schema: `docs/modules/excititor/schemas/vex_overlay.schema.json` (draft 2020-12, schemaVersion `1.0.0`).
- Required fields: `schemaVersion`, `generatedAt`, `tenant`, `purl`, `advisoryId`, `source`, `status`, `observations[]`, `provenance`.
- Status enum: `affected|not_affected|under_investigation|fixed|unknown`.
- Ordering: observations are sorted by `source, advisoryId, fetchedAt` (Link-Not-Merge invariant) and emitted in that order. Overlays are returned in request PURL order, then by `advisoryId`, then `source`.
- Provenance: carries `linksetId`, `linksetHash`, `observationHashes[]`, optional `policyHash`, `sbomContextHash`, and `planCacheKey` for replay.
## Postgres materialization (IAppendOnlyLinksetStore)
- Table `vex_overlays` (materialized cache):
- Primary key: `(tenant, purl, advisory_id, source)`.
- Columns: `status`, `justifications` (jsonb), `conflicts` (jsonb), `observations` (jsonb), `provenance` (jsonb), `cached_at`, `ttl_seconds`, `schema_version`.
- Indexes: unique `(tenant, purl, advisory_id, source)`, plus `(tenant, cached_at)` for TTL sweeps.
- Overlay rows are regenerated when linkset hash or observation hash set changes; cache evictions use `cached_at + ttl_seconds`.
- Linksets and observation hashes come from the append-only linkset store (`IAppendOnlyLinksetStore`) to preserve Aggregation-Only Contract guarantees.
## API shape (Graph/Vuln Explorer)
- Endpoint: `GET /v1/graph/overlays?purl=<purl>&purl=<purl>&includeJustifications=true|false`.
- Response items follow `vex_overlay.schema.json`; `cache` stanza signals `cached`, `cachedAt`, and `ttlSeconds`.
- Cursoring: stable order (input PURL list) with `nextPageToken` based on `(tenant, purl, advisoryId, source, generatedAt)`.
- Telemetry: `excititor.graph.overlays.cache{tenant,hit}` counter; `excititor.graph.overlays.latency_ms` histogram tagged with `cached`.
## Sample (abridged)
```json
{
"schemaVersion": "1.0.0",
"generatedAt": "2025-12-10T00:00:00Z",
"tenant": "tenant-default",
"purl": "pkg:maven/org.example/foo@1.2.3",
"advisoryId": "GHSA-xxxx-yyyy-zzzz",
"source": "ghsa",
"status": "affected",
"justifications": [
{
"kind": "known_affected",
"reason": "Upstream GHSA reports affected range <1.3.0.",
"evidence": ["concelier:ghsa:obs:6561e41b3e3f4a6e9d3b91c1"],
"weight": 0.8
}
],
"conflicts": [
{
"field": "affected.versions",
"reason": "vendor_range_differs",
"values": ["<1.2.0", "<=1.3.0"],
"sourceIds": ["concelier:redhat:obs:...","concelier:ghsa:obs:..."]
}
],
"observations": [
{
"id": "concelier:ghsa:obs:6561e41b3e3f4a6e9d3b91c1",
"contentHash": "sha256:1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd",
"fetchedAt": "2025-11-19T00:00:00Z"
}
],
"provenance": {
"linksetId": "concelier:ghsa:linkset:6561e41b3e3f4a6e9d3b91d0",
"linksetHash": "sha256:deaddeaddeaddeaddeaddeaddeaddeaddeaddeaddeaddeaddeaddeaddeaddead",
"observationHashes": ["sha256:1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd"],
"policyHash": "sha256:0f7c...9ad3",
"sbomContextHash": "sha256:421af53f9eeba6903098d292fbd56f98be62ea6130b5161859889bf11d699d18",
"planCacheKey": "tenant-default|pkg:maven/org.example/foo@1.2.3|GHSA-xxxx-yyyy-zzzz"
},
"cache": {
"cached": true,
"cachedAt": "2025-12-10T00:00:00Z",
"ttlSeconds": 300
}
}
```
## Validation & determinism
- Validate overlays against `vex_overlay.schema.json` in CI and during materialization; reject or warn when fields drift.
- Deterministic ordering: input PURL order, then `advisoryId`, then `source`; observation list sorted by `source, advisoryId, fetchedAt`.
- No mutation: overlays are append-only; regeneration inserts a new row/version, leaving prior cache entries for audit until TTL expires.
## Handoff
- Consumers (Console, Vuln Explorer, Policy Engine, Risk) should treat `vex_overlay.schema.json` as the authoritative contract.
- Offline kits must bundle the schema file and sample payloads under `docs/modules/excititor/samples/` with SHA256 manifests.
- Future schema versions must bump `schemaVersion` and add migration notes to this document and `docs/modules/excititor/architecture.md`.
- Policy and Risk surfaces in WebService now read overlays directly (with claim-store fallback for policy tests) to produce lookup and risk feeds; overlay cache/store are selected per tenant (in-memory by default, Postgres `vex.graph_overlays` when configured).

View File

@@ -0,0 +1,24 @@
# Excititor Implementation Plan
## Purpose
Provide a living plan for Excititor deliverables, dependencies, and evidence.
## Active work
- Track current sprints under `docs/implplan/SPRINT_*.md` for this module.
- Update this file when new scoped work is approved.
## Near-term deliverables
- TBD (add when sprint is staffed).
## Dependencies
- `docs/modules/excititor/architecture.md`
- `docs/modules/excititor/README.md`
- `docs/modules/platform/architecture-overview.md`
## Evidence of completion
- Code changes under `src/Excititor/**`.
- Tests and fixtures under the module's `__Tests` / `__Libraries`.
- Docs and runbooks under `docs/modules/excititor/**`.
## Notes
- Keep deterministic and offline-first expectations aligned with module AGENTS.

View File

@@ -0,0 +1,195 @@
# architecture_excititor_mirrors.md — Excititor Mirror Distribution
> **Status:** Draft (Sprint 7). Complements `docs/modules/excititor/architecture.md` by describing the mirror export surface exposed by `Excititor.WebService` and the configuration hooks used by operators and downstream mirrors.
---
## 0) Purpose
Excititor publishes canonical VEX consensus data. Operators (or StellaOps-managed mirrors) need a deterministic way to sync those exports into downstream environments. Mirror distribution provides:
* A declarative map of export bundles (`json`, `jsonl`, `openvex`, `csaf`) reachable via signed HTTP endpoints under `/excititor/mirror`.
* Thin quota/authentication controls on top of the existing export cache so mirrors cannot starve the web service.
* Stable payload shapes that downstream automation can monitor (index → fetch updates → download artifact → verify signature).
Mirror endpoints are intentionally **read-only**. Write paths (export generation, attestation, cache) remain the responsibility of the export pipeline.
---
## 1) Configuration model
The web service reads mirror configuration from `Excititor:Mirror` (YAML/JSON/appsettings). Each domain groups a set of exports that share rate limits and authentication rules.
```yaml
Excititor:
Mirror:
Domains:
- id: primary
displayName: Primary Mirror
requireAuthentication: false
maxIndexRequestsPerHour: 600
maxDownloadRequestsPerHour: 1200
exports:
- key: consensus
format: json
filters:
vulnId: CVE-2025-0001
productKey: pkg:test/demo
sort:
createdAt: false # descending
limit: 1000
- key: consensus-openvex
format: openvex
filters:
vulnId: CVE-2025-0001
```
### Root settings
| Field | Required | Description |
| --- | --- | --- |
| `outputRoot` | | Filesystem root where mirror artefacts are written. Defaults to the Excititor file-system artifact store root when omitted. |
| `directoryName` | | Optional subdirectory created under `outputRoot`; defaults to `mirror`. |
| `targetRepository` | | Hint propagated to manifests/index files indicating the operator-visible location (for example `s3://mirror/excititor`). |
| `signing` | | Bundle signing configuration. When enabled, the exporter emits a detached JWS (`bundle.json.jws`) alongside each domain bundle. |
`signing` supports the following fields:
| Field | Required | Description |
| --- | --- | --- |
| `enabled` | | Toggles detached signing for domain bundles. |
| `algorithm` | | Signing algorithm identifier (default `ES256`). |
| `keyId` | ✅ (when `enabled`) | Signing key identifier resolved via the configured crypto provider registry. |
| `provider` | | Optional provider hint when multiple registries are available. |
| `keyPath` | | Optional PEM path used to seed the provider when the key is not already loaded. |
### Domain field reference
| Field | Required | Description |
| --- | --- | --- |
| `id` | ✅ | Stable identifier. Appears in URLs (`/excititor/mirror/domains/{id}`) and download filenames. |
| `displayName` | | Human-friendly label surfaced in the `/domains` listing. Falls back to `id`. |
| `requireAuthentication` | | When `true` the service enforces that the caller is authenticated (Authority token). |
| `maxIndexRequestsPerHour` | | Per-domain quota for index endpoints. `0`/negative disables the guard. |
| `maxDownloadRequestsPerHour` | | Per-domain quota for artifact downloads. |
| `exports` | ✅ | Collection of export projections. |
Export-level fields:
| Field | Required | Description |
| --- | --- | --- |
| `key` | ✅ | Unique key within the domain. Used in URLs (`/exports/{key}`) and filenames/bundle entries. |
| `format` | ✅ | One of `json`, `jsonl`, `openvex`, `csaf`. Maps to `VexExportFormat`. |
| `filters` | | Key/value pairs executed via `VexQueryFilter`. Keys must match export data source columns (e.g., `vulnId`, `productKey`). |
| `sort` | | Key/boolean map (false = descending). |
| `limit`, `offset`, `view` | | Optional query bounds passed through to the export query. |
⚠️ **Misconfiguration:** invalid formats or missing keys cause exports to be flagged with `status` in the index response; they are not exposed downstream.
---
## 2) HTTP surface
Routes are grouped under `/excititor/mirror`.
| Method | Path | Description |
| --- | --- | --- |
| `GET` | `/domains` | Returns configured domains with quota metadata. |
| `GET` | `/domains/{domainId}` | Domain detail (auth/quota + export keys). `404` for unknown domains. |
| `GET` | `/domains/{domainId}/index` | Lists exports with exportId, query signature, format, artifact digest, attestation metadata, and size. Applies index quota. |
| `GET` | `/domains/{domainId}/exports/{exportKey}` | Returns manifest metadata (single export). `404` if unknown/missing. |
| `GET` | `/domains/{domainId}/exports/{exportKey}/download` | Streams export content from the artifact store. Applies download quota. |
Responses are serialized via `VexCanonicalJsonSerializer` ensuring stable ordering. Download responses include a content-disposition header naming the file `<domain>-<export>.<ext>`.
### Error handling
* `401` authentication required (`requireAuthentication=true`).
* `404` domain/export not found or manifest not persisted.
* `429` per-domain quota exceeded (`Retry-After` header set in seconds).
* `503` export misconfiguration (invalid format/query).
---
## 3) Rate limiting
`MirrorRateLimiter` implements a simple rolling 1-hour window using `IMemoryCache`. Each domain has two quotas:
* `index` scope → `maxIndexRequestsPerHour`
* `download` scope → `maxDownloadRequestsPerHour`
`0` or negative limits disable enforcement. Quotas are best-effort (per-instance). For HA deployments, configure sticky routing at the ingress or replace the limiter with a distributed implementation.
---
## 4) Interaction with export pipeline
Mirror endpoints consume manifests produced by the export engine (`MongoVexExportStore`). They do **not** trigger new exports. Operators must configure connectors/exporters to keep targeted exports fresh (see `EXCITITOR-EXPORT-01-005/006/007`).
Recommended workflow:
1. Define export plans at the export layer (JSON/OpenVEX/CSAF).
2. Configure mirror domains mapping to those plans.
3. Downstream mirror automation:
* `GET /domains/{id}/index`
* Compare `exportId` / `consensusRevision`
* `GET /download` when new
* Verify digest + attestation
When the export engine runs, it materializes the following artefacts under `outputRoot/<directoryName>`:
- `index.json` canonical index listing each configured domain, manifest/bundle descriptors (with SHA-256 digests), and available export keys.
- `<domain>/manifest.json` per-domain summary with export metadata (query signature, consensus/score digests, source providers) and a descriptor pointing at the bundle.
- `<domain>/bundle.json` canonical payload containing serialized consensus, score envelopes, and normalized VEX claims for the matching export definitions.
- `<domain>/bundle.json.jws` optional detached JWS when signing is enabled.
Downstream automation reads `manifest.json`/`bundle.json` directly, while `/excititor/mirror` endpoints stream the same artefacts through authenticated HTTP.
---
## 5) Operational guidance
* Track quota utilisation via HTTP 429 metrics (configure structured logging or OTEL counters when rate limiting triggers).
* Mirror domains can be deployed per tenant (e.g., `tenant-a`, `tenant-b`) with different auth requirements.
* Ensure the underlying artifact stores (`FileSystem`, `S3`, offline bundle) retain artefacts long enough for mirrors to sync.
* For air-gapped mirrors, combine mirror endpoints with the Offline Kit (see `docs/OFFLINE_KIT.md`).
---
## 6) Future alignment
* Replace manual export definitions with generated mirror bundle manifests once `EXCITITOR-EXPORT-01-007` ships.
* Extend `/index` payload with quiet-provenance when `EXCITITOR-EXPORT-01-006` adds that metadata.
* Integrate domain manifests with DevOps mirror profiles (`DEVOPS-MIRROR-08-001`) so helm/compose overlays can enable or disable domains declaratively.
---
## 7) Runbook & observability checklist (Sprint 22 demo refresh · 2025-11-07)
### Daily / on-call checks
1. **Index freshness** watch `excitor_mirror_export_latency_seconds` (p95 < 180) grouped by `domainId`. If latency grows past 10 minutes, verify the export worker queue (`stellaops-export-worker` logs) and ensure PostgreSQL `vex.exports` has entries newer than `now()-10m`.
2. **Quota exhaustion** alert on `excitor_mirror_quota_exhausted_total{scope="download"}` increases. When triggered, inspect structured logs (`MirrorDomainId`, `QuotaScope`, `RemoteIp`) and either raise limits or throttle abusive clients.
3. **Bundle signature health** metric `excitor_mirror_bundle_signature_verified_total` should match download counts when signing enabled. Deltas indicate missing `.jws` files; rebuild the bundle via export job or copy artefacts from the authority mirror cache.
4. **HTTP errors** dashboards should track 4xx/5xx rates split by route; repeated `503` statuses imply misconfigured exports. Check `mirror/index` logs for `status=misconfigured`.
### Incident steps
1. Use `GET /excititor/mirror/domains/{id}/index` to capture current manifests. Attach the response to the incident log for reproducibility.
2. For quota incidents, temporarily raise `maxIndexRequestsPerHour`/`maxDownloadRequestsPerHour` via the `Excititor:Mirror:Domains` config override, redeploy, then work with the consuming team on caching.
3. For stale exports, trigger the export job (`Excititor.ExportRunner`) and confirm the artefacts are written to `outputRoot/<domain>`.
4. Validate DSSE artefacts by running `cosign verify-blob --certificate-rekor-url=<rekor> --bundle <domain>/bundle.json --signature <domain>/bundle.json.jws`.
### Logging fields (structured)
| Field | Description |
| --- | --- |
| `MirrorDomainId` | Domain handling the request (matches `id` in config). |
| `QuotaScope` | `index` / `download`, useful when alerting on quota events. |
| `ExportKey` | Included in download logs to pinpoint misconfigured exports. |
| `BundleDigest` | SHA-256 of the artefact; compare with index payload when debugging corruption. |
### OTEL signals
- **Counters:** `excitor.mirror.requests`, `excitor.mirror.quota_blocked`, `excitor.mirror.signature.failures`.
- **Histograms:** `excitor.mirror.download.duration`, `excitor.mirror.export.latency`.
- **Spans:** `mirror.index`, `mirror.download` include attributes `mirror.domain`, `mirror.export.key`, and `mirror.quota.remaining`.
Add these instruments via the `MirrorEndpoints` middleware; see `StellaOps.Excititor.WebService/Telemetry/MirrorMetrics.cs`.

View File

@@ -0,0 +1,39 @@
# Excititor Locker Manifest (OBS-53-001)
Defines the manifest for evidence snapshots stored in Evidence Locker / sealed-mode bundles.
## Manifest structure
```json
{
"tenant": "default",
"manifestId": "locker:excititor:2025-11-23:0001",
"createdAt": "2025-11-23T23:10:00Z",
"items": [
{
"observationId": "vex:obs:sha256:...",
"providerId": "ubuntu-csaf",
"contentHash": "sha256:...",
"linksetId": "CVE-2024-0001:pkg:maven/org.demo/app@1.2.3",
"dsseEnvelopeHash": "sha256:...",
"provenance": {
"source": "mirror|ingest",
"mirrorGeneration": 12,
"exportCenterManifest": "sha256:..."
}
}
],
"merkleRoot": "sha256:...", // over `items[*].contentHash`
"signature": null, // populated in OBS-54-001 (DSSE)
"metadata": {"sealed": true}
}
```
## Rules
- `items` sorted by `observationId`, then `providerId`.
- `merkleRoot` uses SHA-256 over concatenated item hashes (stable order above).
- `signature` is a DSSE envelope (hash recorded in `dsseEnvelopeHash`) when OBS-54-001 is enabled; otherwise `null`.
- Manifests are immutable; version using `manifestId` suffix.
## Storage and replay
- Store manifests alongside payloads in object storage; key prefix: `locker/excititor/<tenant>/<manifestId>`.
- Replay tools must verify `merkleRoot` before loading payloads; reject if mismatched.

View File

@@ -0,0 +1,43 @@
# Excititor Timeline Events (OBS-52-001)
Defines the event envelope for evidence timelines emitted by Excititor. All fields are aggregation-only; no consensus/merge logic.
## Envelope
```json
{
"type": "excititor.timeline.v1",
"tenant": "default",
"eventId": "urn:uuid:...",
"timestamp": "2025-11-23T23:10:00Z",
"traceId": "beefcafe...",
"spanId": "deadb33f...",
"source": "excititor.web",
"kind": "observation|linkset",
"action": "ingest|update|backfill|replay",
"observationId": "vex:obs:sha256:...",
"linksetId": "CVE-2024-0001:pkg:maven/org.demo/app@1.2.3",
"justifications": ["component_not_present"],
"conflicts": [
{"providerId": "suse-csaf", "status": "fixed", "justification": null}
],
"evidenceHash": "sha256:...", // content-addressed payload hash
"dsseEnvelopeHash": "sha256:...", // if attested (see OBS-54-001)
"metadata": {"connector": "ubuntu-csaf", "mirrorGeneration": 12}
}
```
## Semantics
- `eventId` is stable per write; retries reuse the same ID.
- `timestamp` must be UTC; derive from TimeProvider.
- `traceId`/`spanId` propagate ingestion traces; if tracing is disabled, set both to `null`.
- `kind` + `action` drive downstream storage and alerting.
- `evidenceHash` is the raw document hash; `dsseEnvelopeHash` appears only when OBS-54-001 is enabled.
## Determinism
- Sort `justifications` and `conflicts` ascending by providerId/status before emit.
- Emit at-most-once per storage write; idempotent consumers rely on `(eventId, tenant)`.
## Transport
- Default topic: `excititor.timeline.v1` (NATS/Valkey). Subject includes tenant: `excititor.timeline.v1.<tenant>`.
- Payload size should stay <32 KiB; truncate conflict arrays with `truncated=true` flag if needed (keep hash counts deterministic).

View File

@@ -0,0 +1,24 @@
# Using the Chunk API
Endpoint: `POST /vex/evidence/chunks`
- Content-Type: `application/x-ndjson`
- See schema: `docs/modules/excititor/schemas/vex-chunk-api.yaml`
Response: `202 Accepted`
```json
{ "chunk_digest": "sha256:…", "queue_id": "uuid" }
```
Operational notes
- Deterministic hashing: server recomputes `chunk_digest` from canonical JSON; mismatches return 400.
- Limits: default 500 items, max 2000 (aligned with Program.cs guard).
- Telemetry: metrics under `StellaOps.Excititor.Chunks` (see chunk-telemetry.md).
- Headers: correlation/trace headers echoed (`X-Stella-TraceId`, `X-Stella-CorrelationId`).
Example curl
```bash
curl -X POST https://excitor.local/vex/evidence/chunks \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/x-ndjson" \
--data-binary @docs/modules/excititor/samples/chunk-sample.ndjson
```

View File

@@ -0,0 +1,26 @@
# Excititor Chunk Telemetry (Sprint 110)
## Metrics (Meter: `StellaOps.Excititor.Chunks`)
- `vex_chunks_ingested_total` (counter) — tags: `tenant`, `source`, `status` (`accepted|rejected`), `reason` (nullable for accepted). Increments per chunk submitted.
- `vex_chunks_item_count` (histogram, unit=items) — records item count per chunk.
- `vex_chunks_payload_bytes` (histogram, unit=bytes) — measured from NDJSON payload length.
- `vex_chunks_latency_ms` (histogram) — end-to-end ingestion latency per request.
## Logs
- `vex.chunk.ingest.accepted` — includes `chunk_id`, `tenant`, `source`, `item_count`, `chunk_digest`.
- `vex.chunk.ingest.rejected` — includes `chunk_id`, `tenant`, `source`, `reason`, validation errors (summarized).
## Wiring steps
1. Register `ChunkTelemetry` as singleton with shared `Meter` instance.
2. In `/vex/evidence/chunks` handler, compute `chunk_digest` deterministically from canonical JSON and emit counters/histograms via `ChunkTelemetry`.
3. Log using structured templates above; avoid request bodies in logs.
4. Expose metrics via default ASP.NET metrics export (Prometheus/OpenTelemetry) already configured in WebService.
## Determinism & offline posture
- Do not include host-specific paths or timestamps in metric dimensions.
- Histogram buckets: use standard OTEL defaults; no runtime-generated buckets.
- Keep meter name stable; adding new instruments requires version note in sprint Decisions & Risks.
## Ownership
- Implementer: Excititor Observability Guild
- Reviewers: Evidence Locker Guild (for parity with attestation metrics)

View File

@@ -0,0 +1,31 @@
# Excititor Consensus Removal Runbook (AOC-19-004)
- **Date:** 2025-11-21
- **Scope:** EXCITITOR-CORE-AOC-19-004
- **Goal:** Eliminate legacy consensus/merged severity fields so Excititor remains aggregation-only.
## Cutover steps
1) **Freeze consensus refresh**`DisableConsensus=true` (default) forces refresh loop off. Keep this enabled during migration.
2) **Schema cleanup** — migrate collections to remove or null legacy fields:
- `vex_consensus` / `vex_consensus_holds`: drop/ignore fields `consensusDigest`, `policyVersion`, `policyRevisionId`, `policyDigest`, `summary`, `signals`, `status` (merged) once Policy takes over.
- `vex_observations` / materialized exports: ensure no merged severity/status fields are written.
- `vex_mirror` exports: stop emitting consensus JSON; retain raw observations only.
3) **Telemetry:** emit counter `excititor.ingest.consensus.disabled` (tags `tenant`, `source`, `connectorId`) once per batch to prove cutover.
4) **Guards:** AOC guards reject any incoming/derived field in `{mergedSeverity, consensusScore, computedStatus}`.
5) **Backfill:** run one-off job to set `consensusDisabled=true` on legacy records and remove merged fields without touching raw observations.
6) **Verification:** regression checklist (per tenant):
- No writes to `vex_consensus*` collections after cutover.
- Ingest + export fixtures show only raw observations/linksets; snapshots deterministic.
- Telemetry counter present; absence of consensus refresh logs.
## Config
```
Excititor:Worker:
DisableConsensus: true # keep true post-cutover
```
## Test plan (after disk space is restored)
- Unit: AOC guard rejects merged fields.
- Integration (Mongo2Go): ingest batch containing merged fields → rejected; telemetry counter increments.
- Worker: start with DisableConsensus=true → consensus refresh loop does not schedule; log once at startup.

View File

@@ -0,0 +1,125 @@
# Excititor Advisory-AI evidence APIs (projection + chunks)
> Covers the read-only evidence surfaces shipped in Sprints 119120: `/v1/vex/observations/{vulnerabilityId}/{productKey}` and `/v1/vex/evidence/chunks`.
## Scope and determinism
- **Aggregation-only**: no consensus, severity merging, or reachability. Responses carry raw statements plus provenance/signature metadata.
- **Stable ordering**: both endpoints sort by `lastSeen` DESC; pagination uses a deterministic `limit`.
- **Limits**: observation projection default `limit=200`, max `500`; chunk stream default `limit=500`, max `2000`.
- **Tenancy**: reads respect `X-Stella-Tenant` when provided; otherwise fall back to `DefaultTenant` configuration.
- **Auth**: bearer token with `vex.read` scope required.
## `/v1/vex/observations/{vulnerabilityId}/{productKey}`
- **Response**: JSON object with `vulnerabilityId`, `productKey`, `generatedAt`, `totalCount`, `truncated`, `statements[]`.
- **Statement fields**: `observationId`, `providerId`, `status`, `justification`, `detail`, `firstSeen`, `lastSeen`, `scope{key,name,version,purl,cpe,componentIdentifiers[]}`, `anchors[]`, `document{digest,format,revision,sourceUri}`, `signature{type,keyId,issuer,verifiedAt}`.
- **Filters**:
- `providerId` (multi-valued, comma-separated)
- `status` (values in `VexClaimStatus`)
- `since` (ISO-8601, UTC)
- `limit` (ints within bounds)
- **Mapping back to storage**:
- `observationId` = `{providerId}:{document.digest}`
- `document.digest` locates the raw record in `vex_raw`.
- `anchors` contain JSON pointers/paragraph locators from source metadata.
Headers:
- `Excititor-Results-Truncated: true|false`
- `Excititor-Results-Total: <int>`
## `/v1/vex/evidence/chunks`
- **Query params**: `vulnerabilityId` (required), `productKey` (required), optional `providerId`, `status`, `since`, `limit`.
- **Limits**: default `limit=500`, max `2000`.
- **Response**: **NDJSON** stream; each line is a `VexEvidenceChunkResponse`.
- **Chunk fields**: `observationId`, `linksetId`, `vulnerabilityId`, `productKey`, `providerId`, `status`, `justification`, `detail`, `scopeScore` (from confidence or signals), `firstSeen`, `lastSeen`, `scope{...}`, `document{digest,format,sourceUri,revision}`, `signature{type,subject,issuer,keyId,verifiedAt,transparencyRef}`, `metadata` (flattened additionalMetadata).
- **Headers**: `Excititor-Results-Total`, `Excititor-Results-Truncated` (mirrors projection API naming).
- **Streaming guidance (SDK/clients)**:
- Use HTTP client that supports response streaming; read line-by-line and JSON-deserialize per line.
- Treat stream as an NDJSON list up to `limit`; no outer array.
- Back-off or paginate by adjusting `since` or narrowing providers/statuses.
OpenAPI (excerpt):
```yaml
paths:
/v1/vex/evidence/chunks:
get:
summary: Stream evidence chunks for a vulnerability/product
parameters:
- in: query
name: vulnerabilityId
schema: { type: string }
required: true
- in: query
name: productKey
schema: { type: string }
required: true
- in: query
name: providerId
schema: { type: string }
description: Comma-separated provider ids
- in: query
name: status
schema: { type: string }
description: Comma-separated VEX statuses
- in: query
name: since
schema: { type: string, format: date-time }
- in: query
name: limit
schema: { type: integer, minimum: 1, maximum: 2000, default: 500 }
responses:
"200":
description: NDJSON stream of VexEvidenceChunkResponse
headers:
Excititor-Results-Total: { schema: { type: integer } }
Excititor-Results-Truncated: { schema: { type: boolean } }
content:
application/x-ndjson:
schema:
type: string
description: One JSON object per line (VexEvidenceChunkResponse)
```
Example (curl):
```bash
curl -s -H "Authorization: Bearer <token>" \
-H "X-Stella-Tenant: acme" \
"https://exc.example.test/v1/vex/evidence/chunks?vulnerabilityId=CVE-2025-0001&productKey=pkg:docker/demo&limit=2" |
head -n 2
```
Sample NDJSON line:
```json
{"observationId":"provider-a:4d2f...","linksetId":"CVE-2025-0001:pkg:docker/demo","vulnerabilityId":"CVE-2025-0001","productKey":"pkg:docker/demo","providerId":"provider-a","status":"Affected","justification":"ComponentNotPresent","detail":"demo detail","scopeScore":0.9,"firstSeen":"2025-11-10T12:00:00Z","lastSeen":"2025-11-12T12:00:00Z","scope":{"key":"pkg:docker/demo","name":"demo","version":"1.0.0","purl":"pkg:docker/demo","cpe":null,"componentIdentifiers":["component-a"]},"document":{"digest":"sha256:e7...","format":"sbomcyclonedx","sourceUri":"https://example.test/vex.json","revision":"r1"},"signature":{"type":"cosign","subject":"demo","issuer":"issuer","keyId":"kid","verifiedAt":"2025-11-12T12:00:00Z","transparencyRef":null},"metadata":{}}
```
## `/v1/vex/attestations/{attestationId}`
- **Purpose**: Lookup attestation provenance (supplier ↔ observation/linkset ↔ product/vulnerability) without touching consensus.
- **Response**: `VexAttestationPayload` with fields:
- `attestationId`, `supplierId`, `observationId`, `linksetId`, `vulnerabilityId`, `productKey`, `justificationSummary`, `issuedAt`, `metadata{}`.
- **Semantics**:
- `attestationId` matches the export/attestation ID used when signing (Resolve/Worker flows).
- `observationId`/`linksetId` map back to evidence identifiers; clients can stitch provenance for citations.
- **Auth**: `vex.read` scope; tenant header optional (payloads are tenant-agnostic).
## Error model
- Standard API envelope with `ValidationProblem` for missing required params.
- `scope` failures return `403` with problem details.
- Tenancy parse failures return `400`.
## Backwards compatibility
- No legacy routes are deprecated by these endpoints; they are additive and remain aggregation-only.
## References
- Implementation: `src/Excititor/StellaOps.Excititor.WebService/Program.cs` (`/v1/vex/observations/**`, `/v1/vex/evidence/chunks`).
- Telemetry: `src/Excititor/StellaOps.Excititor.WebService/Telemetry/EvidenceTelemetry.cs` (`excititor.vex.observation.*`, `excititor.vex.chunks.*`).
- Data model: `src/Excititor/StellaOps.Excititor.WebService/Contracts/VexObservationContracts.cs`, `Contracts/VexEvidenceChunkContracts.cs`.

View File

@@ -0,0 +1,52 @@
# Excititor · Graph Linkouts & Overlays — Implementation Notes (Graph 21-001/002/005/24-101/24-102)
- **Date:** 2025-11-21
- **Scope:** EXCITITOR-GRAPH-21-001, EXCITITOR-GRAPH-21-002, EXCITITOR-GRAPH-21-005
- **Status:** Implementation guidance (storage wiring pending).
## Endpoints
1) **Linkouts (21-001)**
- `POST /internal/graph/linkouts`
- Body: `tenant`, `purls[]` (max 500), `includeJustifications?`, `includeProvenance?`
- Response: ordered by input `purls`; each item includes `advisories[]` (`advisoryId`, `source`, `status`, `justification?`, `modifiedAt`, `evidenceHash`, `connectorId`, `dsseEnvelopeHash?`) plus `conflicts[]`; `notFound[]`.
2) **Overlays (21-002)**
- `GET /v1/graph/overlays?purl=<purl>&purl=<purl>&includeJustifications=true|false`
- Response per PURL: `summary` counts (`open`, `not_affected`, `under_investigation`, `no_statement`), `latestModifiedAt`, `justifications[]` (unique, sorted), `provenance` (`sources[]`, `lastEvidenceHash`), `cached`, `cacheAgeMs`.
3) **Status summaries (24-101)**
- `GET /v1/graph/status?purl=<purl>&purl=<purl>`
- Response mirrors overlay summaries but omits justification payloads; includes `sources[]`, `lastEvidenceHash`, `cached`, `cacheAgeMs`. Intended for Vuln Explorer status colouring.
4) **Batch observations for tooltips (24-102)**
- `GET /v1/graph/observations?purl=<purl>[&purl=...]&includeJustifications=true|false[&limitPerPurl=50][&cursor=<base64>]`
- Response per PURL: ordered `observations[]` (`observationId`, `advisoryId`, `status`, `justification?`, `providerId`, `modifiedAt`, `evidenceHash`, `dsseEnvelopeHash?`) plus `truncated`; top-level `nextCursor`, `hasMore` enable paging. Limits enforced per PURL and globally.
## Storage & Indexes (21-005)
- `vex_observations` indexes:
- `{ tenant: 1, component.purl: 1, advisoryId: 1, source: 1, modifiedAt: -1 }`
- Sparse `{ tenant: 1, component.purl: 1, status: 1 }`
- Optional materialized `vex_overlays` cache: unique `{ tenant: 1, purl: 1 }`, TTL on `cachedAt` driven by `excititor:graph:overlayTtlSeconds` (default 300s); payload must validate against `docs/modules/excititor/schemas/vex_overlay.schema.json` (schemaVersion 1.0.0). Bundle sample payload `docs/modules/excititor/samples/vex-overlay-sample.json` in Offline Kits.
## Determinism
- Ordering: input PURL order → `advisoryId``source` for linkouts; overlays follow input order.
- Truncation: max 200 advisories per PURL; when truncated, include `truncated: true` and `nextCursor` (`advisoryId`, `source`).
## Config knobs
- `excititor:graph:overlayTtlSeconds` (default 300)
- `excititor:graph:maxPurls` (default 500)
- `excititor:graph:maxAdvisoriesPerPurl` (default 200)
- `excititor:graph:maxTooltipItemsPerPurl` (default 50)
- `excititor:graph:maxTooltipTotal` (default 1000)
## Telemetry
- Counter `excititor.graph.linkouts.requests` tags: `tenant`, `includeJustifications`, `includeProvenance`.
- Counter `excititor.graph.overlays.cache` tags: `tenant`, `hit` (`true|false`).
- Histogram `excititor.graph.linkouts.latency.ms` tags: `tenant`.
## Steps to implement
- Bind `GraphOptions` to `Excititor:Graph`.
- Add endpoints to WebService with tenant guard; enforce limits.
- Implement overlay cache with deterministic sort; respect TTL; surface `cached` + `cacheAgeMs`.
- Backfill Mongo indexes above.
- Integration tests (WebApplicationFactory + Mongo2Go) for ordering, truncation, cache metadata, tenant isolation.

View File

@@ -0,0 +1,62 @@
# Excititor Observability Guide
> Added 2025-11-14 alongside Sprint 119 (`EXCITITOR-AIAI-31-003`). Complements the AirGap/mirror runbooks under the same folder.
Excititors evidence APIs now emit first-class OpenTelemetry metrics so Lens, Advisory AI, and Ops can detect misuse or missing provenance without paging through logs. This document lists the counters/histograms shipped by the WebService (`src/Excititor/StellaOps.Excititor.WebService`) and how to hook them into your exporters/dashboards.
## Telemetry prerequisites
- Enable `Excititor:Telemetry` in the service configuration (`appsettings.*`), ensuring **metrics** export is on. The WebService automatically adds the evidence meter (`StellaOps.Excititor.WebService.Evidence`) alongside the ingestion meter.
- Deploy at least one OTLP or console exporter (see `TelemetryExtensions.ConfigureExcititorTelemetry`). If your region lacks OTLP transport, fall back to scraping the console exporter for smoke tests.
- Coordinate with the Ops/Signals guild to provision the span/metric sinks referenced in `docs/modules/platform/architecture-overview.md#observability`.
## Metrics reference
| Metric | Type | Description | Key dimensions |
| --- | --- | --- | --- |
| `excititor.vex.observation.requests` | Counter | Number of `/v1/vex/observations/{vulnerabilityId}/{productKey}` requests handled. | `tenant`, `outcome` (`success`, `error`, `cancelled`), `truncated` (`true/false`) |
| `excititor.vex.observation.statement_count` | Histogram | Distribution of statements returned per observation projection request. | `tenant`, `outcome` |
| `excititor.vex.signature.status` | Counter | Signature status per statement (missing vs. unverified). | `tenant`, `status` (`missing`, `unverified`) |
| `excititor.vex.aoc.guard_violations` | Counter | Aggregated count of Aggregation-Only Contract violations detected by the WebService (ingest + `/v1/vex/aoc/verify`). | `tenant`, `surface` (`ingest`, `aoc_verify`, etc.), `code` (AOC error code) |
| `excititor.vex.chunks.requests` | Counter | Requests to `/v1/vex/evidence/chunks` (NDJSON stream). | `tenant`, `outcome` (`success`,`error`,`cancelled`), `truncated` (`true/false`) |
| `excititor.vex.chunks.bytes` | Histogram | Size of NDJSON chunk streams served (bytes). | `tenant`, `outcome` |
| `excititor.vex.chunks.records` | Histogram | Count of evidence records emitted per chunk stream. | `tenant`, `outcome` |
> All metrics originate from the `EvidenceTelemetry` helper (`src/Excititor/StellaOps.Excititor.WebService/Telemetry/EvidenceTelemetry.cs`). When disabled (telemetry off), the helper is inert.
### Dashboard hints
- **Advisory-AI readiness** alert when `excititor.vex.signature.status{status="missing"}` spikes for a tenant, indicating connectors arent supplying signatures.
- **Guardrail monitoring** graph `excititor.vex.aoc.guard_violations` per `code` to catch upstream feed regressions before they pollute Evidence Locker or Lens caches.
- **Capacity planning** histogram percentiles of `excititor.vex.observation.statement_count` feed API sizing (higher counts mean Advisory AI is requesting broad scopes).
## Operational steps
1. **Enable telemetry**: set `Excititor:Telemetry:EnableMetrics=true`, configure OTLP endpoints/headers as described in `TelemetryExtensions`.
2. **Add dashboards**: import panels referencing the metrics above (see Grafana JSON snippets in Ops repo once merged).
3. **Alerting**: add rules for high guard violation rates, missing signatures, and abnormal chunk bytes/record counts. Tie alerts back to connectors via tenant metadata.
4. **Post-deploy checks**: after each release, verify metrics emit by curling `/v1/vex/observations/...` and `/v1/vex/evidence/chunks`, watching the console exporter (dev) or OTLP (prod).
## SLOs (Sprint 119 OBS-51-001)
The following SLOs apply to Excititor evidence read paths when telemetry is enabled. Record them in the shared SLO registry and alert via the platform alertmanager.
| Surface | SLI | Target | Window | Burn alert | Notes |
| --- | --- | --- | --- | --- | --- |
| `/v1/vex/observations` | p95 latency | ≤ 450ms | 7d | 2% over 1h | Measured on successful responses only; tenant scoped. |
| `/v1/vex/observations` | freshness | ≥ 99% within 5min of upstream ingest | 7d | 5% over 4h | Derived from arrival minus `createdAt`; requires ingest clocks in UTC. |
| `/v1/vex/observations` | signature presence | ≥ 98% statements with signature present | 7d | 3% over 24h | Use `excititor.vex.signature.status{status="missing"}`. |
| `/v1/vex/evidence/chunks` | p95 stream duration | ≤ 600ms | 7d | 2% over 1h | From request start to last NDJSON write; excludes client disconnects. |
| `/v1/vex/evidence/chunks` | truncation rate | ≤ 1% truncated streams | 7d | 1% over 1h | `excititor.vex.chunks.records` with `truncated=true`. |
| AOC guardrail | zero hard violations | 0 | continuous | immediate | Any `excititor.vex.aoc.guard_violations` with severity `error` pages ops. |
Implementation notes:
- Emit latency/freshness SLOs via OTEL views that pre-aggregate by tenant and route to the platform SLO backend; keep bucket boundaries aligned with 50/100/250/450/650/1000ms.
- Freshness SLI derived from ingest timestamps; ensure clocks are synchronized (NTP) and stored in UTC.
- For air-gapped deployments without OTEL sinks, scrape console exporter and push to offline Prometheus; same thresholds apply.
## Related documents
- `docs/modules/excititor/architecture.md` API contract, AOC guardrails, connector responsibilities.
- `docs/modules/excititor/mirrors.md` AirGap/mirror ingestion checklist (feeds into `EXCITITOR-AIRGAP-56/57`).
- `docs/modules/platform/architecture-overview.md#observability` platform-wide telemetry guidance.

View File

@@ -0,0 +1,39 @@
# Excititor Tenant Authority Client (AOC-19-013)
- **Date:** 2025-11-21
- **Scope:** EXCITITOR-CORE-AOC-19-013
- **Files:** `src/Excititor/StellaOps.Excititor.Worker/Auth/TenantAuthorityClientFactory.cs`
## Contract
- Every outbound Authority call must carry `X-Tenant` header and use tenant-specific base URL.
- Base URLs and optional client credentials are configured under `Excititor:Authority:` with per-tenant keys.
- Factory throws when tenant is missing or not configured to prevent cross-tenant leakage.
## Configuration shape
```json
{
"Excititor": {
"Authority": {
"BaseUrls": {
"alpha": "https://authority.alpha.local/",
"bravo": "https://authority.bravo.local/"
},
"ClientIds": {
"alpha": "alpha-client-id"
},
"ClientSecrets": {
"alpha": "alpha-secret"
}
}
}
}
```
## Implementation notes
- `TenantAuthorityClientFactory` (worker) enforces tenant presence and configured base URL; adds `Accept: application/json` and `X-Tenant` headers.
- Registered in DI via `Program.cs` with options binding to `Excititor:Authority`.
- Intended to be reused by WebService/Worker components once disk space block is resolved.
## Next steps
- Wire factory into services that call Authority (WebService + Worker jobs), replacing any tenant-agnostic HttpClient usages.
- Add integration tests to ensure cross-tenant calls reject when config missing or header mismatched.

View File

@@ -0,0 +1,66 @@
# Ubuntu CSAF connector runbook
> Updated 2025-11-09 alongside sprint 110/120 trust-provenance work.
## Purpose
- Ingest Ubuntu USN/CSAF statements via the restart-only connector (`StellaOps.Excititor.Connectors.Ubuntu.CSAF`).
- Preserve Aggregation-Only Contract guarantees while surfacing issuance provenance (`vex.provenance.*`) for VEX Lens and Policy Engine.
- Allow operators to tune trust weighting (tiers, fingerprints, cosign issuers) without recompiling the connector.
## Configuration keys
| Key | Default | Notes |
| --- | --- | --- |
| `Excititor:Connectors:Ubuntu:IndexUri` | `https://ubuntu.com/security/csaf/index.json` | Ubuntu CSAF index. Override only when mirroring the feed. |
| `...:Channels` | `["stable"]` | List of channel names to poll. Order preserved for deterministic cursoring. |
| `...:MetadataCacheDuration` | `4h` | How long to cache catalog metadata before re-fetching. |
| `...:PreferOfflineSnapshot` / `OfflineSnapshotPath` / `PersistOfflineSnapshot` | `false` / `null` / `true` | Enable when running from Offline Kit bundles. Snapshot path must be reachable/read-only under sealed deployments. |
| `...:TrustWeight` | `0.75` | Baseline trust weight (01). Lens multiplies this by freshness/justification modifiers. |
| `...:TrustTier` | `"distro"` | Friendly tier label surfaced via `vex.provenance.trust.tier` (e.g., `distro-trusted`, `community`). |
| `...:CosignIssuer` / `CosignIdentityPattern` | `null` | Supply when Ubuntu publishes cosign attestations (issuer URL and identity regex). Required together. |
| `...:PgpFingerprints` | `[]` | Ordered list of trusted PGP fingerprints. Emitted verbatim as `vex.provenance.pgp.fingerprints`. |
## Example `appsettings.json`
```jsonc
{
"Excititor": {
"Connectors": {
"Ubuntu": {
"IndexUri": "https://mirror.example.com/security/csaf/index.json",
"Channels": ["stable", "esm-apps"],
"TrustWeight": 0.82,
"TrustTier": "distro-trusted",
"CosignIssuer": "https://issuer.ubuntu.com",
"CosignIdentityPattern": "spiffe://ubuntu/vex/*",
"PgpFingerprints": [
"0123456789ABCDEF0123456789ABCDEF01234567",
"89ABCDEF0123456789ABCDEF0123456789ABCDEF"
],
"PreferOfflineSnapshot": true,
"OfflineSnapshotPath": "/opt/stella/offline/ubuntu/index.json"
}
}
}
}
```
## Environment variable cheatsheet
```
Excititor__Connectors__Ubuntu__TrustWeight=0.9
Excititor__Connectors__Ubuntu__TrustTier=distro-critical
Excititor__Connectors__Ubuntu__PgpFingerprints__0=AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Excititor__Connectors__Ubuntu__PgpFingerprints__1=BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
Excititor__Connectors__Ubuntu__CosignIssuer=https://issuer.ubuntu.com
Excititor__Connectors__Ubuntu__CosignIdentityPattern=spiffe://ubuntu/vex/*
```
## Operational checklist
1. **Before enabling** import the Ubuntu PGP bundle (Offline Kit provides `certificates/ubuntu-vex.gpg`) and set the fingerprints so provenance metadata stays deterministic.
2. **Validate provenance output** run `dotnet test src/Excititor/__Tests/StellaOps.Excititor.Connectors.Ubuntu.CSAF.Tests --filter FetchAsync_IngestsNewDocument` to ensure the connector emits the `vex.provenance.*` fields expected by VEX Lens.
3. **Monitor Lens weights** Grafana panels `VEX Lens / Trust Inputs` show the weight/tier captured per provider. Ubuntu rows should reflect the configured `TrustWeight` and fingerprints.
4. **Rotate fingerprints** update `PgpFingerprints` when Canonical rotates signing keys. Apply the change, restart Excititor workers, verify the provenance metadata, then trigger a targeted Lens recompute for Ubuntu issuers.
5. **Offline mode** populate `OfflineSnapshotPath` via Offline Kit bundles before toggling `PreferOfflineSnapshot`. Keep snapshots in the sealed `/opt/stella/offline` hierarchy for auditability.
## Troubleshooting
- **Connector refuses to start** check logs for `InvalidOperationException` referencing `CosignIssuer`/`CosignIdentityPattern` or missing snapshot path; the validator enforces complete pairs and on-disk paths.
- **Lens still sees default weights** confirm the Excititor deployment picked up the new settings (view `/excititor/health` JSON → `connectors.providers[].options`). Lens only overrides when the provenance payload includes `vex.provenance.trust.*` fields.
- **PGP mismatch alerts** if Lens reports fingerprint mismatches, ensure the list ordering matches Canonicals published order; duplicates are trimmed, so provide each fingerprint once.

View File

@@ -0,0 +1,18 @@
{
"subject_digest": "sha256:112233",
"predicates": {
"stellaops.vex.chunk.meta.v1": {
"tenant": "acme",
"source": "ghsa",
"schema": "stellaops.vex.chunk.v1",
"item_count": 1
},
"stellaops.vex.chunk.integrity.v1": {
"items": [
{"ordinal": 0, "sha256": "abc"}
]
}
},
"signing_profile": "sovereign-default",
"transparency": null
}

View File

@@ -0,0 +1 @@
{"chunk_id":"11111111-2222-3333-4444-555555555555","tenant":"acme","source":"ghsa","schema":"stellaops.vex.chunk.v1","items":[{"advisory_id":"GHSA-123","status":"affected","purl":"pkg:npm/foo@1.0.0"}],"provenance":{"fetched_at":"2025-11-20T00:00:00Z","artifact_sha":"abc"}}

View File

@@ -0,0 +1,2 @@
{"tenant":"demo","vulnerabilityId":"CVE-2024-1234","productKey":"pkg:pypi/django@3.2.24","observationId":"obs-001","statementId":"stmt-001","source":{"supplier":"upstream:osv","documentId":"osv:CVE-2024-1234","retrievedAt":"2025-11-18T12:00:00Z","signatureStatus":"missing"},"aoc":{"violations":[]},"evidence":{"type":"vex.statement","payload":{"status":"not_affected","justification":"component_not_present"}},"provenance":{"hash":"sha256:dummyhash","canonicalUri":null,"bundleId":null}}
{"tenant":"demo","vulnerabilityId":"CVE-2024-2345","productKey":"pkg:pypi/requests@2.31.0","observationId":"obs-002","statementId":"stmt-001","source":{"supplier":"upstream:osv","documentId":"osv:CVE-2024-2345","retrievedAt":"2025-11-18T12:05:00Z","signatureStatus":"unverified"},"aoc":{"violations":[{"code":"EVIDENCE_SIGNATURE_MISSING","surface":"ingest"}]},"evidence":{"type":"vex.statement","payload":{"status":"affected","impact":"info","details":"placeholder"}},"provenance":{"hash":"sha256:dummyhash2","canonicalUri":null,"bundleId":null}}

View File

@@ -0,0 +1 @@
4d638b24d6f8f703bcbcac23a0185265f3db5defb9f3d7f33b7be7fccc0de738 docs/modules/excititor/samples/chunks-sample.ndjson

View File

@@ -0,0 +1,93 @@
{
"schemaVersion": "1.0.0",
"generatedAt": "2025-11-20T00:00:00Z",
"connectors": [
{
"connectorId": "excititor:msrc",
"provider": { "name": "Microsoft Security Response Center", "slug": "msrc" },
"issuerTier": "tier-1",
"signers": [
{
"usage": "csaf",
"fingerprints": [
{"alg": "sha256", "format": "pgp", "value": "F1C3D9E4A7B28C5FD6E1A203B947C2A0C5D8BEEF"},
{"alg": "sha256", "format": "x509-spki", "value": "5A1F4C0E9B27D0C64EAC1F22C3F501AA9FCB77AC8B1D4F9F3EA7E6B4CE90F311"}
],
"keyLocator": "oci://mirror.stella.local/keys/msrc-csaf@sha256:793dd8a6..."
}
],
"bundle": {
"kind": "oci-referrer",
"uri": "oci://mirror.stella.local/msrc/csaf:2025-11-19",
"digest": "sha256:4b8c9fd6e479e1b6dcd2e7ed93a85c1c7d6052f7b4a6b83471e44f5c9c2a1f30",
"publishedAt": "2025-11-19T12:00:00Z"
},
"validFrom": "2025-11-01"
},
{
"connectorId": "excititor:oracle",
"provider": { "name": "Oracle", "slug": "oracle" },
"issuerTier": "tier-1",
"signers": [
{
"usage": "oval",
"fingerprints": [
{"alg": "sha256", "format": "x509-spki", "value": "6E3AC4A95BD5402F4C7E9B2371190E0F3B3C11C7B42B88652E7EE0F659A0D202"}
],
"keyLocator": "file://offline-kits/oracle/oval/signing-chain.pem",
"certificateChain": ["-----BEGIN CERTIFICATE-----\nMIID...oracle-root...\n-----END CERTIFICATE-----"]
}
],
"bundle": {
"kind": "file",
"uri": "file://offline-kits/oracle/oval/oval-feed-2025-11-18.tar.gz",
"digest": "sha256:b13b1b84af1da7ee3433e0c6c0cc28a8b5c7d3e52d93b9f86d4a4b0f1dcd8f05",
"publishedAt": "2025-11-18T09:30:00Z"
},
"validFrom": "2025-10-15"
},
{
"connectorId": "excititor:oci.openvex.attest",
"provider": { "name": "StellaOps Mirror", "slug": "stella-mirror" },
"issuerTier": "tier-0",
"signers": [
{
"usage": "openvex",
"fingerprints": [
{"alg": "sha256", "format": "cosign", "value": "a0c1d4e5f6b7982134d56789e0fab12345cdef6789abcdeffedcba9876543210"}
],
"keyLocator": "oci://mirror.stella.local/keys/stella-mirror-openvex:1",
"certificateChain": []
}
],
"bundle": {
"kind": "oci-tag",
"uri": "oci://mirror.stella.local/stellaops/openvex:2025-11-19",
"digest": "sha256:77f6c0b8f2c9845c7d0a4f3b783b0caf00cce6fb899319ff69cb941fe2c58010",
"publishedAt": "2025-11-19T15:00:00Z"
},
"validFrom": "2025-11-15"
},
{
"connectorId": "excititor:ubuntu",
"provider": { "name": "Ubuntu Security", "slug": "ubuntu" },
"issuerTier": "tier-2",
"signers": [
{
"usage": "oval",
"fingerprints": [
{"alg": "sha256", "format": "pgp", "value": "7D19E3B4A5F67C103CB0B4DE0FA28F90D6E4C1D2"}
],
"keyLocator": "tuf://mirror.stella.local/tuf/ubuntu/targets/oval-signing.pub"
}
],
"bundle": {
"kind": "tuf",
"uri": "tuf://mirror.stella.local/tuf/ubuntu/oval/targets/oval-2025-11-18.tar.gz",
"digest": "sha256:e41c4fc15132f8848e9924a1a0f1a247d3c56da87b7735b6c6d8cbe64f0f07e5",
"publishedAt": "2025-11-18T07:00:00Z"
},
"validFrom": "2025-11-01"
}
]
}

View File

@@ -0,0 +1 @@
a2f0986d938d877adf01a76b7a9e79cc148f330e57348569619485feb994df1d connector-signer-metadata-sample.json

View File

@@ -0,0 +1,50 @@
{
"schemaVersion": "1.0.0",
"generatedAt": "2025-12-10T00:00:00Z",
"tenant": "tenant-default",
"purl": "pkg:maven/org.example/foo@1.2.3",
"advisoryId": "GHSA-xxxx-yyyy-zzzz",
"source": "ghsa",
"status": "affected",
"justifications": [
{
"kind": "known_affected",
"reason": "Upstream GHSA reports affected range <1.3.0.",
"evidence": ["concelier:ghsa:obs:6561e41b3e3f4a6e9d3b91c1"],
"weight": 0.8
}
],
"conflicts": [
{
"field": "affected.versions",
"reason": "vendor_range_differs",
"values": ["<1.2.0", "<=1.3.0"],
"sourceIds": [
"concelier:redhat:obs:6561e41b3e3f4a6e9d3b91a1",
"concelier:ghsa:obs:6561e41b3e3f4a6e9d3b91c1"
]
}
],
"observations": [
{
"id": "concelier:ghsa:obs:6561e41b3e3f4a6e9d3b91c1",
"contentHash": "sha256:1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd",
"fetchedAt": "2025-11-19T00:00:00Z"
}
],
"provenance": {
"linksetId": "concelier:ghsa:linkset:6561e41b3e3f4a6e9d3b91d0",
"linksetHash": "sha256:deaddeaddeaddeaddeaddeaddeaddeaddeaddeaddeaddeaddeaddeaddeaddead",
"observationHashes": [
"sha256:1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd1234abcd"
],
"policyHash": "sha256:0f7c0f7c0f7c0f7c0f7c0f7c0f7c0f7c0f7c0f7c0f7c0f7c0f7c0f7c0f7c0f7c",
"sbomContextHash": "sha256:421af53f9eeba6903098d292fbd56f98be62ea6130b5161859889bf11d699d18",
"planCacheKey": "tenant-default|pkg:maven/org.example/foo@1.2.3|GHSA-xxxx-yyyy-zzzz"
},
"cache": {
"cached": true,
"cachedAt": "2025-12-10T00:00:00Z",
"ttlSeconds": 300
}
}

View File

@@ -0,0 +1,125 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://stellaops.dev/schemas/excititor/connector-signer-metadata.schema.json",
"title": "Excititor Connector Signer Metadata",
"type": "object",
"additionalProperties": false,
"required": ["schemaVersion", "generatedAt", "connectors"],
"properties": {
"schemaVersion": {
"type": "string",
"pattern": "^1\\.0\\.0$"
},
"generatedAt": {
"type": "string",
"format": "date-time"
},
"connectors": {
"type": "array",
"minItems": 1,
"items": {
"$ref": "#/$defs/connector"
}
}
},
"$defs": {
"connector": {
"type": "object",
"additionalProperties": false,
"required": [
"connectorId",
"provider",
"issuerTier",
"signers"
],
"properties": {
"connectorId": {
"type": "string",
"pattern": "^[a-z0-9:-\\.]+$"
},
"provider": {
"type": "object",
"additionalProperties": false,
"required": ["name", "slug"],
"properties": {
"name": { "type": "string", "minLength": 3 },
"slug": { "type": "string", "pattern": "^[a-z0-9-]+$" }
}
},
"issuerTier": {
"type": "string",
"enum": ["tier-0", "tier-1", "tier-2", "untrusted"]
},
"signers": {
"type": "array",
"minItems": 1,
"items": { "$ref": "#/$defs/signer" }
},
"bundle": { "$ref": "#/$defs/bundleRef" },
"validFrom": { "type": "string", "format": "date" },
"validTo": { "type": "string", "format": "date" },
"revoked": { "type": "boolean", "default": false },
"notes": { "type": "string", "maxLength": 2000 }
}
},
"signer": {
"type": "object",
"additionalProperties": false,
"required": ["usage", "fingerprints"],
"properties": {
"usage": {
"type": "string",
"enum": ["csaf", "oval", "openvex", "bulk-meta", "attestation"]
},
"fingerprints": {
"type": "array",
"minItems": 1,
"items": { "$ref": "#/$defs/fingerprint" }
},
"keyLocator": {
"type": "string",
"description": "Path or URL (mirror/OCI/TUF) where the signing key or certificate chain can be retrieved in offline kits."
},
"certificateChain": {
"type": "array",
"items": { "type": "string" },
"description": "Optional PEM-encoded certificates for x509/cosign keys."
}
}
},
"fingerprint": {
"type": "object",
"additionalProperties": false,
"required": ["alg", "value"],
"properties": {
"alg": {
"type": "string",
"enum": ["sha256", "sha512", "sha1"]
},
"format": {
"type": "string",
"enum": ["pgp", "x509-spki", "x509-ski", "cosign", "pem"]
},
"value": {
"type": "string",
"minLength": 16,
"maxLength": 128
}
}
},
"bundleRef": {
"type": "object",
"additionalProperties": false,
"required": ["kind", "uri"],
"properties": {
"kind": {
"type": "string",
"enum": ["oci-referrer", "oci-tag", "file", "tuf"]
},
"uri": { "type": "string", "minLength": 8 },
"digest": { "type": "string", "minLength": 32 },
"publishedAt": { "type": "string", "format": "date-time" }
}
}
}
}

View File

@@ -0,0 +1,305 @@
# Issuer Directory Contract v1.0.0
**Status:** APPROVED
**Version:** 1.0.0
**Effective:** 2025-12-19
**Owner:** VEX Lens Guild + Issuer Directory Guild
**Sprint:** SPRINT_0129_0001_0001 (unblocks VEXLENS-30-003)
---
## 1. Purpose
The Issuer Directory provides a registry of known VEX statement issuers with trust metadata, signing key information, and provenance tracking.
## 2. Data Model
### 2.1 Issuer Entity
```csharp
public sealed record Issuer
{
/// <summary>Unique issuer identifier (e.g., "vendor:redhat", "cert:cisa").</summary>
public required string IssuerId { get; init; }
/// <summary>Issuer category.</summary>
public required IssuerCategory Category { get; init; }
/// <summary>Display name.</summary>
public required string DisplayName { get; init; }
/// <summary>Trust tier assignment.</summary>
public required IssuerTrustTier TrustTier { get; init; }
/// <summary>Official website URL.</summary>
public string? WebsiteUrl { get; init; }
/// <summary>Security advisory feed URL.</summary>
public string? AdvisoryFeedUrl { get; init; }
/// <summary>Registered signing keys.</summary>
public ImmutableArray<SigningKeyInfo> SigningKeys { get; init; }
/// <summary>Products/ecosystems this issuer is authoritative for.</summary>
public ImmutableArray<string> AuthoritativeFor { get; init; }
/// <summary>When this issuer record was created.</summary>
public DateTimeOffset CreatedAt { get; init; }
/// <summary>When this issuer record was last updated.</summary>
public DateTimeOffset UpdatedAt { get; init; }
/// <summary>Whether issuer is active.</summary>
public bool IsActive { get; init; } = true;
}
```
### 2.2 Issuer Category
```csharp
public enum IssuerCategory
{
/// <summary>Software vendor/maintainer.</summary>
Vendor = 0,
/// <summary>Linux distribution.</summary>
Distribution = 1,
/// <summary>CERT/security response team.</summary>
Cert = 2,
/// <summary>Security research organization.</summary>
SecurityResearch = 3,
/// <summary>Community project.</summary>
Community = 4,
/// <summary>Commercial security vendor.</summary>
Commercial = 5
}
```
### 2.3 Signing Key Info
```csharp
public sealed record SigningKeyInfo
{
/// <summary>Key fingerprint (SHA-256).</summary>
public required string Fingerprint { get; init; }
/// <summary>Key type (pgp, x509, sigstore).</summary>
public required string KeyType { get; init; }
/// <summary>Key algorithm (rsa, ecdsa, ed25519).</summary>
public string? Algorithm { get; init; }
/// <summary>Key size in bits.</summary>
public int? KeySize { get; init; }
/// <summary>Key creation date.</summary>
public DateTimeOffset? CreatedAt { get; init; }
/// <summary>Key expiration date.</summary>
public DateTimeOffset? ExpiresAt { get; init; }
/// <summary>Whether key is currently valid.</summary>
public bool IsValid { get; init; } = true;
/// <summary>Public key location (URL or inline).</summary>
public string? PublicKeyUri { get; init; }
}
```
## 3. Pre-Registered Issuers
### 3.1 Authoritative Tier (Trust Tier 0)
| Issuer ID | Display Name | Category | Authoritative For |
|-----------|--------------|----------|-------------------|
| `vendor:redhat` | Red Hat Product Security | Vendor | `pkg:rpm/redhat/*`, `pkg:oci/registry.redhat.io/*` |
| `vendor:canonical` | Ubuntu Security Team | Distribution | `pkg:deb/ubuntu/*` |
| `vendor:debian` | Debian Security Team | Distribution | `pkg:deb/debian/*` |
| `vendor:suse` | SUSE Security Team | Distribution | `pkg:rpm/suse/*`, `pkg:rpm/opensuse/*` |
| `vendor:microsoft` | Microsoft Security Response | Vendor | `pkg:nuget/*` (Microsoft packages) |
| `vendor:oracle` | Oracle Security | Vendor | `pkg:maven/com.oracle.*/*` |
| `vendor:apache` | Apache Security Team | Community | `pkg:maven/org.apache.*/*` |
| `vendor:google` | Google Security Team | Vendor | `pkg:golang/google.golang.org/*` |
### 3.2 Trusted Tier (Trust Tier 1)
| Issuer ID | Display Name | Category |
|-----------|--------------|----------|
| `cert:cisa` | CISA | Cert |
| `cert:nist` | NIST NVD | Cert |
| `cert:github` | GitHub Security Advisories | SecurityResearch |
| `cert:snyk` | Snyk Security | Commercial |
| `research:oss-fuzz` | Google OSS-Fuzz | SecurityResearch |
### 3.3 Community Tier (Trust Tier 2)
| Issuer ID | Display Name | Category |
|-----------|--------------|----------|
| `community:osv` | OSV (Open Source Vulnerabilities) | Community |
| `community:vulndb` | VulnDB | Community |
## 4. API Endpoints
### 4.1 List Issuers
```
GET /api/v1/issuers
```
Query Parameters:
- `category`: Filter by category
- `trust_tier`: Filter by trust tier
- `active`: Filter by active status (default: true)
- `limit`: Max results (default: 100)
- `cursor`: Pagination cursor
### 4.2 Get Issuer
```
GET /api/v1/issuers/{issuerId}
```
### 4.3 Register Issuer (Admin)
```
POST /api/v1/issuers
Authorization: Bearer {admin_token}
{
"issuerId": "vendor:acme",
"category": "vendor",
"displayName": "ACME Security",
"trustTier": "trusted",
"websiteUrl": "https://security.acme.example",
"advisoryFeedUrl": "https://security.acme.example/feed.json",
"authoritativeFor": ["pkg:npm/@acme/*"]
}
```
### 4.4 Register Signing Key (Admin)
```
POST /api/v1/issuers/{issuerId}/keys
Authorization: Bearer {admin_token}
{
"fingerprint": "sha256:abc123...",
"keyType": "pgp",
"algorithm": "rsa",
"keySize": 4096,
"publicKeyUri": "https://security.acme.example/keys/signing.asc"
}
```
### 4.5 Lookup by Fingerprint
```
GET /api/v1/issuers/by-fingerprint/{fingerprint}
```
Returns the issuer associated with a signing key fingerprint.
## 5. Trust Tier Resolution
### 5.1 Automatic Assignment
When a VEX statement is received:
1. **Check signature:** If signed, lookup issuer by key fingerprint
2. **Check domain:** Match issuer by advisory feed domain
3. **Check authoritativeFor:** Match issuer by product PURL patterns
4. **Fallback:** Assign `Unknown` tier if no match
### 5.2 Override Rules
Operators can configure trust overrides:
```yaml
# etc/vexlens.yaml
issuer_overrides:
- issuer_id: "community:custom-feed"
trust_tier: "trusted" # Promote community to trusted
- issuer_id: "vendor:untrusted-vendor"
trust_tier: "community" # Demote vendor to community
```
## 6. Issuer Verification
### 6.1 PGP Signature Verification
```csharp
public interface IIssuerVerifier
{
/// <summary>
/// Verifies a VEX document signature against registered issuer keys.
/// </summary>
Task<IssuerVerificationResult> VerifyAsync(
byte[] documentBytes,
byte[] signatureBytes,
CancellationToken cancellationToken = default);
}
public sealed record IssuerVerificationResult
{
public bool IsValid { get; init; }
public string? IssuerId { get; init; }
public string? KeyFingerprint { get; init; }
public IssuerTrustTier? TrustTier { get; init; }
public string? VerificationError { get; init; }
}
```
### 6.2 Sigstore Verification
For Sigstore-signed documents:
1. Verify Rekor inclusion proof
2. Extract OIDC identity from certificate
3. Match identity to registered issuer
4. Return issuer info with trust tier
## 7. Database Schema
```sql
CREATE TABLE vex.issuers (
issuer_id TEXT PRIMARY KEY,
category TEXT NOT NULL,
display_name TEXT NOT NULL,
trust_tier INT NOT NULL DEFAULT 3,
website_url TEXT,
advisory_feed_url TEXT,
authoritative_for TEXT[] DEFAULT '{}',
is_active BOOLEAN DEFAULT TRUE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE vex.issuer_signing_keys (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
issuer_id TEXT NOT NULL REFERENCES vex.issuers(issuer_id),
fingerprint TEXT NOT NULL UNIQUE,
key_type TEXT NOT NULL,
algorithm TEXT,
key_size INT,
public_key_uri TEXT,
is_valid BOOLEAN DEFAULT TRUE,
created_at TIMESTAMPTZ,
expires_at TIMESTAMPTZ,
registered_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_issuer_signing_keys_fingerprint ON vex.issuer_signing_keys(fingerprint);
CREATE INDEX idx_issuers_trust_tier ON vex.issuers(trust_tier);
```
---
## Changelog
| Version | Date | Changes |
|---------|------|---------|
| 1.0.0 | 2025-12-19 | Initial release |

View File

@@ -0,0 +1,82 @@
openapi: 3.1.0
info:
title: StellaOps Excititor Chunk API
version: "0.1.0"
description: |
Frozen for Sprint 110 (EXCITITOR-AIAI-31-002). Aligns with Evidence Locker attestation contract v1.
servers:
- url: https://excitor.local
paths:
/vex/evidence/chunks:
post:
summary: Submit VEX evidence chunk (aggregation-only)
requestBody:
required: true
content:
application/x-ndjson:
schema:
$ref: '#/components/schemas/VexChunk'
responses:
'202':
description: Accepted for processing
content:
application/json:
schema:
type: object
required: [chunk_digest, queue_id]
properties:
chunk_digest:
type: string
description: sha256 of canonical chunk JSON
queue_id:
type: string
description: Background job identifier
'400':
description: Validation error
components:
schemas:
VexChunk:
type: object
required: [chunk_id, tenant, source, schema, items, provenance]
properties:
chunk_id:
type: string
format: uuid
tenant:
type: string
source:
type: string
description: feed id (e.g., ghsa, nvd)
schema:
type: string
enum: [stellaops.vex.chunk.v1]
items:
type: array
items:
type: object
required: [advisory_id, status, purl]
properties:
advisory_id:
type: string
status:
type: string
enum: [affected, unaffected, under_investigation, fixed, unknown]
purl:
type: string
justification:
type: string
last_observed:
type: string
format: date-time
provenance:
type: object
required: [fetched_at, artifact_sha]
properties:
fetched_at:
type: string
format: date-time
artifact_sha:
type: string
signature:
type: object
nullable: true

View File

@@ -0,0 +1,271 @@
# VEX Normalization Contract v1.0.0
**Status:** APPROVED
**Version:** 1.0.0
**Effective:** 2025-12-19
**Owner:** VEX Lens Guild
**Sprint:** SPRINT_0129_0001_0001 (unblocks VEXLENS-30-001 through 30-011)
---
## 1. Purpose
This contract defines the normalization rules for VEX (Vulnerability Exploitability eXchange) documents from multiple sources into a canonical StellaOps internal representation.
## 2. Supported Input Formats
| Format | Version | Parser |
|--------|---------|--------|
| OpenVEX | 0.2.0+ | `OpenVexParser` |
| CycloneDX VEX | 1.5+ | `CycloneDxVexParser` |
| CSAF VEX | 2.0 | `CsafVexParser` |
## 3. Canonical Representation
### 3.1 NormalizedVexStatement
```csharp
public sealed record NormalizedVexStatement
{
/// <summary>Unique statement identifier (deterministic hash).</summary>
public required string StatementId { get; init; }
/// <summary>CVE or vulnerability identifier.</summary>
public required string VulnerabilityId { get; init; }
/// <summary>Normalized status (not_affected, affected, fixed, under_investigation).</summary>
public required VexStatus Status { get; init; }
/// <summary>Justification code (when status = not_affected).</summary>
public VexJustification? Justification { get; init; }
/// <summary>Human-readable impact statement.</summary>
public string? ImpactStatement { get; init; }
/// <summary>Action statement for remediation.</summary>
public string? ActionStatement { get; init; }
/// <summary>Products affected by this statement.</summary>
public required ImmutableArray<ProductIdentifier> Products { get; init; }
/// <summary>Source document metadata.</summary>
public required VexSourceMetadata Source { get; init; }
/// <summary>Statement timestamp (UTC, ISO-8601).</summary>
public required DateTimeOffset Timestamp { get; init; }
/// <summary>Issuer information.</summary>
public required IssuerInfo Issuer { get; init; }
}
```
### 3.2 VexStatus Enum
```csharp
public enum VexStatus
{
/// <summary>Product is not affected by the vulnerability.</summary>
NotAffected = 0,
/// <summary>Product is affected and vulnerable.</summary>
Affected = 1,
/// <summary>Product was affected but is now fixed.</summary>
Fixed = 2,
/// <summary>Impact is being investigated.</summary>
UnderInvestigation = 3
}
```
### 3.3 VexJustification Enum
```csharp
public enum VexJustification
{
/// <summary>Component is not present.</summary>
ComponentNotPresent = 0,
/// <summary>Vulnerable code is not present.</summary>
VulnerableCodeNotPresent = 1,
/// <summary>Vulnerable code is not in execute path.</summary>
VulnerableCodeNotInExecutePath = 2,
/// <summary>Vulnerable code cannot be controlled by adversary.</summary>
VulnerableCodeCannotBeControlledByAdversary = 3,
/// <summary>Inline mitigations exist.</summary>
InlineMitigationsAlreadyExist = 4
}
```
## 4. Normalization Rules
### 4.1 Status Mapping
| Source Format | Source Value | Normalized Status |
|---------------|--------------|-------------------|
| OpenVEX | `not_affected` | NotAffected |
| OpenVEX | `affected` | Affected |
| OpenVEX | `fixed` | Fixed |
| OpenVEX | `under_investigation` | UnderInvestigation |
| CycloneDX | `notAffected` | NotAffected |
| CycloneDX | `affected` | Affected |
| CycloneDX | `resolved` | Fixed |
| CycloneDX | `inTriage` | UnderInvestigation |
| CSAF | `not_affected` | NotAffected |
| CSAF | `known_affected` | Affected |
| CSAF | `fixed` | Fixed |
| CSAF | `under_investigation` | UnderInvestigation |
### 4.2 Justification Mapping
| Source Format | Source Value | Normalized Justification |
|---------------|--------------|--------------------------|
| OpenVEX | `component_not_present` | ComponentNotPresent |
| OpenVEX | `vulnerable_code_not_present` | VulnerableCodeNotPresent |
| OpenVEX | `vulnerable_code_not_in_execute_path` | VulnerableCodeNotInExecutePath |
| OpenVEX | `vulnerable_code_cannot_be_controlled_by_adversary` | VulnerableCodeCannotBeControlledByAdversary |
| OpenVEX | `inline_mitigations_already_exist` | InlineMitigationsAlreadyExist |
| CycloneDX | Same as OpenVEX (camelCase) | Same mapping |
| CSAF | `component_not_present` | ComponentNotPresent |
| CSAF | `vulnerable_code_not_present` | VulnerableCodeNotPresent |
| CSAF | `vulnerable_code_not_in_execute_path` | VulnerableCodeNotInExecutePath |
| CSAF | `vulnerable_code_cannot_be_controlled_by_adversary` | VulnerableCodeCannotBeControlledByAdversary |
| CSAF | `inline_mitigations_already_exist` | InlineMitigationsAlreadyExist |
### 4.3 Product Identifier Normalization
Products are normalized to PURL (Package URL) format:
```
pkg:{ecosystem}/{namespace}/{name}@{version}?{qualifiers}#{subpath}
```
| Source | Extraction Method |
|--------|-------------------|
| OpenVEX | Direct from `product.id` if PURL, else construct from `product.identifiers` |
| CycloneDX | From `bom-ref` PURL or construct from `component.purl` |
| CSAF | From `product_id``product_identification_helper.purl` |
### 4.4 Statement ID Generation
Statement IDs are deterministic SHA-256 hashes:
```csharp
public static string GenerateStatementId(
string vulnerabilityId,
VexStatus status,
IEnumerable<string> productPurls,
string issuerId,
DateTimeOffset timestamp)
{
var input = $"{vulnerabilityId}|{status}|{string.Join(",", productPurls.OrderBy(p => p))}|{issuerId}|{timestamp:O}";
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(input));
return $"stmt:{Convert.ToHexString(hash).ToLowerInvariant()[..32]}";
}
```
## 5. Issuer Directory Integration
Normalized statements include issuer information from the Issuer Directory:
```csharp
public sealed record IssuerInfo
{
/// <summary>Issuer identifier (e.g., "vendor:redhat", "vendor:canonical").</summary>
public required string IssuerId { get; init; }
/// <summary>Display name.</summary>
public required string DisplayName { get; init; }
/// <summary>Trust tier (authoritative, trusted, community, unknown).</summary>
public required IssuerTrustTier TrustTier { get; init; }
/// <summary>Issuer's signing key fingerprints (if signed).</summary>
public ImmutableArray<string> SigningKeyFingerprints { get; init; }
}
public enum IssuerTrustTier
{
Authoritative = 0, // Vendor/maintainer of the product
Trusted = 1, // Known security research org
Community = 2, // Community contributor
Unknown = 3 // Unverified source
}
```
## 6. API Governance
### 6.1 Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/v1/vex/statements` | GET | Query normalized statements |
| `/api/v1/vex/statements/{id}` | GET | Get specific statement |
| `/api/v1/vex/normalize` | POST | Normalize a VEX document |
| `/api/v1/vex/issuers` | GET | List known issuers |
| `/api/v1/vex/issuers/{id}` | GET | Get issuer details |
### 6.2 Query Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| `vulnerability` | string | Filter by CVE/vulnerability ID |
| `product` | string | Filter by PURL (URL-encoded) |
| `status` | enum | Filter by VEX status |
| `issuer` | string | Filter by issuer ID |
| `since` | datetime | Statements after timestamp |
| `limit` | int | Max results (default: 100, max: 1000) |
| `cursor` | string | Pagination cursor |
### 6.3 Response Format
```json
{
"statements": [
{
"statementId": "stmt:a1b2c3d4e5f6...",
"vulnerabilityId": "CVE-2024-1234",
"status": "not_affected",
"justification": "vulnerable_code_not_in_execute_path",
"products": ["pkg:npm/lodash@4.17.21"],
"issuer": {
"issuerId": "vendor:lodash",
"displayName": "Lodash Maintainers",
"trustTier": "authoritative"
},
"timestamp": "2024-12-19T10:30:00Z"
}
],
"cursor": "next_page_token",
"total": 42
}
```
## 7. Precedence Rules
When multiple statements exist for the same vulnerability+product:
1. **Timestamp:** Later statements supersede earlier ones
2. **Trust Tier:** Higher trust tiers take precedence (Authoritative > Trusted > Community > Unknown)
3. **Specificity:** More specific product matches win (exact version > version range > package)
## 8. Validation
All normalized statements must pass:
1. `vulnerabilityId` matches CVE/GHSA/vendor pattern
2. `status` is a valid enum value
3. `products` contains at least one valid PURL
4. `timestamp` is valid ISO-8601 UTC
5. `issuer.issuerId` exists in Issuer Directory or is marked Unknown
---
## Changelog
| Version | Date | Changes |
|---------|------|---------|
| 1.0.0 | 2025-12-19 | Initial release |

View File

@@ -0,0 +1,149 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://stellaops.dev/schemas/excititor/vex_overlay.schema.json",
"title": "Excititor VEX Overlay",
"description": "Graph-ready overlay built from Link-Not-Merge observations and linksets. Immutable and append-only; ordered for deterministic pagination and caching.",
"type": "object",
"additionalProperties": false,
"required": [
"schemaVersion",
"generatedAt",
"tenant",
"purl",
"advisoryId",
"source",
"status",
"observations",
"provenance"
],
"properties": {
"schemaVersion": {
"type": "string",
"enum": ["1.0.0"]
},
"generatedAt": {
"type": "string",
"format": "date-time"
},
"tenant": {
"type": "string",
"description": "Tenant identifier used for storage partitioning."
},
"purl": {
"type": "string",
"description": "Normalized package URL for the component."
},
"advisoryId": {
"type": "string",
"description": "Upstream advisory identifier (e.g., GHSA, RHSA, CVE)."
},
"source": {
"type": "string",
"description": "Linkset source identifier (matches Concelier linkset source)."
},
"status": {
"type": "string",
"enum": [
"affected",
"not_affected",
"under_investigation",
"fixed",
"unknown"
]
},
"justifications": {
"type": "array",
"items": {
"type": "object",
"additionalProperties": false,
"required": ["kind", "reason"],
"properties": {
"kind": {
"type": "string",
"description": "Reason code aligned to VEX statement taxonomy."
},
"reason": {
"type": "string",
"description": "Human-readable justification text."
},
"evidence": {
"type": "array",
"items": {
"type": "string",
"description": "Observation or linkset id contributing to this justification."
}
},
"weight": {
"type": "number",
"minimum": 0,
"maximum": 1,
"description": "Optional confidence weight."
}
}
}
},
"conflicts": {
"type": "array",
"description": "Conflicts detected in linkset normalization.",
"items": {
"type": "object",
"additionalProperties": false,
"required": ["field", "reason"],
"properties": {
"field": { "type": "string" },
"reason": { "type": "string" },
"values": {
"type": "array",
"items": { "type": "string" }
},
"sourceIds": {
"type": "array",
"items": { "type": "string" }
}
}
}
},
"observations": {
"type": "array",
"description": "Ordered list of Link-Not-Merge observation references feeding this overlay.",
"items": {
"type": "object",
"additionalProperties": false,
"required": ["id", "contentHash", "fetchedAt"],
"properties": {
"id": { "type": "string" },
"contentHash": { "type": "string", "pattern": "^sha256:[A-Fa-f0-9]{64}$" },
"fetchedAt": { "type": "string", "format": "date-time" }
}
},
"minItems": 1
},
"provenance": {
"type": "object",
"additionalProperties": false,
"required": ["linksetId", "linksetHash", "observationHashes"],
"properties": {
"linksetId": { "type": "string" },
"linksetHash": { "type": "string", "pattern": "^sha256:[A-Fa-f0-9]{64}$" },
"observationHashes": {
"type": "array",
"items": { "type": "string", "pattern": "^sha256:[A-Fa-f0-9]{64}$" },
"minItems": 1
},
"policyHash": { "type": "string" },
"sbomContextHash": { "type": "string" },
"planCacheKey": { "type": "string" },
"generatedBy": { "type": "string" }
}
},
"cache": {
"type": "object",
"additionalProperties": false,
"properties": {
"cached": { "type": "boolean" },
"cachedAt": { "type": "string", "format": "date-time" },
"ttlSeconds": { "type": "integer", "minimum": 0 }
}
}
}
}

View File

@@ -0,0 +1,37 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://stellaops.dev/schemas/excititor/vex_raw.schema.json",
"title": "Excititor VEX Raw Document",
"$comment": "Note (2025-12): The gridFsObjectId field is legacy. Since Sprint 4400, all large content is stored in PostgreSQL with RustFS. This field exists only for backward compatibility with migrated data.",
"type": "object",
"additionalProperties": true,
"required": ["_id", "providerId", "format", "sourceUri", "retrievedAt", "digest"],
"properties": {
"_id": {
"type": "string",
"description": "Content-addressed digest; equals `digest`."
},
"providerId": { "type": "string", "minLength": 1 },
"format": { "type": "string", "enum": ["csaf", "cyclonedx", "openvex"] },
"sourceUri": { "type": "string", "minLength": 1 },
"retrievedAt": { "type": "string", "format": "date-time" },
"digest": { "type": "string", "minLength": 32 },
"content": {
"oneOf": [
{ "type": "string", "contentEncoding": "base64" },
{ "type": "string" }
],
"description": "Inline payload if below size threshold; may be empty when stored in RustFS (legacy: GridFS prior to Sprint 4400)."
},
"gridFsObjectId": {
"anyOf": [
{ "type": "string" },
{ "type": "null" }
]
},
"metadata": {
"type": "object",
"additionalProperties": { "type": "string" }
}
}
}

View File

@@ -0,0 +1,104 @@
## Status
This document tracks the future-looking risk scoring model for Excititor. The calculation below is not active yet; Sprint 7 work will add the required schema fields, policy controls, and services. Until that ships, Excititor emits consensus statuses without numeric scores.
## Scoring model (target state)
**S = Gate(VEX_status) × W_trust(source) × [Severity_base × (1 + α·KEV + β·EPSS)]**
* **Gate(VEX_status)**: `affected`/`under_investigation` → 1, `not_affected`/`fixed` → 0. A trusted “not affected” or “fixed” still zeroes the score.
* **W_trust(source)**: normalized policy weight (baseline 01). Policies may opt into >1 boosts for signed vendor feeds once Phase 1 closes.
* **Severity_base**: canonical numeric severity from Concelier (CVSS or org-defined scale).
* **KEV flag**: 0/1 boost when CISA Known Exploited Vulnerabilities applies.
* **EPSS**: probability [0,1]; bounded multiplier.
* **α, β**: configurable coefficients (default α=0.25, β=0.5) stored in policy.
Safeguards: freeze boosts when product identity is unknown, clamp outputs ≥0, and log every factor in the audit trail.
## Implementation roadmap
| Phase | Scope | Artifacts |
| --- | --- | --- |
| **Phase 1 Schema foundations** | Extend Excititor consensus/claims and Concelier canonical advisories with severity, KEV, EPSS, and expose α/β + weight ceilings in policy. | Sprint 7 tasks `EXCITITOR-CORE-02-001`, `EXCITITOR-POLICY-02-001`, `EXCITITOR-STORAGE-02-001`, `FEEDCORE-ENGINE-07-001`. |
| **Phase 2 Deterministic score engine** | Implement a scoring component that executes alongside consensus and persists score envelopes with hashes. | Planned task `EXCITITOR-CORE-02-002` (backlog). |
| **Phase 3 Surfacing & enforcement** | Expose scores via WebService/CLI, integrate with Concelier noise priors, and enforce policy-based suppressions. | To be scheduled after Phase 2. |
## Policy controls (Phase 1)
Operators tune scoring inputs through the Excititor policy document:
```yaml
excititor:
policy:
weights:
vendor: 1.10 # per-tier weight
ceiling: 1.40 # max clamp applied to tiers and overrides (1.05.0)
providerOverrides:
trusted.vendor: 1.35
scoring:
alpha: 0.30 # KEV boost coefficient (defaults to 0.25)
beta: 0.60 # EPSS boost coefficient (defaults to 0.50)
```
* All weights (tiers + overrides) are clamped to `[0, weights.ceiling]` with structured warnings when a value is out of range or not a finite number.
* `weights.ceiling` itself is constrained to `[1.0, 5.0]`, preserving prior behaviour when omitted.
* `scoring.alpha` / `scoring.beta` accept non-negative values up to 5.0; values outside the range fall back to defaults and surface diagnostics to operators.
## Data model (after Phase 1)
```json
{
"vulnerabilityId": "CVE-2025-12345",
"product": "pkg:name@version",
"consensus": {
"status": "affected",
"policyRevisionId": "rev-12",
"policyDigest": "0D9AEC…"
},
"signals": {
"severity": {"scheme": "CVSS:3.1", "score": 7.5},
"kev": true,
"epss": 0.40
},
"policy": {
"weight": 1.15,
"alpha": 0.25,
"beta": 0.5
},
"score": {
"value": 10.8,
"generatedAt": "2025-11-05T14:12:30Z",
"audit": [
"gate:affected",
"weight:1.15",
"severity:7.5",
"kev:1",
"epss:0.40"
]
}
}
```
## Operational guidance
* **Inputs**: Concelier delivers severity/KEV/EPSS via the advisory event log; Excititor connectors load VEX statements. Policy owns trust tiers and coefficients.
* **Processing**: the scoring engine (Phase 2) runs next to consensus, storing results with deterministic hashes so exports and attestations can reference them.
* **Consumption**: WebService/CLI will return consensus plus score; scanners may suppress findings only when policy-authorized VEX gating and signed score envelopes agree.
## Pseudocode (Phase 2 preview)
```python
def risk_score(gate, weight, severity, kev, epss, alpha, beta, freeze_boosts=False):
if gate == 0:
return 0
if freeze_boosts:
kev, epss = 0, 0
boost = 1 + alpha * kev + beta * epss
return max(0, weight * severity * boost)
```
## FAQ
* **Can operators opt out?** Set α=β=0 or keep weights ≤1.0 via policy.
* **What about missing signals?** Treat them as zero and log the omission.
* **When will this ship?** Phase 1 is planned for Sprint 7; later phases depend on connector coverage and attestation delivery.

View File

@@ -0,0 +1,515 @@
# VEX Trust Lattice Specification
> **Status**: Implementation Complete (Sprint 7100)
> **Version**: 1.0.0
> **Last Updated**: 2025-12-22
> **Source Advisory**: `docs/product/advisories/archived/22-Dec-2026 - Building a Trust Lattice for VEX Sources.md`
## 1. Overview
The VEX Trust Lattice provides a mathematically rigorous framework for converting heterogeneous VEX claims from multiple sources into a single, signed, reproducible verdict with a numeric confidence and a complete audit trail.
### Goals
1. **Explainability**: Every verdict includes a full breakdown of how it was computed
2. **Reproducibility**: Same inputs always produce identical verdicts (deterministic)
3. **Auditability**: Signed verdict manifests with pinned inputs for regulatory compliance
4. **Tunability**: Per-tenant, per-source trust configuration without code changes
### Non-Goals
- Real-time vulnerability detection (handled by Scanner)
- VEX document ingestion (handled by Excititor core)
- Policy enforcement (handled by Policy Engine)
---
## 2. Trust Vector Model
Each VEX source is assigned a 3-component trust vector scored in the range [0..1].
### 2.1 Provenance (P)
Measures cryptographic and process integrity of the source.
| Score | Description |
|-------|-------------|
| 1.00 | DSSE-signed, timestamped, Rekor/Git anchored, key in allow-list, rotation policy OK |
| 0.75 | DSSE-signed + public key known, but no transparency log |
| 0.40 | Unsigned but retrieved via authenticated, immutable artifact repo |
| 0.10 | Opaque/CSV/email/manual import |
### 2.2 Coverage (C)
Measures how well the statement's scope maps to the target asset.
| Score | Description |
|-------|-------------|
| 1.00 | Exact package + version/build digest + feature/flag context matched |
| 0.75 | Exact package + version range matched; partial feature context |
| 0.50 | Product-level only; maps via CPE/PURL family |
| 0.25 | Family-level heuristics; no version proof |
### 2.3 Replayability (R)
Measures whether the claim can be deterministically re-derived.
| Score | Description |
|-------|-------------|
| 1.00 | All inputs pinned (feeds, SBOM hash, ruleset hash, lattice version); replays byte-identical |
| 0.60 | Inputs mostly pinned; non-deterministic ordering tolerated but stable outcome |
| 0.20 | Ephemeral APIs; no snapshot |
### 2.4 Weight Configuration
The base trust score is computed as:
```
BaseTrust(S) = wP * P + wC * C + wR * R
```
**Default weights:**
- `wP = 0.45` (Provenance)
- `wC = 0.35` (Coverage)
- `wR = 0.20` (Replayability)
Weights are tunable per policy and sum to 1.0.
---
## 3. Claim Scoring
### 3.1 Base Trust Calculation
```csharp
double BaseTrust(double P, double C, double R, TrustWeights W)
=> W.wP * P + W.wC * C + W.wR * R;
```
### 3.2 Claim Strength Multipliers (M)
Each VEX claim carries a strength multiplier based on evidence quality:
| Strength | Value | Description |
|----------|-------|-------------|
| ExploitabilityWithReachability | 1.00 | Exploitability analysis + reachability proof subgraph provided |
| ConfigWithEvidence | 0.80 | Config/feature-flag reason with evidence |
| VendorBlanket | 0.60 | Vendor blanket statement |
| UnderInvestigation | 0.40 | "Under investigation" |
### 3.3 Freshness Decay (F)
Time-decay curve with configurable half-life:
```csharp
double Freshness(DateTime issuedAt, DateTime cutoff, double halfLifeDays = 90, double floor = 0.35)
{
var ageDays = (cutoff - issuedAt).TotalDays;
var decay = Math.Exp(-Math.Log(2) * ageDays / halfLifeDays);
return Math.Max(decay, floor);
}
```
**Parameters:**
- `halfLifeDays = 90` (default): Score halves every 90 days
- `floor = 0.35` (default): Minimum freshness unless revoked
### 3.4 ClaimScore Formula
```
ClaimScore = BaseTrust(S) * M * F
```
**Example calculation:**
```
Source: Red Hat (Vendor)
P = 0.90, C = 0.75, R = 0.60
BaseTrust = 0.45*0.90 + 0.35*0.75 + 0.20*0.60 = 0.405 + 0.2625 + 0.12 = 0.7875
Claim: ConfigWithEvidence (M = 0.80)
Freshness: 30 days old (F = 0.79)
ClaimScore = 0.7875 * 0.80 * 0.79 = 0.498
```
---
## 4. Lattice Merge Algorithm
### 4.1 Partial Ordering
Claims are ordered by a tuple: `(scope_specificity, ClaimScore)`.
Scope specificity levels:
1. Exact digest match (highest)
2. Exact version match
3. Version range match
4. Product family match
5. Platform match (lowest)
### 4.2 Conflict Detection
Conflicts occur when claims for the same (CVE, Asset) have different statuses:
```csharp
bool HasConflict(IEnumerable<Claim> claims)
=> claims.Select(c => c.Status).Distinct().Count() > 1;
```
### 4.3 Conflict Penalty
When conflicts exist, apply a penalty to weaker/older claims:
```csharp
const double ConflictPenalty = 0.25;
if (contradictory)
{
var strongest = claims.OrderByDescending(c => c.Score).First();
foreach (var claim in claims.Where(c => c.Status != strongest.Status))
{
claim.AdjustedScore = claim.Score * (1 - ConflictPenalty);
}
}
```
### 4.4 Winner Selection
Final verdict is selected by:
```csharp
var winner = scored
.OrderByDescending(x => (x.Claim.ScopeSpecificity, x.AdjustedScore))
.First();
```
### 4.5 Audit Trail Generation
Every merge produces:
```csharp
public sealed record MergeResult
{
public VexStatus Status { get; init; }
public double Confidence { get; init; }
public ImmutableArray<VerdictExplanation> Explanations { get; init; }
public ImmutableArray<string> EvidenceRefs { get; init; }
public string PolicyHash { get; init; }
public string LatticeVersion { get; init; }
}
```
---
## 5. Policy Gates
Gates are evaluated after merge to enforce policy requirements.
### 5.1 MinimumConfidenceGate
Requires minimum confidence by environment for certain statuses.
```yaml
gates:
minimumConfidence:
enabled: true
thresholds:
production: 0.75
staging: 0.60
development: 0.40
applyToStatuses:
- not_affected
- fixed
```
**Behavior**: Fails if confidence < threshold for specified statuses.
### 5.2 UnknownsBudgetGate
Limits exposure to unknown/unscored dependencies.
```yaml
gates:
unknownsBudget:
enabled: true
maxUnknownCount: 5
maxCumulativeUncertainty: 2.0
```
**Behavior**: Fails if:
- `#unknown_deps > maxUnknownCount`, OR
- `sum(1 - ClaimScore) > maxCumulativeUncertainty`
### 5.3 SourceQuotaGate
Prevents single-source dominance without corroboration.
```yaml
gates:
sourceQuota:
enabled: true
maxInfluencePercent: 60
corroborationDelta: 0.10
```
**Behavior**: Fails if single source influence > 60% AND no second source within delta=0.10.
### 5.4 ReachabilityRequirementGate
Requires reachability proof for critical vulnerabilities.
```yaml
gates:
reachabilityRequirement:
enabled: true
severityThreshold: CRITICAL
requiredForStatuses:
- not_affected
bypassReasons:
- component_not_present
```
**Behavior**: Fails if `not_affected` on CRITICAL CVE without reachability proof (unless bypass reason applies).
---
## 6. Deterministic Replay
### 6.1 Input Pinning
To guarantee "same inputs → same verdict", pin:
- SBOM digest(s)
- Vuln feed snapshot IDs
- VEX document digests
- Reachability graph IDs
- Policy file hash
- Lattice version
- Clock cutoff (evaluation timestamp)
### 6.2 Verdict Manifest
```json
{
"manifestId": "verd:tenant:asset:cve:1234567890",
"tenant": "acme-corp",
"assetDigest": "sha256:abc123...",
"vulnerabilityId": "CVE-2025-12345",
"inputs": {
"sbomDigests": ["sha256:..."],
"vulnFeedSnapshotIds": ["nvd:2025-12-22"],
"vexDocumentDigests": ["sha256:..."],
"reachabilityGraphIds": ["graph:..."],
"clockCutoff": "2025-12-22T12:00:00Z"
},
"result": {
"status": "not_affected",
"confidence": 0.82,
"explanations": [...]
},
"policyHash": "sha256:...",
"latticeVersion": "1.2.0",
"evaluatedAt": "2025-12-22T12:00:01Z",
"manifestDigest": "sha256:..."
}
```
### 6.3 Signing
Verdict manifests are signed using DSSE with predicate type:
```
https://stella-ops.org/attestations/vex-verdict/1
```
### 6.4 Replay Verification
```
POST /api/v1/authority/verdicts/{manifestId}/replay
Response:
{
"success": true,
"originalManifest": {...},
"replayedManifest": {...},
"differences": [],
"signatureValid": true
}
```
---
## 7. Configuration Reference
### Full Configuration Example
```yaml
# etc/trust-lattice.yaml
version: "1.0"
trustLattice:
weights:
provenance: 0.45
coverage: 0.35
replayability: 0.20
freshness:
halfLifeDays: 90
floor: 0.35
conflictPenalty: 0.25
defaults:
vendor:
provenance: 0.90
coverage: 0.70
replayability: 0.60
distro:
provenance: 0.80
coverage: 0.85
replayability: 0.60
internal:
provenance: 0.85
coverage: 0.95
replayability: 0.90
gates:
minimumConfidence:
enabled: true
thresholds:
production: 0.75
staging: 0.60
development: 0.40
unknownsBudget:
enabled: true
maxUnknownCount: 5
maxCumulativeUncertainty: 2.0
sourceQuota:
enabled: true
maxInfluencePercent: 60
corroborationDelta: 0.10
reachabilityRequirement:
enabled: true
severityThreshold: CRITICAL
```
---
## 8. API Reference
### Endpoints
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/v1/excititor/verdicts/{manifestId}` | Get verdict manifest |
| GET | `/api/v1/excititor/verdicts` | List verdicts (paginated) |
| POST | `/api/v1/authority/verdicts/{manifestId}/replay` | Verify replay |
| GET | `/api/v1/authority/verdicts/{manifestId}/download` | Download signed manifest |
See `docs/API_CLI_REFERENCE.md` for complete API documentation.
---
## 9. Examples
### Example 1: High-Confidence Verdict
**Input:**
- Red Hat VEX: `not_affected` with `component_not_present`
- Ubuntu VEX: `not_affected` with `component_not_present`
**Calculation:**
```
Red Hat: BaseTrust=0.78, M=0.80, F=0.95 → ClaimScore=0.59
Ubuntu: BaseTrust=0.72, M=0.80, F=0.90 → ClaimScore=0.52
No conflict (both agree)
Winner: Red Hat (higher score)
Confidence: 0.59
Gates: All pass (> 0.40 threshold)
```
### Example 2: Conflict Resolution
**Input:**
- Vendor VEX: `not_affected`
- Internal scan: `affected`
**Calculation:**
```
Vendor: ClaimScore=0.65
Internal: ClaimScore=0.55
Conflict detected → penalty applied
Internal adjusted: 0.55 * 0.75 = 0.41
Winner: Vendor
Confidence: 0.65
Note: Conflict recorded in audit trail
```
---
---
## 10. Implementation Reference
### 10.1 Source Files
| Component | Location |
|-----------|----------|
| TrustVector | `src/Excititor/__Libraries/StellaOps.Excititor.Core/TrustVector/TrustVector.cs` |
| TrustWeights | `src/Excititor/__Libraries/StellaOps.Excititor.Core/TrustVector/TrustWeights.cs` |
| ClaimStrength | `src/Excititor/__Libraries/StellaOps.Excititor.Core/TrustVector/ClaimStrength.cs` |
| FreshnessCalculator | `src/Excititor/__Libraries/StellaOps.Excititor.Core/TrustVector/FreshnessCalculator.cs` |
| DefaultTrustVectors | `src/Excititor/__Libraries/StellaOps.Excititor.Core/TrustVector/DefaultTrustVectors.cs` |
| ProvenanceScorer | `src/Excititor/__Libraries/StellaOps.Excititor.Core/TrustVector/ProvenanceScorer.cs` |
| CoverageScorer | `src/Excititor/__Libraries/StellaOps.Excititor.Core/TrustVector/CoverageScorer.cs` |
| ReplayabilityScorer | `src/Excititor/__Libraries/StellaOps.Excititor.Core/TrustVector/ReplayabilityScorer.cs` |
| SourceClassificationService | `src/Excititor/__Libraries/StellaOps.Excititor.Core/TrustVector/SourceClassificationService.cs` |
| ClaimScoreMerger | `src/Policy/__Libraries/StellaOps.Policy/TrustLattice/ClaimScoreMerger.cs` |
| MinimumConfidenceGate | `src/Policy/__Libraries/StellaOps.Policy/Gates/MinimumConfidenceGate.cs` |
| UnknownsBudgetGate | `src/Policy/__Libraries/StellaOps.Policy/Gates/UnknownsBudgetGate.cs` |
| SourceQuotaGate | `src/Policy/__Libraries/StellaOps.Policy/Gates/SourceQuotaGate.cs` |
| ReachabilityRequirementGate | `src/Policy/__Libraries/StellaOps.Policy/Gates/ReachabilityRequirementGate.cs` |
| TrustVectorCalibrator | `src/Excititor/__Libraries/StellaOps.Excititor.Core/Calibration/TrustVectorCalibrator.cs` |
### 10.2 Configuration Files
| File | Purpose |
|------|---------|
| `etc/trust-lattice.yaml.sample` | Trust vector weights, freshness parameters, default vectors |
| `etc/policy-gates.yaml.sample` | Gate thresholds and enable/disable flags |
| `etc/excititor-calibration.yaml.sample` | Calibration learning parameters |
### 10.3 Database Schema
- **Calibration manifests**: `src/Excititor/__Libraries/StellaOps.Excititor.Storage.Postgres/Migrations/002_calibration_schema.sql`
- **Verdict storage**: See Authority module for verdict manifest persistence
### 10.4 Test Coverage
| Test Suite | Location |
|------------|----------|
| TrustVector tests | `src/Excititor/__Tests/StellaOps.Excititor.Core.Tests/TrustVector/` |
| ClaimScoreMerger tests | `src/Policy/__Tests/StellaOps.Policy.Tests/TrustLattice/` |
| Gate tests | `src/Policy/__Tests/StellaOps.Policy.Tests/Gates/` |
| Calibration tests | `src/Excititor/__Tests/StellaOps.Excititor.Core.Tests/Calibration/` |
---
## Related Documentation
- [Excititor Architecture](./architecture.md)
- [Verdict Manifest Specification](../authority/verdict-manifest.md)
- [Policy Gates Configuration](../policy/architecture.md)
- [API Reference](../../API_CLI_REFERENCE.md)
---
*Document Version: 1.0.0*
*Sprint: 7100.0003.0002*
*Created: 2025-12-22*

View File

@@ -0,0 +1,322 @@
# Excititor VEX Observation & Linkset APIs
> Implementation reference for Sprint 121 (`EXCITITOR-LNM-21-201`, `EXCITITOR-LNM-21-202`). Documents the REST endpoints implemented in `src/Excititor/StellaOps.Excititor.WebService/Endpoints/ObservationEndpoints.cs` and `LinksetEndpoints.cs`.
## Authentication & Headers
All endpoints require:
- **Authorization**: Bearer token with `vex.read` scope
- **X-Stella-Tenant**: Tenant identifier (required)
## /vex/observations
### List observations with filters
```
GET /vex/observations?vulnerabilityId=CVE-2024-0001&productKey=pkg:maven/org.demo/app@1.2.3&limit=50
GET /vex/observations?providerId=ubuntu-csaf&limit=50
```
**Query Parameters:**
- `vulnerabilityId` + `productKey` (required together) - Filter by vulnerability and product
- `providerId` - Filter by provider
- `limit` (optional, default: 50, max: 100) - Number of results
- `cursor` (optional) - Pagination cursor from previous response
**Response 200:**
```json
{
"items": [
{
"observationId": "vex:obs:sha256:abc123...",
"tenant": "default",
"providerId": "ubuntu-csaf",
"vulnerabilityId": "CVE-2024-0001",
"productKey": "pkg:maven/org.demo/app@1.2.3",
"status": "affected",
"createdAt": "2025-11-18T12:34:56Z",
"lastObserved": "2025-11-18T12:34:56Z",
"purls": ["pkg:maven/org.demo/app@1.2.3"]
}
],
"nextCursor": "MjAyNS0xMS0xOFQxMjozNDo1NlonfHZleDpvYnM6c2hhMjU2OmFiYzEyMy4uLg=="
}
```
**Error Responses:**
- `400 ERR_PARAMS` - At least one filter is required
- `400 ERR_TENANT` - X-Stella-Tenant header is required
- `403` - Missing required scope
### Get observation by ID
```
GET /vex/observations/{observationId}
```
**Response 200:**
```json
{
"observationId": "vex:obs:sha256:abc123...",
"tenant": "default",
"providerId": "ubuntu-csaf",
"streamId": "ubuntu-csaf-vex",
"upstream": {
"upstreamId": "USN-9999-1",
"documentVersion": "2024.10.22",
"fetchedAt": "2025-11-18T12:34:00Z",
"receivedAt": "2025-11-18T12:34:05Z",
"contentHash": "sha256:...",
"signature": {
"type": "cosign",
"keyId": "ubuntu-vex-prod",
"issuer": "https://token.actions.githubusercontent.com",
"verifiedAt": "2025-11-18T12:34:10Z"
}
},
"content": {
"format": "csaf",
"specVersion": "2.0"
},
"statements": [
{
"vulnerabilityId": "CVE-2024-0001",
"productKey": "pkg:maven/org.demo/app@1.2.3",
"status": "affected",
"lastObserved": "2025-11-18T12:34:56Z",
"locator": "#/statements/0",
"justification": "component_not_present",
"introducedVersion": null,
"fixedVersion": "1.2.4"
}
],
"linkset": {
"aliases": ["USN-9999-1"],
"purls": ["pkg:maven/org.demo/app@1.2.3"],
"cpes": [],
"references": [{"type": "advisory", "url": "https://ubuntu.com/security/notices/USN-9999-1"}]
},
"createdAt": "2025-11-18T12:34:56Z"
}
```
**Error Responses:**
- `404 ERR_NOT_FOUND` - Observation not found
### Count observations
```
GET /vex/observations/count
```
**Response 200:**
```json
{
"count": 12345
}
```
## /vex/linksets
### List linksets with filters
At least one filter is required: `vulnerabilityId`, `productKey`, `providerId`, or `hasConflicts=true`.
```
GET /vex/linksets?vulnerabilityId=CVE-2024-0001&limit=50
GET /vex/linksets?productKey=pkg:maven/org.demo/app@1.2.3&limit=50
GET /vex/linksets?providerId=ubuntu-csaf&limit=50
GET /vex/linksets?hasConflicts=true&limit=50
```
**Query Parameters:**
- `vulnerabilityId` - Filter by vulnerability ID
- `productKey` - Filter by product key
- `providerId` - Filter by provider
- `hasConflicts` - Filter to linksets with disagreements (true/false)
- `limit` (optional, default: 50, max: 100) - Number of results
- `cursor` (optional) - Pagination cursor
**Response 200:**
```json
{
"items": [
{
"linksetId": "sha256:tenant:CVE-2024-0001:pkg:maven/org.demo/app@1.2.3",
"tenant": "default",
"vulnerabilityId": "CVE-2024-0001",
"productKey": "pkg:maven/org.demo/app@1.2.3",
"providerIds": ["ubuntu-csaf", "suse-csaf"],
"statuses": ["affected", "fixed"],
"aliases": [],
"purls": [],
"cpes": [],
"references": [],
"disagreements": [
{
"providerId": "suse-csaf",
"status": "fixed",
"justification": null,
"confidence": 0.85
}
],
"observations": [
{"observationId": "vex:obs:...", "providerId": "ubuntu-csaf", "status": "affected", "confidence": 0.9},
{"observationId": "vex:obs:...", "providerId": "suse-csaf", "status": "fixed", "confidence": 0.85}
],
"createdAt": "2025-11-18T12:34:56Z"
}
],
"nextCursor": null
}
```
**Error Responses:**
- `400 ERR_AGG_PARAMS` - At least one filter is required
### Get linkset by ID
```
GET /vex/linksets/{linksetId}
```
**Response 200:**
```json
{
"linksetId": "sha256:...",
"tenant": "default",
"vulnerabilityId": "CVE-2024-0001",
"productKey": "pkg:maven/org.demo/app@1.2.3",
"providerIds": ["ubuntu-csaf", "suse-csaf"],
"statuses": ["affected", "fixed"],
"confidence": "low",
"hasConflicts": true,
"disagreements": [
{
"providerId": "suse-csaf",
"status": "fixed",
"justification": null,
"confidence": 0.85
}
],
"observations": [
{"observationId": "vex:obs:...", "providerId": "ubuntu-csaf", "status": "affected", "confidence": 0.9},
{"observationId": "vex:obs:...", "providerId": "suse-csaf", "status": "fixed", "confidence": 0.85}
],
"createdAt": "2025-11-18T12:00:00Z",
"updatedAt": "2025-11-18T12:34:56Z"
}
```
**Error Responses:**
- `400 ERR_AGG_PARAMS` - linksetId is required
- `404 ERR_AGG_NOT_FOUND` - Linkset not found
### Lookup linkset by vulnerability and product
```
GET /vex/linksets/lookup?vulnerabilityId=CVE-2024-0001&productKey=pkg:maven/org.demo/app@1.2.3
```
**Response 200:** Same as Get linkset by ID
**Error Responses:**
- `400 ERR_AGG_PARAMS` - vulnerabilityId and productKey are required
- `404 ERR_AGG_NOT_FOUND` - No linkset found for the specified vulnerability and product
### Count linksets
```
GET /vex/linksets/count
```
**Response 200:**
```json
{
"total": 5000,
"withConflicts": 127
}
```
### List linksets with conflicts (shorthand)
```
GET /vex/linksets/conflicts?limit=50
```
**Response 200:** Same format as List linksets
## Error Codes
| Code | Description |
|------|-------------|
| `ERR_PARAMS` | Missing or invalid query parameters (observations) |
| `ERR_TENANT` | X-Stella-Tenant header is required |
| `ERR_NOT_FOUND` | Observation not found |
| `ERR_AGG_PARAMS` | Missing or invalid query parameters (linksets) |
| `ERR_AGG_NOT_FOUND` | Linkset not found |
## Pagination
- Uses cursor-based pagination with base64-encoded `timestamp|id` cursors
- Default limit: 50, Maximum limit: 100
- Cursors are opaque; treat as strings and pass back unchanged
## Determinism
- Results are sorted by timestamp (descending), then by ID
- Array fields are sorted lexicographically
- Status enums are lowercase strings
## SDK Example (TypeScript)
```typescript
const listObservations = async (
baseUrl: string,
token: string,
tenant: string,
vulnerabilityId: string,
productKey: string
) => {
const params = new URLSearchParams({
vulnerabilityId,
productKey,
limit: "100"
});
const response = await fetch(`${baseUrl}/vex/observations?${params}`, {
headers: {
Authorization: `Bearer ${token}`,
"X-Stella-Tenant": tenant
}
});
if (!response.ok) {
const error = await response.json();
throw new Error(`${error.error.code}: ${error.error.message}`);
}
return response.json();
};
const getLinksetWithConflicts = async (
baseUrl: string,
token: string,
tenant: string
) => {
const response = await fetch(`${baseUrl}/vex/linksets/conflicts?limit=50`, {
headers: {
Authorization: `Bearer ${token}`,
"X-Stella-Tenant": tenant
}
});
return response.json();
};
```
## Related Documentation
- `vex_observations.md` - VEX Observation domain model and storage schema
- `evidence-contract.md` - Evidence bundle format and attestation
- `AGENTS.md` - Component development guidelines

View File

@@ -0,0 +1,232 @@
# VEX Observation Model (`vex_observations`)
> Authored 2025-11-14 for Sprint 120 (`EXCITITOR-LNM-21-001`). This document is the canonical schema description for Excititor's immutable observation records. It unblocks downstream documentation tasks (`DOCS-LNM-22-002`) and aligns the WebService/Worker data structures with PostgreSQL persistence.
Excititor ingests heterogeneous VEX statements, normalizes them under the Aggregation-Only Contract (AOC), and persists each normalized statement as a **VEX observation**. These observations are the source of truth for:
- Advisory AI citation APIs (`/v1/vex/observations/{vulnerabilityId}/{productKey}`)
- Graph/Vuln Explorer overlays (batch observation APIs)
- Evidence Locker + portable bundle manifests
- Policy Engine materialization and audit trails
All observation documents are immutable. New information creates a new observation record linked by `observationId`; supersedence happens through Graph/Lens layers, not by mutating this collection.
## Storage & routing
| Aspect | Value |
| --- | --- |
| Table | `vex_observations` (PostgreSQL) |
| Upstream generator | `VexObservationProjectionService` (WebService) and Worker normalization pipeline |
| Primary key | `{tenant, observationId}` |
| Required indexes | `{tenant, vulnerabilityId}`, `{tenant, productKey}`, `{tenant, document.digest}`, `{tenant, providerId, status}` |
| Source of truth for | `/v1/vex/observations`, Graph batch APIs, Excititor → Evidence Locker replication |
## Canonical document shape
```jsonc
{
"tenant": "default",
"observationId": "vex:obs:sha256:...",
"vulnerabilityId": "CVE-2024-12345",
"productKey": "pkg:maven/org.example/app@1.2.3",
"providerId": "ubuntu-csaf",
"status": "affected", // matches VexClaimStatus enum
"justification": {
"type": "component_not_present",
"reason": "Package not shipped in this profile",
"detail": "Binary not in base image"
},
"detail": "Free-form vendor detail",
"confidence": {
"score": 0.9,
"level": "high",
"method": "vendor"
},
"signals": {
"severity": {
"scheme": "cvss3.1",
"score": 7.8,
"label": "High",
"vector": "CVSS:3.1/..."
},
"kev": true,
"epss": 0.77
},
"scope": {
"key": "pkg:deb/ubuntu/apache2@2.4.58-1",
"purls": [
"pkg:deb/ubuntu/apache2@2.4.58-1",
"pkg:docker/example/app@sha256:..."
],
"cpes": ["cpe:2.3:a:apache:http_server:2.4.58:*:*:*:*:*:*:*"]
},
"anchors": [
"#/statements/0/justification",
"#/statements/0/detail"
],
"document": {
"format": "csaf",
"digest": "sha256:abc123...",
"revision": "2024-10-22T09:00:00Z",
"sourceUri": "https://ubuntu.com/security/notices/USN-0000-1",
"signature": {
"type": "cosign",
"issuer": "https://token.actions.githubusercontent.com",
"keyId": "ubuntu-vex-prod",
"verifiedAt": "2024-10-22T09:01:00Z",
"transparencyLogReference": "rekor://UUID",
"trust": {
"tenantId": "default",
"issuerId": "ubuntu",
"effectiveWeight": 0.9,
"tenantOverrideApplied": false,
"retrievedAtUtc": "2024-10-22T09:00:30Z"
}
}
},
"aoc": {
"guardVersion": "2024.10.0",
"violations": [], // non-empty -> stored + surfaced
"ingestedAt": "2024-10-22T09:00:05Z",
"retrievedAt": "2024-10-22T08:59:59Z"
},
"metadata": {
"provider-hint": "Mainline feed",
"source-channel": "mirror"
}
}
```
### Field notes
- **`tenant`** logical tenant resolved by WebService based on headers or default configuration.
- **`observationId`** deterministic hash (sha256) over `{tenant, vulnerabilityId, productKey, providerId, statementDigest}`. Never reused.
- **`status` + `justification`** follow the OpenVEX semantics enforced by `StellaOps.Excititor.Core.VexClaim`.
- **`scope`** includes canonical `key` plus normalized PURLs/CPES; deterministic ordering.
- **`anchors`** optional JSON-pointer hints pointing to the source document sections; stored as trimmed strings.
- **`document.signature`** mirrors `VexSignatureMetadata`; empty if upstream feed lacks signatures.
- **`aoc.violations`** stored if the guard detected non-fatal issues; fatal issues never create an observation.
- **`metadata`** reserved for deterministic provider hints; keys follow `vex.*` prefix guidance.
## Determinism & AOC guarantees
1. **Write-once** once inserted, observation documents never change. New evidence creates a new `observationId`.
2. **Sorted collections** arrays (`anchors`, `purls`, `cpes`) are sorted lexicographically before persistence.
3. **Guard metadata** `aoc.guardVersion` records the guard library version (`docs/aoc/guard-library.md`), enabling audits.
4. **Signatures** only verification metadata proven by the Worker is stored; WebService never recomputes trust.
5. **Time normalization** all timestamps stored as UTC ISO-8601 strings (PostgreSQL `timestamptz`).
## API mapping
| API | Source fields | Notes |
| --- | --- | --- |
| `GET /vex/observations` | `tenant`, `vulnerabilityId`, `productKey`, `providerId` | List observations with filters. Implemented in `ObservationEndpoints.cs`. |
| `GET /vex/observations/{observationId}` | `tenant`, `observationId` | Get single observation by ID with full detail. |
| `GET /vex/observations/count` | `tenant` | Count all observations for tenant. |
| `/v1/vex/observations/{vuln}/{product}` | `tenant`, `vulnerabilityId`, `productKey`, `scope`, `statements[]` | Response uses `VexObservationProjectionService` to render `statements`, `document`, and `signature` fields. |
| `/vex/aoc/verify` | `document.digest`, `providerId`, `aoc` | Replays guard validation for recent digests; guard violations here align with `aoc.violations`. |
| Evidence batch API (Graph) | `statements[]`, `scope`, `signals`, `anchors` | Format optimized for overlays; reduces `document` to digest/URI. |
## Related work
- `EXCITITOR-GRAPH-24-*` relies on this schema to build overlays.
- `DOCS-LNM-22-002` (Link-Not-Merge documentation) references this file.
- `EXCITITOR-ATTEST-73-*` uses `document.digest` + `signature` to embed provenance in attestation payloads.
---
## Rekor Transparency Log Linkage
**Sprint Reference**: `SPRINT_20260117_002_EXCITITOR_vex_rekor_linkage`
VEX observations can be attested to the Sigstore Rekor transparency log, providing an immutable, publicly verifiable record of when each observation was recorded. This supports:
- **Auditability**: Independent verification that an observation existed at a specific time
- **Non-repudiation**: Cryptographic proof of observation provenance
- **Supply chain compliance**: Evidence for regulatory and security requirements
- **Offline verification**: Stored inclusion proofs enable air-gapped verification
### Rekor Linkage Fields
The following fields are added to `vex_observations` when an observation is attested:
| Field | Type | Description |
|-------|------|-------------|
| `rekor_uuid` | TEXT | Rekor entry UUID (64-char hex) |
| `rekor_log_index` | BIGINT | Monotonically increasing log position |
| `rekor_integrated_time` | TIMESTAMPTZ | When entry was integrated into log |
| `rekor_log_url` | TEXT | Rekor server URL where submitted |
| `rekor_inclusion_proof` | JSONB | RFC 6962 inclusion proof for offline verification |
| `rekor_linked_at` | TIMESTAMPTZ | When linkage was recorded locally |
### Schema Extension
```sql
-- V20260117__vex_rekor_linkage.sql
ALTER TABLE excititor.vex_observations
ADD COLUMN IF NOT EXISTS rekor_uuid TEXT,
ADD COLUMN IF NOT EXISTS rekor_log_index BIGINT,
ADD COLUMN IF NOT EXISTS rekor_integrated_time TIMESTAMPTZ,
ADD COLUMN IF NOT EXISTS rekor_log_url TEXT,
ADD COLUMN IF NOT EXISTS rekor_inclusion_proof JSONB,
ADD COLUMN IF NOT EXISTS rekor_linked_at TIMESTAMPTZ;
-- Indexes for Rekor queries
CREATE INDEX idx_vex_observations_rekor_uuid
ON excititor.vex_observations(rekor_uuid)
WHERE rekor_uuid IS NOT NULL;
CREATE INDEX idx_vex_observations_pending_rekor
ON excititor.vex_observations(created_at)
WHERE rekor_uuid IS NULL;
```
### API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/attestations/rekor/observations/{id}` | POST | Attest observation to Rekor |
| `/attestations/rekor/observations/batch` | POST | Batch attestation |
| `/attestations/rekor/observations/{id}/verify` | GET | Verify Rekor linkage |
| `/attestations/rekor/pending` | GET | List observations pending attestation |
### CLI Commands
```bash
# Show observation with Rekor details
stella vex observation show <id> --show-rekor
# Attest an observation to Rekor
stella vex observation attest <id> [--rekor-url URL]
# Verify Rekor linkage
stella vex observation verify-rekor <id> [--offline]
# List pending attestations
stella vex observation list-pending
```
### Inclusion Proof Structure
```jsonc
{
"treeSize": 1234567,
"rootHash": "base64-encoded-root-hash",
"logIndex": 12345,
"hashes": [
"base64-hash-1",
"base64-hash-2",
"base64-hash-3"
]
}
```
### Verification Modes
| Mode | Network | Use Case |
|------|---------|----------|
| Online | Required | Full verification against live Rekor |
| Offline | Not required | Verify using stored inclusion proof |
Offline mode uses the stored `rekor_inclusion_proof` to verify the Merkle path locally. This is essential for air-gapped environments.

View File

@@ -0,0 +1,40 @@
# Extensions (IDE Plugins)
> IDE integration plugins for Stella Ops, enabling release management and configuration validation from within VS Code and JetBrains IDEs.
## Purpose
Provides IDE integration for Stella Ops via VS Code and JetBrains plugins, allowing developers to manage releases, view environments, and validate configurations without leaving their editor. Extensions act as thin clients consuming existing Orchestrator and Router APIs, bringing operational visibility directly into the development workflow.
## Quick Links
- [Architecture](./architecture.md) - Technical design and implementation details
## Status
| Attribute | Value |
|-----------|-------|
| **Maturity** | Beta |
| **Source** | `src/Extensions/` |
## Key Features
- **VS Code extension:** Tree views for releases and environments, CodeLens annotations for `stella.yaml` files, command palette integration, status bar widgets
- **JetBrains plugin:** Tool windows with Releases/Environments/Deployments tabs, YAML annotator for configuration validation, status bar integration, action menus
- **Unified configuration:** Both plugins share the same Orchestrator API surface and authentication flow
- **Real-time updates:** Live status refresh for release pipelines and environment health
## Dependencies
### Upstream (this module depends on)
- **Orchestrator** - Release state, pipeline status, and environment data via HTTP API
- **Authority** - OAuth token-based authentication and scope enforcement
### Downstream (modules that depend on this)
- None (end-user development tools; no other modules consume Extensions)
## Related Documentation
- [Orchestrator](../orchestrator/) - Backend API consumed by extensions
- [Authority](../authority/) - Authentication provider
- [CLI](../cli/) - Command-line alternative for the same operations

View File

@@ -0,0 +1,117 @@
# Extensions (IDE Plugins) Architecture
> Technical architecture for VS Code and JetBrains IDE plugins providing Stella Ops integration.
## Overview
The Extensions module consists of two independent IDE plugins that provide developer-facing integration with the Stella Ops platform. Both plugins are pure HTTP clients that consume the Orchestrator and Router APIs; they do not host any services, expose endpoints, or maintain local databases. Authentication is handled through OAuth tokens obtained from the Authority service.
## Design Principles
1. **Thin client** - Extensions contain no business logic; all state and decisions live in backend services
2. **Consistent experience** - Both plugins expose equivalent functionality despite different technology stacks
3. **Non-blocking** - All API calls are asynchronous; the IDE remains responsive during network operations
4. **Offline-tolerant** - Graceful degradation when the Stella Ops backend is unreachable
## Components
```
Extensions/
├── vscode-stella-ops/ # VS Code extension (TypeScript)
│ ├── src/
│ │ ├── extension.ts # Entry point and activation
│ │ ├── providers/
│ │ │ ├── ReleaseTreeProvider.ts # TreeView: releases
│ │ │ ├── EnvironmentTreeProvider.ts# TreeView: environments
│ │ │ └── CodeLensProvider.ts # CodeLens for stella.yaml
│ │ ├── commands/ # Command palette handlers
│ │ ├── views/
│ │ │ └── webview/ # Webview panels (detail views)
│ │ ├── statusbar/
│ │ │ └── StatusBarManager.ts # Status bar integration
│ │ └── api/
│ │ └── OrchestratorClient.ts # HTTP client for Orchestrator API
│ ├── package.json # Extension manifest
│ └── tsconfig.json
└── jetbrains-stella-ops/ # JetBrains plugin (Kotlin)
├── src/main/kotlin/
│ ├── toolwindow/
│ │ ├── ReleasesToolWindow.kt # Tool window: Releases tab
│ │ ├── EnvironmentsToolWindow.kt # Tool window: Environments tab
│ │ └── DeploymentsToolWindow.kt # Tool window: Deployments tab
│ ├── annotator/
│ │ └── StellaYamlAnnotator.kt # YAML file annotator
│ ├── actions/ # Action menu handlers
│ ├── statusbar/
│ │ └── StellaStatusBarWidget.kt # Status bar widget
│ └── api/
│ └── OrchestratorClient.kt # HTTP client for Orchestrator API
├── src/main/resources/
│ └── META-INF/plugin.xml # Plugin descriptor
└── build.gradle.kts
```
## Data Flow
```
[Developer IDE] --> [Extension/Plugin]
├── GET /api/v1/releases/* ──────> [Orchestrator API]
├── GET /api/v1/environments/* ──> [Orchestrator API]
├── POST /api/v1/promotions/* ──-> [Orchestrator API]
└── POST /oauth/token ──────────-> [Authority]
```
1. **Authentication:** On activation, the extension initiates an OAuth device-code or browser-redirect flow against Authority. The obtained access token is stored in the IDE's secure credential store (VS Code `SecretStorage`, JetBrains `PasswordSafe`).
2. **Data retrieval:** Tree views and tool windows issue HTTP GET requests to the Orchestrator API on initial load and on manual/timed refresh.
3. **Actions:** Approve/reject/promote commands issue HTTP POST requests to the Orchestrator release control endpoints.
4. **Configuration validation:** The CodeLens provider (VS Code) and YAML annotator (JetBrains) parse `stella.yaml` files locally and highlight configuration issues inline.
## VS Code Extension Details
### Tree Views
- **Releases:** Hierarchical view of releases grouped by environment, showing status, version, and promotion eligibility
- **Environments:** Flat list of configured environments with health indicators
### CodeLens
- Inline annotations above `stella.yaml` entries showing the current deployment status of the referenced release
- Click-to-promote actions directly from the YAML file
### Status Bar
- Compact widget showing the number of pending promotions and overall platform health
### Webview Panels
- Detail panels for release timelines, evidence summaries, and deployment logs
## JetBrains Plugin Details
### Tool Windows
- **Releases tab:** Table view of all releases with sortable columns (version, environment, status, timestamp)
- **Environments tab:** Environment cards with health status and current deployments
- **Deployments tab:** Active and recent deployment history with log links
### YAML Annotator
- Real-time validation of `stella.yaml` files with gutter icons and tooltip messages for configuration issues
### Action Menus
- Context-sensitive actions (promote, approve, reject) available from tool window rows and editor context menus
## Security Considerations
- **Token storage:** OAuth tokens are stored exclusively in the IDE's built-in secure credential store; never persisted to disk in plaintext
- **Scope enforcement:** Extensions request only the scopes necessary for read operations and promotions (`release:read`, `release:promote`, `env:read`)
- **TLS enforcement:** All HTTP communication uses HTTPS; certificate validation is not bypassed
- **No secrets in configuration:** The `stella.yaml` file contains no credentials; integration secrets are managed by the Authority and Integrations modules
## Performance Characteristics
- Tree view refresh is debounced to avoid excessive API calls (default: 30-second minimum interval)
- API responses are cached locally with short TTL (60 seconds) to reduce latency on repeated navigation
- Webview panels and tool windows load data lazily on first open
## References
- [Module README](./README.md)
- [Orchestrator Architecture](../orchestrator/architecture.md)
- [Authority Architecture](../authority/architecture.md)

View File

@@ -0,0 +1,43 @@
# Feedser
> Evidence collection library for backport detection and binary fingerprinting.
## Purpose
Feedser provides deterministic, cryptographic evidence collection for backport detection. It extracts patch signatures from unified diffs and binary fingerprints from compiled code to enable high-confidence vulnerability status determination for packages where upstream fixes have been backported by distro maintainers.
## Quick Links
- [Architecture](./architecture.md) - Technical design and implementation details
## Status
| Attribute | Value |
|-----------|-------|
| **Maturity** | Production |
| **Last Reviewed** | 2025-12-29 |
| **Maintainer** | Concelier Guild |
## Key Features
- **Patch Signature Extraction**: Parse unified diffs and extract normalized hunk signatures
- **Binary Fingerprinting**: TLSH fuzzy hashing and instruction sequence hashing
- **Four-Tier Proof System**: Supporting backport detection at multiple confidence levels
- **Deterministic Outputs**: Canonical JSON serialization with stable hashing
## Dependencies
### Upstream (this module depends on)
- None (library with no external service dependencies)
### Downstream (modules that depend on this)
- **Concelier** - ProofService layer consumes Feedser for backport evidence
- **Attestor** - Evidence storage for generated proofs
## Notes
Feedser is a **library**, not a standalone service. It does not expose REST APIs directly and does not make vulnerability decisions. It provides evidence that feeds into VEX statements and Policy Engine evaluation.
## Related Documentation
- [Concelier Architecture](../concelier/architecture.md)

View File

@@ -0,0 +1,237 @@
# component_architecture_feedser.md - **Stella Ops Feedser** (2025Q4)
> Evidence collection library for backport detection and binary fingerprinting.
> **Scope.** Library architecture for **Feedser**: patch signature extraction, binary fingerprinting, and evidence collection supporting the four-tier backport proof system. Consumed primarily by Concelier's ProofService layer.
---
## 0) Mission & boundaries
**Mission.** Provide deterministic, cryptographic evidence collection for backport detection. Extract patch signatures from unified diffs and binary fingerprints from compiled code to enable high-confidence vulnerability status determination for packages where upstream fixes have been backported by distro maintainers.
**Boundaries.**
* Feedser is a **library**, not a standalone service. It does not expose REST APIs directly.
* Feedser **does not** make vulnerability decisions. It provides evidence that feeds into VEX statements and Policy Engine evaluation.
* Feedser **does not** store data. Storage is handled by consuming services (Concelier ProofService, Attestor).
* All outputs are **deterministic** with canonical JSON serialization and stable hashing.
---
## 1) Solution & project layout
```
src/Feedser/
├─ StellaOps.Feedser.Core/ # Patch signature extraction (HunkSig)
│ ├─ HunkSigExtractor.cs # Unified diff parser and normalizer
│ ├─ Models/
│ │ ├─ PatchSignature.cs # Deterministic patch identifier
│ │ ├─ HunkSignature.cs # Individual hunk with normalized content
│ │ └─ DiffParseResult.cs # Parse output with file paths and hunks
│ └─ Normalization/
│ └─ WhitespaceNormalizer.cs # Whitespace/comment stripping
├─ StellaOps.Feedser.BinaryAnalysis/ # Binary fingerprinting engine
│ ├─ BinaryFingerprintFactory.cs # Factory for fingerprinting strategies
│ ├─ IBinaryFingerprinter.cs # Fingerprinter interface
│ ├─ Models/
│ │ ├─ BinaryFingerprint.cs # Fingerprint record with method/value
│ │ └─ FingerprintMatchResult.cs # Match score and confidence
│ └─ Fingerprinters/
│ ├─ SimplifiedTlshFingerprinter.cs # TLSH fuzzy hashing
│ └─ InstructionHashFingerprinter.cs # Instruction sequence hashing
├─ plugins/
│ └─ concelier/ # Concelier integration plugin
└─ __Tests/
└─ StellaOps.Feedser.Core.Tests/ # Unit tests
```
---
## 2) External dependencies
* **Concelier ProofService** - Primary consumer; orchestrates four-tier evidence collection
* **Attestor ProofChain** - Consumes evidence for proof blob generation
* **.NET 10** - Runtime target
* No database dependencies (stateless library)
* No external network dependencies
---
## 3) Contracts & data model
### 3.1 Patch Signature (Tier 3 Evidence)
```csharp
public sealed record PatchSignature
{
public required string Id { get; init; } // Deterministic SHA256
public required string FilePath { get; init; } // Source file path
public required IReadOnlyList<HunkSignature> Hunks { get; init; }
public required string ContentHash { get; init; } // BLAKE3-256 of normalized content
public string? CommitId { get; init; } // Git commit SHA if available
public string? UpstreamCve { get; init; } // Associated CVE
}
public sealed record HunkSignature
{
public required int OldStart { get; init; }
public required int NewStart { get; init; }
public required string NormalizedContent { get; init; } // Whitespace-stripped
public required string ContentHash { get; init; }
}
```
### 3.2 Binary Fingerprint (Tier 4 Evidence)
```csharp
public sealed record BinaryFingerprint
{
public required string Method { get; init; } // tlsh, instruction_hash
public required string Value { get; init; } // Fingerprint value
public required string TargetPath { get; init; } // Binary file path
public string? FunctionName { get; init; } // Function if scoped
public required string Architecture { get; init; } // x86_64, aarch64, etc.
}
public sealed record FingerprintMatchResult
{
public required decimal Similarity { get; init; } // 0.0-1.0
public required decimal Confidence { get; init; } // 0.0-1.0
public required string Method { get; init; }
public required BinaryFingerprint Query { get; init; }
public required BinaryFingerprint Match { get; init; }
}
```
### 3.3 Evidence Tier Confidence Levels
| Tier | Evidence Type | Confidence Range | Description |
|------|--------------|------------------|-------------|
| 1 | Distro Advisory | 0.95-0.98 | Official vendor/distro statement |
| 2 | Changelog Mention | 0.75-0.85 | CVE mentioned in changelog |
| 3 | Patch Signature (HunkSig) | 0.85-0.95 | Normalized patch hash match |
| 4 | Binary Fingerprint | 0.55-0.85 | Compiled code similarity |
---
## 4) Core Components
### 4.1 HunkSigExtractor
Parses unified diff format and extracts normalized patch signatures:
```csharp
public interface IHunkSigExtractor
{
PatchSignature Extract(string unifiedDiff, string? commitId = null);
IReadOnlyList<PatchSignature> ExtractMultiple(string multiFileDiff);
}
```
**Normalization rules:**
- Strip leading/trailing whitespace
- Normalize line endings to LF
- Remove C-style comments (optional)
- Collapse multiple whitespace to single space
- Sort hunks by (file_path, old_start) for determinism
### 4.2 BinaryFingerprintFactory
Factory for creating fingerprinters based on binary type and analysis requirements:
```csharp
public interface IBinaryFingerprintFactory
{
IBinaryFingerprinter Create(FingerprintMethod method);
IReadOnlyList<IBinaryFingerprinter> GetAll();
}
public interface IBinaryFingerprinter
{
string Method { get; }
BinaryFingerprint Extract(ReadOnlySpan<byte> binary, string path);
FingerprintMatchResult Match(BinaryFingerprint query, BinaryFingerprint candidate);
}
```
**Fingerprinting methods:**
| Method | Description | Confidence | Use Case |
|--------|-------------|------------|----------|
| `tlsh` | TLSH fuzzy hash | 0.75-0.85 | General binary similarity |
| `instruction_hash` | Normalized instruction sequences | 0.55-0.75 | Function-level matching |
---
## 5) Integration with Concelier
Feedser is consumed via `StellaOps.Concelier.ProofService.BackportProofService`:
```
BackportProofService (Concelier)
├─ Tier 1: Query advisory_observations (distro advisories)
├─ Tier 2: Query changelogs via ISourceRepository
├─ Tier 3: Query patches via IPatchRepository + HunkSigExtractor
├─ Tier 4: Query binaries + BinaryFingerprintFactory
└─ Aggregate → ProofBlob with combined confidence score
```
The ProofService orchestrates evidence collection across all tiers and produces cryptographic proof blobs for downstream consumption.
---
## 6) Security & compliance
* **Determinism**: All outputs use canonical JSON with sorted keys, UTC timestamps
* **Tamper evidence**: BLAKE3-256 content hashes for all signatures
* **No secrets**: Library handles only public patch/binary data
* **Offline capable**: No network dependencies in core library
---
## 7) Performance targets
* **Patch extraction**: < 10ms for typical unified diff (< 1000 lines)
* **Binary fingerprinting**: < 100ms for 10MB ELF binary
* **Memory**: Streaming processing for large binaries; no full file buffering
* **Parallelism**: Thread-safe extractors; concurrent fingerprinting supported
---
## 8) Observability
Library consumers (ProofService) emit metrics:
* `feedser.hunk_extraction_duration_seconds`
* `feedser.binary_fingerprint_duration_seconds`
* `feedser.fingerprint_match_score{method}`
* `feedser.evidence_tier_confidence{tier}`
---
## 9) Testing matrix
* **Unit tests**: HunkSigExtractor parsing, normalization edge cases
* **Fingerprint tests**: Known binary pairs with expected similarity scores
* **Determinism tests**: Same input produces identical output across runs
* **Performance tests**: Large diff/binary processing within targets
---
## 10) Historical note
Concelier was formerly named "Feedser" (see `docs/airgap/airgap-mode.md`). The module was refactored:
- **Feedser** retained as evidence collection library
- **Concelier** became the advisory aggregation service consuming Feedser
---
## Related Documentation
* Concelier architecture: `../concelier/architecture.md`
* Attestor ProofChain: `../attestor/architecture.md`
* Backport proof system: `../../reachability/backport-proofs.md`

View File

@@ -0,0 +1,49 @@
# Gateway
**Status:** Implemented
**Source:** `src/Gateway/`
**Owner:** Platform Team
## Purpose
Gateway provides API routing, authentication enforcement, and transport abstraction for StellaOps services. Acts as the single entry point for external clients with support for HTTP/HTTPS and transport-agnostic messaging via Router module.
## Components
**Services:**
- `StellaOps.Gateway.WebService` - API gateway with routing, middleware, and security
**Key Features:**
- Route configuration and service discovery
- Authorization middleware (Authority integration)
- Request/response transformation
- Rate limiting and throttling
- Transport abstraction (HTTP, TCP/TLS, UDP, RabbitMQ, Valkey)
## Configuration
See `etc/policy-gateway.yaml.sample` for gateway configuration examples.
Key settings:
- Service route mappings
- Authority issuer and audience configuration
- Transport protocols and endpoints
- Security policies and CORS settings
- Rate limiting rules
## Dependencies
- Authority (authentication and authorization)
- Router (transport-agnostic messaging)
- All backend services (routing targets)
## Related Documentation
- Architecture: `./architecture.md`
- Router Module: `../router/`
- Authority Module: `../authority/`
- API Reference: `../../API_CLI_REFERENCE.md`
## Current Status
Implemented with HTTP/HTTPS support. Integrated with Authority for token validation and authorization. Supports service routing and middleware composition.

View File

@@ -0,0 +1,568 @@
# component_architecture_gateway.md — **Stella Ops Gateway** (Sprint 3600)
> Derived from Reference Architecture Advisory and Router Architecture Specification
> **Dual-location clarification (updated 2026-02-22).** Both `src/Gateway/` and `src/Router/` contain a project named `StellaOps.Gateway.WebService`. They are **different implementations** serving complementary roles:
> - **`src/Gateway/`** (this module) — the simplified HTTP ingress gateway focused on authentication, routing to microservices via binary protocol, and OpenAPI aggregation.
> - **`src/Router/`** — the evolved "Front Door" gateway with advanced features: configurable route tables (`GatewayRouteCatalog`), reverse proxy, SPA hosting, WebSocket support, Valkey messaging transport, and extended Authority integration.
>
> The Router version (`src/Router/`) appears to be the current canonical deployment target. This Gateway version may represent a simplified or legacy configuration. Operators should verify which is deployed in their environment. See also [Router Architecture](../router/architecture.md).
> **Scope.** The Gateway WebService is the single HTTP ingress point for all external traffic. It authenticates requests via Authority (DPoP/mTLS), routes to microservices via the Router binary protocol, aggregates OpenAPI specifications, and enforces tenant isolation.
> **Ownership:** Platform Guild
---
## 0) Mission & Boundaries
### What Gateway Does
- **HTTP Ingress**: Single entry point for all external HTTP/HTTPS traffic
- **Authentication**: DPoP and mTLS token validation via Authority integration
- **Routing**: Routes HTTP requests to microservices via binary protocol (TCP/TLS)
- **OpenAPI Aggregation**: Combines endpoint specs from all registered microservices
- **Health Aggregation**: Provides unified health status from downstream services
- **Rate Limiting**: Per-tenant and per-identity request throttling
- **Tenant Propagation**: Extracts tenant context and propagates to microservices
### What Gateway Does NOT Do
- **Business Logic**: No domain logic; pure routing and auth
- **Data Storage**: Stateless; no persistent state beyond connection cache
- **Direct Database Access**: Never connects to PostgreSQL directly
- **SBOM/VEX Processing**: Delegates to Scanner, Excititor, etc.
---
## 1) Solution & Project Layout
```
src/Gateway/
├── StellaOps.Gateway.WebService/
│ ├── StellaOps.Gateway.WebService.csproj
│ ├── Program.cs # DI bootstrap, transport init
│ ├── Dockerfile
│ ├── appsettings.json
│ ├── appsettings.Development.json
│ ├── Configuration/
│ │ ├── GatewayOptions.cs # All configuration options
│ │ └── TransportOptions.cs # TCP/TLS transport config
│ ├── Middleware/
│ │ ├── TenantMiddleware.cs # Tenant context extraction
│ │ ├── RequestRoutingMiddleware.cs # HTTP → binary routing
│ │ ├── SenderConstraintMiddleware.cs # DPoP/mTLS validation
│ │ ├── IdentityHeaderPolicyMiddleware.cs # Identity header sanitization
│ │ ├── CorrelationIdMiddleware.cs # Request correlation
│ │ └── HealthCheckMiddleware.cs # Health probe handling
│ ├── Services/
│ │ ├── GatewayHostedService.cs # Transport lifecycle
│ │ ├── OpenApiAggregationService.cs # Spec aggregation
│ │ └── HealthAggregationService.cs # Downstream health
│ └── Endpoints/
│ ├── HealthEndpoints.cs # /health/*, /metrics
│ └── OpenApiEndpoints.cs # /openapi.json, /openapi.yaml
```
### Dependencies
```xml
<ItemGroup>
<ProjectReference Include="..\..\__Libraries\StellaOps.Router.Gateway\..." />
<ProjectReference Include="..\..\__Libraries\StellaOps.Router.Transport.Tcp\..." />
<ProjectReference Include="..\..\__Libraries\StellaOps.Router.Transport.Tls\..." />
<ProjectReference Include="..\..\Auth\StellaOps.Auth.ServerIntegration\..." />
</ItemGroup>
```
---
## 2) External Dependencies
| Dependency | Purpose | Required |
|------------|---------|----------|
| **Authority** | OpTok validation, DPoP/mTLS | Yes |
| **Router.Gateway** | Routing state, endpoint discovery | Yes |
| **Router.Transport.Tcp** | Binary transport (dev) | Yes |
| **Router.Transport.Tls** | Binary transport (prod) | Yes |
| **Valkey/Redis** | Rate limiting state | Optional |
---
## 3) Contracts & Data Model
### Request Flow
```
┌──────────────┐ HTTPS ┌─────────────────┐ Binary ┌─────────────────┐
│ Client │ ─────────────► │ Gateway │ ────────────► │ Microservice │
│ (CLI/UI) │ │ WebService │ Frame │ (Scanner, │
│ │ ◄───────────── │ │ ◄──────────── │ Policy, etc) │
└──────────────┘ HTTPS └─────────────────┘ Binary └─────────────────┘
```
### Binary Frame Protocol
Gateway uses the Router binary protocol for internal communication:
| Frame Type | Purpose |
|------------|---------|
| HELLO | Microservice registration with endpoints |
| HEARTBEAT | Health check and latency measurement |
| REQUEST | HTTP request serialized to binary |
| RESPONSE | HTTP response serialized from binary |
| STREAM_DATA | Streaming response chunks |
| CANCEL | Request cancellation propagation |
### Endpoint Descriptor
```csharp
public sealed class EndpointDescriptor
{
public required string Method { get; init; } // GET, POST, etc.
public required string Path { get; init; } // /api/v1/scans/{id}
public required string ServiceName { get; init; } // scanner
public required string Version { get; init; } // 1.0.0
public TimeSpan DefaultTimeout { get; init; } // 30s
public bool SupportsStreaming { get; init; } // true for large responses
public IReadOnlyList<ClaimRequirement> RequiringClaims { get; init; }
}
```
### Routing State
```csharp
public interface IRoutingStateManager
{
ValueTask RegisterEndpointsAsync(ConnectionState conn, HelloPayload hello);
ValueTask<InstanceSelection?> SelectInstanceAsync(string method, string path);
ValueTask UpdateHealthAsync(ConnectionState conn, HeartbeatPayload heartbeat);
ValueTask DrainConnectionAsync(string connectionId);
}
```
---
## 4) REST API
Gateway exposes minimal management endpoints; all business APIs are routed to microservices.
### Health Endpoints
| Endpoint | Auth | Description |
|----------|------|-------------|
| `GET /health/live` | None | Liveness probe |
| `GET /health/ready` | None | Readiness probe |
| `GET /health/startup` | None | Startup probe |
| `GET /metrics` | None | Prometheus metrics |
### OpenAPI Endpoints
| Endpoint | Auth | Description |
|----------|------|-------------|
| `GET /openapi.json` | None | Aggregated OpenAPI 3.1.0 spec |
| `GET /openapi.yaml` | None | YAML format spec |
---
## 5) Execution Flow
### Request Routing
```mermaid
sequenceDiagram
participant C as Client
participant G as Gateway
participant A as Authority
participant M as Microservice
C->>G: HTTPS Request + DPoP Token
G->>A: Validate Token
A-->>G: Claims (sub, tid, scope)
G->>G: Select Instance (Method, Path)
G->>M: Binary REQUEST Frame
M-->>G: Binary RESPONSE Frame
G-->>C: HTTPS Response
```
### Microservice Registration
```mermaid
sequenceDiagram
participant M as Microservice
participant G as Gateway
M->>G: TCP/TLS Connect
M->>G: HELLO (ServiceName, Version, Endpoints)
G->>G: Register Endpoints
G-->>M: HELLO ACK
loop Every 10s
G->>M: HEARTBEAT
M-->>G: HEARTBEAT (latency, health)
G->>G: Update Health State
end
```
---
## 6) Instance Selection Algorithm
```csharp
public ValueTask<InstanceSelection?> SelectInstanceAsync(string method, string path)
{
// 1. Find all endpoints matching (method, path)
var candidates = _endpoints
.Where(e => e.Method == method && MatchPath(e.Path, path))
.ToList();
// 2. Filter by health
candidates = candidates
.Where(c => c.Health is InstanceHealthStatus.Healthy or InstanceHealthStatus.Degraded)
.ToList();
// 3. Region preference
var localRegion = candidates.Where(c => c.Region == _config.Region).ToList();
var neighborRegions = candidates.Where(c => _config.NeighborRegions.Contains(c.Region)).ToList();
var otherRegions = candidates.Except(localRegion).Except(neighborRegions).ToList();
var preferred = localRegion.Any() ? localRegion
: neighborRegions.Any() ? neighborRegions
: otherRegions;
// 4. Within tier: prefer lower latency, then most recent heartbeat
return preferred
.OrderBy(c => c.AveragePingMs)
.ThenByDescending(c => c.LastHeartbeatUtc)
.FirstOrDefault();
}
```
---
## 7) Configuration
```yaml
gateway:
node:
region: "eu1"
nodeId: "gw-eu1-01"
environment: "prod"
transports:
tcp:
enabled: true
port: 9100
maxConnections: 1000
receiveBufferSize: 65536
sendBufferSize: 65536
tls:
enabled: true
port: 9443
certificatePath: "/certs/gateway.pfx"
certificatePassword: "${GATEWAY_CERT_PASSWORD}"
clientCertificateMode: "RequireCertificate"
allowedClientCertificateThumbprints: []
routing:
defaultTimeout: "30s"
maxRequestBodySize: "100MB"
streamingEnabled: true
streamingBufferSize: 16384
neighborRegions: ["eu2", "us1"]
auth:
dpopEnabled: true
dpopMaxClockSkew: "60s"
mtlsEnabled: true
rateLimiting:
enabled: true
requestsPerMinute: 1000
burstSize: 100
redisConnectionString: "${REDIS_URL}" # Valkey (Redis-compatible)
openapi:
enabled: true
cacheTtlSeconds: 300
title: "Stella Ops API"
version: "1.0.0"
health:
heartbeatIntervalSeconds: 10
heartbeatTimeoutSeconds: 30
unhealthyThreshold: 3
```
---
## 8) Scale & Performance
| Metric | Target | Notes |
|--------|--------|-------|
| Routing latency (P50) | <2ms | Overhead only; excludes downstream |
| Routing latency (P99) | <5ms | Under normal load |
| Concurrent connections | 10,000 | Per gateway instance |
| Requests/second | 50,000 | Per gateway instance |
| Memory footprint | <512MB | Base; scales with connections |
### Scaling Strategy
- Horizontal scaling behind load balancer
- Sticky sessions NOT required (stateless)
- Regional deployment for latency optimization
- Rate limiting via distributed Valkey/Redis
---
## 9) Security Posture
### Authentication
| Method | Description |
|--------|-------------|
| DPoP | Proof-of-possession tokens from Authority |
| mTLS | Certificate-bound tokens for machine clients |
### Authorization
- Claims-based authorization per endpoint
- Required claims defined in endpoint descriptors
- Tenant isolation via `tid` claim
### Transport Security
| Component | Encryption |
|-----------|------------|
| Client Gateway | TLS 1.3 (HTTPS) |
| Gateway Microservices | TLS (prod), TCP (dev only) |
### Rate Limiting
Gateway uses the Router's dual-window rate limiting middleware with circuit breaker:
- **Instance-level** (in-memory): Per-router-instance limits using sliding window counters
- High-precision sub-second buckets for fair rate distribution
- No external dependencies; always available
- **Environment-level** (Valkey-backed): Cross-instance limits for distributed deployments
- Atomic Lua scripts for consistent counting across instances
- Circuit breaker pattern for fail-open behavior when Valkey is unavailable
- **Activation gate**: Environment-level checks only activate above traffic threshold (configurable)
- **Response headers**: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, Retry-After
Configuration via `appsettings.yaml`:
```yaml
rate_limiting:
process_back_pressure_when_more_than_per_5min: 5000
for_instance:
rules:
- max_requests: 100
per_seconds: 1
- max_requests: 1000
per_seconds: 60
for_environment:
valkey_connection: "localhost:6379"
rules:
- max_requests: 10000
per_seconds: 60
circuit_breaker:
failure_threshold: 3
timeout_seconds: 30
half_open_timeout: 10
```
---
## 10) Observability & Audit
### Metrics (Prometheus)
```
gateway_requests_total{service,method,path,status}
gateway_request_duration_seconds{service,method,path,quantile}
gateway_active_connections{service}
gateway_transport_frames_total{type}
gateway_auth_failures_total{reason}
gateway_rate_limit_exceeded_total{tenant}
```
### Traces (OpenTelemetry)
- Span per request: `gateway.route`
- Child span: `gateway.auth.validate`
- Child span: `gateway.transport.send`
### Logs (Structured)
```json
{
"timestamp": "2025-12-21T10:00:00Z",
"level": "info",
"message": "Request routed",
"correlationId": "abc123",
"tenantId": "tenant-1",
"method": "GET",
"path": "/api/v1/scans/xyz",
"service": "scanner",
"durationMs": 45,
"status": 200
}
```
---
## 11) Testing Matrix
| Test Type | Scope | Coverage Target |
|-----------|-------|-----------------|
| Unit | Routing algorithm, auth validation | 90% |
| Integration | Transport + routing flow | 80% |
| E2E | Full request path with mock services | Key flows |
| Performance | Latency, throughput, connection limits | SLO targets |
| Chaos | Connection failures, microservice crashes | Resilience |
### Test Fixtures
- `StellaOps.Router.Transport.InMemory` for transport mocking
- Mock Authority for auth testing
- `WebApplicationFactory` for integration tests
---
## 12) DevOps & Operations
### Deployment
```yaml
# Kubernetes deployment excerpt
apiVersion: apps/v1
kind: Deployment
metadata:
name: gateway
spec:
replicas: 3
template:
spec:
containers:
- name: gateway
image: stellaops/gateway:1.0.0
ports:
- containerPort: 8080 # HTTPS
- containerPort: 9443 # TLS (microservices)
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "1000m"
livenessProbe:
httpGet:
path: /health/live
port: 8080
readinessProbe:
httpGet:
path: /health/ready
port: 8080
```
### SLOs
| SLO | Target | Measurement |
|-----|--------|-------------|
| Availability | 99.9% | Uptime over 30 days |
| Latency P99 | <50ms | Includes downstream |
| Error rate | <0.1% | 5xx responses |
---
## 13) Roadmap
| Feature | Sprint | Status |
|---------|--------|--------|
| Core implementation | 3600.0001.0001 | TODO |
| Performance Testing Pipeline | 038 | DONE |
| WebSocket support | Future | Planned |
| gRPC passthrough | Future | Planned |
| GraphQL aggregation | Future | Exploration |
---
## 14) Performance Testing Pipeline (k6 + Prometheus + Correlation IDs)
### Overview
The Gateway includes a comprehensive performance testing pipeline with k6 load tests,
Prometheus metric instrumentation, and Grafana dashboards for performance curve modelling.
### k6 Scenarios (AG)
| Scenario | Purpose | VUs | Duration | Key Metric |
|----------|---------|-----|----------|------------|
| A Health Baseline | Sub-ms health probe overhead | 10 | 1 min | P95 < 10 ms |
| B OpenAPI Aggregation | Spec cache under concurrent readers | 50 | 75 s | P95 < 200 ms |
| C Routing Throughput | Mixed-method routing at target RPS | 200 | 2 min | P50 < 2 ms, P99 < 5 ms |
| D Correlation ID | Propagation overhead measurement | 20 | 1 min | P95 < 5 ms overhead |
| E Rate Limit Boundary | Enforcement correctness at boundary | 100 | 1 min | Retry-After header |
| F Connection Ramp | Transport saturation (ramp to 1000 VUs) | 1000 | 2 min | No 503 responses |
| G Steady-State Soak | Memory leak / resource exhaustion | 50 | 10 min | Stable memory |
Run all scenarios:
```bash
k6 run --env BASE_URL=https://gateway.stella-ops.local src/Gateway/__Tests/load/gateway_performance.k6.js
```
Run a single scenario:
```bash
k6 run --env BASE_URL=https://gateway.stella-ops.local --env SCENARIO=scenario_c_routing_throughput src/Gateway/__Tests/load/gateway_performance.k6.js
```
### Performance Metrics (GatewayPerformanceMetrics)
Meter: `StellaOps.Gateway.Performance`
| Instrument | Type | Unit | Description |
|------------|------|------|-------------|
| `gateway.requests.total` | Counter | | Total requests processed |
| `gateway.errors.total` | Counter | | Errors (4xx/5xx) |
| `gateway.ratelimit.total` | Counter | | Rate-limited requests (429) |
| `gateway.request.duration` | Histogram | ms | Full request duration |
| `gateway.auth.duration` | Histogram | ms | Auth middleware duration |
| `gateway.transport.duration` | Histogram | ms | TCP/TLS transport duration |
| `gateway.routing.duration` | Histogram | ms | Instance selection duration |
### Grafana Dashboard
Dashboard: `devops/telemetry/dashboards/stella-ops-gateway-performance.json`
UID: `stella-ops-gateway-performance`
Panels:
1. **Overview row** P50/P99 gauges, error rate, RPS
2. **Latency Distribution** Percentile time series (overall + per-service)
3. **Throughput & Rate Limiting** RPS by service, rate-limited requests by route
4. **Pipeline Breakdown** Auth/Routing/Transport P95 breakdown, errors by status
5. **Connections & Resources** Active connections, endpoints, memory usage
### C# Models
| Type | Purpose |
|------|---------|
| `GatewayPerformanceObservation` | Single request observation (all pipeline phases) |
| `PerformanceScenarioConfig` | Scenario definition with SLO thresholds |
| `PerformanceCurvePoint` | Aggregated window data with computed RPS/error rate |
| `PerformanceTestSummary` | Complete test run result with threshold violations |
| `GatewayPerformanceMetrics` | OTel service emitting Prometheus-compatible metrics |
---
## 14) References
- Router Architecture: `docs/modules/router/architecture.md`
- Gateway Identity Header Policy: `docs/modules/gateway/identity-header-policy.md`
- OpenAPI Aggregation: `docs/modules/gateway/openapi.md`
- Router ASP.NET Endpoint Bridge: `docs/modules/router/aspnet-endpoint-bridge.md`
- Router Messaging (Valkey) Transport: `docs/modules/router/messaging-valkey-transport.md`
- Authority Integration: `docs/modules/authority/architecture.md`
- Reference Architecture: `docs/product/advisories/archived/2025-12-21-reference-architecture/`
---
**Last Updated**: 2025-12-21 (Sprint 3600)

View File

@@ -0,0 +1,129 @@
# Gateway · Identity Header Policy for Router Dispatch
## Status
- **Implemented** in Sprint 8100.0011.0002.
- Middleware: `src/Gateway/StellaOps.Gateway.WebService/Middleware/IdentityHeaderPolicyMiddleware.cs`
- Last updated: 2025-12-24 (UTC).
## Why This Exists
The Gateway is the single HTTP ingress point and routes requests to internal microservices over Router transports. Many services (and legacy components) still rely on **header-based identity context** (tenant/scopes/actor) rather than (or in addition to) `HttpContext.User` claims.
This creates a hard security requirement:
- **Clients must never be able to inject/override “roles/scopes” headers** that the downstream service trusts.
- The Gateway must derive the effective identity from the validated JWT/JWK token (or explicit anonymous identity) and **overwrite** downstream identity headers accordingly.
## Implementation
The `IdentityHeaderPolicyMiddleware` (introduced in Sprint 8100.0011.0002) replaces the legacy middleware:
- ~~`src/Gateway/StellaOps.Gateway.WebService/Middleware/ClaimsPropagationMiddleware.cs`~~ (retired)
- ~~`src/Gateway/StellaOps.Gateway.WebService/Middleware/TenantMiddleware.cs`~~ (retired)
### Resolved issues
1) **Spoofing risk:** ✅ Fixed. Middleware uses "strip-and-overwrite" semantics—reserved headers are stripped before claims are extracted and downstream headers are written.
2) **Claim type mismatch:** ✅ Fixed. Middleware uses `StellaOpsClaimTypes.Tenant` (`stellaops:tenant`) with fallback to legacy `tid` claim.
3) **Scope claim mismatch:** ✅ Fixed. Middleware extracts scopes from both `scp` (individual claims) and `scope` (space-separated) claims.
4) **Docs alignment:** ✅ Reconciled in this sprint.
## Policy Goals
- **No client-controlled identity headers:** the Gateway rejects or strips identity headers coming from external clients.
- **Gateway-controlled propagation:** the Gateway sets downstream identity headers based on validated token claims or a defined anonymous identity.
- **Compatibility bridge:** support both `X-Stella-*` and `X-StellaOps-*` header naming during migration.
- **Determinism:** header values are canonicalized (whitespace, ordering) and do not vary across equivalent requests.
## Reserved Headers (Draft)
The following headers are considered **reserved identity context** and must not be trusted from external clients:
- Tenant / project:
- `X-StellaOps-Tenant`, `X-Stella-Tenant`
- `X-StellaOps-Project`, `X-Stella-Project`
- Scopes / roles:
- `X-StellaOps-Scopes`, `X-Stella-Scopes`
- Actor / subject (if used for auditing):
- `X-StellaOps-Actor`, `X-Stella-Actor`
- Token proof / confirmation (if propagated):
- `cnf`, `cnf.jkt`
**Internal/legacy pass-through keys to also treat as reserved:**
- `sub`, `scope`, `scp`, `tid` (legacy), `stellaops:tenant` (if ever used as a header key)
## Overwrite Rules (Draft)
For non-system paths (i.e., requests that will be routed to microservices):
1) **Strip** all reserved identity headers from the incoming request.
2) **Compute** effective identity from the authenticated principal:
- `sub` from JWT `sub` (`StellaOpsClaimTypes.Subject`)
- `tenant` from `stellaops:tenant` (`StellaOpsClaimTypes.Tenant`)
- `project` from `stellaops:project` (`StellaOpsClaimTypes.Project`) when present
- `scopes` from:
- `scp` claims (`StellaOpsClaimTypes.ScopeItem`) if present, else
- split `scope` (`StellaOpsClaimTypes.Scope`) by spaces
3) **Write** downstream headers (compat mode):
- Tenant:
- `X-StellaOps-Tenant: <tenant>`
- `X-Stella-Tenant: <tenant>` (optional during migration)
- Project (optional):
- `X-StellaOps-Project: <project>`
- `X-Stella-Project: <project>` (optional during migration)
- Scopes:
- `X-StellaOps-Scopes: <space-delimited scopes>`
- `X-Stella-Scopes: <space-delimited scopes>` (optional during migration)
- Actor:
- `X-StellaOps-Actor: <sub>` (unless another canonical actor claim is defined)
### Anonymous mode
If `Gateway:Auth:AllowAnonymous=true` and the request is unauthenticated:
- Set an explicit anonymous identity so downstream services never interpret “missing header” as privileged:
- `X-StellaOps-Actor: anonymous`
- `X-StellaOps-Scopes: ` (empty) or `anonymous` (choose one and document)
- Tenant behavior must be explicitly defined:
- either reject routed requests without tenant context, or
- require a tenant header even in anonymous mode and treat it as *untrusted input* that only selects a tenancy partition (not privileges).
## Scope Override Header (Offline/Pre-prod)
Some legacy flows allow setting scopes via headers (for offline kits or pre-prod bundles).
Draft enforcement:
- Default: **forbid** client-provided scope headers (`X-StellaOps-Scopes`, `X-Stella-Scopes`) and return 403 (deterministic error code).
- Optional controlled override: allow only when `Gateway:Auth:AllowScopeHeader=true`, and only for explicitly allowed environments (offline/pre-prod).
- Even when allowed, the override must not silently escalate a request beyond what the token allows unless the request is explicitly unauthenticated and the environment is configured for offline operation.
## Implementation Details
### Middleware Registration
The middleware is registered in `Program.cs` after authentication:
```csharp
app.UseAuthentication();
app.UseMiddleware<SenderConstraintMiddleware>();
app.UseMiddleware<IdentityHeaderPolicyMiddleware>();
```
### Configuration
Options are configured via `GatewayOptions.Auth`:
```yaml
Gateway:
Auth:
EnableLegacyHeaders: true # Write X-Stella-* in addition to X-StellaOps-*
AllowScopeHeader: false # Forbid client scope headers (default)
```
### HttpContext.Items Keys
The middleware stores normalized identity in `HttpContext.Items` using `GatewayContextKeys`:
- `Gateway.TenantId` — extracted tenant identifier
- `Gateway.ProjectId` — extracted project identifier (optional)
- `Gateway.Actor` — subject/actor from claims or "anonymous"
- `Gateway.Scopes``HashSet<string>` of scopes
- `Gateway.IsAnonymous``bool` indicating anonymous request
- `Gateway.DpopThumbprint` — JKT from cnf claim (if present)
- `Gateway.CnfJson` — raw cnf claim JSON (if present)
### Tests
Located in `src/Gateway/__Tests/StellaOps.Gateway.WebService.Tests/Middleware/IdentityHeaderPolicyMiddlewareTests.cs`:
- ✅ Spoofed identity headers are stripped and overwritten
- ✅ Claim type mapping uses `StellaOpsClaimTypes.*` correctly
- ✅ Anonymous requests receive explicit `anonymous` identity
- ✅ Legacy headers are written when `EnableLegacyHeaders=true`
- ✅ Scopes are sorted deterministically
## Related Documents
- Gateway architecture: `docs/modules/gateway/architecture.md`
- Tenant auth contract (Web V): `docs/api/gateway/tenant-auth.md`
- Router ASP.NET bridge: `docs/modules/router/aspnet-endpoint-bridge.md`

View File

@@ -0,0 +1,344 @@
# Gateway OpenAPI Implementation
This document describes the implementation architecture of OpenAPI document aggregation in the StellaOps Router Gateway.
## Architecture
The Gateway generates OpenAPI 3.1.0 documentation by aggregating schemas and endpoint metadata from connected microservices.
### Component Overview
```
┌─────────────────────────────────────────────────────────────────────┐
│ Gateway │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────┐ ┌────────────────────┐ │
│ │ ConnectionManager │───►│ InMemoryRoutingState│ │
│ │ │ │ │ │
│ │ - OnHelloReceived │ │ - Connections[] │ │
│ │ - OnConnClosed │ │ - Endpoints │ │
│ └──────────────────┘ │ - Schemas │ │
│ │ └─────────┬──────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────┐ ┌────────────────────┐ │
│ │ OpenApiDocument │◄───│ GatewayOpenApi │ │
│ │ Cache │ │ DocumentCache │ │
│ │ │ │ │ │
│ │ - Invalidate() │ │ - TTL expiration │ │
│ └──────────────────┘ │ - ETag generation │ │
│ └─────────┬──────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────┐ │
│ │ OpenApiDocument │ │
│ │ Generator │ │
│ │ │ │
│ │ - GenerateInfo() │ │
│ │ - GeneratePaths() │ │
│ │ - GenerateTags() │ │
│ │ - GenerateSchemas()│ │
│ └─────────┬──────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────┐ │
│ │ ClaimSecurity │ │
│ │ Mapper │ │
│ │ │ │
│ │ - SecuritySchemes │ │
│ │ - SecurityRequire │ │
│ └────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘
```
### Components
| Component | File | Responsibility |
|-----------|------|----------------|
| `IOpenApiDocumentGenerator` | `OpenApi/IOpenApiDocumentGenerator.cs` | Interface for document generation |
| `OpenApiDocumentGenerator` | `OpenApi/OpenApiDocumentGenerator.cs` | Builds OpenAPI 3.1.0 JSON |
| `IGatewayOpenApiDocumentCache` | `OpenApi/IGatewayOpenApiDocumentCache.cs` | Interface for document caching |
| `GatewayOpenApiDocumentCache` | `OpenApi/GatewayOpenApiDocumentCache.cs` | TTL + invalidation caching |
| `ClaimSecurityMapper` | `OpenApi/ClaimSecurityMapper.cs` | Maps claims to OAuth2 scopes |
| `OpenApiEndpoints` | `OpenApi/OpenApiEndpoints.cs` | HTTP endpoint handlers |
| `OpenApiAggregationOptions` | `OpenApi/OpenApiAggregationOptions.cs` | Configuration options |
---
## OpenApiDocumentGenerator
Generates the complete OpenAPI 3.1.0 document from routing state.
### Process Flow
1. **Collect connections** from `IGlobalRoutingState`
2. **Generate info** section from `OpenApiAggregationOptions`
3. **Generate paths** by iterating all endpoints across connections
4. **Generate components** including schemas and security schemes
5. **Generate tags** from unique service names
### Schema Handling
Schemas are prefixed with service name to avoid naming conflicts:
```csharp
var prefixedId = $"{conn.Instance.ServiceName}_{schemaId}";
// billing_CreateInvoiceRequest
```
### Operation ID Generation
Operation IDs follow a consistent pattern:
```csharp
var operationId = $"{serviceName}_{path}_{method}";
// billing_invoices_POST
```
---
## GatewayOpenApiDocumentCache
Implements caching with TTL expiration and content-based ETags.
### Cache Behavior
| Trigger | Action |
|---------|--------|
| First request | Generate and cache document |
| Subsequent requests (within TTL) | Return cached document |
| TTL expired | Regenerate document |
| Connection added/removed | Invalidate cache |
### ETag Generation
ETags are computed from SHA256 hash of document content:
```csharp
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(documentJson));
var etag = $"\"{Convert.ToHexString(hash)[..16]}\"";
```
### Thread Safety
The cache uses locking to ensure thread-safe regeneration:
```csharp
lock (_lock)
{
if (_cachedDocument is null || IsExpired())
{
RegenerateDocument();
}
}
```
---
## ClaimSecurityMapper
Maps endpoint claim requirements to OpenAPI security schemes.
### Security Scheme Generation
Always generates `BearerAuth` scheme. Generates `OAuth2` scheme only when endpoints have claim requirements:
```csharp
public static JsonObject GenerateSecuritySchemes(
IEnumerable<EndpointDescriptor> endpoints,
string tokenUrl)
{
var schemes = new JsonObject();
// Always add BearerAuth
schemes["BearerAuth"] = new JsonObject { ... };
// Collect scopes from all endpoints
var scopes = CollectScopes(endpoints);
// Add OAuth2 only if scopes exist
if (scopes.Count > 0)
{
schemes["OAuth2"] = GenerateOAuth2Scheme(tokenUrl, scopes);
}
return schemes;
}
```
### Per-Operation Security
Each endpoint with claims gets a security requirement:
```csharp
public static JsonArray GenerateSecurityRequirement(EndpointDescriptor endpoint)
{
if (endpoint.AllowAnonymous)
return new JsonArray(); // Anonymous endpoint
if (!endpoint.RequiresAuthentication && endpoint.RequiringClaims.Count == 0)
return new JsonArray(); // No auth semantics published
return new JsonArray
{
new JsonObject
{
["BearerAuth"] = new JsonArray(),
["OAuth2"] = new JsonArray(scopes.Select(scope => scope))
}
};
}
```
### Router-specific OpenAPI extensions
Gateway now emits Router-specific extensions on each operation:
- `x-stellaops-gateway-auth`: effective authorization semantics projected from endpoint metadata.
- `allowAnonymous`
- `requiresAuthentication`
- `source` (`None`, `AspNetMetadata`, `YamlOverride`, `Hybrid`)
- optional `policies`, `roles`, `claimRequirements`
- `x-stellaops-timeout`: timeout semantics used by gateway dispatch.
- `effectiveSeconds`
- `source` (`endpoint`, `gatewayRouteDefault`, and capped variants)
- `endpointSeconds`, `gatewayRouteDefaultSeconds`, `gatewayGlobalCapSeconds` when available
- precedence list: endpoint override -> service default -> gateway route default -> gateway global cap
- `x-stellaops-timeout-seconds`: backward-compatible scalar alias for `effectiveSeconds`.
---
## Configuration Reference
### OpenApiAggregationOptions
| Property | Type | Default | Description |
|----------|------|---------|-------------|
| `Title` | `string` | `"StellaOps Gateway API"` | API title |
| `Description` | `string` | `"Unified API..."` | API description |
| `Version` | `string` | `"1.0.0"` | API version |
| `ServerUrl` | `string` | `"/"` | Base server URL |
| `CacheTtlSeconds` | `int` | `60` | Cache TTL |
| `Enabled` | `bool` | `true` | Enable/disable |
| `LicenseName` | `string` | `"BUSL-1.1"` | License name |
| `ContactName` | `string?` | `null` | Contact name |
| `ContactEmail` | `string?` | `null` | Contact email |
| `TokenUrl` | `string` | `"/auth/token"` | OAuth2 token URL |
### YAML Configuration
```yaml
OpenApi:
Title: "My Gateway API"
Description: "Unified API for all microservices"
Version: "2.0.0"
ServerUrl: "https://api.example.com"
CacheTtlSeconds: 60
Enabled: true
LicenseName: "BUSL-1.1"
ContactName: "API Team"
ContactEmail: "api@example.com"
TokenUrl: "/auth/token"
```
---
## Service Registration
Services are registered via dependency injection in `ServiceCollectionExtensions`:
```csharp
services.Configure<OpenApiAggregationOptions>(
configuration.GetSection("OpenApi"));
services.AddSingleton<IOpenApiDocumentGenerator, OpenApiDocumentGenerator>();
services.AddSingleton<IGatewayOpenApiDocumentCache, GatewayOpenApiDocumentCache>();
```
Endpoints are mapped in `ApplicationBuilderExtensions`:
```csharp
app.MapGatewayOpenApiEndpoints();
```
---
## Cache Invalidation
The `ConnectionManager` invalidates the cache on connection changes:
```csharp
private Task HandleHelloReceivedAsync(ConnectionState state, HelloPayload payload)
{
_routingState.AddConnection(state);
_openApiCache?.Invalidate(); // Invalidate on new connection
return Task.CompletedTask;
}
private Task HandleConnectionClosedAsync(string connectionId)
{
_routingState.RemoveConnection(connectionId);
_openApiCache?.Invalidate(); // Invalidate on disconnect
return Task.CompletedTask;
}
```
---
## Extension Points
### Custom Routing Plugins
The Gateway supports custom routing plugins via `IRoutingPlugin`. While not directly related to OpenAPI, routing decisions can affect which endpoints are exposed.
### Future Enhancements
Potential extension points for future development:
- **Schema Transformers**: Modify schemas before aggregation
- **Tag Customization**: Custom tag generation logic
- **Response Examples**: Include example responses from connected services
- **Webhooks**: Notify external systems on document changes
---
## Testing
Unit tests are located in `src/Gateway/__Tests/StellaOps.Gateway.WebService.Tests/OpenApi/`:
| Test File | Coverage |
|-----------|----------|
| `OpenApiDocumentGeneratorTests.cs` | Document structure, schema merging, tag generation |
| `GatewayOpenApiDocumentCacheTests.cs` | TTL expiry, invalidation, ETag consistency |
| `ClaimSecurityMapperTests.cs` | Security scheme generation from claims |
### Test Patterns
```csharp
[Fact]
public void GenerateDocument_WithConnections_GeneratesPaths()
{
// Arrange
var endpoint = new EndpointDescriptor { ... };
var connection = CreateConnection("inventory", "1.0.0", endpoint);
_routingState.Setup(x => x.GetAllConnections()).Returns([connection]);
// Act
var document = _sut.GenerateDocument();
// Assert
var doc = JsonDocument.Parse(document);
doc.RootElement.GetProperty("paths")
.TryGetProperty("/api/items", out _)
.Should().BeTrue();
}
```
---
## See Also
- [Schema Validation](../router/schema-validation.md) - JSON Schema validation in microservices
- [OpenAPI Aggregation](../router/openapi-aggregation.md) - Configuration and usage guide
- [API Overview](../../api/overview.md) - General API conventions

View File

@@ -0,0 +1,44 @@
# IssuerDirectory
**Status:** Implemented
**Source:** `src/IssuerDirectory/`
**Owner:** VEX Guild
## Purpose
IssuerDirectory maintains a trust registry of CSAF publishers and VEX statement issuers. Provides discovery, validation, and trust scoring for upstream vulnerability advisories and VEX statements.
## Components
**Services:**
- `StellaOps.IssuerDirectory` - Main service for issuer registry management and API
## Configuration
See `etc/issuer-directory.yaml.sample` for configuration options.
Key settings:
- PostgreSQL connection (schema: `issuer_directory`)
- Authority integration settings
- Issuer discovery endpoints
- Trust validation policies
- CSAF provider metadata validation
## Dependencies
- PostgreSQL (schema: `issuer_directory`)
- Authority (authentication)
- Concelier (consumes issuer metadata)
- VexHub (consumes issuer trust data)
- VexLens (trust scoring integration)
## Related Documentation
- Architecture: `./architecture.md`
- Concelier: `../concelier/`
- VexHub: `../vexhub/`
- VexLens: `../vex-lens/`
## Current Status
Implemented with CSAF publisher discovery and validation. Supports issuer metadata storage and trust registry queries. Integrated with VEX ingestion pipeline.

View File

@@ -0,0 +1,109 @@
# Issuer Directory Architecture (ARCHIVED)
> **ARCHIVED by Sprint 216 (2026-03-04).** IssuerDirectory source ownership moved to the Authority domain.
> Current documentation: `docs/modules/authority/architecture.md` (sections 21.1-21.4).
> Source: `src/Authority/StellaOps.IssuerDirectory/`.
> **Status:** Initial service scaffold (Sprint 100 -- Identity & Signing)
## 1. Purpose
Issuer Directory centralises trusted VEX/CSAF publisher metadata so downstream services (VEX Lens, Excititor, Policy Engine) can resolve issuer identity, active keys, and trust weights. The initial milestone delivers tenant-scoped CRUD APIs with audit logging plus bootstrap import for CSAF publishers.
## 2. Runtime Topology
- **Service name:** `stellaops/issuer-directory`
- **Framework:** ASP.NET Core minimal APIs (`net10.0`)
- **Persistence:** PostgreSQL (`issuer_directory.issuers`, `issuer_directory.issuer_keys`, `issuer_directory.issuer_audit`)
- **AuthZ:** StellaOps resource server scopes (`issuer-directory:read`, `issuer-directory:write`, `issuer-directory:admin`)
- **Audit:** Every create/update/delete emits an audit record with actor, reason, and context.
- **Bootstrap:** On startup, the service imports `data/csaf-publishers.json` into the global tenant (`@global`) and records a `seeded` audit the first time each publisher is added.
- **Key lifecycle:** API validates Ed25519 public keys, X.509 certificates, and DSSE public keys, enforces future expiries, deduplicates fingerprints, and records audit entries for create/rotate/revoke actions.
```
Clients ──> Authority (DPoP/JWT) ──> IssuerDirectory WebService ──> PostgreSQL
└─> Audit sink (PostgreSQL)
```
## 3. Configuration
Configuration is resolved via `IssuerDirectoryWebServiceOptions` (section name `IssuerDirectory`). The default YAML sample lives at `etc/issuer-directory.yaml.sample` and exposes:
```yaml
IssuerDirectory:
telemetry:
minimumLogLevel: Information
authority:
enabled: true
issuer: https://authority.example.com/realms/stellaops
requireHttpsMetadata: true
audiences:
- stellaops-platform
readScope: issuer-directory:read
writeScope: issuer-directory:write
adminScope: issuer-directory:admin
tenantHeader: X-StellaOps-Tenant
seedCsafPublishers: true
csafSeedPath: data/csaf-publishers.json
Postgres:
connectionString: Host=localhost;Port=5432;Database=issuer_directory;Username=stellaops;Password=secret
schema: issuer_directory
issuersTable: issuers
issuerKeysTable: issuer_keys
auditTable: issuer_audit
```
## 4. API Surface (v0)
| Method | Route | Scope | Description |
|--------|-------|-------|-------------|
| `GET` | `/issuer-directory/issuers` | `issuer-directory:read` | List tenant issuers (optionally include global seeds). |
| `GET` | `/issuer-directory/issuers/{id}` | `issuer-directory:read` | Fetch a single issuer by identifier. |
| `POST` | `/issuer-directory/issuers` | `issuer-directory:write` | Create a tenant issuer. Requires `X-StellaOps-Tenant` header and optional `X-StellaOps-Reason`. |
| `PUT` | `/issuer-directory/issuers/{id}` | `issuer-directory:write` | Update issuer metadata/endpoints/tags. |
| `DELETE` | `/issuer-directory/issuers/{id}` | `issuer-directory:admin` | Delete issuer (records audit). |
| `GET` | `/issuer-directory/issuers/{id}/keys` | `issuer-directory:read` | List issuer keys (tenant + optional `@global` seeds). |
| `POST` | `/issuer-directory/issuers/{id}/keys` | `issuer-directory:write` | Add a signing key (validates format, deduplicates fingerprint, audits). |
| `POST` | `/issuer-directory/issuers/{id}/keys/{keyId}/rotate` | `issuer-directory:write` | Retire an active key and create a replacement atomically. |
| `DELETE` | `/issuer-directory/issuers/{id}/keys/{keyId}` | `issuer-directory:admin` | Revoke a key (status → revoked, audit logged). |
| `GET` | `/issuer-directory/issuers/{id}/trust` | `issuer-directory:read` | Retrieve tenant/global trust overrides with effective weight. |
| `PUT` | `/issuer-directory/issuers/{id}/trust` | `issuer-directory:write` | Set or update a tenant trust override; reason may be supplied in body/header. |
| `DELETE` | `/issuer-directory/issuers/{id}/trust` | `issuer-directory:admin` | Remove a tenant trust override (falls back to global/default weight). |
All write/delete operations accept an optional audit reason header (`X-StellaOps-Reason`) which is persisted alongside trust override changes.
Payloads follow the contract in `Contracts/IssuerDtos.cs` and align with domain types (`IssuerRecord`, `IssuerMetadata`, `IssuerEndpoint`).
## 5. Dependencies & Reuse
- `StellaOps.IssuerDirectory.Core` — domain model (`IssuerRecord`, `IssuerKeyRecord`) + application services.
- `StellaOps.IssuerDirectory.Infrastructure` — PostgreSQL persistence, audit sink, seed loader.
- `StellaOps.IssuerDirectory.WebService` — minimal API host, authentication wiring.
- Shared libraries: `StellaOps.Configuration`, `StellaOps.Auth.ServerIntegration`.
## 6. Testing
- Unit coverage for issuer CRUD (`IssuerDirectoryServiceTests`) and key lifecycle (`IssuerKeyServiceTests`) in `StellaOps.IssuerDirectory.Core.Tests`.
- Test infrastructure leverages `FakeTimeProvider` for deterministic timestamps and in-memory fakes for repository + audit sink.
## 7. Observability
- **Metrics.** `issuer_directory_changes_total` (labels: `tenant`, `issuer`, `action`) tracks issuer create/update/delete events; `issuer_directory_key_operations_total` (labels: `tenant`, `issuer`, `operation`, `key_type`) covers key create/rotate/revoke flows; `issuer_directory_key_validation_failures_total` (labels: `tenant`, `issuer`, `reason`) captures validation/verification failures. The WebService exports these via OpenTelemetry (`StellaOps.IssuerDirectory` meter).
- **Logs.** Service-level `ILogger` instrumentation records structured entries for issuer CRUD, key lifecycle operations, and validation failures; audit logs remain the authoritative trail.
## 8. Roadmap (next milestones)
1. **Key management APIs (ISSUER-30-002)** — manage signing keys, enforce expiry, integrate with KMS.
2. **Trust weight overrides (ISSUER-30-003)** — expose policy-friendly trust weighting with audit trails.
3. **SDK integration (ISSUER-30-004)** — supply cached issuer metadata to VEX Lens and Excititor clients.
4. **Observability & Ops (ISSUER-30-005/006)** — metrics, dashboards, deployment automation, offline kit.
## 9. Operations & runbooks
- [Deployment guide](operations/deployment.md)
- [Backup & restore](operations/backup-restore.md)
- [Offline kit notes](operations/offline-kit.md)
---
*Document owner: Issuer Directory Guild*

View File

@@ -0,0 +1,105 @@
# Issuer Directory Backup & Restore
## Scope
- **Applies to:** Issuer Directory when deployed via Docker Compose (`devops/compose/docker-compose.*.yaml`) or the Helm chart (`devops/helm/stellaops`).
- **Artifacts covered:** PostgreSQL database `issuer_directory`, service configuration (`etc/issuer-directory.yaml`), CSAF seed file (`data/csaf-publishers.json`), and secret material for the PostgreSQL connection string.
- **Frequency:** Take a hot backup before every upgrade and at least daily in production. Keep encrypted copies off-site/air-gapped according to your compliance program.
## Inventory checklist
| Component | Location (Compose default) | Notes |
| --- | --- | --- |
| PostgreSQL data | `postgres-data` volume (`/var/lib/docker/volumes/.../postgres-data`) | Contains `issuers`, `issuer_keys`, `issuer_trust_overrides`, and `issuer_audit` tables in the `issuer_directory` schema. |
| Configuration | `etc/issuer-directory.yaml` | Mounted read-only at `/etc/issuer-directory.yaml` inside the container. |
| CSAF seed file | `src/IssuerDirectory/StellaOps.IssuerDirectory/data/csaf-publishers.json` | Ensure customised seeds are part of the backup; regenerate if you ship regional overrides. |
| PostgreSQL secret | `.env` entry `ISSUER_DIRECTORY_POSTGRES_CONNECTION_STRING` or secret store export | Required to restore connectivity; treat as sensitive. |
> **Tip:** Export the secret via `kubectl get secret issuer-directory-secrets -o yaml` (sanitize before storage) or copy the Compose `.env` file into an encrypted vault. For PostgreSQL credentials, consider using `pg_dump` with connection info from environment variables.
## Hot backup (no downtime)
1. **Create output directory**
```bash
BACKUP_DIR=backup/issuer-directory/$(date +%Y-%m-%dT%H%M%S)
mkdir -p "$BACKUP_DIR"
```
2. **Dump PostgreSQL tables**
```bash
docker compose -f devops/compose/docker-compose.prod.yaml exec postgres \
pg_dump --format=custom --compress=9 \
--file=/dump/issuer-directory-$(date +%Y%m%dT%H%M%SZ).dump \
--schema=issuer_directory issuer_directory
docker compose -f devops/compose/docker-compose.prod.yaml cp \
postgres:/dump/issuer-directory-$(date +%Y%m%dT%H%M%SZ).dump "$BACKUP_DIR/"
```
For Kubernetes, run the same `pg_dump` command inside the `stellaops-postgres` pod and copy the archive via `kubectl cp`.
3. **Capture configuration and seeds**
```bash
cp etc/issuer-directory.yaml "$BACKUP_DIR/"
cp src/IssuerDirectory/StellaOps.IssuerDirectory/data/csaf-publishers.json "$BACKUP_DIR/"
```
4. **Capture secrets**
```bash
grep '^ISSUER_DIRECTORY_POSTGRES_CONNECTION_STRING=' dev.env > "$BACKUP_DIR/issuer-directory.postgres.secret"
chmod 600 "$BACKUP_DIR/issuer-directory.postgres.secret"
```
5. **Generate checksums and encrypt**
```bash
(cd "$BACKUP_DIR" && sha256sum * > SHA256SUMS)
tar czf "$BACKUP_DIR.tar.gz" -C "$BACKUP_DIR" .
age -r you@example.org "$BACKUP_DIR.tar.gz" > "$BACKUP_DIR.tar.gz.age"
```
## Cold backup (planned downtime)
1. Notify stakeholders and pause automation calling the API.
2. Stop services:
```bash
docker compose -f devops/compose/docker-compose.prod.yaml down issuer-directory
```
(For Helm: `kubectl scale deploy stellaops-issuer-directory --replicas=0`.)
3. Snapshot volumes:
```bash
docker run --rm -v postgres-data:/data \
-v "$(pwd)":/backup busybox tar czf /backup/postgres-data-$(date +%Y%m%d).tar.gz -C /data .
```
4. Copy configuration, seeds, and secrets as in the hot backup.
5. Restart services and confirm `/health/live` returns `200 OK`.
## Restore procedure
1. **Provision clean volumes**
- Compose: `docker volume rm postgres-data` (optional) then `docker compose up -d postgres`.
- Helm: delete the PostgreSQL PVC or attach a fresh volume snapshot.
2. **Restore PostgreSQL**
```bash
docker compose exec -T postgres \
pg_restore --format=custom --clean --if-exists \
--dbname=issuer_directory < issuer-directory-YYYYMMDDTHHMMSSZ.dump
```
3. **Restore configuration/secrets**
- Copy `issuer-directory.yaml` into `etc/`.
- Reapply the secret: `kubectl apply -f issuer-directory-secret.yaml` or repopulate `.env`.
4. **Restore CSAF seeds** (optional)
- If you maintain a customised seed file, copy it back before starting the container. Otherwise the bundled file will be used.
5. **Start services**
```bash
docker compose up -d issuer-directory
# or
kubectl scale deploy stellaops-issuer-directory --replicas=1
```
6. **Validate**
- `curl -fsSL https://localhost:8447/health/live`
- Issue an access token and list issuers to confirm results.
- Check PostgreSQL counts match expectations (`SELECT COUNT(*) FROM issuer_directory.issuers;`, etc.).
- Confirm Prometheus scrapes `issuer_directory_changes_total` and `issuer_directory_key_operations_total` for the tenants you restored.
## Disaster recovery notes
- **Retention:** Maintain 30 daily + 12 monthly archives. Store copies in geographically separate, access-controlled vaults.
- **Audit reconciliation:** Ensure `issuer_audit` entries cover the restore window; export them for compliance.
- **Seed replay:** If the CSAF seed file was lost, set `ISSUER_DIRECTORY_SEED_CSAF=true` for the first restart to rehydrate the global tenant.
- **Testing:** Run quarterly restore drills in a staging environment to validate procedure drift.
## Verification checklist
- [ ] `/health/live` returns `200 OK`.
- [ ] PostgreSQL tables (`issuers`, `issuer_keys`, `issuer_trust_overrides`) have expected counts.
- [ ] `issuer_directory_changes_total`, `issuer_directory_key_operations_total`, and `issuer_directory_key_validation_failures_total` metrics resume within 1 minute.
- [ ] Audit entries exist for post-restore CRUD activity.
- [ ] Client integrations (VEX Lens, Excititor) resolve issuers successfully.

View File

@@ -0,0 +1,107 @@
# Issuer Directory Deployment Guide
## Scope
- **Applies to:** Issuer Directory WebService (`stellaops/issuer-directory-web`) running via the provided Docker Compose bundles (`devops/compose/docker-compose.*.yaml`) or the Helm chart (`devops/helm/stellaops`).
- **Covers:** Environment prerequisites, secret handling, Compose + Helm rollout steps, and post-deploy verification.
- **Audience:** Platform/DevOps engineers responsible for Identity & Signing sprint deliverables.
## 1 · Prerequisites
- Authority must be running and reachable at the issuer URL you configure (default Compose host: `https://authority:8440`).
- PostgreSQL 16+ with credentials for the `issuer_directory` database (Compose defaults to the user defined in `.env`).
- Network access to Authority, PostgreSQL, and (optionally) Prometheus if you scrape metrics.
- Issuer Directory configuration file `etc/issuer-directory.yaml` checked and customised for your environment (tenant header, audiences, telemetry level, CSAF seed path).
> **Secrets:** Use `etc/secrets/issuer-directory.postgres.secret.example` as a template. Store the real connection string in an untracked file or secrets manager and reference it via environment variables (`ISSUER_DIRECTORY_POSTGRES_CONNECTION_STRING`) rather than committing credentials.
## 2 · Deploy with Docker Compose
1. **Prepare environment variables**
```bash
cp devops/compose/env/dev.env.example dev.env
cp etc/secrets/issuer-directory.postgres.secret.example issuer-directory.postgres.env
# Edit dev.env and issuer-directory.postgres.env with production-ready secrets.
```
2. **Inspect the merged configuration**
```bash
docker compose \
--env-file dev.env \
--env-file issuer-directory.postgres.env \
-f devops/compose/docker-compose.dev.yaml config
```
The command confirms the new `issuer-directory` service resolves the port (`${ISSUER_DIRECTORY_PORT:-8447}`) and the PostgreSQL connection string is in place.
3. **Launch the stack**
```bash
docker compose \
--env-file dev.env \
--env-file issuer-directory.postgres.env \
-f devops/compose/docker-compose.dev.yaml up -d issuer-directory
```
Compose automatically mounts `../../etc/issuer-directory.yaml` into the container at `/etc/issuer-directory.yaml`, seeds CSAF publishers, and exposes the API on `https://localhost:8447`.
### Compose environment variables
| Variable | Purpose | Default |
| --- | --- | --- |
| `ISSUER_DIRECTORY_PORT` | Host port that maps to container port `8080`. | `8447` |
| `ISSUER_DIRECTORY_POSTGRES_CONNECTION_STRING` | Injected into `ISSUERDIRECTORY__POSTGRES__CONNECTIONSTRING`; should contain credentials. | `Host=postgres;Port=5432;Database=issuer_directory;Username=${POSTGRES_USER};Password=${POSTGRES_PASSWORD}` |
| `ISSUER_DIRECTORY_SEED_CSAF` | Toggles CSAF bootstrap on startup. Set to `false` after the first production import if you manage issuers manually. | `true` |
4. **Smoke test**
```bash
curl -k https://localhost:8447/health/live
stellaops-cli issuer-directory issuers list \
--base-url https://localhost:8447 \
--tenant demo \
--access-token "$(stellaops-cli auth token issue --scope issuer-directory:read)"
```
5. **Upgrade & rollback**
- Update Compose images to the desired release manifest (`deploy/releases/*.yaml`), re-run `docker compose config`, then `docker compose up -d`.
- Rollbacks follow the same steps with the previous manifest. PostgreSQL schemas are backwards compatible within `2025.10.x`.
## 3 · Deploy with Helm
1. **Create or update the secret**
```bash
kubectl create secret generic issuer-directory-secrets \
--from-literal=ISSUERDIRECTORY__POSTGRES__CONNECTIONSTRING='Host=stellaops-postgres;Port=5432;Database=issuer_directory;Username=stellaops;Password=<password>' \
--dry-run=client -o yaml | kubectl apply -f -
```
Add optional overrides (e.g. `ISSUERDIRECTORY__AUTHORITY__ISSUER`) if your Authority issuer differs from the default.
2. **Template for validation**
```bash
helm template issuer-directory devops/helm/stellaops \
-f devops/helm/stellaops/values-prod.yaml \
--set services.issuer-directory.env.ISSUERDIRECTORY__AUTHORITY__ISSUER=https://authority.prod.stella-ops.org \
> /tmp/issuer-directory.yaml
```
3. **Install / upgrade**
```bash
helm upgrade --install stellaops devops/helm/stellaops \
-f devops/helm/stellaops/values-prod.yaml \
--set services.issuer-directory.env.ISSUERDIRECTORY__AUTHORITY__ISSUER=https://authority.prod.stella-ops.org
```
The chart provisions:
- ConfigMap `stellaops-issuer-directory-config` with `IssuerDirectory` settings.
- Deployment `stellaops-issuer-directory` with readiness/liveness probes on `/health/live`.
- Service on port `8080` (ClusterIP by default).
4. **Expose for operators (optional)**
- Use an Ingress/HTTPRoute to publish `https://issuer-directory.<env>.stella-ops.org`.
- Ensure the upstream includes DPoP headers if proxied through an API gateway.
5. **Post-deploy validation**
```bash
kubectl exec deploy/stellaops-issuer-directory -- \
curl -sf http://127.0.0.1:8080/health/live
kubectl logs deploy/stellaops-issuer-directory | grep 'IssuerDirectory PostgreSQL connected'
```
Prometheus should begin scraping `issuer_directory_changes_total` and related metrics (labels: `tenant`, `issuer`, `action`).
## 4 · Operational checklist
- **Secrets:** Connection strings live in `issuer-directory-secrets` (Helm) or an `.env` file stored in your secrets vault (Compose). Rotate credentials via secret update + pod restart.
- **Audit streams:** Confirm `issuer_directory_audit` collection receives entries when CRUD operations run; export logs for compliance.
- **Tenants:** The service enforces the `X-StellaOps-Tenant` header. For multi-tenant staging, configure the reverse proxy to inject the correct tenant or issue scoped tokens.
- **CSAF seeds:** `ISSUER_DIRECTORY_SEED_CSAF=true` replays `data/csaf-publishers.json` on startup. Set to `false` once production tenants are fully managed, or override `csafSeedPath` with a curated bundle.
- **Release alignment:** Before promotion, run `deploy/tools/validate-profiles.sh` to lint Compose/Helm bundles, then verify the new `issuer-directory-web` entry in `deploy/releases/2025.10-edge.yaml` (or the relevant manifest) matches the channel you intend to ship.

View File

@@ -0,0 +1,73 @@
# Issuer Directory Offline Kit Notes
## Purpose
Operators bundling StellaOps for fully disconnected environments must include the Issuer Directory service so VEX Lens, Excititor, and Policy Engine can resolve trusted issuers without reaching external registries.
## 1 · Bundle contents
Include the following artefacts in your Offline Update Kit staging tree:
| Path (within kit) | Source | Notes |
| --- | --- | --- |
| `images/issuer-directory-web.tar` | `registry.stella-ops.org/stellaops/issuer-directory-web` (digest from `deploy/releases/<channel>.yaml`) | Export with `crane pull --format=tar` or `skopeo copy docker://... oci:...`. |
| `config/issuer-directory/issuer-directory.yaml` | `etc/issuer-directory.yaml` (customised) | Replace Authority issuer, tenant header, and log level as required. |
| `config/issuer-directory/csaf-publishers.json` | `src/IssuerDirectory/StellaOps.IssuerDirectory/data/csaf-publishers.json` or regional override | Operators can edit before import to add private publishers. |
| `secrets/issuer-directory/connection.env` | Secure secret store export (`ISSUER_DIRECTORY_POSTGRES_CONNECTION_STRING=`) | Encrypt at rest; Offline Kit importer places it in the Compose/Helm secret. |
| `env/issuer-directory.env` (optional) | Curated `.env` snippet (for example `ISSUER_DIRECTORY_SEED_CSAF=false`) | Helps operators disable reseeding after their first import without editing the main profile. |
| `docs/issuer-directory/deployment.md` | `docs/modules/issuer-directory/operations/deployment.md` | Ship alongside kit documentation for operators. |
> **Image digests:** Update `deploy/releases/2025.10-edge.yaml` (or the relevant manifest) with the exact digest before building the kit so `offline-manifest.json` can assert integrity.
## 2 · Compose (air-gapped) deployment
1. Load images locally on the target:
```bash
docker load < images/issuer-directory-web.tar
```
2. Copy Compose artefacts:
```bash
cp devops/compose/docker-compose.airgap.yaml .
cp devops/compose/env/airgap.env.example airgap.env
cp secrets/issuer-directory/connection.env issuer-directory.mongo.env
```
3. Update `airgap.env` with site-specific values (Authority issuer, tenant, ports) and remove outbound endpoints.
4. Bring up the service:
```bash
docker compose \
--env-file airgap.env \
--env-file issuer-directory.mongo.env \
-f docker-compose.airgap.yaml up -d issuer-directory
```
5. Verify via `curl -k https://issuer-directory.airgap.local:8447/health/live`.
## 3 · Kubernetes (air-gapped) deployment
1. Pre-load the OCI image into your local registry mirror and update `values-airgap.yaml` to reference it.
2. Apply the secret bundled in the kit:
```bash
kubectl apply -f secrets/issuer-directory/connection-secret.yaml
```
(Generate this file during packaging with `kubectl create secret generic issuer-directory-secrets ... --dry-run=client -o yaml`.)
3. Install/upgrade the chart:
```bash
helm upgrade --install stellaops devops/helm/stellaops \
-f devops/helm/stellaops/values-airgap.yaml \
--set services.issuer-directory.env.ISSUERDIRECTORY__AUTHORITY__ISSUER=https://authority.airgap.local/realms/stellaops
```
4. Confirm `issuer_directory_changes_total` is visible in your offline Prometheus stack.
## 4 · Import workflow summary
1. Run `ops/offline-kit/build_offline_kit.py` with the additional artefacts noted above.
2. Sign the resulting tarball and manifest (Cosign) and record the SHA-256 in the release notes.
3. At the destination:
```bash
stellaops-cli offline kit import \
--bundle stella-ops-offline-kit-<version>-airgap.tar.gz \
--destination /opt/stellaops/offline-kit
```
4. Follow the Compose or Helm path depending on your topology.
## 5 · Post-import validation
- [ ] `docker images | grep issuer-directory` (Compose) or `kubectl get deploy stellaops-issuer-directory` (Helm) shows the expected version.
- [ ] `csaf-publishers.json` in the container matches the offline bundle (hash check).
- [ ] `/issuer-directory/issuers` returns global seed issuers (requires token with `issuer-directory:read` scope).
- [ ] Audit collection receives entries when you create/update issuers offline.
- [ ] Offline kit manifest (`offline-manifest.json`) lists `images/issuer-directory-web.tar` and `config/issuer-directory/issuer-directory.yaml` with SHA-256 values you recorded during packaging.
- [ ] Prometheus in the offline environment reports `issuer_directory_changes_total` for the tenants imported from the kit.

View File

@@ -0,0 +1,327 @@
# OpsMemory Module
> **Decision Ledger for Playbook Learning**
OpsMemory is a structured ledger of prior security decisions and their outcomes. It enables playbook learning - understanding which decisions led to good outcomes and surfacing institutional knowledge for similar situations.
## What OpsMemory Is
-**Decision + Outcome pairs**: Every security decision is recorded with its eventual outcome
-**Success/failure classification**: Learn what worked and what didn't
-**Similar situation matching**: Find past decisions in comparable scenarios
-**Playbook suggestions**: Surface recommendations based on historical success
## What OpsMemory Is NOT
- ❌ Chat history (that's conversation storage)
- ❌ Audit logs (that's the Timeline)
- ❌ VEX statements (that's Excititor)
## Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ OpsMemory Service │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌──────────────────┐ ┌───────────────┐ │
│ │ Decision │ │ Playbook │ │ Outcome │ │
│ │ Recording │ │ Suggestion │ │ Tracking │ │
│ └──────┬──────┘ └────────┬─────────┘ └───────┬───────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ IOpsMemoryStore │ │
│ │ (PostgreSQL with similarity vectors) │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
```
## Core Components
### OpsMemoryRecord
The core data structure capturing a decision and its context:
```json
{
"memoryId": "mem-abc123",
"tenantId": "tenant-xyz",
"recordedAt": "2026-01-07T12:00:00Z",
"situation": {
"cveId": "CVE-2023-44487",
"component": "pkg:npm/http2@1.0.0",
"severity": "high",
"reachability": "reachable",
"epssScore": 0.97,
"isKev": true,
"contextTags": ["production", "external-facing", "payment-service"]
},
"decision": {
"action": "Remediate",
"rationale": "KEV + reachable + payment service = immediate remediation",
"decidedBy": "security-team",
"decidedAt": "2026-01-07T12:00:00Z",
"policyReference": "policy/critical-kev.rego"
},
"outcome": {
"status": "Success",
"resolutionTime": "4:30:00",
"lessonsLearned": "Upgrade was smooth, no breaking changes",
"recordedAt": "2026-01-07T16:30:00Z"
}
}
```
### Decision Actions
| Action | Description |
|--------|-------------|
| `Accept` | Accept the risk (no action) |
| `Remediate` | Upgrade/patch the component |
| `Quarantine` | Isolate the component |
| `Mitigate` | Apply compensating controls (WAF, config) |
| `Defer` | Defer for later review |
| `Escalate` | Escalate to security team |
| `FalsePositive` | Mark as not applicable |
### Outcome Status
| Status | Description |
|--------|-------------|
| `Success` | Decision led to successful resolution |
| `PartialSuccess` | Decision led to partial resolution |
| `Ineffective` | Decision was ineffective |
| `NegativeOutcome` | Decision led to negative consequences |
| `Pending` | Outcome still pending |
## API Reference
### Record a Decision
```http
POST /api/v1/opsmemory/decisions
Content-Type: application/json
{
"tenantId": "tenant-xyz",
"cveId": "CVE-2023-44487",
"componentPurl": "pkg:npm/http2@1.0.0",
"severity": "high",
"reachability": "reachable",
"epssScore": 0.97,
"action": "Remediate",
"rationale": "KEV + reachable + payment service",
"decidedBy": "alice@example.com",
"contextTags": ["production", "payment-service"]
}
```
**Response:**
```json
{
"memoryId": "abc123def456",
"recordedAt": "2026-01-07T12:00:00Z"
}
```
### Record an Outcome
```http
POST /api/v1/opsmemory/decisions/{memoryId}/outcome?tenantId=tenant-xyz
Content-Type: application/json
{
"status": "Success",
"resolutionTimeMinutes": 270,
"lessonsLearned": "Upgrade was smooth, no breaking changes",
"recordedBy": "alice@example.com"
}
```
### Get Playbook Suggestions
```http
GET /api/v1/opsmemory/suggestions?tenantId=tenant-xyz&cveId=CVE-2024-1234&severity=high&reachability=reachable
```
**Response:**
```json
{
"suggestions": [
{
"suggestedAction": "Remediate",
"confidence": 0.87,
"rationale": "87% confidence based on 15 similar past decisions. Remediation succeeded in 93% of high-severity reachable vulnerabilities.",
"successRate": 0.93,
"similarDecisionCount": 15,
"averageResolutionTimeMinutes": 180,
"evidence": [
{
"memoryId": "abc123",
"similarity": 0.92,
"action": "Remediate",
"outcome": "Success",
"cveId": "CVE-2023-44487"
}
],
"matchingFactors": [
"Same severity: high",
"Same reachability: Reachable",
"Both are KEV",
"Shared context: production"
]
}
],
"analyzedRecords": 15,
"topSimilarity": 0.92
}
```
### Query Past Decisions
```http
GET /api/v1/opsmemory/decisions?tenantId=tenant-xyz&action=Remediate&pageSize=20
```
### Get Statistics
```http
GET /api/v1/opsmemory/stats?tenantId=tenant-xyz
```
**Response:**
```json
{
"tenantId": "tenant-xyz",
"totalDecisions": 1250,
"decisionsWithOutcomes": 980,
"successRate": 0.87
}
```
## Similarity Algorithm
OpsMemory uses a 50-dimensional vector to represent each security situation:
| Dimensions | Feature |
|------------|---------|
| 0-9 | CVE category (memory, injection, auth, crypto, dos, etc.) |
| 10-14 | Severity (none, low, medium, high, critical) |
| 15-18 | Reachability (unknown, reachable, not-reachable, potential) |
| 19-23 | EPSS band (0-0.2, 0.2-0.4, 0.4-0.6, 0.6-0.8, 0.8-1.0) |
| 24-28 | CVSS band (0-2, 2-4, 4-6, 6-8, 8-10) |
| 29 | KEV flag |
| 30-39 | Component type (npm, maven, pypi, nuget, go, cargo, etc.) |
| 40-49 | Context tags (production, external-facing, payment, etc.) |
Similarity is computed using **cosine similarity** between normalized vectors.
## Integration Points
### Decision Recording Hook
OpsMemory integrates with the Findings Ledger to automatically capture decisions:
```csharp
public class OpsMemoryHook : IDecisionHook
{
public async Task OnDecisionRecordedAsync(FindingDecision decision)
{
var record = new OpsMemoryRecord
{
TenantId = decision.TenantId,
Situation = ExtractSituation(decision),
Decision = ExtractDecision(decision)
};
// Fire-and-forget to not block the decision flow
_ = _store.RecordDecisionAsync(record);
}
}
```
### Outcome Tracking
The OutcomeTrackingService monitors for resolution events and prompts users:
1. **Auto-detect resolution**: When a finding is marked resolved
2. **Calculate resolution time**: Time from decision to resolution
3. **Prompt for classification**: Ask user about outcome quality
4. **Link to original decision**: Update the OpsMemory record
## Configuration
```yaml
opsmemory:
connectionString: "Host=localhost;Database=stellaops"
similarity:
minThreshold: 0.6 # Minimum similarity for suggestions
maxResults: 10 # Maximum similar records to analyze
suggestions:
maxSuggestions: 3 # Maximum suggestions to return
minConfidence: 0.5 # Minimum confidence threshold
outcomeTracking:
autoPromptDelay: 24h # Delay before prompting for outcome
reminderInterval: 7d # Reminder interval for pending outcomes
```
## Database Schema
```sql
CREATE SCHEMA IF NOT EXISTS opsmemory;
CREATE TABLE opsmemory.decisions (
memory_id TEXT PRIMARY KEY,
tenant_id TEXT NOT NULL,
recorded_at TIMESTAMPTZ NOT NULL,
-- Situation (JSONB for flexibility)
situation JSONB NOT NULL,
-- Decision (JSONB)
decision JSONB NOT NULL,
-- Outcome (nullable, updated later)
outcome JSONB,
-- Similarity vector (array for simple cosine similarity)
similarity_vector REAL[] NOT NULL
);
CREATE INDEX idx_decisions_tenant ON opsmemory.decisions(tenant_id);
CREATE INDEX idx_decisions_recorded ON opsmemory.decisions(recorded_at DESC);
CREATE INDEX idx_decisions_cve ON opsmemory.decisions((situation->>'cveId'));
```
## Best Practices
### Recording Decisions
1. **Include context tags**: The more context, the better similarity matching
2. **Document rationale**: Future users benefit from understanding why
3. **Reference policies**: Link to the policy that guided the decision
### Recording Outcomes
1. **Be timely**: Record outcomes as soon as resolution is confirmed
2. **Be honest**: Failed decisions are valuable learning data
3. **Add lessons learned**: Help future users avoid pitfalls
### Using Suggestions
1. **Review evidence**: Look at the similar past decisions
2. **Check matching factors**: Ensure the situations are truly comparable
3. **Trust but verify**: Suggestions are guidance, not mandates
## Related Modules
- [Findings Ledger](../findings-ledger/README.md) - Source of decision events
- [Timeline](../timeline-indexer/README.md) - Audit trail
- [Excititor](../excititor/README.md) - VEX statement management
- [Risk Engine](../risk-engine/README.md) - Risk scoring

View File

@@ -0,0 +1,393 @@
# OpsMemory Architecture
> **Technical deep-dive into the Decision Ledger**
## Overview
OpsMemory provides a structured approach to organizational learning from security decisions. It captures the complete lifecycle of a decision - from the situation context through the action taken to the eventual outcome.
## Design Principles
### 1. Determinism First
All operations produce deterministic, reproducible results:
- Similarity vectors are computed from stable inputs
- Confidence scores use fixed formulas
- No randomness in suggestion ranking
### 2. Multi-Tenant Isolation
Every operation is scoped to a tenant:
- Records cannot be accessed across tenants
- Similarity search is tenant-isolated
- Statistics are per-tenant
### 3. Fire-and-Forget Integration
Decision recording is async and non-blocking:
- UI decisions complete immediately
- OpsMemory recording happens in background
- Failures don't affect the primary flow
### 4. Offline Capable
All features work without network access:
- Local PostgreSQL storage
- No external API dependencies
- Self-contained similarity computation
## Component Architecture
```
┌────────────────────────────────────────────────────────────────────┐
│ WebService Layer │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ OpsMemoryEndpoints │ │
│ │ POST /decisions GET /decisions GET /suggestions GET /stats│ │
│ └──────────────────────────────────────────────────────────────┘ │
└────────────────────────────────┬───────────────────────────────────┘
┌────────────────────────────────┼───────────────────────────────────┐
│ Service Layer │
│ ┌─────────────────┐ ┌─────────────────┐ ┌────────────────────┐ │
│ │ PlaybookSuggest │ │ OutcomeTracking │ │ SimilarityVector │ │
│ │ Service │ │ Service │ │ Generator │ │
│ └────────┬────────┘ └────────┬────────┘ └─────────┬──────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ IOpsMemoryStore │ │
│ └──────────────────────────────────────────────────────────────┘ │
└────────────────────────────────┬───────────────────────────────────┘
┌────────────────────────────────┼───────────────────────────────────┐
│ Storage Layer │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ PostgresOpsMemoryStore │ │
│ │ - Decision CRUD │ │
│ │ - Outcome updates │ │
│ │ - Similarity search (array-based cosine) │ │
│ │ - Query with pagination │ │
│ │ - Statistics aggregation │ │
│ └──────────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────────┘
```
## Data Model
### OpsMemoryRecord
The core aggregate containing all decision information:
```csharp
public sealed record OpsMemoryRecord
{
public required string MemoryId { get; init; }
public required string TenantId { get; init; }
public required DateTimeOffset RecordedAt { get; init; }
public required SituationContext Situation { get; init; }
public required DecisionRecord Decision { get; init; }
public OutcomeRecord? Outcome { get; init; }
public ImmutableArray<float> SimilarityVector { get; init; }
}
```
### SituationContext
Captures the security context at decision time:
```csharp
public sealed record SituationContext
{
public string? CveId { get; init; }
public string? Component { get; init; } // PURL
public string? Severity { get; init; } // low/medium/high/critical
public ReachabilityStatus Reachability { get; init; }
public double? EpssScore { get; init; } // 0-1
public double? CvssScore { get; init; } // 0-10
public bool IsKev { get; init; }
public ImmutableArray<string> ContextTags { get; init; }
}
```
### DecisionRecord
The action taken and why:
```csharp
public sealed record DecisionRecord
{
public required DecisionAction Action { get; init; }
public required string Rationale { get; init; }
public required string DecidedBy { get; init; }
public required DateTimeOffset DecidedAt { get; init; }
public string? PolicyReference { get; init; }
public MitigationDetails? Mitigation { get; init; }
}
```
### OutcomeRecord
The result of the decision:
```csharp
public sealed record OutcomeRecord
{
public required OutcomeStatus Status { get; init; }
public TimeSpan? ResolutionTime { get; init; }
public string? ActualImpact { get; init; }
public string? LessonsLearned { get; init; }
public required string RecordedBy { get; init; }
public required DateTimeOffset RecordedAt { get; init; }
}
```
## Similarity Algorithm
### Vector Generation
The `SimilarityVectorGenerator` creates 50-dimensional feature vectors:
```
Vector Layout:
[0-9] : CVE category one-hot (memory, injection, auth, crypto, dos,
info-disclosure, privilege-escalation, xss, path-traversal, other)
[10-14] : Severity one-hot (none, low, medium, high, critical)
[15-18] : Reachability one-hot (unknown, reachable, not-reachable, potential)
[19-23] : EPSS band one-hot (0-0.2, 0.2-0.4, 0.4-0.6, 0.6-0.8, 0.8-1.0)
[24-28] : CVSS band one-hot (0-2, 2-4, 4-6, 6-8, 8-10)
[29] : KEV flag (0 or 1)
[30-39] : Component type one-hot (npm, maven, pypi, nuget, go, cargo,
deb, rpm, apk, other)
[40-49] : Context tag presence (production, development, staging,
external-facing, internal, payment, auth, data, api, frontend)
```
### Cosine Similarity
Similarity between vectors A and B:
```
similarity = (A · B) / (||A|| × ||B||)
```
Where `A · B` is the dot product and `||A||` is the L2 norm.
### CVE Classification
CVEs are classified by analyzing keywords in the CVE ID and description:
| Category | Keywords |
|----------|----------|
| memory | buffer, overflow, heap, stack, use-after-free |
| injection | sql, command, code injection, ldap |
| auth | authentication, authorization, bypass |
| crypto | cryptographic, encryption, key |
| dos | denial of service, resource exhaustion |
| info-disclosure | information disclosure, leak |
| privilege-escalation | privilege escalation, elevation |
| xss | cross-site scripting, xss |
| path-traversal | path traversal, directory traversal |
## Playbook Suggestion Algorithm
### Confidence Calculation
```csharp
confidence = baseSimilarity
× successRateBonus
× recencyBonus
× evidenceCountBonus
```
Where:
- `baseSimilarity`: Highest similarity score from matching records
- `successRateBonus`: `1 + (successRate - 0.5) * 0.5` (rewards high success rate)
- `recencyBonus`: More recent decisions weighted higher
- `evidenceCountBonus`: More evidence = higher confidence
### Suggestion Ranking
1. Group past decisions by action taken
2. For each action, calculate:
- Average similarity of records with that action
- Success rate for that action
- Number of similar decisions
3. Compute confidence score
4. Rank by confidence descending
5. Return top N suggestions
### Rationale Generation
Rationales are generated programmatically:
```
"{confidence}% confidence based on {count} similar past decisions.
{action} succeeded in {successRate}% of {factors}."
```
## Storage Design
### PostgreSQL Schema
```sql
CREATE TABLE opsmemory.decisions (
memory_id TEXT PRIMARY KEY,
tenant_id TEXT NOT NULL,
recorded_at TIMESTAMPTZ NOT NULL,
-- Denormalized situation fields for indexing
cve_id TEXT,
component TEXT,
severity TEXT,
-- Full data as JSONB
situation JSONB NOT NULL,
decision JSONB NOT NULL,
outcome JSONB,
-- Similarity vector as array (not pgvector)
similarity_vector REAL[] NOT NULL
);
-- Indexes
CREATE INDEX idx_decisions_tenant ON opsmemory.decisions(tenant_id);
CREATE INDEX idx_decisions_recorded ON opsmemory.decisions(recorded_at DESC);
CREATE INDEX idx_decisions_cve ON opsmemory.decisions(cve_id) WHERE cve_id IS NOT NULL;
CREATE INDEX idx_decisions_component ON opsmemory.decisions(component) WHERE component IS NOT NULL;
```
### Why Not pgvector?
The current implementation uses PostgreSQL arrays instead of pgvector:
1. **Simpler deployment**: No extension installation required
2. **Smaller dataset**: OpsMemory is per-org, not global
3. **Adequate performance**: Array operations are fast enough for <100K records
4. **Future option**: Can migrate to pgvector if needed
### Cosine Similarity in SQL
```sql
-- Cosine similarity between query vector and stored vectors
SELECT memory_id,
(
SELECT SUM(a * b)
FROM UNNEST(similarity_vector, @query_vector) AS t(a, b)
) / (
SQRT((SELECT SUM(a * a) FROM UNNEST(similarity_vector) AS t(a))) *
SQRT((SELECT SUM(b * b) FROM UNNEST(@query_vector) AS t(b)))
) AS similarity
FROM opsmemory.decisions
WHERE tenant_id = @tenant_id
ORDER BY similarity DESC
LIMIT @top_k;
```
## API Design
### Endpoint Overview
| Method | Path | Description |
|--------|------|-------------|
| POST | `/api/v1/opsmemory/decisions` | Record a new decision |
| GET | `/api/v1/opsmemory/decisions/{id}` | Get decision details |
| POST | `/api/v1/opsmemory/decisions/{id}/outcome` | Record outcome |
| GET | `/api/v1/opsmemory/suggestions` | Get playbook suggestions |
| GET | `/api/v1/opsmemory/decisions` | Query past decisions |
| GET | `/api/v1/opsmemory/stats` | Get statistics |
### Request/Response DTOs
The API uses string-based DTOs that convert to/from internal enums:
```csharp
// API accepts strings
public record RecordDecisionRequest
{
public required string Action { get; init; } // "Remediate", "Accept", etc.
public string? Reachability { get; init; } // "reachable", "not-reachable"
}
// Internal uses enums
public enum DecisionAction { Accept, Remediate, Quarantine, ... }
public enum ReachabilityStatus { Unknown, Reachable, NotReachable, Potential }
```
## Testing Strategy
### Unit Tests (26 tests)
**SimilarityVectorGeneratorTests:**
- Vector dimension validation
- Feature encoding (severity, reachability, EPSS, CVSS, KEV)
- Component type classification
- Context tag encoding
- Vector normalization
- Cosine similarity computation
- Matching factor detection
**PlaybookSuggestionServiceTests:**
- Empty history handling
- Single record suggestions
- Multiple record ranking
- Confidence calculation
- Rationale generation
- Evidence linking
### Integration Tests (5 tests)
**PostgresOpsMemoryStoreTests:**
- Decision persistence and retrieval
- Outcome updates
- Tenant isolation
- Query filtering
- Statistics calculation
## Performance Considerations
### Indexing Strategy
- Primary key on `memory_id` for direct lookups
- Index on `tenant_id` for isolation
- Index on `recorded_at` for recent-first queries
- Partial indexes on `cve_id` and `component` for filtered queries
### Query Optimization
- Limit similarity search to last N days by default
- Return only top-K similar records
- Use cursor-based pagination for large result sets
### Caching
Currently no caching (records are infrequently accessed). Future options:
- Cache similarity vectors in memory
- Cache recent suggestions per tenant
- Use read replicas for heavy read loads
## Future Enhancements
### pgvector Migration
If dataset grows significantly:
1. Install pgvector extension
2. Add vector column with IVFFlat index
3. Replace array-based similarity with vector operations
4. ~100x speedup for large datasets
### ML-Based Suggestions
Replace rule-based confidence with ML model:
1. Train on historical decision-outcome pairs
2. Include more features (time of day, team, etc.)
3. Use gradient boosting or neural network
4. Continuous learning from new outcomes
### Outcome Prediction
Predict outcome before decision is made:
1. Use past outcomes as training data
2. Predict success probability per action
3. Show predicted outcomes in UI
4. Track prediction accuracy over time

View File

@@ -0,0 +1,316 @@
# OpsMemory Chat Integration
> **Connecting Decision Memory to AI-Assisted Workflows**
## Overview
The OpsMemory Chat Integration connects organizational decision memory to AdvisoryAI Chat, enabling:
1. **Context Enrichment**: Past relevant decisions surface automatically in chat
2. **Decision Recording**: New decisions from chat actions are auto-recorded
3. **Feedback Loop**: Outcomes improve future AI suggestions
4. **Object Linking**: Structured references to decisions, issues, and tactics
## Architecture
```
┌──────────────────────────────────────────────────────────────────────┐
│ Chat Session │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ User: "What should we do about CVE-2023-44487?" │ │
│ └────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ OpsMemoryChatProvider.EnrichContextAsync() │ │
│ │ → Query similar past decisions │ │
│ │ → Include known issues and tactics │ │
│ │ → Return top-3 with outcomes │ │
│ └────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ Prompt Assembly (via AdvisoryAiPromptContextEnricher) │ │
│ │ System: "Previous similar situations..." │ │
│ │ - CVE-2022-41903 (same category): Accepted, SUCCESS │ │
│ │ - CVE-2023-1234 (similar severity): Quarantined, SUCCESS │ │
│ │ Known Issues: [ops-mem:issue-xyz123] may apply │ │
│ └────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ Assistant Response with Object Links: │ │
│ │ "Based on 3 similar past decisions [ops-mem:dec-abc123]..." │ │
│ │ [Accept Risk]{action:approve,cve_id=CVE-2023-44487} │ │
│ └────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ (if action executed) │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ OpsMemoryDecisionRecorder.RecordFromActionAsync() │ │
│ │ → Extract situation from chat context │ │
│ │ → Record decision with action, rationale │ │
│ │ → Link to Run attestation │ │
│ └────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
```
## Core Components
### IOpsMemoryChatProvider
The main interface for chat context enrichment:
```csharp
public interface IOpsMemoryChatProvider
{
/// <summary>
/// Enriches chat context with relevant past decisions.
/// </summary>
Task<OpsMemoryChatContext> EnrichContextAsync(
ChatEnrichmentRequest request,
CancellationToken ct = default);
/// <summary>
/// Records a decision made during a chat session.
/// </summary>
Task RecordDecisionAsync(
ChatDecisionRecord record,
CancellationToken ct = default);
}
```
**Location:** `src/AdvisoryAI/StellaOps.AdvisoryAI/Chat/Integration/IOpsMemoryChatProvider.cs`
### OpsMemoryChatProvider
Implementation that queries OpsMemory and formats results for chat:
- **Similarity Search**: Finds past decisions with similar CVE/severity/category
- **Known Issues**: Includes relevant documented issues
- **Tactics**: Surfaces applicable response tactics
- **Fire-and-Forget Recording**: Async decision capture without blocking UX
**Location:** `src/AdvisoryAI/StellaOps.AdvisoryAI/Chat/Integration/OpsMemoryChatProvider.cs`
### AdvisoryAiPromptContextEnricher
Transforms OpsMemory context into AI prompt format:
```csharp
public interface IAdvisoryAiPromptContextEnricher
{
/// <summary>
/// Enriches AI prompt with OpsMemory context.
/// </summary>
Task<PromptEnrichmentResult> EnrichAsync(
PromptEnrichmentRequest request,
CancellationToken ct = default);
}
```
**Location:** `src/AdvisoryAI/StellaOps.AdvisoryAI/Chat/Integration/AdvisoryAiPromptContextEnricher.cs`
## Object Link Format
OpsMemory uses structured object links for cross-referencing:
| Type | Format | Example |
|------|--------|---------|
| Decision | `[ops-mem:dec-{id}]` | `[ops-mem:dec-abc12345]` |
| Known Issue | `[ops-mem:issue-{id}]` | `[ops-mem:issue-xyz98765]` |
| Tactic | `[ops-mem:tactic-{id}]` | `[ops-mem:tactic-respond-001]` |
| Playbook | `[ops-mem:playbook-{id}]` | `[ops-mem:playbook-log4j-response]` |
### Link Resolution
The `OpsMemoryLinkResolver` resolves object links to display text and URLs:
```csharp
public interface IObjectLinkResolver
{
/// <summary>
/// Resolves an object link to display information.
/// </summary>
Task<ObjectLinkResolution?> ResolveAsync(
string objectLink,
CancellationToken ct = default);
}
```
**Example Resolution:**
```
Input: [ops-mem:dec-abc12345]
Output:
- DisplayText: "Accept Risk decision for CVE-2022-41903"
- Url: "/opsmemory/decisions/abc12345"
- Metadata: { outcome: "SUCCESS", actor: "security-team" }
```
## Configuration
```yaml
AdvisoryAI:
Chat:
OpsMemory:
# Enable OpsMemory integration
Enabled: true
# Maximum number of similar decisions to surface
MaxSuggestions: 3
# Minimum similarity score (0.0-1.0)
MinSimilarity: 0.5
# Include known issues in context
IncludeKnownIssues: true
# Include response tactics
IncludeTactics: true
# Automatically record decisions from actions
RecordDecisions: true
OpsMemory:
Integration:
# Link recorded decisions to AI Run attestations
AttestationLinking: true
# Don't block chat flow on recording
FireAndForget: true
```
## Known Issues and Tactics
### Known Issues
Document common false positives or expected behaviors:
```csharp
public interface IKnownIssueStore
{
Task<KnownIssue?> GetByIdAsync(string id, CancellationToken ct);
Task<IReadOnlyList<KnownIssue>> SearchAsync(
KnownIssueSearchRequest request, CancellationToken ct);
}
```
**Example Known Issue:**
```json
{
"id": "issue-log4j-test-code",
"title": "Log4j in test dependencies",
"description": "Log4j detected in test-scope dependencies is not exploitable in production",
"applies_to": {
"cve_pattern": "CVE-2021-44228",
"scope": "test"
},
"recommended_action": "accept_risk",
"status": "active"
}
```
### Response Tactics
Pre-defined response strategies for common situations:
```csharp
public interface ITacticStore
{
Task<Tactic?> GetByIdAsync(string id, CancellationToken ct);
Task<IReadOnlyList<Tactic>> GetMatchingTacticsAsync(
TacticMatchRequest request, CancellationToken ct);
}
```
**Example Tactic:**
```json
{
"id": "tactic-quarantine-critical",
"name": "Quarantine Critical Vulnerabilities",
"trigger": {
"severity": ["CRITICAL"],
"reachability": ["REACHABLE", "UNKNOWN"]
},
"steps": [
"Block deployment to production",
"Notify security team",
"Schedule remediation within 24h"
],
"automation": {
"action": "quarantine",
"notify_channel": "#security-alerts"
}
}
```
## Integration Points
### AdvisoryAI Integration
Register OpsMemory integration during startup:
```csharp
services.AddAdvisoryAIOpsMemoryIntegration(options =>
{
options.Enabled = true;
options.MaxSuggestions = 3;
options.IncludeKnownIssues = true;
options.IncludeTactics = true;
});
```
### Chat Flow Integration
The integration hooks into the chat pipeline:
1. **Pre-Prompt**: `AdvisoryAiPromptContextEnricher` adds OpsMemory context
2. **Response**: AI references past decisions with object links
3. **Post-Action**: `OpsMemoryChatProvider.RecordDecisionAsync` captures the decision
## Best Practices
### Tuning Similarity Threshold
- **0.3-0.5**: Broader matches, may include less relevant decisions
- **0.5-0.7**: Balanced - recommended starting point
- **0.7+**: Strict matching, only very similar situations
### Recording Quality Decisions
For decisions to be useful for future suggestions:
1. **Include Context**: CVE ID, severity, package information
2. **Clear Rationale**: Why this action was chosen
3. **Track Outcomes**: Update with SUCCESS/FAILURE after implementation
### Managing Known Issues
- Review quarterly for relevance
- Archive issues for CVEs that are fully remediated
- Keep issue descriptions actionable
## Testing
### Unit Tests
Located in `src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/Chat/Integration/`:
- `OpsMemoryChatProviderTests.cs` - Provider functionality
- `AdvisoryAiPromptContextEnricherTests.cs` - Prompt enrichment
### Integration Tests
Located in `src/OpsMemory/__Tests/StellaOps.OpsMemory.Tests/Integration/`:
- `OpsMemoryChatProviderIntegrationTests.cs` - Full flow with PostgreSQL
## Related Documentation
- [OpsMemory Architecture](architecture.md) - Core OpsMemory design
- [AdvisoryAI Architecture](../advisory-ai/architecture.md) - AI assistant design
- [Decision Recording API](../../api/opsmemory.md) - REST API reference
---
_Last updated: 10-Jan-2026_

View File

@@ -0,0 +1,49 @@
# Packs Registry
> Task packs registry and distribution service.
## Purpose
PacksRegistry provides a centralized registry for distributable task packs, policy packs, and analyzer bundles. It enables versioned pack management with integrity verification and air-gap support.
## Quick Links
- [Architecture](./architecture.md) - Technical design and implementation details
- [Guides](./guides/) - Usage and configuration guides
## Status
| Attribute | Value |
|-----------|-------|
| **Maturity** | Production |
| **Last Reviewed** | 2025-12-29 |
| **Maintainer** | Platform Guild |
## Key Features
- **Centralized Registry**: Store and manage task packs, policy packs, and analyzer bundles
- **Versioned Management**: Semantic versioning with upgrade/downgrade support
- **Content-Addressed**: All packs are content-addressed with integrity verification
- **Offline Distribution**: Bundle export for air-gapped environments
## Dependencies
### Upstream (this module depends on)
- **PostgreSQL** - Pack metadata storage
- **RustFS/S3** - Pack content storage
- **Authority** - Authentication and authorization
### Downstream (modules that depend on this)
- **TaskRunner** - Consumes packs for execution
## Configuration
```yaml
packs_registry:
storage_backend: rustfs # or s3
max_pack_size_mb: 100
```
## Related Documentation
- [TaskRunner Architecture](../task-runner/architecture.md)

View File

@@ -0,0 +1,100 @@
# component_architecture_packsregistry.md - **Stella Ops PacksRegistry** (2025Q4)
> Task packs registry and distribution service.
> **Scope.** Implementation-ready architecture for **PacksRegistry**: the registry for task packs, policy packs, and analyzer packs that can be distributed to TaskRunner instances.
---
## 0) Mission & boundaries
**Mission.** Provide a **centralized registry** for distributable task packs, policy packs, and analyzer bundles. Enable versioned pack management with integrity verification and air-gap support.
**Boundaries.**
* PacksRegistry **stores and distributes** packs; it does not execute them.
* Pack execution is handled by **TaskRunner**.
* All packs are **content-addressed** with integrity verification.
* Supports **offline distribution** via bundle export.
---
## 1) Solution & project layout
```
src/PacksRegistry/StellaOps.PacksRegistry/
├─ StellaOps.PacksRegistry.Core/ # Pack models, validation
├─ StellaOps.PacksRegistry.Infrastructure/ # Storage, distribution
├─ StellaOps.PacksRegistry.Persistence.EfCore/ # EF Core persistence
├─ StellaOps.PacksRegistry.WebService/ # REST API
├─ StellaOps.PacksRegistry.Worker/ # Background processing
└─ StellaOps.PacksRegistry.Tests/
src/PacksRegistry/__Libraries/
└─ StellaOps.PacksRegistry.Persistence/ # Persistence abstractions
```
---
## 2) External dependencies
* **PostgreSQL** - Pack metadata storage
* **RustFS/S3** - Pack content storage
* **Authority** - Authentication and authorization
* **TaskRunner** - Pack consumer
---
## 3) Contracts & data model
### 3.1 Pack
```json
{
"packId": "policy-baseline-v2",
"version": "2.1.0",
"type": "policy",
"name": "Baseline Security Policy",
"description": "Standard security policy pack",
"digest": "sha256:abc123...",
"size": 45678,
"publishedAt": "2025-01-15T10:30:00Z",
"author": "stellaops",
"dependencies": [],
"metadata": {
"minRunnerVersion": "1.5.0"
}
}
```
### 3.2 Pack Types
| Type | Description |
|------|-------------|
| `policy` | Policy rule packs |
| `analyzer` | Scanner analyzer packs |
| `task` | TaskRunner task definitions |
| `bundle` | Composite packs |
---
## 4) REST API
```
GET /packs → { packs: PackSummary[] }
GET /packs/{id} → { pack: Pack }
GET /packs/{id}/versions → { versions: Version[] }
GET /packs/{id}/{version} → binary content
POST /packs { manifest, content } → { packId }
DELETE /packs/{id}/{version} → { deleted: bool }
GET /healthz | /readyz | /metrics
```
---
## Related Documentation
* TaskRunner: `../taskrunner/architecture.md`
* Policy: `../policy/architecture.md`

View File

@@ -0,0 +1,208 @@
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
# Task Pack Authoring Guide
This guide teaches engineers how to design, validate, and publish Task Packs that align with the Sprint43 specification. Follow these steps to ensure deterministic behaviour, secure approvals, and smooth hand-off to operators.
---
## 1·Prerequisites
- StellaOps CLI `>= 2025.10.0` with pack commands enabled.
- Authority client configured with `packs.write` (publish) and `packs.run` (local testing) scopes.
- Access to Task Runner staging environment for validation runs.
- Familiarity with the [Task Pack Specification](spec.md) and [Packs Registry](registry.md).
- Optional: connection to DevOps staging registry or Offline Kit mirror for publishing.
---
## 2·Design Checklist
1. **Define objective.** Document the operational need, inputs, expected outputs, and rollback strategy.
2. **Identify approvals.** Determine which scopes/roles must sign off (`packs.approve` assignments).
3. **Plan security posture.** Limit secrets usage, set tenant visibility, and note network constraints (sealed mode).
4. **Model observability.** Decide which metrics, logs, and evidence artifacts are critical for post-run audits.
5. **Reuse libraries.** Prefer built-in modules or shared pack fragments to reduce drift.
Capture the above in `docs/summary.md` (optional but recommended) for future maintainers.
---
## 3·Authoring Workflow
### 3.1 Scaffold project
```bash
mkdir my-pack
cd my-pack
stella pack init --name sbom-remediation
```
`stella pack init` creates baseline files:
- `pack.yaml` with metadata placeholders.
- `schemas/inputs.schema.json` (sample).
- `docs/usage.md` (template for human instructions).
- `.packignore` to exclude build artifacts.
### 3.2 Define inputs & schemas
- Use JSON Schema (`draft-2020-12`) for input validation.
- Avoid optional inputs unless there is a deterministic default.
- Store schemas under `schemas/` and reference via relative paths.
### 3.3 Compose steps
- Break workflow into small deterministic steps.
- Name each step with stable `id`.
- Wrap scripts/tools using built-in modules; copy scripts to `assets/` if necessary.
- Use `when` expressions for branch logic; ensure expressions rely solely on inputs or previous outputs.
- For loops, adopt `map` with capped iteration count; avoid data-dependent randomness.
### 3.4 Configure approvals
- Add `spec.approvals` entries for each required review.
- Capture the metadata Authority enforces: `runId`, `gateId`, and `planHash` should be documented so approvers can pass them through `stella pack approve --pack-run-id/--pack-gate-id/--pack-plan-hash` (see `docs/modules/packs-registry/guides/runbook.md#4-approvals-workflow`).
- Provide informative `reasonTemplate` with placeholders.
- Set `expiresAfter` to match operational policy (e.g., 4h for security reviews).
- Document fallback contacts in `docs/runbook.md`.
### 3.5 Manage secrets
- Declare secrets under `spec.secrets`.
- Reference secrets via expressions (e.g., `{{ secrets.jiraToken.value }}`) inside modules that support secure injection.
- Never bake secrets or tokens into pack assets.
- If secret optional, set `optional: true` and handle absence in step logic.
### 3.6 Document outputs
- List expected artifacts under `spec.outputs`.
- Include human-friendly docs (Markdown) describing each output and how to access it through CLI or Console.
---
## 4·Validation
### 4.1 Static validation
```bash
stella pack validate
```
Checks performed:
- Schema compliance (YAML, JSON Schema).
- Determinism guard (forbidden functions, clock usage, network allowlist).
- Reference integrity (assets, schemas, documentation).
- Approval/secret scope availability.
### 4.2 Simulation & plan hash
```bash
stella pack plan --inputs samples/inputs.json --output .artifacts/plan.json
stella pack simulate --inputs samples/inputs.json --output .artifacts/sim.json
```
- Review plan graph to ensure step ordering and gating align with expectations.
- Store simulation output with pack metadata for future audits.
### 4.3 Local rehearsal
```bash
stella pack run \
--inputs samples/inputs.json \
--secrets jiraToken=@secrets/jira.txt \
--dry-run
```
- Use `--dry-run` to verify approvals and outputs without side effects.
- Real runs require `packs.run` and all approval gates satisfied (e.g., via CLI prompts or Console).
### 4.4 Unit tests (optional but encouraged)
- Create a `tests/` folder with CLI-driven regression scripts (e.g., using `stella pack plan` + `jq` assertions).
- Integrate into CI pipelines; ensure tests run offline using cached assets.
---
## 5·Publishing
### 5.1 Build bundle
```bash
stella pack build \
--output dist/sbom-remediation-1.3.0.stella-pack.tgz \
--manifest pack.yaml
```
### 5.2 Sign bundle
```bash
cosign sign-blob \
--yes \
--output-signature dist/sbom-remediation-1.3.0.sig \
dist/sbom-remediation-1.3.0.stella-pack.tgz
```
Store signature alongside bundle; DSSE optional but recommended (see [security guidance](../security/pack-signing-and-rbac.md)).
### 5.3 Publish to registry
```bash
stella pack push \
registry.stella-ops.org/packs/sbom-remediation:1.3.0 \
--bundle dist/sbom-remediation-1.3.0.stella-pack.tgz \
--signature dist/sbom-remediation-1.3.0.sig
```
Registry verifies signature, stores provenance, and updates index.
### 5.4 Offline distribution
- Export bundle + signature + provenance into Offline Kit using `stella pack bundle export`.
- Update mirror manifest (`manifest/offline-manifest.json`) with new pack entries.
---
## 6·Versioning & Compatibility
- Follow SemVer (increment major when breaking schema/behaviour).
- Document compatibility in `docs/compatibility.md` (recommended).
- Registry retains immutable history; use `metadata.deprecated: true` to indicate retirement.
---
## 7·Best Practices
- **Keep steps idempotent.** Support manual retries without side effects.
- **Surface evidence early.** Export intermediate artifacts (plans, logs) for operators.
- **Localize messages.** Provide `locales/en-US.json` for CLI/Console strings (Sprint43 requirement).
- **Avoid long-running commands.** Split heavy tasks into smaller steps with progress telemetry.
- **Guard network usage.** Use `when: "{{ env.isSealed }}"` to block disallowed network operations or provide offline instructions.
- **Document fallbacks.** Include manual recovery instructions in `docs/runbook.md`.
---
## 8·Hand-off & Review
- Submit PR including pack bundle metadata, docs, and validation evidence.
- Request review from Task Runner + Security + DevOps stakeholders.
- Attach `stella pack plan` output and signature digest to review notes.
- After approval, update change log (`docs/CHANGELOG.md`) and notify Task Runner operations.
---
## 9·Compliance Checklist
- [ ] Metadata, inputs, steps, approvals, secrets, and outputs defined per spec.
- [ ] Schemas provided for all object inputs and outputs.
- [ ] Determinism validation (`stella pack validate`) executed with evidence stored.
- [ ] Plan + simulation artifacts committed in `.artifacts/` or CI evidence store.
- [ ] Bundle signed (cosign/DSSE) and signature recorded.
- [ ] Runbook and troubleshooting notes documented.
- [ ] Offline distribution steps prepared (bundle export + manifest update).
- [ ] Imposed rule reminder retained at top.
---
*Last updated: 2025-10-27 (Sprint43).*

View File

@@ -0,0 +1,184 @@
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
# Packs Registry Architecture & Operations
The Packs Registry stores, verifies, and serves Task Pack bundles across environments. It integrates with Authority for RBAC, Task Runner for execution, DevOps for release automation, and Offline Kit for air-gapped distribution.
---
## 1·Service Overview
- **Service name:** `StellaOps.PacksRegistry`
- **Interfaces:** REST/GraphQL API, OCI-compatible registry endpoints, event streams for mirroring.
- **Data stores:** PostgreSQL (`packs`, `pack_versions`, `pack_provenance` tables), object storage (bundle blobs, signatures), timeline events.
- **Dependencies:** Authority scopes (`packs.*`), Export Center (manifests), DevOps signing service, Notifications (optional).
---
## 2·Core Concepts
| Concept | Description |
|---------|-------------|
| **Pack record** | Immutable entry representing a pack version; includes metadata, digest, signatures, tenant visibility. |
| **Channel** | Logical distribution channel (`stable`, `edge`, `beta`, custom). Controls mirroring/promotion flows. |
| **Provenance** | DSSE statements + SBOM linking pack bundle to source repo, CLI build, and Task Runner compatibility. |
| **Mirroring policy** | Rules specifying which packs replicate to downstream registries or Offline Kit bundles. |
| **Audit trail** | Append-only log capturing publish/update/delete actions, approvals, and policy evaluations. |
---
## 3·API Surface
### 3.1 REST Endpoints
| Method | Path | Description | Scopes |
|--------|------|-------------|--------|
| `GET` | `/api/packs` | List packs with filters (`name`, `channel`, `tenant`, `tag`). | `packs.read` |
| `GET` | `/api/packs/{packId}/versions` | List versions with metadata, provenance. | `packs.read` |
| `GET` | `/api/packs/{packId}/versions/{version}` | Retrieve manifest, signatures, compatibility matrix. | `packs.read` |
| `POST` | `/api/packs/{packId}/versions` | Publish new version (bundle upload or OCI reference). | `packs.write` |
| `POST` | `/api/packs/{packId}/promote` | Promote version between channels (edge→stable). | `packs.write` + approval policy |
| `DELETE` | `/api/packs/{packId}/versions/{version}` | Deprecate version (soft delete, immutability preserved). | `packs.write` |
| `GET` | `/api/packs/{packId}/events` | Stream audit events (SSE). | `packs.read` |
### 3.2 OCI Endpoints
The registry exposes OCI-compatible endpoints (`/v2/<namespace>/<pack>/...`) supporting:
- `PUT`/`PATCH`/`GET` for manifests and blobs.
- Content-addressed digests using SHA-256.
- Annotations for pack metadata (`org.opencontainers.image.title`, `io.stellaops.pack.metadata`).
### 3.3 GraphQL (Optional)
GraphQL endpoint (`/api/graphql`) enables advanced queries (filter by approvals, tags, compatibility). Under active development; reference API schema once published.
---
## 4·Publishing Workflow
1. CLI/CI calls `POST /api/packs/{id}/versions` with signed bundle.
2. Registry verifies:
- Manifest schema compliance.
- Signature (cosign/DSSE) validity.
- Authority scopes (`packs.write`).
- Tenant visibility constraints.
3. On success, registry stores bundle, provenance, and emits event (`pack.version.published`).
4. Optional promotion requires additional approvals or integration with DevOps release boards.
All actions recorded in audit log:
```json
{
"id": "evt_01HF...",
"type": "pack.version.published",
"packId": "sbom-remediation",
"version": "1.3.0",
"actor": "user:alice",
"tenant": "west-prod",
"source": "cli/2025.10.0",
"signatures": ["sha256:..."],
"metadataHash": "sha256:..."
}
```
---
## 5·Mirroring & Offline Support
- **Automatic mirroring:** Configure policies to push packs to secondary registries (edge clusters, regional mirrors) or object stores.
- **Offline Kit integration:** `ops/offline-kit` pipeline pulls packs matching specified channels and writes them to `offline/packs/manifest.json` with signatures.
- **Checksum manifest:** Registry maintains `digestmap.json` listing pack digests + signatures; offline installers verify before import.
- **Sealed mode:** Registry can operate in read-only mode for sealed environments; publishing disabled except via offline import command (`stella pack mirror import`).
---
## 6·Security & Compliance
- Enforce Authority scopes; tokens without tenant or required scope are rejected (`ERR_PACK_SCOPE`).
- Signatures verified using trusted Fulcio/KMS roots; optional mirror trust bundles configured via `registry.trustBundle`.
- RBAC mapping:
| Role | Scopes | Capabilities |
|------|--------|--------------|
| `PackViewer` | `packs.read` | Browse, fetch manifests/bundles. |
| `PackPublisher` | `packs.read`, `packs.write` | Publish/promote, manage channels (subject to policy). |
| `PackApprover` | `packs.read`, `packs.approve` | Approve promotions, override tenant visibility (with audit logging). |
| `PackOperator` | `packs.read`, `packs.run` | Execute packs (via CLI/Task Runner). |
- Audit events forwarded to Authority + Evidence Locker.
- Built-in malware/secret scanning runs on bundle upload (configurable via DevOps pipeline).
See [pack signing & RBAC guidance](../security/pack-signing-and-rbac.md) for deeper controls.
---
## 7·Observability
- Metrics (`registry` namespace):
- `pack_publish_total{result}` success/failure counts.
- `pack_signature_verify_seconds` verification latency.
- `pack_channel_promotions_total` promotions per channel.
- `pack_mirror_queue_depth` pending mirror jobs.
- Logs (structured JSON with `packId`, `version`, `actor`, `tenant`, `digest`).
- Traces instrument bundle verification, storage writes, and mirror pushes.
- Alerting suggestions:
- Publish failure rate >5% (5m window) triggers DevOps escalation.
- Mirror lag >15m surfaces to Ops dashboard.
---
## 8·Schema & Metadata Extensions
- Default metadata stored under `metadata.*` from manifest.
- Registry supplements with:
- `compatibility.cli` (supported CLI versions).
- `compatibility.runner` (Task Runner build requirements).
- `provenance.attestations[]` (URIs).
- `channels[]` (current channel assignments).
- `tenantVisibility[]`.
- `deprecated` flag + replacement hints.
Extensions must be deterministic and derived from signed bundle data.
---
## 9·Operations
- **Backups:** Daily snapshots of PostgreSQL tables + object storage, retained for 30days.
- **Retention:** Old versions retained indefinitely; mark as `deprecated` instead of deleting.
- **Maintenance:**
- Run `registry vacuum` weekly to prune orphaned blobs.
- Rotate signing keys per security policy (document in `pack-signing-and-rbac`).
- Validate trust bundles quarterly.
- **Disaster recovery:**
- Restore database + object storage.
- Rebuild OCI indexes (`registry rebuild-index`).
- Replay audit events for downstream systems.
---
## 10·Compliance Checklist
- [ ] REST + OCI endpoints documented with required scopes.
- [ ] Publishing flow covers signature verification, audit logging, and promotion policies.
- [ ] Mirroring/offline strategy recorded (policies, manifests, sealed mode notes).
- [ ] RBAC roles and scope mapping defined.
- [ ] Observability metrics, logs, and alerts described.
- [ ] Operations guidance covers backups, rotation, disaster recovery.
- [ ] Imposed rule reminder included at top of document.
## 11·TP Gap Remediation (2025-12)
- **Signed registry record (TP7):** Every pack version stores DSSE envelopes for bundle + attestation, SBOM path, and revocation list reference. Imports fail-closed when signatures or revocation proofs are missing.
- **Offline bundle schema (TP8):** Registry exports offline artefacts that must satisfy `docs/modules/packs-registry/guides/packs-offline-bundle.schema.json`; publish pipeline invokes `scripts/packs/verify_offline_bundle.py --require-dsse` before promotion.
- **Hash ledger (TP1/TP2):** Publish step writes `hashes[]` (sha256) for manifest, canonical plan, `inputs.lock`, approvals ledger, SBOM, and revocations; digests surface in audit events and `digestmap.json`.
- **Sandbox + quotas (TP6):** Registry metadata carries `sandbox.mode`, explicit egress allowlists, CPU/memory limits, and quota seconds; Task Runner refuses packs missing these fields.
- **SLO + alerting (TP9):** Pack metadata includes SLOs (`runP95Seconds`, `approvalP95Seconds`, `maxQueueDepth`); registry emits metrics/alerts when declared SLOs are exceeded during publish/import flows.
- **Fail-closed imports (TP10):** Import/mirror paths abort when DSSE, hash entries, or revocation files are absent or stale, returning actionable error codes for CLI/Task Runner.
- **Approval ledger schema:** Registry exposes `docs/modules/packs-registry/guides/approvals-ledger.schema.json` for DSSE approval records (planHash must be `sha256:<64-hex>`); import validation rejects non-conforming ledgers.
---
*Last updated: 2025-12-05 (Sprint0157-0001-0001 TaskRunner I).*

View File

@@ -0,0 +1,163 @@
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
# Task Pack Operations Runbook
This runbook guides SREs and on-call engineers through executing, monitoring, and troubleshooting Task Packs using the Task Runner service, Packs Registry, and StellaOps CLI. It aligns with Sprint43 deliverables (approvals workflow, notifications, chaos resilience).
---
## 1·Quick Reference
| Action | Command / UI | Notes |
|--------|--------------|-------|
| Validate pack | `stella pack validate --bundle <file>` | Run before publishing or importing. |
| Plan pack run | `stella pack plan --inputs inputs.json` | Outputs plan hash, required approvals, secret summary. |
| Execute pack | `stella pack run --pack <id>:<version>` | Streams logs; prompts for secrets/approvals if allowed. |
| Approve gate | Console notifications or `stella pack approve --run <id> --gate <gate>` | Requires `packs.approve`. |
| View run | Console `/console/packs/runs/:id` or `stella pack runs show <id>` | SSE stream available for live status. |
| Export evidence | `stella pack runs export --run <id>` | Produces bundle with plan, logs, artifacts, attestations. |
---
## 2·Run Lifecycle
1. **Submission**
- CLI/Orchestrator submits run with inputs, pack version, tenant context.
- Task Runner validates pack hash, scopes, sealed-mode constraints.
2. **Plan & Simulation**
- Runner caches plan graph; optional simulation diff recorded.
3. **Approvals**
- Gates emit notifications (`NOTIFY-SVC-40-001`).
- Approvers can approve/resume via CLI, Console, or API.
4. **Execution**
- Steps executed per plan (sequential/parallel).
- Logs streamed via SSE (`/task-runner/runs/{id}/logs`).
5. **Evidence & Attestation**
- On completion, DSSE attestation + evidence bundle stored.
- Exports available via Export Center.
6. **Cleanup**
- Artifacts retained per retention policy (default 30d).
- Mirror pack run manifest to Offline Kit if configured.
---
## 3·Monitoring & Telemetry
- **Metrics dashboards:** `task-runner` Grafana board.
- `pack_run_active` active runs per tenant.
- `pack_step_duration_seconds` histograms per step type.
- `pack_gate_wait_seconds` approval wait time (alert >30m).
- `pack_run_success_ratio` success vs failure rate.
- **Logs:** Search by `runId`, `packId`, `tenant`, `stepId`.
- **Traces:** Query `taskrunner.run` span in Tempo/Jaeger.
- **Notifications:** Subscribe to `pack.run.*` topics via Notifier for Slack/email/PagerDuty hooks.
Observability configuration referenced in Task Runner tasks (OBS-50-001..55-001).
---
## 4·Approvals Workflow
- Approvals may be requested via Console banner, CLI prompt, or email/Slack.
- Approver roles: `packs.approve` + tenant membership.
- CLI command:
```bash
stella pack approve \
--run run:tenant:timestamp \
--gate security-review \
--comment "Validated remediation scope; proceeding."
```
- Metadata parameters are mandatory: `--pack-run-id`, `--pack-gate-id`, and `--pack-plan-hash` map 1:1 to the Authority token parameters (`pack_run_id`, `pack_gate_id`, `pack_plan_hash`). The CLI resolves sensible defaults from `stella pack plan`, but operators can override them explicitly for out-of-band runs. Authority `/token` rejects `packs.approve` requests missing any of these fields and records the failure in `authority.pack_scope_violation`. Keep this section (and `docs/security/pack-signing-and-rbac.md`) handy—the Authority team references it as the canonical procedure.
- Auto-expiry triggers run cancellation (configurable per gate).
- Approval events logged and included in evidence bundle.
---
## 5·Secrets Handling
- Secrets retrieved via Authority secure channel or CLI profile.
- Task Runner injects secrets into isolated environment variables or temp files (auto-shredded).
- Logs redact secrets; evidence bundles include only secret metadata (name, scope, last four characters).
- For sealed mode, secrets must originate from sealed vault (configured via `TASKRUNNER_SEALED_VAULT_URL`).
---
## 6·Failure Recovery
| Scenario | Symptom | Resolution |
|----------|---------|------------|
| **Plan hash mismatch** | Run aborted with `ERR_PACK_HASH_MISMATCH`. | Re-run `stella pack plan`; ensure pack not modified post-plan. |
| **Approval timeout** | `ERR_PACK_APPROVAL_TIMEOUT`. | Requeue run with extended TTL or escalate to approver; verify notifications delivered. |
| **Secret missing** | Run fails at injection step. | Provide secret via CLI (`--secrets`) or configure profile; check Authority scope. |
| **Network blocked (sealed)** | `ERR_PACK_NETWORK_BLOCKED`. | Update pack to avoid external calls or whitelist domain via AirGap policy. |
| **Artifact upload failure** | Evidence missing, logs show storage errors. | Retry run with `--resume` (if supported); verify object storage health. |
| **Runner chaos trigger** | Run paused with chaos event note. | Review chaos test plan; resume if acceptable or cancel run. |
`stella pack runs resume --run <id>` resumes paused runs post-remediation (approvals or transient failures).
---
## 7·Chaos & Resilience
- Chaos hooks pause runs, drop network, or delay approvals to test resilience.
- Track chaos events via `pack.chaos.injected` timeline entries.
- Post-chaos, ensure metrics return to baseline; record findings in Ops log.
---
## 8·Offline & Air-Gapped Execution
- Use `stella pack mirror pull` to import packs into sealed registry.
- CLI caches bundles under `~/.stella/packs/` for offline runs.
- Approvals require offline process:
- Generate approval request bundle (`stella pack approve --offline-request`).
- Approver signs bundle using offline CLI.
- Import approval via `stella pack approve --offline-response`.
- Evidence bundles exported to removable media; verify checksums before upload to online systems.
---
## 9·Runbooks for Common Packs
Maintain per-pack playbooks in `docs/modules/packs-registry/guides/runbook/<pack-name>.md`. Include:
- Purpose and scope.
- Required inputs and secrets.
- Approval stakeholders.
- Pre-checks and post-checks.
- Rollback procedures.
The Docs Guild can use this root runbook as a template.
---
## 10·Escalation Matrix
| Issue | Primary | Secondary | Notes |
|-------|---------|-----------|-------|
| Pack validation errors | DevEx/CLI Guild | Task Runner Guild | Provide pack bundle + validation output. |
| Approval pipeline failure | Task Runner Guild | Authority Core | Confirm scope/role mapping. |
| Registry outage | Packs Registry Guild | DevOps Guild | Use mirror fallback if possible. |
| Evidence integrity issues | Evidence Locker Guild | Security Guild | Validate DSSE attestations, escalate if tampered. |
Escalations must include run ID, tenant, pack version, plan hash, and timestamps.
---
## 11·Compliance Checklist
- [ ] Run lifecycle documented (submission → evidence).
- [ ] Monitoring metrics, logs, traces, and notifications captured.
- [ ] Approvals workflow instructions provided (CLI + Console).
- [ ] Secret handling, sealed-mode constraints, and offline process described.
- [ ] Failure scenarios + recovery steps listed.
- [ ] Chaos/resilience guidance included.
- [ ] Escalation matrix defined.
- [ ] Imposed rule reminder included at top.
---
*Last updated: 2025-10-27 (Sprint43).*

View File

@@ -0,0 +1,261 @@
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
# Task Pack Specification (Sprint43 Draft)
The Task Pack specification defines a deterministic, auditable format that enables operators to encode multi-step maintenance, validation, and deployment workflows. Packs are executed by the Task Runner service, distributed through the Packs Registry, and invoked via the StellaOps CLI (`stella pack ...`) or Orchestrator integrations.
---
## 1·Goals & Scope
- **Deterministic execution.** Identical inputs yield identical run graphs, output manifests, and evidence bundles across environments (online, sealed, or offline).
- **Secure-by-default.** Pack metadata must capture provenance, signatures, RBAC requirements, and secret usage; execution enforces tenant scopes and approvals.
- **Portable.** Packs are distributed as signed OCI artifacts or tarballs that work in connected and air-gapped deployments, including Offline Kit mirrors.
- **Composable.** Packs can reference reusable steps, expressions, and shared libraries without sacrificing determinism or auditability.
Non-goals: full-blown workflow orchestration, unbounded scripting, or remote code injection. All logic is declarative and constrained to Task Runner capabilities.
---
## 2·Terminology
| Term | Definition |
|------|------------|
| **Pack manifest** | Primary YAML document (`pack.yaml`) describing metadata, inputs, steps, policies, and evidence expectations. |
| **Step** | Atomic unit of work executed by Task Runner (e.g., command, API call, policy gate, approval). Steps can be sequential or parallel. |
| **Expression** | Deterministic evaluation (JMESPath-like) used for branching, templating, and conditionals. |
| **Policy gate** | Declarative rule that blocks execution until conditions are met (e.g., approval recorded, external signal received). |
| **Artifact** | File, JSON blob, or OCI object produced by a step, referenced in manifests and evidence bundles. |
| **Pack bundle** | Distribution archive (`.stella-pack.tgz` or OCI ref) containing manifest, assets, schemas, and provenance metadata. |
---
## 3·Pack Layout
```
my-pack/
├─ pack.yaml # Required manifest
├─ assets/ # Optional static assets (scripts, templates)
├─ schemas/ # JSON schemas for inputs/outputs
├─ docs/ # Markdown docs rendered in Console/CLI help
├─ provenance/ # DSSE statements, SBOM, attestations
└─ README.md # Author-facing summary (optional)
```
Publishing via Packs Registry or OCI ensures the directory is canonical and hashed.
---
## 4·Manifest Schema (v1.0)
```yaml
apiVersion: stellaops.io/pack.v1
kind: TaskPack
metadata:
name: sbom-remediation
version: 1.3.0
description: >
Audit SBOM drift, quiet high-risk findings, and export mitigation evidence.
tags: [sbom, remediation, policy]
tenantVisibility: ["west-prod", "east-stage"] # optional allowlist
maintainers:
- name: Jane Doe
email: jane@example.com
license: BUSL-1.1
annotations:
imposedRuleReminder: true
spec:
inputs:
- name: sbomBundle
type: object
schema: schemas/sbom-bundle.schema.json
required: true
- name: dryRun
type: boolean
default: false
secrets:
- name: jiraToken
scope: packs.run # Authority scope required
description: Optional token for ticket automation
approvals:
- id: security-review
grants: ["packs.approve"]
expiresAfter: PT4H
reasonTemplate: "Approve remediation for SBOM {{ inputs.sbomBundle.metadata.image }}"
steps:
- id: validate-input
run:
uses: builtin:validate-schema
with:
target: "{{ inputs.sbomBundle }}"
schema: schemas/sbom-bundle.schema.json
- id: plan-remediation
when: "{{ not inputs.dryRun }}"
run:
uses: builtin:policy-simulate
with:
sbom: "{{ inputs.sbomBundle }}"
policy: "policies/remediation.yaml"
- id: approval-gate
gate:
approval: security-review
message: "Security must approve remediation before changes apply."
- id: apply-remediation
run:
uses: builtin:cli-command
with:
command: ["stella", "policy", "promote", "--from-pack"]
- id: export-evidence
run:
uses: builtin:evidence-export
with:
includeArtifacts: ["{{ steps.plan-remediation.outputs.planPath }}"]
outputs:
- name: evidenceBundle
type: file
path: "{{ steps.export-evidence.outputs.bundlePath }}"
success:
message: "Remediation applied; evidence bundle ready."
failure:
retries:
maxAttempts: 1
backoffSeconds: 0
message: "Remediation failed; see evidence bundle for context."
```
### 4.1 Field Summary
| Field | Description | Requirements |
|-------|-------------|--------------|
| `metadata` | Human-facing metadata; used for registry listings and RBAC hints. | `name` (DNS-1123), `version` (SemVer), `description`2048 chars. |
| `spec.inputs` | Declarative inputs validated at plan time. | Must include type; custom schema optional but recommended. |
| `spec.secrets` | Secrets requested at runtime; never stored in pack bundle. | Each secret references Authority scope; CLI prompts or injects from profiles. |
| `spec.approvals` | Named approval gates with required grants and TTL. | ID unique per pack; `grants` map to Authority roles. Approval metadata (`runId`, `gateId`, `planHash`) feeds Authoritys `pack_run_id`/`pack_gate_id`/`pack_plan_hash` parameters (see `docs/modules/packs-registry/guides/runbook.md#4-approvals-workflow`). |
| `spec.steps` | Execution graph; each step is `run`, `gate`, `parallel`, or `map`. | Steps must declare deterministic `uses` module and `id`. |
| `spec.outputs` | Declared artifacts for downstream automation. | `type` can be `file`, `object`, or `url`; path/expression required. |
| `success` / `failure` | Messages + retry policy. | `failure.retries.maxAttempts` + `backoffSeconds` default to 0. |
---
## 5·Step Types
| Type | Schema | Notes |
|------|--------|-------|
| `run` | Executes a built-in module (`builtin:*`) or registry-provided module. | Modules must be deterministic, side-effect constrained, and versioned. |
| `parallel` | Executes sub-steps concurrently; `maxParallel` optional. | Results aggregated; failures trigger abort unless `continueOnError`. |
| `map` | Iterates over deterministic list; each iteration spawns sub-step. | Sequence derived from expression result; ordering stable. |
| `gate.approval` | Blocks until approval recorded with required grants. | Supports `autoExpire` to cancel run on timeout. |
| `gate.policy` | Calls Policy Engine to ensure criteria met (e.g., no critical findings). | Fails run if gate not satisfied. |
`when` expressions must be pure (no side effects) and rely only on declared inputs or prior outputs.
---
## 6·Determinism & Validation
1. **Plan phase** (`stella pack plan`, `TaskRunner.Plan` API) parses manifest, resolves expressions, validates schemas, and emits canonical graph with hash.
2. **Simulation** compares plan vs dry-run results, capturing differences in `planDiff`. Required for approvals in sealed environments.
3. **Execution** uses plan hash to ensure runtime graph matches simulation. Divergence aborts run.
4. **Evidence**: Task Runner emits DSSE attestation referencing plan hash, input digests, and output artifacts.
Validation pipeline:
```text
pack.yaml ──▶ schema validation ──▶ expression audit ──▶ determinism guard ──▶ signing
```
Packs must pass CLI validation before publishing.
### 6.1·TP Gap Remediation (2025-12)
- **Canonical plan hash (TP1):** Compute `plan.hash` as `sha256:<64-hex>` over canonical JSON (`plan.canonicalPlanPath`) with sorted keys and normalized numbers/booleans. The canonical plan file ships in offline bundles.
- **Inputs lock (TP2):** CLI emits `inputs.lock` capturing resolved inputs and redacted secret placeholders; hashed via `hashes[]` and included in evidence bundles.
- **Approval ledger DSSE (TP3):** Approval responses are DSSE-signed ledgers embedding `runId`, `gateId`, `planHash`, and tenant context; Task Runner rejects approvals without matching plan hash.
- **Secret redaction (TP4):** `security.secretsRedactionPolicy` defines hashing/redaction for secrets and PII; transcripts/evidence must reference this policy.
- **Deterministic RNG/time (TP5):** RNG seed is derived from `plan.hash`; timestamps use UTC ISO-8601; log ordering is monotonic.
- **Sandbox + egress quotas (TP6):** Packs declare `sandbox.mode`, explicit `egressAllowlist`, CPU/memory limits, and optional `quotaSeconds`; missing fields cause fail-closed refusal.
- **Registry signing + revocation (TP7):** Bundles carry SBOM + DSSE envelopes and reference a revocation list enforced during registry import.
- **Offline bundle schema + verifier (TP8):** Offline exports must satisfy `docs/modules/packs-registry/guides/packs-offline-bundle.schema.json` and pass `scripts/packs/verify_offline_bundle.py --require-dsse`.
- **SLO + alerting (TP9):** Manifests declare `slo.runP95Seconds`, `slo.approvalP95Seconds`, `slo.maxQueueDepth`, and optional `slo.alertRules`; telemetry enforces and alerts on breaches.
- **Fail-closed gates (TP10):** Approval/policy/timeline gates fail closed when DSSE, hash entries, or quotas are missing/expired; CLI surfaces remediation hints.
- **Approval ledger schema:** Approval decisions must conform to `docs/modules/packs-registry/guides/approvals-ledger.schema.json`; planHash is `sha256:<64-hex>` and DSSE envelopes must reference ledger digest.
---
## 7·Signatures & Provenance
- Pack bundles are signed with **cosign** (keyless Fulcio/KMS supported) and optionally DSSE envelopes.
- `provenance/` directory stores signed statements (SLSA Build L1+) linking source repo, CI run, and manifest hash.
- Registry verifies signatures on push/pull; Task Runner refuses unsigned packs unless in development mode.
- Attestations include:
- Pack manifest digest (`sha256`)
- Pack bundle digest
- Build metadata (`git.ref`, `ci.workflow`, `cli.version`)
---
## 8·RBAC & Scopes
Authority scopes introduced by `AUTH-PACKS-41-001`:
| Scope | Purpose |
|-------|---------|
| `packs.read` | Discover packs, download manifests. |
| `packs.write` | Publish/update packs in registry (requires signature). |
| `packs.run` | Execute packs via CLI/Task Runner. |
| `packs.approve` | Fulfil approval gates defined in packs. |
Task Runner enforces scopes per tenant; pack metadata may further restrict tenant visibility (`metadata.tenantVisibility`).
---
## 9·Observability & Evidence
- Metrics: `pack_run_duration_seconds`, `pack_step_retry_total`, `pack_gate_wait_seconds`.
- Logs: Structured JSON per step with scrubbed inputs (`secretMask` applied).
- Timeline events: `pack.started`, `pack.approval.requested`, `pack.approval.granted`, `pack.completed`.
- Evidence bundle includes:
- Plan manifest (canonical JSON)
- Step transcripts (redacted)
- Artifacts manifest (sha256, size)
- Attestations references
---
## 10·Compatibility Matrix
| CLI Version | Pack API | Task Runner | Notes |
|-------------|----------|-------------|-------|
| 2025.10.x | `pack.v1` | Runner build `>=2025.10.0` | Approvals optional, loops disabled. |
| 2025.12.x | `pack.v1` | Runner build `>=2025.12.0` | Approvals resume, secrets injection, localization strings. |
| Future | `pack.v2` | TBD | Will introduce typed outputs & partial replay (track in Epic 13). |
CLI enforces compatibility: running pack with unsupported features yields `ERR_PACK_UNSUPPORTED`.
---
## 11·Publishing Workflow
1. Author pack (`pack.yaml`, assets, docs).
2. Run `stella pack validate` (schema + determinism).
3. Generate bundle: `stella pack build --output my-pack.stella-pack.tgz`.
4. Sign: `cosign sign-blob my-pack.stella-pack.tgz`.
5. Publish: `stella pack push registry.example.com/org/my-pack:1.3.0`.
6. Registry verifies signature, records provenance, and exposes pack via API.
---
## 12·Compliance Checklist
- [ ] Manifest schema documented for all fields, including approvals, secrets, and outputs.
- [ ] Determinism requirements outlined with plan/simulate semantics and CLI validation steps.
- [ ] Signing + provenance expectations spelled out with cosign/DSSE references.
- [ ] RBAC scopes (`packs.*`) and tenant visibility rules captured.
- [ ] Observability (metrics, logs, evidence) described for Task Runner integrations.
- [ ] Compatibility matrix enumerates CLI/Runner requirements.
- [ ] Publishing workflow documented with CLI commands.
- [ ] Imposed rule reminder included at top of document.
---
*Last updated: 2025-12-05 (Sprint0157-0001-0001 TaskRunner I).*

View File

@@ -0,0 +1,51 @@
# Provenance
> Provenance attestation library for SLSA/DSSE compliance.
## Purpose
Provenance provides deterministic, verifiable provenance attestations for all StellaOps artifacts. It enables SLSA compliance through DSSE statement generation, Merkle tree construction, and cryptographic verification.
## Quick Links
- [Architecture](./architecture.md) - Technical design and implementation details
- [Guides](./guides/) - Attestation generation guides
## Status
| Attribute | Value |
|-----------|-------|
| **Maturity** | Production |
| **Last Reviewed** | 2025-12-29 |
| **Maintainer** | Security Guild |
## Key Features
- **DSSE Statement Generation**: Build provenance attestations per DSSE spec
- **SLSA Compliance**: Support for SLSA build predicates
- **Merkle Tree Construction**: Content-addressed integrity verification
- **Promotion Attestations**: Track artifact promotions across environments
- **Verification Harness**: Validate attestation chains
## Dependencies
### Upstream (this module depends on)
- **Signer/KMS** - Key management for signing (delegated)
### Downstream (modules that depend on this)
- **Attestor** - Stores generated attestations
- **EvidenceLocker** - Evidence bundle attestations
- **ExportCenter** - Export attestations
## Notes
Provenance is a **library**, not a standalone service. It does not:
- Store attestations (handled by Attestor and EvidenceLocker)
- Hold signing keys (delegated to Signer/KMS)
All attestation outputs are deterministic with canonical JSON serialization.
## Related Documentation
- [Attestor Architecture](../attestor/architecture.md)
- [DSSE Specification](../../security/trust-and-signing.md)

View File

@@ -0,0 +1,316 @@
# component_architecture_provenance.md - **Stella Ops Provenance** (2025Q4)
> Provenance attestation library for SLSA/DSSE compliance.
> **Scope.** Library architecture for **Provenance**: shared libraries and tooling for generating, signing, and verifying provenance attestations (DSSE/SLSA). Used by evidence bundles, exports, and timeline verification flows.
---
## 0) Mission & boundaries
**Mission.** Provide **deterministic, verifiable provenance attestations** for all StellaOps artifacts. Enable SLSA compliance through DSSE statement generation, Merkle tree construction, and cryptographic verification.
**Boundaries.**
* Provenance is a **library**, not a standalone service.
* Provenance **does not** store attestations. Storage is handled by Attestor and EvidenceLocker.
* Provenance **does not** hold signing keys. Key management is delegated to Signer/KMS.
* All attestation outputs are **deterministic** with canonical JSON serialization.
---
## 1) Solution & project layout
```
src/Provenance/
├─ StellaOps.Provenance.Attestation/ # Core attestation library
│ ├─ AGENTS.md # Guild charter
│ ├─ PromotionAttestation.cs # Promotion statement builder
│ ├─ BuildModels.cs # SLSA build predicate models
│ ├─ Signers.cs # Signer abstractions
│ ├─ Verification.cs # Verification harness
│ └─ Hex.cs # Hex encoding utilities
├─ StellaOps.Provenance.Attestation.Tool/ # CLI tool for attestation
│ ├─ Program.cs
│ └─ README.md
└─ __Tests/
└─ StellaOps.Provenance.Attestation.Tests/
├─ PromotionAttestationBuilderTests.cs
├─ VerificationTests.cs
├─ SignersTests.cs
├─ MerkleTreeTests.cs
└─ RotatingSignerTests.cs
```
---
## 2) External dependencies
* **Signer** - Cryptographic signing operations
* **Cryptography** - Hash computation, signature algorithms
* **EvidenceLocker** - Consumes attestations for storage
* **ExportCenter** - Attaches attestations to export bundles
* **.NET 10** - Runtime target
---
## 3) Contracts & data model
### 3.1 DSSE Statement
Dead Simple Signing Envelope (DSSE) format:
```json
{
"payloadType": "application/vnd.in-toto+json",
"payload": "<base64-encoded-statement>",
"signatures": [
{
"keyid": "sha256:abc123...",
"sig": "<base64-signature>"
}
]
}
```
### 3.2 SLSA Provenance Predicate
```json
{
"_type": "https://in-toto.io/Statement/v1",
"subject": [
{
"name": "pkg:oci/scanner@sha256:abc123",
"digest": { "sha256": "abc123..." }
}
],
"predicateType": "https://slsa.dev/provenance/v1",
"predicate": {
"buildDefinition": {
"buildType": "https://stellaops.dev/build/v1",
"externalParameters": {},
"internalParameters": {},
"resolvedDependencies": []
},
"runDetails": {
"builder": {
"id": "https://stellaops.dev/builders/scanner"
},
"metadata": {
"invocationId": "build-2025-01-15-abc123",
"startedOn": "2025-01-15T10:30:00Z",
"finishedOn": "2025-01-15T10:35:00Z"
}
}
}
}
```
### 3.3 Promotion Attestation
For artifact promotion across environments:
```csharp
public sealed class PromotionAttestation
{
public required string ArtifactDigest { get; init; }
public required string SourceEnvironment { get; init; }
public required string TargetEnvironment { get; init; }
public required DateTimeOffset PromotedAt { get; init; }
public required string ApprovedBy { get; init; }
public required IReadOnlyList<string> PolicyDigests { get; init; }
public string? MerkleRoot { get; init; }
}
```
---
## 4) Core Components
### 4.1 Signer Abstractions
```csharp
public interface IAttestationSigner
{
string KeyId { get; }
string Algorithm { get; }
Task<byte[]> SignAsync(
ReadOnlyMemory<byte> payload,
CancellationToken ct);
}
public interface IRotatingSigner : IAttestationSigner
{
DateTimeOffset KeyNotBefore { get; }
DateTimeOffset KeyNotAfter { get; }
Task<IAttestationSigner> GetCurrentSignerAsync(CancellationToken ct);
}
```
### 4.2 Verification Harness
```csharp
public interface IAttestationVerifier
{
Task<VerificationResult> VerifyAsync(
DsseEnvelope envelope,
VerificationOptions options,
CancellationToken ct);
}
public sealed record VerificationResult
{
public required bool IsValid { get; init; }
public required string KeyId { get; init; }
public DateTimeOffset? SignedAt { get; init; }
public IReadOnlyList<string>? Warnings { get; init; }
public string? ErrorMessage { get; init; }
}
```
### 4.3 Merkle Tree Utilities
For evidence chain verification:
```csharp
public static class MerkleTree
{
public static string ComputeRoot(IEnumerable<string> leaves);
public static IReadOnlyList<string> ComputePath(
IReadOnlyList<string> leaves,
int leafIndex);
public static bool VerifyPath(
string leaf,
IReadOnlyList<string> path,
string root);
}
```
---
## 5) CLI Tool
`StellaOps.Provenance.Attestation.Tool` provides CLI commands:
```bash
# Generate provenance attestation
provenance-tool generate \
--subject "pkg:oci/scanner@sha256:abc123" \
--builder "stellaops/ci" \
--output attestation.json
# Sign attestation
provenance-tool sign \
--input attestation.json \
--key-id "kms://keys/signing-key" \
--output attestation.dsse.json
# Verify attestation
provenance-tool verify \
--input attestation.dsse.json \
--trust-root trust-bundle.json
# Generate promotion attestation
provenance-tool promote \
--artifact "sha256:abc123" \
--from staging \
--to production \
--approver "user@example.com"
```
---
## 6) Security & compliance
* **SLSA L3 compliance**: Build provenance with hermetic builds
* **Key rotation**: RotatingSigner supports key rotation with overlap
* **Determinism**: Canonical JSON ensures reproducible digests
* **Offline verification**: Trust bundles for air-gapped verification
* **Threat model**: Reviewed before each release
---
## 7) Performance targets
* **Statement generation**: < 10ms for typical attestation
* **Signing**: Depends on KMS (target < 100ms for HSM)
* **Verification**: < 50ms for single signature
* **Merkle root**: < 100ms for 10,000 leaves
---
## 8) Testing matrix
* **Serialization tests**: Deterministic JSON output across runs
* **Signing tests**: Round-trip sign/verify
* **Merkle tests**: Path generation and verification
* **Rotation tests**: Key rotation with overlap handling
* **Integration tests**: Full attestation flow with mock KMS
---
## 9) Sample Artifacts
Samples committed under `samples/provenance/`:
```
samples/provenance/
├─ slsa-provenance-v1.json # Sample SLSA statement
├─ promotion-attestation.json # Sample promotion
├─ trust-bundle.json # Sample trust root
└─ verify-example.sh # Verification script
```
---
## 10) Integration Points
### 10.1 EvidenceLocker
Evidence bundles include attestations:
```json
{
"bundleId": "eb-2025-01-15-abc123",
"attestations": [
{
"type": "slsa-provenance",
"dsse": { /* DSSE envelope */ }
}
]
}
```
### 10.2 ExportCenter
Exports attach attestations to manifests:
```json
{
"exportId": "export-abc123",
"manifest": { /* export manifest */ },
"attestation": { /* DSSE envelope */ }
}
```
### 10.3 CLI
Scanner and other tools generate attestations:
```bash
stella scan image:tag --attest --output sbom.cdx.json
# Produces sbom.cdx.json + sbom.cdx.json.dsse
```
---
## Related Documentation
* Attestor architecture: `../attestor/architecture.md`
* Signer architecture: `../signer/architecture.md`
* EvidenceLocker: `../evidence-locker/architecture.md`
* SLSA specification: https://slsa.dev/provenance/v1
* DSSE specification: https://github.com/secure-systems-lab/dsse

View File

@@ -0,0 +1,355 @@
# Inline DSSE Provenance
> **Status:** Draft aligns with the November2025 advisory “store DSSE attestation refs inline on every SBOM/VEX event node.”
> **Owners:** Authority Guild · Feedser Guild · Platform Guild · Docs Guild.
This document defines how Stella Ops records provenance for SBOM, VEX, scan, and derived events: every event node in the PostgreSQL event store includes DSSE + Rekor references and verification metadata so audits and replay become first-class queries.
---
## 1. Event patch (PostgreSQL schema)
```jsonc
{
"_id": "evt_...",
"kind": "SBOM|VEX|SCAN|INGEST|DERIVED",
"subject": {
"purl": "pkg:nuget/example@1.2.3",
"digest": { "sha256": "..." },
"version": "1.2.3"
},
"provenance": {
"dsse": {
"envelopeDigest": "sha256:...",
"payloadType": "application/vnd.in-toto+json",
"key": {
"keyId": "cosign:SHA256-PKIX:ABC...",
"issuer": "fulcio",
"algo": "ECDSA"
},
"rekor": {
"logIndex": 1234567,
"uuid": "b3f0...",
"integratedTime": 1731081600,
"mirrorSeq": 987654 // optional
},
"chain": [
{ "type": "build", "id": "att:build#...", "digest": "sha256:..." },
{ "type": "sbom", "id": "att:sbom#...", "digest": "sha256:..." }
]
}
},
"trust": {
"verified": true,
"verifier": "Authority@stella",
"witnesses": 1,
"policyScore": 0.92
},
"ts": "2025-11-11T12:00:00Z"
}
```
### Key fields
| Field | Description |
|-------|-------------|
| `provenance.dsse.envelopeDigest` | SHA-256 of the DSSE envelope (not payload). |
| `provenance.dsse.payloadType` | Usually `application/vnd.in-toto+json`. |
| `provenance.dsse.key` | Key fingerprint / issuer / algorithm. |
| `provenance.dsse.rekor` | Rekor transparency log metadata (index, UUID, integrated time). |
| `provenance.dsse.chain` | Optional chain of dependent attestations (build → sbom → scan). |
| `trust.*` | Result of local verification (DSSE signature, Rekor proof, policy). |
---
## 2. Write path (ingest flow)
1. **Obtain provenance metadata** for each attested artifact (build, SBOM, VEX, scan). The CI script (`scripts/publish_attestation_with_provenance.sh`) captures `envelopeDigest`, Rekor `logIndex`/`uuid`, and key info.
2. **Authority/Feedser** verify the DSSE + Rekor proof (local cosign/rekor libs or the Signer service) and set `trust.verified = true`, `trust.verifier = "Authority@stella"`, `trust.witnesses = 1`.
3. **Attach** the provenance block before appending the event to PostgreSQL, using `StellaOps.Provenance.Postgres` helpers.
4. **Backfill** historical events by resolving known subjects → attestation digests and running an update script.
### 2.1 Supplying metadata from Concelier statements
Concelier ingestion jobs can now inline provenance when they create advisory statements. Add an `AdvisoryProvenance` entry with `kind = "dsse"` (or `dsse-metadata` / `attestation-dsse`) and set `value` to the same JSON emitted by the CI snippet. `AdvisoryEventLog` and `AdvisoryMergeService` automatically parse that entry, hydrate `AdvisoryStatementInput.Provenance/Trust`, and persist the metadata alongside the statement.
```json
{
"source": "attestor",
"kind": "dsse",
"value": "{ \"dsse\": { \"envelopeDigest\": \"sha256:…\", \"payloadType\": \"application/vnd.in-toto+json\" }, \"trust\": { \"verified\": true, \"verifier\": \"Authority@stella\" } }",
"recordedAt": "2025-11-10T00:00:00Z"
}
```
Providing the metadata during ingestion keeps new statements self-contained and reduces the surface that the `/events/statements/{statementId}/provenance` endpoint needs to backfill later.
Reference helper: `src/__Libraries/StellaOps.Provenance.Postgres/ProvenancePostgresExtensions.cs`.
---
### 2.2 Advisory AI structured chunk schema (GHSA/Cisco parity)
Advisory AI consumes the canonical `Advisory` aggregate and emits structured chunks that mirror GHSA GraphQL and Cisco PSIRT provenance anchors. The response contract is:
```jsonc
{
"advisoryKey": "CVE-2025-0001",
"fingerprint": "<sha256 of canonical advisory>",
"total": 3,
"truncated": false,
"entries": [
{
"type": "workaround", // sorted by (type, observationPath, documentId)
"chunkId": "c0ffee12", // sha256(advisory.observationId + observationPath)[:16]
"content": { /* structured field */ },
"provenance": {
"documentId": "tenant-a:chunk:newest", // PostgreSQL id of backing observation
"observationPath": "/references/0", // JSON Pointer into the observation
"source": "nvd",
"kind": "workaround",
"value": "tenant-a:chunk:newest",
"recordedAt": "2025-01-07T00:00:00Z",
"fieldMask": ["/references/0"]
}
}
]
}
```
Determinism requirements:
- Order entries by `(type, observationPath, documentId)` to keep cache keys stable across nodes.
- Always include the advisory `fingerprint` in cache keys and responses.
- Preserve observation-level provenance by emitting both `documentId` and `observationPath` under `provenance`.
These anchors let Attestor/Console deep-link evidence and allow offline mirrors to prove origin without merging transforms.
---
## 3. CI/CD snippet
See `scripts/publish_attestation_with_provenance.sh`:
```bash
rekor-cli upload --rekor_server "$REKOR_URL" \
--artifact "$ATTEST_PATH" --type dsse --format json > rekor-upload.json
LOG_INDEX=$(jq '.LogIndex' rekor-upload.json)
UUID=$(jq -r '.UUID' rekor-upload.json)
ENVELOPE_SHA256=$(sha256sum "$ATTEST_PATH" | awk '{print $1}')
cat > provenance-meta.json <<EOF
{
"subject": { "imageRef": "$IMAGE_REF", "digest": { "sha256": "$IMAGE_DIGEST" } },
"dsse": {
"envelopeDigest": "sha256:$ENVELOPE_SHA256",
"payloadType": "application/vnd.in-toto+json",
"key": { "keyId": "$KEY_ID", "issuer": "$KEY_ISSUER", "algo": "$KEY_ALGO" },
"rekor": { "logIndex": $LOG_INDEX, "uuid": "$UUID", "integratedTime": $(jq '.IntegratedTime' rekor-upload.json) }
}
}
EOF
```
Feedser ingests this JSON and maps it to `DsseProvenance` + `TrustInfo`.
---
## 4. PostgreSQL indexes
Create indexes to keep provenance queries fast (PostgreSQL DDL):
```sql
-- events_by_subject_kind_provenance
CREATE INDEX events_by_subject_kind_provenance
ON events (subject_digest_sha256, kind, provenance_dsse_rekor_log_index);
-- events_unproven_by_kind
CREATE INDEX events_unproven_by_kind
ON events (kind, trust_verified, provenance_dsse_rekor_log_index);
-- events_by_rekor_logindex
CREATE INDEX events_by_rekor_logindex
ON events (provenance_dsse_rekor_log_index);
-- events_by_envelope_digest (partial index for non-null values)
CREATE INDEX events_by_envelope_digest
ON events (provenance_dsse_envelope_digest)
WHERE provenance_dsse_envelope_digest IS NOT NULL;
-- events_by_ts_kind_verified
CREATE INDEX events_by_ts_kind_verified
ON events (ts DESC, kind, trust_verified);
```
Deployment options:
- **Ops script:** `psql -d stellaops_db -f ops/postgres/indices/events_provenance_indices.sql`
- **C# helper:** `PostgresIndexes.EnsureEventIndexesAsync(connection, ct)`
This section was updated as part of `PROV-INDEX-401-030` (completed 2025-11-27).
---
## 5. Query recipes
* **All proven VEX for an image digest:**
```sql
SELECT * FROM events
WHERE kind = 'VEX'
AND subject_digest_sha256 = '<digest>'
AND provenance_dsse_rekor_log_index IS NOT NULL
AND trust_verified = true;
```
* **Compliance gap (unverified data used for decisions):**
```sql
SELECT kind, COUNT(*) as count
FROM events
WHERE kind IN ('VEX', 'SBOM', 'SCAN')
AND (trust_verified IS NOT TRUE
OR provenance_dsse_rekor_log_index IS NULL)
GROUP BY kind;
```
* **Replay slice:** filter for events where `provenance.dsse.chain` covers build → sbom → scan and export referenced attestation digests.
---
## 6. Policy gates
Examples:
```yaml
rules:
- id: GATE-PROVEN-VEX
when:
all:
- kind: "VEX"
- trust.verified: true
- key.keyId in VendorAllowlist
- rekor.integratedTime <= releaseFreeze
then:
decision: ALLOW
- id: BLOCK-UNPROVEN
when:
any:
- trust.verified != true
- provenance.dsse.rekor.logIndex missing
then:
decision: FAIL
reason: "Unproven evidence influences decision; require Rekor-backed attestation."
```
---
## 7. UI nudges
* **Provenance chip** on findings/events: `Verified • Rekor#1234567 • KeyID:cosign:...` (click → inclusion proof & DSSE preview).
* Facet filter: `Provenance = Verified / Missing / Key-Policy-Mismatch`.
---
## 8. Implementation tasks
| Task ID | Scope |
|---------|-------|
| `PROV-INLINE-401-028` | Extend Authority/Feedser write-paths to attach `provenance.dsse` + `trust` blocks using `StellaOps.Provenance.Postgres`. |
| `PROV-BACKFILL-401-029` | Backfill historical events with DSSE/Rekor refs based on existing attestation digests. |
| `PROV-INDEX-401-030` | Create PostgreSQL indexes and expose helper queries for audits. |
Keep this document updated when new attestation types or mirror/witness policies land.
---
## 9. Feedser API for provenance updates
Feedser exposes a lightweight endpoint for attaching provenance after an event is recorded:
```
POST /events/statements/{statementId}/provenance
Headers: X-Stella-Tenant, Authorization (if Authority is enabled)
Body: { "dsse": { ... }, "trust": { ... } }
```
The body matches the JSON emitted by `publish_attestation_with_provenance.sh`. Feedser validates the payload, ensures `trust.verified = true`, and then calls `AttachStatementProvenanceAsync` so the DSSE metadata lands inline on the target statement. Clients receive HTTP 202 on success, 400 on malformed input, and 404 if the statement id is unknown.
---
## 10. Backfill service
`EventProvenanceBackfillService` (`src/StellaOps.Events.Postgres/EventProvenanceBackfillService.cs`) orchestrates backfilling historical events with DSSE provenance metadata.
### 10.1 Components
| Class | Purpose |
|-------|---------|
| `IAttestationResolver` | Interface for resolving attestation metadata by subject digest. |
| `EventProvenanceBackfillService` | Queries unproven events, resolves attestations, updates events. |
| `StubAttestationResolver` | Test/development stub implementation. |
### 10.2 Usage
```csharp
var resolver = new MyAttestationResolver(rekorClient, attestationRepo);
var backfillService = new EventProvenanceBackfillService(postgresConnection, resolver);
// Count unproven events
var count = await backfillService.CountUnprovenEventsAsync(
new[] { "SBOM", "VEX", "SCAN" });
// Backfill with progress reporting
var progress = new Progress<BackfillResult>(r =>
Console.WriteLine($"{r.EventId}: {r.Status}"));
var summary = await backfillService.BackfillAllAsync(
kinds: new[] { "SBOM", "VEX", "SCAN" },
limit: 1000,
progress: progress);
Console.WriteLine($"Processed: {summary.TotalProcessed}");
Console.WriteLine($"Success: {summary.SuccessCount}");
Console.WriteLine($"Not found: {summary.NotFoundCount}");
Console.WriteLine($"Errors: {summary.ErrorCount}");
```
### 10.3 Implementing IAttestationResolver
Implementations should query the attestation store (Rekor, CAS, or local PostgreSQL) by subject digest:
```csharp
public class RekorAttestationResolver : IAttestationResolver
{
private readonly IRekorClient _rekor;
private readonly IAttestationRepository _attestations;
public async Task<AttestationResolution?> ResolveAsync(
string subjectDigestSha256,
string eventKind,
CancellationToken cancellationToken)
{
// Look up attestation by subject digest
var record = await _attestations.GetAsync(subjectDigestSha256, eventKind, cancellationToken);
if (record is null) return null;
// Fetch Rekor proof if available
var proof = await _rekor.GetProofAsync(record.RekorUuid, RekorBackend.Sigstore, cancellationToken);
return new AttestationResolution
{
Dsse = new DsseProvenance { /* ... */ },
Trust = new TrustInfo { Verified = true, Verifier = "Authority@stella" },
AttestationId = record.Id
};
}
}
```
### 10.4 Reference files
- `src/StellaOps.Events.Postgres/IAttestationResolver.cs`
- `src/StellaOps.Events.Postgres/EventProvenanceBackfillService.cs`
- `src/StellaOps.Events.Postgres/StubAttestationResolver.cs`
This section was added as part of `PROV-BACKFILL-401-029` (completed 2025-11-27).

View File

@@ -0,0 +1,16 @@
# Provenance Backfill Plan (Sprint 401)
Artifacts available
- Attestation inventory: `docs/modules/provenance/guides/attestation-inventory-2025-11-18.ndjson`
- Subject→Rekor map: `docs/modules/provenance/guides/subject-rekor-map-2025-11-18.json`
Procedure (deterministic)
1) Load inventory NDJSON; validate UUID/ULID and digest formats.
2) For each record, resolve Rekor entry via the subject→Rekor map; if missing, record gap and skip write.
3) Emit backfilled events to the provenance store using `scripts/publish_attestation_with_provenance.sh --mode backfill` (add `--subject` and `--rekor` arguments) with sorted input to guarantee stable ordering.
4) Log every backfilled subject + Rekor digest pair to `logs/provenance-backfill-2025-11-18.ndjson` (UTC timestamps, ISO-8601).
5) Rerun until gaps are zero; then mark PROV-BACKFILL-401-029 DONE.
Determinism
- Sort by subject, then rekorEntry before processing.
- Use canonical JSON writer for outputs; timestamps in UTC `O` format.

View File

@@ -0,0 +1,76 @@
# Provenance & Attestation Reference
> **Imposed rule:** All exported evidence must ship with DSSE + transparency proof bundles; unsigned or proof-less artifacts are rejected at ingress and may not be stored in the Evidence Locker.
This guide explains how StellaOps generates, signs, verifies, and distributes DSSE attestations for SBOMs, policy evaluations, and runtime evidence.
## 1. Attestation Workflow (online and offline)
1. **Producer** (Scanner, Policy Engine, runtime probes) emits a payload and a request to sign.
2. **Signer** authenticates the caller, validates supply-chain policy (release integrity, image pinning), then signs using keyless or tenant KMS keys.
3. **Attestor** wraps the payload in DSSE, records it in Rekor v2 (when online), persists the bundle plus inclusion proof, and exposes a verification package API.
4. **Export Center** and **Evidence Locker** embed the bundle and proof into export artifacts for offline replay; CLI retrieves the same package via `stella attest fetch`.
5. **Verifiers** (CLI, Policy Engine, auditors) validate signature roots, Rekor proof, and optional transparency witness endorsements.
## 2. DSSE Payload Types & Schemas
Supported payload types (all versioned and protobuf/JSON dual-encoded):
- `StellaOps.BuildProvenance@1`
- `StellaOps.SBOMAttestation@1`
- `StellaOps.ScanResults@1`
- `StellaOps.PolicyEvaluation@1`
- `StellaOps.VEXAttestation@1`
- `StellaOps.RiskProfileEvidence@1`
- `StellaOps.PromotionAttestation@1` (predicate `stella.ops/promotion@v1`, see `docs/release/promotion-attestations.md`)
Schema sources: `src/Attestor/StellaOps.Attestor.Types` and module dossiers. All payloads include:
- `subject` (digest + PURL/NEVRA coordinates)
- `timestamp` (UTC, ISO-8601)
- `producer` (service + version)
- `critical` block (policy version, scanner defs, reachability context)
- `materials` (SBOM/VEX references) and optional `auxiliary_proofs`
## 3. Signing & storage controls
- **Key policy:** Short-lived OIDC keyless by default; tenant KMS allowed; Ed25519 and ECDSA P-256 supported.
- **Inclusion:** Rekor v2 UUID + log index cached; when offline, the Attestor stamps a `transparency_pending` marker to be replayed later.
- **WORM:** Evidence Locker keeps immutable copies; retention and legal hold are enforced per tenant and surfaced in `docs/modules/evidence-locker/guides/evidence-locker.md`.
- **Redaction:** Sensitive fields (secrets, PII) must be excluded at payload creation; the signer refuses payloads marked `pii=true` without a redaction ticket.
## 4. Verification workflow
Command-line (online or offline bundle):
```sh
stella attest verify \
--bundle path/to/bundle.dsse.json \
--rekor-root pubkeys/rekor.pub \
--fulcio-root pubkeys/fulcio.pub \
--certificate-chain pubkeys/issuer-chain.pem
```
Verification steps performed by services and CLI:
- Validate DSSE signature against Fulcio/tenant roots and certificate policies.
- Confirm subject digest matches expected container/image/SBOM digest.
- Check Rekor inclusion proof and (if present) transparency witness signatures.
- Enforce freshness: reject bundles older than `attestation.max_age_days` (tenant policy).
- Record verification result into Timeline events for auditability.
## 5. Offline / air-gap posture
- Export Center emits self-contained bundles (`*.dsse.json`, `rekor-proof.json`, `cert-chain.pem`) plus a verification manifest for deterministic replay.
- CLI `stella attest verify --bundle bundle.dsse.json --offline` skips Rekor lookups and relies on embedded proofs.
- When connectivity returns, the Attestor replays pending `transparency_pending` entries and updates Evidence Locker indexes; Timeline events capture the replay.
## 6. References
- `docs/modules/signer/architecture.md`
- `docs/modules/attestor/architecture.md`
- `docs/modules/export-center/architecture.md`
- `docs/modules/policy/architecture.md`
- `docs/modules/telemetry/architecture.md`
- `docs/modules/evidence-locker/guides/evidence-locker.md`
- `src/Provenance/StellaOps.Provenance.Attestation`

View File

@@ -0,0 +1,60 @@
# Risk Engine
> Risk scoring runtime with pluggable providers and explainability.
## Purpose
RiskEngine computes deterministic, explainable risk scores for vulnerabilities by aggregating signals from multiple data sources (EPSS, CVSS, KEV, VEX, reachability). It produces audit trails and explainability payloads for every scoring decision.
## Quick Links
- [Architecture](./architecture.md) - Technical design and implementation details
- [Guides](./guides/) - Scoring configuration guides
- [Samples](./samples/) - Risk profile examples
## Status
| Attribute | Value |
|-----------|-------|
| **Maturity** | Production |
| **Last Reviewed** | 2025-12-29 |
| **Maintainer** | Policy Guild |
## Key Features
- **Pluggable Providers**: EPSS, CVSS+KEV, VEX status, fix availability providers
- **Deterministic Scoring**: Same inputs produce identical scores
- **Explainability**: Audit trails for every scoring decision
- **Offline Support**: Air-gapped operation via factor bundles
## Dependencies
### Upstream (this module depends on)
- **Concelier** - CVSS, KEV data
- **Excititor** - VEX status data
- **Signals** - Reachability data
- **Authority** - Authentication
### Downstream (modules that depend on this)
- **Policy Engine** - Consumes risk scores for policy evaluation
## Configuration
```yaml
risk_engine:
providers:
- epss
- cvss_kev
- vex_gate
- fix_exposure
cache_ttl_minutes: 60
```
## Notes
RiskEngine does not make PASS/FAIL decisions. It provides scores to the Policy Engine which makes enforcement decisions.
## Related Documentation
- [Policy Architecture](../policy/architecture.md)
- [Risk Scoring Contract](../../contracts/risk-scoring.md)

View File

@@ -0,0 +1,376 @@
# component_architecture_riskengine.md - **Stella Ops RiskEngine** (2025Q4)
> Risk scoring runtime with pluggable providers and explainability.
> **Scope.** Implementation-ready architecture for **RiskEngine**: the scoring runtime that computes Risk Scoring Profiles across deployments while preserving provenance and explainability. Covers scoring workers, providers, caching, and integration with Policy Engine.
---
## 0) Mission & boundaries
**Mission.** Compute **deterministic, explainable risk scores** for vulnerabilities by aggregating signals from multiple data sources (EPSS, CVSS, KEV, VEX, reachability). Produce audit trails and explainability payloads for every scoring decision.
**Boundaries.**
* RiskEngine **does not** make PASS/FAIL decisions. It provides scores to the Policy Engine.
* RiskEngine **does not** own vulnerability data. It consumes from Concelier, Excititor, and Signals.
* Scoring is **deterministic**: same inputs produce identical scores.
* Supports **offline/air-gapped** operation via factor bundles.
---
## 1) Solution & project layout
```
src/RiskEngine/StellaOps.RiskEngine/
├─ StellaOps.RiskEngine.Core/ # Scoring orchestrators, provider contracts
│ ├─ Providers/
│ │ ├─ IRiskScoreProvider.cs # Provider interface
│ │ ├─ EpssProvider.cs # EPSS score provider
│ │ ├─ CvssKevProvider.cs # CVSS + KEV provider
│ │ ├─ VexGateProvider.cs # VEX status provider
│ │ ├─ FixExposureProvider.cs # Fix availability provider
│ │ └─ DefaultTransformsProvider.cs # Score transformations
│ ├─ Contracts/
│ │ ├─ ScoreRequest.cs # Scoring request DTO
│ │ └─ RiskScoreResult.cs # Scoring result with explanation
│ └─ Services/
│ ├─ RiskScoreWorker.cs # Scoring job executor
│ └─ RiskScoreQueue.cs # Job queue management
├─ StellaOps.RiskEngine.Infrastructure/ # Persistence, caching, connectors
│ └─ Stores/
│ └─ InMemoryRiskScoreResultStore.cs
├─ StellaOps.RiskEngine.WebService/ # REST API for jobs and results
│ └─ Program.cs
├─ StellaOps.RiskEngine.Worker/ # Background scoring workers
│ ├─ Program.cs
│ └─ Worker.cs
└─ StellaOps.RiskEngine.Tests/ # Unit and integration tests
```
---
## 2) External dependencies
* **PostgreSQL** - Score persistence and job state
* **Concelier** - Vulnerability advisory data, EPSS scores
* **Excititor** - VEX statements
* **Signals** - Reachability and runtime signals
* **Policy Engine** - Consumes risk scores for decision-making
* **Authority** - Authentication and authorization
* **Valkey/Redis** - Score caching (optional)
---
## 3) Contracts & data model
### 3.1 ScoreRequest
```csharp
public sealed record ScoreRequest
{
public required string VulnerabilityId { get; init; } // CVE or vuln ID
public required string ArtifactId { get; init; } // PURL or component ID
public string? TenantId { get; init; }
public string? ContextId { get; init; } // Scan or assessment ID
public IReadOnlyList<string>? EnabledProviders { get; init; }
}
```
### 3.2 RiskScoreResult
```csharp
public sealed record RiskScoreResult
{
public required string RequestId { get; init; }
public required decimal FinalScore { get; init; } // 0.0-10.0
public required string Tier { get; init; } // Critical/High/Medium/Low/Info
public required DateTimeOffset ComputedAt { get; init; }
public required IReadOnlyList<ProviderContribution> Contributions { get; init; }
public required ExplainabilityPayload Explanation { get; init; }
}
public sealed record ProviderContribution
{
public required string ProviderId { get; init; }
public required decimal RawScore { get; init; }
public required decimal Weight { get; init; }
public required decimal WeightedScore { get; init; }
public string? FactorSource { get; init; } // Where data came from
public DateTimeOffset? FactorTimestamp { get; init; } // When factor was computed
}
```
### 3.3 Provider Interface
```csharp
public interface IRiskScoreProvider
{
string ProviderId { get; }
decimal DefaultWeight { get; }
TimeSpan CacheTtl { get; }
Task<ProviderResult> ComputeAsync(
ScoreRequest request,
CancellationToken ct);
Task<bool> IsHealthyAsync(CancellationToken ct);
}
```
---
## 4) Score Providers
### 4.1 Built-in Providers
| Provider | Data Source | Weight | Description |
|----------|-------------|--------|-------------|
| `epss` | Concelier/EPSS | 0.25 | EPSS probability score (0-1 → 0-10) |
| `cvss-kev` | Concelier | 0.30 | CVSS base + KEV boost |
| `vex-gate` | Excititor | 0.20 | VEX status (affected/not_affected) |
| `fix-exposure` | Concelier | 0.15 | Fix availability window |
| `reachability` | Signals | 0.10 | Code path reachability |
### 4.2 Score Computation
```
FinalScore = Σ(provider.weight × provider.score) / Σ(provider.weight)
Tier mapping:
9.0-10.0 → Critical
7.0-8.9 → High
4.0-6.9 → Medium
1.0-3.9 → Low
0.0-0.9 → Info
```
### 4.3 Provider Data Sources
```csharp
public interface IEpssSources
{
Task<EpssScore?> GetScoreAsync(string cveId, CancellationToken ct);
}
public interface ICvssKevSources
{
Task<CvssData?> GetCvssAsync(string cveId, CancellationToken ct);
Task<bool> IsKevAsync(string cveId, CancellationToken ct);
}
```
---
## 4.4 Exploit Maturity Service
The **ExploitMaturityService** consolidates multiple exploitation signals into a unified maturity level for risk prioritization.
### Maturity Taxonomy
| Level | Description | Evidence |
|-------|-------------|----------|
| `Unknown` | No exploitation intelligence available | No signals or below thresholds |
| `Theoretical` | Exploit theoretically possible | Low EPSS (<10%) |
| `ProofOfConcept` | PoC exploit exists | Moderate EPSS (10-40%) |
| `Active` | Active exploitation observed | High EPSS (40-80%), in-the-wild reports |
| `Weaponized` | Weaponized exploit in campaigns | Very high EPSS (>80%), KEV listing |
### Signal Sources
```csharp
public interface IExploitMaturityService
{
Task<ExploitMaturityResult> AssessMaturityAsync(string cveId, CancellationToken ct);
Task<ExploitMaturityLevel?> GetMaturityLevelAsync(string cveId, CancellationToken ct);
Task<IReadOnlyList<MaturityHistoryEntry>> GetMaturityHistoryAsync(string cveId, CancellationToken ct);
}
```
**Signal aggregation:**
1. **EPSS** - Maps probability score to maturity level via thresholds
2. **KEV** - CISA Known Exploited Vulnerabilities → `Weaponized`
3. **InTheWild** - Threat intel feeds → `Active`
### EPSS Threshold Mapping
| EPSS Score | Maturity Level |
|------------|----------------|
| ≥ 0.80 | Weaponized |
| ≥ 0.40 | Active |
| ≥ 0.10 | ProofOfConcept |
| ≥ 0.01 | Theoretical |
| < 0.01 | Unknown |
### Exploit Maturity API Endpoints
```
GET /exploit-maturity/{cveId} → ExploitMaturityResult
GET /exploit-maturity/{cveId}/level → { level: "Active" }
GET /exploit-maturity/{cveId}/history → { entries: [...] }
POST /exploit-maturity/batch { cveIds: [...] } → { results: [...] }
```
---
## 5) REST API (RiskEngine.WebService)
All under `/api/v1/risk`. Auth: **OpTok**.
```
POST /scores { request: ScoreRequest } → { jobId }
GET /scores/{jobId} → { result: RiskScoreResult, status }
GET /scores/{jobId}/explain → { explanation: ExplainabilityPayload }
POST /batch { requests: ScoreRequest[] } → { batchId }
GET /batch/{batchId} → { results: RiskScoreResult[], status }
GET /providers → { providers: ProviderInfo[] }
GET /providers/{id}/health → { healthy: bool, lastCheck }
GET /healthz | /readyz | /metrics
```
---
## 6) Configuration (YAML)
```yaml
RiskEngine:
Postgres:
ConnectionString: "Host=postgres;Database=risk;..."
Cache:
Enabled: true
Provider: "valkey"
ConnectionString: "redis://valkey:6379" # Valkey (Redis-compatible)
DefaultTtl: "00:15:00"
Providers:
Epss:
Enabled: true
Weight: 0.25
CacheTtl: "01:00:00"
Source: "concelier"
CvssKev:
Enabled: true
Weight: 0.30
KevBoost: 2.0
VexGate:
Enabled: true
Weight: 0.20
NotAffectedScore: 0.0
AffectedScore: 10.0
FixExposure:
Enabled: true
Weight: 0.15
NoFixPenalty: 1.5
Reachability:
Enabled: true
Weight: 0.10
UnreachableDiscount: 0.5
Worker:
Concurrency: 4
BatchSize: 100
PollInterval: "00:00:05"
Offline:
FactorBundlePath: "/data/risk-factors"
AllowStaleData: true
MaxStalenessHours: 168
```
---
## 7) Security & compliance
* **AuthN/Z**: Authority-issued OpToks with `risk.score` scope
* **Tenant isolation**: Scores scoped by tenant ID
* **Audit trail**: All scoring decisions logged with inputs and factors
* **No PII**: Only vulnerability and artifact identifiers processed
---
## 8) Performance targets
* **Single score**: < 100ms P95 (cached factors)
* **Batch scoring**: < 500ms P95 for 100 items
* **Provider health check**: < 1s timeout
* **Cache hit rate**: > 80% for repeated CVEs
---
## 9) Observability
**Metrics:**
* `risk.scores.computed_total{tier,provider}`
* `risk.scores.duration_seconds`
* `risk.providers.health{provider,status}`
* `risk.cache.hits_total` / `risk.cache.misses_total`
* `risk.batch.size_histogram`
**Tracing:** Spans for each provider contribution, cache operations, and aggregation.
**Logs:** Structured logs with `cve_id`, `artifact_id`, `tenant`, `final_score`.
---
## 10) Testing matrix
* **Provider tests**: Each provider returns expected scores for fixture data
* **Aggregation tests**: Weighted combination produces correct final score
* **Determinism tests**: Same inputs produce identical scores
* **Cache tests**: Cache hit/miss behavior correct
* **Offline tests**: Factor bundles load and score correctly
* **Integration tests**: Full scoring pipeline with mocked data sources
---
## 11) Offline/Air-Gap Support
### Factor Bundles
Pre-computed factor data for offline operation:
```
/data/risk-factors/
├─ epss/
│ └─ epss-2025-01-15.json.gz
├─ cvss/
│ └─ cvss-2025-01-15.json.gz
├─ kev/
│ └─ kev-2025-01-15.json
└─ manifest.json
```
### Staleness Handling
When operating offline, scores include staleness indicators:
```json
{
"finalScore": 7.2,
"dataFreshness": {
"epss": { "age": "48h", "stale": false },
"kev": { "age": "24h", "stale": false }
}
}
```
---
## Related Documentation
* Policy scoring: `../policy/architecture.md`
* Concelier feeds: `../concelier/architecture.md`
* Excititor VEX: `../excititor/architecture.md`
* Signals reachability: `../signals/architecture.md`

View File

@@ -0,0 +1,296 @@
# Risk Engine FixChain Integration
> **Sprint:** SPRINT_20260110_012_007_RISK
> **Last Updated:** 10-Jan-2026
## Overview
The Risk Engine FixChain integration enables automatic risk score adjustment based on verified fix status from FixChain attestations. When a vulnerability has a verified fix, the risk score is reduced proportionally to the verification confidence level.
## Why This Matters
| Current State | With FixChain Integration |
|---------------|---------------------------|
| Risk scores ignore fix verification | Fix confidence reduces risk |
| Binary matches = always vulnerable | Verified fixes lower severity |
| No credit for patched backports | Backport fixes recognized |
| Manual risk exceptions needed | Automatic risk adjustment |
## Risk Adjustment Model
### Verdict to Risk Modifier Mapping
| Verdict | Confidence | Risk Modifier | Rationale |
|---------|------------|---------------|-----------|
| `fixed` | >= 95% | -80% to -90% | High-confidence verified fix |
| `fixed` | 85-95% | -60% to -80% | Verified fix, some uncertainty |
| `fixed` | 70-85% | -40% to -60% | Likely fixed, needs confirmation |
| `fixed` | 60-70% | -20% to -40% | Possible fix, low confidence |
| `fixed` | < 60% | 0% | Below threshold, no adjustment |
| `partial` | >= 60% | -25% to -50% | Partial fix applied |
| `inconclusive` | any | 0% | Cannot determine, conservative |
| `still_vulnerable` | any | 0% | No fix detected |
| No attestation | N/A | 0% | No verification performed |
### Modifier Formula
```
AdjustedRisk = BaseRisk * (1 - (Modifier * ConfidenceWeight))
Where:
Modifier = verdict-based modifier from table above
ConfidenceWeight = min(1.0, (Confidence - MinThreshold) / (1.0 - MinThreshold))
```
### Example Calculation
```
CVE-2024-0727 on pkg:deb/debian/openssl@3.0.11-1~deb12u2:
BaseRisk = 8.5 (HIGH)
FixChain Verdict = "fixed"
FixChain Confidence = 0.97
Modifier = 0.90 (high confidence tier)
ConfidenceWeight = (0.97 - 0.60) / (1.0 - 0.60) = 0.925
AdjustedRisk = 8.5 * (1 - 0.90 * 0.925) = 8.5 * 0.1675 = 1.42 (LOW)
```
## Components
### IFixChainRiskProvider
Main interface for FixChain risk integration:
```csharp
public interface IFixChainRiskProvider
{
Task<FixVerificationStatus?> GetFixStatusAsync(
string cveId,
string binarySha256,
string? componentPurl = null,
CancellationToken ct = default);
double ComputeRiskAdjustment(FixVerificationStatus status);
FixChainRiskFactor CreateRiskFactor(FixVerificationStatus status);
}
```
### FixChainRiskProvider
Implementation that:
1. Queries the attestation store for FixChain predicates
2. Computes risk adjustment based on verdict and confidence
3. Creates structured risk factors for UI display
### IFixChainAttestationClient
Client for querying attestations:
```csharp
public interface IFixChainAttestationClient
{
Task<FixChainAttestationData?> GetFixChainAsync(
string cveId,
string binarySha256,
string? componentPurl = null,
CancellationToken ct = default);
Task<ImmutableArray<FixChainAttestationData>> GetForComponentAsync(
string componentPurl,
CancellationToken ct = default);
}
```
## Configuration
### YAML Configuration
```yaml
RiskEngine:
Providers:
FixChain:
Enabled: true
HighConfidenceThreshold: 0.95
MediumConfidenceThreshold: 0.85
LowConfidenceThreshold: 0.70
MinConfidenceThreshold: 0.60
FixedReduction: 0.90
PartialReduction: 0.50
MaxRiskReduction: 0.90
CacheMaxAgeHours: 24
```
### Service Registration
```csharp
services.AddOptions<FixChainRiskOptions>()
.Bind(config.GetSection("RiskEngine:Providers:FixChain"))
.ValidateDataAnnotations()
.ValidateOnStart();
services.AddSingleton<IFixChainRiskProvider, FixChainRiskProvider>();
services.AddHttpClient<IFixChainAttestationClient, FixChainAttestationClient>();
```
## Usage
### Getting Fix Status
```csharp
var provider = services.GetRequiredService<IFixChainRiskProvider>();
var status = await provider.GetFixStatusAsync(
"CVE-2024-0727",
binarySha256,
componentPurl);
if (status is not null)
{
var adjustment = provider.ComputeRiskAdjustment(status);
var adjustedRisk = baseRisk * adjustment;
}
```
### Creating Risk Factors
```csharp
var status = await provider.GetFixStatusAsync(cveId, binarySha256);
if (status is not null)
{
var factor = provider.CreateRiskFactor(status);
// For UI display
var display = factor.ToDisplay();
var badge = factor.ToBadge();
var summary = factor.ToSummary();
}
```
### Signal-Based Scoring
For batch processing via signals:
```csharp
var signals = new Dictionary<string, double>
{
[FixChainRiskProvider.SignalFixConfidence] = 0.95,
[FixChainRiskProvider.SignalFixStatus] = FixChainRiskProvider.EncodeStatus("fixed")
};
var request = new ScoreRequest("fixchain", subject, signals);
var adjustment = await provider.ScoreAsync(request, ct);
```
## Metrics
The integration exposes the following OpenTelemetry metrics:
| Metric | Type | Description |
|--------|------|-------------|
| `risk_fixchain_lookups_total` | Counter | Total attestation lookups |
| `risk_fixchain_hits_total` | Counter | Attestations found |
| `risk_fixchain_misses_total` | Counter | Lookups with no attestation |
| `risk_fixchain_cache_hits_total` | Counter | Lookups served from cache |
| `risk_fixchain_lookup_duration_seconds` | Histogram | Lookup duration |
| `risk_fixchain_adjustments_total` | Counter | Risk adjustments applied |
| `risk_fixchain_reduction_percent` | Histogram | Reduction percentage distribution |
| `risk_fixchain_errors_total` | Counter | Lookup errors |
### Recording Metrics
```csharp
// Automatically recorded by the provider, or manually:
FixChainRiskMetrics.RecordLookup(
found: true,
fromCache: false,
durationSeconds: 0.05,
verdict: "fixed");
FixChainRiskMetrics.RecordAdjustment(
verdict: FixChainVerdictStatus.Fixed,
confidence: 0.95m,
reductionPercent: 0.80);
```
## UI Integration
### Display Model
```csharp
var display = factor.ToDisplay();
// display.Label = "Fix Verification"
// display.Value = "Fixed (95% confidence)"
// display.Impact = -0.80
// display.ImpactDirection = "decrease"
// display.EvidenceRef = "fixchain://sha256:..."
// display.Details = { verdict, confidence, verified_at, ... }
```
### Badge Component
```csharp
var badge = factor.ToBadge();
// badge.Status = "Fixed"
// badge.Color = "green"
// badge.Icon = "check-circle"
// badge.Confidence = 0.95m
// badge.Tooltip = "Verified fix (95% confidence)"
```
## Testing
### Unit Tests
```csharp
[Fact]
public async Task FixedVerdict_HighConfidence_ReturnsLowRisk()
{
var provider = new FixChainRiskProvider(options);
var status = new FixVerificationStatus
{
Verdict = "fixed",
Confidence = 0.97m,
VerifiedAt = DateTimeOffset.UtcNow,
AttestationDigest = "sha256:test"
};
var adjustment = provider.ComputeRiskAdjustment(status);
adjustment.Should().BeLessThan(0.3);
}
```
### Integration Tests
```csharp
[Fact]
public async Task FullWorkflow_FixedVerdict_ReducesRisk()
{
var attestationClient = new InMemoryFixChainAttestationClient();
attestationClient.AddAttestation(cveId, binarySha256, attestation);
var provider = new FixChainRiskProvider(options, attestationClient, logger);
var status = await provider.GetFixStatusAsync(cveId, binarySha256);
status.Should().NotBeNull();
status!.Verdict.Should().Be("fixed");
}
```
## Decisions and Trade-offs
| Decision | Rationale |
|----------|-----------|
| Conservative thresholds | Start high, can lower based on accuracy data |
| No automatic upgrade | Inconclusive doesn't increase risk |
| Cache TTL 30 minutes | Balances freshness vs. performance |
| Attestation required | No reduction without verifiable evidence |
| Minimum confidence 60% | Below this, evidence is too weak for adjustment |
## Related Documentation
- [FixChain Attestation Predicate](../attestor/fix-chain-predicate.md)
- [Golden Set Schema](../binary-index/golden-set-schema.md)
- [Risk Engine Architecture](./architecture.md)

View File

@@ -0,0 +1,45 @@
# Risk API
> Based on `CONTRACT-RISK-SCORING-002` (2025-12-05). Examples are frozen in `docs/modules/risk-engine/samples/api/risk-api-samples.json` with hashes in `SHA256SUMS`. Keep ETags and error payloads deterministic.
## Purpose
- Document risk-related endpoints for profile management, simulation, scoring results, explainability retrieval, and export.
## Scope & Audience
- Audience: API consumers, SDK authors, platform integrators.
- In scope: endpoint list, methods, request/response schemas, auth/tenancy headers, rate limits, feature flags, error model.
- Out of scope: console/UI workflow details (see `explainability.md`).
## Endpoints (v1)
- `POST /api/v1/risk/jobs` — submit scoring job (body: job request); returns `202` with `job_id` and `status` (`queued`). Sample: `risk-api-samples.json#submit_job_request`.
- `GET /api/v1/risk/jobs/{job_id}` — job status + results array (sample: `get_job_status`).
- `GET /api/v1/risk/explain/{job_id}` — explainability payload (sample references `../explain/explain-trace.json`). Optional `If-None-Match` for caching.
- `GET /api/v1/risk/profiles` — list profiles (tenant-filtered); includes `profile_hash`, `version`, `etag` (see error-catalog headers).
- `POST /api/v1/risk/profiles` — create/update profile with DSSE/attestation metadata; returns `201` with `etag`.
- `POST /api/v1/risk/simulations` — dry-run scoring with fixtures; returns explain + contributions without persisting results.
- `GET /api/v1/risk/export/{job_id}` — export bundle (JSON + CSV + manifest) for auditors.
- Feature flags: `risk.jobs`, `risk.explain`, `risk.simulations`, `risk.export` (toggle exposure per tenant).
## Auth & Tenancy
- Required headers: `X-Stella-Tenant`, `Authorization: Bearer <token>`, optional `X-Stella-Scope` for imposed rule reminders.
- Imposed rule reminder must be present in responses where tenant-bound resources are returned.
## Error Model
- Envelope: `code`, `message`, `correlation_id`, `severity`, `remediation`; sample catalog in `docs/modules/risk-engine/samples/api/error-catalog.json`.
- Rate-limit headers: `Retry-After`, `X-RateLimit-Remaining`; caching headers include `ETag` for explain/results/profile GETs.
## Determinism & Offline Posture
- Samples: `docs/modules/risk-engine/samples/api/risk-api-samples.json` (hashes in `SHA256SUMS`); explain sample reused via relative reference.
- No live dependencies; use frozen fixtures. Keep ordering of fields stable in docs and samples.
## Open Items
- Add ETag examples for profile list/create once generators emit them.
- Populate error/code catalog and SDK targets once available.
- Align feature flag names with deployment config.
## References
- `docs/modules/risk-engine/guides/overview.md`
- `docs/modules/risk-engine/guides/profiles.md`
- `docs/modules/risk-engine/guides/factors.md`
- `docs/modules/risk-engine/guides/formulas.md`
- `docs/modules/risk-engine/guides/explainability.md`

View File

@@ -0,0 +1,817 @@
# EPSS v4 Integration Guide
## Overview
EPSS (Exploit Prediction Scoring System) v4 is a machine learning-based vulnerability scoring system developed by FIRST.org that predicts the probability a CVE will be exploited in the wild within the next 30 days. StellaOps integrates EPSS as a **probabilistic threat signal** alongside CVSS v4's **deterministic severity assessment**, enabling more accurate vulnerability prioritization.
**Key Concepts**:
- **EPSS Score**: Probability (0.0-1.0) that a CVE will be exploited in next 30 days
- **EPSS Percentile**: Ranking (0.0-1.0) of this CVE relative to all scored CVEs
- **Model Date**: Date for which EPSS scores were computed
- **Immutable at-scan**: EPSS evidence captured at scan time never changes (deterministic replay)
- **Current EPSS**: Live projection for triage (updated daily)
---
## EPSS Versioning Clarification
> **Note on "EPSS v4" Terminology**
>
> The term "EPSS v4" used in this document is a conceptual identifier aligning with CVSS v4 integration, **not** an official FIRST.org version number. FIRST.org's EPSS does not use explicit version numbers like "v1", "v2", etc.
>
> **How EPSS Versioning Actually Works:**
> - EPSS models are identified by **model_date** (e.g., `2025-12-16`)
> - Each daily CSV release represents a new model trained on updated threat data
> - The EPSS specification itself evolves without formal version increments
>
> **StellaOps Implementation:**
> - Tracks `model_date` for each EPSS score ingested
> - Does not assume a formal EPSS version number
> - Evidence replay uses the `model_date` from scan time
>
> For authoritative EPSS methodology, see: [FIRST.org EPSS Documentation](https://www.first.org/epss/)
---
## How EPSS Works
EPSS uses machine learning to predict exploitation probability based on:
1. **Vulnerability Characteristics**: CVSS metrics, CWE, affected products
2. **Social Signals**: Twitter/GitHub mentions, security blog posts
3. **Exploit Database Entries**: Exploit-DB, Metasploit, etc.
4. **Historical Exploitation**: Past exploitation patterns
EPSS is updated **daily** by FIRST.org based on fresh threat intelligence.
### EPSS vs CVSS
| Dimension | CVSS v4 | EPSS v4 |
|-----------|---------|---------|
| **Nature** | Deterministic severity | Probabilistic threat |
| **Scale** | 0.0-10.0 (severity) | 0.0-1.0 (probability) |
| **Update Frequency** | Static (per CVE version) | Daily (live threat data) |
| **Purpose** | Impact assessment | Likelihood assessment |
| **Source** | Vendor/NVD | FIRST.org ML model |
**Example**:
- **CVE-2024-1234**: CVSS 9.8 (Critical) + EPSS 0.01 (1st percentile)
- Interpretation: Severe impact if exploited, but very unlikely to be exploited
- Priority: **Medium** (deprioritize despite high CVSS)
- **CVE-2024-5678**: CVSS 6.5 (Medium) + EPSS 0.95 (98th percentile)
- Interpretation: Moderate impact, but actively being exploited
- Priority: **High** (escalate despite moderate CVSS)
---
## Architecture Overview
### Data Flow
```
┌────────────────────────────────────────────────────────────────┐
│ EPSS Data Lifecycle in StellaOps │
└────────────────────────────────────────────────────────────────┘
1. INGESTION (Daily 00:05 UTC)
┌───────────────────┐
│ FIRST.org │ Daily CSV: epss_scores-YYYY-MM-DD.csv.gz
│ (300k CVEs) │ ~15MB compressed
└────────┬──────────┘
┌───────────────────────────────────────────────────────────┐
│ Concelier: EpssIngestJob │
│ - Download/Import CSV │
│ - Parse (handle # comment, validate bounds) │
│ - Bulk insert: epss_scores (partitioned by month) │
│ - Compute delta: epss_changes (flags for enrichment) │
│ - Upsert: epss_current (latest projection) │
│ - Emit event: "epss.updated" │
└────────┬──────────────────────────────────────────────────┘
[PostgreSQL: concelier.epss_*]
├─────────────────────────────┐
│ │
▼ ▼
2. AT-SCAN CAPTURE (Immutable Evidence)
┌────────────────────────────────────────────────────────────┐
│ Scanner: On new scan │
│ - Bulk query: epss_current for CVE list │
│ - Store immutable evidence: │
│ * epss_score_at_scan │
│ * epss_percentile_at_scan │
│ * epss_model_date_at_scan │
│ * epss_import_run_id_at_scan │
│ - Use in lattice decision (SR→CR if EPSS≥90th) │
└─────────────────────────────────────────────────────────────┘
3. LIVE ENRICHMENT (Existing Findings)
┌─────────────────────────────────────────────────────────────┐
│ Concelier: EpssEnrichmentJob (on "epss.updated") │
│ - Read: epss_changes WHERE flags IN (CROSSED_HIGH, BIG_JUMP)│
│ - Find impacted: vuln_instance_triage BY cve_id │
│ - Update: current_epss_score, current_epss_percentile │
│ - If priority band changed → emit "vuln.priority.changed" │
└────────┬────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Notify: On "vuln.priority.changed" │
│ - Check tenant notification rules │
│ - Send: Slack / Email / Teams / In-app │
│ - Payload: EPSS delta, threshold crossed │
└─────────────────────────────────────────────────────────────┘
4. POLICY SCORING
┌─────────────────────────────────────────────────────────────┐
│ Policy Engine: Risk Score Formula │
│ risk_score = (cvss/10) + epss_bonus + kev_bonus + reach_mult│
│ │
│ EPSS Bonus (Simple Profile): │
│ - Percentile ≥99th: +10% │
│ - Percentile ≥90th: +5% │
│ - Percentile ≥50th: +2% │
│ - Percentile <50th: 0% │
│ │
│ VEX Lattice Rules: │
│ - SR + EPSS≥90th → Escalate to CR (Confirmed Reachable) │
│ - DV + EPSS≥95th → Flag for review (vendor denial) │
│ - U + EPSS≥95th → Prioritize for reachability analysis │
└─────────────────────────────────────────────────────────────┘
```
### Database Schema
**Location**: `concelier` database
#### epss_import_runs (Provenance)
Tracks each EPSS import with full provenance for audit trail.
```sql
CREATE TABLE concelier.epss_import_runs (
import_run_id UUID PRIMARY KEY,
model_date DATE NOT NULL UNIQUE,
source_uri TEXT NOT NULL,
file_sha256 TEXT NOT NULL,
row_count INT NOT NULL,
model_version_tag TEXT NULL,
published_date DATE NULL,
status TEXT NOT NULL, -- IN_PROGRESS, SUCCEEDED, FAILED
created_at TIMESTAMPTZ NOT NULL
);
```
#### epss_scores (Time-Series, Partitioned)
Immutable append-only history of daily EPSS scores.
```sql
CREATE TABLE concelier.epss_scores (
model_date DATE NOT NULL,
cve_id TEXT NOT NULL,
epss_score DOUBLE PRECISION NOT NULL,
percentile DOUBLE PRECISION NOT NULL,
import_run_id UUID NOT NULL,
PRIMARY KEY (model_date, cve_id)
) PARTITION BY RANGE (model_date);
```
**Partitions**: Monthly (e.g., `epss_scores_2025_12`)
#### epss_current (Latest Projection)
Materialized view of latest EPSS score per CVE for fast lookups.
```sql
CREATE TABLE concelier.epss_current (
cve_id TEXT PRIMARY KEY,
epss_score DOUBLE PRECISION NOT NULL,
percentile DOUBLE PRECISION NOT NULL,
model_date DATE NOT NULL,
import_run_id UUID NOT NULL,
updated_at TIMESTAMPTZ NOT NULL
);
```
**Usage**: Scanner bulk queries this table for new scans.
#### epss_changes (Delta Tracking, Partitioned)
Tracks material EPSS changes for targeted enrichment.
```sql
CREATE TABLE concelier.epss_changes (
model_date DATE NOT NULL,
cve_id TEXT NOT NULL,
old_score DOUBLE PRECISION NULL,
new_score DOUBLE PRECISION NOT NULL,
delta_score DOUBLE PRECISION NULL,
old_percentile DOUBLE PRECISION NULL,
new_percentile DOUBLE PRECISION NOT NULL,
delta_percentile DOUBLE PRECISION NULL,
flags INT NOT NULL, -- Bitmask
PRIMARY KEY (model_date, cve_id)
) PARTITION BY RANGE (model_date);
```
**Flags** (bitmask):
- `1` = NEW_SCORED (CVE newly appeared)
- `2` = CROSSED_HIGH (percentile ≥95th)
- `4` = BIG_JUMP (|Δscore| ≥0.10)
- `8` = DROPPED_LOW (percentile <50th)
- `16` = SCORE_INCREASED
- `32` = SCORE_DECREASED
---
## Configuration
### Scheduler Configuration
**File**: `etc/scheduler.yaml`
```yaml
scheduler:
jobs:
- name: epss.ingest
schedule: "0 5 0 * * *" # Daily at 00:05 UTC
worker: concelier
args:
source: online
date: null # Auto: yesterday
timeout: 600s
retry:
max_attempts: 3
backoff: exponential
```
### Concelier Configuration
**File**: `etc/concelier.yaml`
```yaml
concelier:
epss:
enabled: true
online_source:
base_url: "https://epss.empiricalsecurity.com/"
url_pattern: "epss_scores-{date:yyyy-MM-dd}.csv.gz"
timeout: 180s
bundle_source:
path: "/opt/stellaops/bundles/epss/"
thresholds:
high_percentile: 0.95 # Top 5%
high_score: 0.50 # 50% probability
big_jump_delta: 0.10 # 10 percentage points
low_percentile: 0.50 # Median
enrichment:
enabled: true
batch_size: 1000
flags_to_process:
- NEW_SCORED
- CROSSED_HIGH
- BIG_JUMP
```
### Scanner Configuration
**File**: `etc/scanner.yaml`
```yaml
scanner:
epss:
enabled: true
provider: postgres
cache_ttl: 3600
fallback_on_missing: unknown # Options: unknown, zero, skip
```
### Policy Configuration
**File**: `etc/policy.yaml`
```yaml
policy:
scoring:
epss:
enabled: true
profile: simple # Options: simple, advanced, custom
simple_bonuses:
percentile_99: 0.10 # +10%
percentile_90: 0.05 # +5%
percentile_50: 0.02 # +2%
lattice:
epss_escalation:
enabled: true
sr_to_cr_threshold: 0.90 # SR→CR if EPSS≥90th percentile
```
---
## Daily Operation
### Automated Ingestion
EPSS data is ingested automatically daily at **00:05 UTC** via Scheduler.
**Workflow**:
1. Scheduler triggers `epss.ingest` job at 00:05 UTC
2. Concelier downloads `epss_scores-YYYY-MM-DD.csv.gz` from FIRST.org
3. CSV parsed (comment line metadata, rows scores)
4. Bulk insert into `epss_scores` partition (NpgsqlBinaryImporter)
5. Compute delta: `epss_changes` (compare vs `epss_current`)
6. Upsert `epss_current` (latest projection)
7. Emit `epss.updated` event
8. Enrichment job updates impacted vulnerability instances
9. Notifications sent if priority bands changed
**Monitoring**:
```bash
# Check latest model date
stellaops epss status
# Output:
# EPSS Status:
# Latest Model Date: 2025-12-16
# Import Time: 2025-12-17 00:07:32 UTC
# CVE Count: 231,417
# Staleness: FRESH (1 day)
```
### Manual Triggering
```bash
# Trigger manual ingest (force re-import)
stellaops concelier job trigger epss.ingest --date 2025-12-16 --force
# Backfill historical data (last 30 days)
stellaops epss backfill --from 2025-11-17 --to 2025-12-16
```
---
## Air-Gapped Operation
### Bundle Structure
EPSS data for offline deployments is packaged in risk bundles:
```
risk-bundle-2025-12-16/
├── manifest.json
├── epss/
│ ├── epss_scores-2025-12-16.csv.zst # ZSTD compressed
│ └── epss_metadata.json
├── kev/
│ └── kev-catalog.json
└── signatures/
└── bundle.dsse.json
```
### EPSS Metadata
**File**: `epss/epss_metadata.json`
```json
{
"model_date": "2025-12-16",
"model_version": "v2025.12.16",
"published_date": "2025-12-16",
"row_count": 231417,
"sha256": "abc123...",
"source_uri": "https://epss.empiricalsecurity.com/epss_scores-2025-12-16.csv.gz",
"created_at": "2025-12-16T00:00:00Z"
}
```
### Import Procedure
```bash
# 1. Transfer bundle to air-gapped system
scp risk-bundle-2025-12-16.tar.zst airgap-host:/opt/stellaops/bundles/
# 2. Import bundle
stellaops offline import --bundle /opt/stellaops/bundles/risk-bundle-2025-12-16.tar.zst
# 3. Verify import
stellaops epss status
# Output:
# EPSS Status:
# Latest Model Date: 2025-12-16
# Source: bundle://risk-bundle-2025-12-16
# CVE Count: 231,417
# Staleness: ACCEPTABLE (within 7 days)
```
### Update Cadence
**Recommended**:
- **Online**: Daily (automatic)
- **Air-gapped**: Weekly (manual bundle import)
**Staleness Thresholds**:
- **FRESH**: 1 day
- **ACCEPTABLE**: 7 days
- **STALE**: 14 days
- **VERY_STALE**: >14 days (alert, fallback to CVSS-only)
---
## Scanner Integration
### EPSS Evidence in Scan Findings
Every scan finding includes **immutable EPSS-at-scan** evidence:
```json
{
"finding_id": "CVE-2024-12345-pkg:npm/lodash@4.17.21",
"cve_id": "CVE-2024-12345",
"product": "pkg:npm/lodash@4.17.21",
"scan_id": "scan-abc123",
"scan_timestamp": "2025-12-17T10:30:00Z",
"evidence": {
"cvss_v4": {
"vector_string": "CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/SC:H/SI:H/SA:H",
"base_score": 9.3,
"severity": "CRITICAL"
},
"epss_at_scan": {
"epss_score": 0.42357,
"percentile": 0.88234,
"model_date": "2025-12-16",
"import_run_id": "550e8400-e29b-41d4-a716-446655440000"
},
"epss_current": {
"epss_score": 0.45123,
"percentile": 0.89456,
"model_date": "2025-12-17",
"delta_score": 0.02766,
"delta_percentile": 0.01222,
"trend": "RISING"
}
}
}
```
**Key Points**:
- **epss_at_scan**: Immutable, captured at scan time (deterministic replay)
- **epss_current**: Mutable, updated daily for live triage
- **Replay**: Historical scans always use `epss_at_scan` for consistent policy evaluation
### Bulk Query Optimization
Scanner queries EPSS for all CVEs in a single database call:
```sql
SELECT cve_id, epss_score, percentile, model_date, import_run_id
FROM concelier.epss_current
WHERE cve_id = ANY(@cve_ids);
```
**Performance**: <500ms for 10k CVEs (P95)
---
## Policy Engine Integration
### Risk Score Formula
**Simple Profile**:
```
risk_score = (cvss_base / 10) + epss_bonus + kev_bonus
```
**EPSS Bonus Table**:
| EPSS Percentile | Bonus | Rationale |
|----------------|-------|-----------|
| 99th | +10% | Top 1% most likely to be exploited |
| 90th | +5% | Top 10% high exploitation probability |
| 50th | +2% | Above median moderate risk |
| <50th | 0% | Below median no bonus |
**Advanced Profile**:
Adds:
- **KEV synergy**: If in KEV catalog multiply EPSS bonus by 1.5
- **Uncertainty penalty**: Missing EPSS -5%
- **Temporal decay**: EPSS >30 days stale → reduce bonus by 50%
### VEX Lattice Rules
**Escalation**:
- **SR (Static Reachable) + EPSS≥90th** → Auto-escalate to **CR (Confirmed Reachable)**
- Rationale: High exploit probability warrants confirmation
**Review Flags**:
- **DV (Denied by Vendor VEX) + EPSS≥95th** → Flag for manual review
- Rationale: Vendor denial contradicted by active exploitation signals
**Prioritization**:
- **U (Unknown) + EPSS≥95th** → Prioritize for reachability analysis
- Rationale: High exploit probability justifies effort
### SPL (Stella Policy Language) Syntax
```yaml
# Custom policy using EPSS
rules:
- name: high_epss_escalation
condition: |
epss.percentile >= 0.95 AND
lattice.state == "SR" AND
runtime.exposed == true
action: escalate_to_cr
reason: "High EPSS (top 5%) + Static Reachable + Runtime Exposed"
- name: epss_trend_alert
condition: |
epss.delta_score >= 0.10 AND
cvss.base_score >= 7.0
action: notify
channels: [slack, email]
reason: "EPSS jumped by 10+ points (was {epss.old_score}, now {epss.new_score})"
```
**Available Fields**:
- `epss.score` - Current EPSS score (0.0-1.0)
- `epss.percentile` - Current percentile (0.0-1.0)
- `epss.model_date` - Model date
- `epss.delta_score` - Change vs previous scan
- `epss.trend` - RISING, FALLING, STABLE
- `epss.at_scan.score` - Immutable score at scan time
- `epss.at_scan.percentile` - Immutable percentile at scan time
---
## Notification Integration
### Event: vuln.priority.changed
Emitted when EPSS change causes priority band shift.
**Payload**:
```json
{
"event_type": "vuln.priority.changed",
"vulnerability_id": "CVE-2024-12345",
"product_key": "pkg:npm/lodash@4.17.21",
"old_priority_band": "medium",
"new_priority_band": "high",
"reason": "EPSS percentile crossed 95th (was 88th, now 96th)",
"epss_change": {
"old_score": 0.42,
"new_score": 0.78,
"delta_score": 0.36,
"old_percentile": 0.88,
"new_percentile": 0.96,
"model_date": "2025-12-16"
}
}
```
### Notification Rules
**File**: `etc/notify.yaml`
```yaml
notify:
rules:
- name: epss_crossed_high
event_type: vuln.priority.changed
condition: "payload.epss_change.new_percentile >= 0.95"
channels: [slack, email]
template: epss_high_alert
digest: false # Immediate
- name: epss_big_jump
event_type: vuln.priority.changed
condition: "payload.epss_change.delta_score >= 0.10"
channels: [slack]
template: epss_rising_threat
digest: true
digest_time: "09:00" # Daily digest at 9 AM
```
### Slack Template Example
```
🚨 **High EPSS Alert**
**CVE**: CVE-2024-12345
**Product**: pkg:npm/lodash@4.17.21
**EPSS**: 0.78 (96th percentile) ⬆️ from 0.42 (88th percentile)
**Delta**: +0.36 (36 percentage points)
**Priority**: Medium → **High**
**Action Required**: Review and prioritize remediation.
[View in StellaOps →](https://stellaops.example.com/vulns/CVE-2024-12345)
```
---
## Troubleshooting
### EPSS Data Not Available
**Symptom**: Scans show "EPSS: N/A"
**Diagnosis**:
```bash
# Check EPSS status
stellaops epss status
# Check import runs
stellaops concelier jobs list --type epss.ingest --limit 10
```
**Resolution**:
1. **No imports**: Trigger manual ingest
```bash
stellaops concelier job trigger epss.ingest
```
2. **Import failed**: Check logs
```bash
stellaops concelier logs --job-id <id> --level ERROR
```
3. **FIRST.org down**: Use air-gapped bundle
```bash
stellaops offline import --bundle /path/to/risk-bundle.tar.zst
```
### Stale EPSS Data
**Symptom**: UI shows "EPSS stale (14 days)"
**Diagnosis**:
```sql
SELECT * FROM concelier.epss_model_staleness;
-- Output: days_stale: 14, staleness_status: STALE
```
**Resolution**:
1. **Online**: Check scheduler job status
```bash
stellaops scheduler jobs status epss.ingest
```
2. **Air-gapped**: Import fresh bundle
```bash
stellaops offline import --bundle /path/to/latest-bundle.tar.zst
```
3. **Fallback**: Disable EPSS temporarily (uses CVSS-only)
```yaml
# etc/scanner.yaml
scanner:
epss:
enabled: false
```
### High Memory Usage During Ingest
**Symptom**: Concelier worker OOM during EPSS ingest
**Diagnosis**:
```bash
# Check memory metrics
stellaops metrics query 'process_resident_memory_bytes{service="concelier"}'
```
**Resolution**:
1. **Increase worker memory limit**:
```yaml
# Kubernetes deployment
resources:
limits:
memory: 1Gi # Was 512Mi
```
2. **Verify streaming parser** (should not load full CSV into memory):
```bash
# Check logs for "EPSS CSV parsed: rows_yielded="
stellaops concelier logs --job-type epss.ingest | grep "CSV parsed"
```
---
## Best Practices
### 1. Combine Signals (Never Use EPSS Alone)
❌ **Don't**: `if epss > 0.95 then CRITICAL`
✅ **Do**: `if cvss >= 8.0 AND epss >= 0.95 AND runtime_exposed then CRITICAL`
### 2. Review High EPSS Manually
Manually review vulnerabilities with EPSS ≥95th percentile, especially if:
- CVSS is low (<7.0) but EPSS is high
- Vendor VEX denies exploitability but EPSS is high
### 3. Track Trends
Monitor EPSS changes over time:
- Rising EPSS → increasing threat
- Falling EPSS → threat subsiding
### 4. Update Regularly
- **Online**: Daily (automatic)
- **Air-gapped**: Weekly minimum, daily preferred
### 5. Verify During Audits
For compliance audits, use EPSS-at-scan (immutable) not current EPSS:
```sql
SELECT epss_score_at_scan, epss_model_date_at_scan
FROM scan_findings
WHERE scan_id = 'audit-scan-20251217';
```
---
## API Reference
### Query Current EPSS
```bash
# Single CVE
stellaops epss get CVE-2024-12345
# Output:
# CVE-2024-12345
# Score: 0.42357 (42.4% probability)
# Percentile: 88.2th
# Model Date: 2025-12-16
# Status: FRESH
```
### Batch Query
```bash
# From file
stellaops epss batch --file cves.txt --output epss-scores.json
# cves.txt:
# CVE-2024-1
# CVE-2024-2
# CVE-2024-3
```
### Query History
```bash
# Last 180 days
stellaops epss history CVE-2024-12345 --days 180 --format csv
# Output: epss-history-CVE-2024-12345.csv
# model_date,epss_score,percentile
# 2025-12-17,0.45123,0.89456
# 2025-12-16,0.42357,0.88234
# ...
```
### Top CVEs by EPSS
```bash
# Top 100
stellaops epss top --limit 100 --format table
# Output:
# Rank | CVE | Score | Percentile | CVSS
# -----|---------------|--------|------------|------
# 1 | CVE-2024-9999 | 0.9872 | 99.9th | 9.8
# 2 | CVE-2024-8888 | 0.9654 | 99.8th | 8.1
# ...
```
---
## References
- **FIRST EPSS Homepage**: https://www.first.org/epss/
- **EPSS Data & Stats**: https://www.first.org/epss/data_stats
- **EPSS API Docs**: https://www.first.org/epss/api
- **CVSS v4.0 Spec**: https://www.first.org/cvss/v4.0/specification-document
- **StellaOps Policy Guide**: `docs/policy/overview.md`
- **StellaOps Reachability Guide**: `docs/modules/scanner/reachability.md`
---
**Last Updated**: 2025-12-17
**Version**: 1.0
**Maintainer**: StellaOps Security Team

View File

@@ -0,0 +1,290 @@
# EPSS Integration Guide
## Overview
EPSS (Exploit Prediction Scoring System) is a FIRST.org initiative that provides probability scores for vulnerability exploitation within 30 days. StellaOps integrates EPSS as a risk signal alongside CVSS and KEV (Known Exploited Vulnerabilities) to provide more accurate vulnerability prioritization.
## How EPSS Works
EPSS uses machine learning to predict the probability that a CVE will be exploited in the wild within the next 30 days. The model considers:
- Vulnerability characteristics (CVSS metrics, CWE, etc.)
- Social signals (Twitter mentions, GitHub issues, etc.)
- Exploit database entries
- Historical exploitation patterns
EPSS outputs two values:
- **Score** (0.0-1.0): Probability of exploitation in next 30 days
- **Percentile** (0-100): Ranking relative to all other CVEs
## How EPSS Affects Risk Scoring in StellaOps
### Combined Risk Formula
StellaOps combines CVSS, KEV, and EPSS signals into a unified risk score:
```
risk_score = clamp01(
(cvss / 10) + # Base severity (0-1)
kevBonus + # +0.20 if in CISA KEV
epssBonus # +0.02 to +0.10 based on percentile
)
```
### EPSS Bonus Thresholds
| EPSS Percentile | Bonus | Rationale |
|-----------------|-------|-----------|
| >= 99th | +10% | Top 1% most likely to be exploited; urgent priority |
| >= 90th | +5% | Top 10%; high exploitation probability |
| >= 50th | +2% | Above median; moderate additional risk |
| < 50th | 0% | Below median; no bonus applied |
### Example Calculations
| CVE | CVSS | KEV | EPSS Percentile | Risk Score |
|-----|------|-----|-----------------|------------|
| CVE-2024-1234 | 9.8 | Yes | 99.5th | 1.00 (clamped) |
| CVE-2024-5678 | 7.5 | No | 95th | 0.80 |
| CVE-2024-9012 | 6.0 | No | 60th | 0.62 |
| CVE-2024-3456 | 8.0 | No | 30th | 0.80 |
## Implementation Reference
### IEpssSource Interface
```csharp
// Location: src/RiskEngine/StellaOps.RiskEngine/StellaOps.RiskEngine.Core/Providers/IEpssSources.cs
public interface IEpssSource
{
/// <summary>
/// Returns EPSS data for the given CVE identifier, or null if unknown.
/// </summary>
Task<EpssData?> GetEpssAsync(string cveId, CancellationToken cancellationToken);
}
public sealed record EpssData(double Score, double Percentile, DateTimeOffset? ModelVersion = null);
```
### Risk Providers
**EpssProvider** - Uses EPSS score directly as risk (0.0-1.0):
```csharp
// Location: src/RiskEngine/StellaOps.RiskEngine/StellaOps.RiskEngine.Core/Providers/EpssProvider.cs
public const string ProviderName = "epss";
```
**CvssKevEpssProvider** - Combined provider using all three signals:
```csharp
// Location: src/RiskEngine/StellaOps.RiskEngine/StellaOps.RiskEngine.Core/Providers/EpssProvider.cs
public const string ProviderName = "cvss-kev-epss";
```
## Policy Configuration
### Enabling EPSS Integration
```yaml
# etc/risk-engine.yaml
risk:
providers:
- name: cvss-kev-epss
enabled: true
priority: 1
epss:
enabled: true
source: database # or "api" for live FIRST API
cache_ttl: 24h
# Percentile-based bonus thresholds
thresholds:
- percentile: 99
bonus: 0.10
- percentile: 90
bonus: 0.05
- percentile: 50
bonus: 0.02
```
### Custom Threshold Configuration
Organizations can customize EPSS bonus thresholds based on their risk tolerance:
```yaml
# More aggressive (higher bonuses for high-risk vulns)
epss:
thresholds:
- percentile: 99
bonus: 0.15
- percentile: 95
bonus: 0.10
- percentile: 75
bonus: 0.05
# More conservative (smaller bonuses)
epss:
thresholds:
- percentile: 99
bonus: 0.05
- percentile: 95
bonus: 0.02
```
## EPSS in Lattice Decisions
EPSS influences VEX lattice state transitions for vulnerability triage:
| Current State | EPSS >= 90th Percentile | Recommended Action |
|---------------|-------------------------|-------------------|
| SR (Static Reachable) | Yes | Escalate to CR (Confirmed Reachable) priority |
| SU (Static Unreachable) | Yes | Flag for review - high exploit probability despite unreachable |
| DV (Denied by Vendor VEX) | Yes | Review denial validity - exploit activity contradicts vendor |
| U (Unknown) | Yes | Prioritize for reachability analysis |
### VEX Policy Example
```yaml
# etc/vex-policy.yaml
lattice:
transitions:
- from: SR
to: CR
condition:
epss_percentile: ">= 90"
action: auto_escalate
- from: SU
to: REVIEW
condition:
epss_percentile: ">= 95"
action: flag_for_review
reason: "High EPSS despite static unreachability"
```
## Offline EPSS Data
EPSS data is included in offline risk bundles for air-gapped environments.
### Bundle Structure
```
risk-bundle-2025-12-14/
├── manifest.json
├── kev/
│ └── kev-catalog.json
├── epss/
│ ├── epss-scores.csv.zst # Compressed EPSS data
│ └── epss-metadata.json # Model date, row count, checksum
└── signatures/
└── bundle.dsse.json
```
### EPSS Metadata
```json
{
"model_date": "2025-12-14",
"row_count": 248732,
"sha256": "abc123...",
"source": "first.org",
"created_at": "2025-12-14T00:00:00Z"
}
```
### Importing Offline EPSS Data
```bash
# Import risk bundle (includes EPSS)
stellaops offline import --kit risk-bundle-2025-12-14.tar.zst
# Verify EPSS data imported
stellaops epss status
# Output:
# EPSS Data Status:
# Model Date: 2025-12-14
# CVE Count: 248,732
# Last Import: 2025-12-14T10:30:00Z
```
## Accuracy Considerations
| Metric | Value | Notes |
|--------|-------|-------|
| EPSS Coverage | ~95% of NVD CVEs | Some very new CVEs (<24h) not yet scored |
| Model Refresh | Daily | Scores can change day-to-day |
| Prediction Window | 30 days | Probability of exploit in next 30 days |
| Historical Accuracy | ~85% AUC | Based on FIRST published evaluations |
### Limitations
1. **New CVEs**: Very recent CVEs may not have EPSS scores yet
2. **Model Lag**: EPSS model updates daily; real-world exploit activity may be faster
3. **Zero-Days**: Pre-disclosure vulnerabilities cannot be scored
4. **Context Blind**: EPSS doesn't consider your specific environment
## Best Practices
1. **Combine Signals**: Always use EPSS alongside CVSS and KEV, not in isolation
2. **Review High EPSS**: Manually review vulnerabilities with EPSS >= 95th percentile
3. **Track Changes**: Monitor EPSS score changes over time for trending threats
4. **Update Regularly**: Keep EPSS data fresh (daily in online mode, weekly for offline)
5. **Verify High-Risk**: For critical decisions, verify EPSS data against FIRST API
## API Usage
### Query EPSS Score
```bash
# Get EPSS score for a specific CVE
stellaops epss get CVE-2024-12345
# Batch query
stellaops epss batch --file cves.txt --output epss-scores.json
```
### Programmatic Access
```csharp
// Using IEpssSource
var epssData = await epssSource.GetEpssAsync("CVE-2024-12345", cancellationToken);
if (epssData is not null)
{
Console.WriteLine($"Score: {epssData.Score:P2}");
Console.WriteLine($"Percentile: {epssData.Percentile:F1}th");
}
```
## Troubleshooting
### EPSS Data Not Available
```bash
# Check EPSS source status
stellaops epss status
# Force refresh from FIRST API
stellaops epss refresh --force
# Check for specific CVE
stellaops epss get CVE-2024-12345 --verbose
```
### Stale EPSS Data
If EPSS data is older than 7 days:
```bash
# Check staleness
stellaops epss check-staleness
# Import fresh bundle
stellaops offline import --kit latest-bundle.tar.zst
```
## References
- [FIRST EPSS Model](https://www.first.org/epss/)
- [EPSS API Documentation](https://www.first.org/epss/api)
- [EPSS FAQ](https://www.first.org/epss/faq)
- [StellaOps Risk Engine Architecture](../modules/risk-engine/architecture.md)

View File

@@ -0,0 +1,35 @@
# Risk Explainability
> Source: `CONTRACT-RISK-SCORING-002` (2025-12-05). Fixtures live under `docs/modules/risk-engine/samples/explain/`; all hashes in `SHA256SUMS`. Keep outputs deterministic (frozen payloads, stable ordering).
## Purpose
- Show how the scoring engine produces per-factor contributions and traces that UI/CLI/export surfaces render for auditors and operators.
## Scope & Audience
- Audience: Console/CLI users, auditors, SREs.
- In scope: explainability payload shape, field meanings, provenance, UI/CLI mapping, offline/export behavior.
- Out of scope: formula math (see `formulas.md`), API specifics (see `api.md`).
## Payload Shape
- Envelope: `job_id`, `tenant_id`, `context_id`, `profile_id`, `profile_version`, `profile_hash`, `finding_id`, `raw_score`, `normalized_score`, `severity`, `signal_values{}`, `signal_contributions{}`, optional `override_applied`, `override_reason`, `gates_triggered[]`, `scored_at`, `provenance` (job hash + fixture hashes).
- Factor entries (from `signal_values`/`signal_contributions`): `name`, `source`, `type`, `path`, `raw_value`, `normalized_value`, `weight`, `contribution`, `provenance`.
- UI/CLI expectations: deterministic ordering (factor type → source → timestamp), highlight top contributors, show attestation status for each factor.
## UI/CLI Views
- Console: frame sample in `docs/modules/risk-engine/samples/explain/console-frame.json` shows top contributors, gate badges, and provenance hashes.
- CLI `stella risk explain job-001`: deterministic text fixture in `docs/modules/risk-engine/samples/explain/cli-explain.txt`; `--json` mirrors `explain-trace.json`.
- Export Center: embed explain payload + SHA256 manifest; CSV export keeps deterministic ordering.
## Determinism & Offline Posture
- Example payload: `docs/modules/risk-engine/samples/explain/explain-trace.json` (hash in `SHA256SUMS`).
- No live calls; all captures from frozen fixtures. Use exact ordering and timestamps when regenerating.
## Open Items
- Add schema file once JSON schema is frozen; update references accordingly.
## References
- `docs/modules/risk-engine/guides/overview.md`
- `docs/modules/risk-engine/guides/profiles.md`
- `docs/modules/risk-engine/guides/factors.md`
- `docs/modules/risk-engine/guides/formulas.md`
- `docs/modules/risk-engine/guides/api.md`

View File

@@ -0,0 +1,46 @@
# Risk Factors
> Aligned to `CONTRACT-RISK-SCORING-002` (published 2025-12-05). Keep fixtures deterministic and offline-friendly.
## Purpose
- Catalog supported factors (exploit likelihood, VEX state, reachability, runtime facts, fix availability, asset criticality, provenance trust, tenant overrides) and how they normalize into risk math.
## Scope & Audience
- Audience: risk engineers, policy authors, platform SREs.
- In scope: factor definitions, required/optional fields, normalization rules, TTLs, provenance expectations.
- Out of scope: full formula math (see `formulas.md`), API wiring (see `api.md`).
## Factor Catalog (mirrors profile `signals[]`)
| Factor | Required fields | Optional fields | Notes |
| --- | --- | --- | --- |
| CVSS / exploit likelihood | `name`, `source`, `type:"numeric"`, `path`, `transform:"normalize_10"` | `unit:"score"`, `last_seen`, `confidence` | Normalize 010 to 01; clamp and keep original in provenance. |
| KEV flag | `name`, `source`, `type:"boolean"`, `path` | `last_seen` | Boolean boost; drives severity overrides/decisions. |
| Reachability | `name`, `source`, `type:"numeric"`, `path` | `unit:"score"`, `guards` | May fuse static reachability + runtime observation; ordered by entrypoint/path hash. |
| Runtime facts | `name`, `source`, `type:"categorical" or "numeric"`, `path` | `trace_id`, `span_id` | Includes host/container identity and provenance for runtime traces. |
| Fix availability | `name`, `source`, `type`, `path` | `mitigation`, `vendor_status` | Decay older advisories; keep mitigation text intact. |
| Asset criticality | `name`, `source`, `type`, `path` | `tenant_scope`, `owner` | Used as multiplier/guard in formulas. |
| Provenance trust | `name`, `source`, `type:"categorical"`, `path` | `key_id`, `chain_of_custody` | Gate low-trust inputs; must carry attestation hash. |
| Custom overrides | `name`, `source`, `type`, `path` | `override_reason`, `reviewer`, `expires_at` | Logged and expiring; surfaced in `signal_contributions`. |
## Normalization Rules
- Validate against profile `signals.type` and known transforms; reject unknown fields.
- Clamp numeric inputs to 01; record original value in provenance for audit.
- TTL/decay: apply per-factor defaults (pending payload fixtures); drop expired signals deterministically.
- Precedence: signed → unsigned; runtime → static; newer → older; when tied, lowest hash order.
Interim notes: follow legacy profile guidance — preserve provenance, never mutate source evidence, and keep ordering stable so explainability hashes are repeatable across UI/CLI/exports.
## Determinism & Ordering
- Sort factors by `factor_type` then `source` then `timestamp_utc`; deterministic hashing for fixtures.
- Record SHA256 for sample payloads in `docs/modules/risk-engine/samples/factors/SHA256SUMS` once provided.
## Open Items
- Sample payloads per factor for fixtures + hashes.
- TTL/decay parameters from Risk Engine Guild.
- Provenance attestation examples (signed runtime traces, KEV ingestion evidence).
## References
- `docs/modules/risk-engine/guides/overview.md`
- `docs/modules/risk-engine/guides/profiles.md`
- `docs/modules/risk-engine/guides/formulas.md`
- `docs/modules/risk-engine/guides/api.md`

View File

@@ -0,0 +1,62 @@
# Risk Formulas
> Based on `CONTRACT-RISK-SCORING-002` (2025-12-05). Keep math examples deterministic with fixed fixtures.
## Purpose
- Describe how normalized factors combine into a 0100 risk score with severity bands.
- Capture gating, weighting, normalization, and override rules.
## Scope & Audience
- Audience: risk engineers, policy authors, auditors.
- In scope: weighting strategies, aggregation functions, severity thresholds, gating rules, tie-breakers.
- Out of scope: full API payloads (see `api.md`), factor definitions (see `factors.md`).
## Formula Building Blocks
- Weighted sum with per-factor caps; enforce max contribution per family (exploitability, reachability, runtime).
- Base rule (contract): `raw_score = Σ(signal_value × weight)`, `normalized_score = clamp(raw_score, 0.0, 1.0)`.
- VEX gate: if `signals.HasVexDenial`, return `0.0` immediately (mitigated finding).
- CVSS + KEV provider: `score = clamp01((cvss/10) + (kev ? 0.2 : 0))`.
- Guard rails: hard gates when `(exploit_likelihood >= T1) AND (reachability >= T2)` or when provenance trust below minimum.
- Decay/time weighting: exponential decay for stale runtime/KEV signals; fresh VEX `not_affected` may down-weight exploit scores.
- Tenant/asset overrides: additive/override blocks with expiry; always logged in explainability output.
- Safety: divide-by-zero and null handling must be deterministic and reflected in explain trace.
## Severity Mapping
- Contract levels: `critical`, `high`, `medium`, `low`, `informational` (priority 15).
- Map `normalized_score` to bands per profile policy; include band rationale in explainability payload.
## Determinism
- Stable ordering of factors before aggregation.
- Use fixed precision (e.g., 4 decimals) before severity mapping; round not truncate.
- Hash fixtures and record SHA256 for every example payload in `docs/modules/risk-engine/samples/formulas/SHA256SUMS`.
Interim notes: mirror legacy rule — simulation and production must share the exact evaluation codepath; no per-environment divergences. Severity buckets must be deterministic and governed by Authority scopes.
## Example (contract-aligned)
```json
{
"finding_id": "f-123",
"profile_id": "default-profile",
"profile_version": "1.0.0",
"raw_score": 0.75,
"normalized_score": 0.85,
"severity": "high",
"signal_values": { "cvss": 7.5, "kev": true, "reachability": 0.9 },
"signal_contributions": { "cvss": 0.4, "kev": 0.3, "reachability": 0.3 },
"override_applied": "kev-boost",
"override_reason": "Known Exploited Vulnerability",
"scored_at": "2025-12-05T00:00:02Z"
}
```
- CLI/Console screenshots pending telemetry assets (keep deterministic fixture IDs).
## Open Items
- Fixtures for jobs/results and explainability traces.
- Final per-profile severity thresholds (document once agreed).
- UI traces for console/CLI explainability views.
## References
- `docs/modules/risk-engine/guides/overview.md`
- `docs/modules/risk-engine/guides/profiles.md`
- `docs/modules/risk-engine/guides/factors.md`
- `docs/modules/risk-engine/guides/api.md`

View File

@@ -0,0 +1,50 @@
# Risk Overview
> Source of truth: `CONTRACT-RISK-SCORING-002` (published 2025-12-05). Keep fixtures deterministic (UTC timestamps, stable ordering, sealed sample payloads) and avoid external assets.
## Purpose
- Explain the risk model at a glance: factors, formulas, scoring semantics (0100), and severity bands.
- Show how risk flows through StellaOps services (ingest → evaluate → explain → export) and how provenance is preserved.
## Scope & Audience
- Audience: policy authors, risk engineers, auditors, and SREs consuming risk outputs.
- In scope: concepts, glossary, lifecycle, artifacts, cross-module data flow diagrams (add after schema approval).
- Out of scope: detailed factor math (goes to `formulas.md`), API specifics (goes to `api.md`).
## Core Concepts
- **Signal → evidence → factor:** raw events (scanner, VEX, runtime) become evidence once validated; evidence is normalized into factors listed under profile `signals[]`.
- **Profile vs. formula:** a profile bundles factor weights, thresholds, overrides, and severity mapping; formulas describe how weighted signals aggregate and when gates short-circuit.
- **Provenance:** every input keeps its attestation/signature and source hash; explainability echoes `profile_hash`, factor hashes, and job correlation IDs.
- **Explainability payloads:** UI/CLI show per-factor contributions (`signal_contributions`), source hashes, and rule gates; exports reuse the same envelope.
- **Determinism:** stable ordering (factor type → source → timestamp), UTC ISO-8601 timestamps, fixed precision math, sealed fixtures.
Profiles use normalized factors (exploit likelihood, KEV flag, reachability, runtime evidence, fix availability, asset criticality, provenance trust) to produce 01 scores mapped to severity buckets. Simulation and production share the exact code path.
## Lifecycle
1. **Job submit:** POST `/api/v1/risk/jobs` with `tenant_id`, `context_id`, `profile_id`, finding list; request is signed and queued.
2. **Evidence ingestion:** scanner surface + reachability graphs, Zastava runtime signals, VEX/KEV feeds, mirror bundles (offline).
3. **Normalization:** clamp units to 01, apply TTL/decay, dedupe by provenance hash, map to canonical factor catalog.
4. **Profile evaluation:** apply weighted sum and overrides; respect gates (e.g., KEV + reachability) and Authority-imposed rules.
5. **Severity assignment:** map `normalized_score` to severity levels (critical/high/medium/low/informational) with rationale.
6. **Explainability & observability:** emit per-factor contribution table, provenance pointers, evaluation latency metrics; surface via `/risk/jobs/{id}` and export bundles.
7. **Export/archival:** package explainability + profile version/hash for Findings Ledger/Export Center; mirror-friendly.
## Artifacts & Schemas
- Contract: `CONTRACT-RISK-SCORING-002` (2025-12-05) — risk scoring jobs, results, and profile model.
- Profile schema fields: `id`, `version`, `description`, optional `extends`, `signals[] {name, source, type, path, transform, unit}`, `weights{}`, `overrides{severity[], decisions[]}`, `metadata`, `provenance`.
- Job/result fields: `job_id`, `profile_hash`, `normalized_score`, `severity`, `signal_values`, `signal_contributions`, optional overrides and timestamps.
- Explainability envelope: reuse `signal_contributions` + `profile_hash`; store fixtures under `docs/modules/risk-engine/samples/explain/`.
## Determinism & Offline Posture
- Use frozen fixture sets with SHA256 tables; keep manifests in `docs/modules/risk-engine/samples/*/SHA256SUMS`.
- Regenerate examples via documented scripts only; no live network calls.
- Simulation, API, UI, and export consumers must share the same deterministic ordering and precision.
## Open Items
- Need real payload fixtures (jobs + explainability traces) and UI telemetry captures; placeholders remain in samples folders.
## References (to link once available)
- `docs/modules/risk-engine/guides/profiles.md`
- `docs/modules/risk-engine/guides/factors.md`
- `docs/modules/risk-engine/guides/formulas.md`
- `docs/modules/risk-engine/guides/api.md`

View File

@@ -0,0 +1,85 @@
# Risk Profiles
> Contract source: `CONTRACT-RISK-SCORING-002` (published 2025-12-05). This file supersedes `docs/modules/risk-engine/guides/risk-profiles.md` once fixtures are added.
## Purpose
- Define how profiles group factors, weights, thresholds, and severity bands.
- Describe authoring, simulation, promotion, rollback, and provenance for profiles.
## Scope & Audience
- Audience: policy authors, risk engineers, platform SREs.
- Coverage: profile schema, lifecycle, governance, promotion paths, rollback, and observability hooks.
## Schema (from CONTRACT-RISK-SCORING-002)
- Required: `id`, `version`, `description`, `signals[]`, `weights`, `metadata`.
- `signals[]` fields: `name`, `source`, `type` (`numeric|boolean|categorical`), `path`, optional `transform`, optional `unit`.
- Overrides: `overrides.severity[] { when, set }`, `overrides.decisions[] { when, action, reason }`.
- Optional: `extends`, rollout flags, tenant overrides, `valid_from`/`valid_until`.
- Storage rules: immutable once promoted; each change creates a new version with DSSE envelope and SHA256 manifest entry (`docs/modules/risk-engine/samples/profiles/SHA256SUMS`).
### Example Profile (contract snippet)
```json
{
"id": "default-profile",
"version": "1.0.0",
"description": "Default risk profile for vulnerability prioritization",
"extends": "base-profile",
"signals": [
{ "name": "cvss", "source": "nvd", "type": "numeric", "path": "/cvss/base_score", "transform": "normalize_10", "unit": "score" },
{ "name": "kev", "source": "cisa", "type": "boolean", "path": "/kev/in_catalog" },
{ "name": "reachability", "source": "scanner", "type": "numeric", "path": "/reachability/score" }
],
"weights": { "cvss": 0.4, "kev": 0.3, "reachability": 0.3 },
"overrides": {
"severity": [{ "when": { "kev": true }, "set": "critical" }],
"decisions": [{ "when": { "kev": true, "reachability": { "$gt": 0.8 } }, "action": "deny", "reason": "KEV with high reachability" }]
},
"metadata": {}
}
```
### Severity Levels
| Level | Value | Priority |
| --- | --- | --- |
| Critical | `critical` | 1 |
| High | `high` | 2 |
| Medium | `medium` | 3 |
| Low | `low` | 4 |
| Informational | `informational` | 5 |
## Lifecycle (outline)
1. Authoring in Policy Studio (draft state)
2. Simulation against fixtures (deterministic inputs)
3. Review/approval workflow
4. Promotion to environments (dev → staging → prod)
5. Rollback hooks and audit trail
## Governance & Determinism
- Profiles stored with DSSE/signatures; fixtures recorded in `docs/modules/risk-engine/samples/profiles/SHA256SUMS`.
- Simulation and production share the same evaluation codepath; feature flags must be documented in `metadata.flags`.
- Offline posture: include profiles, fixtures, and explainability bundles inside mirror packages with manifest hashes.
## Explainability & Observability
- Per-factor contribution outputs (JSON) with stable ordering (factor type → source).
- Metrics: evaluation latency (p50/p95), cache hit ratio, factor coverage %, profile hit rate, failed provenance validations.
- Dashboards/alerts: to be filled when telemetry payloads arrive; reserve panels for gating violations and override usage.
## Open Items
- Add signed fixtures (profiles + hashes) under `docs/modules/risk-engine/samples/profiles/` once payloads arrive.
- Capture feature-flag list for registry alignment.
- Telemetry field list for dashboards/alerts.
- Finalize migration note when legacy `docs/modules/risk-engine/guides/risk-profiles.md` is archived.
## References
- `docs/modules/risk-engine/guides/overview.md`
- `docs/modules/risk-engine/guides/factors.md`
- `docs/modules/risk-engine/guides/formulas.md`
- `docs/modules/risk-engine/guides/explainability.md`
- `docs/modules/risk-engine/guides/api.md`
- Existing context: `docs/modules/risk-engine/guides/risk-profiles.md` (to reconcile once schema lands)
## Interim Notes (carried from legacy `docs/modules/risk-engine/guides/risk-profiles.md`)
- Profiles define how evidence (CVSS/EPSS-like exploit likelihood, KEV flags, VEX status, reachability, runtime evidence, fix availability, asset criticality, provenance trust) normalizes into a 0100 score with severity buckets.
- Workflow highlights: author in Policy Studio → simulate with fixtures → activate in Policy Engine → explain outputs in CLI/Console → export for auditors via Export Center.
- Governance: draft/review/approval with DSSE/signatures; rollback hooks and promotion gates enforced by Authority scopes; determinism required (same codepath for simulation and production).
- Observability: record scoring latency, factor distribution, and profile usage; offline posture via mirror bundles with fixtures and hash manifests.

View File

@@ -0,0 +1,57 @@
# Risk Scoring Profiles
> Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
## Overview
Risk Scoring Profiles define customizable formulas that convert raw evidence (CVSS, EPSS-like exploit likelihood, KEV exploited lists, VEX status, reachability, runtime evidence, fix availability, asset criticality, provenance trust) into normalized risk scores (0100) with severity buckets. Profiles are authored in Policy Studio, simulated, versioned, and executed by the scoring engine with full explainability.
- **Primary components:** Policy Engine, Findings Ledger, Conseiller, Excitor, Console, Policy Studio, CLI, Export Center, Authority & Tenancy, Observability.
- **Surfaces:** policy documents, scoring engine, factor providers, explainability artefacts, APIs, CLI, UI.
Aggregation-Only Contract remains in force: Conseiller and Excitor never merge or mutate source records. Risk scoring consumes linked evidence and preserves provenance for explainability.
## Core workflow
1. **Profile authoring:** Policy Studio exposes declarative DSL to define factors, weights, thresholds, and severity buckets.
2. **Simulation:** operators preview profiles against historical findings/SBOMs, compare with existing policies, and inspect factor breakdowns.
3. **Activation:** Policy Engine evaluates profiles on change streams, producing scores and detailed factor contributions per finding and per asset.
4. **Explainability:** CLI/Console display math traces, provenance IDs, and rationale for each factor. Export Center packages reports for auditors.
5. **Versioning:** profiles carry semantic versions, promotion workflows, and rollback hooks; Authority scopes enforce who can publish or edit.
## Factor model
| Factor | Description | Typical signal source |
| --- | --- | --- |
| Exploit likelihood | EPSS/KEV or internal intel | Conseiller enrichment |
| VEX status | not_affected / affected / fixed | Excitor (VEX Lens) |
| Reachability | entrypoint closure, runtime observations | Scanner + Zastava |
| Fix availability | patch released, vendor guidance | Conseiller, Policy Engine |
| Asset criticality | business context, tenant overrides | Policy Studio inputs |
| Provenance trust | signed evidence, attestation status | Attestor, Authority |
Factors feed into a weighted scoring engine with per-factor contribution reporting.
## Governance & guardrails
- Profiles live in Policy Studio with draft/review/approval workflows.
- Policy Engine enforces deterministic evaluation; simulations and production runs share the same scoring code.
- CLI parity enables automated promotion, export/import, and simulation from pipelines.
- Observability records scoring latency, factor distribution, and profile usage.
- Offline support: profiles, factor plugins, and explain bundles ship inside mirror bundles for air-gapped environments.
## Deliverables
- Policy language reference and examples.
- Simulation APIs/CLI with diff output.
- Scoring engine implementation with explain traces and determinism checks.
- Console visualizations (severity heatmaps, contribution waterfalls).
- Export Center reports with risk scoring sections.
- Observability dashboards for profile health and scoring throughput.
## References
- Policy core: `docs/modules/policy/architecture.md`
- Findings ledger: `docs/modules/vuln-explorer/architecture.md`
- VEX consensus: `docs/modules/vex-lens/architecture.md`
- Offline operations: `docs/airgap/airgap-mode.md`

View File

@@ -0,0 +1,8 @@
# Risk Samples Ingest Checklist (use when payloads arrive)
1) Drop payloads into the correct folder (`profiles/`, `factors/`, `explain/`, `api/`).
2) Normalize JSON deterministically (e.g., `jq -S .`) before hashing; keep UTC timestamps.
3) Run `sha256sum * > SHA256SUMS` in the target folder; keep file sorted.
4) Verify hashes: `sha256sum -c SHA256SUMS`.
5) Add a short README snippet in the sprint Execution Log noting files added and hashes updated.
6) Keep fixtures offline-only; no external calls or redactions after hashing.

View File

@@ -0,0 +1,26 @@
# Risk Samples (fixtures layout)
Use this folder for frozen, deterministic fixtures once schemas and payloads arrive.
Structure (proposed):
- `profiles/` — profile JSON (DSSE-wrapped where applicable) + `SHA256SUMS`
- `factors/` — factor input payloads grouped by source (epss/, kev/, reachability/, runtime/), each with `SHA256SUMS`
- `explain/` — explainability outputs paired with inputs; include `SHA256SUMS`
- `api/` — request/response examples for risk endpoints; include `SHA256SUMS`
Rules:
- UTC timestamps; stable ordering of arrays/objects.
- No live calls; fixtures only.
- Record hashes via `sha256sum` and keep manifests alongside samples.
Quick receipt checklist (see `INGEST_CHECKLIST.md` for detail):
1) Normalize JSON with `jq -S .`
2) Update `SHA256SUMS` in the target folder
3) Verify with `sha256sum -c`
4) Log files + hashes in the sprint Execution Log
Manifests created:
- `profiles/SHA256SUMS`
- `factors/SHA256SUMS`
- `explain/SHA256SUMS`
- `api/SHA256SUMS`

View File

@@ -0,0 +1,3 @@
Use the root `INGEST_CHECKLIST.md`.
Place request/response examples here; normalize with `jq -S .`, update `SHA256SUMS`, verify with `sha256sum -c`.
Include required headers; redact secrets; UTC timestamps only.

View File

@@ -0,0 +1,3 @@
fe460af2699ce335199f6e26597bab4530c6f3f476d4b1f93526175597565d10 README.md
00f8dc4e466eb95c06545e6336d7b0866b53ac430335b7fd1b7889da13529b93 error-catalog.json
96926cd81dfb6ff02d62d1fde5d7b2b7b5b3950e50eb651e51b8ae3042ac9506 risk-api-samples.json

View File

@@ -0,0 +1,13 @@
{
"errors": [
{"code": "risk.job.not_found", "message": "Risk job not found", "http_status": 404, "remediation": "Verify job_id"},
{"code": "risk.profile.invalid_signature", "message": "Profile DSSE signature failed", "http_status": 400, "remediation": "Re-sign profile and retry"},
{"code": "risk.job.rate_limited", "message": "Rate limit exceeded", "http_status": 429, "remediation": "Retry after backoff", "retry_after": 5},
{"code": "risk.tenant.scope_denied", "message": "Tenant scope not authorized", "http_status": 403, "remediation": "Provide required scope header"}
],
"headers": {
"etag": "\"risk-api-sample-etag\"",
"x-ratelimit-remaining": 99,
"retry-after": 5
}
}

View File

@@ -0,0 +1,61 @@
{
"submit_job_request": {
"method": "POST",
"path": "/api/v1/risk/jobs",
"headers": {
"Content-Type": "application/json",
"X-Stella-Tenant": "tenant-default"
},
"body": {
"tenant_id": "tenant-default",
"context_id": "ctx-001",
"profile_id": "default-profile",
"findings": [
{
"finding_id": "finding-123",
"component_purl": "pkg:npm/lodash@4.17.20",
"advisory_id": "CVE-2024-1234",
"trigger": "created"
}
],
"priority": "normal",
"requested_at": "2025-12-05T00:00:00Z"
},
"response": {
"status": 202,
"body": {"job_id": "job-001", "status": "queued"}
}
},
"get_job_status": {
"method": "GET",
"path": "/api/v1/risk/jobs/job-001",
"response": {
"status": 200,
"body": {
"job_id": "job-001",
"status": "completed",
"results": [
{
"finding_id": "finding-123",
"profile_id": "default-profile",
"profile_version": "1.0.0",
"raw_score": 0.75,
"normalized_score": 0.85,
"severity": "high",
"signal_values": {"cvss": 7.5, "kev": true, "reachability": 0.9},
"signal_contributions": {"cvss": 0.4, "kev": 0.3, "reachability": 0.3},
"scored_at": "2025-12-05T00:00:02Z"
}
]
}
}
},
"get_explain": {
"method": "GET",
"path": "/api/v1/risk/explain/job-001",
"response": {
"status": 200,
"body_ref": "../explain/explain-trace.json"
}
}
}

View File

@@ -0,0 +1,3 @@
Use the root `INGEST_CHECKLIST.md`.
Store explainability outputs paired with their inputs; normalize with `jq -S .`, update `SHA256SUMS`, verify with `sha256sum -c`.
Maintain ordering and UTC timestamps; no live data.

View File

@@ -0,0 +1,4 @@
fe460af2699ce335199f6e26597bab4530c6f3f476d4b1f93526175597565d10 README.md
abcacb431d35d649a0deae81aecce9996b28304da6342a083f9616af6b1ca6a2 cli-explain.txt
f3f1b41f5261f50f3fc104ebeeb2649cc9866d04f9634228778551e6c3364cb8 console-frame.json
1d2e56eebf0a266f80519f073e1db532c4a4f2d7fa604ea5c05d4e208719cc7c explain-trace.json

View File

@@ -0,0 +1,15 @@
stella risk explain job-001 --tenant tenant-default
==================================================
Finding: finding-123
Profile: default-profile v1.0.0 (hash sha256:profilehash)
Score: 0.85 (HIGH)
Gates: kev_and_reachability
Contributions (ordered)
- cvss 0.40 raw=7.5 source=nvd prov=sha256:cvsshash
- kev 0.30 raw=true source=cisa prov=sha256:kevhash
- reachability 0.30 raw=0.9 source=scanner prov=sha256:reachhash
Overrides: kev-boost (Known Exploited Vulnerability)
Provenance: job sha256:jobhash | fixtures [sha256:cvsshash, sha256:kevhash, sha256:reachhash]
Timestamp: 2025-12-05T00:00:02Z

View File

@@ -0,0 +1,22 @@
{
"frame_id": "console-explain-001",
"captured_at": "2025-12-05T00:05:00Z",
"ui_version": "1.0.0",
"tenant_id": "tenant-default",
"finding_id": "finding-123",
"profile_id": "default-profile",
"profile_hash": "sha256:profilehash",
"score": 0.85,
"severity": "high",
"gates": ["kev_and_reachability"],
"top_contributors": [
{"factor": "cvss", "contribution": 0.4, "raw": 7.5, "source": "nvd", "provenance": "sha256:cvsshash"},
{"factor": "kev", "contribution": 0.3, "raw": true, "source": "cisa", "provenance": "sha256:kevhash"},
{"factor": "reachability", "contribution": 0.3, "raw": 0.9, "source": "scanner", "provenance": "sha256:reachhash"}
],
"charts": {
"donut": {"critical": 0, "high": 1, "medium": 0, "low": 0, "informational": 0},
"stacked": [0.4, 0.3, 0.3]
},
"provenance": {"job_hash": "sha256:jobhash", "fixtures": ["sha256:cvsshash", "sha256:kevhash", "sha256:reachhash"]}
}

View File

@@ -0,0 +1,34 @@
{
"job_id": "job-001",
"tenant_id": "tenant-default",
"context_id": "ctx-001",
"profile_id": "default-profile",
"profile_version": "1.0.0",
"profile_hash": "sha256:profilehash",
"finding_id": "finding-123",
"raw_score": 0.75,
"normalized_score": 0.85,
"severity": "high",
"signal_values": {
"cvss": 7.5,
"kev": true,
"reachability": 0.9
},
"signal_contributions": {
"cvss": 0.4,
"kev": 0.3,
"reachability": 0.3
},
"override_applied": "kev-boost",
"override_reason": "Known Exploited Vulnerability",
"gates_triggered": ["kev_and_reachability"],
"scored_at": "2025-12-05T00:00:02Z",
"provenance": {
"job_hash": "sha256:jobhash",
"fixtures": [
"sha256:cvsshash",
"sha256:kevhash",
"sha256:reachhash"
]
}
}

View File

@@ -0,0 +1,3 @@
Use the root `INGEST_CHECKLIST.md`.
Drop factor payloads by source (epss/, kev/, reachability/, runtime/), normalize with `jq -S .`, update `SHA256SUMS`, verify with `sha256sum -c`.
Keep UTC timestamps and no live data.

View File

@@ -0,0 +1,2 @@
fe460af2699ce335199f6e26597bab4530c6f3f476d4b1f93526175597565d10 README.md
13cf45be5a287a38d000aff4db266616e765fc1acdc1df9f37b2e03eb729d1d2 factors-normalized.json

View File

@@ -0,0 +1,44 @@
{
"profile_id": "default-profile",
"context_id": "ctx-001",
"factors": [
{
"name": "cvss",
"source": "nvd",
"type": "numeric",
"path": "/cvss/base_score",
"raw_value": 7.5,
"normalized_value": 0.75,
"weight": 0.4,
"contribution": 0.4,
"timestamp_utc": "2025-12-05T00:00:00Z",
"provenance": "sha256:cvsshash"
},
{
"name": "kev",
"source": "cisa",
"type": "boolean",
"path": "/kev/in_catalog",
"raw_value": true,
"normalized_value": 1.0,
"weight": 0.3,
"contribution": 0.3,
"timestamp_utc": "2025-12-05T00:00:00Z",
"provenance": "sha256:kevhash"
},
{
"name": "reachability",
"source": "scanner",
"type": "numeric",
"path": "/reachability/score",
"raw_value": 0.9,
"normalized_value": 0.9,
"weight": 0.3,
"contribution": 0.3,
"timestamp_utc": "2025-12-05T00:00:01Z",
"provenance": "sha256:reachhash"
}
],
"ordering": "factor_type->source->timestamp_utc",
"precision": 4
}

View File

@@ -0,0 +1,8 @@
| Date (UTC) | Folder | Files added | SHA256SUMS updated | Notes |
| --- | --- | --- | --- | --- |
| 2025-__-__ | profiles/ | | yes/no | source + checklist step refs |
| 2025-__-__ | factors/ | | yes/no | source + checklist step refs |
| 2025-__-__ | explain/ | | yes/no | source + checklist step refs |
| 2025-__-__ | api/ | | yes/no | source + checklist step refs |
Instructions: copy a row per drop, fill actual date, list filenames, mark whether `SHA256SUMS` was updated, and note evidence source. Keep this file sorted by date for determinism.

View File

@@ -0,0 +1,3 @@
Use the root `INGEST_CHECKLIST.md`.
Place profile JSON/DSSE here, normalize with `jq -S .`, update `SHA256SUMS`, and verify with `sha256sum -c`.
UTC timestamps only; no live data.

View File

@@ -0,0 +1,2 @@
fe460af2699ce335199f6e26597bab4530c6f3f476d4b1f93526175597565d10 README.md
c8242d4051232152d024dd37324b346dcf019a5e46b7b82fae8349ad802affab default-profile.json

View File

@@ -0,0 +1,18 @@
{
"id": "default-profile",
"version": "1.0.0",
"description": "Default risk profile for vulnerability prioritization",
"extends": "base-profile",
"signals": [
{ "name": "cvss", "source": "nvd", "type": "numeric", "path": "/cvss/base_score", "transform": "normalize_10", "unit": "score" },
{ "name": "kev", "source": "cisa", "type": "boolean", "path": "/kev/in_catalog" },
{ "name": "reachability", "source": "scanner", "type": "numeric", "path": "/reachability/score", "unit": "score" }
],
"weights": { "cvss": 0.4, "kev": 0.3, "reachability": 0.3 },
"overrides": {
"severity": [ { "when": { "kev": true }, "set": "critical" } ],
"decisions": [ { "when": { "kev": true, "reachability": { "$gt": 0.8 } }, "action": "deny", "reason": "KEV with high reachability" } ]
},
"metadata": { "author": "docs-guild", "created_at": "2025-12-05T00:00:00Z" },
"provenance": { "hash": "sha256:placeholder", "signed": false }
}

View File

@@ -0,0 +1,39 @@
# Scheduler agent guide
## Mission
Scheduler detects advisory/VEX deltas, computes impact windows, and orchestrates re-evaluations across Scanner and Policy Engine. Docs in this directory are the front-door contract for contributors.
## Working directory
- `docs/modules/scheduler` (docs-only); code changes live under `src/Scheduler/**` but must be coordinated via sprint plans.
## Roles & owners
- **Docs author**: curates AGENTS/TASKS/runbooks; keeps determinism/offline guidance accurate.
- **Scheduler engineer (Worker/WebService)**: aligns implementation notes with architecture and ensures observability/runbook updates land with code.
- **Observability/Ops**: maintains dashboards/rules, documents operational SLOs and alert contracts.
## Required Reading
- `docs/modules/scheduler/README.md`
- `docs/modules/scheduler/architecture.md`
- `docs/modules/scheduler/implementation_plan.md`
- `docs/modules/platform/architecture-overview.md`
## How to work
1. Open relevant sprint file in `docs/implplan/SPRINT_*.md` and set task status to `DOING` there and in `docs/modules/scheduler/TASKS.md` before starting.
2. Confirm prerequisites above are read; note any missing contracts in sprint **Decisions & Risks**.
3. Keep outputs deterministic (stable ordering, UTC ISO-8601 timestamps, sorted lists) and offline-friendly (no external fetches without mirrors).
4. When changing behavior, update runbooks and observability assets in `./operations/`.
5. On completion, set status to `DONE` in both the sprint file and `TASKS.md`; if paused, revert to `TODO` and add a brief note.
## Guardrails
- Honour the Aggregation-Only Contract where applicable (see `../../aoc/aggregation-only-contract.md`).
- No undocumented schema or API contract changes; document deltas in architecture or implementation_plan.
- Keep Offline Kit parity—document air-gapped workflows for any new feature.
- Prefer deterministic fixtures and avoid machine-specific artefacts in examples.
## Testing & determinism expectations
- Examples and snippets should be reproducible; pin sample timestamps to UTC and sort collections.
- Observability examples must align with published metric names and labels; update `operations/worker-prometheus-rules.yaml` if alert semantics change.
## Status mirrors
- Sprint tracker: `/docs/implplan/SPRINT_*.md` (source of record for Delivery Tracker).
- Local tracker: `docs/modules/scheduler/TASKS.md` (mirrors sprint status; keep in sync).

Some files were not shown because too many files have changed in this diff Show More