feat(rate-limiting): Implement core rate limiting functionality with configuration, decision-making, metrics, middleware, and service registration

- Add RateLimitConfig for configuration management with YAML binding support.
- Introduce RateLimitDecision to encapsulate the result of rate limit checks.
- Implement RateLimitMetrics for OpenTelemetry metrics tracking.
- Create RateLimitMiddleware for enforcing rate limits on incoming requests.
- Develop RateLimitService to orchestrate instance and environment rate limit checks.
- Add RateLimitServiceCollectionExtensions for dependency injection registration.
This commit is contained in:
master
2025-12-17 18:02:37 +02:00
parent 394b57f6bf
commit 8bbfe4d2d2
211 changed files with 47179 additions and 1590 deletions

View File

@@ -27,7 +27,7 @@
* **Signer** (caller) — authenticated via **mTLS** and **Authority** OpToks.
* **Rekor v2** — tilebacked transparency log endpoint(s).
* **MinIO (S3)** — optional archive store for DSSE envelopes & verification bundles.
* **MongoDB** — local cache of `{uuid, index, proof, artifactSha256, bundleSha256}`; job state; audit.
* **PostgreSQL** — local cache of `{uuid, index, proof, artifactSha256, bundleSha256}`; job state; audit.
* **Redis** — dedupe/idempotency keys and shortlived ratelimit buckets.
* **Licensing Service (optional)** — “endorse” call for crosslog publishing when customer optsin.
@@ -109,48 +109,70 @@ The Attestor implements RFC 6962-compliant Merkle inclusion proof verification f
---
## 2) Data model (Mongo)
## 2) Data model (PostgreSQL)
Database: `attestor`
**Collections & schemas**
**Tables & schemas**
* `entries`
* `entries` table
```
{ _id: "<rekor-uuid>",
artifact: { sha256: "<sha256>", kind: "sbom|report|vex-export", imageDigest?, subjectUri? },
bundleSha256: "<sha256>", // canonicalized DSSE
index: <int>, // log index/sequence if provided by backend
proof: { // inclusion proof
checkpoint: { origin, size, rootHash, timestamp },
inclusion: { leafHash, path[] } // Merkle path (tiles)
},
log: { url, logId? },
createdAt, status: "included|pending|failed",
signerIdentity: { mode: "keyless|kms", issuer, san?, kid? }
}
```sql
CREATE TABLE attestor.entries (
id UUID PRIMARY KEY, -- rekor-uuid
artifact_sha256 TEXT NOT NULL,
artifact_kind TEXT NOT NULL, -- sbom|report|vex-export
artifact_image_digest TEXT,
artifact_subject_uri TEXT,
bundle_sha256 TEXT NOT NULL, -- canonicalized DSSE
log_index INTEGER, -- log index/sequence if provided by backend
proof_checkpoint JSONB, -- { origin, size, rootHash, timestamp }
proof_inclusion JSONB, -- { leafHash, path[] } Merkle path (tiles)
log_url TEXT,
log_id TEXT,
created_at TIMESTAMPTZ DEFAULT NOW(),
status TEXT NOT NULL, -- included|pending|failed
signer_identity JSONB -- { mode, issuer, san?, kid? }
);
```
* `dedupe`
* `dedupe` table
```
{ key: "bundle:<sha256>", rekorUuid, createdAt, ttlAt } // idempotency key
```sql
CREATE TABLE attestor.dedupe (
key TEXT PRIMARY KEY, -- bundle:<sha256> idempotency key
rekor_uuid UUID NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW(),
ttl_at TIMESTAMPTZ NOT NULL -- for scheduled cleanup
);
```
* `audit`
* `audit` table
```
{ _id, ts, caller: { cn, mTLSThumbprint, sub, aud }, // from mTLS + OpTok
action: "submit|verify|fetch",
artifactSha256, bundleSha256, rekorUuid?, index?, result, latencyMs, backend }
```sql
CREATE TABLE attestor.audit (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
ts TIMESTAMPTZ DEFAULT NOW(),
caller_cn TEXT,
caller_mtls_thumbprint TEXT,
caller_sub TEXT,
caller_aud TEXT,
action TEXT NOT NULL, -- submit|verify|fetch
artifact_sha256 TEXT,
bundle_sha256 TEXT,
rekor_uuid UUID,
log_index INTEGER,
result TEXT NOT NULL,
latency_ms INTEGER,
backend TEXT
);
```
Indexes:
* `entries` on `artifact.sha256`, `bundleSha256`, `createdAt`, and `{status:1, createdAt:-1}`.
* `dedupe.key` unique (TTL 2448h).
* `audit.ts` for timerange queries.
* `entries`: indexes on `artifact_sha256`, `bundle_sha256`, `created_at`, and composite `(status, created_at DESC)`.
* `dedupe`: unique index on `key`; scheduled job cleans rows where `ttl_at < NOW()` (2448h retention).
* `audit`: index on `ts` for timerange queries.
---
@@ -207,16 +229,100 @@ public interface IContentAddressedIdGenerator
### Predicate Types
The ProofChain library defines DSSE predicates for each attestation type:
The ProofChain library defines DSSE predicates for proof chain attestations. All predicates follow the in-toto Statement/v1 format.
| Predicate | Type URI | Purpose |
|-----------|----------|---------|
| `EvidencePredicate` | `stellaops.org/evidence/v1` | Scan evidence (findings, reachability) |
| `ReasoningPredicate` | `stellaops.org/reasoning/v1` | Exploitability reasoning |
| `VexPredicate` | `stellaops.org/vex-verdict/v1` | VEX status determination |
| `ProofSpinePredicate` | `stellaops.org/proof-spine/v1` | Complete proof bundle |
#### Predicate Type Registry
**Reference:** `src/Attestor/__Libraries/StellaOps.Attestor.ProofChain/`
| Predicate | Type URI | Purpose | Signer Role |
|-----------|----------|---------|-------------|
| **Evidence** | `evidence.stella/v1` | Raw evidence from scanner/ingestor (findings, reachability data) | Scanner/Ingestor key |
| **Reasoning** | `reasoning.stella/v1` | Policy evaluation trace with inputs and intermediate findings | Policy/Authority key |
| **VEX Verdict** | `cdx-vex.stella/v1` | VEX verdict with status, justification, and provenance | VEXer/Vendor key |
| **Proof Spine** | `proofspine.stella/v1` | Merkle-aggregated proof spine linking evidence to verdict | Authority key |
| **Verdict Receipt** | `verdict.stella/v1` | Final surfaced decision receipt with policy rule reference | Authority key |
| **SBOM Linkage** | `https://stella-ops.org/predicates/sbom-linkage/v1` | SBOM-to-component linkage metadata | Generator key |
#### Evidence Statement (`evidence.stella/v1`)
Captures raw evidence collected from scanners or vulnerability feeds.
| Field | Type | Description |
|-------|------|-------------|
| `source` | string | Scanner or feed name that produced this evidence |
| `sourceVersion` | string | Version of the source tool |
| `collectionTime` | DateTimeOffset | UTC timestamp when evidence was collected |
| `sbomEntryId` | string | Reference to the SBOM entry this evidence relates to |
| `vulnerabilityId` | string? | CVE or vulnerability identifier if applicable |
| `rawFinding` | object | Pointer to or inline representation of raw finding data |
| `evidenceId` | string | Content-addressed ID (sha256:&lt;hash&gt;) |
#### Reasoning Statement (`reasoning.stella/v1`)
Captures policy evaluation traces linking evidence to decisions.
| Field | Type | Description |
|-------|------|-------------|
| `sbomEntryId` | string | SBOM entry this reasoning applies to |
| `evidenceIds` | string[] | Evidence IDs considered in this reasoning |
| `policyVersion` | string | Version of the policy used for evaluation |
| `inputs` | object | Inputs to the reasoning process (evaluation time, thresholds, lattice rules) |
| `intermediateFindings` | object? | Intermediate findings from the evaluation |
| `reasoningId` | string | Content-addressed ID (sha256:&lt;hash&gt;) |
#### VEX Verdict Statement (`cdx-vex.stella/v1`)
Captures VEX status determinations with provenance.
| Field | Type | Description |
|-------|------|-------------|
| `sbomEntryId` | string | SBOM entry this verdict applies to |
| `vulnerabilityId` | string | CVE, GHSA, or other vulnerability identifier |
| `status` | string | VEX status: `not_affected`, `affected`, `fixed`, `under_investigation` |
| `justification` | string | Justification for the VEX status |
| `policyVersion` | string | Version of the policy used |
| `reasoningId` | string | Reference to the reasoning that led to this verdict |
| `vexVerdictId` | string | Content-addressed ID (sha256:&lt;hash&gt;) |
#### Proof Spine Statement (`proofspine.stella/v1`)
Merkle-aggregated proof bundle linking all chain components.
| Field | Type | Description |
|-------|------|-------------|
| `sbomEntryId` | string | SBOM entry this proof spine covers |
| `evidenceIds` | string[] | Sorted list of evidence IDs included in this proof bundle |
| `reasoningId` | string | Reasoning ID linking evidence to verdict |
| `vexVerdictId` | string | VEX verdict ID for this entry |
| `policyVersion` | string | Version of the policy used |
| `proofBundleId` | string | Content-addressed ID (sha256:&lt;merkle_root&gt;) |
#### Verdict Receipt Statement (`verdict.stella/v1`)
Final surfaced decision receipt with full provenance.
| Field | Type | Description |
|-------|------|-------------|
| `graphRevisionId` | string | Graph revision ID this verdict was computed from |
| `findingKey` | object | Finding key (sbomEntryId + vulnerabilityId) |
| `rule` | object | Policy rule that produced this verdict |
| `decision` | object | Decision made by the rule |
| `inputs` | object | Inputs used to compute this verdict |
| `outputs` | object | Outputs/references from this verdict |
| `createdAt` | DateTimeOffset | UTC timestamp when verdict was created |
#### SBOM Linkage Statement (`sbom-linkage/v1`)
SBOM-to-component linkage metadata.
| Field | Type | Description |
|-------|------|-------------|
| `sbom` | object | SBOM descriptor (id, format, specVersion, mediaType, sha256, location) |
| `generator` | object | Generator tool descriptor |
| `generatedAt` | DateTimeOffset | UTC timestamp when linkage was generated |
| `incompleteSubjects` | object[]? | Subjects that could not be fully resolved |
| `tags` | object? | Arbitrary tags for classification or filtering |
**Reference:** `src/Attestor/__Libraries/StellaOps.Attestor.ProofChain/Statements/`
---
@@ -354,7 +460,7 @@ The ProofChain library defines DSSE predicates for each attestation type:
### 4.5 Bulk verification
`POST /api/v1/rekor/verify:bulk` enqueues a verification job containing up to `quotas.bulk.maxItemsPerJob` items. Each item mirrors the single verification payload (uuid | artifactSha256 | subject+envelopeId, optional policyVersion/refreshProof). The handler persists a MongoDB job document (`bulk_jobs` collection) and returns `202 Accepted` with a job descriptor and polling URL.
`POST /api/v1/rekor/verify:bulk` enqueues a verification job containing up to `quotas.bulk.maxItemsPerJob` items. Each item mirrors the single verification payload (uuid | artifactSha256 | subject+envelopeId, optional policyVersion/refreshProof). The handler persists a PostgreSQL job record (`bulk_jobs` table) and returns `202 Accepted` with a job descriptor and polling URL.
`GET /api/v1/rekor/verify:bulk/{jobId}` returns progress and per-item results (subject/uuid, status, issues, cached verification report if available). Jobs are tenant- and subject-scoped; only the initiating principal can read their progress.
@@ -405,7 +511,7 @@ The worker honours `bulkVerification.itemDelayMilliseconds` for throttling and r
## 7) Storage & archival
* **Entries** in Mongo provide a local ledger keyed by `rekorUuid` and **artifact sha256** for quick reverse lookups.
* **Entries** in PostgreSQL provide a local ledger keyed by `rekorUuid` and **artifact sha256** for quick reverse lookups.
* **S3 archival** (if enabled):
```
@@ -505,8 +611,8 @@ attestor:
mirror:
enabled: false
url: "https://rekor-v2.mirror"
mongo:
uri: "mongodb://mongo/attestor"
postgres:
connectionString: "Host=postgres;Port=5432;Database=attestor;Username=stellaops;Password=secret"
s3:
enabled: true
endpoint: "http://minio:9000"

View File

@@ -1,97 +1,97 @@
# Authority Backup & Restore Runbook
## Scope
- **Applies to:** StellaOps Authority deployments running the official `ops/authority/docker-compose.authority.yaml` stack or equivalent Kubernetes packaging.
- **Artifacts covered:** MongoDB (`stellaops-authority` database), Authority configuration (`etc/authority.yaml`), plugin manifests under `etc/authority.plugins/`, and signing key material stored in the `authority-keys` volume (defaults to `/app/keys` inside the container).
- **Frequency:** Run the full procedure prior to upgrades, before rotating keys, and at least once per 24h in production. Store snapshots in an encrypted, access-controlled vault.
## Inventory Checklist
| Component | Location (compose default) | Notes |
| --- | --- | --- |
| Mongo data | `mongo-data` volume (`/var/lib/docker/volumes/.../mongo-data`) | Contains all Authority collections (`AuthorityUser`, `AuthorityClient`, `AuthorityToken`, etc.). |
| Configuration | `etc/authority.yaml` | Mounted read-only into the container at `/etc/authority.yaml`. |
| Plugin manifests | `etc/authority.plugins/*.yaml` | Includes `standard.yaml` with `tokenSigning.keyDirectory`. |
| Signing keys | `authority-keys` volume -> `/app/keys` | Path is derived from `tokenSigning.keyDirectory` (defaults to `../keys` relative to the manifest). |
> **TIP:** Confirm the deployed key directory via `tokenSigning.keyDirectory` in `etc/authority.plugins/standard.yaml`; some installations relocate keys to `/var/lib/stellaops/authority/keys`.
## Hot Backup (no downtime)
1. **Create output directory:** `mkdir -p backup/$(date +%Y-%m-%d)` on the host.
2. **Dump Mongo:**
```bash
docker compose -f ops/authority/docker-compose.authority.yaml exec mongo \
mongodump --archive=/dump/authority-$(date +%Y%m%dT%H%M%SZ).gz \
--gzip --db stellaops-authority
docker compose -f ops/authority/docker-compose.authority.yaml cp \
mongo:/dump/authority-$(date +%Y%m%dT%H%M%SZ).gz backup/
```
The `mongodump` archive preserves indexes and can be restored with `mongorestore --archive --gzip`.
3. **Capture configuration + manifests:**
```bash
cp etc/authority.yaml backup/
rsync -a etc/authority.plugins/ backup/authority.plugins/
```
4. **Export signing keys:** the compose file maps `authority-keys` to a local Docker volume. Snapshot it without stopping the service:
```bash
docker run --rm \
-v authority-keys:/keys \
-v "$(pwd)/backup:/backup" \
busybox tar czf /backup/authority-keys-$(date +%Y%m%dT%H%M%SZ).tar.gz -C /keys .
```
5. **Checksum:** generate SHA-256 digests for every file and store them alongside the artefacts.
6. **Encrypt & upload:** wrap the backup folder using your secrets management standard (e.g., age, GPG) and upload to the designated offline vault.
## Cold Backup (planned downtime)
1. Notify stakeholders and drain traffic (CLI clients should refresh tokens afterwards).
2. Stop services:
```bash
docker compose -f ops/authority/docker-compose.authority.yaml down
```
3. Back up volumes directly using `tar`:
```bash
docker run --rm -v mongo-data:/data -v "$(pwd)/backup:/backup" \
busybox tar czf /backup/mongo-data-$(date +%Y%m%d).tar.gz -C /data .
docker run --rm -v authority-keys:/keys -v "$(pwd)/backup:/backup" \
busybox tar czf /backup/authority-keys-$(date +%Y%m%d).tar.gz -C /keys .
```
4. Copy configuration + manifests as in the hot backup (steps 36).
5. Restart services and verify health:
```bash
docker compose -f ops/authority/docker-compose.authority.yaml up -d
curl -fsS http://localhost:8080/ready
```
## Restore Procedure
1. **Provision clean volumes:** remove existing volumes if youre rebuilding a node (`docker volume rm mongo-data authority-keys`), then recreate the compose stack so empty volumes exist.
2. **Restore Mongo:**
```bash
docker compose exec -T mongo mongorestore --archive --gzip --drop < backup/authority-YYYYMMDDTHHMMSSZ.gz
```
Use `--drop` to replace collections; omit if doing a partial restore.
3. **Restore configuration/manifests:** copy `authority.yaml` and `authority.plugins/*` into place before starting the Authority container.
4. **Restore signing keys:** untar into the mounted volume:
```bash
docker run --rm -v authority-keys:/keys -v "$(pwd)/backup:/backup" \
busybox tar xzf /backup/authority-keys-YYYYMMDD.tar.gz -C /keys
```
Ensure file permissions remain `600` for private keys (`chmod -R 600`).
5. **Start services & validate:**
```bash
docker compose up -d
curl -fsS http://localhost:8080/health
```
# Authority Backup & Restore Runbook
## Scope
- **Applies to:** StellaOps Authority deployments running the official `ops/authority/docker-compose.authority.yaml` stack or equivalent Kubernetes packaging.
- **Artifacts covered:** PostgreSQL (`stellaops-authority` database), Authority configuration (`etc/authority.yaml`), plugin manifests under `etc/authority.plugins/`, and signing key material stored in the `authority-keys` volume (defaults to `/app/keys` inside the container).
- **Frequency:** Run the full procedure prior to upgrades, before rotating keys, and at least once per 24 h in production. Store snapshots in an encrypted, access-controlled vault.
## Inventory Checklist
| Component | Location (compose default) | Notes |
| --- | --- | --- |
| PostgreSQL data | `postgres-data` volume (`/var/lib/docker/volumes/.../postgres-data`) | Contains all Authority tables (`authority_user`, `authority_client`, `authority_token`, etc.). |
| Configuration | `etc/authority.yaml` | Mounted read-only into the container at `/etc/authority.yaml`. |
| Plugin manifests | `etc/authority.plugins/*.yaml` | Includes `standard.yaml` with `tokenSigning.keyDirectory`. |
| Signing keys | `authority-keys` volume -> `/app/keys` | Path is derived from `tokenSigning.keyDirectory` (defaults to `../keys` relative to the manifest). |
> **TIP:** Confirm the deployed key directory via `tokenSigning.keyDirectory` in `etc/authority.plugins/standard.yaml`; some installations relocate keys to `/var/lib/stellaops/authority/keys`.
## Hot Backup (no downtime)
1. **Create output directory:** `mkdir -p backup/$(date +%Y-%m-%d)` on the host.
2. **Dump PostgreSQL:**
```bash
docker compose -f ops/authority/docker-compose.authority.yaml exec postgres \
pg_dump -Fc -d stellaops-authority \
-f /dump/authority-$(date +%Y%m%dT%H%M%SZ).dump
docker compose -f ops/authority/docker-compose.authority.yaml cp \
postgres:/dump/authority-$(date +%Y%m%dT%H%M%SZ).dump backup/
```
The `pg_dump` archive preserves indexes and can be restored with `pg_restore`.
3. **Capture configuration + manifests:**
```bash
cp etc/authority.yaml backup/
rsync -a etc/authority.plugins/ backup/authority.plugins/
```
4. **Export signing keys:** the compose file maps `authority-keys` to a local Docker volume. Snapshot it without stopping the service:
```bash
docker run --rm \
-v authority-keys:/keys \
-v "$(pwd)/backup:/backup" \
busybox tar czf /backup/authority-keys-$(date +%Y%m%dT%H%M%SZ).tar.gz -C /keys .
```
5. **Checksum:** generate SHA-256 digests for every file and store them alongside the artefacts.
6. **Encrypt & upload:** wrap the backup folder using your secrets management standard (e.g., age, GPG) and upload to the designated offline vault.
## Cold Backup (planned downtime)
1. Notify stakeholders and drain traffic (CLI clients should refresh tokens afterwards).
2. Stop services:
```bash
docker compose -f ops/authority/docker-compose.authority.yaml down
```
3. Back up volumes directly using `tar`:
```bash
docker run --rm -v postgres-data:/data -v "$(pwd)/backup:/backup" \
busybox tar czf /backup/postgres-data-$(date +%Y%m%d).tar.gz -C /data .
docker run --rm -v authority-keys:/keys -v "$(pwd)/backup:/backup" \
busybox tar czf /backup/authority-keys-$(date +%Y%m%d).tar.gz -C /keys .
```
4. Copy configuration + manifests as in the hot backup (steps 36).
5. Restart services and verify health:
```bash
docker compose -f ops/authority/docker-compose.authority.yaml up -d
curl -fsS http://localhost:8080/ready
```
## Restore Procedure
1. **Provision clean volumes:** remove existing volumes if you're rebuilding a node (`docker volume rm postgres-data authority-keys`), then recreate the compose stack so empty volumes exist.
2. **Restore PostgreSQL:**
```bash
docker compose exec -T postgres pg_restore -d stellaops-authority --clean < backup/authority-YYYYMMDDTHHMMSSZ.dump
```
Use `--clean` to drop existing objects before restoring; omit if doing a partial restore.
3. **Restore configuration/manifests:** copy `authority.yaml` and `authority.plugins/*` into place before starting the Authority container.
4. **Restore signing keys:** untar into the mounted volume:
```bash
docker run --rm -v authority-keys:/keys -v "$(pwd)/backup:/backup" \
busybox tar xzf /backup/authority-keys-YYYYMMDD.tar.gz -C /keys
```
Ensure file permissions remain `600` for private keys (`chmod -R 600`).
5. **Start services & validate:**
```bash
docker compose up -d
curl -fsS http://localhost:8080/health
```
6. **Validate JWKS and tokens:** call `/jwks` and issue a short-lived token via the CLI to confirm key material matches expectations. If the restored environment requires a fresh signing key, follow the rotation SOP in [`docs/11_AUTHORITY.md`](../../../11_AUTHORITY.md) using `ops/authority/key-rotation.sh` to invoke `/internal/signing/rotate`.
## Disaster Recovery Notes
- **Air-gapped replication:** replicate archives via the Offline Update Kit transport channels; never attach USB devices without scanning.
- **Retention:** maintain 30 daily snapshots + 12 monthly archival copies. Rotate encryption keys annually.
## Disaster Recovery Notes
- **Air-gapped replication:** replicate archives via the Offline Update Kit transport channels; never attach USB devices without scanning.
- **Retention:** maintain 30 daily snapshots + 12 monthly archival copies. Rotate encryption keys annually.
- **Key compromise:** if signing keys are suspected compromised, restore from the latest clean backup, rotate via OPS3 (see `ops/authority/key-rotation.sh` and [`docs/11_AUTHORITY.md`](../../../11_AUTHORITY.md)), and publish a revocation notice.
- **Mongo version:** keep dump/restore images pinned to the deployment version (compose uses `mongo:7`). Driver 3.5.0 requires MongoDB **4.2+**—clusters still on 4.0 must be upgraded before restore, and future driver releases will drop 4.0 entirely. citeturn1open1
## Verification Checklist
- [ ] `/ready` reports all identity providers ready.
- [ ] OAuth flows issue tokens signed by the restored keys.
- [ ] `PluginRegistrationSummary` logs expected providers on startup.
- [ ] Revocation manifest export (`dotnet run --project src/Authority/StellaOps.Authority`) succeeds.
- [ ] Monitoring dashboards show metrics resuming (see OPS5 deliverables).
- **PostgreSQL version:** keep dump/restore images pinned to the deployment version (compose uses `postgres:16`). Npgsql 8.x requires PostgreSQL **12+**—clusters still on older versions must be upgraded before restore.
## Verification Checklist
- [ ] `/ready` reports all identity providers ready.
- [ ] OAuth flows issue tokens signed by the restored keys.
- [ ] `PluginRegistrationSummary` logs expected providers on startup.
- [ ] Revocation manifest export (`dotnet run --project src/Authority/StellaOps.Authority`) succeeds.
- [ ] Monitoring dashboards show metrics resuming (see OPS5 deliverables).

View File

@@ -20,19 +20,19 @@
## 1) Aggregation-Only Contract guardrails
**Epic1 distilled** — the service itself is the enforcement point for AOC. The guardrail checklist is embedded in code (`AOCWriteGuard`) and must be satisfied before any advisory hits Mongo:
**Epic 1 distilled** — the service itself is the enforcement point for AOC. The guardrail checklist is embedded in code (`AOCWriteGuard`) and must be satisfied before any advisory hits PostgreSQL:
1. **No derived semantics in ingestion.** The DTOs produced by connectors cannot contain severity, consensus, reachability, merged status, or fix hints. Roslyn analyzers (`StellaOps.AOC.Analyzers`) scan connectors and fail builds if forbidden properties appear.
2. **Immutable raw docs.** Every upstream advisory is persisted in `advisory_raw` with append-only semantics. Revisions produce new `_id`s via version suffix (`:v2`, `:v3`), linking back through `supersedes`.
2. **Immutable raw rows.** Every upstream advisory is persisted in `advisory_raw` with append-only semantics. Revisions produce new IDs via version suffix (`:v2`, `:v3`), linking back through `supersedes`.
3. **Mandatory provenance.** Collectors record `source`, `upstream` metadata (`document_version`, `fetched_at`, `received_at`, `content_hash`), and signature presence before writing.
4. **Linkset only.** Derived joins (aliases, PURLs, CPEs, references) are stored inside `linkset` and never mutate `content.raw`.
5. **Deterministic canonicalisation.** Writers use canonical JSON (sorted object keys, lexicographic arrays) ensuring identical inputs yield the same hashes/diff-friendly outputs.
6. **Idempotent upserts.** `(source.vendor, upstream.upstream_id, upstream.content_hash)` uniquely identify a document. Duplicate hashes short-circuit; new hashes create a new version.
7. **Verifier & CI.** `StellaOps.AOC.Verifier` processes observation batches in CI and at runtime, rejecting writes lacking provenance, introducing unordered collections, or violating the schema.
> Feature toggle: set `concelier:features:noMergeEnabled=true` to disable the legacy Merge module and its `merge:reconcile` job once Link-Not-Merge adoption is complete (MERGE-LNM-21-002). Analyzer `CONCELIER0002` prevents new references to Merge DI helpers when this flag is enabled.
### 1.1 Advisory raw document shape
6. **Idempotent upserts.** `(source.vendor, upstream.upstream_id, upstream.content_hash)` uniquely identify a document. Duplicate hashes short-circuit; new hashes create a new version.
7. **Verifier & CI.** `StellaOps.AOC.Verifier` processes observation batches in CI and at runtime, rejecting writes lacking provenance, introducing unordered collections, or violating the schema.
> Feature toggle: set `concelier:features:noMergeEnabled=true` to disable the legacy Merge module and its `merge:reconcile` job once Link-Not-Merge adoption is complete (MERGE-LNM-21-002). Analyzer `CONCELIER0002` prevents new references to Merge DI helpers when this flag is enabled.
### 1.1 Advisory raw document shape
```json
{
@@ -61,28 +61,28 @@
"spec_version": "1.6",
"raw": { /* unmodified upstream document */ }
},
"identifiers": {
"primary": "GHSA-xxxx-....",
"aliases": ["CVE-2025-12345", "GHSA-xxxx-...."]
},
"linkset": {
"purls": ["pkg:npm/lodash@4.17.21"],
"cpes": ["cpe:2.3:a:lodash:lodash:4.17.21:*:*:*:*:*:*:*"],
"references": [
{"type":"advisory","url":"https://..."},
{"type":"fix","url":"https://..."}
],
"reconciled_from": ["content.raw.affected.ranges", "content.raw.pkg"]
},
"advisory_key": "CVE-2025-12345",
"links": [
{"scheme":"CVE","value":"CVE-2025-12345"},
{"scheme":"GHSA","value":"GHSA-XXXX-...."},
{"scheme":"PRIMARY","value":"CVE-2025-12345"}
],
"supersedes": "advisory_raw:osv:GHSA-xxxx-....:v2",
"tenant": "default"
}
"identifiers": {
"primary": "GHSA-xxxx-....",
"aliases": ["CVE-2025-12345", "GHSA-xxxx-...."]
},
"linkset": {
"purls": ["pkg:npm/lodash@4.17.21"],
"cpes": ["cpe:2.3:a:lodash:lodash:4.17.21:*:*:*:*:*:*:*"],
"references": [
{"type":"advisory","url":"https://..."},
{"type":"fix","url":"https://..."}
],
"reconciled_from": ["content.raw.affected.ranges", "content.raw.pkg"]
},
"advisory_key": "CVE-2025-12345",
"links": [
{"scheme":"CVE","value":"CVE-2025-12345"},
{"scheme":"GHSA","value":"GHSA-XXXX-...."},
{"scheme":"PRIMARY","value":"CVE-2025-12345"}
],
"supersedes": "advisory_raw:osv:GHSA-xxxx-....:v2",
"tenant": "default"
}
```
### 1.2 Connector lifecycle
@@ -90,7 +90,7 @@
1. **Snapshot stage** — connectors fetch signed feeds or use offline mirrors keyed by `{vendor, stream, snapshot_date}`.
2. **Parse stage** — upstream payloads are normalised into strongly-typed DTOs with UTC timestamps.
3. **Guard stage** — DTOs run through `AOCWriteGuard` performing schema validation, forbidden-field checks, provenance validation, deterministic sorting, and `_id` computation.
4. **Write stage** — append-only Mongo insert; duplicate hash is ignored, changed hash creates a new version and emits `supersedes` pointer.
4. **Write stage** — append-only PostgreSQL insert; duplicate hash is ignored, changed hash creates a new version and emits `supersedes` pointer.
5. **Event stage** — DSSE-backed events `advisory.observation.updated` and `advisory.linkset.updated` notify downstream services (Policy, Export Center, CLI).
### 1.3 Export readiness
@@ -99,7 +99,7 @@ Concelier feeds Export Center profiles (Epic10) by:
- Maintaining canonical JSON exports with deterministic manifests (`export.json`) listing content hashes, counts, and `supersedes` chains.
- Producing Trivy DB-compatible artifacts (SQLite + metadata) packaged under `db/` with hash manifests.
- Surfacing mirror manifests that reference Mongo snapshot digests, enabling Offline Kit bundle verification.
- Surfacing mirror manifests that reference PostgreSQL snapshot digests, enabling Offline Kit bundle verification.
Running the same export job twice against the same snapshot must yield byte-identical archives and manifest hashes.
@@ -109,13 +109,13 @@ Running the same export job twice against the same snapshot must yield byte-iden
**Process shape:** single ASP.NET Core service `StellaOps.Concelier.WebService` hosting:
* **Scheduler** with distributed locks (Mongo backed).
* **Scheduler** with distributed locks (PostgreSQL backed).
* **Connectors** (fetch/parse/map) that emit immutable observation candidates.
* **Observation writer** enforcing AOC invariants via `AOCWriteGuard`.
* **Linkset builder** that correlates observations into `advisory_linksets` and annotates conflicts.
* **Event publisher** emitting `advisory.observation.updated` and `advisory.linkset.updated` messages.
* **Exporters** (JSON, Trivy DB, Offline Kit slices) fed from observation/linkset stores.
* **Minimal REST** for health/status/trigger/export, raw observation reads, and evidence retrieval (`GET /vuln/evidence/advisories/{advisory_key}`).
* **Minimal REST** for health/status/trigger/export, raw observation reads, and evidence retrieval (`GET /vuln/evidence/advisories/{advisory_key}`).
**Scale:** HA by running N replicas; **locks** prevent overlapping jobs per source/exporter.
@@ -123,7 +123,7 @@ Running the same export job twice against the same snapshot must yield byte-iden
## 3) Canonical domain model
> Stored in MongoDB (database `concelier`), serialized with a **canonical JSON** writer (stable order, camelCase, normalized timestamps).
> Stored in PostgreSQL (database `concelier`), serialized with a **canonical JSON** writer (stable order, camelCase, normalized timestamps).
### 2.1 Core entities
@@ -300,7 +300,7 @@ public interface IFeedConnector {
1. **Connector fetch/parse/map** connectors download upstream payloads, validate signatures, and map to DTOs (identifiers, references, raw payload, provenance).
2. **AOC guard** `AOCWriteGuard` verifies forbidden keys, provenance completeness, tenant claims, timestamp normalization, and content hash idempotency. Violations raise `ERR_AOC_00x` mapped to structured logs and metrics.
3. **Append-only write** observations insert into `advisory_observations`; duplicates by `(tenant, source.vendor, upstream.upstreamId, upstream.contentHash)` become no-ops; new content for same upstream id creates a supersedes chain.
4. **Change feed + event** Mongo change streams trigger `advisory.observation.updated@1` events with deterministic payloads (IDs, hash, supersedes pointer, linkset summary). Policy Engine, Offline Kit builder, and guard dashboards subscribe.
4. **Replication + event** PostgreSQL logical replication triggers `advisory.observation.updated@1` events with deterministic payloads (IDs, hash, supersedes pointer, linkset summary). Policy Engine, Offline Kit builder, and guard dashboards subscribe.
### 5.2 Linkset correlation
@@ -321,9 +321,9 @@ Events are emitted via NATS (primary) and Redis Stream (fallback). Consumers ack
---
## 7) Storage schema (MongoDB)
## 7) Storage schema (PostgreSQL)
### Collections & indexes (LNM path)
### Tables & indexes (LNM path)
* `concelier.sources` `{_id, type, baseUrl, enabled, notes}` connector catalog.
* `concelier.source_state` `{sourceName(unique), enabled, cursor, lastSuccess, backoffUntil, paceOverrides}` run-state (TTL indexes on `backoffUntil`).
@@ -338,15 +338,15 @@ Events are emitted via NATS (primary) and Redis Stream (fallback). Consumers ack
_id: "tenant:vendor:upstreamId:revision",
tenant,
source: { vendor, stream, api, collectorVersion },
upstream: { upstreamId, documentVersion, fetchedAt, receivedAt, contentHash, signature },
content: { format, specVersion, raw, metadata? },
identifiers: { cve?, ghsa?, vendorIds[], aliases[] },
linkset: { purls[], cpes[], aliases[], references[], reconciledFrom[] },
rawLinkset: { aliases[], purls[], cpes[], references[], reconciledFrom[], notes? },
supersedes?: "prevObservationId",
createdAt,
attributes?: object
}
upstream: { upstreamId, documentVersion, fetchedAt, receivedAt, contentHash, signature },
content: { format, specVersion, raw, metadata? },
identifiers: { cve?, ghsa?, vendorIds[], aliases[] },
linkset: { purls[], cpes[], aliases[], references[], reconciledFrom[] },
rawLinkset: { aliases[], purls[], cpes[], references[], reconciledFrom[], notes? },
supersedes?: "prevObservationId",
createdAt,
attributes?: object
}
```
* Indexes: `{tenant:1, upstream.upstreamId:1}`, `{tenant:1, source.vendor:1, linkset.purls:1}`, `{tenant:1, linkset.aliases:1}`, `{tenant:1, createdAt:-1}`.
@@ -389,9 +389,9 @@ Events are emitted via NATS (primary) and Redis Stream (fallback). Consumers ack
* `locks` `{_id(jobKey), holder, acquiredAt, heartbeatAt, leaseMs, ttlAt}` (TTL cleans dead locks)
* `jobs` `{_id, type, args, state, startedAt, heartbeatAt, endedAt, error}`
**Legacy collections** (`advisory`, `alias`, `affected`, `reference`, `merge_event`) remain read-only during the migration window to support back-compat exports. New code must not write to them; scheduled cleanup removes them after Link-Not-Merge GA.
**Legacy tables** (`advisory`, `alias`, `affected`, `reference`, `merge_event`) remain read-only during the migration window to support back-compat exports. New code must not write to them; scheduled cleanup removes them after Link-Not-Merge GA.
**GridFS buckets**: `fs.documents` for raw payloads (immutable); `fs.exports` for historical JSON/Trivy archives.
**Object storage**: `documents` for raw payloads (immutable); `exports` for historical JSON/Trivy archives.
---
@@ -476,7 +476,8 @@ GET /affected?productKey=pkg:rpm/openssl&limit=100
```yaml
concelier:
mongo: { uri: "mongodb://mongo/concelier" }
postgres:
connectionString: "Host=postgres;Port=5432;Database=concelier;Username=stellaops;Password=stellaops"
s3:
endpoint: "http://minio:9000"
bucket: "stellaops-concelier"
@@ -540,12 +541,12 @@ concelier:
* **Ingest**: ≥ 5k documents/min on 4 cores (CSAF/OpenVEX/JSON).
* **Normalize/map**: ≥ 50k observation statements/min on 4 cores.
* **Observation write**: ≤ 5ms P95 per document (including guard + Mongo write).
* **Observation write**: ≤ 5 ms P95 per row (including guard + PostgreSQL write).
* **Linkset build**: ≤ 15ms P95 per `(vulnerabilityId, productKey)` update, even with 20+ contributing observations.
* **Export**: 1M advisories JSON in ≤ 90s (streamed, zstd), Trivy DB in ≤ 60s on 8 cores.
* **Memory**: hard cap per job; chunked streaming writers; backpressure to avoid GC spikes.
**Scale pattern**: add Concelier replicas; Mongo scaling via indices and read/write concerns; GridFS only for oversized docs.
**Scale pattern**: add Concelier replicas; PostgreSQL scaling via indices and read/write connection pooling; object storage for oversized docs.
---
@@ -556,13 +557,13 @@ concelier:
* `concelier.fetch.docs_total{source}`
* `concelier.fetch.bytes_total{source}`
* `concelier.parse.failures_total{source}`
* `concelier.map.statements_total{source}`
* `concelier.observations.write_total{result=ok|noop|error}`
* `concelier.linksets.updated_total{result=ok|skip|error}`
* `concelier.linksets.conflicts_total{type}`
* `concelier.export.bytes{kind}`
* `concelier.export.duration_seconds{kind}`
* `advisory_ai_chunk_requests_total{tenant,result,cache}` and `advisory_ai_guardrail_blocks_total{tenant,reason,cache}` instrument the `/advisories/{key}/chunks` surfaces that Advisory AI consumes. Cache hits now emit the same guardrail counters so operators can see blocked segments even when responses are served from cache.
* `concelier.map.statements_total{source}`
* `concelier.observations.write_total{result=ok|noop|error}`
* `concelier.linksets.updated_total{result=ok|skip|error}`
* `concelier.linksets.conflicts_total{type}`
* `concelier.export.bytes{kind}`
* `concelier.export.duration_seconds{kind}`
* `advisory_ai_chunk_requests_total{tenant,result,cache}` and `advisory_ai_guardrail_blocks_total{tenant,reason,cache}` instrument the `/advisories/{key}/chunks` surfaces that Advisory AI consumes. Cache hits now emit the same guardrail counters so operators can see blocked segments even when responses are served from cache.
* **Tracing** around fetch/parse/map/observe/linkset/export.
* **Logs**: structured with `source`, `uri`, `docDigest`, `advisoryKey`, `exportId`.
@@ -604,7 +605,7 @@ concelier:
1. **MVP**: Red Hat (CSAF), SUSE (CSAF), Ubuntu (USN JSON), OSV; JSON export.
2. **Add**: GHSA GraphQL, Debian (DSA HTML/JSON), Alpine secdb; Trivy DB export.
3. **Attestation handoff**: integrate with **Signer/Attestor** (optional).
- Advisory evidence attestation parameters and path rules are documented in `docs/modules/concelier/attestation.md`.
4. **Scale & diagnostics**: provider dashboards, staleness alerts, export cache reuse.
3. **Attestation handoff**: integrate with **Signer/Attestor** (optional).
- Advisory evidence attestation parameters and path rules are documented in `docs/modules/concelier/attestation.md`.
4. **Scale & diagnostics**: provider dashboards, staleness alerts, export cache reuse.
5. **Offline kit**: endtoend verified bundles for airgap.

View File

@@ -22,7 +22,7 @@
Excititor enforces the same ingestion covenant as Concelier, tailored to VEX payloads:
1. **Immutable `vex_raw` documents.** Upstream OpenVEX/CSAF/CycloneDX files are stored verbatim (`content.raw`) with provenance (`issuer`, `statement_id`, timestamps, signatures). Revisions append new versions linked by `supersedes`.
1. **Immutable `vex_raw` rows.** Upstream OpenVEX/CSAF/CycloneDX files are stored verbatim (`content.raw`) with provenance (`issuer`, `statement_id`, timestamps, signatures). Revisions append new versions linked by `supersedes`.
2. **No derived consensus at ingest time.** Fields such as `effective_status`, `merged_state`, `severity`, or reachability are forbidden. Roslyn analyzers and runtime guards block violations before writes.
3. **Linkset-only joins.** Product aliases, CVE keys, SBOM hints, and references live under `linkset`; ingestion must never mutate the underlying statement.
@@ -330,11 +330,11 @@ All exports remain deterministic and, when configured, attested via DSSE + Rekor
---
## 4) Storage schema (MongoDB)
## 4) Storage schema (PostgreSQL)
Database: `excititor`
### 3.1 Collections
### 3.1 Tables
**`vex.providers`**
@@ -357,7 +357,7 @@ uri
ingestedAt
contentType
sig: { verified: bool, method: pgp|cosign|x509|none, keyId|certSubject, bundle? }
payload: GridFS pointer (if large)
payload: object storage pointer (if large)
disposition: kept|replaced|superseded
correlation: { replaces?: sha256, replacedBy?: sha256 }
```
@@ -620,7 +620,8 @@ GET /providers/{id}/status → last fetch, doc counts, signature stats
```yaml
excititor:
mongo: { uri: "mongodb://mongo/excititor" }
postgres:
connectionString: "Host=postgres;Port=5432;Database=excititor;Username=stellaops;Password=stellaops"
s3:
endpoint: http://minio:9000
bucket: stellaops
@@ -703,7 +704,7 @@ Run the ingestion endpoint once after applying migration `20251019-consensus-sig
* **Scaling:**
* WebService handles control APIs; **Worker** background services (same image) execute fetch/normalize in parallel with ratelimits; Mongo writes batched; upserts by natural keys.
* WebService handles control APIs; **Worker** background services (same image) execute fetch/normalize in parallel with ratelimits; PostgreSQL writes batched; upserts by natural keys.
* Exports stream straight to S3 (MinIO) with rolling buffers.
* **Caching:**
@@ -760,7 +761,7 @@ Excititor.Worker ships with a background refresh service that re-evaluates stale
* **Dashboards:** provider staleness, linkset conflict hot spots, signature posture, export cache hit-rate.
* **Telemetry configuration:** `Excititor:Telemetry` toggles OpenTelemetry for the host (`Enabled`, `EnableTracing`, `EnableMetrics`, `ServiceName`, `OtlpEndpoint`, optional `OtlpHeaders` and `ResourceAttributes`). Point it at the collector profile listed in `docs/observability/observability.md` so Excititors `ingestion_*` metrics land in the same Grafana dashboards as Concelier.
* **Health endpoint:** `/obs/excititor/health` (scope `vex.admin`) surfaces ingest/link/signature/conflict SLOs for Console + Grafana. Thresholds are configurable via `Excititor:Observability:*` (see `docs/observability/observability.md`).
* **Local replica set:** `tools/mongodb/local-mongo.sh start` downloads the vetted MongoDB binaries (6.0.x), boots a `rs0` single-node replica set, and prints the `EXCITITOR_TEST_MONGO_URI` export line so storage/integration tests can bypass Mongo2Go. `restart` restarts in-place, `clean` wipes the managed data/logs for deterministic runs, and `stop/status/logs` cover teardown/inspection.
* **Local database:** Use Docker Compose or `tools/postgres/local-postgres.sh start` to boot a PostgreSQL instance for storage/integration tests. `restart` restarts in-place, `clean` wipes the managed data/logs for deterministic runs, and `stop/status/logs` cover teardown/inspection.
* **API headers:** responses echo `X-Stella-TraceId` and `X-Stella-CorrelationId` to keep Console/Loki links deterministic; inbound correlation headers are preserved when present.
---

View File

@@ -4,11 +4,11 @@
The Export Center is the dedicated service layer that packages StellaOps evidence and policy overlays into reproducible bundles. It runs as a multi-surface API backed by asynchronous workers and format adapters, enforcing Aggregation-Only Contract (AOC) guardrails while providing deterministic manifests, signing, and distribution paths.
## Runtime topology
## Runtime topology
- **Export Center API (`StellaOps.ExportCenter.WebService`).** Receives profile CRUD, export run requests, status queries, and download streams through the unified Web API gateway. Enforces tenant scopes, RBAC, quotas, and concurrency guards.
- **Export Center Worker (`StellaOps.ExportCenter.Worker`).** Dequeues export jobs from the Orchestrator, resolves selectors, invokes adapters, and writes manifests and bundle artefacts. Stateless; scales horizontally.
- **Backing stores.**
- MongoDB collections: `export_profiles`, `export_runs`, `export_inputs`, `export_distributions`, `export_events`.
- PostgreSQL tables: `export_profiles`, `export_runs`, `export_inputs`, `export_distributions`, `export_events`.
- Object storage bucket or filesystem for staging bundle payloads.
- Optional registry/object storage credentials injected via Authority-scoped secrets.
- **Integration peers.**
@@ -16,16 +16,16 @@ The Export Center is the dedicated service layer that packages StellaOps evidenc
- **Policy Engine** for deterministic policy snapshots and evaluated findings.
- **Orchestrator** for job scheduling, quotas, and telemetry fan-out.
- **Authority** for tenant-aware access tokens and KMS key references.
- **Console & CLI** as presentation surfaces consuming the API.
## Gap remediation (EC1EC10)
- Schemas: publish signed `ExportProfile` + manifest schemas with selector validation; keep in repo alongside OpenAPI docs.
- Determinism: per-adapter ordering/compression rules with rerun-hash CI; pin Trivy DB schema versions.
- Provenance: DSSE/SLSA attestations with log metadata for every export run; include tenant IDs in predicates.
- Integrity: require checksum/signature headers and OCI annotations; mirror delta/tombstone rules documented for adapters.
- Security: cross-tenant exports denied by default; enforce approval tokens and encryption recipient validation.
- Offline parity: provide export-kit packaging + verify script for air-gap consumers; include fixtures under `src/ExportCenter/__fixtures`.
- Advisory link: see `docs/product-advisories/28-Nov-2025 - Export Center and Reporting Strategy.md` (EC1EC10) for original requirements and keep it alongside sprint tasks for implementers.
- **Console & CLI** as presentation surfaces consuming the API.
## Gap remediation (EC1EC10)
- Schemas: publish signed `ExportProfile` + manifest schemas with selector validation; keep in repo alongside OpenAPI docs.
- Determinism: per-adapter ordering/compression rules with rerun-hash CI; pin Trivy DB schema versions.
- Provenance: DSSE/SLSA attestations with log metadata for every export run; include tenant IDs in predicates.
- Integrity: require checksum/signature headers and OCI annotations; mirror delta/tombstone rules documented for adapters.
- Security: cross-tenant exports denied by default; enforce approval tokens and encryption recipient validation.
- Offline parity: provide export-kit packaging + verify script for air-gap consumers; include fixtures under `src/ExportCenter/__fixtures`.
- Advisory link: see `docs/product-advisories/28-Nov-2025 - Export Center and Reporting Strategy.md` (EC1EC10) for original requirements and keep it alongside sprint tasks for implementers.
## Job lifecycle
1. **Profile selection.** Operator or automation picks a profile (`json:raw`, `json:policy`, `trivy:db`, `trivy:java-db`, `mirror:full`, `mirror:delta`) and submits scope selectors (tenant, time window, products, SBOM subjects, ecosystems). See `docs/modules/export-center/profiles.md` for profile definitions and configuration fields.
@@ -58,7 +58,7 @@ Cancellation requests mark runs as `aborted` and cause workers to stop iterating
All endpoints require Authority-issued JWT + DPoP tokens with scopes `export:run`, `export:read`, and tenant claim alignment. Rate-limiting and quotas surface via `X-Stella-Quota-*` headers.
### Worker pipeline
- **Input resolvers.** Query Findings Ledger and Policy Engine using stable pagination (Mongo `_id` ascending, or resume tokens for change streams). Selector expressions compile into Mongo filter fragments and/or API query parameters.
- **Input resolvers.** Query Findings Ledger and Policy Engine using stable pagination (PostgreSQL `id` ascending, or cursor-based pagination). Selector expressions compile into PostgreSQL WHERE clauses and/or API query parameters.
- **Adapter host.** Adapter plugin loader (restart-time only) resolves profile variant to adapter implementation. Adapters present a deterministic `RunAsync(context)` contract with streaming writers and telemetry instrumentation.
- **Content writers.**
- JSON adapters emit `.jsonl.zst` files with canonical ordering (tenant, subject, document id).
@@ -75,40 +75,40 @@ All endpoints require Authority-issued JWT + DPoP tokens with scopes `export:run
| `export_profiles` | Profile definitions (kind, variant, config). | `_id`, `tenant`, `name`, `kind`, `variant`, `config_json`, `created_by`, `created_at`. | Config includes adapter parameters (included record types, compression, encryption). |
| `export_runs` | Run state machine and audit info. | `_id`, `profile_id`, `tenant`, `status`, `requested_by`, `selectors`, `policy_snapshot_id`, `started_at`, `completed_at`, `duration_ms`, `error_code`. | Immutable selectors; status transitions recorded in `export_events`. |
| `export_inputs` | Resolved input ranges. | `run_id`, `source`, `cursor`, `count`, `hash`. | Enables resumable retries and audit. |
| `export_distributions` | Distribution artefacts. | `run_id`, `type` (`http`, `oci`, `object`), `location`, `sha256`, `size_bytes`, `expires_at`. | `expires_at` used for retention policies and automatic pruning. |
| `export_events` | Timeline of state transitions and metrics. | `run_id`, `event_type`, `message`, `at`, `metrics`. | Feeds SSE stream and audit trails. |
## Audit bundles (immutable triage exports)
Audit bundles are a specialized Export Center output: a deterministic, immutable evidence pack for a single subject (and optional time window) suitable for audits and incident response.
- **Schema**: `docs/schemas/audit-bundle-index.schema.json` (bundle index/manifest with integrity hashes and referenced artefacts).
- **Core APIs**:
- `POST /v1/audit-bundles` - Create a new bundle (async generation).
- `GET /v1/audit-bundles` - List previously created bundles.
- `GET /v1/audit-bundles/{bundleId}` - Returns job metadata (`Accept: application/json`) or streams bundle bytes (`Accept: application/octet-stream`).
- **Typical contents**: vuln reports, SBOM(s), VEX decisions, policy evaluations, and DSSE attestations, plus an integrity root hash and optional OCI reference.
- **Reference**: `docs/product-advisories/archived/27-Nov-2025-superseded/28-Nov-2025 - Vulnerability Triage UX & VEX-First Decisioning.md`.
## Adapter responsibilities
- **JSON (`json:raw`, `json:policy`).**
- Ensures canonical casing, timezone normalization, and linkset preservation.
- Policy variant embeds policy snapshot metadata (`policy_version`, `inputs_hash`, `decision_trace` fingerprint) and emits evaluated findings as separate files.
| `export_distributions` | Distribution artefacts. | `run_id`, `type` (`http`, `oci`, `object`), `location`, `sha256`, `size_bytes`, `expires_at`. | `expires_at` used for retention policies and automatic pruning. |
| `export_events` | Timeline of state transitions and metrics. | `run_id`, `event_type`, `message`, `at`, `metrics`. | Feeds SSE stream and audit trails. |
## Audit bundles (immutable triage exports)
Audit bundles are a specialized Export Center output: a deterministic, immutable evidence pack for a single subject (and optional time window) suitable for audits and incident response.
- **Schema**: `docs/schemas/audit-bundle-index.schema.json` (bundle index/manifest with integrity hashes and referenced artefacts).
- **Core APIs**:
- `POST /v1/audit-bundles` - Create a new bundle (async generation).
- `GET /v1/audit-bundles` - List previously created bundles.
- `GET /v1/audit-bundles/{bundleId}` - Returns job metadata (`Accept: application/json`) or streams bundle bytes (`Accept: application/octet-stream`).
- **Typical contents**: vuln reports, SBOM(s), VEX decisions, policy evaluations, and DSSE attestations, plus an integrity root hash and optional OCI reference.
- **Reference**: `docs/product-advisories/archived/27-Nov-2025-superseded/28-Nov-2025 - Vulnerability Triage UX & VEX-First Decisioning.md`.
## Adapter responsibilities
- **JSON (`json:raw`, `json:policy`).**
- Ensures canonical casing, timezone normalization, and linkset preservation.
- Policy variant embeds policy snapshot metadata (`policy_version`, `inputs_hash`, `decision_trace` fingerprint) and emits evaluated findings as separate files.
- Enforces AOC guardrails: no derived modifications to raw evidence fields.
- **Trivy (`trivy:db`, `trivy:java-db`).**
- Maps StellaOps advisory schema to Trivy DB format, handling namespace collisions and ecosystem-specific ranges.
- Validates compatibility against supported Trivy schema versions; run fails fast if mismatch.
- Emits optional manifest summarising package counts and severity distribution.
- **Mirror (`mirror:full`, `mirror:delta`).**
- Builds self-contained filesystem layout (`/manifests`, `/data/raw`, `/data/policy`, `/indexes`).
- Delta variant compares against base manifest (`base_export_id`) to write only changed artefacts; records `removed` entries for cleanup.
- Supports optional encryption of `/data` subtree (age/AES-GCM) with key wrapping stored in `provenance.json`.
- **DevPortal (`devportal:offline`).**
- Packages developer portal static assets, OpenAPI specs, SDK releases, and changelog content into a reproducible archive with manifest/checksum pairs.
- Emits `manifest.json`, `checksums.txt`, helper scripts, and a DSSE signature document (`manifest.dsse.json`) as described in [DevPortal Offline Bundle Specification](devportal-offline.md).
- Stores artefacts under `<storagePrefix>/<bundleId>/` and signs manifests via the Export Center signing adapter (HMAC-SHA256 v1, tenant scoped).
Adapters expose structured telemetry events (`adapter.start`, `adapter.chunk`, `adapter.complete`) with record counts and byte totals per chunk. Failures emit `adapter.error` with reason codes.
- **Mirror (`mirror:full`, `mirror:delta`).**
- Builds self-contained filesystem layout (`/manifests`, `/data/raw`, `/data/policy`, `/indexes`).
- Delta variant compares against base manifest (`base_export_id`) to write only changed artefacts; records `removed` entries for cleanup.
- Supports optional encryption of `/data` subtree (age/AES-GCM) with key wrapping stored in `provenance.json`.
- **DevPortal (`devportal:offline`).**
- Packages developer portal static assets, OpenAPI specs, SDK releases, and changelog content into a reproducible archive with manifest/checksum pairs.
- Emits `manifest.json`, `checksums.txt`, helper scripts, and a DSSE signature document (`manifest.dsse.json`) as described in [DevPortal Offline Bundle Specification](devportal-offline.md).
- Stores artefacts under `<storagePrefix>/<bundleId>/` and signs manifests via the Export Center signing adapter (HMAC-SHA256 v1, tenant scoped).
Adapters expose structured telemetry events (`adapter.start`, `adapter.chunk`, `adapter.complete`) with record counts and byte totals per chunk. Failures emit `adapter.error` with reason codes.
## Signing and provenance
- **Manifest schema.** `export.json` contains run metadata, profile descriptor, selector summary, counts, SHA-256 digests, compression hints, and distribution list. Deterministic field ordering and normalized timestamps.
@@ -122,11 +122,11 @@ Adapters expose structured telemetry events (`adapter.start`, `adapter.chunk`, `
- **Object storage.** Writes to tenant-prefixed paths (`s3://stella-exports/{tenant}/{run-id}/...`) with immutable retention policies. Retention scheduler purges expired runs based on profile configuration.
- **Offline Kit seeding.** Mirror bundles optionally staged into Offline Kit assembly pipelines, inheriting the same manifests and signatures.
## Observability
- **Metrics.** Emits `exporter_run_duration_seconds`, `exporter_run_bytes_total{profile}`, `exporter_run_failures_total{error_code}`, `exporter_active_runs{tenant}`, `exporter_distribution_push_seconds{type}`.
- **Logs.** Structured logs with fields `run_id`, `tenant`, `profile_kind`, `adapter`, `phase`, `correlation_id`, `error_code`. Phases include `plan`, `resolve`, `adapter`, `manifest`, `sign`, `distribute`.
- **Traces.** Optional OpenTelemetry spans (`export.plan`, `export.fetch`, `export.write`, `export.sign`, `export.distribute`) for cross-service correlation.
- **Dashboards & alerts.** DevOps pipeline seeds Grafana dashboards summarising throughput, size, failure ratios, and distribution latency. Alert thresholds: failure rate >5% per profile, median run duration >p95 baseline, signature verification failures >0. Runbook + dashboard stub for offline import: `operations/observability.md`, `operations/dashboards/export-center-observability.json`.
## Observability
- **Metrics.** Emits `exporter_run_duration_seconds`, `exporter_run_bytes_total{profile}`, `exporter_run_failures_total{error_code}`, `exporter_active_runs{tenant}`, `exporter_distribution_push_seconds{type}`.
- **Logs.** Structured logs with fields `run_id`, `tenant`, `profile_kind`, `adapter`, `phase`, `correlation_id`, `error_code`. Phases include `plan`, `resolve`, `adapter`, `manifest`, `sign`, `distribute`.
- **Traces.** Optional OpenTelemetry spans (`export.plan`, `export.fetch`, `export.write`, `export.sign`, `export.distribute`) for cross-service correlation.
- **Dashboards & alerts.** DevOps pipeline seeds Grafana dashboards summarising throughput, size, failure ratios, and distribution latency. Alert thresholds: failure rate >5% per profile, median run duration >p95 baseline, signature verification failures >0. Runbook + dashboard stub for offline import: `operations/observability.md`, `operations/dashboards/export-center-observability.json`.
## Security posture
- Tenant claim enforced at every query and distribution path; cross-tenant selectors rejected unless explicit cross-tenant mirror feature toggled with signed approval.
@@ -139,7 +139,7 @@ Adapters expose structured telemetry events (`adapter.start`, `adapter.chunk`, `
- Packaged as separate API and worker containers. Helm chart and compose overlays define horizontal scaling, worker concurrency, queue leases, and object storage credentials.
- Requires Authority client credentials for KMS and optional registry credentials stored via sealed secrets.
- Offline-first deployments disable OCI distribution by default and provide local object storage endpoints; HTTP downloads served via internal gateway.
- Health endpoints: `/health/ready` validates Mongo connectivity, object storage access, adapter registry integrity, and KMS signer readiness.
- Health endpoints: `/health/ready` validates PostgreSQL connectivity, object storage access, adapter registry integrity, and KMS signer readiness.
## Compliance checklist
- [ ] Profiles and runs enforce tenant scoping; cross-tenant exports disabled unless approved.

View File

@@ -12,54 +12,54 @@
- `Advisory` and `VEXStatement` nodes linking to Concelier/Excititor records via digests.
- `PolicyVersion` nodes representing signed policy packs.
- **Edges:** directed, timestamped relationships such as `DEPENDS_ON`, `BUILT_FROM`, `DECLARED_IN`, `AFFECTED_BY`, `VEX_EXEMPTS`, `GOVERNS_WITH`, `OBSERVED_RUNTIME`. Each edge carries provenance (SRM hash, SBOM digest, policy run ID).
- **Overlays:** computed index tables providing fast access to reachability, blast radius, and differential views (e.g., `graph_overlay/vuln/{tenant}/{advisoryKey}`). Runtime endpoints emit overlays inline (`policy.overlay.v1`, `openvex.v1`) with deterministic overlay IDs (`sha256(tenant|nodeId|overlayKind)`) and sampled explain traces on policy overlays.
- **Overlays:** computed index tables providing fast access to reachability, blast radius, and differential views (e.g., `graph_overlay/vuln/{tenant}/{advisoryKey}`). Runtime endpoints emit overlays inline (`policy.overlay.v1`, `openvex.v1`) with deterministic overlay IDs (`sha256(tenant|nodeId|overlayKind)`) and sampled explain traces on policy overlays.
## 2) Pipelines
1. **Ingestion:** Cartographer/SBOM Service emit SBOM snapshots (`sbom_snapshot` events) captured by the Graph Indexer. Advisories/VEX from Concelier/Excititor generate edge updates, policy runs attach overlay metadata.
2. **ETL:** Normalises nodes/edges into canonical IDs, deduplicates, enforces tenant partitions, and writes to the graph store (planned: Neo4j-compatible or document + adjacency lists in Mongo).
3. **Overlay computation:** Batch workers build materialised views for frequently used queries (impact lists, saved queries, policy overlays) and store as immutable blobs for Offline Kit exports.
4. **Diffing:** `graph_diff` jobs compare two snapshots (e.g., pre/post deploy) and generate signed diff manifests for UI/CLI consumption.
5. **Analytics (Runtime & Signals 140.A):** background workers run Louvain-style clustering + degree/betweenness approximations on ingested graphs, emitting overlays per tenant/snapshot and writing cluster ids back to nodes when enabled.
4. **Diffing:** `graph_diff` jobs compare two snapshots (e.g., pre/post deploy) and generate signed diff manifests for UI/CLI consumption.
5. **Analytics (Runtime & Signals 140.A):** background workers run Louvain-style clustering + degree/betweenness approximations on ingested graphs, emitting overlays per tenant/snapshot and writing cluster ids back to nodes when enabled.
## 3) APIs
- `POST /graph/search` — NDJSON node tiles with cursor paging, tenant + scope guards.
- `POST /graph/query` — NDJSON nodes/edges/stats/cursor with budgets (tiles/nodes/edges) and optional inline overlays (`includeOverlays=true`) emitting `policy.overlay.v1` and `openvex.v1` payloads; overlay IDs are `sha256(tenant|nodeId|overlayKind)`; policy overlay may include a sampled `explainTrace`.
- `POST /graph/paths` — bounded BFS (depth ≤6) returning path nodes/edges/stats; honours budgets and overlays.
- `POST /graph/diff` — compares `snapshotA` vs `snapshotB`, streaming node/edge added/removed/changed tiles plus stats; budget enforcement mirrors `/graph/query`.
- `POST /graph/export` — async job producing deterministic manifests (`sha256`, size, format) for `ndjson/csv/graphml/png/svg`; download via `/graph/export/{jobId}`.
- Legacy: `GET /graph/nodes/{id}`, `POST /graph/query/saved`, `GET /graph/impact/{advisoryKey}`, `POST /graph/overlay/policy` remain in spec but should align to the NDJSON surfaces above as they are brought forward.
- `POST /graph/search` — NDJSON node tiles with cursor paging, tenant + scope guards.
- `POST /graph/query` — NDJSON nodes/edges/stats/cursor with budgets (tiles/nodes/edges) and optional inline overlays (`includeOverlays=true`) emitting `policy.overlay.v1` and `openvex.v1` payloads; overlay IDs are `sha256(tenant|nodeId|overlayKind)`; policy overlay may include a sampled `explainTrace`.
- `POST /graph/paths` — bounded BFS (depth ≤6) returning path nodes/edges/stats; honours budgets and overlays.
- `POST /graph/diff` — compares `snapshotA` vs `snapshotB`, streaming node/edge added/removed/changed tiles plus stats; budget enforcement mirrors `/graph/query`.
- `POST /graph/export` — async job producing deterministic manifests (`sha256`, size, format) for `ndjson/csv/graphml/png/svg`; download via `/graph/export/{jobId}`.
- Legacy: `GET /graph/nodes/{id}`, `POST /graph/query/saved`, `GET /graph/impact/{advisoryKey}`, `POST /graph/overlay/policy` remain in spec but should align to the NDJSON surfaces above as they are brought forward.
## 4) Storage considerations
- Backed by either:
- **Document + adjacency** (Mongo collections `graph_nodes`, `graph_edges`, `graph_overlays`) with deterministic ordering and streaming exports.
- **Relational + adjacency** (PostgreSQL tables `graph_nodes`, `graph_edges`, `graph_overlays`) with deterministic ordering and streaming exports.
- Or **Graph DB** (e.g., Neo4j/Cosmos Gremlin) behind an abstraction layer; choice depends on deployment footprint.
- All storages require tenant partitioning, append-only change logs, and export manifests for Offline Kits.
## 5) Offline & export
- Each snapshot packages `nodes.jsonl`, `edges.jsonl`, `overlays/` plus manifest with hash, counts, and provenance. Export Center consumes these artefacts for graph-specific bundles.
- Saved queries and overlays include deterministic IDs so Offline Kit consumers can import and replay results.
- Runtime hosts register the SBOM ingest pipeline via `services.AddSbomIngestPipeline(...)`. Snapshot exports default to `./artifacts/graph-snapshots` but can be redirected with `STELLAOPS_GRAPH_SNAPSHOT_DIR` or the `SbomIngestOptions.SnapshotRootDirectory` callback.
- Analytics overlays are exported as NDJSON (`overlays/clusters.ndjson`, `overlays/centrality.ndjson`) ordered by node id; `overlays/manifest.json` mirrors snapshot id and counts for offline parity.
- Each snapshot packages `nodes.jsonl`, `edges.jsonl`, `overlays/` plus manifest with hash, counts, and provenance. Export Center consumes these artefacts for graph-specific bundles.
- Saved queries and overlays include deterministic IDs so Offline Kit consumers can import and replay results.
- Runtime hosts register the SBOM ingest pipeline via `services.AddSbomIngestPipeline(...)`. Snapshot exports default to `./artifacts/graph-snapshots` but can be redirected with `STELLAOPS_GRAPH_SNAPSHOT_DIR` or the `SbomIngestOptions.SnapshotRootDirectory` callback.
- Analytics overlays are exported as NDJSON (`overlays/clusters.ndjson`, `overlays/centrality.ndjson`) ordered by node id; `overlays/manifest.json` mirrors snapshot id and counts for offline parity.
## 6) Observability
- Metrics: ingestion lag (`graph_ingest_lag_seconds`), node/edge counts, query latency per saved query, overlay generation duration.
- New analytics metrics: `graph_analytics_runs_total`, `graph_analytics_failures_total`, `graph_analytics_clusters_total`, `graph_analytics_centrality_total`, plus change-stream/backfill counters (`graph_changes_total`, `graph_backfill_total`, `graph_change_failures_total`, `graph_change_lag_seconds`).
- Metrics: ingestion lag (`graph_ingest_lag_seconds`), node/edge counts, query latency per saved query, overlay generation duration.
- New analytics metrics: `graph_analytics_runs_total`, `graph_analytics_failures_total`, `graph_analytics_clusters_total`, `graph_analytics_centrality_total`, plus change-stream/backfill counters (`graph_changes_total`, `graph_backfill_total`, `graph_change_failures_total`, `graph_change_lag_seconds`).
- Logs: structured events for ETL stages and query execution (with trace IDs).
- Traces: ETL pipeline spans, query engine spans.
## 7) Rollout notes
- Phase 1: ingest SBOM + advisories, deliver impact queries.
- Phase 2: add VEX overlays, policy overlays, diff tooling.
- Phase 3: expose runtime/Zastava edges and AI-assisted recommendations (future).
### Local testing note
Set `STELLAOPS_TEST_MONGO_URI` to a reachable MongoDB instance before running `tests/Graph/StellaOps.Graph.Indexer.Tests`. The test harness falls back to `mongodb://127.0.0.1:27017`, then Mongo2Go, but the CI workflow requires the environment variable to be present to ensure upsert coverage runs against a managed database. Use `STELLAOPS_GRAPH_SNAPSHOT_DIR` (or the `AddSbomIngestPipeline` options callback) to control where graph snapshot artefacts land during local runs.
## 7) Rollout notes
- Phase 1: ingest SBOM + advisories, deliver impact queries.
- Phase 2: add VEX overlays, policy overlays, diff tooling.
- Phase 3: expose runtime/Zastava edges and AI-assisted recommendations (future).
### Local testing note
Set `STELLAOPS_TEST_POSTGRES_CONNECTION` to a reachable PostgreSQL instance before running `tests/Graph/StellaOps.Graph.Indexer.Tests`. The test harness falls back to `Host=127.0.0.1;Port=5432;Database=stellaops_test`, then Testcontainers for PostgreSQL, but the CI workflow requires the environment variable to be present to ensure upsert coverage runs against a managed database. Use `STELLAOPS_GRAPH_SNAPSHOT_DIR` (or the `AddSbomIngestPipeline` options callback) to control where graph snapshot artefacts land during local runs.
Refer to the module README and implementation plan for immediate context, and update this document once component boundaries and data flows are finalised.

View File

@@ -10,16 +10,16 @@ Issuer Directory centralises trusted VEX/CSAF publisher metadata so downstream s
- **Service name:** `stellaops/issuer-directory`
- **Framework:** ASP.NET Core minimal APIs (`net10.0`)
- **Persistence:** MongoDB (`issuer-directory.issuers`, `issuer-directory.issuer_keys`, `issuer-directory.issuer_audit`)
- **Persistence:** PostgreSQL (`issuer_directory.issuers`, `issuer_directory.issuer_keys`, `issuer_directory.issuer_audit`)
- **AuthZ:** StellaOps resource server scopes (`issuer-directory:read`, `issuer-directory:write`, `issuer-directory:admin`)
- **Audit:** Every create/update/delete emits an audit record with actor, reason, and context.
- **Bootstrap:** On startup, the service imports `data/csaf-publishers.json` into the global tenant (`@global`) and records a `seeded` audit the first time each publisher is added.
- **Key lifecycle:** API validates Ed25519 public keys, X.509 certificates, and DSSE public keys, enforces future expiries, deduplicates fingerprints, and records audit entries for create/rotate/revoke actions.
```
Clients ──> Authority (DPoP/JWT) ──> IssuerDirectory WebService ──> MongoDB
Clients ──> Authority (DPoP/JWT) ──> IssuerDirectory WebService ──> PostgreSQL
└─> Audit sink (Mongo)
└─> Audit sink (PostgreSQL)
```
## 3. Configuration
@@ -42,12 +42,12 @@ IssuerDirectory:
tenantHeader: X-StellaOps-Tenant
seedCsafPublishers: true
csafSeedPath: data/csaf-publishers.json
Mongo:
connectionString: mongodb://localhost:27017
database: issuer-directory
issuersCollection: issuers
issuerKeysCollection: issuer_keys
auditCollection: issuer_audit
Postgres:
connectionString: Host=localhost;Port=5432;Database=issuer_directory;Username=stellaops;Password=secret
schema: issuer_directory
issuersTable: issuers
issuerKeysTable: issuer_keys
auditTable: issuer_audit
```
## 4. API Surface (v0)
@@ -74,7 +74,7 @@ Payloads follow the contract in `Contracts/IssuerDtos.cs` and align with domain
## 5. Dependencies & Reuse
- `StellaOps.IssuerDirectory.Core` — domain model (`IssuerRecord`, `IssuerKeyRecord`) + application services.
- `StellaOps.IssuerDirectory.Infrastructure`MongoDB persistence, audit sink, seed loader.
- `StellaOps.IssuerDirectory.Infrastructure`PostgreSQL persistence, audit sink, seed loader.
- `StellaOps.IssuerDirectory.WebService` — minimal API host, authentication wiring.
- Shared libraries: `StellaOps.Configuration`, `StellaOps.Auth.ServerIntegration`.

View File

@@ -2,18 +2,18 @@
## Scope
- **Applies to:** Issuer Directory when deployed via Docker Compose (`deploy/compose/docker-compose.*.yaml`) or the Helm chart (`deploy/helm/stellaops`).
- **Artifacts covered:** MongoDB database `issuer-directory`, service configuration (`etc/issuer-directory.yaml`), CSAF seed file (`data/csaf-publishers.json`), and secret material for the Mongo connection string.
- **Artifacts covered:** PostgreSQL database `issuer_directory`, service configuration (`etc/issuer-directory.yaml`), CSAF seed file (`data/csaf-publishers.json`), and secret material for the PostgreSQL connection string.
- **Frequency:** Take a hot backup before every upgrade and at least daily in production. Keep encrypted copies off-site/air-gapped according to your compliance program.
## Inventory checklist
| Component | Location (Compose default) | Notes |
| --- | --- | --- |
| Mongo data | `mongo-data` volume (`/var/lib/docker/volumes/.../mongo-data`) | Contains `issuers`, `issuer_keys`, `issuer_trust_overrides`, and `issuer_audit` collections. |
| PostgreSQL data | `postgres-data` volume (`/var/lib/docker/volumes/.../postgres-data`) | Contains `issuers`, `issuer_keys`, `issuer_trust_overrides`, and `issuer_audit` tables in the `issuer_directory` schema. |
| Configuration | `etc/issuer-directory.yaml` | Mounted read-only at `/etc/issuer-directory.yaml` inside the container. |
| CSAF seed file | `src/IssuerDirectory/StellaOps.IssuerDirectory/data/csaf-publishers.json` | Ensure customised seeds are part of the backup; regenerate if you ship regional overrides. |
| Mongo secret | `.env` entry `ISSUER_DIRECTORY_MONGO_CONNECTION_STRING` or secret store export | Required to restore connectivity; treat as sensitive. |
| PostgreSQL secret | `.env` entry `ISSUER_DIRECTORY_POSTGRES_CONNECTION_STRING` or secret store export | Required to restore connectivity; treat as sensitive. |
> **Tip:** Export the secret via `kubectl get secret issuer-directory-secrets -o yaml` (sanitize before storage) or copy the Compose `.env` file into an encrypted vault.
> **Tip:** Export the secret via `kubectl get secret issuer-directory-secrets -o yaml` (sanitize before storage) or copy the Compose `.env` file into an encrypted vault. For PostgreSQL credentials, consider using `pg_dump` with connection info from environment variables.
## Hot backup (no downtime)
1. **Create output directory**
@@ -21,16 +21,17 @@
BACKUP_DIR=backup/issuer-directory/$(date +%Y-%m-%dT%H%M%S)
mkdir -p "$BACKUP_DIR"
```
2. **Dump Mongo collections**
2. **Dump PostgreSQL tables**
```bash
docker compose -f deploy/compose/docker-compose.prod.yaml exec mongo \
mongodump --archive=/dump/issuer-directory-$(date +%Y%m%dT%H%M%SZ).gz \
--gzip --db issuer-directory
docker compose -f deploy/compose/docker-compose.prod.yaml exec postgres \
pg_dump --format=custom --compress=9 \
--file=/dump/issuer-directory-$(date +%Y%m%dT%H%M%SZ).dump \
--schema=issuer_directory issuer_directory
docker compose -f deploy/compose/docker-compose.prod.yaml cp \
mongo:/dump/issuer-directory-$(date +%Y%m%dT%H%M%SZ).gz "$BACKUP_DIR/"
postgres:/dump/issuer-directory-$(date +%Y%m%dT%H%M%SZ).dump "$BACKUP_DIR/"
```
For Kubernetes, run the same `mongodump` command inside the `stellaops-mongo` pod and copy the archive via `kubectl cp`.
For Kubernetes, run the same `pg_dump` command inside the `stellaops-postgres` pod and copy the archive via `kubectl cp`.
3. **Capture configuration and seeds**
```bash
cp etc/issuer-directory.yaml "$BACKUP_DIR/"
@@ -38,8 +39,8 @@
```
4. **Capture secrets**
```bash
grep '^ISSUER_DIRECTORY_MONGO_CONNECTION_STRING=' dev.env > "$BACKUP_DIR/issuer-directory.mongo.secret"
chmod 600 "$BACKUP_DIR/issuer-directory.mongo.secret"
grep '^ISSUER_DIRECTORY_POSTGRES_CONNECTION_STRING=' dev.env > "$BACKUP_DIR/issuer-directory.postgres.secret"
chmod 600 "$BACKUP_DIR/issuer-directory.postgres.secret"
```
5. **Generate checksums and encrypt**
```bash
@@ -57,21 +58,21 @@
(For Helm: `kubectl scale deploy stellaops-issuer-directory --replicas=0`.)
3. Snapshot volumes:
```bash
docker run --rm -v mongo-data:/data \
-v "$(pwd)":/backup busybox tar czf /backup/mongo-data-$(date +%Y%m%d).tar.gz -C /data .
docker run --rm -v postgres-data:/data \
-v "$(pwd)":/backup busybox tar czf /backup/postgres-data-$(date +%Y%m%d).tar.gz -C /data .
```
4. Copy configuration, seeds, and secrets as in the hot backup.
5. Restart services and confirm `/health/live` returns `200 OK`.
## Restore procedure
1. **Provision clean volumes**
- Compose: `docker volume rm mongo-data` (optional) then `docker compose up -d mongo`.
- Helm: delete the Mongo PVC or attach a fresh volume snapshot.
2. **Restore Mongo**
- Compose: `docker volume rm postgres-data` (optional) then `docker compose up -d postgres`.
- Helm: delete the PostgreSQL PVC or attach a fresh volume snapshot.
2. **Restore PostgreSQL**
```bash
docker compose exec -T mongo \
mongorestore --archive \
--gzip --drop < issuer-directory-YYYYMMDDTHHMMSSZ.gz
docker compose exec -T postgres \
pg_restore --format=custom --clean --if-exists \
--dbname=issuer_directory < issuer-directory-YYYYMMDDTHHMMSSZ.dump
```
3. **Restore configuration/secrets**
- Copy `issuer-directory.yaml` into `etc/`.
@@ -87,7 +88,7 @@
6. **Validate**
- `curl -fsSL https://localhost:8447/health/live`
- Issue an access token and list issuers to confirm results.
- Check Mongo counts match expectations (`db.issuers.countDocuments()`, etc.).
- Check PostgreSQL counts match expectations (`SELECT COUNT(*) FROM issuer_directory.issuers;`, etc.).
- Confirm Prometheus scrapes `issuer_directory_changes_total` and `issuer_directory_key_operations_total` for the tenants you restored.
## Disaster recovery notes
@@ -98,7 +99,7 @@
## Verification checklist
- [ ] `/health/live` returns `200 OK`.
- [ ] Mongo collections (`issuers`, `issuer_keys`, `issuer_trust_overrides`) have expected counts.
- [ ] PostgreSQL tables (`issuers`, `issuer_keys`, `issuer_trust_overrides`) have expected counts.
- [ ] `issuer_directory_changes_total`, `issuer_directory_key_operations_total`, and `issuer_directory_key_validation_failures_total` metrics resume within 1 minute.
- [ ] Audit entries exist for post-restore CRUD activity.
- [ ] Client integrations (VEX Lens, Excititor) resolve issuers successfully.

View File

@@ -7,34 +7,34 @@
## 1 · Prerequisites
- Authority must be running and reachable at the issuer URL you configure (default Compose host: `https://authority:8440`).
- MongoDB 4.2+ with credentials for the `issuer-directory` database (Compose defaults to the root user defined in `.env`).
- Network access to Authority, MongoDB, and (optionally) Prometheus if you scrape metrics.
- PostgreSQL 14+ with credentials for the `issuer_directory` database (Compose defaults to the user defined in `.env`).
- Network access to Authority, PostgreSQL, and (optionally) Prometheus if you scrape metrics.
- Issuer Directory configuration file `etc/issuer-directory.yaml` checked and customised for your environment (tenant header, audiences, telemetry level, CSAF seed path).
> **Secrets:** Use `etc/secrets/issuer-directory.mongo.secret.example` as a template. Store the real connection string in an untracked file or secrets manager and reference it via environment variables (`ISSUER_DIRECTORY_MONGO_CONNECTION_STRING`) rather than committing credentials.
> **Secrets:** Use `etc/secrets/issuer-directory.postgres.secret.example` as a template. Store the real connection string in an untracked file or secrets manager and reference it via environment variables (`ISSUER_DIRECTORY_POSTGRES_CONNECTION_STRING`) rather than committing credentials.
## 2 · Deploy with Docker Compose
1. **Prepare environment variables**
```bash
cp deploy/compose/env/dev.env.example dev.env
cp etc/secrets/issuer-directory.mongo.secret.example issuer-directory.mongo.env
# Edit dev.env and issuer-directory.mongo.env with production-ready secrets.
cp etc/secrets/issuer-directory.postgres.secret.example issuer-directory.postgres.env
# Edit dev.env and issuer-directory.postgres.env with production-ready secrets.
```
2. **Inspect the merged configuration**
```bash
docker compose \
--env-file dev.env \
--env-file issuer-directory.mongo.env \
--env-file issuer-directory.postgres.env \
-f deploy/compose/docker-compose.dev.yaml config
```
The command confirms the new `issuer-directory` service resolves the port (`${ISSUER_DIRECTORY_PORT:-8447}`) and the Mongo connection string is in place.
The command confirms the new `issuer-directory` service resolves the port (`${ISSUER_DIRECTORY_PORT:-8447}`) and the PostgreSQL connection string is in place.
3. **Launch the stack**
```bash
docker compose \
--env-file dev.env \
--env-file issuer-directory.mongo.env \
--env-file issuer-directory.postgres.env \
-f deploy/compose/docker-compose.dev.yaml up -d issuer-directory
```
Compose automatically mounts `../../etc/issuer-directory.yaml` into the container at `/etc/issuer-directory.yaml`, seeds CSAF publishers, and exposes the API on `https://localhost:8447`.
@@ -43,7 +43,7 @@
| Variable | Purpose | Default |
| --- | --- | --- |
| `ISSUER_DIRECTORY_PORT` | Host port that maps to container port `8080`. | `8447` |
| `ISSUER_DIRECTORY_MONGO_CONNECTION_STRING` | Injected into `ISSUERDIRECTORY__MONGO__CONNECTIONSTRING`; should contain credentials. | `mongodb://${MONGO_INITDB_ROOT_USERNAME}:${MONGO_INITDB_ROOT_PASSWORD}@mongo:27017` |
| `ISSUER_DIRECTORY_POSTGRES_CONNECTION_STRING` | Injected into `ISSUERDIRECTORY__POSTGRES__CONNECTIONSTRING`; should contain credentials. | `Host=postgres;Port=5432;Database=issuer_directory;Username=${POSTGRES_USER};Password=${POSTGRES_PASSWORD}` |
| `ISSUER_DIRECTORY_SEED_CSAF` | Toggles CSAF bootstrap on startup. Set to `false` after the first production import if you manage issuers manually. | `true` |
4. **Smoke test**
@@ -63,7 +63,7 @@
1. **Create or update the secret**
```bash
kubectl create secret generic issuer-directory-secrets \
--from-literal=ISSUERDIRECTORY__MONGO__CONNECTIONSTRING='mongodb://stellaops:<password>@stellaops-mongo:27017' \
--from-literal=ISSUERDIRECTORY__POSTGRES__CONNECTIONSTRING='Host=stellaops-postgres;Port=5432;Database=issuer_directory;Username=stellaops;Password=<password>' \
--dry-run=client -o yaml | kubectl apply -f -
```
Add optional overrides (e.g. `ISSUERDIRECTORY__AUTHORITY__ISSUER`) if your Authority issuer differs from the default.
@@ -95,7 +95,7 @@
```bash
kubectl exec deploy/stellaops-issuer-directory -- \
curl -sf http://127.0.0.1:8080/health/live
kubectl logs deploy/stellaops-issuer-directory | grep 'IssuerDirectory Mongo connected'
kubectl logs deploy/stellaops-issuer-directory | grep 'IssuerDirectory PostgreSQL connected'
```
Prometheus should begin scraping `issuer_directory_changes_total` and related metrics (labels: `tenant`, `issuer`, `action`).

View File

@@ -10,7 +10,7 @@
* Notify **does not make policy decisions** and **does not rescan**; it **consumes** events from Scanner/Scheduler/Excitor/Conselier/Attestor/Zastava and routes them.
* Attachments are **links** (UI/attestation pages); Notify **does not** attach SBOMs or large blobs to messages.
* Secrets for channels (Slack tokens, SMTP creds) are **referenced**, not stored raw in Mongo.
* Secrets for channels (Slack tokens, SMTP creds) are **referenced**, not stored raw in the database.
* **2025-11-02 module boundary.** Maintain `src/Notify/` as the reusable notification toolkit (engine, storage, queue, connectors) and `src/Notifier/` as the Notifications Studio host that composes those libraries. Do not merge directories without an approved packaging RFC that covers build impacts, offline kit parity, and cross-module governance.
---
@@ -26,7 +26,6 @@ src/
├─ StellaOps.Notify.Engine/ # rules engine, templates, idempotency, digests, throttles
├─ StellaOps.Notify.Models/ # DTOs (Rule, Channel, Event, Delivery, Template)
├─ StellaOps.Notify.Storage.Postgres/ # canonical persistence (notify schema)
├─ StellaOps.Notify.Storage.Mongo/ # legacy shim kept only for data export/migrations
├─ StellaOps.Notify.Queue/ # bus client (Redis Streams/NATS JetStream)
└─ StellaOps.Notify.Tests.* # unit/integration/e2e
```
@@ -36,7 +35,7 @@ src/
* **Notify.WebService** (stateless API)
* **Notify.Worker** (horizontal scale)
**Dependencies**: Authority (OpToks; DPoP/mTLS), **PostgreSQL** (notify schema), Redis/NATS (bus), HTTP egress to Slack/Teams/Webhooks, SMTP relay for Email. MongoDB remains only for archival/export tooling until Phase 7 cleanup.
**Dependencies**: Authority (OpToks; DPoP/mTLS), **PostgreSQL** (notify schema), Redis/NATS (bus), HTTP egress to Slack/Teams/Webhooks, SMTP relay for Email.
> **Configuration.** Notify.WebService bootstraps from `notify.yaml` (see `etc/notify.yaml.sample`). Use `storage.driver: postgres` and provide `postgres.notify` options (`connectionString`, `schemaName`, pool sizing, timeouts). Authority settings follow the platform defaults—when running locally without Authority, set `authority.enabled: false` and supply `developmentSigningKey` so JWTs can be validated offline.
>
@@ -240,11 +239,11 @@ public interface INotifyConnector {
---
## 7) Data model (Mongo)
## 7) Data model (PostgreSQL)
Canonical JSON Schemas for rules/channels/events live in `docs/modules/notify/resources/schemas/`. Sample payloads intended for tests/UI mock responses are captured in `docs/modules/notify/resources/samples/`.
**Database**: `notify`
**Database**: `stellaops_notify` (PostgreSQL)
* `rules`
@@ -289,11 +288,11 @@ Canonical JSON Schemas for rules/channels/events live in `docs/modules/notify/re
Base path: `/api/v1/notify` (Authority OpToks; scopes: `notify.admin` for write, `notify.read` for view).
*All* REST calls require the tenant header `X-StellaOps-Tenant` (matches the canonical `tenantId` stored in Mongo). Payloads are normalised via `NotifySchemaMigration` before persistence to guarantee schema version pinning.
*All* REST calls require the tenant header `X-StellaOps-Tenant` (matches the canonical `tenantId` stored in PostgreSQL). Payloads are normalised via `NotifySchemaMigration` before persistence to guarantee schema version pinning.
Authentication today is stubbed with Bearer tokens (`Authorization: Bearer <token>`). When Authority wiring lands, this will switch to OpTok validation + scope enforcement, but the header contract will remain the same.
Service configuration exposes `notify:auth:*` keys (issuer, audience, signing key, scope names) so operators can wire the Authority JWKS or (in dev) a symmetric test key. `notify:storage:*` keys cover Mongo URI/database/collection overrides. Both sets are required for the new API surface.
Service configuration exposes `notify:auth:*` keys (issuer, audience, signing key, scope names) so operators can wire the Authority JWKS or (in dev) a symmetric test key. `notify:storage:*` keys cover PostgreSQL connection/schema overrides. Both sets are required for the new API surface.
Internal tooling can hit `/internal/notify/<entity>/normalize` to upgrade legacy JSON and return canonical output used in the docs fixtures.
@@ -347,7 +346,7 @@ Authority signs ack tokens using keys configured under `notifications.ackTokens`
* **Ingestor**: N consumers with perkey ordering (key = tenant|digest|namespace).
* **RuleMatcher**: loads active rules snapshot for tenant into memory; vectorized predicate check.
* **Throttle/Dedupe**: consult Redis + Mongo `throttles`; if hit → record `status=throttled`.
* **Throttle/Dedupe**: consult Redis + PostgreSQL `throttles`; if hit → record `status=throttled`.
* **DigestCoalescer**: append to open digest window or flush when timer expires.
* **Renderer**: select template (channel+locale), inject variables, enforce length limits, compute `bodyHash`.
* **Connector**: send; handle providerspecific rate limits and backoffs; `maxAttempts` with exponential jitter; overflow → DLQ (deadletter topic) + UI surfacing.
@@ -367,7 +366,7 @@ Authority signs ack tokens using keys configured under `notifications.ackTokens`
## 11) Security & privacy
* **AuthZ**: all APIs require **Authority** OpToks; actions scoped by tenant.
* **Secrets**: `secretRef` only; Notify fetches justintime from Authority Secret proxy or K8s Secret (mounted). No plaintext secrets in Mongo.
* **Secrets**: `secretRef` only; Notify fetches justintime from Authority Secret proxy or K8s Secret (mounted). No plaintext secrets in database.
* **Egress TLS**: validate SSL; pin domains per channel config; optional CA bundle override for onprem SMTP.
* **Webhook signing**: HMAC or Ed25519 signatures in `X-StellaOps-Signature` + replaywindow timestamp; include canonical body hash in header.
* **Redaction**: deliveries store **hashes** of bodies, not full payloads for chat/email to minimize PII retention (configurable).
@@ -456,7 +455,7 @@ notify:
| Invalid channel secret | Mark channel unhealthy; suppress sends; surface in UI |
| Rule explosion (matches everything) | Safety valve: pertenant RPM caps; autopause rule after X drops; UI alert |
| Bus outage | Buffer to local queue (bounded); resume consuming when healthy |
| Mongo slowness | Fall back to Redis throttles; batch write deliveries; shed lowpriority notifications |
| PostgreSQL slowness | Fall back to Redis throttles; batch write deliveries; shed lowpriority notifications |
---
@@ -530,7 +529,7 @@ Bootstrap Pack. The artefacts live under `bootstrap/notify/` after running the
Offline Kit builder and include:
- `notify.yaml` — configuration derived from `etc/notify.airgap.yaml`, pointing
to the sealed MongoDB/Authority endpoints and loading connectors from the
to the sealed PostgreSQL/Authority endpoints and loading connectors from the
local plug-in directory.
- `notify-web.secret.example` — template for the Authority client secret,
intended to be renamed to `notify-web.secret` before deployment.

View File

@@ -43,7 +43,7 @@ graph TD
subgraph Ingestion["Aggregation-Only Ingestion (AOC)"]
Concelier[Concelier.WebService]
Excititor[Excititor.WebService]
RawStore[(MongoDB<br/>advisory_raw / vex_raw)]
RawStore[(PostgreSQL<br/>advisory_raw / vex_raw)]
end
subgraph Derivation["Policy & Overlay"]
Policy[Policy Engine]
@@ -106,7 +106,7 @@ Key boundaries:
|------------|---------|------------|-------|
| `advisory_raw` | Immutable vendor/ecosystem advisory documents. | `_id`, `tenant`, `source.*`, `upstream.*`, `content.raw`, `linkset`, `supersedes`. | Idempotent by `(source.vendor, upstream.upstream_id, upstream.content_hash)`. |
| `vex_raw` | Immutable vendor VEX statements. | Mirrors `advisory_raw`; `identifiers.statements` summarises affected components. | Maintains supersedes chain identical to advisory flow. |
| Change streams (`advisory_raw_stream`, `vex_raw_stream`) | Feed Policy Engine and Scheduler. | `operationType`, `documentKey`, `fullDocument`, `tenant`, `traceId`. | Scope filtered per tenant before delivery. |
| Logical replication (`advisory_raw_stream`, `vex_raw_stream`) | Feed Policy Engine and Scheduler. | `operationType`, `documentKey`, `fullDocument`, `tenant`, `traceId`. | Scope filtered per tenant before delivery. |
### 2.3 Guarded ingestion sequence
@@ -115,16 +115,16 @@ sequenceDiagram
participant Upstream as Upstream Source
participant Connector as Concelier/Excititor Connector
participant Guard as AOCWriteGuard
participant Mongo as MongoDB (advisory_raw / vex_raw)
participant Stream as Change Stream
participant PG as PostgreSQL (advisory_raw / vex_raw)
participant Stream as Logical Replication
participant Policy as Policy Engine
Upstream-->>Connector: CSAF / OSV / VEX document
Connector->>Connector: Normalize transport, compute content_hash
Connector->>Guard: Candidate raw doc (source + upstream + content + linkset)
Guard-->>Connector: ERR_AOC_00x on violation
Guard->>Mongo: Append immutable document (with tenant & supersedes)
Mongo-->>Stream: Change event (tenant scoped)
Guard->>PG: Append immutable row (with tenant & supersedes)
PG-->>Stream: Replication event (tenant scoped)
Stream->>Policy: Raw delta payload
Policy->>Policy: Evaluate policies, compute effective findings
```
@@ -144,9 +144,9 @@ sequenceDiagram
## 3·Data & control flow highlights
1. **Ingestion:** Concelier / Excititor connectors fetch upstream documents, compute linksets, and hand payloads to `AOCWriteGuard`. Guards validate schema, provenance, forbidden fields, supersedes pointers, and append-only rules before writing to Mongo.
1. **Ingestion:** Concelier / Excititor connectors fetch upstream documents, compute linksets, and hand payloads to `AOCWriteGuard`. Guards validate schema, provenance, forbidden fields, supersedes pointers, and append-only rules before writing to PostgreSQL.
2. **Verification:** `stella aoc verify` (CLI/CI) and `/aoc/verify` endpoints replay guard checks against stored documents, mapping `ERR_AOC_00x` codes to exit codes for automation.
3. **Policy evaluation:** Mongo change streams deliver tenant-scoped raw deltas. Policy Engine joins SBOM inventory (via BOM Index), executes deterministic policies, writes overlays, and emits events to Scheduler/Notify.
3. **Policy evaluation:** PostgreSQL logical replication delivers tenant-scoped raw deltas. Policy Engine joins SBOM inventory (via BOM Index), executes deterministic policies, writes overlays, and emits events to Scheduler/Notify.
4. **Experience surfaces:** Console renders an AOC dashboard showing ingestion latency, guard violations, and supersedes depth. CLI exposes raw-document fetch helpers for auditing. Offline Kit bundles raw collections alongside guard configs to keep air-gapped installs verifiable.
5. **Observability:** All services emit `ingestion_write_total`, `aoc_violation_total{code}`, `ingestion_latency_seconds`, and trace spans `ingest.fetch`, `ingest.transform`, `ingest.write`, `aoc.guard`. Logs correlate via `traceId`, `tenant`, `source.vendor`, and `content_hash`.
@@ -154,8 +154,8 @@ sequenceDiagram
## 4·Offline & disaster readiness
- **Offline Kit:** Packages raw Mongo snapshots (`advisory_raw`, `vex_raw`) plus guard configuration and CLI verifier binaries so air-gapped sites can re-run AOC checks before promotion.
- **Recovery:** Supersedes chains allow rollback to prior revisions without mutating documents. Disaster exercises must rehearse restoring from snapshot, replaying change streams into Policy Engine, and re-validating guard compliance.
- **Offline Kit:** Packages raw PostgreSQL snapshots (`advisory_raw`, `vex_raw`) plus guard configuration and CLI verifier binaries so air-gapped sites can re-run AOC checks before promotion.
- **Recovery:** Supersedes chains allow rollback to prior revisions without mutating rows. Disaster exercises must rehearse restoring from snapshot, replaying logical replication into Policy Engine, and re-validating guard compliance.
- **Migration:** Legacy normalised fields are moved to temporary views during cutover; ingestion runtime removes writes once guard-enforced path is live (see [Migration playbook](../../ingestion/aggregation-only-contract.md#8-migration-playbook)).
---
@@ -169,7 +169,7 @@ sequenceDiagram
3. `outputbundle.tar.zst` (SBOM, findings, VEX, logs, Merkle proofs).
Every artifact is signed with multi-profile keys (FIPS, GOST, SM, etc.) managed by Authority. See `docs/replay/DETERMINISTIC_REPLAY.md` §2§5 for the full schema.
- **Reachability subtree:** When reachability recording is enabled, Scanner uploads graphs & runtime traces under `cas://replay/<scan-id>/reachability/graphs/` and `cas://replay/<scan-id>/reachability/traces/`. Manifest references (StellaOps.Replay.Core) bind these URIs along with analyzer hashes so Replay + Signals can rehydrate explainability evidence deterministically.
- **Storage tiers:** Primary storage is Mongo (`replay_runs`, `replay_subjects`) plus the CAS bucket. Evidence Locker mirrors bundles for long-term retention and legal hold workflows (`docs/modules/evidence-locker/architecture.md`). Offline kits package bundles under `offline/replay/<scan-id>` with detached DSSE envelopes for air-gapped verification.
- **Storage tiers:** Primary storage is PostgreSQL (`replay_runs`, `replay_subjects`) plus the CAS bucket. Evidence Locker mirrors bundles for long-term retention and legal hold workflows (`docs/modules/evidence-locker/architecture.md`). Offline kits package bundles under `offline/replay/<scan-id>` with detached DSSE envelopes for air-gapped verification.
- **APIs & ownership:** Scanner WebService produces the bundles via `record` mode, Scanner Worker emits Merkle metadata, Signer/Authority provide DSSE signatures, Attestor anchors manifests to Rekor, CLI/Evidence Locker handle retrieval, and Docs Guild maintains runbooks. Responsibilities are tracked in `docs/implplan/SPRINT_185_shared_replay_primitives.md` through `SPRINT_187_evidence_locker_cli_integration.md`.
- **Operational policies:** Retention defaults to 180days for hot CAS storage and 2years for cold Evidence Locker copies. Rotation and pruning follow the checklist in `docs/runbooks/replay_ops.md`.
@@ -193,7 +193,7 @@ sequenceDiagram
## 7·Compliance checklist
- [ ] AOC guard enabled for all Concelier and Excititor write paths in production.
- [ ] Mongo schema validators deployed for `advisory_raw` and `vex_raw`; change streams scoped per tenant.
- [ ] PostgreSQL schema constraints deployed for `advisory_raw` and `vex_raw`; logical replication scoped per tenant.
- [ ] Authority scopes (`advisory:*`, `vex:*`, `effective:*`) configured in Gateway and validated via integration tests.
- [ ] `stella aoc verify` wired into CI/CD pipelines with seeded violation fixtures.
- [ ] Console AOC dashboard and CLI documentation reference the new ingestion contract.

View File

@@ -49,13 +49,13 @@ graph TD
Materializer[Effective Findings Writer]
end
subgraph RawStores["Raw Stores (AOC)"]
AdvisoryRaw[(MongoDB<br/>advisory_raw)]
VexRaw[(MongoDB<br/>vex_raw)]
AdvisoryRaw[(PostgreSQL<br/>advisory_raw)]
VexRaw[(PostgreSQL<br/>vex_raw)]
end
subgraph Derived["Derived Stores"]
Mongo[(MongoDB<br/>policies / policy_runs / effective_finding_*)]
PG[(PostgreSQL<br/>policies / policy_runs / effective_finding_*)]
Blob[(Object Store / Evidence Locker)]
Queue[(Mongo Queue / NATS)]
Queue[(PostgreSQL Queue / NATS)]
end
Concelier[(Concelier APIs)]
Excititor[(Excititor APIs)]
@@ -75,12 +75,12 @@ graph TD
WorkerPool --> VexRaw
WorkerPool --> SBOM
WorkerPool --> Materializer
Materializer --> Mongo
Materializer --> PG
WorkerPool --> Blob
API --> Mongo
API --> PG
API --> Blob
API --> Authority
Orchestrator --> Mongo
Orchestrator --> PG
Authority --> API
```
@@ -88,14 +88,14 @@ Key notes:
- API host exposes lifecycle, run, simulate, findings endpoints with DPoP-bound OAuth enforcement.
- Orchestrator manages run scheduling/fairness; writes run tickets to queue, leases jobs to worker pool.
- Workers evaluate policies using cached IR; join external services via tenant-scoped clients; pull immutable advisories/VEX from the raw stores; write derived overlays to Mongo and optional explain bundles to blob storage.
- Workers evaluate policies using cached IR; join external services via tenant-scoped clients; pull immutable advisories/VEX from the raw stores; write derived overlays to PostgreSQL and optional explain bundles to blob storage.
- Observability (metrics/traces/logs) integrated via OpenTelemetry (not shown).
---
### 2.1·AOC inputs & immutability
- **Raw-only reads.** Evaluation workers access `advisory_raw` / `vex_raw` via tenant-scoped Mongo clients or the Concelier/Excititor raw APIs. No Policy Engine component is permitted to mutate these collections.
- **Raw-only reads.** Evaluation workers access `advisory_raw` / `vex_raw` via tenant-scoped PostgreSQL clients or the Concelier/Excititor raw APIs. No Policy Engine component is permitted to mutate these tables.
- **Guarded ingestion.** `AOCWriteGuard` rejects forbidden fields before data reaches the raw stores. Policy tests replay known `ERR_AOC_00x` violations to confirm ingestion compliance.
- **Change streams as contract.** Run orchestration stores resumable cursors for raw change streams. Replays of these cursors (e.g., after failover) must yield identical materialisation outcomes.
- **Derived stores only.** All severity, consensus, and suppression state lives in `effective_finding_*` collections and explain bundles owned by Policy Engine. Provenance fields link back to raw document IDs so auditors can trace every verdict.
@@ -107,13 +107,13 @@ Key notes:
| Module | Responsibility | Notes |
|--------|----------------|-------|
| **Configuration** (`Configuration/`) | Bind settings (Mongo URIs, queue options, service URLs, sealed mode), validate on start. | Strict schema; fails fast on missing secrets. |
| **Configuration** (`Configuration/`) | Bind settings (PostgreSQL connection strings, queue options, service URLs, sealed mode), validate on start. | Strict schema; fails fast on missing secrets. |
| **Authority Client** (`Authority/`) | Acquire tokens, enforce scopes, perform DPoP key rotation. | Only service identity uses `effective:write`. |
| **DSL Compiler** (`Dsl/`) | Parse, canonicalise, IR generation, checksum caching. | Uses Roslyn-like pipeline; caches by `policyId+version+hash`. |
| **Selection Layer** (`Selection/`) | Batch SBOM ↔ advisory ↔ VEX joiners; apply equivalence tables; support incremental cursors. | Deterministic ordering (SBOM → advisory → VEX). |
| **Evaluator** (`Evaluation/`) | Execute IR with first-match semantics, compute severity/trust/reachability weights, record rule hits. | Stateless; all inputs provided by selection layer. |
| **Signals** (`Signals/`) | Normalizes reachability, trust, entropy, uncertainty, runtime hits into a single dictionary passed to Evaluator; supplies default `unknown` values when signals missing. Entropy penalties are derived from Scanner `layer_summary.json`/`entropy.report.json` (K=0.5, cap=0.3, block at image opaque ratio &gt; 0.15 w/ unknown provenance) and exported via `policy_entropy_penalty_value` / `policy_entropy_image_opaque_ratio`; SPL scope `entropy.*` exposes `penalty`, `image_opaque_ratio`, `blocked`, `warned`, `capped`, `top_file_opaque_ratio`. | Aligns with `signals.*` namespace in DSL. |
| **Materialiser** (`Materialization/`) | Upsert effective findings, append history, manage explain bundle exports. | Mongo transactions per SBOM chunk. |
| **Materialiser** (`Materialization/`) | Upsert effective findings, append history, manage explain bundle exports. | PostgreSQL transactions per SBOM chunk. |
| **Orchestrator** (`Runs/`) | Change-stream ingestion, fairness, retry/backoff, queue writer. | Works with Scheduler Models DTOs. |
| **API** (`Api/`) | Minimal API endpoints, DTO validation, problem responses, idempotency. | Generated clients for CLI/UI. |
| **Observability** (`Telemetry/`) | Metrics (`policy_run_seconds`, `rules_fired_total`), traces, structured logs. | Sampled rule-hit logs with redaction. |
@@ -183,7 +183,7 @@ Determinism guard instrumentation wraps the evaluator, rejecting access to forbi
- **Change streams:** Concelier and Excititor publish document changes to the scheduler queue (`policy.trigger.delta`). Payload includes `tenant`, `source`, `linkset digests`, `cursor`.
- **Orchestrator:** Maintains per-tenant backlog; merges deltas until time/size thresholds met, then enqueues `PolicyRunRequest`.
- **Queue:** Mongo queue with lease; each job assigned `leaseDuration`, `maxAttempts`.
- **Queue:** PostgreSQL queue with lease; each job assigned `leaseDuration`, `maxAttempts`.
- **Workers:** Lease jobs, execute evaluation pipeline, report status (success/failure/canceled). Failures with recoverable errors requeue with backoff; determinism or schema violations mark job `failed` and raise incident event.
- **Fairness:** Round-robin per `{tenant, policyId}`; emergency jobs (`priority=emergency`) jump queue but limited via circuit breaker.
- **Replay:** On demand, orchestrator rehydrates run via stored cursors and exports sealed bundle for audit/CI determinism checks.

View File

@@ -11,7 +11,7 @@
## 2) Project layout
- `src/SbomService/StellaOps.SbomService` — REST API + event emitters + orchestrator integration.
- Storage: MongoDB collections (proposed)
- Storage: PostgreSQL tables (proposed)
- `sbom_snapshots` (immutable versions; tenant + artifact + digest + createdAt)
- `sbom_projections` (materialised views keyed by snapshotId, entrypoint/service node flags)
- `sbom_assets` (asset metadata, criticality/owner/env/exposure; append-only history)
@@ -66,7 +66,7 @@ Operational rules:
- `sbom.version.created` — emitted per new SBOM snapshot; payload: tenant, artifact digest, sbomVersion, projection hash, source bundle hash, import provenance; replay/backfill via outbox with watermark.
- `sbom.asset.updated` — emitted when asset metadata changes; idempotent payload keyed by `(tenant, assetId, version)`.
- Inventory/resolver feeds — queue/topic delivering `(artifact, purl, version, paths, runtime_flag, scope, nearest_safe_version)` for Vuln Explorer/Findings Ledger.
- Current implementation uses an in-memory event store/publisher (with clock abstraction) plus `/internal/sbom/events` + `/internal/sbom/events/backfill` to validate envelopes until the Mongo-backed outbox is wired.
- Current implementation uses an in-memory event store/publisher (with clock abstraction) plus `/internal/sbom/events` + `/internal/sbom/events/backfill` to validate envelopes until the PostgreSQL-backed outbox is wired.
- Entrypoint/service node overrides are exposed via `/entrypoints` (tenant-scoped) and should be mirrored into Cartographer relevance jobs when the outbox lands.
## 6) Determinism & offline posture
@@ -86,14 +86,14 @@ Operational rules:
- Logs: structured, include tenant + artifact digest + sbomVersion; classify ingest failures (schema, storage, orchestrator, validation).
- Alerts: backlog thresholds for outbox/event delivery; high latency on path/timeline endpoints.
## 9) Configuration (Mongo-backed catalog & lookup)
- Enable Mongo storage for `/console/sboms` and `/components/lookup` by setting `SbomService:Mongo:ConnectionString` (env: `SBOM_SbomService__Mongo__ConnectionString`).
- Optional overrides: `SbomService:Mongo:Database`, `SbomService:Mongo:CatalogCollection`, `SbomService:Mongo:ComponentLookupCollection`; defaults are `sbom_service`, `sbom_catalog`, `sbom_component_neighbors`.
## 9) Configuration (PostgreSQL-backed catalog & lookup)
- Enable PostgreSQL storage for `/console/sboms` and `/components/lookup` by setting `SbomService:PostgreSQL:ConnectionString` (env: `SBOM_SbomService__PostgreSQL__ConnectionString`).
- Optional overrides: `SbomService:PostgreSQL:Schema`, `SbomService:PostgreSQL:CatalogTable`, `SbomService:PostgreSQL:ComponentLookupTable`; defaults are `sbom_service`, `sbom_catalog`, `sbom_component_neighbors`.
- When the connection string is absent the service falls back to fixture JSON or deterministic in-memory seeds to keep air-gapped workflows alive.
## 10) Open questions / dependencies
- Confirm orchestrator pause/backfill contract (shared with Runtime & Signals 140-series).
- Finalise storage collection names and indexes (compound on tenant+artifactDigest+version, TTL for transient staging).
- Finalise storage table names and indexes (compound on tenant+artifactDigest+version, TTL for transient staging).
- Publish canonical LNM v1 fixtures and JSON schemas for projections and asset metadata.
- See `docs/modules/sbomservice/api/projection-read.md` for `/sboms/{snapshotId}/projection` (LNM v1, tenant-scoped, hash-returning).

View File

@@ -2,7 +2,7 @@
> Aligned with Epic6 Vulnerability Explorer and Epic10 Export Center.
> **Scope.** Implementationready architecture for the **Scanner** subsystem: WebService, Workers, analyzers, SBOM assembly (inventory & usage), perlayer caching, threeway diffs, artifact catalog (RustFS default + Mongo, S3-compatible fallback), attestation handoff, and scale/security posture. This document is the contract between the scanning plane and everything else (Policy, Excititor, Concelier, UI, CLI).
> **Scope.** Implementationready architecture for the **Scanner** subsystem: WebService, Workers, analyzers, SBOM assembly (inventory & usage), perlayer caching, threeway diffs, artifact catalog (RustFS default + PostgreSQL, S3-compatible fallback), attestation handoff, and scale/security posture. This document is the contract between the scanning plane and everything else (Policy, Excititor, Concelier, UI, CLI).
---
@@ -25,7 +25,7 @@ src/
├─ StellaOps.Scanner.WebService/ # REST control plane, catalog, diff, exports
├─ StellaOps.Scanner.Worker/ # queue consumer; executes analyzers
├─ StellaOps.Scanner.Models/ # DTOs, evidence, graph nodes, CDX/SPDX adapters
├─ StellaOps.Scanner.Storage/ # Mongo repositories; RustFS object client (default) + S3 fallback; ILM/GC
├─ StellaOps.Scanner.Storage/ # PostgreSQL repositories; RustFS object client (default) + S3 fallback; ILM/GC
├─ StellaOps.Scanner.Queue/ # queue abstraction (Redis/NATS/RabbitMQ)
├─ StellaOps.Scanner.Cache/ # layer cache; file CAS; bloom/bitmap indexes
├─ StellaOps.Scanner.EntryTrace/ # ENTRYPOINT/CMD → terminal program resolver (shell AST)
@@ -132,7 +132,7 @@ The DI extension (`AddScannerQueue`) wires the selected transport, so future add
* **OCI registry** with **Referrers API** (discover attached SBOMs/signatures).
* **RustFS** (default, offline-first) for SBOM artifacts; optional S3/MinIO compatibility retained for migration; **Object Lock** semantics emulated via retention headers; **ILM** for TTL.
* **MongoDB** for catalog, job state, diffs, ILM rules.
* **PostgreSQL** for catalog, job state, diffs, ILM rules.
* **Queue** (Redis Streams/NATS/RabbitMQ).
* **Authority** (onprem OIDC) for **OpToks** (DPoP/mTLS).
* **Signer** + **Attestor** (+ **Fulcio/KMS** + **Rekor v2**) for DSSE + transparency.
@@ -167,7 +167,7 @@ The DI extension (`AddScannerQueue`) wires the selected transport, so future add
No confidences. Either a fact is proven with listed mechanisms, or it is not claimed.
### 3.2 Catalog schema (Mongo)
### 3.2 Catalog schema (PostgreSQL)
* `artifacts`
@@ -182,8 +182,8 @@ No confidences. Either a fact is proven with listed mechanisms, or it is not cla
* `links { fromType, fromDigest, artifactId }` // image/layer -> artifact
* `jobs { _id, kind, args, state, startedAt, heartbeatAt, endedAt, error }`
* `lifecycleRules { ruleId, scope, ttlDays, retainIfReferenced, immutable }`
* `ruby.packages { _id: scanId, imageDigest, generatedAtUtc, packages[] }` // decoded `RubyPackageInventory` documents for CLI/Policy reuse
* `bun.packages { _id: scanId, imageDigest, generatedAtUtc, packages[] }` // decoded `BunPackageInventory` documents for CLI/Policy reuse
* `ruby.packages { _id: scanId, imageDigest, generatedAtUtc, packages[] }` // decoded `RubyPackageInventory` rows for CLI/Policy reuse
* `bun.packages { _id: scanId, imageDigest, generatedAtUtc, packages[] }` // decoded `BunPackageInventory` rows for CLI/Policy reuse
### 3.3 Object store layout (RustFS)
@@ -389,8 +389,8 @@ scanner:
queue:
kind: redis
url: "redis://queue:6379/0"
mongo:
uri: "mongodb://mongo/scanner"
postgres:
connectionString: "Host=postgres;Port=5432;Database=scanner;Username=stellaops;Password=stellaops"
s3:
endpoint: "http://minio:9000"
bucket: "stellaops"
@@ -493,7 +493,7 @@ scanner:
* **HA**: WebService horizontal scale; Workers autoscale by queue depth & CPU; distributed locks on layers.
* **Retention**: ILM rules per artifact class (`short`, `default`, `compliance`); **Object Lock** for compliance artifacts (reports, signed SBOMs).
* **Upgrades**: bump **cache schema** when analyzer outputs change; WebService triggers refresh of dependent artifacts.
* **Backups**: Mongo (daily dumps); RustFS snapshots (filesystem-level rsync/ZFS) or S3 versioning when legacy driver enabled; Rekor v2 DB snapshots.
* **Backups**: PostgreSQL (pg_dump daily); RustFS snapshots (filesystem-level rsync/ZFS) or S3 versioning when legacy driver enabled; Rekor v2 DB snapshots.
---

View File

@@ -0,0 +1,357 @@
# EPSS Integration Architecture
> **Advisory Source**: `docs/product-advisories/16-Dec-2025 - Merging EPSS v4 with CVSS v4 Frameworks.md`
> **Last Updated**: 2025-12-17
> **Status**: Approved for Implementation
---
## Executive Summary
EPSS (Exploit Prediction Scoring System) is a **probabilistic model** that estimates the likelihood a given CVE will be exploited in the wild over the next ~30 days. This document defines how StellaOps integrates EPSS as a first-class risk signal.
**Key Distinction**:
- **CVSS v4**: Deterministic measurement of *severity* (0-10)
- **EPSS**: Dynamic, data-driven *probability of exploitation* (0-1)
EPSS does **not** replace CVSS or VEX—it provides complementary probabilistic threat intelligence.
---
## 1. Design Principles
### 1.1 EPSS as Probabilistic Signal
| Signal Type | Nature | Source |
|-------------|--------|--------|
| CVSS v4 | Deterministic impact | NVD, vendor |
| EPSS | Probabilistic threat | FIRST daily feeds |
| VEX | Vendor intent | Vendor statements |
| Runtime context | Actual exposure | StellaOps scanner |
**Rule**: EPSS *modulates confidence*, never asserts truth.
### 1.2 Architectural Constraints
1. **Append-only time-series**: Never overwrite historical EPSS data
2. **Deterministic replay**: Every scan stores the EPSS snapshot reference used
3. **Idempotent ingestion**: Safe to re-run for same date
4. **Postgres as source of truth**: Valkey is optional cache only
5. **Air-gap compatible**: Manual import via signed bundles
---
## 2. Data Model
### 2.1 Core Tables
#### Import Provenance
```sql
CREATE TABLE epss_import_runs (
import_run_id UUID PRIMARY KEY,
model_date DATE NOT NULL,
source_uri TEXT NOT NULL,
retrieved_at TIMESTAMPTZ NOT NULL,
file_sha256 TEXT NOT NULL,
decompressed_sha256 TEXT NULL,
row_count INT NOT NULL,
model_version_tag TEXT NULL,
published_date DATE NULL,
status TEXT NOT NULL, -- SUCCEEDED / FAILED
error TEXT NULL,
UNIQUE (model_date)
);
```
#### Time-Series Scores (Partitioned)
```sql
CREATE TABLE epss_scores (
model_date DATE NOT NULL,
cve_id TEXT NOT NULL,
epss_score DOUBLE PRECISION NOT NULL,
percentile DOUBLE PRECISION NOT NULL,
import_run_id UUID NOT NULL REFERENCES epss_import_runs(import_run_id),
PRIMARY KEY (model_date, cve_id)
) PARTITION BY RANGE (model_date);
```
#### Current Projection (Fast Lookup)
```sql
CREATE TABLE epss_current (
cve_id TEXT PRIMARY KEY,
epss_score DOUBLE PRECISION NOT NULL,
percentile DOUBLE PRECISION NOT NULL,
model_date DATE NOT NULL,
import_run_id UUID NOT NULL
);
CREATE INDEX idx_epss_current_score_desc ON epss_current (epss_score DESC);
CREATE INDEX idx_epss_current_percentile_desc ON epss_current (percentile DESC);
```
#### Change Detection
```sql
CREATE TABLE epss_changes (
model_date DATE NOT NULL,
cve_id TEXT NOT NULL,
old_score DOUBLE PRECISION NULL,
new_score DOUBLE PRECISION NOT NULL,
delta_score DOUBLE PRECISION NULL,
old_percentile DOUBLE PRECISION NULL,
new_percentile DOUBLE PRECISION NOT NULL,
flags INT NOT NULL, -- bitmask: NEW_SCORED, CROSSED_HIGH, BIG_JUMP
PRIMARY KEY (model_date, cve_id)
) PARTITION BY RANGE (model_date);
```
### 2.2 Flags Bitmask
| Flag | Value | Meaning |
|------|-------|---------|
| NEW_SCORED | 0x01 | CVE newly scored (not in previous day) |
| CROSSED_HIGH | 0x02 | Score crossed above high threshold |
| CROSSED_LOW | 0x04 | Score crossed below high threshold |
| BIG_JUMP_UP | 0x08 | Delta > 0.10 upward |
| BIG_JUMP_DOWN | 0x10 | Delta > 0.10 downward |
| TOP_PERCENTILE | 0x20 | Entered top 5% |
---
## 3. Service Architecture
### 3.1 Component Responsibilities
```
┌─────────────────────────────────────────────────────────────────┐
│ EPSS DATA FLOW │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Scheduler │────►│ Concelier │────►│ Scanner │ │
│ │ (triggers) │ │ (ingest) │ │ (evidence) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌──────────────┐ │ │
│ │ │ Postgres │◄───────────┘ │
│ │ │ (truth) │ │
│ │ └──────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Notify │◄────│ Excititor │ │
│ │ (alerts) │ │ (VEX tasks) │ │
│ └──────────────┘ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
```
| Component | Responsibility |
|-----------|----------------|
| **Scheduler** | Triggers daily EPSS import job |
| **Concelier** | Downloads/imports EPSS, stores facts, computes delta, emits events |
| **Scanner** | Attaches EPSS-at-scan as immutable evidence, uses for scoring |
| **Excititor** | Creates VEX tasks when EPSS is high and VEX missing |
| **Notify** | Sends alerts on priority changes |
### 3.2 Event Flow
```
Scheduler
→ epss.ingest(date)
→ Concelier (ingest)
→ epss.updated
→ Notify (optional daily summary)
→ Concelier (enrichment)
→ vuln.priority.changed
→ Notify (targeted alerts)
→ Excititor (VEX task creation)
```
---
## 4. Ingestion Pipeline
### 4.1 Data Source
FIRST publishes daily CSV snapshots at:
```
https://epss.empiricalsecurity.com/epss_scores-YYYY-MM-DD.csv.gz
```
Each file contains ~300k CVE records with:
- `cve` - CVE ID
- `epss` - Score (0.000001.00000)
- `percentile` - Rank vs all CVEs
### 4.2 Ingestion Steps
1. **Scheduler** triggers daily job for date D
2. **Download** `epss_scores-D.csv.gz`
3. **Decompress** stream
4. **Parse** header comment for model version/date
5. **Validate** scores in [0,1], monotonic percentile
6. **Bulk load** into TEMP staging table
7. **Transaction**:
- Insert `epss_import_runs`
- Insert into `epss_scores` partition
- Compute `epss_changes` by comparing staging vs `epss_current`
- Upsert `epss_current`
- Enqueue `epss.updated` event
8. **Commit**
### 4.3 Air-Gap Import
Accept local bundle containing:
- `epss_scores-YYYY-MM-DD.csv.gz`
- `manifest.json` with sha256, source attribution, DSSE signature
Same pipeline, with `source_uri = bundle://...`.
---
## 5. Enrichment Rules
### 5.1 New Scan Findings (Immutable)
Store EPSS "as-of" scan time:
```csharp
public record ScanEpssEvidence
{
public double EpssScoreAtScan { get; init; }
public double EpssPercentileAtScan { get; init; }
public DateOnly EpssModelDateAtScan { get; init; }
public Guid EpssImportRunIdAtScan { get; init; }
}
```
This supports deterministic replay even if EPSS changes later.
### 5.2 Existing Findings (Live Triage)
Maintain mutable "current EPSS" on vulnerability instances:
- **scan_finding_evidence**: Immutable EPSS-at-scan
- **vuln_instance_triage**: Current EPSS + band (for live triage)
### 5.3 Efficient Delta Targeting
On `epss.updated(D)`:
1. Read `epss_changes` where flags indicate material change
2. Find impacted vulnerability instances by CVE
3. Update only those instances
4. Emit `vuln.priority.changed` only if band crossed
---
## 6. Notification Policy
### 6.1 Default Thresholds
| Threshold | Default | Description |
|-----------|---------|-------------|
| HighPercentile | 0.95 | Top 5% of all CVEs |
| HighScore | 0.50 | 50% exploitation probability |
| BigJumpDelta | 0.10 | Meaningful daily change |
### 6.2 Trigger Conditions
1. **Newly scored** CVE in inventory AND `percentile >= HighPercentile`
2. Existing CVE **crosses above** HighPercentile or HighScore
3. Delta > BigJumpDelta AND CVE in runtime-exposed assets
All thresholds are org-configurable.
---
## 7. Trust Lattice Integration
### 7.1 Scoring Rule Example
```
IF cvss_base >= 8.0
AND epss_score >= 0.35
AND runtime_exposed = true
→ priority = IMMEDIATE_ATTENTION
```
### 7.2 Score Weights
| Factor | Default Weight | Range |
|--------|---------------|-------|
| CVSS | 0.25 | 0.0-1.0 |
| EPSS | 0.25 | 0.0-1.0 |
| Reachability | 0.25 | 0.0-1.0 |
| Freshness | 0.15 | 0.0-1.0 |
| Frequency | 0.10 | 0.0-1.0 |
---
## 8. API Surface
### 8.1 Internal API Endpoints
| Endpoint | Description |
|----------|-------------|
| `GET /epss/current?cve=...` | Bulk lookup current EPSS |
| `GET /epss/history?cve=...&days=180` | Historical time-series |
| `GET /epss/top?order=epss&limit=100` | Top CVEs by score |
| `GET /epss/changes?date=...` | Daily change report |
### 8.2 UI Requirements
For each vulnerability instance:
- EPSS score + percentile
- Model date
- Trend delta vs previous scan date
- Filter chips: "High EPSS", "Rising EPSS", "High CVSS + High EPSS"
- Evidence panel showing EPSS-at-scan vs current EPSS
---
## 9. Implementation Checklist
### Phase 1: Data Foundation
- [ ] DB migrations: tables + partitions + indexes
- [ ] Concelier ingestion job: online download + bundle import
### Phase 2: Integration
- [ ] epss_current + epss_changes projection
- [ ] Scanner.WebService: attach EPSS-at-scan evidence
- [ ] Bulk lookup API
### Phase 3: Enrichment
- [ ] Concelier enrichment job: update triage projections
- [ ] Notify subscription to vuln.priority.changed
### Phase 4: UI/UX
- [ ] EPSS fields in vulnerability detail
- [ ] Filters and sort by exploit likelihood
- [ ] Trend visualization
### Phase 5: Operations
- [ ] Backfill tool (last 180 days)
- [ ] Ops runbook: schedules, manual re-run, air-gap import
---
## 10. Anti-Patterns to Avoid
| Anti-Pattern | Why It's Wrong |
|--------------|----------------|
| Storing only latest EPSS | Breaks auditability and replay |
| Mixing EPSS into CVE table | EPSS is signal, not vulnerability data |
| Treating EPSS as severity | EPSS is probability, not impact |
| Alerting on every daily fluctuation | Creates alert fatigue |
| Recomputing EPSS internally | Use FIRST's authoritative data |
---
## Related Documents
- [Unknowns API Documentation](../api/unknowns-api.md)
- [Score Replay API](../api/score-replay-api.md)
- [Trust Lattice Architecture](../modules/scanner/architecture.md)

View File

@@ -26,7 +26,7 @@ src/
├─ StellaOps.Scheduler.Worker/ # planners + runners (N replicas)
├─ StellaOps.Scheduler.ImpactIndex/ # purl→images inverted index (roaring bitmaps)
├─ StellaOps.Scheduler.Models/ # DTOs (Schedule, Run, ImpactSet, Deltas)
├─ StellaOps.Scheduler.Storage.Mongo/ # schedules, runs, cursors, locks
├─ StellaOps.Scheduler.Storage.Postgres/ # schedules, runs, cursors, locks
├─ StellaOps.Scheduler.Queue/ # Redis Streams / NATS abstraction
├─ StellaOps.Scheduler.Tests.* # unit/integration/e2e
```
@@ -36,7 +36,7 @@ src/
* **Scheduler.WebService** (stateless)
* **Scheduler.Worker** (scaleout; planners + executors)
**Dependencies**: Authority (OpTok + DPoP/mTLS), Scanner.WebService, Conselier, Excitor, MongoDB, Redis/NATS, (optional) Notify.
**Dependencies**: Authority (OpTok + DPoP/mTLS), Scanner.WebService, Conselier, Excitor, PostgreSQL, Redis/NATS, (optional) Notify.
---
@@ -52,7 +52,7 @@ src/
---
## 3) Data model (Mongo)
## 3) Data model (PostgreSQL)
**Database**: `scheduler`
@@ -111,7 +111,7 @@ Goal: translate **change keys** → **image sets** in **milliseconds**.
* `Contains[purl] → bitmap(imageIds)`
* `UsedBy[purl] → bitmap(imageIds)` (subset of Contains)
* Optionally keep **Owner maps**: `{imageId → {tenantId, namespaces[], repos[]}}` for selection filters.
* Persist in RocksDB/LMDB or Redismodules; cache hot shards in memory; snapshot to Mongo for cold start.
* Persist in RocksDB/LMDB or Redismodules; cache hot shards in memory; snapshot to PostgreSQL for cold start.
**Update paths**:
@@ -298,8 +298,8 @@ scheduler:
queue:
kind: "redis" # or "nats"
url: "redis://redis:6379/4"
mongo:
uri: "mongodb://mongo/scheduler"
postgres:
connectionString: "Host=postgres;Port=5432;Database=scheduler;Username=stellaops;Password=stellaops"
impactIndex:
storage: "rocksdb" # "rocksdb" | "redis" | "memory"
warmOnStart: true
@@ -335,7 +335,7 @@ scheduler:
| Scanner under load (429) | Backoff with jitter; respect pertenant/leaky bucket |
| Oversubscription (too many impacted) | Prioritize KEV/critical first; spillover to next window; UI banner shows backlog |
| Notify down | Buffer outbound events in queue (TTL 24h) |
| Mongo slow | Cut batch sizes; samplelog; alert ops; dont drop runs unless critical |
| PostgreSQL slow | Cut batch sizes; samplelog; alert ops; don't drop runs unless critical |
---

View File

@@ -20,17 +20,17 @@
## 1) Responsibilities (contract)
1. **Authenticate** caller with **OpTok** (Authority OIDC, DPoP or mTLSbound).
2. **Authorize** scopes (`signer.sign`) + audience (`aud=signer`) + tenant/installation.
3. **Validate entitlement** via **PoE** (ProofofEntitlement) against Cloud Licensing `/license/introspect`.
4. **Verify release integrity** of the **scanner** image digest presented in the request: must be **cosignsigned** by StellaOps release key, discoverable via **OCI Referrers API**.
5. **Enforce plan & quotas** (concurrency/QPS/artifact size/rate caps).
6. **Mint signing identity**:
1. **Authenticate** caller with **OpTok** (Authority OIDC, DPoP or mTLSbound).
2. **Authorize** scopes (`signer.sign`) + audience (`aud=signer`) + tenant/installation.
3. **Validate entitlement** via **PoE** (ProofofEntitlement) against Cloud Licensing `/license/introspect`.
4. **Verify release integrity** of the **scanner** image digest presented in the request: must be **cosignsigned** by StellaOps release key, discoverable via **OCI Referrers API**.
5. **Enforce plan & quotas** (concurrency/QPS/artifact size/rate caps).
6. **Mint signing identity**:
* **Keyless** (default): get a shortlived X.509 cert from **Fulcio** using the Signers OIDC identity and sign the DSSE.
* **Keyful** (optional): sign with an HSM/KMS key.
7. **Return DSSE bundle** (subject digests + predicate + cert chain or KMS key id).
8. **Audit** every decision; expose metrics.
7. **Return DSSE bundle** (subject digests + predicate + cert chain or KMS key id).
8. **Audit** every decision; expose metrics.
---
@@ -41,7 +41,7 @@
* **Fulcio** (Sigstore) *or* **KMS/HSM**: to obtain certs or perform signatures.
* **OCI Registry (Referrers API)**: to verify **scanner** image release signature.
* **Attestor**: downstream service that writes DSSE bundles to **Rekor v2**.
* **Config/state stores**: Redis (caches, rate buckets), Mongo/Postgres (audit log).
* **Config/state stores**: Redis (caches, rate buckets), PostgreSQL (audit log).
---
@@ -115,55 +115,55 @@ Errors (RFC7807):
* `400 invalid_request` (schema/predicate/type invalid)
* `500 signing_unavailable` (Fulcio/KMS outage)
### 3.2 `GET /verify/referrers?imageDigest=<sha256>`
Checks whether the **image** at digest is signed by **StellaOps release key**.
Response:
### 3.2 `GET /verify/referrers?imageDigest=<sha256>`
Checks whether the **image** at digest is signed by **StellaOps release key**.
Response:
```json
{ "trusted": true, "signatures": [ { "type": "cosign", "digest": "sha256:...", "signedBy": "StellaOps Release 2027 Q2" } ] }
```
> **Note:** This endpoint is also used internally by Signer before issuing signatures.
### 3.3 Predicate catalog (Sprint401 update)
Signer now enforces an allowlist of predicate identifiers:
| Predicate | Description | Producer |
|-----------|-------------|----------|
| `stella.ops/sbom@v1` | SBOM/report attestation (existing). | Scanner WebService. |
| `stella.ops/promotion@v1` | Promotion evidence (see `docs/release/promotion-attestations.md`). | DevOps/Export Center. |
| `stella.ops/vexDecision@v1` | OpenVEX decision for a single `(cve, product)` pair, including reachability evidence references. | Policy Engine / VEXer. |
Requests with unknown predicates receive `400 predicate_not_allowed`. Policy Engine must supply the OpenVEX JSON as the `predicate` body; Signer preserves payload bytes verbatim so DSSE digest = OpenVEX digest.
---
### KMS drivers (keyful mode)
Signer now ships five deterministic KMS adapters alongside the default keyless flow:
- `services.AddFileKms(...)` stores encrypted ECDSA material on disk for air-gapped or lab installs.
- `services.AddAwsKms(options => { options.Region = "us-east-1"; /* optional: options.Endpoint, UseFipsEndpoint */ });` delegates signing to AWS KMS, caches metadata/public keys offline, and never exports the private scalar. Rotation/revocation still run through AWS tooling (this library intentionally throws for those APIs so we do not paper over operator approvals).
- `services.AddGcpKms(options => { options.Endpoint = "kms.googleapis.com"; });` integrates with Google Cloud KMS asymmetric keys, auto-resolves the primary key version when callers omit a version, and verifies signatures locally with exported PEM material.
- `services.AddPkcs11Kms(options => { options.LibraryPath = "/opt/hsm/libpkcs11.so"; options.PrivateKeyLabel = "stella-attestor"; });` loads a PKCS#11 module, opens read-only sessions, signs digests via HSM mechanisms, and never hoists the private scalar into process memory.
- `services.AddFido2Kms(options => { options.CredentialId = "<base64url>"; options.PublicKeyPem = "-----BEGIN PUBLIC KEY-----..."; options.AuthenticatorFactory = sp => new WebAuthnAuthenticator(); });` routes signing to a WebAuthn/FIDO2 authenticator for dual-control or air-gap scenarios. The authenticator must supply the CTAP/WebAuthn plumbing; the library handles digesting, key material caching, and verification.
Cloud & hardware-backed drivers share a few invariants:
1. Hash payloads server-side (SHA-256) before invoking provider APIs signatures remain reproducible and digest inputs are observable in structured audit logs.
2. Cache metadata for the configurable window (default 5min) and subject-public-key-info blobs for 10min; tune these per sovereignty policy when running in sealed/offline environments.
3. Only expose public coordinates (`Qx`, `Qy`) to the host ― `KmsKeyMaterial.D` is blank for non-exportable keys so downstream code cannot accidentally persist secrets.
> **Security review checkpoint:** rotate/destroy remains an administrative action in the provider. Document those runbooks per tenant, and gate AWS/GCP traffic in sealed-mode via the existing egress allowlist. PKCS#11 loads native code, so keep library paths on the allowlist and validate HSM policies separately. FIDO2 authenticators expect an operator in the loop; plan for session timeouts and explicit audit fields when enabling interactive signing.
## 4) Validation pipeline (hot path)
```mermaid
sequenceDiagram
autonumber
{ "trusted": true, "signatures": [ { "type": "cosign", "digest": "sha256:...", "signedBy": "StellaOps Release 2027 Q2" } ] }
```
> **Note:** This endpoint is also used internally by Signer before issuing signatures.
### 3.3 Predicate catalog (Sprint401 update)
Signer now enforces an allowlist of predicate identifiers:
| Predicate | Description | Producer |
|-----------|-------------|----------|
| `stella.ops/sbom@v1` | SBOM/report attestation (existing). | Scanner WebService. |
| `stella.ops/promotion@v1` | Promotion evidence (see `docs/release/promotion-attestations.md`). | DevOps/Export Center. |
| `stella.ops/vexDecision@v1` | OpenVEX decision for a single `(cve, product)` pair, including reachability evidence references. | Policy Engine / VEXer. |
Requests with unknown predicates receive `400 predicate_not_allowed`. Policy Engine must supply the OpenVEX JSON as the `predicate` body; Signer preserves payload bytes verbatim so DSSE digest = OpenVEX digest.
---
### KMS drivers (keyful mode)
Signer now ships five deterministic KMS adapters alongside the default keyless flow:
- `services.AddFileKms(...)` stores encrypted ECDSA material on disk for air-gapped or lab installs.
- `services.AddAwsKms(options => { options.Region = "us-east-1"; /* optional: options.Endpoint, UseFipsEndpoint */ });` delegates signing to AWS KMS, caches metadata/public keys offline, and never exports the private scalar. Rotation/revocation still run through AWS tooling (this library intentionally throws for those APIs so we do not paper over operator approvals).
- `services.AddGcpKms(options => { options.Endpoint = "kms.googleapis.com"; });` integrates with Google Cloud KMS asymmetric keys, auto-resolves the primary key version when callers omit a version, and verifies signatures locally with exported PEM material.
- `services.AddPkcs11Kms(options => { options.LibraryPath = "/opt/hsm/libpkcs11.so"; options.PrivateKeyLabel = "stella-attestor"; });` loads a PKCS#11 module, opens read-only sessions, signs digests via HSM mechanisms, and never hoists the private scalar into process memory.
- `services.AddFido2Kms(options => { options.CredentialId = "<base64url>"; options.PublicKeyPem = "-----BEGIN PUBLIC KEY-----..."; options.AuthenticatorFactory = sp => new WebAuthnAuthenticator(); });` routes signing to a WebAuthn/FIDO2 authenticator for dual-control or air-gap scenarios. The authenticator must supply the CTAP/WebAuthn plumbing; the library handles digesting, key material caching, and verification.
Cloud & hardware-backed drivers share a few invariants:
1. Hash payloads server-side (SHA-256) before invoking provider APIs signatures remain reproducible and digest inputs are observable in structured audit logs.
2. Cache metadata for the configurable window (default 5min) and subject-public-key-info blobs for 10min; tune these per sovereignty policy when running in sealed/offline environments.
3. Only expose public coordinates (`Qx`, `Qy`) to the host ― `KmsKeyMaterial.D` is blank for non-exportable keys so downstream code cannot accidentally persist secrets.
> **Security review checkpoint:** rotate/destroy remains an administrative action in the provider. Document those runbooks per tenant, and gate AWS/GCP traffic in sealed-mode via the existing egress allowlist. PKCS#11 loads native code, so keep library paths on the allowlist and validate HSM policies separately. FIDO2 authenticators expect an operator in the loop; plan for session timeouts and explicit audit fields when enabling interactive signing.
## 4) Validation pipeline (hot path)
```mermaid
sequenceDiagram
autonumber
participant Client as Scanner.WebService
participant Auth as Authority (OIDC)
participant Sign as Signer
@@ -283,7 +283,7 @@ Per `license_id` (from PoE):
* PoE introspection cache (short TTL, e.g., 60120s).
* Releaseverify cache (`scannerImageDigest` → { trusted, ts }).
* **Audit store** (Mongo or Postgres): `signer.audit_events`
* **Audit store** (PostgreSQL): `signer.audit_events`
```
{ _id, ts, tenantId, installationId, licenseId, customerId,

View File

@@ -12,7 +12,7 @@
- **WebService** (`StellaOps.TaskRunner.WebService`) - HTTP API, plan hash validation, SSE log streaming, approval endpoints.
- **Worker** (`StellaOps.TaskRunner.Worker`) - run orchestration, retries/backoff, artifact capture, attestation generation.
- **Core** (`StellaOps.TaskRunner.Core`) - execution graph builder, simulation engine, step state machine, policy/approval gate abstractions.
- **Infrastructure** (`StellaOps.TaskRunner.Infrastructure`) - storage adapters (Mongo, file), artifact/object store clients, evidence bundle writer.
- **Infrastructure** (`StellaOps.TaskRunner.Infrastructure`) - storage adapters (PostgreSQL, file), artifact/object store clients, evidence bundle writer.
## 3. Execution Phases
1. **Plan** - parse manifest, validate schema, resolve inputs/secrets, build execution graph, compute canonical `planHash` (SHA-256 over normalised graph).
@@ -29,7 +29,7 @@
- `POST /api/runs/{runId}/cancel` (`packs.run`) - cancel active run.
- TODO (Phase II): `GET /.well-known/openapi` (TASKRUN-OAS-61-002) after OAS publication.
## 5. Data Model (Mongo, mirrors migration doc)
## 5. Data Model (PostgreSQL, mirrors migration doc)
- **pack_runs**: `_id`, `planHash`, `plan`, `failurePolicy`, `requestedAt`, `createdAt`, `updatedAt`, `steps[]`, `tenantId`.
- **pack_run_logs**: `_id`, `runId`, `sequence` (monotonic), `timestamp` (UTC), `level`, `eventType`, `message`, `stepId?`, `metadata`.
- **pack_artifacts**: `_id`, `runId`, `name`, `type`, `sourcePath?`, `storedPath?`, `status`, `notes?`, `capturedAt`.
@@ -65,18 +65,17 @@
- **Export Center** - evidence bundles and manifests for offline/air-gapped export.
- **Orchestrator/CLI** - submission + resume flows; SSE log consumption.
## 11. Configuration (Mongo example)
## 11. Configuration (PostgreSQL example)
```json
\"TaskRunner\": {
\"Storage\": {
\"Mode\": \"mongo\",
\"Mongo\": {
\"ConnectionString\": \"mongodb://127.0.0.1:27017/taskrunner\",
\"Database\": \"taskrunner\",
\"RunsCollection\": \"pack_runs\",
\"LogsCollection\": \"pack_run_logs\",
\"ArtifactsCollection\": \"pack_artifacts\",
\"ApprovalsCollection\": \"pack_run_approvals\"
\"Mode\": \"postgresql\",
\"PostgreSQL\": {
\"ConnectionString\": \"Host=127.0.0.1;Database=taskrunner;Username=stellaops;Password=secret\",
\"RunsTable\": \"pack_runs\",
\"LogsTable\": \"pack_run_logs\",
\"ArtifactsTable\": \"pack_artifacts\",
\"ApprovalsTable\": \"pack_run_approvals\"
}
}
}

View File

@@ -43,7 +43,7 @@
* **Vuln Explorer**: Enriches vulnerability data with VEX status.
* **Orchestrator**: Schedules consensus compute jobs for batch processing.
* **Authority**: Validates issuer trust and key fingerprints.
* **Config stores**: MongoDB (projections, issuer directory), Redis (caches).
* **Config stores**: PostgreSQL (projections, issuer directory), Redis (caches).
---
@@ -168,7 +168,7 @@ vexlens:
projectionRetentionDays: 365
eventRetentionDays: 90
issuerDirectory:
source: mongodb # mongodb, file, api
source: postgresql # postgresql, file, api
refreshIntervalMinutes: 60
```

View File

@@ -11,7 +11,7 @@
| Component | Requirement | Notes |
|-----------|-------------|-------|
| Runtime | .NET 10.0+ | LTS recommended |
| Database | MongoDB 6.0+ | For projections and issuer directory |
| Database | PostgreSQL 15.0+ | For projections and issuer directory |
| Cache | Redis 7.0+ (optional) | For caching consensus results |
| Memory | 512MB minimum | 2GB recommended for production |
| CPU | 2 cores minimum | 4 cores for high throughput |
@@ -43,13 +43,12 @@ VEXLENS_TRUST_ALLOW_UNKNOWN_ISSUERS=true
VEXLENS_TRUST_UNKNOWN_ISSUER_PENALTY=0.5
# Storage
VEXLENS_STORAGE_MONGODB_CONNECTION_STRING=mongodb://localhost:27017
VEXLENS_STORAGE_MONGODB_DATABASE=vexlens
VEXLENS_STORAGE_POSTGRESQL_CONNECTION_STRING=Host=localhost;Database=vexlens;Username=stellaops;Password=secret
VEXLENS_STORAGE_PROJECTION_RETENTION_DAYS=365
VEXLENS_STORAGE_EVENT_RETENTION_DAYS=90
# Issuer Directory
VEXLENS_ISSUER_DIRECTORY_SOURCE=mongodb
VEXLENS_ISSUER_DIRECTORY_SOURCE=postgresql
VEXLENS_ISSUER_DIRECTORY_REFRESH_INTERVAL_MINUTES=60
# Observability
@@ -86,16 +85,15 @@ vexlens:
ProductAuthority: 0.05
storage:
mongodb:
connectionString: mongodb://localhost:27017
database: vexlens
projectionsCollection: consensus_projections
issuersCollection: issuers
postgresql:
connectionString: Host=localhost;Database=vexlens;Username=stellaops;Password=secret
projectionsTable: consensus_projections
issuersTable: issuers
projectionRetentionDays: 365
eventRetentionDays: 90
issuerDirectory:
source: mongodb
source: postgresql
refreshIntervalMinutes: 60
seedFile: /etc/vexlens/issuers.json
@@ -126,7 +124,7 @@ docker run -d \
--name vexlens \
-p 8080:8080 \
-v /etc/vexlens:/etc/vexlens:ro \
-e VEXLENS_STORAGE_MONGODB_CONNECTION_STRING=mongodb://mongo:27017 \
-e VEXLENS_STORAGE_POSTGRESQL_CONNECTION_STRING="Host=postgres;Database=vexlens;Username=stellaops;Password=secret" \
stellaops/vexlens:latest
```
@@ -154,11 +152,11 @@ spec:
ports:
- containerPort: 8080
env:
- name: VEXLENS_STORAGE_MONGODB_CONNECTION_STRING
- name: VEXLENS_STORAGE_POSTGRESQL_CONNECTION_STRING
valueFrom:
secretKeyRef:
name: vexlens-secrets
key: mongodb-connection-string
key: postgresql-connection-string
resources:
requests:
memory: "512Mi"
@@ -205,7 +203,7 @@ spec:
```bash
helm install vexlens stellaops/vexlens \
--namespace stellaops \
--set mongodb.connectionString=mongodb://mongo:27017 \
--set postgresql.connectionString="Host=postgres;Database=vexlens;Username=stellaops;Password=secret" \
--set replicas=2 \
--set resources.requests.memory=512Mi \
--set resources.limits.memory=2Gi
@@ -293,7 +291,7 @@ curl http://vexlens:8080/health/live
```bash
curl http://vexlens:8080/health/ready
# Response: {"status": "Healthy", "checks": {"mongodb": "Healthy", "issuerDirectory": "Healthy"}}
# Response: {"status": "Healthy", "checks": {"postgresql": "Healthy", "issuerDirectory": "Healthy"}}
```
### 5.3 Detailed Health
@@ -358,11 +356,10 @@ groups:
### 7.1 Backup Projections
```bash
# MongoDB backup
mongodump --uri="mongodb://localhost:27017" \
--db=vexlens \
--collection=consensus_projections \
--out=/backup/vexlens-$(date +%Y%m%d)
# PostgreSQL backup
pg_dump -h localhost -U stellaops -d vexlens \
-t consensus_projections \
-F c -f /backup/vexlens-projections-$(date +%Y%m%d).dump
```
### 7.2 Backup Issuer Directory
@@ -376,10 +373,9 @@ curl http://vexlens:8080/api/v1/vexlens/issuers?limit=1000 \
### 7.3 Restore
```bash
# Restore MongoDB
mongorestore --uri="mongodb://localhost:27017" \
--db=vexlens \
/backup/vexlens-20251206/
# Restore PostgreSQL
pg_restore -h localhost -U stellaops -d vexlens \
/backup/vexlens-projections-20251206.dump
# Re-seed issuers if needed
# Issuers are automatically loaded from seed file on startup
@@ -408,10 +404,10 @@ vexlens:
batchTimeoutMs: 50
storage:
mongodb:
postgresql:
# Connection pool
maxConnectionPoolSize: 100
minConnectionPoolSize: 10
maxPoolSize: 100
minPoolSize: 10
caching:
enabled: true