feat(rate-limiting): Implement core rate limiting functionality with configuration, decision-making, metrics, middleware, and service registration
- Add RateLimitConfig for configuration management with YAML binding support. - Introduce RateLimitDecision to encapsulate the result of rate limit checks. - Implement RateLimitMetrics for OpenTelemetry metrics tracking. - Create RateLimitMiddleware for enforcing rate limits on incoming requests. - Develop RateLimitService to orchestrate instance and environment rate limit checks. - Add RateLimitServiceCollectionExtensions for dependency injection registration.
This commit is contained in:
@@ -20,19 +20,19 @@
|
||||
|
||||
## 1) Aggregation-Only Contract guardrails
|
||||
|
||||
**Epic 1 distilled** — the service itself is the enforcement point for AOC. The guardrail checklist is embedded in code (`AOCWriteGuard`) and must be satisfied before any advisory hits Mongo:
|
||||
**Epic 1 distilled** — the service itself is the enforcement point for AOC. The guardrail checklist is embedded in code (`AOCWriteGuard`) and must be satisfied before any advisory hits PostgreSQL:
|
||||
|
||||
1. **No derived semantics in ingestion.** The DTOs produced by connectors cannot contain severity, consensus, reachability, merged status, or fix hints. Roslyn analyzers (`StellaOps.AOC.Analyzers`) scan connectors and fail builds if forbidden properties appear.
|
||||
2. **Immutable raw docs.** Every upstream advisory is persisted in `advisory_raw` with append-only semantics. Revisions produce new `_id`s via version suffix (`:v2`, `:v3`), linking back through `supersedes`.
|
||||
2. **Immutable raw rows.** Every upstream advisory is persisted in `advisory_raw` with append-only semantics. Revisions produce new IDs via version suffix (`:v2`, `:v3`), linking back through `supersedes`.
|
||||
3. **Mandatory provenance.** Collectors record `source`, `upstream` metadata (`document_version`, `fetched_at`, `received_at`, `content_hash`), and signature presence before writing.
|
||||
4. **Linkset only.** Derived joins (aliases, PURLs, CPEs, references) are stored inside `linkset` and never mutate `content.raw`.
|
||||
5. **Deterministic canonicalisation.** Writers use canonical JSON (sorted object keys, lexicographic arrays) ensuring identical inputs yield the same hashes/diff-friendly outputs.
|
||||
6. **Idempotent upserts.** `(source.vendor, upstream.upstream_id, upstream.content_hash)` uniquely identify a document. Duplicate hashes short-circuit; new hashes create a new version.
|
||||
7. **Verifier & CI.** `StellaOps.AOC.Verifier` processes observation batches in CI and at runtime, rejecting writes lacking provenance, introducing unordered collections, or violating the schema.
|
||||
|
||||
> Feature toggle: set `concelier:features:noMergeEnabled=true` to disable the legacy Merge module and its `merge:reconcile` job once Link-Not-Merge adoption is complete (MERGE-LNM-21-002). Analyzer `CONCELIER0002` prevents new references to Merge DI helpers when this flag is enabled.
|
||||
|
||||
### 1.1 Advisory raw document shape
|
||||
6. **Idempotent upserts.** `(source.vendor, upstream.upstream_id, upstream.content_hash)` uniquely identify a document. Duplicate hashes short-circuit; new hashes create a new version.
|
||||
7. **Verifier & CI.** `StellaOps.AOC.Verifier` processes observation batches in CI and at runtime, rejecting writes lacking provenance, introducing unordered collections, or violating the schema.
|
||||
|
||||
> Feature toggle: set `concelier:features:noMergeEnabled=true` to disable the legacy Merge module and its `merge:reconcile` job once Link-Not-Merge adoption is complete (MERGE-LNM-21-002). Analyzer `CONCELIER0002` prevents new references to Merge DI helpers when this flag is enabled.
|
||||
|
||||
### 1.1 Advisory raw document shape
|
||||
|
||||
```json
|
||||
{
|
||||
@@ -61,28 +61,28 @@
|
||||
"spec_version": "1.6",
|
||||
"raw": { /* unmodified upstream document */ }
|
||||
},
|
||||
"identifiers": {
|
||||
"primary": "GHSA-xxxx-....",
|
||||
"aliases": ["CVE-2025-12345", "GHSA-xxxx-...."]
|
||||
},
|
||||
"linkset": {
|
||||
"purls": ["pkg:npm/lodash@4.17.21"],
|
||||
"cpes": ["cpe:2.3:a:lodash:lodash:4.17.21:*:*:*:*:*:*:*"],
|
||||
"references": [
|
||||
{"type":"advisory","url":"https://..."},
|
||||
{"type":"fix","url":"https://..."}
|
||||
],
|
||||
"reconciled_from": ["content.raw.affected.ranges", "content.raw.pkg"]
|
||||
},
|
||||
"advisory_key": "CVE-2025-12345",
|
||||
"links": [
|
||||
{"scheme":"CVE","value":"CVE-2025-12345"},
|
||||
{"scheme":"GHSA","value":"GHSA-XXXX-...."},
|
||||
{"scheme":"PRIMARY","value":"CVE-2025-12345"}
|
||||
],
|
||||
"supersedes": "advisory_raw:osv:GHSA-xxxx-....:v2",
|
||||
"tenant": "default"
|
||||
}
|
||||
"identifiers": {
|
||||
"primary": "GHSA-xxxx-....",
|
||||
"aliases": ["CVE-2025-12345", "GHSA-xxxx-...."]
|
||||
},
|
||||
"linkset": {
|
||||
"purls": ["pkg:npm/lodash@4.17.21"],
|
||||
"cpes": ["cpe:2.3:a:lodash:lodash:4.17.21:*:*:*:*:*:*:*"],
|
||||
"references": [
|
||||
{"type":"advisory","url":"https://..."},
|
||||
{"type":"fix","url":"https://..."}
|
||||
],
|
||||
"reconciled_from": ["content.raw.affected.ranges", "content.raw.pkg"]
|
||||
},
|
||||
"advisory_key": "CVE-2025-12345",
|
||||
"links": [
|
||||
{"scheme":"CVE","value":"CVE-2025-12345"},
|
||||
{"scheme":"GHSA","value":"GHSA-XXXX-...."},
|
||||
{"scheme":"PRIMARY","value":"CVE-2025-12345"}
|
||||
],
|
||||
"supersedes": "advisory_raw:osv:GHSA-xxxx-....:v2",
|
||||
"tenant": "default"
|
||||
}
|
||||
```
|
||||
|
||||
### 1.2 Connector lifecycle
|
||||
@@ -90,7 +90,7 @@
|
||||
1. **Snapshot stage** — connectors fetch signed feeds or use offline mirrors keyed by `{vendor, stream, snapshot_date}`.
|
||||
2. **Parse stage** — upstream payloads are normalised into strongly-typed DTOs with UTC timestamps.
|
||||
3. **Guard stage** — DTOs run through `AOCWriteGuard` performing schema validation, forbidden-field checks, provenance validation, deterministic sorting, and `_id` computation.
|
||||
4. **Write stage** — append-only Mongo insert; duplicate hash is ignored, changed hash creates a new version and emits `supersedes` pointer.
|
||||
4. **Write stage** — append-only PostgreSQL insert; duplicate hash is ignored, changed hash creates a new version and emits `supersedes` pointer.
|
||||
5. **Event stage** — DSSE-backed events `advisory.observation.updated` and `advisory.linkset.updated` notify downstream services (Policy, Export Center, CLI).
|
||||
|
||||
### 1.3 Export readiness
|
||||
@@ -99,7 +99,7 @@ Concelier feeds Export Center profiles (Epic 10) by:
|
||||
|
||||
- Maintaining canonical JSON exports with deterministic manifests (`export.json`) listing content hashes, counts, and `supersedes` chains.
|
||||
- Producing Trivy DB-compatible artifacts (SQLite + metadata) packaged under `db/` with hash manifests.
|
||||
- Surfacing mirror manifests that reference Mongo snapshot digests, enabling Offline Kit bundle verification.
|
||||
- Surfacing mirror manifests that reference PostgreSQL snapshot digests, enabling Offline Kit bundle verification.
|
||||
|
||||
Running the same export job twice against the same snapshot must yield byte-identical archives and manifest hashes.
|
||||
|
||||
@@ -109,13 +109,13 @@ Running the same export job twice against the same snapshot must yield byte-iden
|
||||
|
||||
**Process shape:** single ASP.NET Core service `StellaOps.Concelier.WebService` hosting:
|
||||
|
||||
* **Scheduler** with distributed locks (Mongo backed).
|
||||
* **Scheduler** with distributed locks (PostgreSQL backed).
|
||||
* **Connectors** (fetch/parse/map) that emit immutable observation candidates.
|
||||
* **Observation writer** enforcing AOC invariants via `AOCWriteGuard`.
|
||||
* **Linkset builder** that correlates observations into `advisory_linksets` and annotates conflicts.
|
||||
* **Event publisher** emitting `advisory.observation.updated` and `advisory.linkset.updated` messages.
|
||||
* **Exporters** (JSON, Trivy DB, Offline Kit slices) fed from observation/linkset stores.
|
||||
* **Minimal REST** for health/status/trigger/export, raw observation reads, and evidence retrieval (`GET /vuln/evidence/advisories/{advisory_key}`).
|
||||
* **Minimal REST** for health/status/trigger/export, raw observation reads, and evidence retrieval (`GET /vuln/evidence/advisories/{advisory_key}`).
|
||||
|
||||
**Scale:** HA by running N replicas; **locks** prevent overlapping jobs per source/exporter.
|
||||
|
||||
@@ -123,7 +123,7 @@ Running the same export job twice against the same snapshot must yield byte-iden
|
||||
|
||||
## 3) Canonical domain model
|
||||
|
||||
> Stored in MongoDB (database `concelier`), serialized with a **canonical JSON** writer (stable order, camelCase, normalized timestamps).
|
||||
> Stored in PostgreSQL (database `concelier`), serialized with a **canonical JSON** writer (stable order, camelCase, normalized timestamps).
|
||||
|
||||
### 2.1 Core entities
|
||||
|
||||
@@ -300,7 +300,7 @@ public interface IFeedConnector {
|
||||
1. **Connector fetch/parse/map** — connectors download upstream payloads, validate signatures, and map to DTOs (identifiers, references, raw payload, provenance).
|
||||
2. **AOC guard** — `AOCWriteGuard` verifies forbidden keys, provenance completeness, tenant claims, timestamp normalization, and content hash idempotency. Violations raise `ERR_AOC_00x` mapped to structured logs and metrics.
|
||||
3. **Append-only write** — observations insert into `advisory_observations`; duplicates by `(tenant, source.vendor, upstream.upstreamId, upstream.contentHash)` become no-ops; new content for same upstream id creates a supersedes chain.
|
||||
4. **Change feed + event** — Mongo change streams trigger `advisory.observation.updated@1` events with deterministic payloads (IDs, hash, supersedes pointer, linkset summary). Policy Engine, Offline Kit builder, and guard dashboards subscribe.
|
||||
4. **Replication + event** — PostgreSQL logical replication triggers `advisory.observation.updated@1` events with deterministic payloads (IDs, hash, supersedes pointer, linkset summary). Policy Engine, Offline Kit builder, and guard dashboards subscribe.
|
||||
|
||||
### 5.2 Linkset correlation
|
||||
|
||||
@@ -321,9 +321,9 @@ Events are emitted via NATS (primary) and Redis Stream (fallback). Consumers ack
|
||||
|
||||
---
|
||||
|
||||
## 7) Storage schema (MongoDB)
|
||||
## 7) Storage schema (PostgreSQL)
|
||||
|
||||
### Collections & indexes (LNM path)
|
||||
### Tables & indexes (LNM path)
|
||||
|
||||
* `concelier.sources` `{_id, type, baseUrl, enabled, notes}` — connector catalog.
|
||||
* `concelier.source_state` `{sourceName(unique), enabled, cursor, lastSuccess, backoffUntil, paceOverrides}` — run-state (TTL indexes on `backoffUntil`).
|
||||
@@ -338,15 +338,15 @@ Events are emitted via NATS (primary) and Redis Stream (fallback). Consumers ack
|
||||
_id: "tenant:vendor:upstreamId:revision",
|
||||
tenant,
|
||||
source: { vendor, stream, api, collectorVersion },
|
||||
upstream: { upstreamId, documentVersion, fetchedAt, receivedAt, contentHash, signature },
|
||||
content: { format, specVersion, raw, metadata? },
|
||||
identifiers: { cve?, ghsa?, vendorIds[], aliases[] },
|
||||
linkset: { purls[], cpes[], aliases[], references[], reconciledFrom[] },
|
||||
rawLinkset: { aliases[], purls[], cpes[], references[], reconciledFrom[], notes? },
|
||||
supersedes?: "prevObservationId",
|
||||
createdAt,
|
||||
attributes?: object
|
||||
}
|
||||
upstream: { upstreamId, documentVersion, fetchedAt, receivedAt, contentHash, signature },
|
||||
content: { format, specVersion, raw, metadata? },
|
||||
identifiers: { cve?, ghsa?, vendorIds[], aliases[] },
|
||||
linkset: { purls[], cpes[], aliases[], references[], reconciledFrom[] },
|
||||
rawLinkset: { aliases[], purls[], cpes[], references[], reconciledFrom[], notes? },
|
||||
supersedes?: "prevObservationId",
|
||||
createdAt,
|
||||
attributes?: object
|
||||
}
|
||||
```
|
||||
|
||||
* Indexes: `{tenant:1, upstream.upstreamId:1}`, `{tenant:1, source.vendor:1, linkset.purls:1}`, `{tenant:1, linkset.aliases:1}`, `{tenant:1, createdAt:-1}`.
|
||||
@@ -389,9 +389,9 @@ Events are emitted via NATS (primary) and Redis Stream (fallback). Consumers ack
|
||||
* `locks` `{_id(jobKey), holder, acquiredAt, heartbeatAt, leaseMs, ttlAt}` (TTL cleans dead locks)
|
||||
* `jobs` `{_id, type, args, state, startedAt, heartbeatAt, endedAt, error}`
|
||||
|
||||
**Legacy collections** (`advisory`, `alias`, `affected`, `reference`, `merge_event`) remain read-only during the migration window to support back-compat exports. New code must not write to them; scheduled cleanup removes them after Link-Not-Merge GA.
|
||||
**Legacy tables** (`advisory`, `alias`, `affected`, `reference`, `merge_event`) remain read-only during the migration window to support back-compat exports. New code must not write to them; scheduled cleanup removes them after Link-Not-Merge GA.
|
||||
|
||||
**GridFS buckets**: `fs.documents` for raw payloads (immutable); `fs.exports` for historical JSON/Trivy archives.
|
||||
**Object storage**: `documents` for raw payloads (immutable); `exports` for historical JSON/Trivy archives.
|
||||
|
||||
---
|
||||
|
||||
@@ -476,7 +476,8 @@ GET /affected?productKey=pkg:rpm/openssl&limit=100
|
||||
|
||||
```yaml
|
||||
concelier:
|
||||
mongo: { uri: "mongodb://mongo/concelier" }
|
||||
postgres:
|
||||
connectionString: "Host=postgres;Port=5432;Database=concelier;Username=stellaops;Password=stellaops"
|
||||
s3:
|
||||
endpoint: "http://minio:9000"
|
||||
bucket: "stellaops-concelier"
|
||||
@@ -540,12 +541,12 @@ concelier:
|
||||
|
||||
* **Ingest**: ≥ 5k documents/min on 4 cores (CSAF/OpenVEX/JSON).
|
||||
* **Normalize/map**: ≥ 50k observation statements/min on 4 cores.
|
||||
* **Observation write**: ≤ 5 ms P95 per document (including guard + Mongo write).
|
||||
* **Observation write**: ≤ 5 ms P95 per row (including guard + PostgreSQL write).
|
||||
* **Linkset build**: ≤ 15 ms P95 per `(vulnerabilityId, productKey)` update, even with 20+ contributing observations.
|
||||
* **Export**: 1M advisories JSON in ≤ 90 s (streamed, zstd), Trivy DB in ≤ 60 s on 8 cores.
|
||||
* **Memory**: hard cap per job; chunked streaming writers; backpressure to avoid GC spikes.
|
||||
|
||||
**Scale pattern**: add Concelier replicas; Mongo scaling via indices and read/write concerns; GridFS only for oversized docs.
|
||||
**Scale pattern**: add Concelier replicas; PostgreSQL scaling via indices and read/write connection pooling; object storage for oversized docs.
|
||||
|
||||
---
|
||||
|
||||
@@ -556,13 +557,13 @@ concelier:
|
||||
* `concelier.fetch.docs_total{source}`
|
||||
* `concelier.fetch.bytes_total{source}`
|
||||
* `concelier.parse.failures_total{source}`
|
||||
* `concelier.map.statements_total{source}`
|
||||
* `concelier.observations.write_total{result=ok|noop|error}`
|
||||
* `concelier.linksets.updated_total{result=ok|skip|error}`
|
||||
* `concelier.linksets.conflicts_total{type}`
|
||||
* `concelier.export.bytes{kind}`
|
||||
* `concelier.export.duration_seconds{kind}`
|
||||
* `advisory_ai_chunk_requests_total{tenant,result,cache}` and `advisory_ai_guardrail_blocks_total{tenant,reason,cache}` instrument the `/advisories/{key}/chunks` surfaces that Advisory AI consumes. Cache hits now emit the same guardrail counters so operators can see blocked segments even when responses are served from cache.
|
||||
* `concelier.map.statements_total{source}`
|
||||
* `concelier.observations.write_total{result=ok|noop|error}`
|
||||
* `concelier.linksets.updated_total{result=ok|skip|error}`
|
||||
* `concelier.linksets.conflicts_total{type}`
|
||||
* `concelier.export.bytes{kind}`
|
||||
* `concelier.export.duration_seconds{kind}`
|
||||
* `advisory_ai_chunk_requests_total{tenant,result,cache}` and `advisory_ai_guardrail_blocks_total{tenant,reason,cache}` instrument the `/advisories/{key}/chunks` surfaces that Advisory AI consumes. Cache hits now emit the same guardrail counters so operators can see blocked segments even when responses are served from cache.
|
||||
* **Tracing** around fetch/parse/map/observe/linkset/export.
|
||||
* **Logs**: structured with `source`, `uri`, `docDigest`, `advisoryKey`, `exportId`.
|
||||
|
||||
@@ -604,7 +605,7 @@ concelier:
|
||||
|
||||
1. **MVP**: Red Hat (CSAF), SUSE (CSAF), Ubuntu (USN JSON), OSV; JSON export.
|
||||
2. **Add**: GHSA GraphQL, Debian (DSA HTML/JSON), Alpine secdb; Trivy DB export.
|
||||
3. **Attestation hand‑off**: integrate with **Signer/Attestor** (optional).
|
||||
- Advisory evidence attestation parameters and path rules are documented in `docs/modules/concelier/attestation.md`.
|
||||
4. **Scale & diagnostics**: provider dashboards, staleness alerts, export cache reuse.
|
||||
3. **Attestation hand‑off**: integrate with **Signer/Attestor** (optional).
|
||||
- Advisory evidence attestation parameters and path rules are documented in `docs/modules/concelier/attestation.md`.
|
||||
4. **Scale & diagnostics**: provider dashboards, staleness alerts, export cache reuse.
|
||||
5. **Offline kit**: end‑to‑end verified bundles for air‑gap.
|
||||
|
||||
Reference in New Issue
Block a user