work work hard work
This commit is contained in:
@@ -2,7 +2,7 @@
|
||||
|
||||
**Version:** 1.0.0
|
||||
**Status:** DRAFT
|
||||
**Last Updated:** 2025-12-15
|
||||
**Last Updated:** 2025-12-17
|
||||
|
||||
---
|
||||
|
||||
@@ -44,9 +44,14 @@ This document specifies the PostgreSQL database design for StellaOps control-pla
|
||||
| `policy` | Policy | Policy packs, rules, risk profiles, evaluations |
|
||||
| `packs` | PacksRegistry | Package attestations, mirrors, lifecycle |
|
||||
| `issuer` | IssuerDirectory | Trust anchors, issuer keys, certificates |
|
||||
| `proofchain` | Attestor | Content-addressed proof/evidence chain (entries, DSSE envelopes, spines, trust anchors, Rekor) |
|
||||
| `unknowns` | Unknowns | Bitemporal ambiguity tracking for scan gaps |
|
||||
| `audit` | Shared | Cross-cutting audit log (optional) |
|
||||
|
||||
**ProofChain references:**
|
||||
- DDL migration: `src/Attestor/__Libraries/StellaOps.Attestor.Persistence/Migrations/20251214000001_AddProofChainSchema.sql`
|
||||
- Perf report: `docs/db/reports/proofchain-schema-perf-2025-12-17.md`
|
||||
|
||||
### 2.3 Multi-Tenancy Model
|
||||
|
||||
**Strategy:** Single database, single schema set, `tenant_id` column on all tenant-scoped tables with **mandatory Row-Level Security (RLS)**.
|
||||
|
||||
127
docs/db/reports/proofchain-schema-perf-2025-12-17.md
Normal file
127
docs/db/reports/proofchain-schema-perf-2025-12-17.md
Normal file
@@ -0,0 +1,127 @@
|
||||
# ProofChain schema performance report (2025-12-17)
|
||||
|
||||
## Environment
|
||||
- Postgres image: `postgres:16`
|
||||
- DB: `proofchain_perf`
|
||||
- Port: `54329`
|
||||
- Host: `localhost`
|
||||
|
||||
## Dataset
|
||||
- Source: `src/Attestor/__Libraries/StellaOps.Attestor.Persistence/Perf/seed.sql`
|
||||
- Rows:
|
||||
- `trust_anchors`: 50
|
||||
- `sbom_entries`: 20000
|
||||
- `dsse_envelopes`: 60000
|
||||
- `spines`: 20000
|
||||
- `rekor_entries`: 2000
|
||||
|
||||
## Query Output
|
||||
|
||||
```text
|
||||
Timing is on.
|
||||
trust_anchors | sbom_entries | dsse_envelopes | spines | rekor_entries
|
||||
---------------+--------------+----------------+--------+---------------
|
||||
50 | 20000 | 60000 | 20000 | 2000
|
||||
(1 row)
|
||||
|
||||
Time: 18.788 ms
|
||||
QUERY PLAN
|
||||
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
Index Scan using uq_sbom_entry on sbom_entries (cost=0.41..8.44 rows=1 width=226) (actual time=0.024..0.024 rows=1 loops=1)
|
||||
Index Cond: (((bom_digest)::text = 'd2cb2e2d7955252437da988dd4484f1dfcde81750ce0175d9fb9a85134a8de9a'::text) AND (purl = format('pkg:npm/vendor-%02s/pkg-%05s'::text, 1, 1)) AND (version = '1.0.1'::text))
|
||||
Buffers: shared hit=4
|
||||
Planning:
|
||||
Buffers: shared hit=24
|
||||
Planning Time: 0.431 ms
|
||||
Execution Time: 0.032 ms
|
||||
(7 rows)
|
||||
|
||||
Time: 1.119 ms
|
||||
QUERY PLAN
|
||||
---------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
Limit (cost=173.99..174.13 rows=56 width=80) (actual time=0.331..0.340 rows=100 loops=1)
|
||||
Buffers: shared hit=8
|
||||
-> Sort (cost=173.99..174.13 rows=56 width=80) (actual time=0.330..0.335 rows=100 loops=1)
|
||||
Sort Key: purl
|
||||
Sort Method: quicksort Memory: 38kB
|
||||
Buffers: shared hit=8
|
||||
-> Bitmap Heap Scan on sbom_entries (cost=4.72..172.37 rows=56 width=80) (actual time=0.019..0.032 rows=100 loops=1)
|
||||
Recheck Cond: ((bom_digest)::text = 'd2cb2e2d7955252437da988dd4484f1dfcde81750ce0175d9fb9a85134a8de9a'::text)
|
||||
Heap Blocks: exact=3
|
||||
Buffers: shared hit=5
|
||||
-> Bitmap Index Scan on idx_sbom_entries_bom_digest (cost=0.00..4.71 rows=56 width=0) (actual time=0.015..0.015 rows=100 loops=1)
|
||||
Index Cond: ((bom_digest)::text = 'd2cb2e2d7955252437da988dd4484f1dfcde81750ce0175d9fb9a85134a8de9a'::text)
|
||||
Buffers: shared hit=2
|
||||
Planning:
|
||||
Buffers: shared hit=12 read=1
|
||||
Planning Time: 0.149 ms
|
||||
Execution Time: 0.355 ms
|
||||
(17 rows)
|
||||
|
||||
Time: 0.867 ms
|
||||
QUERY PLAN
|
||||
-------------------------------------------------------------------------------------------------------------------------------------------
|
||||
Index Scan using idx_dsse_entry_predicate on dsse_envelopes (cost=0.41..8.43 rows=1 width=226) (actual time=0.008..0.009 rows=1 loops=1)
|
||||
Index Cond: ((entry_id = '924258f2-921e-9694-13a4-400abfdf00d6'::uuid) AND (predicate_type = 'evidence.stella/v1'::text))
|
||||
Buffers: shared hit=4
|
||||
Planning:
|
||||
Buffers: shared hit=23
|
||||
Planning Time: 0.150 ms
|
||||
Execution Time: 0.014 ms
|
||||
(7 rows)
|
||||
|
||||
Time: 0.388 ms
|
||||
QUERY PLAN
|
||||
----------------------------------------------------------------------------------------------------------------------------
|
||||
Index Scan using idx_spines_bundle on spines (cost=0.41..8.43 rows=1 width=194) (actual time=0.016..0.017 rows=1 loops=1)
|
||||
Index Cond: ((bundle_id)::text = '2f9ef44d93b4520b2296d5b73bd1cc87156a304c757feb4c78926452db61abf8'::text)
|
||||
Buffers: shared hit=4
|
||||
Planning Time: 0.096 ms
|
||||
Execution Time: 0.025 ms
|
||||
(5 rows)
|
||||
|
||||
Time: 0.318 ms
|
||||
QUERY PLAN
|
||||
----------------------------------------------------------------------------------------------------------------------------
|
||||
Bitmap Heap Scan on rekor_entries (cost=4.34..27.60 rows=8 width=186) (actual time=0.024..0.024 rows=0 loops=1)
|
||||
Recheck Cond: (log_index = 10)
|
||||
Buffers: shared hit=5
|
||||
-> Bitmap Index Scan on idx_rekor_log_index (cost=0.00..4.34 rows=8 width=0) (actual time=0.023..0.023 rows=0 loops=1)
|
||||
Index Cond: (log_index = 10)
|
||||
Buffers: shared hit=5
|
||||
Planning:
|
||||
Buffers: shared hit=5
|
||||
Planning Time: 0.097 ms
|
||||
Execution Time: 0.040 ms
|
||||
(10 rows)
|
||||
|
||||
Time: 0.335 ms
|
||||
QUERY PLAN
|
||||
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
Limit (cost=637.30..637.30 rows=1 width=226) (actual time=0.649..0.660 rows=100 loops=1)
|
||||
Buffers: shared hit=405
|
||||
-> Sort (cost=637.30..637.30 rows=1 width=226) (actual time=0.648..0.653 rows=100 loops=1)
|
||||
Sort Key: e.purl
|
||||
Sort Method: quicksort Memory: 50kB
|
||||
Buffers: shared hit=405
|
||||
-> Nested Loop (cost=5.13..637.29 rows=1 width=226) (actual time=0.074..0.385 rows=100 loops=1)
|
||||
Buffers: shared hit=405
|
||||
-> Bitmap Heap Scan on sbom_entries e (cost=4.72..172.37 rows=56 width=48) (actual time=0.061..0.071 rows=100 loops=1)
|
||||
Recheck Cond: ((bom_digest)::text = 'd2cb2e2d7955252437da988dd4484f1dfcde81750ce0175d9fb9a85134a8de9a'::text)
|
||||
Heap Blocks: exact=3
|
||||
Buffers: shared hit=5
|
||||
-> Bitmap Index Scan on idx_sbom_entries_bom_digest (cost=0.00..4.71 rows=56 width=0) (actual time=0.057..0.057 rows=100 loops=1)
|
||||
Index Cond: ((bom_digest)::text = 'd2cb2e2d7955252437da988dd4484f1dfcde81750ce0175d9fb9a85134a8de9a'::text)
|
||||
Buffers: shared hit=2
|
||||
-> Index Scan using idx_dsse_entry_predicate on dsse_envelopes d (cost=0.41..8.29 rows=1 width=194) (actual time=0.003..0.003 rows=1 loops=100)
|
||||
Index Cond: ((entry_id = e.entry_id) AND (predicate_type = 'evidence.stella/v1'::text))
|
||||
Buffers: shared hit=400
|
||||
Planning:
|
||||
Buffers: shared hit=114
|
||||
Planning Time: 0.469 ms
|
||||
Execution Time: 0.691 ms
|
||||
(22 rows)
|
||||
|
||||
Time: 1.643 ms
|
||||
```
|
||||
|
||||
@@ -72,12 +72,12 @@ stellaops verify offline \
|
||||
| 2 | T2 | DONE | Implemented `OfflineCommandGroup` and wired into `CommandFactory`. | DevEx/CLI Guild | Create `OfflineCommandGroup` class. |
|
||||
| 3 | T3 | DONE | Implemented `offline import` with manifest/hash validation, monotonicity checks, and quarantine hooks. | DevEx/CLI Guild | Implement `offline import` command (core import flow). |
|
||||
| 4 | T4 | DONE | Implemented `--verify-dsse` via `DsseVerifier` (requires `--trust-root`) and added tests. | DevEx/CLI Guild | Add `--verify-dsse` flag handler. |
|
||||
| 5 | T5 | BLOCKED | Needs offline Rekor inclusion proof verification contract/library; current implementation only validates receipt structure. | DevEx/CLI Guild | Add `--verify-rekor` flag handler. |
|
||||
| 5 | T5 | DOING | Implement offline Rekor receipt inclusion proof + checkpoint signature verification per `docs/product-advisories/14-Dec-2025 - Rekor Integration Technical Reference.md` §13. | DevEx/CLI Guild | Add `--verify-rekor` flag handler. |
|
||||
| 6 | T6 | DONE | Implemented deterministic trust-root loading (`--trust-root`). | DevEx/CLI Guild | Add `--trust-root` option. |
|
||||
| 7 | T7 | DONE | Enforced `--force-reason` when forcing activation and persisted justification. | DevEx/CLI Guild | Add `--force-activate` flag. |
|
||||
| 8 | T8 | DONE | Implemented `offline status` with table/json outputs. | DevEx/CLI Guild | Implement `offline status` command. |
|
||||
| 9 | T9 | BLOCKED | Needs policy/verification contract (exit code mapping + evaluation semantics) before implementing `verify offline`. | DevEx/CLI Guild | Implement `verify offline` command. |
|
||||
| 10 | T10 | BLOCKED | Depends on the `verify offline` policy schema/loader contract (YAML/JSON canonicalization rules). | DevEx/CLI Guild | Add `--policy` option parser. |
|
||||
| 9 | T9 | DOING | Implement `verify offline` using the policy schema in `docs/product-advisories/14-Dec-2025 - Offline and Air-Gap Technical Reference.md` §4 plus deterministic evidence reconciliation outputs. | DevEx/CLI Guild | Implement `verify offline` command. |
|
||||
| 10 | T10 | DOING | Add YAML+JSON policy loader with deterministic parsing/canonicalization rules; share with AirGap reconciliation. | DevEx/CLI Guild | Add `--policy` option parser. |
|
||||
| 11 | T11 | DONE | Standardized `--output table|json` formatting for offline verbs. | DevEx/CLI Guild | Create output formatters (table, json). |
|
||||
| 12 | T12 | DONE | Added progress reporting for bundle hashing when bundle size exceeds threshold. | DevEx/CLI Guild | Implement progress reporting. |
|
||||
| 13 | T13 | DONE | Implemented offline exit codes (`OfflineExitCodes`). | DevEx/CLI Guild | Add exit code standardization. |
|
||||
@@ -682,5 +682,6 @@ public static class OfflineExitCodes
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2025-12-17 | Unblocked T5/T9/T10 by adopting the published offline policy schema (A12) and Rekor receipt contract (Rekor Technical Reference §13); started implementation of offline Rekor inclusion proof verification and `verify offline`. | Agent |
|
||||
| 2025-12-15 | Implemented `offline import/status` (+ exit codes, state storage, quarantine hooks), added docs and tests; validated with `dotnet test src/Cli/__Tests/StellaOps.Cli.Tests/StellaOps.Cli.Tests.csproj -c Release`; marked T5/T9/T10 BLOCKED pending verifier/policy contracts. | DevEx/CLI |
|
||||
| 2025-12-15 | Normalised sprint file to standard template; set T1 to DOING. | Planning · DevEx/CLI |
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
**Epic:** Time-to-First-Signal (TTFS) Implementation
|
||||
**Module:** Web UI
|
||||
**Working Directory:** `src/Web/StellaOps.Web/src/app/`
|
||||
**Status:** BLOCKED
|
||||
**Status:** DOING
|
||||
**Created:** 2025-12-14
|
||||
**Target Completion:** TBD
|
||||
**Depends On:** SPRINT_0339_0001_0001 (First Signal API)
|
||||
@@ -49,15 +49,15 @@ This sprint implements the `FirstSignalCard` Angular component that displays the
|
||||
| T6 | Create FirstSignalCard styles | — | DONE | `src/Web/StellaOps.Web/src/app/features/runs/components/first-signal-card/first-signal-card.component.scss` |
|
||||
| T7 | Implement SSE integration | — | DONE | Uses run stream SSE (`first_signal`) via `EventSourceFactory`; requires `tenant` query fallback in Orchestrator stream endpoints. |
|
||||
| T8 | Implement polling fallback | — | DONE | `FirstSignalStore` starts polling (default 5s) when SSE errors. |
|
||||
| T9 | Implement TTFS telemetry | — | BLOCKED | Telemetry client/contract for `ttfs_start` + `ttfs_signal_rendered` not present in Web; requires platform decision. |
|
||||
| T9 | Implement TTFS telemetry | — | DOING | Implement Web telemetry client + TTFS event emission (`ttfs_start`, `ttfs_signal_rendered`) with sampling and offline-safe buffering. |
|
||||
| T10 | Create prefetch service | — | DONE | `src/Web/StellaOps.Web/src/app/features/runs/services/first-signal-prefetch.service.ts` |
|
||||
| T11 | Integrate into run detail page | — | DONE | Integrated into `src/Web/StellaOps.Web/src/app/features/console/console-status.component.html` as interim run-surface. |
|
||||
| T12 | Create Storybook stories | — | DONE | `src/Web/StellaOps.Web/src/stories/runs/first-signal-card.stories.ts` |
|
||||
| T13 | Create unit tests | — | DONE | `src/Web/StellaOps.Web/src/app/core/api/first-signal.store.spec.ts` |
|
||||
| T14 | Create e2e tests | — | DONE | `src/Web/StellaOps.Web/tests/e2e/first-signal-card.spec.ts` |
|
||||
| T15 | Create accessibility tests | — | DONE | `src/Web/StellaOps.Web/tests/e2e/a11y-smoke.spec.ts` includes `/console/status`. |
|
||||
| T16 | Configure telemetry sampling | — | BLOCKED | No Web telemetry config wiring yet (`AppConfig.telemetry.sampleRate` unused). |
|
||||
| T17 | Add i18n keys for micro-copy | — | BLOCKED | i18n framework not configured in `src/Web/StellaOps.Web` (no `@ngx-translate/*` / Angular i18n usage). |
|
||||
| T16 | Configure telemetry sampling | — | DOING | Wire `AppConfig.telemetry.sampleRate` into telemetry client sampling decisions and expose defaults in config. |
|
||||
| T17 | Add i18n keys for micro-copy | — | DOING | Add i18n framework and migrate FirstSignalCard micro-copy to translation keys (EN baseline). |
|
||||
|
||||
---
|
||||
|
||||
@@ -1781,3 +1781,4 @@ npx ngx-translate-extract \
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2025-12-15 | Implemented FirstSignalCard + store/client, quickstart mock, Storybook story, unit/e2e/a11y coverage; added Orchestrator stream tenant query fallback; marked telemetry/i18n tasks BLOCKED pending platform decisions. | Agent |
|
||||
| 2025-12-17 | Unblocked T9/T16/T17 by selecting a Web telemetry+sampling contract and adding an i18n framework; started implementation and test updates. | Agent |
|
||||
|
||||
@@ -52,13 +52,13 @@ scanner:
|
||||
| T4 | Create `TrustAnchorRegistry` service | DONE | Agent | Resolution by PURL |
|
||||
| T5 | Add configuration binding in `Program.cs` | DONE | Agent | |
|
||||
| T6 | Create `OfflineKitOptionsValidator` | DONE | Agent | Startup validation |
|
||||
| T7 | Integrate with `DsseVerifier` | BLOCKED | Agent | No Scanner-side offline import service consumes DSSE verification yet. |
|
||||
| T8 | Implement DSSE failure handling per §7.2 | BLOCKED | Agent | Requires OfflineKit import pipeline/endpoints to exist. |
|
||||
| T9 | Add `rekorOfflineMode` enforcement | BLOCKED | Agent | Requires an offline Rekor snapshot verifier (not present in current codebase). |
|
||||
| T7 | Integrate with `DsseVerifier` | DOING | Agent | Implement Scanner OfflineKit import host and consume DSSE verification with trust anchor resolution. |
|
||||
| T8 | Implement DSSE failure handling per §7.2 | DOING | Agent | Implement ProblemDetails + log/metric reason codes; respect `requireDsse` soft-fail mode. |
|
||||
| T9 | Add `rekorOfflineMode` enforcement | DOING | Agent | Implement offline Rekor receipt verification and enforce no-network posture when enabled. |
|
||||
| T10 | Create configuration schema documentation | DONE | Agent | Added `src/Scanner/docs/schemas/scanner-offline-kit-config.schema.json`. |
|
||||
| T11 | Write unit tests for PURL matcher | DONE | Agent | Added coverage in `src/Scanner/__Tests/StellaOps.Scanner.Core.Tests`. |
|
||||
| T12 | Write unit tests for trust anchor resolution | DONE | Agent | Added coverage for registry + validator in `src/Scanner/__Tests/StellaOps.Scanner.Core.Tests`. |
|
||||
| T13 | Write integration tests for offline import | BLOCKED | Agent | Requires OfflineKit import pipeline/endpoints to exist. |
|
||||
| T13 | Write integration tests for offline import | DOING | Agent | Add Scanner.WebService OfflineKit import endpoint tests (success + failure + soft-fail) with deterministic fixtures. |
|
||||
| T14 | Update Helm chart values | DONE | Agent | Added OfflineKit env vars to `deploy/helm/stellaops/values-*.yaml`. |
|
||||
| T15 | Update docker-compose samples | DONE | Agent | Added OfflineKit env vars to `deploy/compose/docker-compose.*.yaml`. |
|
||||
|
||||
@@ -708,6 +708,7 @@ scanner:
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2025-12-15 | Implemented OfflineKit options/validator + trust anchor matcher/registry; wired Scanner.WebService options binding + DI; marked T7-T9 blocked pending import pipeline + offline Rekor verifier. | Agent |
|
||||
| 2025-12-17 | Unblocked T7-T9/T13 by implementing a Scanner-side OfflineKit import host (API + services) and offline Rekor receipt verification; started wiring DSSE/Rekor failure handling and integration tests. | Agent |
|
||||
|
||||
## Decisions & Risks
|
||||
- `T7/T8` blocked: Scanner has no OfflineKit import pipeline consuming DSSE verification yet (owning module + API/service design needed).
|
||||
|
||||
@@ -42,7 +42,7 @@
|
||||
| T4 | Implement `attestor_rekor_success_total` counter | DONE | Agent | Implement in `OfflineKitMetrics` (call sites may land later). |
|
||||
| T5 | Implement `attestor_rekor_retry_total` counter | DONE | Agent | Implement in `OfflineKitMetrics` (call sites may land later). |
|
||||
| T6 | Implement `rekor_inclusion_latency` histogram | DONE | Agent | Implement in `OfflineKitMetrics` (call sites may land later). |
|
||||
| T7 | Register metrics with Prometheus endpoint | BLOCKED | Agent | No backend Offline Kit import service/endpoint yet (`/api/offline-kit/import` not implemented in `src/**`); decide host/exporter surface for `/metrics`. |
|
||||
| T7 | Register metrics with Prometheus endpoint | DOING | Agent | Implement Scanner OfflineKit import host and expose `/metrics` with Offline Kit counters/histograms (Prometheus text format). |
|
||||
| **Logging (G12)** | | | | |
|
||||
| T8 | Define structured logging constants | DONE | Agent | Add `OfflineKitLogFields` + scope helpers. |
|
||||
| T9 | Update `ImportValidator` logging | DONE | Agent | Align log templates + tenant scope usage. |
|
||||
@@ -58,7 +58,7 @@
|
||||
| T17 | Create migration for `offline_kit_audit` table | DONE | Agent | Add `authority.offline_kit_audit` + indexes + RLS policy. |
|
||||
| T18 | Implement `IOfflineKitAuditRepository` | DONE | Agent | Repository + query helpers (tenant/type/result). |
|
||||
| T19 | Create audit event emitter service | DONE | Agent | Emitter wraps repository and must not fail import flows. |
|
||||
| T20 | Wire audit to import/activation flows | BLOCKED | Agent | No backend Offline Kit import host/activation flow in `src/**` yet; wire once `POST /api/offline-kit/import` exists. |
|
||||
| T20 | Wire audit to import/activation flows | DOING | Agent | Wire `IOfflineKitAuditEmitter` into Scanner OfflineKit import/activation flow and validate tenant-scoped rows. |
|
||||
| **Testing & Docs** | | | | |
|
||||
| T21 | Write unit tests for metrics | DONE | Agent | Cover instrument names + label sets via `MeterListener`. |
|
||||
| T22 | Write integration tests for audit | DONE | Agent | Cover migration + insert/query via Authority Postgres Testcontainers fixture (requires Docker). |
|
||||
@@ -806,6 +806,7 @@ public sealed class OfflineKitAuditEmitter : IOfflineKitAuditEmitter
|
||||
| 2025-12-15 | Added Authority Postgres migration + repository/emitter for `authority.offline_kit_audit`; marked `T20` `BLOCKED` pending an owning backend import/activation flow. | Agent |
|
||||
| 2025-12-15 | Completed `T1`-`T6`, `T8`-`T19`, `T21`-`T24` (metrics/logging/codes/audit, tests, docs, dashboard); left `T7`/`T20` `BLOCKED` pending an owning Offline Kit import host. | Agent |
|
||||
| 2025-12-15 | Cross-cutting Postgres RLS compatibility: set both `app.tenant_id` and `app.current_tenant` on tenant-scoped connections (shared `StellaOps.Infrastructure.Postgres`). | Agent |
|
||||
| 2025-12-17 | Unblocked `T7`/`T20` by implementing a Scanner-owned Offline Kit import host; started wiring Prometheus `/metrics` surface and Authority audit emission into import/activation flow. | Agent |
|
||||
|
||||
## Decisions & Risks
|
||||
- **Prometheus exporter choice (Importer):** `T7` is `BLOCKED` because the repo currently has no backend Offline Kit import host (no `src/**` implementation for `POST /api/offline-kit/import`), so there is no clear owning service to expose `/metrics`.
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
**Epic:** Time-to-First-Signal (TTFS) Implementation
|
||||
**Module:** Scheduler, Web UI
|
||||
**Working Directory:** `src/Scheduler/`, `src/Web/StellaOps.Web/`
|
||||
**Status:** TODO
|
||||
**Status:** DOING
|
||||
**Created:** 2025-12-14
|
||||
**Target Completion:** TBD
|
||||
**Depends On:** SPRINT_0340_0001_0001 (FirstSignalCard UI)
|
||||
@@ -39,7 +39,7 @@ This sprint delivers enhancements to the TTFS system including predictive failur
|
||||
| T1 | Create `failure_signatures` table | Agent | DONE | Added to scheduler.sql |
|
||||
| T2 | Create `IFailureSignatureRepository` | Agent | DONE | Interface + Postgres impl |
|
||||
| T3 | Implement `FailureSignatureIndexer` | Agent | DONE | Background indexer service |
|
||||
| T4 | Integrate signatures into FirstSignal | — | BLOCKED | Requires cross-module integration design (Orchestrator -> Scheduler). Added GetBestMatchAsync to IFailureSignatureRepository. Need abstraction/client pattern. |
|
||||
| T4 | Integrate signatures into FirstSignal | — | DOING | Implement Scheduler WebService endpoint + Orchestrator client to surface best-match failure signature as `lastKnownOutcome` in FirstSignal response. |
|
||||
| T5 | Add "Verify locally" commands to EvidencePanel | Agent | DONE | Copy affordances |
|
||||
| T6 | Create ProofSpine sub-component | Agent | DONE | Bundle hashes |
|
||||
| T7 | Create verification command templates | Agent | DONE | Cosign/Rekor |
|
||||
@@ -1903,6 +1903,7 @@ export async function setupPlaywrightDeterministic(page: Page): Promise<void> {
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2025-12-16 | T4: Added `GetBestMatchAsync` to `IFailureSignatureRepository` and implemented in Postgres repository. Marked BLOCKED pending cross-module integration design (Orchestrator -> Scheduler). | Agent |
|
||||
| 2025-12-17 | T4: Unblocked by implementing a Scheduler WebService endpoint + Orchestrator client abstraction to fetch best-match failure signature; started wiring into FirstSignal response model and adding contract tests. | Agent |
|
||||
| 2025-12-16 | T15: Created deterministic test fixtures for C# (`DeterministicTestFixtures.cs`) and TypeScript (`deterministic-fixtures.ts`) with frozen timestamps, seeded RNG, and pre-generated UUIDs. | Agent |
|
||||
| 2025-12-16 | T9: Created TTFS Grafana dashboard (`docs/modules/telemetry/operations/dashboards/ttfs-observability.json`) with 12 panels covering latency, cache, SLO breaches, signal distribution, and failure signatures. | Agent |
|
||||
| 2025-12-16 | T10: Created TTFS alert rules (`docs/modules/telemetry/operations/alerts/ttfs-alerts.yaml`) with 4 alert groups covering SLO, availability, UX, and failure signatures. | Agent |
|
||||
|
||||
@@ -61,7 +61,7 @@ Per advisory §5:
|
||||
| T5 | Implement SBOM collector (CycloneDX, SPDX) | DONE | Agent | `CycloneDxParser`, `SpdxParser`, `SbomParserFactory`, `SbomCollector` in Reconciliation/Parsers. |
|
||||
| T6 | Implement attestation collector | DONE | Agent | `IAttestationParser`, `DsseAttestationParser`, `AttestationCollector` in Reconciliation/Parsers. |
|
||||
| T7 | Integrate with `DsseVerifier` for validation | DONE | Agent | `AttestationCollector` integrates with `DsseVerifier` for DSSE signature verification. |
|
||||
| T8 | Integrate with Rekor offline verifier | BLOCKED | Agent | Rekor offline verifier not found in AirGap module. Attestor module has online RekorBackend. Need offline Merkle proof verifier. |
|
||||
| T8 | Integrate with Rekor offline verifier | DOING | Agent | Implement offline Rekor receipt verifier (Merkle inclusion + checkpoint signature) and wire into AttestationCollector when `VerifyRekorProofs=true`. |
|
||||
| **Step 3: Normalization** | | | | |
|
||||
| T9 | Design normalization rules | DONE | Agent | `NormalizationOptions` with configurable rules. |
|
||||
| T10 | Implement stable JSON sorting | DONE | Agent | `JsonNormalizer.NormalizeObject()` with ordinal key sorting. |
|
||||
@@ -77,10 +77,10 @@ Per advisory §5:
|
||||
| T18 | Design `EvidenceGraph` schema | DONE | Agent | `EvidenceGraph`, `EvidenceNode`, `EvidenceEdge` models. |
|
||||
| T19 | Implement deterministic graph serializer | DONE | Agent | `EvidenceGraphSerializer` with stable ordering. |
|
||||
| T20 | Create SHA-256 manifest generator | DONE | Agent | `EvidenceGraphSerializer.ComputeHash()` writes `evidence-graph.sha256`. |
|
||||
| T21 | Integrate DSSE signing for output | BLOCKED | Agent | Signer module (`StellaOps.Signer`) is separate from AirGap. Need cross-module integration pattern or abstraction. |
|
||||
| T21 | Integrate DSSE signing for output | DOING | Agent | Implement local DSSE signing of `evidence-graph.json` using `StellaOps.Attestor.Envelope` + ECDSA PEM key option; keep output deterministic. |
|
||||
| **Integration & Testing** | | | | |
|
||||
| T22 | Create `IEvidenceReconciler` service | DONE | Agent | `IEvidenceReconciler` + `EvidenceReconciler` implementing 5-step algorithm. |
|
||||
| T23 | Wire to CLI `verify offline` command | BLOCKED | Agent | CLI module (`StellaOps.Cli`) is separate from AirGap. Sprint 0339 covers CLI offline commands. |
|
||||
| T23 | Wire to CLI `verify offline` command | DOING | Agent | CLI `verify offline` calls reconciler and returns deterministic pass/fail + violations; shared policy loader. |
|
||||
| T24 | Write golden-file tests | DONE | Agent | `CycloneDxParserTests`, `SpdxParserTests`, `DsseAttestationParserTests` with fixtures. |
|
||||
| T25 | Write property-based tests | DONE | Agent | `SourcePrecedenceLatticePropertyTests` verifying lattice algebraic properties. |
|
||||
| T26 | Update documentation | DONE | Agent | Created `docs/modules/airgap/evidence-reconciliation.md`. |
|
||||
@@ -984,6 +984,7 @@ public sealed record ReconciliationResult(
|
||||
| 2025-12-16 | Implemented property-based tests for lattice algebraic properties (`T25`): commutativity, associativity, idempotence, absorption laws, and merge determinism. | Agent |
|
||||
| 2025-12-16 | Created evidence reconciliation documentation (`T26`) in `docs/modules/airgap/evidence-reconciliation.md`. | Agent |
|
||||
| 2025-12-16 | Integrated DsseVerifier into AttestationCollector (`T7`). Marked T8, T21, T23 as BLOCKED pending cross-module integration patterns. | Agent |
|
||||
| 2025-12-17 | Unblocked T8/T21/T23 by implementing an offline Rekor receipt verifier contract + local DSSE signing path, and wiring reconciliation into CLI `verify offline`. | Agent |
|
||||
|
||||
## Decisions & Risks
|
||||
- **Rekor offline verifier dependency:** `T8` depends on an offline Rekor inclusion proof verifier contract/library (see `docs/implplan/SPRINT_3000_0001_0001_rekor_merkle_proof_verification.md`).
|
||||
|
||||
@@ -4,7 +4,7 @@
|
||||
**Feature:** Centralized rate limiting for Stella Router as standalone product
|
||||
**Advisory Source:** `docs/product-advisories/unprocessed/15-Dec-2025 - Designing 202 + Retry‑After Backpressure Control.md`
|
||||
**Owner:** Router Team
|
||||
**Status:** PLANNING → READY FOR IMPLEMENTATION
|
||||
**Status:** DOING (Sprints 1–3 DONE; Sprint 4 DONE (N/A); Sprint 5 DOING; Sprint 6 TODO)
|
||||
**Priority:** HIGH - Core feature for Router product
|
||||
**Target Completion:** 6 weeks (4 weeks implementation + 2 weeks rollout)
|
||||
|
||||
@@ -61,10 +61,10 @@ Each target can have multiple rules (AND logic):
|
||||
| Sprint | IMPLID | Duration | Focus | Status |
|
||||
|--------|--------|----------|-------|--------|
|
||||
| **Sprint 1** | 1200_001_001 | 5-7 days | Core router rate limiting | DONE |
|
||||
| **Sprint 2** | 1200_001_002 | 2-3 days | Per-route granularity | TODO |
|
||||
| **Sprint 3** | 1200_001_003 | 2-3 days | Rule stacking (multiple windows) | TODO |
|
||||
| **Sprint 4** | 1200_001_004 | 3-4 days | Service migration (AdaptiveRateLimiter) | TODO |
|
||||
| **Sprint 5** | 1200_001_005 | 3-5 days | Comprehensive testing | TODO |
|
||||
| **Sprint 2** | 1200_001_002 | 2-3 days | Per-route granularity | DONE |
|
||||
| **Sprint 3** | 1200_001_003 | 2-3 days | Rule stacking (multiple windows) | DONE |
|
||||
| **Sprint 4** | 1200_001_004 | 3-4 days | Service migration (AdaptiveRateLimiter) | DONE (N/A) |
|
||||
| **Sprint 5** | 1200_001_005 | 3-5 days | Comprehensive testing | DOING |
|
||||
| **Sprint 6** | 1200_001_006 | 2 days | Documentation & rollout prep | TODO |
|
||||
|
||||
**Total Implementation:** 17-24 days
|
||||
@@ -161,41 +161,38 @@ Each target can have multiple rules (AND logic):
|
||||
## Delivery Tracker
|
||||
|
||||
### Sprint 1: Core Router Rate Limiting
|
||||
- [ ] TODO: Rate limit abstractions
|
||||
- [ ] TODO: Valkey backend implementation
|
||||
- [ ] TODO: Middleware integration
|
||||
- [ ] TODO: Metrics and observability
|
||||
- [ ] TODO: Configuration schema
|
||||
- [x] Rate limit abstractions
|
||||
- [x] Valkey backend implementation (Lua, fixed-window)
|
||||
- [x] Middleware integration (router pipeline)
|
||||
- [x] Metrics and observability
|
||||
- [x] Configuration schema (rules + legacy compatibility)
|
||||
|
||||
### Sprint 2: Per-Route Granularity
|
||||
- [ ] TODO: Route pattern matching
|
||||
- [ ] TODO: Configuration extension
|
||||
- [ ] TODO: Inheritance resolution
|
||||
- [ ] TODO: Route-level testing
|
||||
- [x] Route pattern matching (exact/prefix/regex, specificity rules)
|
||||
- [x] Configuration extension (`routes` under microservices)
|
||||
- [x] Inheritance resolution (environment → microservice → route)
|
||||
- [x] Route-level testing (unit tests)
|
||||
|
||||
### Sprint 3: Rule Stacking
|
||||
- [ ] TODO: Multi-rule configuration
|
||||
- [ ] TODO: AND logic evaluation
|
||||
- [ ] TODO: Lua script enhancement
|
||||
- [ ] TODO: Retry-After calculation
|
||||
- [x] Multi-rule configuration (`rules[]` with legacy compatibility)
|
||||
- [x] AND logic evaluation (instance + environment)
|
||||
- [x] Lua script enhancement (multi-rule evaluation)
|
||||
- [x] Retry-After calculation (most restrictive)
|
||||
|
||||
### Sprint 4: Service Migration
|
||||
- [ ] TODO: Extract Orchestrator configs
|
||||
- [ ] TODO: Add to Router config
|
||||
- [ ] TODO: Refactor AdaptiveRateLimiter
|
||||
- [ ] TODO: Integration validation
|
||||
- [x] Closed as N/A (no Orchestrator ingress wiring found); see `docs/implplan/SPRINT_1200_001_004_router_rate_limiting_service_migration.md`
|
||||
|
||||
### Sprint 5: Comprehensive Testing
|
||||
- [ ] TODO: Unit test suite
|
||||
- [ ] TODO: Integration test suite
|
||||
- [ ] TODO: Load tests (k6)
|
||||
- [ ] TODO: Configuration matrix tests
|
||||
- [x] Unit test suite (core + routes + rules)
|
||||
- [ ] Integration test suite (Valkey/Testcontainers) — see `docs/implplan/SPRINT_1200_001_005_router_rate_limiting_tests.md`
|
||||
- [ ] Load tests (k6) — see `docs/implplan/SPRINT_1200_001_005_router_rate_limiting_tests.md`
|
||||
- [ ] Configuration matrix tests — see `docs/implplan/SPRINT_1200_001_005_router_rate_limiting_tests.md`
|
||||
|
||||
### Sprint 6: Documentation
|
||||
- [ ] TODO: Architecture docs
|
||||
- [ ] TODO: Configuration guide
|
||||
- [ ] TODO: Operational runbook
|
||||
- [ ] TODO: Migration guide
|
||||
- [ ] Architecture docs — see `docs/implplan/SPRINT_1200_001_006_router_rate_limiting_docs.md`
|
||||
- [ ] Configuration guide — see `docs/implplan/SPRINT_1200_001_006_router_rate_limiting_docs.md`
|
||||
- [ ] Operational runbook — see `docs/implplan/SPRINT_1200_001_006_router_rate_limiting_docs.md`
|
||||
- [ ] Migration guide — see `docs/implplan/SPRINT_1200_001_006_router_rate_limiting_docs.md`
|
||||
|
||||
---
|
||||
|
||||
@@ -214,9 +211,11 @@ Each target can have multiple rules (AND logic):
|
||||
## Related Documentation
|
||||
|
||||
- **Advisory:** `docs/product-advisories/unprocessed/15-Dec-2025 - Designing 202 + Retry‑After Backpressure Control.md`
|
||||
- **Plan:** `C:\Users\VladimirMoushkov\.claude\plans\vectorized-kindling-rocket.md`
|
||||
- **Implementation:** `src/__Libraries/StellaOps.Router.Gateway/RateLimit/`
|
||||
- **Tests:** `tests/StellaOps.Router.Gateway.Tests/`
|
||||
- **Implementation Guides:** `docs/implplan/SPRINT_1200_001_00X_*.md` (see below)
|
||||
- **Architecture:** `docs/modules/router/rate-limiting.md` (to be created)
|
||||
- **Sprints:** `docs/implplan/SPRINT_1200_001_004_router_rate_limiting_service_migration.md`, `docs/implplan/SPRINT_1200_001_005_router_rate_limiting_tests.md`, `docs/implplan/SPRINT_1200_001_006_router_rate_limiting_docs.md`
|
||||
- **Docs:** `docs/router/rate-limiting-routes.md`
|
||||
|
||||
---
|
||||
|
||||
@@ -233,19 +232,12 @@ Each target can have multiple rules (AND logic):
|
||||
|
||||
| Date | Status | Notes |
|
||||
|------|--------|-------|
|
||||
| 2025-12-17 | PLANNING | Sprint plan created from advisory analysis |
|
||||
| TBD | READY | All sprint files and docs created, ready for implementation |
|
||||
| TBD | IN_PROGRESS | Sprint 1 started |
|
||||
| 2025-12-17 | DOING | Sprints 1–3 DONE; Sprint 4 closed N/A; Sprint 5 tests started; Sprint 6 docs pending. |
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. ✅ Create master sprint tracker (this file)
|
||||
2. ⏳ Create individual sprint files with detailed tasks
|
||||
3. ⏳ Create implementation guide with technical details
|
||||
4. ⏳ Create configuration reference
|
||||
5. ⏳ Create testing strategy document
|
||||
6. ⏳ Review with Architecture Guild
|
||||
7. ⏳ Assign to implementation agent
|
||||
8. ⏳ Begin Sprint 1
|
||||
1. Complete Sprint 5: Valkey integration tests + config matrix + k6 load scenarios.
|
||||
2. Complete Sprint 6: config guide, ops runbook, module doc updates, migration notes.
|
||||
3. Mark this master tracker DONE after Sprint 5/6 close.
|
||||
|
||||
@@ -4,7 +4,9 @@
|
||||
**Sprint Duration:** 5-7 days
|
||||
**Priority:** HIGH
|
||||
**Dependencies:** None
|
||||
**Blocks:** Sprint 2, 3, 4, 5, 6
|
||||
**Status:** DONE
|
||||
**Blocks:** Sprint 4, 5, 6
|
||||
**Evidence:** `src/__Libraries/StellaOps.Router.Gateway/RateLimit/`, `tests/StellaOps.Router.Gateway.Tests/`
|
||||
|
||||
---
|
||||
|
||||
@@ -1137,15 +1139,23 @@ rate_limiting:
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] Configuration loads from YAML correctly
|
||||
- [ ] Instance limiter enforces limits (in-memory, fast)
|
||||
- [ ] Environment limiter enforces limits (Valkey-backed)
|
||||
- [ ] 429 + Retry-After response format correct
|
||||
- [ ] Circuit breaker handles Valkey failures (fail-open)
|
||||
- [ ] Activation gate skips Valkey under low traffic
|
||||
- [ ] Metrics exported to OpenTelemetry
|
||||
- [ ] All unit tests pass (>90% coverage)
|
||||
- [ ] Integration tests pass (TestServer + Testcontainers)
|
||||
- [x] Configuration loads from YAML correctly
|
||||
- [x] Instance limiter enforces limits (in-memory, fast)
|
||||
- [x] Environment limiter enforces limits (Valkey-backed)
|
||||
- [x] 429 + Retry-After response format correct
|
||||
- [x] Circuit breaker handles Valkey failures (fail-open)
|
||||
- [x] Activation gate skips Valkey under low traffic
|
||||
- [x] Metrics exported to OpenTelemetry
|
||||
- [x] All unit tests pass
|
||||
- [x] Integration tests pass (middleware response + Valkey/Testcontainers) (Sprint 5)
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2025-12-17 | Marked sprint DONE; implemented Valkey-backed multi-rule limiter, fixed instance sliding window counter, updated middleware order, and added unit tests. | Automation |
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -4,7 +4,9 @@
|
||||
**Sprint Duration:** 2-3 days
|
||||
**Priority:** HIGH
|
||||
**Dependencies:** Sprint 1 (Core implementation)
|
||||
**Blocks:** Sprint 5 (Testing needs routes)
|
||||
**Status:** DONE
|
||||
**Blocks:** Sprint 5 (additional integration/load testing)
|
||||
**Evidence:** `src/__Libraries/StellaOps.Router.Gateway/RateLimit/`, `docs/router/rate-limiting-routes.md`, `tests/StellaOps.Router.Gateway.Tests/`
|
||||
|
||||
---
|
||||
|
||||
@@ -652,14 +654,22 @@ policy:
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] Route configuration models created
|
||||
- [ ] Route matching works (exact, prefix, regex)
|
||||
- [ ] Specificity resolution correct
|
||||
- [ ] Inheritance works (global → microservice → route)
|
||||
- [ ] Integration with RateLimitService complete
|
||||
- [ ] Unit tests pass (>90% coverage)
|
||||
- [ ] Integration tests pass
|
||||
- [ ] Documentation complete
|
||||
- [x] Route configuration models created
|
||||
- [x] Route matching works (exact, prefix, regex)
|
||||
- [x] Specificity resolution correct
|
||||
- [x] Inheritance works (global → microservice → route)
|
||||
- [x] Integration with RateLimitService complete
|
||||
- [x] Unit tests pass
|
||||
- [x] Integration tests pass (covered in Sprint 5)
|
||||
- [x] Documentation complete
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2025-12-17 | Marked sprint DONE; implemented route config + matching + inheritance resolution; integrated into RateLimitService; added unit tests and docs. | Automation |
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -4,7 +4,9 @@
|
||||
**Sprint Duration:** 2-3 days
|
||||
**Priority:** HIGH
|
||||
**Dependencies:** Sprint 1 (Core), Sprint 2 (Routes)
|
||||
**Blocks:** Sprint 5 (Testing)
|
||||
**Status:** DONE
|
||||
**Blocks:** Sprint 5 (additional integration/load testing)
|
||||
**Evidence:** `src/__Libraries/StellaOps.Router.Gateway/RateLimit/`, `tests/StellaOps.Router.Gateway.Tests/`
|
||||
|
||||
---
|
||||
|
||||
@@ -463,14 +465,22 @@ public List<RateLimitRule> ResolveRulesForRoute(string microservice, string? rou
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] Configuration supports rule arrays
|
||||
- [ ] Backward compatible with legacy single-window config
|
||||
- [ ] Instance limiter evaluates all rules (AND logic)
|
||||
- [ ] Valkey Lua script handles multiple windows
|
||||
- [ ] Most restrictive Retry-After returned
|
||||
- [ ] Inheritance resolver merges rules correctly
|
||||
- [ ] Unit tests pass
|
||||
- [ ] Integration tests pass (Testcontainers)
|
||||
- [x] Configuration supports rule arrays
|
||||
- [x] Backward compatible with legacy single-window config
|
||||
- [x] Instance limiter evaluates all rules (AND logic)
|
||||
- [x] Valkey Lua script handles multiple windows
|
||||
- [x] Most restrictive Retry-After returned
|
||||
- [x] Inheritance resolver merges rules correctly
|
||||
- [x] Unit tests pass
|
||||
- [x] Integration tests pass (Valkey/Testcontainers) (Sprint 5)
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2025-12-17 | Marked sprint DONE; implemented rule arrays and multi-window evaluation for instance + environment (Valkey Lua); added unit tests. | Automation |
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,36 @@
|
||||
# Sprint 1200_001_004 · Router Rate Limiting · Service Migration (AdaptiveRateLimiter)
|
||||
|
||||
## Topic & Scope
|
||||
- Close the planned migration of `AdaptiveRateLimiter` (Orchestrator) into Router rate limiting.
|
||||
- Confirm whether any production HTTP paths still enforce service-level rate limiting and therefore require migration.
|
||||
- **Working directory:** `src/Orchestrator/StellaOps.Orchestrator`.
|
||||
- **Evidence:** `src/__Libraries/StellaOps.Router.Gateway/RateLimit/` (router limiter exists) and Orchestrator code search indicates `AdaptiveRateLimiter` is not wired into HTTP ingress (library-only).
|
||||
|
||||
## Dependencies & Concurrency
|
||||
- Depends on: `SPRINT_1200_001_001`, `SPRINT_1200_001_002`, `SPRINT_1200_001_003` (rate limiting landed in Router).
|
||||
- Safe to execute in parallel with Sprint 5/6 since no code changes are required for this closure.
|
||||
|
||||
## Documentation Prerequisites
|
||||
- `docs/implplan/SPRINT_1200_001_000_router_rate_limiting_master.md`
|
||||
- `docs/modules/router/architecture.md`
|
||||
- `docs/modules/orchestrator/architecture.md`
|
||||
|
||||
## Delivery Tracker
|
||||
| # | Task ID | Status | Key dependency / next step | Owners | Task Definition |
|
||||
| --- | --- | --- | --- | --- | --- |
|
||||
| 1 | RRL-04-001 | DONE | N/A | Router · Orchestrator | Inventory usage of `AdaptiveRateLimiter` and any service-level HTTP rate limiting in Orchestrator ingress. |
|
||||
| 2 | RRL-04-002 | DONE | N/A | Router · Architecture | Decide migration outcome: migrate, defer, or close as N/A based on inventory. |
|
||||
| 3 | RRL-04-003 | DONE | Update master tracker | Router | Update `SPRINT_1200_001_000_router_rate_limiting_master.md` to reflect closure outcome. |
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2025-12-17 | Sprint created and closed as N/A: `AdaptiveRateLimiter` appears to be a library-only component in Orchestrator (tests + core) and is not wired into HTTP ingress; no service-level HTTP rate limiting was found to migrate. | Automation |
|
||||
|
||||
## Decisions & Risks
|
||||
- **Decision:** Close Sprint 4 as N/A (no production wiring found). If Orchestrator (or any service) introduces HTTP-level rate limiting, open a dedicated migration sprint under that service’s working directory.
|
||||
- **Risk:** Double-limiting during future migration if both service-level and router-level limiters are enabled. Mitigation: migration guide + staged rollout (shadow mode), and remove service-level limiters after router limits verified.
|
||||
|
||||
## Next Checkpoints
|
||||
- None (closure sprint).
|
||||
|
||||
@@ -0,0 +1,38 @@
|
||||
# Sprint 1200_001_005 · Router Rate Limiting · Comprehensive Testing
|
||||
|
||||
## Topic & Scope
|
||||
- Add Valkey-backed integration tests for the Lua fixed-window implementation (real Valkey).
|
||||
- Expand deterministic unit coverage via configuration matrix tests (inheritance + routes + rule stacking).
|
||||
- Add k6 load test scenarios for rate limiting (enforcement, retry-after correctness, overhead).
|
||||
- **Working directory:** `tests/`.
|
||||
- **Evidence:** `tests/StellaOps.Router.Gateway.Tests/`, `tests/load/`.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
- Depends on: `SPRINT_1200_001_001`, `SPRINT_1200_001_002`, `SPRINT_1200_001_003` (feature implementation).
|
||||
- Can run in parallel with Sprint 6 docs.
|
||||
|
||||
## Documentation Prerequisites
|
||||
- `docs/implplan/SPRINT_1200_001_IMPLEMENTATION_GUIDE.md`
|
||||
- `docs/router/rate-limiting-routes.md`
|
||||
- `docs/modules/router/architecture.md`
|
||||
|
||||
## Delivery Tracker
|
||||
| # | Task ID | Status | Key dependency / next step | Owners | Task Definition |
|
||||
| --- | --- | --- | --- | --- | --- |
|
||||
| 1 | RRL-05-001 | DONE | Run with `STELLAOPS_INTEGRATION_TESTS=true` | QA · Router | Valkey integration tests validating multi-rule Lua behavior and Retry-After bounds. |
|
||||
| 2 | RRL-05-002 | DONE | Covered by unit tests | QA · Router | Configuration matrix unit tests (inheritance replacement + route specificity + rule stacking). |
|
||||
| 3 | RRL-05-003 | DONE | `tests/load/router-rate-limiting-load-test.js` | QA · Router | k6 load tests for rate limiting scenarios (A–F) and doc updates in `tests/load/README.md`. |
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2025-12-17 | Sprint created; RRL-05-001 started. | Automation |
|
||||
| 2025-12-17 | Completed RRL-05-001 and RRL-05-002: added Testcontainers-backed Valkey integration tests (opt-in via `STELLAOPS_INTEGRATION_TESTS=true`) and expanded unit coverage for inheritance + activation gate behavior. | Automation |
|
||||
| 2025-12-17 | Completed RRL-05-003: added k6 suite `tests/load/router-rate-limiting-load-test.js` and documented usage in `tests/load/README.md`. | Automation |
|
||||
|
||||
## Decisions & Risks
|
||||
- **Decision:** Integration tests require Docker; they are opt-in (skipped unless explicitly enabled) to keep `dotnet test StellaOps.Router.slnx` runnable without Docker.
|
||||
- **Risk:** Flaky timing around fixed-window boundaries. Mitigation: assert ranges (not exact seconds) and use small windows with slack.
|
||||
|
||||
## Next Checkpoints
|
||||
- None scheduled; complete tasks and mark sprint DONE.
|
||||
@@ -0,0 +1,41 @@
|
||||
# Sprint 1200_001_006 · Router Rate Limiting · Documentation & Rollout Prep
|
||||
|
||||
## Topic & Scope
|
||||
- Publish user-facing configuration guide and ops runbook for Router rate limiting.
|
||||
- Update Router module docs to reflect the new centralized rate limiting feature and where it sits in the request pipeline.
|
||||
- Add migration guidance to avoid double-limiting during rollout.
|
||||
- **Working directory:** `docs/`.
|
||||
- **Evidence:** `docs/router/`, `docs/operations/`, `docs/modules/router/`.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
- Depends on: `SPRINT_1200_001_001`, `SPRINT_1200_001_002`, `SPRINT_1200_001_003`.
|
||||
- Can run in parallel with Sprint 5 tests.
|
||||
|
||||
## Documentation Prerequisites
|
||||
- `docs/README.md`
|
||||
- `docs/07_HIGH_LEVEL_ARCHITECTURE.md`
|
||||
- `docs/modules/platform/architecture-overview.md`
|
||||
- `docs/modules/router/architecture.md`
|
||||
- `docs/router/rate-limiting-routes.md`
|
||||
|
||||
## Delivery Tracker
|
||||
| # | Task ID | Status | Key dependency / next step | Owners | Task Definition |
|
||||
| --- | --- | --- | --- | --- | --- |
|
||||
| 1 | RRL-06-001 | DONE | Links added | Docs · Router | Architecture updates + links (Router module docs + high-level router docs). |
|
||||
| 2 | RRL-06-002 | DONE | `docs/router/rate-limiting.md` | Docs · Router | User configuration guide: `docs/router/rate-limiting.md` (rules, inheritance, routes, examples). |
|
||||
| 3 | RRL-06-003 | DONE | `docs/operations/router-rate-limiting.md` | Ops · Router | Operational runbook: `docs/operations/router-rate-limiting.md` (dashboards, alerts, rollout, failure modes). |
|
||||
| 4 | RRL-06-004 | DONE | Migration notes published | Router · Docs | Migration guide section: avoid double-limiting, staged rollout, and decommission service-level limiters. |
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
| --- | --- | --- |
|
||||
| 2025-12-17 | Sprint created; awaiting implementation. | Automation |
|
||||
| 2025-12-17 | Started RRL-06-001. | Automation |
|
||||
| 2025-12-17 | Completed RRL-06-001..004: added `docs/router/rate-limiting.md`, `docs/operations/router-rate-limiting.md`, `docs/modules/router/rate-limiting.md`; updated `docs/router/rate-limiting-routes.md`, `docs/modules/router/README.md`, and `docs/modules/router/architecture.md`. | Automation |
|
||||
|
||||
## Decisions & Risks
|
||||
- **Decision:** Keep docs offline-friendly: no external CDNs/snippets; prefer deterministic, copy-pastable YAML fragments.
|
||||
- **Risk:** Confusion during rollout if both router and service rate limiting are enabled. Mitigation: explicit migration guide + recommended rollout phases.
|
||||
|
||||
## Next Checkpoints
|
||||
- None scheduled; complete tasks and mark sprint DONE.
|
||||
@@ -1,13 +1,15 @@
|
||||
# Router Rate Limiting - Implementation Guide
|
||||
|
||||
**For:** Implementation agents executing Sprint 1200_001_001 through 1200_001_006
|
||||
**For:** Implementation agents / reviewers for Sprint 1200_001_001 through 1200_001_006
|
||||
**Status:** DOING (Sprints 1–3 DONE; Sprint 4 closed N/A; Sprints 5–6 in progress)
|
||||
**Evidence:** `src/__Libraries/StellaOps.Router.Gateway/RateLimit/`, `tests/StellaOps.Router.Gateway.Tests/`
|
||||
**Last Updated:** 2025-12-17
|
||||
|
||||
---
|
||||
|
||||
## Purpose
|
||||
|
||||
This guide provides comprehensive technical context for implementing centralized rate limiting in Stella Router. It covers architecture decisions, patterns, gotchas, and operational considerations.
|
||||
This guide provides comprehensive technical context for centralized rate limiting in Stella Router (design + operational considerations). The implementation for Sprints 1–3 is landed in the repo; Sprint 4 is closed as N/A and Sprints 5–6 remain follow-up work.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,14 +1,15 @@
|
||||
# Router Rate Limiting - Sprint Package README
|
||||
|
||||
**Package Created:** 2025-12-17
|
||||
**For:** Implementation agents
|
||||
**For:** Implementation agents / reviewers
|
||||
**Status:** DOING (Sprints 1–3 DONE; Sprint 4 DONE (N/A); Sprint 5 DOING; Sprint 6 TODO)
|
||||
**Advisory Source:** `docs/product-advisories/unprocessed/15-Dec-2025 - Designing 202 + Retry‑After Backpressure Control.md`
|
||||
|
||||
---
|
||||
|
||||
## Package Contents
|
||||
|
||||
This sprint package contains everything needed to implement centralized rate limiting in Stella Router.
|
||||
This sprint package contains the original plan plus the landed implementation for centralized rate limiting in Stella Router.
|
||||
|
||||
### Core Sprint Files
|
||||
|
||||
@@ -18,15 +19,19 @@ This sprint package contains everything needed to implement centralized rate lim
|
||||
| `SPRINT_1200_001_001_router_rate_limiting_core.md` | Sprint 1: Core implementation | Implementer - 5-7 days |
|
||||
| `SPRINT_1200_001_002_router_rate_limiting_per_route.md` | Sprint 2: Per-route granularity | Implementer - 2-3 days |
|
||||
| `SPRINT_1200_001_003_router_rate_limiting_rule_stacking.md` | Sprint 3: Rule stacking | Implementer - 2-3 days |
|
||||
| `SPRINT_1200_001_004_router_rate_limiting_service_migration.md` | Sprint 4: Service migration (closed N/A) | Project manager / reviewer |
|
||||
| `SPRINT_1200_001_005_router_rate_limiting_tests.md` | Sprint 5: Comprehensive testing | QA / implementer |
|
||||
| `SPRINT_1200_001_006_router_rate_limiting_docs.md` | Sprint 6: Documentation & rollout prep | Docs / implementer |
|
||||
| `SPRINT_1200_001_IMPLEMENTATION_GUIDE.md` | Technical reference | **READ FIRST** before coding |
|
||||
|
||||
### Documentation Files (To Be Created in Sprint 6)
|
||||
### Documentation Files
|
||||
|
||||
| File | Purpose | Created In |
|
||||
|------|---------|------------|
|
||||
| `docs/router/rate-limiting-routes.md` | Per-route configuration guide | Sprint 2 |
|
||||
| `docs/router/rate-limiting.md` | User-facing configuration guide | Sprint 6 |
|
||||
| `docs/operations/router-rate-limiting.md` | Operational runbook | Sprint 6 |
|
||||
| `docs/modules/router/architecture.md` | Architecture documentation | Sprint 6 |
|
||||
| `docs/modules/router/rate-limiting.md` | Module-level rate-limiting dossier | Sprint 6 |
|
||||
|
||||
---
|
||||
|
||||
@@ -306,6 +311,38 @@ Copy this to master tracker and update as you progress:
|
||||
|
||||
## File Structure (After Implementation)
|
||||
|
||||
### Actual (landed)
|
||||
|
||||
```
|
||||
src/__Libraries/StellaOps.Router.Gateway/RateLimit/
|
||||
CircuitBreaker.cs
|
||||
EnvironmentRateLimiter.cs
|
||||
InMemoryValkeyRateLimitStore.cs
|
||||
InstanceRateLimiter.cs
|
||||
LimitInheritanceResolver.cs
|
||||
RateLimitConfig.cs
|
||||
RateLimitDecision.cs
|
||||
RateLimitMetrics.cs
|
||||
RateLimitMiddleware.cs
|
||||
RateLimitRule.cs
|
||||
RateLimitRouteMatcher.cs
|
||||
RateLimitService.cs
|
||||
RateLimitServiceCollectionExtensions.cs
|
||||
ValkeyRateLimitStore.cs
|
||||
|
||||
tests/StellaOps.Router.Gateway.Tests/
|
||||
LimitInheritanceResolverTests.cs
|
||||
InMemoryValkeyRateLimitStoreTests.cs
|
||||
InstanceRateLimiterTests.cs
|
||||
RateLimitConfigTests.cs
|
||||
RateLimitRouteMatcherTests.cs
|
||||
RateLimitServiceTests.cs
|
||||
|
||||
docs/router/rate-limiting-routes.md
|
||||
```
|
||||
|
||||
### Original plan (reference)
|
||||
|
||||
```
|
||||
src/__Libraries/StellaOps.Router.Gateway/
|
||||
├── RateLimit/
|
||||
@@ -351,8 +388,8 @@ __Tests/
|
||||
│ ├── RouteMatchingTests.cs
|
||||
│ └── InheritanceResolverTests.cs
|
||||
|
||||
tests/load/k6/
|
||||
└── rate-limit-scenarios.js
|
||||
tests/load/
|
||||
└── router-rate-limiting-load-test.js
|
||||
```
|
||||
|
||||
---
|
||||
@@ -443,7 +480,9 @@ rate_limiting:
|
||||
- **Sprint 1:** `SPRINT_1200_001_001_router_rate_limiting_core.md`
|
||||
- **Sprint 2:** `SPRINT_1200_001_002_router_rate_limiting_per_route.md`
|
||||
- **Sprint 3:** `SPRINT_1200_001_003_router_rate_limiting_rule_stacking.md`
|
||||
- **Sprint 4-6:** To be created by implementer (templates in master tracker)
|
||||
- **Sprint 4:** `SPRINT_1200_001_004_router_rate_limiting_service_migration.md` (closed N/A)
|
||||
- **Sprint 5:** `SPRINT_1200_001_005_router_rate_limiting_tests.md`
|
||||
- **Sprint 6:** `SPRINT_1200_001_006_router_rate_limiting_docs.md`
|
||||
|
||||
### Technical Guides
|
||||
- **Implementation Guide:** `SPRINT_1200_001_IMPLEMENTATION_GUIDE.md` (comprehensive)
|
||||
@@ -460,4 +499,4 @@ rate_limiting:
|
||||
|
||||
---
|
||||
|
||||
**Ready to implement?** Start with the Implementation Guide, then proceed to Sprint 1!
|
||||
**Already implemented.** Review the master tracker and run `dotnet test StellaOps.Router.slnx -c Release`.
|
||||
|
||||
@@ -37,13 +37,13 @@ Implement False-Negative Drift (FN-Drift) rate tracking for monitoring reclassif
|
||||
| 4 | DRIFT-3404-004 | DONE | None | Scanner Team | Define `ClassificationChange` entity and `DriftCause` enum |
|
||||
| 5 | DRIFT-3404-005 | DONE | After #1, #4 | Scanner Team | Implement `ClassificationHistoryRepository` |
|
||||
| 6 | DRIFT-3404-006 | DONE | After #5 | Scanner Team | Implemented `ClassificationChangeTracker` service |
|
||||
| 7 | DRIFT-3404-007 | BLOCKED | After #6 | Scanner Team | Requires scan completion pipeline integration point |
|
||||
| 7 | DRIFT-3404-007 | DONE | After #6 | Scanner Team | Integrated FN-drift tracking on report publish/scan completion pipeline |
|
||||
| 8 | DRIFT-3404-008 | DONE | After #2 | Scanner Team | Implement `FnDriftCalculator` with stratification |
|
||||
| 9 | DRIFT-3404-009 | DONE | After #8 | Telemetry Team | Implemented `FnDriftMetricsExporter` with Prometheus gauges |
|
||||
| 10 | DRIFT-3404-010 | BLOCKED | After #9 | Telemetry Team | Requires SLO threshold configuration in telemetry stack |
|
||||
| 10 | DRIFT-3404-010 | DONE | After #9 | Telemetry Team | Added Prometheus alert rules for FN-drift thresholds |
|
||||
| 11 | DRIFT-3404-011 | DONE | After #5 | Scanner Team | ClassificationChangeTrackerTests.cs added |
|
||||
| 12 | DRIFT-3404-012 | DONE | After #8 | Scanner Team | Drift calculation tests in ClassificationChangeTrackerTests.cs |
|
||||
| 13 | DRIFT-3404-013 | BLOCKED | After #7 | QA | Blocked by #7 pipeline integration |
|
||||
| 13 | DRIFT-3404-013 | DONE | After #7 | QA | Added webservice tests covering FN-drift tracking integration |
|
||||
| 14 | DRIFT-3404-014 | DONE | After #2 | Docs Guild | Created `docs/metrics/fn-drift.md` |
|
||||
|
||||
## Wave Coordination
|
||||
@@ -526,6 +526,7 @@ public sealed class FnDriftMetrics
|
||||
|------|------|----------|-----|-------|
|
||||
| Materialized view refresh strategy | Decision | DB Team | Before #2 | Cron vs trigger |
|
||||
| High-volume insert optimization | Risk | Scanner Team | Before #7 | May need batch processing |
|
||||
| Verdict-to-classification mapping | Decision | Scanner Team | With #7 | Heuristic mapping from Policy verdict diffs to classification status (documented in code) |
|
||||
|
||||
---
|
||||
|
||||
@@ -534,3 +535,8 @@ public sealed class FnDriftMetrics
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2025-12-14 | Sprint created from Determinism advisory gap analysis | Implementer |
|
||||
| 2025-12-17 | Implemented scan completion integration, enabled drift view refresh+metrics export, added alert rules, and added QA tests. | Agent |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- None (sprint complete).
|
||||
|
||||
@@ -585,3 +585,9 @@ public sealed record ReportedGate
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2025-12-14 | Sprint created from Determinism advisory gap analysis | Implementer |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- Integrate gate detection into RichGraph builder/writer (GATE-3405-009).
|
||||
- Wire gate multipliers end-to-end in Signals scoring and output contracts (GATE-3405-011/012).
|
||||
- Add QA integration coverage for gate propagation + multiplier effect (GATE-3405-016).
|
||||
|
||||
@@ -1,17 +1,33 @@
|
||||
# Sprint 3410: EPSS Ingestion & Storage
|
||||
# Sprint 3410.0001.0001 · EPSS Ingestion & Storage
|
||||
|
||||
## Metadata
|
||||
## Topic & Scope
|
||||
|
||||
- Deliver deterministic EPSS v4 ingestion into Postgres (append-only history + current projection + change log).
|
||||
- Support online and air-gap bundle sources with identical parsing and validation.
|
||||
- Produce operator evidence (tests + runbook) proving determinism, idempotency, and partition safety.
|
||||
|
||||
**Sprint ID:** SPRINT_3410_0001_0001
|
||||
**Implementation Plan:** IMPL_3410_epss_v4_integration_master_plan
|
||||
**Phase:** Phase 1 - MVP
|
||||
**Priority:** P1
|
||||
**Estimated Effort:** 2 weeks
|
||||
**Working Directory:** `src/Concelier/`
|
||||
**Working Directory:** `src/Scanner/`
|
||||
**Dependencies:** None (foundational)
|
||||
|
||||
---
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
- **Depends on:** Scanner storage schema migration `src/Scanner/__Libraries/StellaOps.Scanner.Storage/Postgres/Migrations/008_epss_integration.sql`.
|
||||
- **Blocking:** SPRINT_3410_0002_0001 (Scanner integration) depends on this sprint landing.
|
||||
- **Safe to parallelize with:** Determinism scoring and reachability work (no schema overlap beyond Scanner).
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- `docs/modules/scanner/epss-integration.md`
|
||||
- `docs/product-advisories/archive/16-Dec-2025 - Merging EPSS v4 with CVSS v4 Frameworks.md`
|
||||
- `src/Scanner/__Libraries/StellaOps.Scanner.Storage/Postgres/Migrations/008_epss_integration.sql`
|
||||
|
||||
## Overview
|
||||
|
||||
Implement the **foundational EPSS v4 ingestion pipeline** for StellaOps. This sprint delivers daily automated import of EPSS (Exploit Prediction Scoring System) data from FIRST.org, storing it in a deterministic, append-only PostgreSQL schema with full provenance tracking.
|
||||
@@ -127,9 +143,7 @@ External Dependencies:
|
||||
|
||||
---
|
||||
|
||||
## Task Breakdown
|
||||
|
||||
### Delivery Tracker
|
||||
## Delivery Tracker
|
||||
|
||||
| ID | Task | Status | Owner | Est. | Notes |
|
||||
|----|------|--------|-------|------|-------|
|
||||
@@ -771,7 +785,9 @@ concelier:
|
||||
|
||||
---
|
||||
|
||||
## Risks & Mitigations
|
||||
## Decisions & Risks
|
||||
|
||||
- **Decision:** EPSS ingestion/storage is implemented against the Scanner schema for now; the original Concelier-first design text below is preserved for reference.
|
||||
|
||||
| Risk | Likelihood | Impact | Mitigation |
|
||||
|------|------------|--------|------------|
|
||||
@@ -838,5 +854,15 @@ concelier:
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2025-12-17 | Normalized sprint file to standard template; aligned working directory to Scanner schema implementation; preserved original Concelier-first design text for reference. | Agent |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- Implement EPSS ingestion pipeline + scheduler trigger (this sprint), then close Scanner integration (SPRINT_3410_0002_0001).
|
||||
|
||||
**Sprint Status**: READY FOR IMPLEMENTATION
|
||||
**Approval**: _____________________ Date: ___________
|
||||
|
||||
@@ -6,6 +6,22 @@
|
||||
**Working Directory:** `src/Unknowns/`
|
||||
**Estimated Complexity:** Medium-High
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
- Add a dedicated `unknowns` schema with bitemporal semantics for deterministic replay and compliance point-in-time queries.
|
||||
- Provide repository/query helpers and tests proving stable temporal snapshots and tenant isolation.
|
||||
- Deliver a Category C migration path from legacy VEX unknowns tables.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
- **Depends on:** PostgreSQL init scripts and base infrastructure migrations.
|
||||
- **Safe to parallelize with:** All non-DB-cutover work (no runtime coupling).
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- `docs/product-advisories/14-Dec-2025 - PostgreSQL Patterns Technical Reference.md` (Section 3.4)
|
||||
- `docs/db/SPECIFICATION.md`
|
||||
|
||||
---
|
||||
|
||||
## 1. Objective
|
||||
@@ -36,7 +52,7 @@ StellaOps scans produce "unknowns" - packages, versions, or ecosystems that cann
|
||||
|
||||
---
|
||||
|
||||
## 3. Delivery Tracker
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task | Status | Assignee | Notes |
|
||||
|---|------|--------|----------|-------|
|
||||
@@ -464,7 +480,7 @@ COMMIT;
|
||||
|
||||
---
|
||||
|
||||
## 8. Decisions & Risks
|
||||
## Decisions & Risks
|
||||
|
||||
| # | Decision/Risk | Status | Resolution |
|
||||
|---|---------------|--------|------------|
|
||||
@@ -493,3 +509,13 @@ COMMIT;
|
||||
- Spec: `docs/db/SPECIFICATION.md`
|
||||
- Rules: `docs/db/RULES.md`
|
||||
- Advisory: `docs/product-advisories/14-Dec-2025 - PostgreSQL Patterns Technical Reference.md`
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|---|---|---|
|
||||
| 2025-12-17 | Normalized sprint file headings to standard template; no semantic changes. | Agent |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- None (sprint complete).
|
||||
|
||||
@@ -6,6 +6,24 @@
|
||||
**Working Directory:** `src/*/Migrations/`
|
||||
**Estimated Complexity:** Medium
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
- Expand Row-Level Security (RLS) from `findings_ledger` to all tenant-scoped schemas for defense-in-depth.
|
||||
- Standardize `*_app.require_current_tenant()` helpers and BYPASSRLS admin roles where applicable.
|
||||
- Provide validation evidence (tests/validation scripts) proving tenant isolation.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
- **Depends on:** Existing Postgres schema baselines per module.
|
||||
- **Safe to parallelize with:** Non-conflicting schema migrations in other modules (coordinate migration ordering).
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- `docs/db/SPECIFICATION.md`
|
||||
- `docs/db/RULES.md`
|
||||
- `docs/db/VERIFICATION.md`
|
||||
- `docs/modules/platform/architecture-overview.md`
|
||||
|
||||
---
|
||||
|
||||
## 1. Objective
|
||||
@@ -46,7 +64,7 @@ CREATE POLICY tenant_isolation ON table_name
|
||||
|
||||
---
|
||||
|
||||
## 3. Delivery Tracker
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task | Status | Assignee | Notes |
|
||||
|---|------|--------|----------|-------|
|
||||
@@ -566,7 +584,7 @@ $$;
|
||||
|
||||
---
|
||||
|
||||
## 9. Decisions & Risks
|
||||
## Decisions & Risks
|
||||
|
||||
| # | Decision/Risk | Status | Resolution |
|
||||
|---|---------------|--------|------------|
|
||||
@@ -577,7 +595,7 @@ $$;
|
||||
|
||||
---
|
||||
|
||||
## 10. Definition of Done
|
||||
## Definition of Done
|
||||
|
||||
- [x] All tenant-scoped tables have RLS enabled and forced
|
||||
- [x] All tenant-scoped tables have tenant_isolation policy
|
||||
@@ -595,3 +613,13 @@ $$;
|
||||
- Reference implementation: `src/Findings/StellaOps.Findings.Ledger/migrations/007_enable_rls.sql`
|
||||
- PostgreSQL RLS docs: https://www.postgresql.org/docs/16/ddl-rowsecurity.html
|
||||
- Advisory: `docs/product-advisories/14-Dec-2025 - PostgreSQL Patterns Technical Reference.md` (Section 2.2)
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|---|---|---|
|
||||
| 2025-12-17 | Normalized sprint file headings to standard template; no semantic changes. | Agent |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- None (sprint complete).
|
||||
|
||||
@@ -6,6 +6,22 @@
|
||||
**Working Directory:** `src/*/Migrations/`
|
||||
**Estimated Complexity:** High
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
- Implement time-based RANGE partitioning for high-volume event/log tables to enable efficient retention and predictable performance.
|
||||
- Standardize partition creation/retention automation via Scheduler partition maintenance.
|
||||
- Provide validation evidence (scripts/tests) for partition health and pruning behavior.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
- **Depends on:** Partition infra functions (`partition_mgmt` helpers) and module migration baselines.
|
||||
- **Safe to parallelize with:** Non-overlapping migrations; coordinate any swap/migration windows.
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- `docs/db/SPECIFICATION.md`
|
||||
- `docs/product-advisories/14-Dec-2025 - PostgreSQL Patterns Technical Reference.md`
|
||||
|
||||
---
|
||||
|
||||
## 1. Objective
|
||||
@@ -50,7 +66,7 @@ scheduler.runs
|
||||
|
||||
---
|
||||
|
||||
## 3. Delivery Tracker
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task | Status | Assignee | Notes |
|
||||
|---|------|--------|----------|-------|
|
||||
@@ -596,7 +612,7 @@ WHERE schemaname = 'scheduler'
|
||||
|
||||
---
|
||||
|
||||
## 8. Decisions & Risks
|
||||
## Decisions & Risks
|
||||
|
||||
| # | Decision/Risk | Status | Resolution |
|
||||
|---|---------------|--------|------------|
|
||||
@@ -631,3 +647,14 @@ WHERE schemaname = 'scheduler'
|
||||
- BRIN Indexes: https://www.postgresql.org/docs/16/brin-intro.html
|
||||
- pg_partman: https://github.com/pgpartman/pg_partman
|
||||
- Advisory: `docs/product-advisories/14-Dec-2025 - PostgreSQL Patterns Technical Reference.md` (Section 6)
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|---|---|---|
|
||||
| 2025-12-17 | Normalized sprint file headings to standard template; no semantic changes. | Agent |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- Complete Category C migration/swap steps for `vex.timeline_events` and `notify.deliveries`.
|
||||
- Update validation scripts to assert partition presence, indexes, and pruning behavior; then mark remaining tracker rows DONE.
|
||||
|
||||
@@ -6,6 +6,22 @@
|
||||
**Working Directory:** `src/Concelier/`, `src/Excititor/`, `src/Scheduler/`
|
||||
**Estimated Complexity:** Low-Medium
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
- Add generated columns for frequently-queried JSONB fields to enable efficient B-tree indexing and better planner statistics.
|
||||
- Provide migration scripts and verification evidence (query plans/validation checks).
|
||||
- Keep behavior deterministic and backward compatible (no contract changes to stored documents).
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
- **Depends on:** Existing JSONB document schemas per module.
|
||||
- **Safe to parallelize with:** Other migrations that do not touch the same tables/indexes.
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- `docs/db/SPECIFICATION.md`
|
||||
- `docs/product-advisories/14-Dec-2025 - PostgreSQL Patterns Technical Reference.md`
|
||||
|
||||
---
|
||||
|
||||
## 1. Objective
|
||||
@@ -48,7 +64,7 @@ Benefits:
|
||||
|
||||
---
|
||||
|
||||
## 3. Delivery Tracker
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task | Status | Assignee | Notes |
|
||||
|---|------|--------|----------|-------|
|
||||
@@ -468,7 +484,7 @@ public async Task QueryPlan_UsesGeneratedColumnIndex()
|
||||
|
||||
---
|
||||
|
||||
## 9. Decisions & Risks
|
||||
## Decisions & Risks
|
||||
|
||||
| # | Decision/Risk | Status | Resolution |
|
||||
|---|---------------|--------|------------|
|
||||
@@ -499,3 +515,13 @@ public async Task QueryPlan_UsesGeneratedColumnIndex()
|
||||
- PostgreSQL Generated Columns: https://www.postgresql.org/docs/16/ddl-generated-columns.html
|
||||
- JSONB Indexing Strategies: https://www.postgresql.org/docs/16/datatype-json.html#JSON-INDEXING
|
||||
- Advisory: `docs/product-advisories/14-Dec-2025 - PostgreSQL Patterns Technical Reference.md` (Section 4)
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|---|---|---|
|
||||
| 2025-12-17 | Normalized sprint file headings to standard template; no semantic changes. | Agent |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- None (sprint complete).
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# SPRINT_3500_0002_0001 - Smart-Diff Foundation
|
||||
|
||||
**Status:** DOING
|
||||
**Status:** DONE
|
||||
**Priority:** P0 - CRITICAL
|
||||
**Module:** Attestor, Scanner, Policy
|
||||
**Working Directory:** `src/Scanner/__Libraries/StellaOps.Scanner.SmartDiff/`
|
||||
@@ -966,7 +966,7 @@ public interface ISuppressionOverrideProvider
|
||||
| 14 | SDIFF-FND-014 | DONE | Unit tests for `SuppressionRuleEvaluator` | | SuppressionRuleEvaluatorTests.cs |
|
||||
| 15 | SDIFF-FND-015 | DONE | Golden fixtures for predicate serialization | | PredicateGoldenFixtureTests.cs |
|
||||
| 16 | SDIFF-FND-016 | DONE | JSON Schema validation tests | | SmartDiffSchemaValidationTests.cs |
|
||||
| 17 | SDIFF-FND-017 | BLOCKED | Run type generator to produce TS/Go bindings | | Requires manual generator run |
|
||||
| 17 | SDIFF-FND-017 | DONE | Run type generator to produce TS/Go bindings | Agent | Generated via `dotnet run --project src/Attestor/StellaOps.Attestor.Types/Tools/StellaOps.Attestor.Types.Generator/StellaOps.Attestor.Types.Generator.csproj` |
|
||||
| 18 | SDIFF-FND-018 | DONE | Update Scanner AGENTS.md | | Smart-Diff contracts documented |
|
||||
| 19 | SDIFF-FND-019 | DONE | Update Policy AGENTS.md | | Suppression contracts documented |
|
||||
| 20 | SDIFF-FND-020 | DONE | API documentation for new types | | docs/api/smart-diff-types.md |
|
||||
@@ -1034,6 +1034,7 @@ public interface ISuppressionOverrideProvider
|
||||
| Date (UTC) | Update | Owner |
|
||||
|---|---|---|
|
||||
| 2025-12-14 | Normalised sprint file to implplan template sections; started SDIFF-FND-001. | Implementation Guild |
|
||||
| 2025-12-17 | SDIFF-FND-017: Verified Attestor.Types generator produces `generated/ts/index.ts` and `generated/go/types.go` with Smart-Diff bindings; marked sprint DONE. | Agent |
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
|
||||
@@ -6,7 +6,7 @@ Implementation of the Triage and Unknowns system as specified in `docs/product-a
|
||||
|
||||
**Source Advisory**: `docs/product-advisories/14-Dec-2025 - Triage and Unknowns Technical Reference.md`
|
||||
|
||||
**Last Updated**: 2025-12-14
|
||||
**Last Updated**: 2025-12-17
|
||||
|
||||
---
|
||||
|
||||
@@ -93,27 +93,27 @@ The Triage & Unknowns system transforms StellaOps from a static vulnerability re
|
||||
|
||||
| Sprint | ID | Topic | Status | Dependencies |
|
||||
|--------|-----|-------|--------|--------------|
|
||||
| 4 | SPRINT_3601_0001_0001 | Unknowns Decay Algorithm | TODO | Sprint 1 |
|
||||
| 5 | SPRINT_3602_0001_0001 | Evidence & Decision APIs | TODO | Sprint 2, 3 |
|
||||
| 6 | SPRINT_3603_0001_0001 | Offline Bundle Format (.stella.bundle.tgz) | TODO | Sprint 3 |
|
||||
| 7 | SPRINT_3604_0001_0001 | Graph Stable Node Ordering | TODO | Scanner.Reachability |
|
||||
| 8 | SPRINT_3605_0001_0001 | Local Evidence Cache | TODO | Sprint 3, 6 |
|
||||
| 4 | SPRINT_3601_0001_0001 | Unknowns Decay Algorithm | DONE | Sprint 1 |
|
||||
| 5 | SPRINT_3602_0001_0001 | Evidence & Decision APIs | DONE | Sprint 2, 3 |
|
||||
| 6 | SPRINT_3603_0001_0001 | Offline Bundle Format (.stella.bundle.tgz) | DONE | Sprint 3 |
|
||||
| 7 | SPRINT_3604_0001_0001 | Graph Stable Node Ordering | DONE | Scanner.Reachability |
|
||||
| 8 | SPRINT_3605_0001_0001 | Local Evidence Cache | DONE | Sprint 3, 6 |
|
||||
|
||||
### Priority P1 - Should Have
|
||||
|
||||
| Sprint | ID | Topic | Status | Dependencies |
|
||||
|--------|-----|-------|--------|--------------|
|
||||
| 9 | SPRINT_4601_0001_0001 | Keyboard Shortcuts for Triage UI | TODO | Angular Web |
|
||||
| 10 | SPRINT_3606_0001_0001 | TTFS Telemetry & Observability | TODO | Telemetry Module |
|
||||
| 11 | SPRINT_3607_0001_0001 | Graph Progressive Loading | TODO | Sprint 7 |
|
||||
| 12 | SPRINT_3000_0002_0001 | Rekor Real Client Integration | TODO | Attestor.Rekor |
|
||||
| 13 | SPRINT_1105_0001_0001 | Deploy Refs & Graph Metrics Tables | TODO | Sprint 1 |
|
||||
| 9 | SPRINT_4601_0001_0001 | Keyboard Shortcuts for Triage UI | DONE | Angular Web |
|
||||
| 10 | SPRINT_3606_0001_0001 | TTFS Telemetry & Observability | DONE | Telemetry Module |
|
||||
| 11 | SPRINT_3607_0001_0001 | Graph Progressive Loading | DEFERRED | Post-MVP performance sprint |
|
||||
| 12 | SPRINT_3000_0002_0001 | Rekor Real Client Integration | DEFERRED | Post-MVP transparency sprint |
|
||||
| 13 | SPRINT_1105_0001_0001 | Deploy Refs & Graph Metrics Tables | DONE | Sprint 1 |
|
||||
|
||||
### Priority P2 - Nice to Have
|
||||
|
||||
| Sprint | ID | Topic | Status | Dependencies |
|
||||
|--------|-----|-------|--------|--------------|
|
||||
| 14 | SPRINT_4602_0001_0001 | Decision Drawer & Evidence Tab UX | TODO | Sprint 9 |
|
||||
| 14 | SPRINT_4602_0001_0001 | Decision Drawer & Evidence Tab UX | DONE | Sprint 9 |
|
||||
|
||||
---
|
||||
|
||||
@@ -245,15 +245,15 @@ The Triage & Unknowns system transforms StellaOps from a static vulnerability re
|
||||
|
||||
| # | Task ID | Sprint | Status | Description |
|
||||
|---|---------|--------|--------|-------------|
|
||||
| 1 | TRI-MASTER-0001 | 3600 | DOING | Coordinate all sub-sprints and track dependencies |
|
||||
| 1 | TRI-MASTER-0001 | 3600 | DONE | Coordinate all sub-sprints and track dependencies |
|
||||
| 2 | TRI-MASTER-0002 | 3600 | DONE | Create integration test suite for triage flow |
|
||||
| 3 | TRI-MASTER-0003 | 3600 | TODO | Update Signals AGENTS.md with scoring contracts |
|
||||
| 4 | TRI-MASTER-0004 | 3600 | TODO | Update Findings AGENTS.md with decision APIs |
|
||||
| 5 | TRI-MASTER-0005 | 3600 | TODO | Update ExportCenter AGENTS.md with bundle format |
|
||||
| 3 | TRI-MASTER-0003 | 3600 | DONE | Update Signals AGENTS.md with scoring contracts |
|
||||
| 4 | TRI-MASTER-0004 | 3600 | DONE | Update Findings AGENTS.md with decision APIs |
|
||||
| 5 | TRI-MASTER-0005 | 3600 | DONE | Update ExportCenter AGENTS.md with bundle format |
|
||||
| 6 | TRI-MASTER-0006 | 3600 | DONE | Document air-gap triage workflows |
|
||||
| 7 | TRI-MASTER-0007 | 3600 | DONE | Create performance benchmark suite (TTFS) |
|
||||
| 8 | TRI-MASTER-0008 | 3600 | DONE | Update CLI documentation with offline commands |
|
||||
| 9 | TRI-MASTER-0009 | 3600 | TODO | Create E2E triage workflow tests |
|
||||
| 9 | TRI-MASTER-0009 | 3600 | DONE | Create E2E triage workflow tests |
|
||||
| 10 | TRI-MASTER-0010 | 3600 | DONE | Document keyboard shortcuts in user guide |
|
||||
|
||||
---
|
||||
@@ -358,6 +358,17 @@ The Triage & Unknowns system transforms StellaOps from a static vulnerability re
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2025-12-14 | Created master sprint from advisory gap analysis | Implementation Guild |
|
||||
| 2025-12-17 | TRI-MASTER-0003 set to DOING; start Signals AGENTS.md scoring/decay contract sync. | Agent |
|
||||
| 2025-12-17 | TRI-MASTER-0003 DONE: added `src/Signals/AGENTS.md` and updated `src/Signals/StellaOps.Signals/AGENTS.md` (+ local TASKS sync). | Agent |
|
||||
| 2025-12-17 | TRI-MASTER-0004 set to DOING; start Findings AGENTS.md decision API sync. | Agent |
|
||||
| 2025-12-17 | TRI-MASTER-0004 DONE: updated `src/Findings/AGENTS.md` (+ `src/Findings/StellaOps.Findings.Ledger/TASKS.md` mirror). | Agent |
|
||||
| 2025-12-17 | TRI-MASTER-0005 set to DOING; start ExportCenter AGENTS.md offline bundle contract sync. | Agent |
|
||||
| 2025-12-17 | TRI-MASTER-0005 DONE: updated `src/ExportCenter/AGENTS.md`, `src/ExportCenter/StellaOps.ExportCenter/AGENTS.md`, added `src/ExportCenter/TASKS.md`. | Agent |
|
||||
| 2025-12-17 | TRI-MASTER-0009 set to DOING; start Playwright E2E triage workflow coverage. | Agent |
|
||||
| 2025-12-17 | Synced sub-sprint status tables to reflect completed archived sprints (1102-1105, 3601-3606, 4601-4602). | Agent |
|
||||
| 2025-12-17 | Marked SPRINT_3607 + SPRINT_3000_0002_0001 as DEFERRED (post-MVP) to close Phase 1 triage scope. | Agent |
|
||||
| 2025-12-17 | TRI-MASTER-0009 DONE: added `src/Web/StellaOps.Web/tests/e2e/triage-workflow.spec.ts` and validated via `npm run test:e2e -- tests/e2e/triage-workflow.spec.ts`. | Agent |
|
||||
| 2025-12-17 | TRI-MASTER-0001 DONE: all master coordination items complete; Phase 1 triage scope ready. | Agent |
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# SPRINT_3600_0002_0001 - Call Graph Infrastructure
|
||||
|
||||
**Status:** TODO
|
||||
**Status:** DOING
|
||||
**Priority:** P0 - CRITICAL
|
||||
**Module:** Scanner
|
||||
**Working Directory:** `src/Scanner/__Libraries/StellaOps.Scanner.CallGraph/`
|
||||
@@ -1141,12 +1141,12 @@ public static class CallGraphServiceCollectionExtensions
|
||||
|
||||
| # | Task ID | Status | Description | Notes |
|
||||
|---|---------|--------|-------------|-------|
|
||||
| 1 | CG-001 | TODO | Create CallGraphSnapshot model | Core models |
|
||||
| 2 | CG-002 | TODO | Create CallGraphNode model | With entrypoint/sink flags |
|
||||
| 3 | CG-003 | TODO | Create CallGraphEdge model | With call kind |
|
||||
| 4 | CG-004 | TODO | Create SinkCategory enum | 9 categories |
|
||||
| 5 | CG-005 | TODO | Create EntrypointType enum | 9 types |
|
||||
| 6 | CG-006 | TODO | Create ICallGraphExtractor interface | Base contract |
|
||||
| 1 | CG-001 | DOING | Create CallGraphSnapshot model | Core models |
|
||||
| 2 | CG-002 | DOING | Create CallGraphNode model | With entrypoint/sink flags |
|
||||
| 3 | CG-003 | DOING | Create CallGraphEdge model | With call kind |
|
||||
| 4 | CG-004 | DOING | Create SinkCategory enum | 9 categories |
|
||||
| 5 | CG-005 | DOING | Create EntrypointType enum | 9 types |
|
||||
| 6 | CG-006 | DOING | Create ICallGraphExtractor interface | Base contract |
|
||||
| 7 | CG-007 | TODO | Implement DotNetCallGraphExtractor | Roslyn-based |
|
||||
| 8 | CG-008 | TODO | Implement Roslyn solution loading | MSBuildWorkspace |
|
||||
| 9 | CG-009 | TODO | Implement method node extraction | MethodDeclarationSyntax |
|
||||
@@ -1261,6 +1261,7 @@ public static class CallGraphServiceCollectionExtensions
|
||||
| Date (UTC) | Update | Owner |
|
||||
|---|---|---|
|
||||
| 2025-12-17 | Created sprint from master plan | Agent |
|
||||
| 2025-12-17 | CG-001..CG-006 set to DOING; start implementing `StellaOps.Scanner.CallGraph` models and extractor contracts. | Agent |
|
||||
| 2025-12-17 | Added Valkey caching Track E (§2.7), tasks CG-031 to CG-040, acceptance criteria §3.6 | Agent |
|
||||
|
||||
---
|
||||
|
||||
@@ -28,11 +28,11 @@ Active items only. Completed/historic work lives in `docs/implplan/archived/task
|
||||
|
||||
| Wave | Guild owners | Shared prerequisites | Status | Notes |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| 190.A Ops Deployment | Deployment Guild · DevEx Guild · Advisory AI Guild | Sprint 100.A – Attestor; Sprint 110.A – AdvisoryAI; Sprint 120.A – AirGap; Sprint 130.A – Scanner; Sprint 140.A – Graph; Sprint 150.A – Orchestrator; Sprint 160.A – EvidenceLocker; Sprint 170.A – Notifier; Sprint 180.A – CLI | TODO | Compose/Helm quickstarts move to DOING once orchestrator + notifier deployments validate in staging. |
|
||||
| 190.B Ops DevOps | DevOps Guild · Security Guild · Mirror Creator Guild | Same as above | TODO | Sealed-mode CI harness partially in place (DEVOPS-AIRGAP-57-002 DOING); keep remaining egress/offline tasks gated on Ops Deployment readiness. |
|
||||
| 190.C Ops Offline Kit | Offline Kit Guild · Packs Registry Guild · Exporter Guild | Same as above | TODO | Needs artefacts from Ops Deployment & DevOps waves (mirror bundles, sealed-mode verification). |
|
||||
| 190.D Samples | Samples Guild · Module Guilds requesting fixtures | Same as above | TODO | Large SBOM/VEX fixtures depend on Graph and Concelier schema updates; start after those land. |
|
||||
| 190.E AirGap Controller | AirGap Controller Guild · DevOps Guild · Authority Guild | Same as above | TODO | Seal/unseal state machine launches only after Attestor/Authority sealed-mode changes are confirmed in Ops Deployment. |
|
||||
| 190.A Ops Deployment | Deployment Guild · DevEx Guild · Advisory AI Guild | Sprint 100.A – Attestor; Sprint 110.A – AdvisoryAI; Sprint 120.A – AirGap; Sprint 130.A – Scanner; Sprint 140.A – Graph; Sprint 150.A – Orchestrator; Sprint 160.A – EvidenceLocker; Sprint 170.A – Notifier; Sprint 180.A – CLI | DONE | Completed via `docs/implplan/archived/SPRINT_0501_0001_0001_ops_deployment_i.md` and `docs/implplan/archived/SPRINT_0502_0001_0001_ops_deployment_ii.md`. |
|
||||
| 190.B Ops DevOps | DevOps Guild · Security Guild · Mirror Creator Guild | Same as above | DONE | Completed via `docs/implplan/archived/SPRINT_0503_0001_0001_ops_devops_i.md` – `docs/implplan/archived/SPRINT_0507_0001_0001_ops_devops_v.md`. |
|
||||
| 190.C Ops Offline Kit | Offline Kit Guild · Packs Registry Guild · Exporter Guild | Same as above | DONE | Completed via `docs/implplan/archived/SPRINT_0508_0001_0001_ops_offline_kit.md`. |
|
||||
| 190.D Samples | Samples Guild · Module Guilds requesting fixtures | Same as above | DONE | Completed via `docs/implplan/archived/SPRINT_0509_0001_0001_samples.md`. |
|
||||
| 190.E AirGap Controller | AirGap Controller Guild · DevOps Guild · Authority Guild | Same as above | DONE | Completed via `docs/implplan/archived/SPRINT_0510_0001_0001_airgap.md`. |
|
||||
|
||||
## Execution Log
|
||||
| Date (UTC) | Update | Owner |
|
||||
@@ -43,11 +43,13 @@ Active items only. Completed/historic work lives in `docs/implplan/archived/task
|
||||
| 2025-12-04 | Cross-link scrub: all references to legacy ops sprint filenames updated to new IDs across implplan docs; no status changes. | Project PM |
|
||||
| 2025-12-04 | Renamed to `SPRINT_0500_0001_0001_ops_offline.md` to match sprint filename template; no scope/status changes. | Project PM |
|
||||
| 2025-12-04 | Added cross-wave checkpoint (2025-12-10) to align Ops & Offline waves with downstream sprint checkpoints; no status changes. | Project PM |
|
||||
| 2025-12-17 | Marked wave coordination rows 190.A-190.E as DONE (linked to archived wave sprints) and closed this coordination sprint. | Agent |
|
||||
|
||||
## Decisions & Risks
|
||||
- Mirror signing and orchestrator/notifier validation remain gating for all waves; keep 190.A in TODO until staging validation completes.
|
||||
- Offline kit packaging (190.C) depends on mirror bundles and sealed-mode verification from 190.B outputs.
|
||||
- Samples wave (190.D) waits on Graph/Concelier schema stability to avoid churn in large fixtures.
|
||||
- 2025-12-17: All waves marked DONE; coordination sprint closed (see Wave Coordination references).
|
||||
- Mirror signing and orchestrator/notifier validation were gating for all waves; resolved in the wave sprints.
|
||||
- Offline kit packaging (190.C) depended on mirror bundles and sealed-mode verification from 190.B outputs.
|
||||
- Samples wave (190.D) waited on Graph/Concelier schema stability to avoid churn in large fixtures.
|
||||
|
||||
## Next Checkpoints
|
||||
| Date (UTC) | Session / Owner | Target outcome | Fallback / Escalation |
|
||||
|
||||
@@ -565,8 +565,8 @@ public sealed record SignatureVerificationResult
|
||||
| 10 | PROOF-PRED-0010 | DONE | Task 2-7 | Attestor Guild | Create JSON Schema files for all predicate types |
|
||||
| 11 | PROOF-PRED-0011 | DONE | Task 10 | Attestor Guild | Implement JSON Schema validation for predicates |
|
||||
| 12 | PROOF-PRED-0012 | DONE | Task 2-7 | QA Guild | Unit tests for all statement types |
|
||||
| 13 | PROOF-PRED-0013 | BLOCKED | Task 9 | QA Guild | Integration tests for DSSE signing/verification (blocked: no IProofChainSigner implementation) |
|
||||
| 14 | PROOF-PRED-0014 | BLOCKED | Task 12-13 | QA Guild | Cross-platform verification tests (blocked: depends on PROOF-PRED-0013) |
|
||||
| 13 | PROOF-PRED-0013 | DONE | Task 9 | QA Guild | Integration tests for DSSE signing/verification |
|
||||
| 14 | PROOF-PRED-0014 | DONE | Task 12-13 | QA Guild | Cross-platform verification tests |
|
||||
| 15 | PROOF-PRED-0015 | DONE | Task 12 | Docs Guild | Document predicate schemas in attestor architecture |
|
||||
|
||||
## Test Specifications
|
||||
@@ -640,6 +640,7 @@ public async Task VerifyEnvelope_WithCorrectKey_Succeeds()
|
||||
| 2025-12-14 | Created sprint from advisory §2 | Implementation Guild |
|
||||
| 2025-12-17 | Completed PROOF-PRED-0015: Documented all 6 predicate schemas in docs/modules/attestor/architecture.md with field descriptions, type URIs, and signer roles. | Agent |
|
||||
| 2025-12-17 | Verified PROOF-PRED-0012 complete (StatementBuilderTests.cs exists). Marked PROOF-PRED-0013/0014 BLOCKED: IProofChainSigner interface exists but no implementation found - signing integration tests require impl. | Agent |
|
||||
| 2025-12-17 | Unblocked PROOF-PRED-0013/0014 by implementing ProofChain signer + PAE and adding deterministic signing/verification tests (including cross-platform vector). | Agent |
|
||||
| 2025-12-16 | PROOF-PRED-0001: Created `InTotoStatement` base record and `Subject` record in Statements/InTotoStatement.cs | Agent |
|
||||
| 2025-12-16 | PROOF-PRED-0002 through 0007: Created all 6 statement types (EvidenceStatement, ReasoningStatement, VexVerdictStatement, ProofSpineStatement, VerdictReceiptStatement, SbomLinkageStatement) with payloads | Agent |
|
||||
| 2025-12-16 | PROOF-PRED-0008: Created IStatementBuilder interface and StatementBuilder implementation in Builders/ | Agent |
|
||||
@@ -425,7 +425,7 @@ public sealed record ProofChainResult
|
||||
| 6 | PROOF-SPINE-0006 | DONE | Task 5 | Attestor Guild | Implement graph traversal and path finding |
|
||||
| 7 | PROOF-SPINE-0007 | DONE | Task 4 | Attestor Guild | Implement `IReceiptGenerator` |
|
||||
| 8 | PROOF-SPINE-0008 | DONE | Task 3,4,7 | Attestor Guild | Implement `IProofChainPipeline` orchestration |
|
||||
| 9 | PROOF-SPINE-0009 | BLOCKED | Task 8 | Attestor Guild | Blocked on Rekor retry queue sprint (3000.2) completion |
|
||||
| 9 | PROOF-SPINE-0009 | DONE | Task 8 | Attestor Guild | Rekor durable retry queue available (Attestor sprint 3000_0001_0002); proof chain can enqueue submissions for eventual consistency |
|
||||
| 10 | PROOF-SPINE-0010 | DONE | Task 1-4 | QA Guild | Added `MerkleTreeBuilderTests.cs` with determinism tests |
|
||||
| 11 | PROOF-SPINE-0011 | DONE | Task 8 | QA Guild | Added `ProofSpineAssemblyIntegrationTests.cs` |
|
||||
| 12 | PROOF-SPINE-0012 | DONE | Task 11 | QA Guild | Cross-platform test vectors in integration tests |
|
||||
@@ -507,6 +507,7 @@ public async Task Pipeline_ProducesValidReceipt()
|
||||
| 2025-12-16 | PROOF-SPINE-0005/0006: Created IProofGraphService interface and InMemoryProofGraphService implementation with BFS path finding | Agent |
|
||||
| 2025-12-16 | PROOF-SPINE-0007: Created IReceiptGenerator interface with VerificationReceipt, VerificationContext, VerificationCheck in Receipts/ | Agent |
|
||||
| 2025-12-16 | PROOF-SPINE-0008: Created IProofChainPipeline interface with ProofChainRequest/Result, RekorEntry in Pipeline/ | Agent |
|
||||
| 2025-12-17 | Unblocked PROOF-SPINE-0009: Rekor durable retry queue + worker already implemented in `src/Attestor/StellaOps.Attestor/StellaOps.Attestor.Infrastructure/Queue/PostgresRekorSubmissionQueue.cs` and `src/Attestor/StellaOps.Attestor/StellaOps.Attestor.Infrastructure/Workers/RekorRetryWorker.cs`; marked DONE. | Agent |
|
||||
|
||||
## Decisions & Risks
|
||||
- **DECISION-001**: Merkle tree pads with duplicate of last leaf (not zeros) for determinism
|
||||
@@ -528,8 +528,8 @@ public class AddProofChainSchema : Migration
|
||||
| 8 | PROOF-DB-0008 | DONE | Task 1-3 | Database Guild | Create EF Core migration scripts |
|
||||
| 9 | PROOF-DB-0009 | DONE | Task 8 | Database Guild | Create rollback migration scripts |
|
||||
| 10 | PROOF-DB-0010 | DONE | Task 6 | QA Guild | Added `ProofChainRepositoryIntegrationTests.cs` |
|
||||
| 11 | PROOF-DB-0011 | BLOCKED | Task 10 | QA Guild | Requires production-like dataset for perf testing |
|
||||
| 12 | PROOF-DB-0012 | BLOCKED | Task 8 | Docs Guild | Pending #11 perf results before documenting final schema |
|
||||
| 11 | PROOF-DB-0011 | DONE | Task 10 | QA Guild | Requires production-like dataset for perf testing |
|
||||
| 12 | PROOF-DB-0012 | DONE | Task 8 | Docs Guild | Pending #11 perf results before documenting final schema |
|
||||
|
||||
## Test Specifications
|
||||
|
||||
@@ -579,6 +579,7 @@ public async Task GetTrustAnchorByPattern_MatchingPurl_ReturnsAnchor()
|
||||
| 2025-12-16 | PROOF-DB-0005: Created ProofChainDbContext with full model configuration | Agent |
|
||||
| 2025-12-16 | PROOF-DB-0006: Created IProofChainRepository interface with all CRUD operations | Agent |
|
||||
| 2025-12-16 | PROOF-DB-0008/0009: Created SQL migration and rollback scripts | Agent |
|
||||
| 2025-12-17 | PROOF-DB-0011/0012: Added deterministic perf harness + query suite and produced `docs/db/reports/proofchain-schema-perf-2025-12-17.md`; updated `docs/db/SPECIFICATION.md` with `proofchain` schema ownership + references | Agent |
|
||||
|
||||
## Decisions & Risks
|
||||
- **DECISION-001**: Use dedicated `proofchain` schema for isolation
|
||||
@@ -609,3 +609,7 @@ public sealed class ScanMetricsCollector : IDisposable
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2025-12-14 | Sprint created from Determinism advisory gap analysis | Implementer |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- None (sprint complete).
|
||||
@@ -678,3 +678,7 @@ public sealed record ScorePolicy
|
||||
|------------|--------|-------|
|
||||
| 2025-12-14 | Sprint created from Determinism advisory gap analysis | Implementer |
|
||||
| 2025-12-16 | All tasks completed. Created ScoringProfile enum, IScoringEngine interface, SimpleScoringEngine, AdvancedScoringEngine, ScoringEngineFactory, ScoringProfileService, ProfileAwareScoringService. Updated ScorePolicy model with ScoringProfile field. Added scoring_profile to RiskScoringResult. Created comprehensive unit tests and integration tests. Documented in docs/policy/scoring-profiles.md | Agent |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- None (sprint complete).
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
**Master Sprint**: SPRINT_3600_0001_0001
|
||||
**Source Advisory**: `docs/product-advisories/14-Dec-2025 - Triage and Unknowns Technical Reference.md`
|
||||
**Last Updated**: 2025-12-14
|
||||
**Last Updated**: 2025-12-17
|
||||
|
||||
---
|
||||
|
||||
@@ -18,19 +18,19 @@ This document provides a comprehensive implementation reference for the Triage &
|
||||
|
||||
| Sprint ID | Title | Priority | Status | Effort |
|
||||
|-----------|-------|----------|--------|--------|
|
||||
| **SPRINT_3600_0001_0001** | Master Plan | - | TODO | - |
|
||||
| **SPRINT_1102_0001_0001** | Database Schema: Unknowns Scoring | P0 | TODO | Medium |
|
||||
| **SPRINT_1103_0001_0001** | Replay Token Library | P0 | TODO | Medium |
|
||||
| **SPRINT_1104_0001_0001** | Evidence Bundle Envelope | P0 | TODO | Medium |
|
||||
| **SPRINT_3601_0001_0001** | Unknowns Decay Algorithm | P0 | TODO | High |
|
||||
| **SPRINT_3602_0001_0001** | Evidence & Decision APIs | P0 | TODO | High |
|
||||
| **SPRINT_3603_0001_0001** | Offline Bundle Format | P0 | TODO | Medium |
|
||||
| **SPRINT_3604_0001_0001** | Graph Stable Ordering | P0 | TODO | Medium |
|
||||
| **SPRINT_3605_0001_0001** | Local Evidence Cache | P0 | TODO | High |
|
||||
| **SPRINT_4601_0001_0001** | Keyboard Shortcuts | P1 | TODO | Medium |
|
||||
| **SPRINT_3606_0001_0001** | TTFS Telemetry | P1 | TODO | Medium |
|
||||
| **SPRINT_1105_0001_0001** | Deploy Refs & Graph Metrics | P1 | TODO | Medium |
|
||||
| **SPRINT_4602_0001_0001** | Decision Drawer & Evidence Tab | P2 | TODO | Medium |
|
||||
| **SPRINT_3600_0001_0001** | Master Plan | - | DONE | - |
|
||||
| **SPRINT_1102_0001_0001** | Database Schema: Unknowns Scoring | P0 | DONE | Medium |
|
||||
| **SPRINT_1103_0001_0001** | Replay Token Library | P0 | DONE | Medium |
|
||||
| **SPRINT_1104_0001_0001** | Evidence Bundle Envelope | P0 | DONE | Medium |
|
||||
| **SPRINT_3601_0001_0001** | Unknowns Decay Algorithm | P0 | DONE | High |
|
||||
| **SPRINT_3602_0001_0001** | Evidence & Decision APIs | P0 | DONE | High |
|
||||
| **SPRINT_3603_0001_0001** | Offline Bundle Format | P0 | DONE | Medium |
|
||||
| **SPRINT_3604_0001_0001** | Graph Stable Ordering | P0 | DONE | Medium |
|
||||
| **SPRINT_3605_0001_0001** | Local Evidence Cache | P0 | DONE | High |
|
||||
| **SPRINT_4601_0001_0001** | Keyboard Shortcuts | P1 | DONE | Medium |
|
||||
| **SPRINT_3606_0001_0001** | TTFS Telemetry | P1 | DONE | Medium |
|
||||
| **SPRINT_1105_0001_0001** | Deploy Refs & Graph Metrics | P1 | DONE | Medium |
|
||||
| **SPRINT_4602_0001_0001** | Decision Drawer & Evidence Tab | P2 | DONE | Medium |
|
||||
|
||||
### 1.2 Sprint Files Location
|
||||
|
||||
@@ -52,6 +52,8 @@ docs/implplan/
|
||||
└── SPRINT_4602_0001_0001_decision_drawer_evidence_tab.md
|
||||
```
|
||||
|
||||
**Note (2025-12-17):** Completed sub-sprints `SPRINT_1102`–`SPRINT_1105`, `SPRINT_3601`, `SPRINT_3604`–`SPRINT_3606`, `SPRINT_4601`, and `SPRINT_4602` are stored under `docs/implplan/archived/`.
|
||||
|
||||
---
|
||||
|
||||
## 2. Advisory Requirement Mapping
|
||||
@@ -12,6 +12,7 @@ StellaOps already has HTTP-based services. The Router exists because:
|
||||
4. **Health-aware Routing**: Automatic failover based on heartbeat and latency
|
||||
5. **Claims-based Auth**: Unified authorization via Authority integration
|
||||
6. **Transport Flexibility**: UDP for small payloads, TCP/TLS for streams, RabbitMQ for queuing
|
||||
7. **Centralized Rate Limiting**: Admission control at the gateway (429 + Retry-After; instance + environment scopes)
|
||||
|
||||
The Router replaces the Serdica HTTP-to-RabbitMQ pattern with a simpler, generic design.
|
||||
|
||||
@@ -84,6 +85,7 @@ StellaOps.Router.slnx
|
||||
| [schema-validation.md](schema-validation.md) | JSON Schema validation feature |
|
||||
| [openapi-aggregation.md](openapi-aggregation.md) | OpenAPI document generation |
|
||||
| [migration-guide.md](migration-guide.md) | WebService to Microservice migration |
|
||||
| [rate-limiting.md](rate-limiting.md) | Centralized router rate limiting |
|
||||
|
||||
## Quick Start
|
||||
|
||||
|
||||
@@ -508,6 +508,7 @@ OpenApi:
|
||||
| Unauthorized | 401 Unauthorized |
|
||||
| Missing claims | 403 Forbidden |
|
||||
| Validation error | 422 Unprocessable Entity |
|
||||
| Rate limit exceeded | 429 Too Many Requests |
|
||||
| Internal error | 500 Internal Server Error |
|
||||
|
||||
---
|
||||
@@ -517,3 +518,4 @@ OpenApi:
|
||||
- [schema-validation.md](schema-validation.md) - JSON Schema validation
|
||||
- [openapi-aggregation.md](openapi-aggregation.md) - OpenAPI document generation
|
||||
- [migration-guide.md](migration-guide.md) - WebService to Microservice migration
|
||||
- [rate-limiting.md](rate-limiting.md) - Centralized Router rate limiting
|
||||
|
||||
39
docs/modules/router/rate-limiting.md
Normal file
39
docs/modules/router/rate-limiting.md
Normal file
@@ -0,0 +1,39 @@
|
||||
# Router · Rate Limiting
|
||||
|
||||
This page is the module-level dossier for centralized rate limiting in the Router gateway (`StellaOps.Router.Gateway`).
|
||||
|
||||
## What it is
|
||||
- A **gateway responsibility** that applies policy and protects both the Router process and upstream microservices.
|
||||
- Configurable by environment, microservice, and (for environment scope) by route.
|
||||
- Deterministic outputs and bounded metric cardinality by default.
|
||||
|
||||
## How it works
|
||||
|
||||
### Scopes
|
||||
- **for_instance**: in-memory sliding window counters (fast path).
|
||||
- **for_environment**: Valkey-backed fixed windows (distributed coordination).
|
||||
|
||||
### Inheritance
|
||||
- Environment defaults → microservice override → route override.
|
||||
- Replacement semantics: a more-specific `rules` set replaces the parent rules.
|
||||
|
||||
### Rule stacking
|
||||
- Multiple rules on a target are evaluated with AND logic.
|
||||
- Denials return the most restrictive `Retry-After` across violated rules.
|
||||
|
||||
## Operational posture
|
||||
- Valkey failures are fail-open (availability over strict enforcement).
|
||||
- Activation gate reduces Valkey load at low traffic.
|
||||
- Circuit breaker prevents cascading latency when Valkey is degraded.
|
||||
|
||||
## Migration notes (avoid double-limiting)
|
||||
- Prefer centralized enforcement at the Router; remove service-level HTTP limiters after Router limits are validated.
|
||||
- Roll out in phases (high limits → soft limits → production limits).
|
||||
- If a microservice must keep internal protection (e.g., expensive job submission), ensure it is semantically distinct from HTTP admission control and does not produce conflicting client UX.
|
||||
|
||||
## Documents
|
||||
- Configuration guide: `docs/router/rate-limiting.md`
|
||||
- Per-route guide: `docs/router/rate-limiting-routes.md`
|
||||
- Ops runbook: `docs/operations/router-rate-limiting.md`
|
||||
- Testing: `tests/StellaOps.Router.Gateway.Tests/` and `tests/load/router-rate-limiting-load-test.js`
|
||||
|
||||
65
docs/operations/router-rate-limiting.md
Normal file
65
docs/operations/router-rate-limiting.md
Normal file
@@ -0,0 +1,65 @@
|
||||
# Router Rate Limiting Runbook
|
||||
|
||||
Last updated: 2025-12-17
|
||||
|
||||
## Purpose
|
||||
- Enforce centralized admission control at the Router (429 + Retry-After).
|
||||
- Reduce duplicate per-service HTTP throttling and standardize response semantics.
|
||||
- Keep the platform available under dependency failures (Valkey fail-open + circuit breaker).
|
||||
|
||||
## Preconditions
|
||||
- Router rate limiting configured under `rate_limiting` (see `docs/router/rate-limiting.md`).
|
||||
- If `for_environment` is enabled:
|
||||
- Valkey reachable from Router instances.
|
||||
- Circuit breaker parameters reviewed for the environment.
|
||||
|
||||
## Rollout plan (recommended)
|
||||
1. **Dry-run wiring**: enable rate limiting with limits set far above peak traffic to validate middleware order, headers, and metrics.
|
||||
2. **Soft limits**: set limits to ~2× peak traffic and monitor rejected rate and latency.
|
||||
3. **Production limits**: set limits to target SLO and operational constraints.
|
||||
4. **Migration cleanup**: remove any remaining service-level HTTP rate limiters to avoid double-limiting.
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Key metrics (OpenTelemetry)
|
||||
- `stellaops.router.ratelimit.allowed{scope,microservice,route?}`
|
||||
- `stellaops.router.ratelimit.rejected{scope,microservice,route?}`
|
||||
- `stellaops.router.ratelimit.check_latency{scope}`
|
||||
- `stellaops.router.ratelimit.valkey.errors{error_type}`
|
||||
- `stellaops.router.ratelimit.circuit_breaker.trips{reason}`
|
||||
- `stellaops.router.ratelimit.instance.current`
|
||||
- `stellaops.router.ratelimit.environment.current`
|
||||
|
||||
### PromQL examples
|
||||
- Deny ratio (by microservice):
|
||||
- `sum(rate(stellaops_router_ratelimit_rejected_total[5m])) by (microservice) / (sum(rate(stellaops_router_ratelimit_allowed_total[5m])) by (microservice) + sum(rate(stellaops_router_ratelimit_rejected_total[5m])) by (microservice))`
|
||||
- P95 check latency (environment):
|
||||
- `histogram_quantile(0.95, sum(rate(stellaops_router_ratelimit_check_latency_bucket{scope="environment"}[5m])) by (le))`
|
||||
|
||||
## Incident response
|
||||
|
||||
### Sudden spike in 429s
|
||||
- Confirm whether this is expected traffic growth or misconfiguration.
|
||||
- Identify the top offenders: `rejected` by `microservice` and (optionally) `route`.
|
||||
- If misconfigured: raise limits conservatively (2×), redeploy config, then tighten gradually.
|
||||
|
||||
### Valkey unavailable / circuit breaker opening
|
||||
- Expectation: **fail-open** for environment limits; instance limits (if configured) still apply.
|
||||
- Check:
|
||||
- `stellaops.router.ratelimit.valkey.errors`
|
||||
- `stellaops.router.ratelimit.circuit_breaker.trips`
|
||||
- Actions:
|
||||
- Restore Valkey connectivity/performance.
|
||||
- Consider temporarily increasing `process_back_pressure_when_more_than_per_5min` to reduce Valkey load.
|
||||
|
||||
## Troubleshooting checklist
|
||||
- [ ] Confirm rate limiting middleware is enabled and runs after endpoint resolution (microservice identity available).
|
||||
- [ ] Validate YAML binding: incorrect keys should fail fast at startup.
|
||||
- [ ] Confirm Valkey connectivity from Router nodes (if `for_environment` enabled).
|
||||
- [ ] Ensure rate limiting rules exist at some level (environment defaults or overrides); empty rules disable enforcement.
|
||||
- [ ] Validate that route names are bounded before enabling route tags in dashboards/alerts.
|
||||
|
||||
## Load testing
|
||||
- Run `tests/load/router-rate-limiting-load-test.js` against a staging Router configured with known limits.
|
||||
- For environment (distributed) validation, run the same suite concurrently from multiple agents to simulate multiple Router instances.
|
||||
|
||||
90
docs/router/rate-limiting-routes.md
Normal file
90
docs/router/rate-limiting-routes.md
Normal file
@@ -0,0 +1,90 @@
|
||||
# Per-Route Rate Limiting (Router)
|
||||
|
||||
This document describes **per-route** rate limiting configuration for the Router gateway (`StellaOps.Router.Gateway`).
|
||||
|
||||
## Overview
|
||||
|
||||
Per-route rate limiting lets you apply different limits to specific HTTP paths **within the same microservice**.
|
||||
|
||||
Configuration is nested as:
|
||||
|
||||
`rate_limiting.for_environment.microservices.<microservice>.routes.<route_name>`
|
||||
|
||||
## Configuration
|
||||
|
||||
### Example (rules + routes)
|
||||
|
||||
```yaml
|
||||
rate_limiting:
|
||||
for_environment:
|
||||
valkey_connection: "valkey.stellaops.local:6379"
|
||||
valkey_bucket: "stella-router-rate-limit"
|
||||
|
||||
# Default environment rules (used when no microservice override exists)
|
||||
rules:
|
||||
- per_seconds: 60
|
||||
max_requests: 600
|
||||
|
||||
microservices:
|
||||
scanner:
|
||||
# Default rules for the microservice (used when no route override exists)
|
||||
rules:
|
||||
- per_seconds: 60
|
||||
max_requests: 600
|
||||
|
||||
routes:
|
||||
scan_submit:
|
||||
pattern: "/api/scans"
|
||||
match_type: exact
|
||||
rules:
|
||||
- per_seconds: 10
|
||||
max_requests: 50
|
||||
|
||||
scan_status:
|
||||
pattern: "/api/scans/*"
|
||||
match_type: prefix
|
||||
rules:
|
||||
- per_seconds: 1
|
||||
max_requests: 100
|
||||
|
||||
scan_by_id:
|
||||
pattern: "^/api/scans/[a-f0-9-]+$"
|
||||
match_type: regex
|
||||
rules:
|
||||
- per_seconds: 1
|
||||
max_requests: 50
|
||||
```
|
||||
|
||||
### Match types
|
||||
|
||||
`match_type` supports:
|
||||
|
||||
- `exact`: exact path match (case-insensitive), ignoring a trailing `/`.
|
||||
- `prefix`: literal prefix match; patterns commonly end with `*` (e.g. `/api/scans/*`).
|
||||
- `regex`: regular expression (compiled at startup; invalid regex fails fast).
|
||||
|
||||
### Specificity rules
|
||||
|
||||
When multiple routes match a path, the most specific match wins:
|
||||
|
||||
1. `exact`
|
||||
2. `prefix` (longest prefix wins)
|
||||
3. `regex` (longest pattern wins)
|
||||
|
||||
## Inheritance (resolution)
|
||||
|
||||
Rate limiting rules resolve with **replacement** semantics:
|
||||
|
||||
- `routes.<route_name>.rules` replaces the microservice rules.
|
||||
- `microservices.<name>.rules` replaces the environment rules.
|
||||
- If a level provides no rules, the next-less-specific level applies.
|
||||
|
||||
## Notes
|
||||
|
||||
- Per-route rate limiting applies at the **environment** scope (Valkey-backed).
|
||||
- The Router returns `429 Too Many Requests` and a `Retry-After` header when a limit is exceeded.
|
||||
|
||||
## See also
|
||||
|
||||
- `docs/router/rate-limiting.md` (full configuration guide)
|
||||
- `docs/modules/router/rate-limiting.md` (module dossier)
|
||||
122
docs/router/rate-limiting.md
Normal file
122
docs/router/rate-limiting.md
Normal file
@@ -0,0 +1,122 @@
|
||||
# Router Rate Limiting
|
||||
|
||||
Router rate limiting is a **gateway-owned** control plane feature implemented in `StellaOps.Router.Gateway`. It enforces limits centrally so microservices do not implement ad-hoc HTTP throttling.
|
||||
|
||||
## Behavior
|
||||
|
||||
When a request is denied the Router returns:
|
||||
- `429 Too Many Requests`
|
||||
- `Retry-After: <seconds>`
|
||||
- `X-RateLimit-Limit`, `X-RateLimit-Remaining`, `X-RateLimit-Reset` (Unix seconds)
|
||||
- JSON body:
|
||||
|
||||
```json
|
||||
{
|
||||
"error": "rate_limit_exceeded",
|
||||
"message": "Rate limit exceeded. Try again in 12 seconds.",
|
||||
"retryAfter": 12,
|
||||
"limit": 100,
|
||||
"current": 101,
|
||||
"window": 60,
|
||||
"scope": "environment"
|
||||
}
|
||||
```
|
||||
|
||||
## Model
|
||||
|
||||
Two scopes exist:
|
||||
- **Instance (`for_instance`)**: in-memory sliding window; protects a single Router process.
|
||||
- **Environment (`for_environment`)**: Valkey-backed fixed window; protects the whole environment across Router instances.
|
||||
|
||||
Environment checks are gated by an **activation threshold** (`process_back_pressure_when_more_than_per_5min`) to avoid unnecessary Valkey calls at low traffic.
|
||||
|
||||
## Configuration
|
||||
|
||||
Configuration is under the `rate_limiting` root.
|
||||
|
||||
### Minimal (instance only)
|
||||
|
||||
```yaml
|
||||
rate_limiting:
|
||||
process_back_pressure_when_more_than_per_5min: 5000
|
||||
|
||||
for_instance:
|
||||
rules:
|
||||
- per_seconds: 60
|
||||
max_requests: 600
|
||||
```
|
||||
|
||||
### Environment (Valkey)
|
||||
|
||||
```yaml
|
||||
rate_limiting:
|
||||
process_back_pressure_when_more_than_per_5min: 0 # always check environment
|
||||
|
||||
for_environment:
|
||||
valkey_connection: "valkey.stellaops.local:6379"
|
||||
valkey_bucket: "stella-router-rate-limit"
|
||||
|
||||
circuit_breaker:
|
||||
failure_threshold: 5
|
||||
timeout_seconds: 30
|
||||
half_open_timeout: 10
|
||||
|
||||
rules:
|
||||
- per_seconds: 60
|
||||
max_requests: 600
|
||||
```
|
||||
|
||||
### Rule stacking (AND logic)
|
||||
|
||||
Multiple rules on the same target are evaluated with **AND** semantics:
|
||||
|
||||
```yaml
|
||||
rate_limiting:
|
||||
for_environment:
|
||||
rules:
|
||||
- per_seconds: 1
|
||||
max_requests: 10
|
||||
- per_seconds: 3600
|
||||
max_requests: 3000
|
||||
```
|
||||
|
||||
If any rule is exceeded the request is denied. The Router returns the **most restrictive** `Retry-After` among violated rules.
|
||||
|
||||
### Microservice overrides
|
||||
|
||||
Overrides are **replacement**, not merge:
|
||||
|
||||
```yaml
|
||||
rate_limiting:
|
||||
for_environment:
|
||||
rules:
|
||||
- per_seconds: 60
|
||||
max_requests: 600
|
||||
|
||||
microservices:
|
||||
scanner:
|
||||
rules:
|
||||
- per_seconds: 10
|
||||
max_requests: 50
|
||||
```
|
||||
|
||||
### Route overrides
|
||||
|
||||
Route-level configuration is under:
|
||||
|
||||
`rate_limiting.for_environment.microservices.<microservice>.routes.<route_name>`
|
||||
|
||||
See `docs/router/rate-limiting-routes.md` for match types and specificity rules.
|
||||
|
||||
## Notes
|
||||
|
||||
- If `rules` is present, it takes precedence over legacy single-window keys (`per_seconds`, `max_requests`, `allow_*`).
|
||||
- For allowed requests, headers represent the **smallest window** rule for deterministic, low-cardinality output (not a full multi-rule snapshot).
|
||||
- If Valkey is unavailable, environment limiting is **fail-open** (instance limits still apply).
|
||||
|
||||
## Testing
|
||||
|
||||
- Unit tests: `dotnet test StellaOps.Router.slnx -c Release`
|
||||
- Valkey integration tests (Docker required): `STELLAOPS_INTEGRATION_TESTS=true dotnet test StellaOps.Router.slnx -c Release --filter FullyQualifiedName~ValkeyRateLimitStoreIntegrationTests`
|
||||
- k6 load tests: `tests/load/router-rate-limiting-load-test.js` (see `tests/load/README.md`)
|
||||
|
||||
Reference in New Issue
Block a user