feat(rate-limiting): Implement core rate limiting functionality with configuration, decision-making, metrics, middleware, and service registration

- Add RateLimitConfig for configuration management with YAML binding support. - Introduce RateLimitDecision to encapsulate the result of rate limit checks. - Implement RateLimitMetrics for OpenTelemetry metrics tracking. - Create RateLimitMiddleware for enforcing rate limits on incoming requests. - Develop RateLimitService to orchestrate instance and environment rate limit checks. - Add RateLimitServiceCollectionExtensions for dependency injection registration.
2025-12-17 18:02:37 +02:00
parent 394b57f6bf
commit 8bbfe4d2d2
211 changed files with 47179 additions and 1590 deletions
--- a/docs/db/schemas/scanner_schema_specification.md
+++ b/docs/db/schemas/scanner_schema_specification.md
@@ -0,0 +1,468 @@
+# Scanner Schema Specification
+
+**Schema**: `scanner`
+**Owner**: Scanner.WebService
+**Purpose**: Scan orchestration, call-graphs, proof bundles, reachability analysis
+**Sprint**: SPRINT_3500_0002_0001, SPRINT_3500_0003_0002
+
+---
+
+## Overview
+
+The `scanner` schema contains all tables related to:
+1. Scan manifests and deterministic replay
+2. Proof bundles (content-addressed storage metadata)
+3. Call-graph nodes and edges (reachability analysis)
+4. Entrypoints (framework-specific entry discovery)
+5. Runtime samples (profiling data for reachability validation)
+
+**Design Principles**:
+- All tables use `scan_id` as primary partition key for scan isolation
+- Deterministic data only (no timestamps in core algorithms)
+- Content-addressed references (hashes, not paths)
+- Forward-only schema evolution
+
+---
+
+## Tables
+
+### 1. scan_manifest
+
+**Purpose**: Stores immutable scan manifests capturing all inputs for deterministic replay.
+
+**Schema**:
+
+| Column | Type | Nullable | Description |
+|--------|------|----------|-------------|
+| `scan_id` | `text` | NOT NULL | Primary key; UUID format |
+| `created_at_utc` | `timestamptz` | NOT NULL | Scan creation timestamp |
+| `artifact_digest` | `text` | NOT NULL | Image/artifact digest (sha256:...) |
+| `artifact_purl` | `text` | NULL | PURL identifier (pkg:oci/...) |
+| `scanner_version` | `text` | NOT NULL | Scanner.WebService version |
+| `worker_version` | `text` | NOT NULL | Scanner.Worker version |
+| `concelier_snapshot_hash` | `text` | NOT NULL | Concelier feed snapshot digest |
+| `excititor_snapshot_hash` | `text` | NOT NULL | Excititor VEX snapshot digest |
+| `lattice_policy_hash` | `text` | NOT NULL | Policy bundle digest |
+| `deterministic` | `boolean` | NOT NULL | Whether scan used deterministic mode |
+| `seed` | `bytea` | NOT NULL | 32-byte deterministic seed |
+| `knobs` | `jsonb` | NULL | Configuration knobs (depth limits, etc.) |
+| `manifest_hash` | `text` | NOT NULL | SHA-256 of canonical manifest JSON (UNIQUE) |
+| `manifest_json` | `jsonb` | NOT NULL | Canonical JSON manifest |
+| `manifest_dsse_json` | `jsonb` | NOT NULL | DSSE signature envelope |
+
+**Indexes**:
+
+```sql
+CREATE INDEX idx_scan_manifest_artifact ON scanner.scan_manifest(artifact_digest);
+CREATE INDEX idx_scan_manifest_snapshots ON scanner.scan_manifest(concelier_snapshot_hash, excititor_snapshot_hash);
+CREATE INDEX idx_scan_manifest_created ON scanner.scan_manifest(created_at_utc DESC);
+CREATE UNIQUE INDEX idx_scan_manifest_hash ON scanner.scan_manifest(manifest_hash);
+```
+
+**Constraints**:
+- `manifest_hash` format: `sha256:[0-9a-f]{64}`
+- `seed` must be exactly 32 bytes
+- `scan_id` format: UUID v4
+
+**Partitioning**: None (lookup table, <100k rows expected)
+
+**Retention**: 180 days (drop scans older than 180 days)
+
+---
+
+### 2. proof_bundle
+
+**Purpose**: Metadata for content-addressed proof bundles (zip archives).
+
+**Schema**:
+
+| Column | Type | Nullable | Description |
+|--------|------|----------|-------------|
+| `scan_id` | `text` | NOT NULL | Foreign key to `scan_manifest.scan_id` |
+| `root_hash` | `text` | NOT NULL | Merkle root hash of bundle contents |
+| `bundle_uri` | `text` | NOT NULL | File path or S3 URI to bundle zip |
+| `proof_root_dsse_json` | `jsonb` | NOT NULL | DSSE signature of root hash |
+| `created_at_utc` | `timestamptz` | NOT NULL | Bundle creation timestamp |
+
+**Primary Key**: `(scan_id, root_hash)`
+
+**Indexes**:
+
+```sql
+CREATE INDEX idx_proof_bundle_scan ON scanner.proof_bundle(scan_id);
+CREATE INDEX idx_proof_bundle_created ON scanner.proof_bundle(created_at_utc DESC);
+```
+
+**Constraints**:
+- `root_hash` format: `sha256:[0-9a-f]{64}`
+- `bundle_uri` must be accessible file path or S3 URI
+
+**Partitioning**: None (<100k rows expected)
+
+**Retention**: 365 days (compliance requirement for signed bundles)
+
+---
+
+### 3. cg_node (call-graph nodes)
+
+**Purpose**: Stores call-graph nodes (methods/functions) extracted from artifacts.
+
+**Schema**:
+
+| Column | Type | Nullable | Description |
+|--------|------|----------|-------------|
+| `scan_id` | `text` | NOT NULL | Partition key |
+| `node_id` | `text` | NOT NULL | Deterministic node ID (hash-based) |
+| `artifact_key` | `text` | NOT NULL | Artifact identifier (assembly name, JAR, etc.) |
+| `symbol_key` | `text` | NOT NULL | Canonical symbol name (Namespace.Type::Method) |
+| `visibility` | `text` | NOT NULL | `public`, `internal`, `private`, `unknown` |
+| `flags` | `integer` | NOT NULL | Bitfield: `IS_ENTRYPOINT_CANDIDATE=1`, `IS_VIRTUAL=2`, etc. |
+
+**Primary Key**: `(scan_id, node_id)`
+
+**Indexes**:
+
+```sql
+CREATE INDEX idx_cg_node_artifact ON scanner.cg_node(scan_id, artifact_key);
+CREATE INDEX idx_cg_node_symbol ON scanner.cg_node(scan_id, symbol_key);
+CREATE INDEX idx_cg_node_flags ON scanner.cg_node(scan_id, flags) WHERE (flags & 1) = 1; -- Entrypoint candidates
+```
+
+**Constraints**:
+- `node_id` format: `sha256:[0-9a-f]{64}` (deterministic hash)
+- `visibility` must be one of: `public`, `internal`, `private`, `unknown`
+
+**Partitioning**: Hash partition by `scan_id` (for scans with >100k nodes)
+
+**Retention**: 90 days (call-graphs recomputed on rescan)
+
+---
+
+### 4. cg_edge (call-graph edges)
+
+**Purpose**: Stores call-graph edges (invocations) between nodes.
+
+**Schema**:
+
+| Column | Type | Nullable | Description |
+|--------|------|----------|-------------|
+| `scan_id` | `text` | NOT NULL | Partition key |
+| `from_node_id` | `text` | NOT NULL | Caller node ID |
+| `to_node_id` | `text` | NOT NULL | Callee node ID |
+| `kind` | `smallint` | NOT NULL | `1=static`, `2=heuristic` |
+| `reason` | `smallint` | NOT NULL | `1=direct_call`, `2=virtual_call`, `3=reflection_string`, etc. |
+| `weight` | `real` | NOT NULL | Edge confidence weight (0.0-1.0) |
+
+**Primary Key**: `(scan_id, from_node_id, to_node_id, kind, reason)`
+
+**Indexes**:
+
+```sql
+CREATE INDEX idx_cg_edge_from ON scanner.cg_edge(scan_id, from_node_id);
+CREATE INDEX idx_cg_edge_to ON scanner.cg_edge(scan_id, to_node_id);
+CREATE INDEX idx_cg_edge_static ON scanner.cg_edge(scan_id, kind) WHERE kind = 1;
+CREATE INDEX idx_cg_edge_heuristic ON scanner.cg_edge(scan_id, kind) WHERE kind = 2;
+```
+
+**Constraints**:
+- `kind` must be 1 (static) or 2 (heuristic)
+- `reason` must be in range 1-10 (enum defined in code)
+- `weight` must be in range [0.0, 1.0]
+
+**Partitioning**: Hash partition by `scan_id` (for scans with >500k edges)
+
+**Retention**: 90 days
+
+**Notes**:
+- High-volume table (1M+ rows per large scan)
+- Use partial indexes for `kind` to optimize static-only queries
+- Consider GIN index on `(from_node_id, to_node_id)` for bidirectional BFS
+
+---
+
+### 5. entrypoint
+
+**Purpose**: Stores discovered entrypoints (HTTP routes, CLI commands, background jobs).
+
+**Schema**:
+
+| Column | Type | Nullable | Description |
+|--------|------|----------|-------------|
+| `scan_id` | `text` | NOT NULL | Partition key |
+| `node_id` | `text` | NOT NULL | Reference to `cg_node.node_id` |
+| `kind` | `text` | NOT NULL | `http`, `grpc`, `cli`, `job`, `event`, `unknown` |
+| `framework` | `text` | NOT NULL | `aspnetcore`, `spring`, `express`, etc. |
+| `route` | `text` | NULL | HTTP route pattern (e.g., `/api/orders/{id}`) |
+| `metadata` | `jsonb` | NULL | Framework-specific metadata |
+
+**Primary Key**: `(scan_id, node_id, kind, framework, route)`
+
+**Indexes**:
+
+```sql
+CREATE INDEX idx_entrypoint_scan ON scanner.entrypoint(scan_id);
+CREATE INDEX idx_entrypoint_kind ON scanner.entrypoint(scan_id, kind);
+CREATE INDEX idx_entrypoint_framework ON scanner.entrypoint(scan_id, framework);
+```
+
+**Constraints**:
+- `kind` must be one of: `http`, `grpc`, `cli`, `job`, `event`, `unknown`
+- `route` required for `kind='http'` or `kind='grpc'`
+
+**Partitioning**: None (<10k rows per scan)
+
+**Retention**: 90 days
+
+---
+
+### 6. runtime_sample
+
+**Purpose**: Stores runtime profiling samples (stack traces) for reachability validation.
+
+**Schema**:
+
+| Column | Type | Nullable | Description |
+|--------|------|----------|-------------|
+| `scan_id` | `text` | NOT NULL | Partition key (links to scan) |
+| `collected_at` | `timestamptz` | NOT NULL | Sample collection timestamp |
+| `env_hash` | `text` | NOT NULL | Environment hash (k8s ns+pod+container) |
+| `sample_id` | `bigserial` | NOT NULL | Auto-incrementing sample ID |
+| `timestamp` | `timestamptz` | NOT NULL | Sample timestamp |
+| `pid` | `integer` | NOT NULL | Process ID |
+| `thread_id` | `integer` | NOT NULL | Thread ID |
+| `frames` | `text[]` | NOT NULL | Array of node IDs (stack trace) |
+| `weight` | `real` | NOT NULL | Sample weight (1.0 for discrete samples) |
+
+**Primary Key**: `(scan_id, sample_id)`
+
+**Indexes**:
+
+```sql
+CREATE INDEX idx_runtime_sample_scan ON scanner.runtime_sample(scan_id, collected_at DESC);
+CREATE INDEX idx_runtime_sample_frames ON scanner.runtime_sample USING GIN(frames);
+CREATE INDEX idx_runtime_sample_env ON scanner.runtime_sample(scan_id, env_hash);
+```
+
+**Constraints**:
+- `frames` array length must be >0 and <1000
+- `weight` must be >0.0
+
+**Partitioning**: **TIME-BASED** (monthly partitions by `collected_at`)
+
+```sql
+CREATE TABLE scanner.runtime_sample_2025_01 PARTITION OF scanner.runtime_sample
+    FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
+```
+
+**Retention**: 90 days (drop old partitions automatically)
+
+**Notes**:
+- **Highest volume table** (10M+ rows for long-running services)
+- GIN index on `frames[]` enables fast "find samples containing node X" queries
+- Partition pruning critical for performance
+
+---
+
+## Enums (Defined in Code)
+
+### cg_edge.kind
+
+| Value | Name | Description |
+|-------|------|-------------|
+| 1 | `static` | Statically proven call edge |
+| 2 | `heuristic` | Heuristic/inferred edge (reflection, DI, dynamic) |
+
+### cg_edge.reason
+
+| Value | Name | Description |
+|-------|------|-------------|
+| 1 | `direct_call` | Direct method invocation |
+| 2 | `virtual_call` | Virtual/interface dispatch |
+| 3 | `reflection_string` | Reflection with string name |
+| 4 | `di_binding` | Dependency injection registration |
+| 5 | `dynamic_import` | Dynamic module import (JS/Python) |
+| 6 | `delegate_invoke` | Delegate/lambda invocation |
+| 7 | `async_await` | Async method call |
+| 8 | `constructor` | Object constructor invocation |
+| 9 | `plt_got` | PLT/GOT indirect call (native binaries) |
+| 10 | `unknown` | Unknown edge type |
+
+### cg_node.flags (Bitfield)
+
+| Bit | Flag | Description |
+|-----|------|-------------|
+| 0 | `IS_ENTRYPOINT_CANDIDATE` | Node could be an entrypoint |
+| 1 | `IS_VIRTUAL` | Virtual or interface method |
+| 2 | `IS_ASYNC` | Async method |
+| 3 | `IS_CONSTRUCTOR` | Constructor method |
+| 4 | `IS_EXPORTED` | Publicly exported (for native binaries) |
+
+---
+
+## Schema Evolution
+
+### Migration Categories
+
+Per `docs/db/SPECIFICATION.md`:
+
+| Category | Prefix | Execution | Description |
+|----------|--------|-----------|-------------|
+| Startup (A) | `001-099` | Automatic at boot | Non-breaking DDL (CREATE IF NOT EXISTS) |
+| Release (B) | `100-199` | Manual via CLI | Breaking changes (requires maintenance window) |
+| Seed | `S001-S999` | After schema | Reference data with ON CONFLICT DO NOTHING |
+| Data (C) | `DM001-DM999` | Background job | Batched data transformations |
+
+### Upcoming Migrations
+
+| Migration | Category | Sprint | Description |
+|-----------|----------|--------|-------------|
+| `010_scanner_schema.sql` | Startup (A) | 3500.0002.0001 | Create scanner schema, scan_manifest, proof_bundle |
+| `011_call_graph_tables.sql` | Startup (A) | 3500.0003.0002 | Create cg_node, cg_edge, entrypoint |
+| `012_runtime_sample_partitions.sql` | Startup (A) | 3500.0003.0004 | Create runtime_sample with monthly partitions |
+| `S001_seed_edge_reasons.sql` | Seed | 3500.0003.0002 | Seed edge reason lookup table |
+
+---
+
+## Performance Considerations
+
+### Query Patterns
+
+**High-frequency queries**:
+
+1. **Scan manifest lookup by artifact**:
+   ```sql
+   SELECT * FROM scanner.scan_manifest
+   WHERE artifact_digest = $1
+   ORDER BY created_at_utc DESC LIMIT 1;
+   ```
+   - Index: `idx_scan_manifest_artifact`
+
+2. **Reachability BFS (forward)**:
+   ```sql
+   SELECT to_node_id FROM scanner.cg_edge
+   WHERE scan_id = $1 AND from_node_id = ANY($2) AND kind = 1;
+   ```
+   - Index: `idx_cg_edge_from`
+
+3. **Reachability BFS (backward)**:
+   ```sql
+   SELECT from_node_id FROM scanner.cg_edge
+   WHERE scan_id = $1 AND to_node_id = $2 AND kind = 1;
+   ```
+   - Index: `idx_cg_edge_to`
+
+4. **Find runtime samples containing node**:
+   ```sql
+   SELECT * FROM scanner.runtime_sample
+   WHERE scan_id = $1 AND $2 = ANY(frames);
+   ```
+   - Index: `idx_runtime_sample_frames` (GIN)
+
+### Index Maintenance
+
+**Reindex schedule**:
+- `cg_edge` indexes: Weekly (high churn)
+- `runtime_sample` GIN index: Monthly (after partition drops)
+
+**Vacuum**:
+- Autovacuum enabled for all tables
+- Manual VACUUM ANALYZE after bulk inserts (>1M rows)
+
+### Partition Management
+
+**Automated partition creation** (cron job):
+
+```sql
+-- Create next month's partition 7 days in advance
+CREATE TABLE IF NOT EXISTS scanner.runtime_sample_2025_02 PARTITION OF scanner.runtime_sample
+    FOR VALUES FROM ('2025-02-01') TO ('2025-03-01');
+```
+
+**Automated partition dropping** (90-day retention):
+
+```sql
+DROP TABLE IF EXISTS scanner.runtime_sample_2024_10; -- Older than 90 days
+```
+
+---
+
+## Compliance & Auditing
+
+### DSSE Signatures
+
+All proof bundles and manifests include DSSE signatures:
+- `manifest_dsse_json` in `scan_manifest`
+- `proof_root_dsse_json` in `proof_bundle`
+
+**Verification**:
+- Signatures verified on read using `IContentSigner.Verify`
+- Invalid signatures → reject proof bundle
+
+### Immutability
+
+**Immutable tables**:
+- `scan_manifest` — No updates allowed after insert
+- `proof_bundle` — No updates allowed after insert
+
+**Enforcement**: Application-level (no UPDATE grants in production)
+
+### Retention Policies
+
+| Table | Retention | Enforcement |
+|-------|-----------|-------------|
+| `scan_manifest` | 180 days | DELETE WHERE created_at_utc < NOW() - INTERVAL '180 days' |
+| `proof_bundle` | 365 days | DELETE WHERE created_at_utc < NOW() - INTERVAL '365 days' |
+| `cg_node` | 90 days | CASCADE delete on scan_manifest |
+| `cg_edge` | 90 days | CASCADE delete on scan_manifest |
+| `runtime_sample` | 90 days | DROP PARTITION (monthly) |
+
+---
+
+## Monitoring
+
+### Key Metrics
+
+1. **Table sizes**:
+   ```sql
+   SELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename))
+   FROM pg_tables WHERE schemaname = 'scanner';
+   ```
+
+2. **Index usage**:
+   ```sql
+   SELECT indexrelname, idx_scan, idx_tup_read, idx_tup_fetch
+   FROM pg_stat_user_indexes
+   WHERE schemaname = 'scanner'
+   ORDER BY idx_scan DESC;
+   ```
+
+3. **Partition sizes**:
+   ```sql
+   SELECT tablename, pg_size_pretty(pg_total_relation_size('scanner.'||tablename))
+   FROM pg_tables
+   WHERE schemaname = 'scanner' AND tablename LIKE 'runtime_sample_%'
+   ORDER BY tablename DESC;
+   ```
+
+### Alerts
+
+- **Table growth**: Alert if `cg_edge` >10GB per scan
+- **Index bloat**: Alert if index size >2x expected
+- **Partition creation**: Alert if next month's partition not created 7 days in advance
+- **Vacuum lag**: Alert if last autovacuum >7 days
+
+---
+
+## References
+
+- `docs/07_HIGH_LEVEL_ARCHITECTURE.md` — Schema isolation design
+- `docs/db/SPECIFICATION.md` — Database specification
+- `docs/operations/postgresql-guide.md` — Operations guide
+- `SPRINT_3500_0002_0001_score_proofs_foundations.md` — Implementation sprint
+- `SPRINT_3500_0003_0002_reachability_dotnet_call_graphs.md` — Call-graph implementation
+
+---
+
+**Last Updated**: 2025-12-17
+**Schema Version**: 1.0
+**Next Review**: Sprint 3500.0003.0004 (partition strategy)