feat(metrics): Implement scan metrics repository and PostgreSQL integration
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
- Added IScanMetricsRepository interface for scan metrics persistence and retrieval. - Implemented PostgresScanMetricsRepository for PostgreSQL database interactions, including methods for saving and retrieving scan metrics and execution phases. - Introduced methods for obtaining TTE statistics and recent scans for tenants. - Implemented deletion of old metrics for retention purposes. test(tests): Add SCA Failure Catalogue tests for FC6-FC10 - Created ScaCatalogueDeterminismTests to validate determinism properties of SCA Failure Catalogue fixtures. - Developed ScaFailureCatalogueTests to ensure correct handling of specific failure modes in the scanner. - Included tests for manifest validation, file existence, and expected findings across multiple failure cases. feat(telemetry): Integrate scan completion metrics into the pipeline - Introduced IScanCompletionMetricsIntegration interface and ScanCompletionMetricsIntegration class to record metrics upon scan completion. - Implemented proof coverage and TTE metrics recording with logging for scan completion summaries.
This commit is contained in:
@@ -1173,6 +1173,67 @@ CREATE INDEX idx_metadata_active ON scheduler.runs USING GIN (stats)
|
||||
WHERE state = 'completed';
|
||||
```
|
||||
|
||||
### 6.4 Generated Columns for JSONB Hot Keys
|
||||
|
||||
For frequently-queried JSONB fields, use PostgreSQL generated columns to enable efficient B-tree indexing and query planning statistics.
|
||||
|
||||
**Problem with expression indexes:**
|
||||
```sql
|
||||
-- Expression indexes don't collect statistics
|
||||
CREATE INDEX idx_format ON sbom_docs ((doc->>'bomFormat'));
|
||||
-- Query planner can't estimate cardinality, may choose suboptimal plans
|
||||
```
|
||||
|
||||
**Solution: Generated columns (PostgreSQL 12+):**
|
||||
```sql
|
||||
-- Add generated column that extracts JSONB field
|
||||
ALTER TABLE scanner.sbom_documents
|
||||
ADD COLUMN bom_format TEXT GENERATED ALWAYS AS ((doc->>'bomFormat')) STORED;
|
||||
|
||||
-- Standard B-tree index with full statistics
|
||||
CREATE INDEX idx_sbom_bom_format ON scanner.sbom_documents(bom_format);
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- **B-tree indexable**: Standard index on generated column
|
||||
- **Statistics**: `ANALYZE` collects cardinality, MCV, histogram
|
||||
- **Index-only scans**: Visible to covering indexes
|
||||
- **Zero application changes**: Transparent to ORM/queries
|
||||
|
||||
**When to use generated columns:**
|
||||
- Field queried in >10% of queries against the table
|
||||
- Cardinality >100 distinct values (worth collecting stats)
|
||||
- Field used in JOIN conditions or GROUP BY
|
||||
- Index-only scans are beneficial
|
||||
|
||||
**Naming convention:**
|
||||
```
|
||||
<json_path_snake_case>
|
||||
Examples:
|
||||
doc->>'bomFormat' → bom_format
|
||||
raw->>'schemaVersion' → schema_version
|
||||
stats->>'findingCount'→ finding_count
|
||||
```
|
||||
|
||||
**Migration pattern:**
|
||||
```sql
|
||||
-- Step 1: Add generated column (no lock on existing rows)
|
||||
ALTER TABLE scheduler.runs
|
||||
ADD COLUMN finding_count INT GENERATED ALWAYS AS ((stats->>'findingCount')::int) STORED;
|
||||
|
||||
-- Step 2: Create index concurrently
|
||||
CREATE INDEX CONCURRENTLY idx_runs_finding_count
|
||||
ON scheduler.runs(tenant_id, finding_count);
|
||||
|
||||
-- Step 3: Analyze for statistics
|
||||
ANALYZE scheduler.runs;
|
||||
```
|
||||
|
||||
**Reference implementations:**
|
||||
- `src/Scheduler/...Storage.Postgres/Migrations/010_generated_columns_runs.sql`
|
||||
- `src/Excititor/...Storage.Postgres/Migrations/004_generated_columns_vex.sql`
|
||||
- `src/Concelier/...Storage.Postgres/Migrations/007_generated_columns_advisories.sql`
|
||||
|
||||
---
|
||||
|
||||
## 7. Partitioning Strategy
|
||||
|
||||
Reference in New Issue
Block a user