Files
git.stella-ops.org/docs/db/schemas/scan-metrics.md
master 415eff1207
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
feat(metrics): Implement scan metrics repository and PostgreSQL integration
- Added IScanMetricsRepository interface for scan metrics persistence and retrieval.
- Implemented PostgresScanMetricsRepository for PostgreSQL database interactions, including methods for saving and retrieving scan metrics and execution phases.
- Introduced methods for obtaining TTE statistics and recent scans for tenants.
- Implemented deletion of old metrics for retention purposes.

test(tests): Add SCA Failure Catalogue tests for FC6-FC10

- Created ScaCatalogueDeterminismTests to validate determinism properties of SCA Failure Catalogue fixtures.
- Developed ScaFailureCatalogueTests to ensure correct handling of specific failure modes in the scanner.
- Included tests for manifest validation, file existence, and expected findings across multiple failure cases.

feat(telemetry): Integrate scan completion metrics into the pipeline

- Introduced IScanCompletionMetricsIntegration interface and ScanCompletionMetricsIntegration class to record metrics upon scan completion.
- Implemented proof coverage and TTE metrics recording with logging for scan completion summaries.
2025-12-16 14:00:35 +02:00

5.6 KiB

Scan Metrics Schema

Sprint: SPRINT_3406_0001_0001_metrics_tables
Task: METRICS-3406-013
Working Directory: src/Scanner/__Libraries/StellaOps.Scanner.Storage/Postgres/Migrations/

Overview

The scan metrics schema provides relational PostgreSQL tables for tracking Time-to-Evidence (TTE) and scan performance metrics. This is a hybrid approach where metrics are stored in PostgreSQL while replay manifests remain in the document store.

Tables

scanner.scan_metrics

Primary table for per-scan metrics.

Column Type Description
metrics_id UUID Primary key
scan_id UUID Unique scan identifier
tenant_id UUID Tenant identifier
surface_id UUID Optional attack surface identifier
artifact_digest TEXT Artifact content hash
artifact_type TEXT Type: oci_image, tarball, directory, other
replay_manifest_hash TEXT Reference to replay manifest in document store
findings_sha256 TEXT Findings content hash
vex_bundle_sha256 TEXT VEX bundle content hash
proof_bundle_sha256 TEXT Proof bundle content hash
sbom_sha256 TEXT SBOM content hash
policy_digest TEXT Policy version hash
feed_snapshot_id TEXT Feed snapshot identifier
started_at TIMESTAMPTZ Scan start time
finished_at TIMESTAMPTZ Scan completion time
total_duration_ms INT TTE in milliseconds (generated)
t_ingest_ms INT Ingest phase duration
t_analyze_ms INT Analyze phase duration
t_reachability_ms INT Reachability phase duration
t_vex_ms INT VEX phase duration
t_sign_ms INT Sign phase duration
t_publish_ms INT Publish phase duration
package_count INT Number of packages analyzed
finding_count INT Number of findings
vex_decision_count INT Number of VEX decisions
scanner_version TEXT Scanner version
scanner_image_digest TEXT Scanner container digest
is_replay BOOLEAN Replay mode flag
created_at TIMESTAMPTZ Record creation time

scanner.execution_phases

Detailed phase execution tracking.

Column Type Description
id BIGSERIAL Primary key
metrics_id UUID Foreign key to scan_metrics
phase_name TEXT Phase: ingest, analyze, reachability, vex, sign, publish, other
phase_order INT Execution order
started_at TIMESTAMPTZ Phase start time
finished_at TIMESTAMPTZ Phase completion time
duration_ms INT Duration in milliseconds (generated)
success BOOLEAN Phase success status
error_code TEXT Error code if failed
error_message TEXT Error message if failed
phase_metrics JSONB Phase-specific metrics

Views

scanner.scan_tte

Time-to-Evidence view with phase breakdowns.

SELECT
    metrics_id,
    scan_id,
    tte_ms,
    tte_seconds,
    ingest_percent,
    analyze_percent,
    reachability_percent,
    vex_percent,
    sign_percent,
    publish_percent
FROM scanner.scan_tte
WHERE tenant_id = :tenant_id;

scanner.tte_stats

Hourly TTE statistics with SLO compliance.

SELECT
    hour_bucket,
    scan_count,
    tte_avg_ms,
    tte_p50_ms,
    tte_p95_ms,
    slo_p50_compliance_percent,
    slo_p95_compliance_percent
FROM scanner.tte_stats
WHERE tenant_id = :tenant_id;

Functions

scanner.tte_percentile

Calculate TTE percentile for a tenant.

SELECT scanner.tte_percentile(
    p_tenant_id := :tenant_id,
    p_percentile := 0.95,
    p_since := NOW() - INTERVAL '7 days'
);

Indexes

Index Columns Purpose
idx_scan_metrics_tenant tenant_id Tenant queries
idx_scan_metrics_artifact artifact_digest Artifact lookups
idx_scan_metrics_started started_at Time-range queries
idx_scan_metrics_surface surface_id Surface queries
idx_scan_metrics_replay is_replay Filter replays
idx_scan_metrics_tenant_started tenant_id, started_at Compound tenant+time
idx_execution_phases_metrics metrics_id Phase lookups
idx_execution_phases_name phase_name Phase filtering

SLO Thresholds

Per the advisory section 13.1:

Metric Target
TTE P50 < 120 seconds
TTE P95 < 300 seconds

Usage Examples

Get TTE for recent scans

SELECT scan_id, tte_ms, tte_seconds
FROM scanner.scan_tte
WHERE tenant_id = :tenant_id
  AND NOT is_replay
ORDER BY started_at DESC
LIMIT 100;

Check SLO compliance

SELECT
    hour_bucket,
    slo_p50_compliance_percent,
    slo_p95_compliance_percent
FROM scanner.tte_stats
WHERE tenant_id = :tenant_id
  AND hour_bucket >= NOW() - INTERVAL '24 hours';

Phase breakdown analysis

SELECT
    phase_name,
    AVG(duration_ms) as avg_ms,
    PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY duration_ms) as p95_ms
FROM scanner.execution_phases ep
JOIN scanner.scan_metrics sm ON ep.metrics_id = sm.metrics_id
WHERE sm.tenant_id = :tenant_id
  AND sm.started_at >= NOW() - INTERVAL '7 days'
GROUP BY phase_name
ORDER BY phase_order;

Migration

Migration file: src/Scanner/__Libraries/StellaOps.Scanner.Storage/Postgres/Migrations/004_scan_metrics.sql

Apply with:

psql -d stellaops -f 004_scan_metrics.sql