# Analytics Query Library This document provides ready-to-use SQL queries for common analytics use cases. All queries are optimized for the analytics star schema. ## Executive Dashboard Queries ### 1. Top Supplier Concentration (Supply Chain Risk) Identifies suppliers with the highest component footprint, indicating supply chain concentration risk. ```sql -- Via stored procedure (recommended, optional environment filter) SELECT analytics.sp_top_suppliers(20, 'prod'); -- Direct query SELECT supplier, component_count, artifact_count, team_count, critical_vuln_count, high_vuln_count, environments FROM analytics.mv_supplier_concentration ORDER BY component_count DESC LIMIT 20; ``` **Use case**: Identify vendors that, if compromised, would affect the most artifacts. ### 2. License Risk Heatmap Shows distribution of components by license category for compliance review. ```sql -- Via stored procedure (optional environment filter) SELECT analytics.sp_license_heatmap('prod'); -- Direct query with grouping SELECT license_category, SUM(component_count) AS total_components, SUM(artifact_count) AS total_artifacts, COUNT(DISTINCT license_concluded) AS unique_licenses FROM analytics.mv_license_distribution GROUP BY license_category ORDER BY CASE license_category WHEN 'copyleft-strong' THEN 1 WHEN 'proprietary' THEN 2 WHEN 'unknown' THEN 3 WHEN 'copyleft-weak' THEN 4 ELSE 5 END; ``` **Use case**: Compliance review, identify components requiring legal review. ### 3. CVE Exposure Adjusted by VEX Shows true vulnerability exposure after applying VEX mitigations. ```sql -- Via stored procedure SELECT analytics.sp_vuln_exposure('prod', 'high'); -- Direct query showing VEX effectiveness (global view; use sp_vuln_exposure for environment filtering) SELECT vuln_id, severity::TEXT, cvss_score, epss_score, kev_listed, fix_available, raw_artifact_count AS total_affected, effective_artifact_count AS actually_affected, raw_artifact_count - effective_artifact_count AS vex_mitigated, ROUND(100.0 * (raw_artifact_count - effective_artifact_count) / NULLIF(raw_artifact_count, 0), 1) AS mitigation_rate FROM analytics.mv_vuln_exposure WHERE effective_artifact_count > 0 ORDER BY CASE severity WHEN 'critical' THEN 1 WHEN 'high' THEN 2 WHEN 'medium' THEN 3 ELSE 4 END, effective_artifact_count DESC LIMIT 50; ``` **Use case**: Show executives the "real" risk after VEX assessment. ### 4. Fixable Vulnerability Backlog Lists vulnerabilities that can be fixed today (fix available, not VEX-mitigated). ```sql -- Via stored procedure SELECT analytics.sp_fixable_backlog('prod'); -- Direct query with priority scoring SELECT a.name AS service, a.environment, a.team, c.name AS component, c.version AS current_version, cv.vuln_id, cv.severity::TEXT, cv.cvss_score, cv.epss_score, cv.fixed_version, cv.kev_listed, -- Priority score: higher = fix first ( CASE cv.severity WHEN 'critical' THEN 100 WHEN 'high' THEN 75 WHEN 'medium' THEN 50 ELSE 25 END + COALESCE(cv.epss_score * 100, 0) + (CASE WHEN cv.kev_listed THEN 50 ELSE 0 END) )::INT AS priority_score FROM analytics.component_vulns cv JOIN analytics.components c ON c.component_id = cv.component_id JOIN analytics.artifact_components ac ON ac.component_id = c.component_id JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id LEFT JOIN analytics.vex_overrides vo ON vo.artifact_id = a.artifact_id AND vo.vuln_id = cv.vuln_id AND vo.status = 'not_affected' AND vo.valid_from <= now() AND (vo.valid_until IS NULL OR vo.valid_until > now()) WHERE cv.affects = TRUE AND cv.fix_available = TRUE AND vo.override_id IS NULL AND a.environment = 'prod' ORDER BY priority_score DESC, a.name LIMIT 100; ``` **Use case**: Prioritize remediation work based on risk and fixability. ### 5. Build Integrity / Attestation Coverage Shows attestation gaps by environment and team. ```sql -- Via stored procedure SELECT analytics.sp_attestation_gaps('prod'); -- Direct query with gap analysis SELECT environment, team, total_artifacts, with_provenance, total_artifacts - with_provenance AS missing_provenance, provenance_pct, slsa_level_2_plus, slsa2_pct, with_sbom_attestation, with_vex_attestation FROM analytics.mv_attestation_coverage WHERE environment = 'prod' ORDER BY provenance_pct ASC; ``` **Use case**: Identify teams/environments not meeting attestation requirements. ## Trend Analysis Queries ### 6. Vulnerability Trend (30 Days) ```sql SELECT snapshot_date, environment, SUM(total_vulns) AS total_vulns, SUM(fixable_vulns) AS fixable_vulns, SUM(vex_mitigated) AS vex_mitigated, SUM(total_vulns) - SUM(vex_mitigated) AS net_exposure, SUM(kev_vulns) AS kev_vulns FROM analytics.daily_vulnerability_counts WHERE snapshot_date >= CURRENT_DATE - INTERVAL '30 days' GROUP BY snapshot_date, environment ORDER BY environment, snapshot_date; ``` ### 7. Vulnerability Trend by Severity ```sql SELECT snapshot_date, severity::TEXT, SUM(total_vulns) AS total_vulns FROM analytics.daily_vulnerability_counts WHERE snapshot_date >= CURRENT_DATE - INTERVAL '30 days' AND environment = 'prod' GROUP BY snapshot_date, severity ORDER BY snapshot_date, CASE severity WHEN 'critical' THEN 1 WHEN 'high' THEN 2 WHEN 'medium' THEN 3 ELSE 4 END; ``` ### 8. Component Growth Trend ```sql SELECT snapshot_date, environment, SUM(total_components) AS total_components, SUM(unique_suppliers) AS unique_suppliers FROM analytics.daily_component_counts WHERE snapshot_date >= CURRENT_DATE - INTERVAL '30 days' GROUP BY snapshot_date, environment ORDER BY environment, snapshot_date; ``` ## Deep-Dive Queries ### 9. Component Impact Analysis Find all artifacts affected by a specific component. ```sql SELECT a.name AS artifact, a.version, a.environment, a.team, ac.depth AS dependency_depth, ac.introduced_via FROM analytics.components c JOIN analytics.artifact_components ac ON ac.component_id = c.component_id JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id WHERE c.purl LIKE 'pkg:maven/org.apache.logging.log4j/log4j-core%' ORDER BY a.environment, a.name; ``` ### 10. CVE Impact Analysis Find all artifacts affected by a specific CVE. ```sql SELECT DISTINCT a.name AS artifact, a.version, a.environment, a.team, c.name AS component, c.version AS component_version, cv.cvss_score, cv.fixed_version, CASE WHEN vo.status = 'not_affected' THEN 'VEX Mitigated' WHEN cv.fix_available THEN 'Fix Available' ELSE 'Vulnerable' END AS status FROM analytics.component_vulns cv JOIN analytics.components c ON c.component_id = cv.component_id JOIN analytics.artifact_components ac ON ac.component_id = c.component_id JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id LEFT JOIN analytics.vex_overrides vo ON vo.artifact_id = a.artifact_id AND vo.vuln_id = cv.vuln_id AND vo.valid_from <= now() AND (vo.valid_until IS NULL OR vo.valid_until > now()) WHERE cv.vuln_id = 'CVE-2021-44228' ORDER BY a.environment, a.name; ``` ### 11. Supplier Vulnerability Profile Detailed vulnerability breakdown for a specific supplier. ```sql SELECT c.supplier_normalized AS supplier, c.name AS component, c.version, cv.vuln_id, cv.severity::TEXT, cv.cvss_score, cv.kev_listed, cv.fix_available, cv.fixed_version FROM analytics.components c JOIN analytics.component_vulns cv ON cv.component_id = c.component_id WHERE c.supplier_normalized = 'apache software foundation' AND cv.affects = TRUE ORDER BY CASE cv.severity WHEN 'critical' THEN 1 WHEN 'high' THEN 2 ELSE 3 END, cv.cvss_score DESC; ``` ### 12. License Compliance Report Components with concerning licenses in production. ```sql SELECT c.name AS component, c.version, c.license_concluded, c.license_category::TEXT, c.supplier_normalized AS supplier, COUNT(DISTINCT a.artifact_id) AS artifact_count, ARRAY_AGG(DISTINCT a.name ORDER BY a.name) AS affected_artifacts FROM analytics.components c JOIN analytics.artifact_components ac ON ac.component_id = c.component_id JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id WHERE c.license_category IN ('copyleft-strong', 'proprietary', 'unknown') AND a.environment = 'prod' GROUP BY c.component_id, c.name, c.version, c.license_concluded, c.license_category, c.supplier_normalized ORDER BY c.license_category, artifact_count DESC; ``` ### 13. MTTR Analysis Mean time to remediate by severity. ```sql SELECT cv.severity::TEXT, COUNT(*) AS remediated_vulns, AVG(EXTRACT(EPOCH FROM (vo.valid_from - cv.published_at)) / 86400)::NUMERIC(10,2) AS avg_days_to_mitigate, PERCENTILE_CONT(0.5) WITHIN GROUP ( ORDER BY EXTRACT(EPOCH FROM (vo.valid_from - cv.published_at)) / 86400 )::NUMERIC(10,2) AS median_days, PERCENTILE_CONT(0.9) WITHIN GROUP ( ORDER BY EXTRACT(EPOCH FROM (vo.valid_from - cv.published_at)) / 86400 )::NUMERIC(10,2) AS p90_days FROM analytics.component_vulns cv JOIN analytics.vex_overrides vo ON vo.vuln_id = cv.vuln_id AND vo.status = 'not_affected' AND vo.valid_from <= now() AND (vo.valid_until IS NULL OR vo.valid_until > now()) WHERE cv.published_at >= now() - INTERVAL '90 days' AND cv.published_at IS NOT NULL GROUP BY cv.severity ORDER BY CASE cv.severity WHEN 'critical' THEN 1 WHEN 'high' THEN 2 WHEN 'medium' THEN 3 ELSE 4 END; ``` ### 14. Transitive Dependency Risk Components introduced through transitive dependencies. ```sql SELECT c.name AS transitive_component, c.version, ac.introduced_via AS direct_dependency, ac.depth, COUNT(DISTINCT cv.vuln_id) AS vuln_count, SUM(CASE WHEN cv.severity = 'critical' THEN 1 ELSE 0 END) AS critical_count, COUNT(DISTINCT a.artifact_id) AS affected_artifacts FROM analytics.components c JOIN analytics.artifact_components ac ON ac.component_id = c.component_id JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id LEFT JOIN analytics.component_vulns cv ON cv.component_id = c.component_id AND cv.affects = TRUE WHERE ac.depth > 0 -- Transitive only AND a.environment = 'prod' GROUP BY c.component_id, c.name, c.version, ac.introduced_via, ac.depth HAVING COUNT(cv.vuln_id) > 0 ORDER BY critical_count DESC, vuln_count DESC LIMIT 50; ``` ### 15. VEX Effectiveness Report How effective is the VEX program at reducing noise? ```sql SELECT DATE_TRUNC('week', vo.created_at)::DATE AS week, COUNT(*) AS total_overrides, COUNT(*) FILTER (WHERE vo.status = 'not_affected') AS not_affected, COUNT(*) FILTER (WHERE vo.status = 'affected') AS confirmed_affected, COUNT(*) FILTER (WHERE vo.status = 'under_investigation') AS under_investigation, COUNT(*) FILTER (WHERE vo.status = 'fixed') AS marked_fixed, -- Noise reduction rate ROUND(100.0 * COUNT(*) FILTER (WHERE vo.status = 'not_affected') / NULLIF(COUNT(*), 0), 1) AS noise_reduction_pct FROM analytics.vex_overrides vo WHERE vo.created_at >= now() - INTERVAL '90 days' GROUP BY DATE_TRUNC('week', vo.created_at) ORDER BY week; ``` ## Performance Tips 1. **Use materialized views**: Queries prefixed with `mv_` are pre-computed and fast 2. **Add environment filter**: Most queries benefit from `WHERE environment = 'prod'` 3. **Use stored procedures**: `sp_*` functions return JSON and handle caching 4. **Limit results**: Always use `LIMIT` for large result sets 5. **Check refresh times**: Views are refreshed daily; data may be up to 24h stale ## Query Parameters Common filter parameters: | Parameter | Type | Example | Notes | |-----------|------|---------|-------| | `environment` | TEXT | `'prod'`, `'stage'` | Filter by deployment environment | | `team` | TEXT | `'platform'` | Filter by owning team | | `severity` | TEXT | `'critical'`, `'high'` | Minimum severity level | | `days` | INT | `30`, `90` | Lookback period | | `limit` | INT | `20`, `100` | Max results |