423 lines
12 KiB
Markdown
423 lines
12 KiB
Markdown
# Analytics Query Library
|
|
|
|
This document provides ready-to-use SQL queries for common analytics use cases. All queries are optimized for the analytics star schema.
|
|
|
|
## Executive Dashboard Queries
|
|
|
|
### 1. Top Supplier Concentration (Supply Chain Risk)
|
|
|
|
Identifies suppliers with the highest component footprint, indicating supply chain concentration risk.
|
|
|
|
```sql
|
|
-- Via stored procedure (recommended, optional environment filter)
|
|
SELECT analytics.sp_top_suppliers(20, 'prod');
|
|
|
|
-- Direct query
|
|
SELECT
|
|
supplier,
|
|
component_count,
|
|
artifact_count,
|
|
team_count,
|
|
critical_vuln_count,
|
|
high_vuln_count,
|
|
environments
|
|
FROM analytics.mv_supplier_concentration
|
|
ORDER BY component_count DESC
|
|
LIMIT 20;
|
|
```
|
|
|
|
**Use case**: Identify vendors that, if compromised, would affect the most artifacts.
|
|
|
|
### 2. License Risk Heatmap
|
|
|
|
Shows distribution of components by license category for compliance review.
|
|
|
|
```sql
|
|
-- Via stored procedure (optional environment filter)
|
|
SELECT analytics.sp_license_heatmap('prod');
|
|
|
|
-- Direct query with grouping
|
|
SELECT
|
|
license_category,
|
|
SUM(component_count) AS total_components,
|
|
SUM(artifact_count) AS total_artifacts,
|
|
COUNT(DISTINCT license_concluded) AS unique_licenses
|
|
FROM analytics.mv_license_distribution
|
|
GROUP BY license_category
|
|
ORDER BY
|
|
CASE license_category
|
|
WHEN 'copyleft-strong' THEN 1
|
|
WHEN 'proprietary' THEN 2
|
|
WHEN 'unknown' THEN 3
|
|
WHEN 'copyleft-weak' THEN 4
|
|
ELSE 5
|
|
END;
|
|
```
|
|
|
|
**Use case**: Compliance review, identify components requiring legal review.
|
|
|
|
### 3. CVE Exposure Adjusted by VEX
|
|
|
|
Shows true vulnerability exposure after applying VEX mitigations.
|
|
|
|
```sql
|
|
-- Via stored procedure
|
|
SELECT analytics.sp_vuln_exposure('prod', 'high');
|
|
|
|
-- Direct query showing VEX effectiveness (global view; use sp_vuln_exposure for environment filtering)
|
|
SELECT
|
|
vuln_id,
|
|
severity::TEXT,
|
|
cvss_score,
|
|
epss_score,
|
|
kev_listed,
|
|
fix_available,
|
|
raw_artifact_count AS total_affected,
|
|
effective_artifact_count AS actually_affected,
|
|
raw_artifact_count - effective_artifact_count AS vex_mitigated,
|
|
ROUND(100.0 * (raw_artifact_count - effective_artifact_count) / NULLIF(raw_artifact_count, 0), 1) AS mitigation_rate
|
|
FROM analytics.mv_vuln_exposure
|
|
WHERE effective_artifact_count > 0
|
|
ORDER BY
|
|
CASE severity
|
|
WHEN 'critical' THEN 1
|
|
WHEN 'high' THEN 2
|
|
WHEN 'medium' THEN 3
|
|
ELSE 4
|
|
END,
|
|
effective_artifact_count DESC
|
|
LIMIT 50;
|
|
```
|
|
|
|
**Use case**: Show executives the "real" risk after VEX assessment.
|
|
|
|
### 4. Fixable Vulnerability Backlog
|
|
|
|
Lists vulnerabilities that can be fixed today (fix available, not VEX-mitigated).
|
|
|
|
```sql
|
|
-- Via stored procedure
|
|
SELECT analytics.sp_fixable_backlog('prod');
|
|
|
|
-- Direct query with priority scoring
|
|
SELECT
|
|
a.name AS service,
|
|
a.environment,
|
|
a.team,
|
|
c.name AS component,
|
|
c.version AS current_version,
|
|
cv.vuln_id,
|
|
cv.severity::TEXT,
|
|
cv.cvss_score,
|
|
cv.epss_score,
|
|
cv.fixed_version,
|
|
cv.kev_listed,
|
|
-- Priority score: higher = fix first
|
|
(
|
|
CASE cv.severity
|
|
WHEN 'critical' THEN 100
|
|
WHEN 'high' THEN 75
|
|
WHEN 'medium' THEN 50
|
|
ELSE 25
|
|
END
|
|
+ COALESCE(cv.epss_score * 100, 0)
|
|
+ (CASE WHEN cv.kev_listed THEN 50 ELSE 0 END)
|
|
)::INT AS priority_score
|
|
FROM analytics.component_vulns cv
|
|
JOIN analytics.components c ON c.component_id = cv.component_id
|
|
JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
|
|
JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
|
|
LEFT JOIN analytics.vex_overrides vo ON vo.artifact_id = a.artifact_id
|
|
AND vo.vuln_id = cv.vuln_id
|
|
AND vo.status = 'not_affected'
|
|
AND vo.valid_from <= now()
|
|
AND (vo.valid_until IS NULL OR vo.valid_until > now())
|
|
WHERE cv.affects = TRUE
|
|
AND cv.fix_available = TRUE
|
|
AND vo.override_id IS NULL
|
|
AND a.environment = 'prod'
|
|
ORDER BY priority_score DESC, a.name
|
|
LIMIT 100;
|
|
```
|
|
|
|
**Use case**: Prioritize remediation work based on risk and fixability.
|
|
|
|
### 5. Build Integrity / Attestation Coverage
|
|
|
|
Shows attestation gaps by environment and team.
|
|
|
|
```sql
|
|
-- Via stored procedure
|
|
SELECT analytics.sp_attestation_gaps('prod');
|
|
|
|
-- Direct query with gap analysis
|
|
SELECT
|
|
environment,
|
|
team,
|
|
total_artifacts,
|
|
with_provenance,
|
|
total_artifacts - with_provenance AS missing_provenance,
|
|
provenance_pct,
|
|
slsa_level_2_plus,
|
|
slsa2_pct,
|
|
with_sbom_attestation,
|
|
with_vex_attestation
|
|
FROM analytics.mv_attestation_coverage
|
|
WHERE environment = 'prod'
|
|
ORDER BY provenance_pct ASC;
|
|
```
|
|
|
|
**Use case**: Identify teams/environments not meeting attestation requirements.
|
|
|
|
## Trend Analysis Queries
|
|
|
|
### 6. Vulnerability Trend (30 Days)
|
|
|
|
```sql
|
|
SELECT
|
|
snapshot_date,
|
|
environment,
|
|
SUM(total_vulns) AS total_vulns,
|
|
SUM(fixable_vulns) AS fixable_vulns,
|
|
SUM(vex_mitigated) AS vex_mitigated,
|
|
SUM(total_vulns) - SUM(vex_mitigated) AS net_exposure,
|
|
SUM(kev_vulns) AS kev_vulns
|
|
FROM analytics.daily_vulnerability_counts
|
|
WHERE snapshot_date >= CURRENT_DATE - INTERVAL '30 days'
|
|
GROUP BY snapshot_date, environment
|
|
ORDER BY environment, snapshot_date;
|
|
```
|
|
|
|
### 7. Vulnerability Trend by Severity
|
|
|
|
```sql
|
|
SELECT
|
|
snapshot_date,
|
|
severity::TEXT,
|
|
SUM(total_vulns) AS total_vulns
|
|
FROM analytics.daily_vulnerability_counts
|
|
WHERE snapshot_date >= CURRENT_DATE - INTERVAL '30 days'
|
|
AND environment = 'prod'
|
|
GROUP BY snapshot_date, severity
|
|
ORDER BY snapshot_date,
|
|
CASE severity
|
|
WHEN 'critical' THEN 1
|
|
WHEN 'high' THEN 2
|
|
WHEN 'medium' THEN 3
|
|
ELSE 4
|
|
END;
|
|
```
|
|
|
|
### 8. Component Growth Trend
|
|
|
|
```sql
|
|
SELECT
|
|
snapshot_date,
|
|
environment,
|
|
SUM(total_components) AS total_components,
|
|
SUM(unique_suppliers) AS unique_suppliers
|
|
FROM analytics.daily_component_counts
|
|
WHERE snapshot_date >= CURRENT_DATE - INTERVAL '30 days'
|
|
GROUP BY snapshot_date, environment
|
|
ORDER BY environment, snapshot_date;
|
|
```
|
|
|
|
## Deep-Dive Queries
|
|
|
|
### 9. Component Impact Analysis
|
|
|
|
Find all artifacts affected by a specific component.
|
|
|
|
```sql
|
|
SELECT
|
|
a.name AS artifact,
|
|
a.version,
|
|
a.environment,
|
|
a.team,
|
|
ac.depth AS dependency_depth,
|
|
ac.introduced_via
|
|
FROM analytics.components c
|
|
JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
|
|
JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
|
|
WHERE c.purl LIKE 'pkg:maven/org.apache.logging.log4j/log4j-core%'
|
|
ORDER BY a.environment, a.name;
|
|
```
|
|
|
|
### 10. CVE Impact Analysis
|
|
|
|
Find all artifacts affected by a specific CVE.
|
|
|
|
```sql
|
|
SELECT DISTINCT
|
|
a.name AS artifact,
|
|
a.version,
|
|
a.environment,
|
|
a.team,
|
|
c.name AS component,
|
|
c.version AS component_version,
|
|
cv.cvss_score,
|
|
cv.fixed_version,
|
|
CASE
|
|
WHEN vo.status = 'not_affected' THEN 'VEX Mitigated'
|
|
WHEN cv.fix_available THEN 'Fix Available'
|
|
ELSE 'Vulnerable'
|
|
END AS status
|
|
FROM analytics.component_vulns cv
|
|
JOIN analytics.components c ON c.component_id = cv.component_id
|
|
JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
|
|
JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
|
|
LEFT JOIN analytics.vex_overrides vo ON vo.artifact_id = a.artifact_id
|
|
AND vo.vuln_id = cv.vuln_id
|
|
AND vo.valid_from <= now()
|
|
AND (vo.valid_until IS NULL OR vo.valid_until > now())
|
|
WHERE cv.vuln_id = 'CVE-2021-44228'
|
|
ORDER BY a.environment, a.name;
|
|
```
|
|
|
|
### 11. Supplier Vulnerability Profile
|
|
|
|
Detailed vulnerability breakdown for a specific supplier.
|
|
|
|
```sql
|
|
SELECT
|
|
c.supplier_normalized AS supplier,
|
|
c.name AS component,
|
|
c.version,
|
|
cv.vuln_id,
|
|
cv.severity::TEXT,
|
|
cv.cvss_score,
|
|
cv.kev_listed,
|
|
cv.fix_available,
|
|
cv.fixed_version
|
|
FROM analytics.components c
|
|
JOIN analytics.component_vulns cv ON cv.component_id = c.component_id
|
|
WHERE c.supplier_normalized = 'apache software foundation'
|
|
AND cv.affects = TRUE
|
|
ORDER BY
|
|
CASE cv.severity
|
|
WHEN 'critical' THEN 1
|
|
WHEN 'high' THEN 2
|
|
ELSE 3
|
|
END,
|
|
cv.cvss_score DESC;
|
|
```
|
|
|
|
### 12. License Compliance Report
|
|
|
|
Components with concerning licenses in production.
|
|
|
|
```sql
|
|
SELECT
|
|
c.name AS component,
|
|
c.version,
|
|
c.license_concluded,
|
|
c.license_category::TEXT,
|
|
c.supplier_normalized AS supplier,
|
|
COUNT(DISTINCT a.artifact_id) AS artifact_count,
|
|
ARRAY_AGG(DISTINCT a.name ORDER BY a.name) AS affected_artifacts
|
|
FROM analytics.components c
|
|
JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
|
|
JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
|
|
WHERE c.license_category IN ('copyleft-strong', 'proprietary', 'unknown')
|
|
AND a.environment = 'prod'
|
|
GROUP BY c.component_id, c.name, c.version, c.license_concluded, c.license_category, c.supplier_normalized
|
|
ORDER BY c.license_category, artifact_count DESC;
|
|
```
|
|
|
|
### 13. MTTR Analysis
|
|
|
|
Mean time to remediate by severity.
|
|
|
|
```sql
|
|
SELECT
|
|
cv.severity::TEXT,
|
|
COUNT(*) AS remediated_vulns,
|
|
AVG(EXTRACT(EPOCH FROM (vo.valid_from - cv.published_at)) / 86400)::NUMERIC(10,2) AS avg_days_to_mitigate,
|
|
PERCENTILE_CONT(0.5) WITHIN GROUP (
|
|
ORDER BY EXTRACT(EPOCH FROM (vo.valid_from - cv.published_at)) / 86400
|
|
)::NUMERIC(10,2) AS median_days,
|
|
PERCENTILE_CONT(0.9) WITHIN GROUP (
|
|
ORDER BY EXTRACT(EPOCH FROM (vo.valid_from - cv.published_at)) / 86400
|
|
)::NUMERIC(10,2) AS p90_days
|
|
FROM analytics.component_vulns cv
|
|
JOIN analytics.vex_overrides vo ON vo.vuln_id = cv.vuln_id
|
|
AND vo.status = 'not_affected'
|
|
AND vo.valid_from <= now()
|
|
AND (vo.valid_until IS NULL OR vo.valid_until > now())
|
|
WHERE cv.published_at >= now() - INTERVAL '90 days'
|
|
AND cv.published_at IS NOT NULL
|
|
GROUP BY cv.severity
|
|
ORDER BY
|
|
CASE cv.severity
|
|
WHEN 'critical' THEN 1
|
|
WHEN 'high' THEN 2
|
|
WHEN 'medium' THEN 3
|
|
ELSE 4
|
|
END;
|
|
```
|
|
|
|
### 14. Transitive Dependency Risk
|
|
|
|
Components introduced through transitive dependencies.
|
|
|
|
```sql
|
|
SELECT
|
|
c.name AS transitive_component,
|
|
c.version,
|
|
ac.introduced_via AS direct_dependency,
|
|
ac.depth,
|
|
COUNT(DISTINCT cv.vuln_id) AS vuln_count,
|
|
SUM(CASE WHEN cv.severity = 'critical' THEN 1 ELSE 0 END) AS critical_count,
|
|
COUNT(DISTINCT a.artifact_id) AS affected_artifacts
|
|
FROM analytics.components c
|
|
JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
|
|
JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
|
|
LEFT JOIN analytics.component_vulns cv ON cv.component_id = c.component_id AND cv.affects = TRUE
|
|
WHERE ac.depth > 0 -- Transitive only
|
|
AND a.environment = 'prod'
|
|
GROUP BY c.component_id, c.name, c.version, ac.introduced_via, ac.depth
|
|
HAVING COUNT(cv.vuln_id) > 0
|
|
ORDER BY critical_count DESC, vuln_count DESC
|
|
LIMIT 50;
|
|
```
|
|
|
|
### 15. VEX Effectiveness Report
|
|
|
|
How effective is the VEX program at reducing noise?
|
|
|
|
```sql
|
|
SELECT
|
|
DATE_TRUNC('week', vo.created_at)::DATE AS week,
|
|
COUNT(*) AS total_overrides,
|
|
COUNT(*) FILTER (WHERE vo.status = 'not_affected') AS not_affected,
|
|
COUNT(*) FILTER (WHERE vo.status = 'affected') AS confirmed_affected,
|
|
COUNT(*) FILTER (WHERE vo.status = 'under_investigation') AS under_investigation,
|
|
COUNT(*) FILTER (WHERE vo.status = 'fixed') AS marked_fixed,
|
|
-- Noise reduction rate
|
|
ROUND(100.0 * COUNT(*) FILTER (WHERE vo.status = 'not_affected') / NULLIF(COUNT(*), 0), 1) AS noise_reduction_pct
|
|
FROM analytics.vex_overrides vo
|
|
WHERE vo.created_at >= now() - INTERVAL '90 days'
|
|
GROUP BY DATE_TRUNC('week', vo.created_at)
|
|
ORDER BY week;
|
|
```
|
|
|
|
## Performance Tips
|
|
|
|
1. **Use materialized views**: Queries prefixed with `mv_` are pre-computed and fast
|
|
2. **Add environment filter**: Most queries benefit from `WHERE environment = 'prod'`
|
|
3. **Use stored procedures**: `sp_*` functions return JSON and handle caching
|
|
4. **Limit results**: Always use `LIMIT` for large result sets
|
|
5. **Check refresh times**: Views are refreshed daily; data may be up to 24h stale
|
|
|
|
## Query Parameters
|
|
|
|
Common filter parameters:
|
|
|
|
| Parameter | Type | Example | Notes |
|
|
|-----------|------|---------|-------|
|
|
| `environment` | TEXT | `'prod'`, `'stage'` | Filter by deployment environment |
|
|
| `team` | TEXT | `'platform'` | Filter by owning team |
|
|
| `severity` | TEXT | `'critical'`, `'high'` | Minimum severity level |
|
|
| `days` | INT | `30`, `90` | Lookback period |
|
|
| `limit` | INT | `20`, `100` | Max results |
|