license switch agpl -> busl1, sprints work, new product advisories
This commit is contained in:
136
docs/modules/analytics/README.md
Normal file
136
docs/modules/analytics/README.md
Normal file
@@ -0,0 +1,136 @@
|
||||
# Analytics Module
|
||||
|
||||
The Analytics module provides a star-schema data warehouse layer for SBOM and attestation data, enabling executive reporting, risk dashboards, and ad-hoc analysis.
|
||||
|
||||
## Overview
|
||||
|
||||
Stella Ops generates rich data through SBOM ingestion, vulnerability correlation, VEX assessments, and attestations. The Analytics module normalizes this data into a queryable warehouse schema optimized for:
|
||||
|
||||
- **Executive dashboards**: Risk posture, vulnerability trends, compliance status
|
||||
- **Supply chain analysis**: Supplier concentration, license distribution
|
||||
- **Security metrics**: CVE exposure, VEX effectiveness, MTTR tracking
|
||||
- **Attestation coverage**: SLSA compliance, provenance gaps
|
||||
|
||||
## Key Capabilities
|
||||
|
||||
| Capability | Description |
|
||||
|------------|-------------|
|
||||
| Unified component registry | Canonical component table with normalized suppliers and licenses |
|
||||
| Vulnerability correlation | Pre-joined component-vulnerability mapping with EPSS/KEV flags |
|
||||
| VEX-adjusted exposure | Vulnerability counts that respect VEX overrides |
|
||||
| Attestation tracking | Provenance and SLSA level coverage by environment/team |
|
||||
| Time-series rollups | Daily snapshots for trend analysis |
|
||||
| Materialized views | Pre-computed aggregations for dashboard performance |
|
||||
|
||||
## Data Model
|
||||
|
||||
### Star Schema Overview
|
||||
|
||||
```
|
||||
┌─────────────────┐
|
||||
│ artifacts │ (dimension)
|
||||
│ container/app │
|
||||
└────────┬────────┘
|
||||
│
|
||||
┌──────────────┼──────────────┐
|
||||
│ │ │
|
||||
┌─────────▼──────┐ ┌─────▼─────┐ ┌──────▼──────┐
|
||||
│ artifact_ │ │attestations│ │vex_overrides│
|
||||
│ components │ │ (fact) │ │ (fact) │
|
||||
│ (bridge) │ └───────────┘ └─────────────┘
|
||||
└─────────┬──────┘
|
||||
│
|
||||
┌─────────▼──────┐
|
||||
│ components │ (dimension)
|
||||
│ unified │
|
||||
│ registry │
|
||||
└─────────┬──────┘
|
||||
│
|
||||
┌─────────▼──────┐
|
||||
│ component_ │
|
||||
│ vulns │ (fact)
|
||||
│ (bridge) │
|
||||
└────────────────┘
|
||||
```
|
||||
|
||||
### Core Tables
|
||||
|
||||
| Table | Type | Purpose |
|
||||
|-------|------|---------|
|
||||
| `components` | Dimension | Unified component registry with PURL, supplier, license |
|
||||
| `artifacts` | Dimension | Container images and applications with SBOM metadata |
|
||||
| `artifact_components` | Bridge | Links artifacts to their SBOM components |
|
||||
| `component_vulns` | Fact | Component-to-vulnerability mapping |
|
||||
| `attestations` | Fact | Attestation metadata (provenance, SBOM, VEX) |
|
||||
| `vex_overrides` | Fact | VEX status overrides with justifications |
|
||||
| `raw_sboms` | Audit | Raw SBOM payloads for reprocessing |
|
||||
| `raw_attestations` | Audit | Raw DSSE envelopes for audit |
|
||||
| `daily_vulnerability_counts` | Rollup | Daily vuln aggregations |
|
||||
| `daily_component_counts` | Rollup | Daily component aggregations |
|
||||
|
||||
### Materialized Views
|
||||
|
||||
| View | Refresh | Purpose |
|
||||
|------|---------|---------|
|
||||
| `mv_supplier_concentration` | Daily | Top suppliers by component count |
|
||||
| `mv_license_distribution` | Daily | License category distribution |
|
||||
| `mv_vuln_exposure` | Daily | CVE exposure adjusted by VEX |
|
||||
| `mv_attestation_coverage` | Daily | Provenance/SLSA coverage by env/team |
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Day-1 Queries
|
||||
|
||||
**Top supplier concentration (supply chain risk):**
|
||||
```sql
|
||||
SELECT * FROM analytics.sp_top_suppliers(20);
|
||||
```
|
||||
|
||||
**License risk heatmap:**
|
||||
```sql
|
||||
SELECT * FROM analytics.sp_license_heatmap();
|
||||
```
|
||||
|
||||
**CVE exposure adjusted by VEX:**
|
||||
```sql
|
||||
SELECT * FROM analytics.sp_vuln_exposure('prod', 'high');
|
||||
```
|
||||
|
||||
**Fixable vulnerability backlog:**
|
||||
```sql
|
||||
SELECT * FROM analytics.sp_fixable_backlog('prod');
|
||||
```
|
||||
|
||||
**Attestation coverage gaps:**
|
||||
```sql
|
||||
SELECT * FROM analytics.sp_attestation_gaps('prod');
|
||||
```
|
||||
|
||||
### API Endpoints
|
||||
|
||||
| Endpoint | Method | Description |
|
||||
|----------|--------|-------------|
|
||||
| `/api/analytics/suppliers` | GET | Supplier concentration data |
|
||||
| `/api/analytics/licenses` | GET | License distribution |
|
||||
| `/api/analytics/vulnerabilities` | GET | CVE exposure (VEX-adjusted) |
|
||||
| `/api/analytics/backlog` | GET | Fixable vulnerability backlog |
|
||||
| `/api/analytics/attestation-coverage` | GET | Attestation gaps |
|
||||
| `/api/analytics/trends/vulnerabilities` | GET | Vulnerability time-series |
|
||||
| `/api/analytics/trends/components` | GET | Component time-series |
|
||||
|
||||
## Architecture
|
||||
|
||||
See [architecture.md](./architecture.md) for detailed design decisions, data flow, and normalization rules.
|
||||
|
||||
## Schema Reference
|
||||
|
||||
See [analytics_schema.sql](../../db/analytics_schema.sql) for complete DDL including:
|
||||
- Table definitions with indexes
|
||||
- Normalization functions
|
||||
- Materialized views
|
||||
- Stored procedures
|
||||
- Refresh procedures
|
||||
|
||||
## Sprint Reference
|
||||
|
||||
Implementation tracked in: `docs/implplan/SPRINT_20260120_030_Platform_sbom_analytics_lake.md`
|
||||
270
docs/modules/analytics/architecture.md
Normal file
270
docs/modules/analytics/architecture.md
Normal file
@@ -0,0 +1,270 @@
|
||||
# Analytics Module Architecture
|
||||
|
||||
## Design Philosophy
|
||||
|
||||
The Analytics module implements a **star-schema data warehouse** pattern optimized for analytical queries rather than transactional workloads. Key design principles:
|
||||
|
||||
1. **Separation of concerns**: Analytics schema is isolated from operational schemas (scanner, vex, proof_system)
|
||||
2. **Pre-computation**: Expensive aggregations computed in advance via materialized views
|
||||
3. **Audit trail**: Raw payloads preserved for reprocessing and compliance
|
||||
4. **Determinism**: All normalization functions are immutable and reproducible
|
||||
5. **Incremental updates**: Supports both full refresh and incremental ingestion
|
||||
|
||||
## Data Flow
|
||||
|
||||
```
|
||||
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
||||
│ Scanner │ │ Concelier │ │ Attestor │
|
||||
│ (SBOM) │ │ (Vuln) │ │ (DSSE) │
|
||||
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
|
||||
│ │ │
|
||||
│ SBOM Ingested │ Vuln Updated │ Attestation Created
|
||||
▼ ▼ ▼
|
||||
┌──────────────────────────────────────────────────────┐
|
||||
│ AnalyticsIngestionService │
|
||||
│ - Normalize components (PURL, supplier, license) │
|
||||
│ - Upsert to unified registry │
|
||||
│ - Correlate with vulnerabilities │
|
||||
│ - Store raw payloads │
|
||||
└──────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌──────────────────────────────────────────────────────┐
|
||||
│ analytics schema │
|
||||
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌────────────┐ │
|
||||
│ │components│ │artifacts│ │comp_vuln│ │attestations│ │
|
||||
│ └─────────┘ └─────────┘ └─────────┘ └────────────┘ │
|
||||
└──────────────────────────────────────────────────────┘
|
||||
│
|
||||
│ Daily refresh
|
||||
▼
|
||||
┌──────────────────────────────────────────────────────┐
|
||||
│ Materialized Views │
|
||||
│ mv_supplier_concentration | mv_license_distribution │
|
||||
│ mv_vuln_exposure | mv_attestation_coverage │
|
||||
└──────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌──────────────────────────────────────────────────────┐
|
||||
│ Platform API Endpoints │
|
||||
│ (with 5-minute caching) │
|
||||
└──────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Normalization Rules
|
||||
|
||||
### PURL Parsing
|
||||
|
||||
Package URLs (PURLs) are the canonical identifier for components. The `parse_purl()` function extracts:
|
||||
|
||||
| Field | Example | Notes |
|
||||
|-------|---------|-------|
|
||||
| `purl_type` | `maven`, `npm`, `pypi` | Ecosystem identifier |
|
||||
| `purl_namespace` | `org.apache.logging` | Group/org/scope (optional) |
|
||||
| `purl_name` | `log4j-core` | Package name |
|
||||
| `purl_version` | `2.17.1` | Version string |
|
||||
|
||||
### Supplier Normalization
|
||||
|
||||
The `normalize_supplier()` function standardizes supplier names for consistent grouping:
|
||||
|
||||
1. Convert to lowercase
|
||||
2. Trim whitespace
|
||||
3. Remove legal suffixes: Inc., LLC, Ltd., Corp., GmbH, B.V., S.A., PLC, Co.
|
||||
4. Normalize internal whitespace
|
||||
|
||||
**Examples:**
|
||||
- `"Apache Software Foundation, Inc."` → `"apache software foundation"`
|
||||
- `"Google LLC"` → `"google"`
|
||||
- `" Microsoft Corp. "` → `"microsoft"`
|
||||
|
||||
### License Categorization
|
||||
|
||||
The `categorize_license()` function maps SPDX expressions to risk categories:
|
||||
|
||||
| Category | Examples | Risk Level |
|
||||
|----------|----------|------------|
|
||||
| `permissive` | MIT, Apache-2.0, BSD-3-Clause, ISC | Low |
|
||||
| `copyleft-weak` | LGPL-2.1, MPL-2.0, EPL-2.0 | Medium |
|
||||
| `copyleft-strong` | GPL-3.0, AGPL-3.0, SSPL | High |
|
||||
| `proprietary` | Proprietary, Commercial | Review Required |
|
||||
| `unknown` | Unrecognized expressions | Review Required |
|
||||
|
||||
**Special handling:**
|
||||
- GPL with exceptions (e.g., `GPL-2.0 WITH Classpath-exception-2.0`) → `copyleft-weak`
|
||||
- Dual-licensed (e.g., `MIT OR Apache-2.0`) → uses first match
|
||||
|
||||
## Component Deduplication
|
||||
|
||||
Components are deduplicated by `(purl, hash_sha256)`:
|
||||
|
||||
1. If same PURL and hash: existing record updated (last_seen_at, counts)
|
||||
2. If same PURL but different hash: new record created (version change)
|
||||
3. If same hash but different PURL: new record (aliased package)
|
||||
|
||||
**Upsert pattern:**
|
||||
```sql
|
||||
INSERT INTO analytics.components (...)
|
||||
VALUES (...)
|
||||
ON CONFLICT (purl, hash_sha256) DO UPDATE SET
|
||||
last_seen_at = now(),
|
||||
sbom_count = components.sbom_count + 1,
|
||||
updated_at = now();
|
||||
```
|
||||
|
||||
## Vulnerability Correlation
|
||||
|
||||
When a component is upserted, the `VulnerabilityCorrelationService` queries Concelier for matching advisories:
|
||||
|
||||
1. Query by PURL type + namespace + name
|
||||
2. Filter by version range matching
|
||||
3. Upsert to `component_vulns` with severity, EPSS, KEV flags
|
||||
|
||||
**Version range matching** uses Concelier's existing logic to handle:
|
||||
- Semver ranges: `>=1.0.0 <2.0.0`
|
||||
- Exact versions: `1.2.3`
|
||||
- Wildcards: `1.x`
|
||||
|
||||
## VEX Override Logic
|
||||
|
||||
The `mv_vuln_exposure` view implements VEX-adjusted counts:
|
||||
|
||||
```sql
|
||||
-- Effective count excludes artifacts with active VEX overrides
|
||||
COUNT(DISTINCT ac.artifact_id) FILTER (
|
||||
WHERE NOT EXISTS (
|
||||
SELECT 1 FROM analytics.vex_overrides vo
|
||||
WHERE vo.artifact_id = ac.artifact_id
|
||||
AND vo.vuln_id = cv.vuln_id
|
||||
AND vo.status = 'not_affected'
|
||||
AND (vo.valid_until IS NULL OR vo.valid_until > now())
|
||||
)
|
||||
) AS effective_artifact_count
|
||||
```
|
||||
|
||||
**Override validity:**
|
||||
- `valid_from`: When the override became effective
|
||||
- `valid_until`: Expiration (NULL = no expiration)
|
||||
- Only `status = 'not_affected'` reduces exposure counts
|
||||
|
||||
## Time-Series Rollups
|
||||
|
||||
Daily rollups computed by `compute_daily_rollups()`:
|
||||
|
||||
**Vulnerability counts** (per environment/team/severity):
|
||||
- `total_vulns`: All affecting vulnerabilities
|
||||
- `fixable_vulns`: Vulns with `fix_available = TRUE`
|
||||
- `vex_mitigated`: Vulns with active `not_affected` override
|
||||
- `kev_vulns`: Vulns in CISA KEV
|
||||
- `unique_cves`: Distinct CVE IDs
|
||||
- `affected_artifacts`: Artifacts containing affected components
|
||||
- `affected_components`: Components with affecting vulns
|
||||
|
||||
**Component counts** (per environment/team/license/type):
|
||||
- `total_components`: Distinct components
|
||||
- `unique_suppliers`: Distinct normalized suppliers
|
||||
|
||||
**Retention policy:** 90 days in hot storage; older data archived to cold storage.
|
||||
|
||||
## Materialized View Refresh
|
||||
|
||||
All materialized views support `REFRESH ... CONCURRENTLY` for zero-downtime updates:
|
||||
|
||||
```sql
|
||||
-- Refresh all views (run daily via pg_cron or Scheduler)
|
||||
SELECT analytics.refresh_all_views();
|
||||
```
|
||||
|
||||
**Refresh schedule (recommended):**
|
||||
- `mv_supplier_concentration`: 02:00 UTC daily
|
||||
- `mv_license_distribution`: 02:15 UTC daily
|
||||
- `mv_vuln_exposure`: 02:30 UTC daily
|
||||
- `mv_attestation_coverage`: 02:45 UTC daily
|
||||
- `compute_daily_rollups()`: 03:00 UTC daily
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
### Indexing Strategy
|
||||
|
||||
| Table | Key Indexes | Query Pattern |
|
||||
|-------|-------------|---------------|
|
||||
| `components` | `purl`, `supplier_normalized`, `license_category` | Lookup, aggregation |
|
||||
| `artifacts` | `digest`, `environment`, `team` | Lookup, filtering |
|
||||
| `component_vulns` | `vuln_id`, `severity`, `fix_available` | Join, filtering |
|
||||
| `attestations` | `artifact_id`, `predicate_type` | Join, aggregation |
|
||||
| `vex_overrides` | `(artifact_id, vuln_id)`, `status` | Subquery exists |
|
||||
|
||||
### Query Performance Targets
|
||||
|
||||
| Query | Target | Notes |
|
||||
|-------|--------|-------|
|
||||
| `sp_top_suppliers(20)` | < 100ms | Uses materialized view |
|
||||
| `sp_license_heatmap()` | < 100ms | Uses materialized view |
|
||||
| `sp_vuln_exposure()` | < 200ms | Uses materialized view |
|
||||
| `sp_fixable_backlog()` | < 500ms | Live query with indexes |
|
||||
| `sp_attestation_gaps()` | < 100ms | Uses materialized view |
|
||||
|
||||
### Caching Strategy
|
||||
|
||||
Platform API endpoints use a 5-minute TTL cache:
|
||||
- Cache key: endpoint + query parameters
|
||||
- Invalidation: Time-based only (no event-driven invalidation)
|
||||
- Storage: Valkey (in-memory)
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Schema Permissions
|
||||
|
||||
```sql
|
||||
-- Read-only role for dashboards
|
||||
GRANT USAGE ON SCHEMA analytics TO dashboard_reader;
|
||||
GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO dashboard_reader;
|
||||
GRANT SELECT ON ALL SEQUENCES IN SCHEMA analytics TO dashboard_reader;
|
||||
|
||||
-- Write role for ingestion service
|
||||
GRANT USAGE ON SCHEMA analytics TO analytics_writer;
|
||||
GRANT SELECT, INSERT, UPDATE ON ALL TABLES IN SCHEMA analytics TO analytics_writer;
|
||||
GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA analytics TO analytics_writer;
|
||||
```
|
||||
|
||||
### Data Classification
|
||||
|
||||
| Table | Classification | Notes |
|
||||
|-------|----------------|-------|
|
||||
| `components` | Internal | Contains package names, versions |
|
||||
| `artifacts` | Internal | Contains image names, team names |
|
||||
| `component_vulns` | Internal | Vulnerability data (public CVEs) |
|
||||
| `vex_overrides` | Confidential | Contains justifications, operator IDs |
|
||||
| `raw_sboms` | Confidential | Full SBOM payloads |
|
||||
| `raw_attestations` | Confidential | Signed attestation envelopes |
|
||||
|
||||
### Audit Trail
|
||||
|
||||
All tables include `created_at` and `updated_at` timestamps. Raw payload tables (`raw_sboms`, `raw_attestations`) are append-only with content hashes for integrity verification.
|
||||
|
||||
## Integration Points
|
||||
|
||||
### Upstream Dependencies
|
||||
|
||||
| Service | Event | Action |
|
||||
|---------|-------|--------|
|
||||
| Scanner | SBOM ingested | Normalize and upsert components |
|
||||
| Concelier | Advisory updated | Re-correlate affected components |
|
||||
| Excititor | VEX observation | Create/update vex_overrides |
|
||||
| Attestor | Attestation created | Upsert attestation record |
|
||||
|
||||
### Downstream Consumers
|
||||
|
||||
| Consumer | Data | Endpoint |
|
||||
|----------|------|----------|
|
||||
| Console UI | Dashboard data | `/api/analytics/*` |
|
||||
| Export Center | Compliance reports | Direct DB query |
|
||||
| AdvisoryAI | Risk context | `/api/analytics/vulnerabilities` |
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
1. **Partitioning**: Partition `daily_*` tables by date for faster queries and archival
|
||||
2. **Incremental refresh**: Implement incremental materialized view refresh for large datasets
|
||||
3. **Custom dimensions**: Support user-defined component groupings (business units, cost centers)
|
||||
4. **Predictive analytics**: Add ML-based risk prediction using historical trends
|
||||
5. **BI tool integration**: Direct connectors for Tableau, Looker, Metabase
|
||||
418
docs/modules/analytics/queries.md
Normal file
418
docs/modules/analytics/queries.md
Normal file
@@ -0,0 +1,418 @@
|
||||
# Analytics Query Library
|
||||
|
||||
This document provides ready-to-use SQL queries for common analytics use cases. All queries are optimized for the analytics star schema.
|
||||
|
||||
## Executive Dashboard Queries
|
||||
|
||||
### 1. Top Supplier Concentration (Supply Chain Risk)
|
||||
|
||||
Identifies suppliers with the highest component footprint, indicating supply chain concentration risk.
|
||||
|
||||
```sql
|
||||
-- Via stored procedure (recommended)
|
||||
SELECT * FROM analytics.sp_top_suppliers(20);
|
||||
|
||||
-- Direct query
|
||||
SELECT
|
||||
supplier,
|
||||
component_count,
|
||||
artifact_count,
|
||||
team_count,
|
||||
critical_vuln_count,
|
||||
high_vuln_count,
|
||||
environments
|
||||
FROM analytics.mv_supplier_concentration
|
||||
ORDER BY component_count DESC
|
||||
LIMIT 20;
|
||||
```
|
||||
|
||||
**Use case**: Identify vendors that, if compromised, would affect the most artifacts.
|
||||
|
||||
### 2. License Risk Heatmap
|
||||
|
||||
Shows distribution of components by license category for compliance review.
|
||||
|
||||
```sql
|
||||
-- Via stored procedure
|
||||
SELECT * FROM analytics.sp_license_heatmap();
|
||||
|
||||
-- Direct query with grouping
|
||||
SELECT
|
||||
license_category,
|
||||
SUM(component_count) AS total_components,
|
||||
SUM(artifact_count) AS total_artifacts,
|
||||
COUNT(DISTINCT license_concluded) AS unique_licenses
|
||||
FROM analytics.mv_license_distribution
|
||||
GROUP BY license_category
|
||||
ORDER BY
|
||||
CASE license_category
|
||||
WHEN 'copyleft-strong' THEN 1
|
||||
WHEN 'proprietary' THEN 2
|
||||
WHEN 'unknown' THEN 3
|
||||
WHEN 'copyleft-weak' THEN 4
|
||||
ELSE 5
|
||||
END;
|
||||
```
|
||||
|
||||
**Use case**: Compliance review, identify components requiring legal review.
|
||||
|
||||
### 3. CVE Exposure Adjusted by VEX
|
||||
|
||||
Shows true vulnerability exposure after applying VEX mitigations.
|
||||
|
||||
```sql
|
||||
-- Via stored procedure
|
||||
SELECT * FROM analytics.sp_vuln_exposure('prod', 'high');
|
||||
|
||||
-- Direct query showing VEX effectiveness
|
||||
SELECT
|
||||
vuln_id,
|
||||
severity::TEXT,
|
||||
cvss_score,
|
||||
epss_score,
|
||||
kev_listed,
|
||||
fix_available,
|
||||
raw_artifact_count AS total_affected,
|
||||
effective_artifact_count AS actually_affected,
|
||||
raw_artifact_count - effective_artifact_count AS vex_mitigated,
|
||||
ROUND(100.0 * (raw_artifact_count - effective_artifact_count) / NULLIF(raw_artifact_count, 0), 1) AS mitigation_rate
|
||||
FROM analytics.mv_vuln_exposure
|
||||
WHERE effective_artifact_count > 0
|
||||
ORDER BY
|
||||
CASE severity
|
||||
WHEN 'critical' THEN 1
|
||||
WHEN 'high' THEN 2
|
||||
WHEN 'medium' THEN 3
|
||||
ELSE 4
|
||||
END,
|
||||
effective_artifact_count DESC
|
||||
LIMIT 50;
|
||||
```
|
||||
|
||||
**Use case**: Show executives the "real" risk after VEX assessment.
|
||||
|
||||
### 4. Fixable Vulnerability Backlog
|
||||
|
||||
Lists vulnerabilities that can be fixed today (fix available, not VEX-mitigated).
|
||||
|
||||
```sql
|
||||
-- Via stored procedure
|
||||
SELECT * FROM analytics.sp_fixable_backlog('prod');
|
||||
|
||||
-- Direct query with priority scoring
|
||||
SELECT
|
||||
a.name AS service,
|
||||
a.environment,
|
||||
a.team,
|
||||
c.name AS component,
|
||||
c.version AS current_version,
|
||||
cv.vuln_id,
|
||||
cv.severity::TEXT,
|
||||
cv.cvss_score,
|
||||
cv.epss_score,
|
||||
cv.fixed_version,
|
||||
cv.kev_listed,
|
||||
-- Priority score: higher = fix first
|
||||
(
|
||||
CASE cv.severity
|
||||
WHEN 'critical' THEN 100
|
||||
WHEN 'high' THEN 75
|
||||
WHEN 'medium' THEN 50
|
||||
ELSE 25
|
||||
END
|
||||
+ COALESCE(cv.epss_score * 100, 0)
|
||||
+ (CASE WHEN cv.kev_listed THEN 50 ELSE 0 END)
|
||||
)::INT AS priority_score
|
||||
FROM analytics.component_vulns cv
|
||||
JOIN analytics.components c ON c.component_id = cv.component_id
|
||||
JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
|
||||
JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
|
||||
LEFT JOIN analytics.vex_overrides vo ON vo.artifact_id = a.artifact_id
|
||||
AND vo.vuln_id = cv.vuln_id
|
||||
AND vo.status = 'not_affected'
|
||||
AND (vo.valid_until IS NULL OR vo.valid_until > now())
|
||||
WHERE cv.affects = TRUE
|
||||
AND cv.fix_available = TRUE
|
||||
AND vo.override_id IS NULL
|
||||
AND a.environment = 'prod'
|
||||
ORDER BY priority_score DESC, a.name
|
||||
LIMIT 100;
|
||||
```
|
||||
|
||||
**Use case**: Prioritize remediation work based on risk and fixability.
|
||||
|
||||
### 5. Build Integrity / Attestation Coverage
|
||||
|
||||
Shows attestation gaps by environment and team.
|
||||
|
||||
```sql
|
||||
-- Via stored procedure
|
||||
SELECT * FROM analytics.sp_attestation_gaps('prod');
|
||||
|
||||
-- Direct query with gap analysis
|
||||
SELECT
|
||||
environment,
|
||||
team,
|
||||
total_artifacts,
|
||||
with_provenance,
|
||||
total_artifacts - with_provenance AS missing_provenance,
|
||||
provenance_pct,
|
||||
slsa_level_2_plus,
|
||||
slsa2_pct,
|
||||
with_sbom_attestation,
|
||||
with_vex_attestation
|
||||
FROM analytics.mv_attestation_coverage
|
||||
WHERE environment = 'prod'
|
||||
ORDER BY provenance_pct ASC;
|
||||
```
|
||||
|
||||
**Use case**: Identify teams/environments not meeting attestation requirements.
|
||||
|
||||
## Trend Analysis Queries
|
||||
|
||||
### 6. Vulnerability Trend (30 Days)
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
snapshot_date,
|
||||
environment,
|
||||
SUM(total_vulns) AS total_vulns,
|
||||
SUM(fixable_vulns) AS fixable_vulns,
|
||||
SUM(vex_mitigated) AS vex_mitigated,
|
||||
SUM(total_vulns) - SUM(vex_mitigated) AS net_exposure,
|
||||
SUM(kev_vulns) AS kev_vulns
|
||||
FROM analytics.daily_vulnerability_counts
|
||||
WHERE snapshot_date >= CURRENT_DATE - INTERVAL '30 days'
|
||||
GROUP BY snapshot_date, environment
|
||||
ORDER BY environment, snapshot_date;
|
||||
```
|
||||
|
||||
### 7. Vulnerability Trend by Severity
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
snapshot_date,
|
||||
severity::TEXT,
|
||||
SUM(total_vulns) AS total_vulns
|
||||
FROM analytics.daily_vulnerability_counts
|
||||
WHERE snapshot_date >= CURRENT_DATE - INTERVAL '30 days'
|
||||
AND environment = 'prod'
|
||||
GROUP BY snapshot_date, severity
|
||||
ORDER BY snapshot_date,
|
||||
CASE severity
|
||||
WHEN 'critical' THEN 1
|
||||
WHEN 'high' THEN 2
|
||||
WHEN 'medium' THEN 3
|
||||
ELSE 4
|
||||
END;
|
||||
```
|
||||
|
||||
### 8. Component Growth Trend
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
snapshot_date,
|
||||
environment,
|
||||
SUM(total_components) AS total_components,
|
||||
SUM(unique_suppliers) AS unique_suppliers
|
||||
FROM analytics.daily_component_counts
|
||||
WHERE snapshot_date >= CURRENT_DATE - INTERVAL '30 days'
|
||||
GROUP BY snapshot_date, environment
|
||||
ORDER BY environment, snapshot_date;
|
||||
```
|
||||
|
||||
## Deep-Dive Queries
|
||||
|
||||
### 9. Component Impact Analysis
|
||||
|
||||
Find all artifacts affected by a specific component.
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
a.name AS artifact,
|
||||
a.version,
|
||||
a.environment,
|
||||
a.team,
|
||||
ac.depth AS dependency_depth,
|
||||
ac.introduced_via
|
||||
FROM analytics.components c
|
||||
JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
|
||||
JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
|
||||
WHERE c.purl LIKE 'pkg:maven/org.apache.logging.log4j/log4j-core%'
|
||||
ORDER BY a.environment, a.name;
|
||||
```
|
||||
|
||||
### 10. CVE Impact Analysis
|
||||
|
||||
Find all artifacts affected by a specific CVE.
|
||||
|
||||
```sql
|
||||
SELECT DISTINCT
|
||||
a.name AS artifact,
|
||||
a.version,
|
||||
a.environment,
|
||||
a.team,
|
||||
c.name AS component,
|
||||
c.version AS component_version,
|
||||
cv.cvss_score,
|
||||
cv.fixed_version,
|
||||
CASE
|
||||
WHEN vo.status = 'not_affected' THEN 'VEX Mitigated'
|
||||
WHEN cv.fix_available THEN 'Fix Available'
|
||||
ELSE 'Vulnerable'
|
||||
END AS status
|
||||
FROM analytics.component_vulns cv
|
||||
JOIN analytics.components c ON c.component_id = cv.component_id
|
||||
JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
|
||||
JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
|
||||
LEFT JOIN analytics.vex_overrides vo ON vo.artifact_id = a.artifact_id
|
||||
AND vo.vuln_id = cv.vuln_id
|
||||
AND (vo.valid_until IS NULL OR vo.valid_until > now())
|
||||
WHERE cv.vuln_id = 'CVE-2021-44228'
|
||||
ORDER BY a.environment, a.name;
|
||||
```
|
||||
|
||||
### 11. Supplier Vulnerability Profile
|
||||
|
||||
Detailed vulnerability breakdown for a specific supplier.
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
c.supplier_normalized AS supplier,
|
||||
c.name AS component,
|
||||
c.version,
|
||||
cv.vuln_id,
|
||||
cv.severity::TEXT,
|
||||
cv.cvss_score,
|
||||
cv.kev_listed,
|
||||
cv.fix_available,
|
||||
cv.fixed_version
|
||||
FROM analytics.components c
|
||||
JOIN analytics.component_vulns cv ON cv.component_id = c.component_id
|
||||
WHERE c.supplier_normalized = 'apache software foundation'
|
||||
AND cv.affects = TRUE
|
||||
ORDER BY
|
||||
CASE cv.severity
|
||||
WHEN 'critical' THEN 1
|
||||
WHEN 'high' THEN 2
|
||||
ELSE 3
|
||||
END,
|
||||
cv.cvss_score DESC;
|
||||
```
|
||||
|
||||
### 12. License Compliance Report
|
||||
|
||||
Components with concerning licenses in production.
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
c.name AS component,
|
||||
c.version,
|
||||
c.license_concluded,
|
||||
c.license_category::TEXT,
|
||||
c.supplier_normalized AS supplier,
|
||||
COUNT(DISTINCT a.artifact_id) AS artifact_count,
|
||||
ARRAY_AGG(DISTINCT a.name) AS affected_artifacts
|
||||
FROM analytics.components c
|
||||
JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
|
||||
JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
|
||||
WHERE c.license_category IN ('copyleft-strong', 'proprietary', 'unknown')
|
||||
AND a.environment = 'prod'
|
||||
GROUP BY c.component_id, c.name, c.version, c.license_concluded, c.license_category, c.supplier_normalized
|
||||
ORDER BY c.license_category, artifact_count DESC;
|
||||
```
|
||||
|
||||
### 13. MTTR Analysis
|
||||
|
||||
Mean time to remediate by severity.
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
cv.severity::TEXT,
|
||||
COUNT(*) AS remediated_vulns,
|
||||
AVG(EXTRACT(EPOCH FROM (vo.valid_from - cv.published_at)) / 86400)::NUMERIC(10,2) AS avg_days_to_mitigate,
|
||||
PERCENTILE_CONT(0.5) WITHIN GROUP (
|
||||
ORDER BY EXTRACT(EPOCH FROM (vo.valid_from - cv.published_at)) / 86400
|
||||
)::NUMERIC(10,2) AS median_days,
|
||||
PERCENTILE_CONT(0.9) WITHIN GROUP (
|
||||
ORDER BY EXTRACT(EPOCH FROM (vo.valid_from - cv.published_at)) / 86400
|
||||
)::NUMERIC(10,2) AS p90_days
|
||||
FROM analytics.component_vulns cv
|
||||
JOIN analytics.vex_overrides vo ON vo.vuln_id = cv.vuln_id
|
||||
AND vo.status = 'not_affected'
|
||||
WHERE cv.published_at >= now() - INTERVAL '90 days'
|
||||
AND cv.published_at IS NOT NULL
|
||||
GROUP BY cv.severity
|
||||
ORDER BY
|
||||
CASE cv.severity
|
||||
WHEN 'critical' THEN 1
|
||||
WHEN 'high' THEN 2
|
||||
WHEN 'medium' THEN 3
|
||||
ELSE 4
|
||||
END;
|
||||
```
|
||||
|
||||
### 14. Transitive Dependency Risk
|
||||
|
||||
Components introduced through transitive dependencies.
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
c.name AS transitive_component,
|
||||
c.version,
|
||||
ac.introduced_via AS direct_dependency,
|
||||
ac.depth,
|
||||
COUNT(DISTINCT cv.vuln_id) AS vuln_count,
|
||||
SUM(CASE WHEN cv.severity = 'critical' THEN 1 ELSE 0 END) AS critical_count,
|
||||
COUNT(DISTINCT a.artifact_id) AS affected_artifacts
|
||||
FROM analytics.components c
|
||||
JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
|
||||
JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
|
||||
LEFT JOIN analytics.component_vulns cv ON cv.component_id = c.component_id AND cv.affects = TRUE
|
||||
WHERE ac.depth > 0 -- Transitive only
|
||||
AND a.environment = 'prod'
|
||||
GROUP BY c.component_id, c.name, c.version, ac.introduced_via, ac.depth
|
||||
HAVING COUNT(cv.vuln_id) > 0
|
||||
ORDER BY critical_count DESC, vuln_count DESC
|
||||
LIMIT 50;
|
||||
```
|
||||
|
||||
### 15. VEX Effectiveness Report
|
||||
|
||||
How effective is the VEX program at reducing noise?
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
DATE_TRUNC('week', vo.created_at)::DATE AS week,
|
||||
COUNT(*) AS total_overrides,
|
||||
COUNT(*) FILTER (WHERE vo.status = 'not_affected') AS not_affected,
|
||||
COUNT(*) FILTER (WHERE vo.status = 'affected') AS confirmed_affected,
|
||||
COUNT(*) FILTER (WHERE vo.status = 'under_investigation') AS under_investigation,
|
||||
COUNT(*) FILTER (WHERE vo.status = 'fixed') AS marked_fixed,
|
||||
-- Noise reduction rate
|
||||
ROUND(100.0 * COUNT(*) FILTER (WHERE vo.status = 'not_affected') / NULLIF(COUNT(*), 0), 1) AS noise_reduction_pct
|
||||
FROM analytics.vex_overrides vo
|
||||
WHERE vo.created_at >= now() - INTERVAL '90 days'
|
||||
GROUP BY DATE_TRUNC('week', vo.created_at)
|
||||
ORDER BY week;
|
||||
```
|
||||
|
||||
## Performance Tips
|
||||
|
||||
1. **Use materialized views**: Queries prefixed with `mv_` are pre-computed and fast
|
||||
2. **Add environment filter**: Most queries benefit from `WHERE environment = 'prod'`
|
||||
3. **Use stored procedures**: `sp_*` functions return JSON and handle caching
|
||||
4. **Limit results**: Always use `LIMIT` for large result sets
|
||||
5. **Check refresh times**: Views are refreshed daily; data may be up to 24h stale
|
||||
|
||||
## Query Parameters
|
||||
|
||||
Common filter parameters:
|
||||
|
||||
| Parameter | Type | Example | Notes |
|
||||
|-----------|------|---------|-------|
|
||||
| `environment` | TEXT | `'prod'`, `'stage'` | Filter by deployment environment |
|
||||
| `team` | TEXT | `'platform'` | Filter by owning team |
|
||||
| `severity` | TEXT | `'critical'`, `'high'` | Minimum severity level |
|
||||
| `days` | INT | `30`, `90` | Lookback period |
|
||||
| `limit` | INT | `20`, `100` | Max results |
|
||||
Reference in New Issue
Block a user