Files
git.stella-ops.org/docs/modules/analytics/README.md
2026-01-22 19:08:46 +02:00

226 lines
9.0 KiB
Markdown

# Analytics Module
The Analytics module provides a star-schema data warehouse layer for SBOM and attestation data, enabling executive reporting, risk dashboards, and ad-hoc analysis.
## Overview
Stella Ops generates rich data through SBOM ingestion, vulnerability correlation, VEX assessments, and attestations. The Analytics module normalizes this data into a queryable warehouse schema optimized for:
- **Executive dashboards**: Risk posture, vulnerability trends, compliance status
- **Supply chain analysis**: Supplier concentration, license distribution
- **Security metrics**: CVE exposure, VEX effectiveness, MTTR tracking
- **Attestation coverage**: SLSA compliance, provenance gaps
## Key Capabilities
| Capability | Description |
|------------|-------------|
| Unified component registry | Canonical component table with normalized suppliers and licenses |
| Vulnerability correlation | Pre-joined component-vulnerability mapping with EPSS/KEV flags |
| VEX-adjusted exposure | Vulnerability counts that respect active VEX overrides (validity windows applied) |
| Attestation tracking | Provenance and SLSA level coverage by environment/team |
| Time-series rollups | Daily snapshots for trend analysis |
| Materialized views | Pre-computed aggregations for dashboard performance |
## Data Model
### Star Schema Overview
```
┌─────────────────┐
│ artifacts │ (dimension)
│ container/app │
└────────┬────────┘
┌──────────────┼──────────────┐
│ │ │
┌─────────▼──────┐ ┌─────▼─────┐ ┌──────▼──────┐
│ artifact_ │ │attestations│ │vex_overrides│
│ components │ │ (fact) │ │ (fact) │
│ (bridge) │ └───────────┘ └─────────────┘
└─────────┬──────┘
┌─────────▼──────┐
│ components │ (dimension)
│ unified │
│ registry │
└─────────┬──────┘
┌─────────▼──────┐
│ component_ │
│ vulns │ (fact)
│ (bridge) │
└────────────────┘
```
### Core Tables
| Table | Type | Purpose |
|-------|------|---------|
| `components` | Dimension | Unified component registry with PURL, supplier, license |
| `artifacts` | Dimension | Container images and applications with SBOM metadata |
| `artifact_components` | Bridge | Links artifacts to their SBOM components |
| `component_vulns` | Fact | Component-to-vulnerability mapping |
| `attestations` | Fact | Attestation metadata (provenance, SBOM, VEX) |
| `vex_overrides` | Fact | VEX status overrides with justifications |
| `raw_sboms` | Audit | Raw SBOM payloads for reprocessing |
| `raw_attestations` | Audit | Raw DSSE envelopes for audit |
| `daily_vulnerability_counts` | Rollup | Daily vuln aggregations |
| `daily_component_counts` | Rollup | Daily component aggregations |
Rollup retention is 90 days in hot storage. `compute_daily_rollups()` prunes
older rows after each run; archival follows operations runbooks.
Platform WebService can automate rollups + materialized view refreshes via
`PlatformAnalyticsMaintenanceService` (see `architecture.md` for schedule and
configuration).
Use `Platform:AnalyticsMaintenance:BackfillDays` to recompute the most recent
N days of rollups on the first maintenance run after downtime (set to `0` to disable).
### Materialized Views
| View | Refresh | Purpose |
|------|---------|---------|
| `mv_supplier_concentration` | Daily | Top suppliers by component count |
| `mv_license_distribution` | Daily | License category distribution |
| `mv_vuln_exposure` | Daily | CVE exposure adjusted by VEX |
| `mv_attestation_coverage` | Daily | Provenance/SLSA coverage by env/team |
Array-valued fields (for example `environments` and `ecosystems`) are ordered
alphabetically to keep analytics outputs deterministic.
## Quick Start
### Day-1 Queries
**Top supplier concentration (supply chain risk, optional environment filter):**
```sql
SELECT analytics.sp_top_suppliers(20, 'prod');
```
**License risk heatmap (optional environment filter):**
```sql
SELECT analytics.sp_license_heatmap('prod');
```
**CVE exposure adjusted by VEX:**
```sql
SELECT analytics.sp_vuln_exposure('prod', 'high');
```
**Fixable vulnerability backlog:**
```sql
SELECT analytics.sp_fixable_backlog('prod');
```
**Attestation coverage gaps:**
```sql
SELECT analytics.sp_attestation_gaps('prod');
```
### API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/analytics/suppliers` | GET | Supplier concentration data |
| `/api/analytics/licenses` | GET | License distribution |
| `/api/analytics/vulnerabilities` | GET | CVE exposure (VEX-adjusted) |
| `/api/analytics/backlog` | GET | Fixable vulnerability backlog |
| `/api/analytics/attestation-coverage` | GET | Attestation gaps |
| `/api/analytics/trends/vulnerabilities` | GET | Vulnerability time-series |
| `/api/analytics/trends/components` | GET | Component time-series |
All analytics endpoints require the `analytics.read` scope.
The platform metadata capability `analytics` reports whether analytics storage is configured.
#### Query Parameters
- `/api/analytics/suppliers`: `limit` (optional, default 20), `environment` (optional)
- `/api/analytics/licenses`: `environment` (optional)
- `/api/analytics/vulnerabilities`: `minSeverity` (optional, default `low`), `environment` (optional)
- `/api/analytics/backlog`: `environment` (optional)
- `/api/analytics/attestation-coverage`: `environment` (optional)
- `/api/analytics/trends/vulnerabilities`: `environment` (optional), `days` (optional, default 30)
- `/api/analytics/trends/components`: `environment` (optional), `days` (optional, default 30)
## Ingestion Configuration
Analytics ingestion runs inside the Platform WebService and subscribes to Scanner, Concelier, and Attestor streams. Configure ingestion via `Platform:AnalyticsIngestion`:
```yaml
Platform:
Storage:
PostgresConnectionString: "Host=...;Database=analytics;Username=...;Password=..."
AnalyticsIngestion:
Enabled: true
PostgresConnectionString: "" # optional; defaults to Platform:Storage
AllowedTenants: ["tenant-a", "tenant-b"]
Streams:
ScannerStream: "orchestrator:events"
ConcelierObservationStream: "concelier:advisory.observation.updated:v1"
ConcelierLinksetStream: "concelier:advisory.linkset.updated:v1"
AttestorStream: "attestor:events"
StartFromBeginning: false
Cas:
RootPath: "/var/lib/stellaops/cas"
DefaultBucket: "attestations"
Attestations:
BundleUriTemplate: "bundle:{digest}"
```
Bundle URI templates support:
- `{digest}` for the full digest string (for example `sha256:...`).
- `{hash}` for the raw hex digest (no algorithm prefix).
- `bundle:{digest}` which resolves to `cas://<DefaultBucket>/{digest}` by default.
- `file:/path/to/bundles/bundle-{hash}.json` for offline file ingestion.
For offline workflows, verify bundles with `stella bundle verify` before ingesting them.
## Console UI
SBOM Lake analytics are exposed in the Console under `Analytics > SBOM Lake` (`/analytics/sbom-lake`).
Console access requires `ui.read` plus `analytics.read` scopes.
Key UI features:
- Filters for environment, minimum severity, and time window.
- Panels for suppliers, licenses, vulnerability exposure, and attestation coverage.
- Trend views for vulnerabilities and components.
- Fixable backlog table with CSV export.
See [console.md](./console.md) for operator guidance and filter behavior.
## CLI Access
SBOM lake analytics are exposed via the CLI under `stella analytics sbom-lake`
(requires `analytics.read` scope).
```bash
# Top suppliers
stella analytics sbom-lake suppliers --limit 20
# Vulnerability exposure in prod (high+), CSV export
stella analytics sbom-lake vulnerabilities --environment prod --min-severity high --format csv --output vuln.csv
# 30-day trends for both series
stella analytics sbom-lake trends --days 30 --series all --format json
```
See `docs/modules/cli/guides/commands/analytics.md` for command-level details.
## Architecture
See [architecture.md](./architecture.md) for detailed design decisions, data flow, and normalization rules.
## Schema Reference
See [analytics_schema.sql](../../db/analytics_schema.sql) for complete DDL including:
- Table definitions with indexes
- Normalization functions
- Materialized views
- Stored procedures
- Refresh procedures
## Sprint Reference
Implementation tracked in:
- `docs/implplan/SPRINT_20260120_030_Platform_sbom_analytics_lake.md`
- `docs/implplan/SPRINT_20260120_032_Cli_sbom_analytics_cli.md`