tests fixes and sprints work
This commit is contained in:
@@ -7,7 +7,7 @@ The Analytics module implements a **star-schema data warehouse** pattern optimiz
|
||||
1. **Separation of concerns**: Analytics schema is isolated from operational schemas (scanner, vex, proof_system)
|
||||
2. **Pre-computation**: Expensive aggregations computed in advance via materialized views
|
||||
3. **Audit trail**: Raw payloads preserved for reprocessing and compliance
|
||||
4. **Determinism**: All normalization functions are immutable and reproducible
|
||||
4. **Determinism**: Normalization functions are immutable and reproducible; array aggregates are ordered for stable outputs
|
||||
5. **Incremental updates**: Supports both full refresh and incremental ingestion
|
||||
|
||||
## Data Flow
|
||||
@@ -120,10 +120,9 @@ When a component is upserted, the `VulnerabilityCorrelationService` queries Conc
|
||||
2. Filter by version range matching
|
||||
3. Upsert to `component_vulns` with severity, EPSS, KEV flags
|
||||
|
||||
**Version range matching** uses Concelier's existing logic to handle:
|
||||
- Semver ranges: `>=1.0.0 <2.0.0`
|
||||
- Exact versions: `1.2.3`
|
||||
- Wildcards: `1.x`
|
||||
**Version range matching** currently supports semver ranges and exact matches via
|
||||
`VersionRuleEvaluator`. Non-semver schemes fall back to exact string matches; wildcard
|
||||
and ecosystem-specific ranges require upstream normalization.
|
||||
|
||||
## VEX Override Logic
|
||||
|
||||
@@ -145,7 +144,21 @@ COUNT(DISTINCT ac.artifact_id) FILTER (
|
||||
**Override validity:**
|
||||
- `valid_from`: When the override became effective
|
||||
- `valid_until`: Expiration (NULL = no expiration)
|
||||
- Only `status = 'not_affected'` reduces exposure counts
|
||||
- Only `status = 'not_affected'` reduces exposure counts, and only when the override is active in its validity window.
|
||||
|
||||
## Attestation Ingestion
|
||||
|
||||
Attestation ingestion consumes Attestor Rekor entry events and expects Sigstore bundles
|
||||
or raw DSSE envelopes. The ingestion service:
|
||||
- Resolves bundle URIs using `BundleUriTemplate`; `bundle:{digest}` maps to
|
||||
`cas://<DefaultBucket>/{digest}` by default.
|
||||
- Decodes DSSE payloads, computes `dsse_payload_hash`, and records `predicate_uri` plus
|
||||
Rekor log metadata (`rekor_log_id`, `rekor_log_index`).
|
||||
- Uses in-toto `subject` digests to link artifacts when reanalysis hints are absent.
|
||||
- Maps predicate URIs into `analytics_attestation_type` values
|
||||
(`provenance`, `sbom`, `vex`, `build`, `scan`, `policy`).
|
||||
- Expands VEX statements into `vex_overrides` rows, one per product reference, and
|
||||
captures optional validity timestamps when provided.
|
||||
|
||||
## Time-Series Rollups
|
||||
|
||||
@@ -164,14 +177,14 @@ Daily rollups computed by `compute_daily_rollups()`:
|
||||
- `total_components`: Distinct components
|
||||
- `unique_suppliers`: Distinct normalized suppliers
|
||||
|
||||
**Retention policy:** 90 days in hot storage; older data archived to cold storage.
|
||||
**Retention policy:** 90 days in hot storage; `compute_daily_rollups()` prunes older rows and downstream jobs archive to cold storage.
|
||||
|
||||
## Materialized View Refresh
|
||||
|
||||
All materialized views support `REFRESH ... CONCURRENTLY` for zero-downtime updates:
|
||||
|
||||
```sql
|
||||
-- Refresh all views (run daily via pg_cron or Scheduler)
|
||||
-- Refresh all views (non-concurrent; run off-peak)
|
||||
SELECT analytics.refresh_all_views();
|
||||
```
|
||||
|
||||
@@ -182,6 +195,19 @@ SELECT analytics.refresh_all_views();
|
||||
- `mv_attestation_coverage`: 02:45 UTC daily
|
||||
- `compute_daily_rollups()`: 03:00 UTC daily
|
||||
|
||||
Platform WebService can run the daily rollup + refresh loop via
|
||||
`PlatformAnalyticsMaintenanceService`. Configure the schedule with:
|
||||
- `Platform:AnalyticsMaintenance:Enabled` (default `true`)
|
||||
- `Platform:AnalyticsMaintenance:IntervalMinutes` (default `1440`)
|
||||
- `Platform:AnalyticsMaintenance:RunOnStartup` (default `true`)
|
||||
- `Platform:AnalyticsMaintenance:ComputeDailyRollups` (default `true`)
|
||||
- `Platform:AnalyticsMaintenance:RefreshMaterializedViews` (default `true`)
|
||||
- `Platform:AnalyticsMaintenance:BackfillDays` (default `0`, set to `0` to disable; recompute the most recent N days on the first maintenance run)
|
||||
|
||||
The hosted service issues concurrent refresh statements directly for each view.
|
||||
Use a DB scheduler (pg_cron) or external orchestrator if you need the staggered
|
||||
per-view timing above.
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
### Indexing Strategy
|
||||
@@ -198,9 +224,9 @@ SELECT analytics.refresh_all_views();
|
||||
|
||||
| Query | Target | Notes |
|
||||
|-------|--------|-------|
|
||||
| `sp_top_suppliers(20)` | < 100ms | Uses materialized view |
|
||||
| `sp_license_heatmap()` | < 100ms | Uses materialized view |
|
||||
| `sp_vuln_exposure()` | < 200ms | Uses materialized view |
|
||||
| `sp_top_suppliers(20, 'prod')` | < 100ms | Uses materialized view when env is null; env filter reads base tables |
|
||||
| `sp_license_heatmap('prod')` | < 100ms | Uses materialized view when env is null; env filter reads base tables |
|
||||
| `sp_vuln_exposure()` | < 200ms | Uses materialized view for global queries; environment filters read base tables |
|
||||
| `sp_fixable_backlog()` | < 500ms | Live query with indexes |
|
||||
| `sp_attestation_gaps()` | < 100ms | Uses materialized view |
|
||||
|
||||
@@ -246,12 +272,12 @@ All tables include `created_at` and `updated_at` timestamps. Raw payload tables
|
||||
|
||||
### Upstream Dependencies
|
||||
|
||||
| Service | Event | Action |
|
||||
|---------|-------|--------|
|
||||
| Scanner | SBOM ingested | Normalize and upsert components |
|
||||
| Concelier | Advisory updated | Re-correlate affected components |
|
||||
| Excititor | VEX observation | Create/update vex_overrides |
|
||||
| Attestor | Attestation created | Upsert attestation record |
|
||||
| Service | Event | Contract | Action |
|
||||
|---------|-------|----------|--------|
|
||||
| Scanner | SBOM report ready | `scanner.event.report.ready@1` (`docs/modules/signals/events/orchestrator-scanner-events.md`) | Normalize and upsert components |
|
||||
| Concelier | Advisory observation/linkset updated | `advisory.observation.updated@1` (`docs/modules/concelier/events/advisory.observation.updated@1.schema.json`), `advisory.linkset.updated@1` (`docs/modules/concelier/events/advisory.linkset.updated@1.md`) | Re-correlate affected components |
|
||||
| Excititor | VEX statement changes | `vex.statement.*` (`docs/modules/excititor/architecture.md`) | Create/update vex_overrides |
|
||||
| Attestor | Rekor entry logged | `rekor.entry.logged` (`docs/modules/attestor/architecture.md`) | Upsert attestation record |
|
||||
|
||||
### Downstream Consumers
|
||||
|
||||
|
||||
Reference in New Issue
Block a user