documentation cleanse, sprints work and planning. remaining non EF DAL migration to EF

This commit is contained in:
master
2026-02-25 01:24:07 +02:00
parent b07d27772e
commit 4db038123b
9090 changed files with 4836 additions and 2909 deletions

View File

@@ -2,7 +2,7 @@
> **Sprint:** SPRINT_20260107_006_003 Task CH-016
> **Status:** Active
> **Last Updated:** 2026-01-13
> **Last Updated:** 2026-02-24
The AdvisoryAI Chat Interface provides a conversational experience for security operators to investigate vulnerabilities, understand findings, and take remediation actions—all grounded in internal evidence with citations.
@@ -29,11 +29,21 @@ The chat interface enables:
## API Reference
### Endpoint Families and Migration Timeline
- Canonical chat surface: `/api/v1/chat/*`
- Legacy compatibility surface: `/v1/advisory-ai/conversations*`
- Legacy sunset date (UTC): **December 31, 2026**
- Legacy responses emit migration headers:
- `Deprecation: true`
- `Sunset: Thu, 31 Dec 2026 23:59:59 GMT`
- `Link: </api/v1/chat/query>; rel="successor-version"`
### Create Conversation
Creates a new conversation session.
Required headers: `X-StellaOps-User`, `X-StellaOps-Client`, and either `X-StellaOps-Roles` (`chat:user` or `chat:admin`) or `X-StellaOps-Scopes` (`advisory:chat` or `advisory:run`).
Required headers: `X-StellaOps-User`, `X-StellaOps-Client`, and either `X-StellaOps-Roles` (`chat:user` or `chat:admin`) or `X-StellaOps-Scopes` (`advisory-ai:view`, `advisory-ai:operate`, `advisory-ai:admin`, plus legacy `advisory:chat` / `advisory:run` aliases).
```http
POST /v1/advisory-ai/conversations
@@ -88,6 +98,8 @@ X-StellaOps-Client: ui
}
```
`content` is the canonical add-turn payload field. A temporary compatibility shim still accepts legacy `message` input and maps it to `content`; empty/whitespace payloads return HTTP 400. Legacy `message` usage emits a warning header: `Warning: 299 - Legacy chat payload field 'message' is deprecated; use 'content'.`
**Response (Server-Sent Events):**
```
event: token
@@ -112,12 +124,14 @@ event: done
data: {"turnId": "turn-xyz", "groundingScore": 0.92}
```
Conversation add-turn responses now use the same grounded runtime path as the chat gateway. When runtime generation is unavailable, the service returns an explicit deterministic fallback response with metadata (no placeholder responses).
### Get Conversation
Retrieves a conversation with its history.
```http
GET /api/v1/advisory-ai/conversations/{conversationId}
GET /v1/advisory-ai/conversations/{conversationId}
Authorization: Bearer <token>
```
@@ -157,7 +171,7 @@ Authorization: Bearer <token>
Lists conversations for a tenant/user.
```http
GET /api/v1/advisory-ai/conversations?tenantId=tenant-123&userId=user-xyz&limit=20
GET /v1/advisory-ai/conversations?tenantId=tenant-123&userId=user-xyz&limit=20
Authorization: Bearer <token>
```
@@ -166,7 +180,7 @@ Authorization: Bearer <token>
Deletes a conversation and its history.
```http
DELETE /api/v1/advisory-ai/conversations/{conversationId}
DELETE /v1/advisory-ai/conversations/{conversationId}
Authorization: Bearer <token>
```
@@ -205,6 +219,9 @@ AI responses include object links that reference internal evidence. These links
| Attestation | `[attest:dsse:{digest}]` | `[attest:dsse:sha256:xyz]` | Link to DSSE attestation |
| Authority Key | `[auth:keys/{keyId}]` | `[auth:keys/gitlab-oidc]` | Link to signing key |
| Documentation | `[docs:{path}]` | `[docs:scopes/ci-webhook]` | Link to documentation |
| Finding | `[finding:{id}]` | `[finding:CVE-2024-12345]` | Link to finding detail |
| Scan | `[scan:{id}]` | `[scan:scan-2026-02-24-001]` | Link to scan detail |
| Policy | `[policy:{id}]` | `[policy:DENY-CRITICAL-PROD]` | Link to policy detail |
### Link Resolution
@@ -436,4 +453,3 @@ AdvisoryAI:
- [Deployment Guide](deployment.md)
- [Security Guardrails](/docs/security/assistant-guardrails.md)
- [Controlled Conversational Interface Advisory](../../../docs-archived/product/advisories/13-Jan-2026%20-%20Controlled%20Conversational%20Interface.md)

View File

@@ -130,6 +130,8 @@ Implemented in `src/AdvisoryAI/StellaOps.AdvisoryAI/KnowledgeSearch/KnowledgeSea
- Query telemetry:
- Unified search emits hashed query telemetry (`SHA-256` query hash, intent, domain weights, latency, top domains) via `IUnifiedSearchTelemetrySink`.
- Web fallback behavior: when unified search fails, `UnifiedSearchClient` falls back to legacy AKS (`/v1/advisory-ai/search`) and maps grouped legacy results into unified cards (`diagnostics.mode = legacy-fallback`).
- UI now shows an explicit degraded-mode banner for `legacy-fallback` / `fallback-empty` modes and clears it automatically on recovery.
- Degraded-mode enter/exit transitions emit analytics markers (`__degraded_mode_enter__`, `__degraded_mode_exit__`); server-side search history intentionally ignores `__*` synthetic markers.
## Web behavior
Global search now consumes AKS and supports:
@@ -140,6 +142,7 @@ Global search now consumes AKS and supports:
- API: `Curl` (copy command).
- Doctor: `Run` (navigate to doctor and copy run command).
- `More` action for "show more like this" local query expansion.
- Search-quality metrics taxonomy is standardized on `query`, `click`, and `zero_result` event types (no legacy `search` event dependency in quality SQL).
## CLI behavior
AKS commands:

View File

@@ -1,225 +0,0 @@
# Analytics Module
The Analytics module provides a star-schema data warehouse layer for SBOM and attestation data, enabling executive reporting, risk dashboards, and ad-hoc analysis.
## Overview
Stella Ops generates rich data through SBOM ingestion, vulnerability correlation, VEX assessments, and attestations. The Analytics module normalizes this data into a queryable warehouse schema optimized for:
- **Executive dashboards**: Risk posture, vulnerability trends, compliance status
- **Supply chain analysis**: Supplier concentration, license distribution
- **Security metrics**: CVE exposure, VEX effectiveness, MTTR tracking
- **Attestation coverage**: SLSA compliance, provenance gaps
## Key Capabilities
| Capability | Description |
|------------|-------------|
| Unified component registry | Canonical component table with normalized suppliers and licenses |
| Vulnerability correlation | Pre-joined component-vulnerability mapping with EPSS/KEV flags |
| VEX-adjusted exposure | Vulnerability counts that respect active VEX overrides (validity windows applied) |
| Attestation tracking | Provenance and SLSA level coverage by environment/team |
| Time-series rollups | Daily snapshots for trend analysis |
| Materialized views | Pre-computed aggregations for dashboard performance |
## Data Model
### Star Schema Overview
```
┌─────────────────┐
│ artifacts │ (dimension)
│ container/app │
└────────┬────────┘
┌──────────────┼──────────────┐
│ │ │
┌─────────▼──────┐ ┌─────▼─────┐ ┌──────▼──────┐
│ artifact_ │ │attestations│ │vex_overrides│
│ components │ │ (fact) │ │ (fact) │
│ (bridge) │ └───────────┘ └─────────────┘
└─────────┬──────┘
┌─────────▼──────┐
│ components │ (dimension)
│ unified │
│ registry │
└─────────┬──────┘
┌─────────▼──────┐
│ component_ │
│ vulns │ (fact)
│ (bridge) │
└────────────────┘
```
### Core Tables
| Table | Type | Purpose |
|-------|------|---------|
| `components` | Dimension | Unified component registry with PURL, supplier, license |
| `artifacts` | Dimension | Container images and applications with SBOM metadata |
| `artifact_components` | Bridge | Links artifacts to their SBOM components |
| `component_vulns` | Fact | Component-to-vulnerability mapping |
| `attestations` | Fact | Attestation metadata (provenance, SBOM, VEX) |
| `vex_overrides` | Fact | VEX status overrides with justifications |
| `raw_sboms` | Audit | Raw SBOM payloads for reprocessing |
| `raw_attestations` | Audit | Raw DSSE envelopes for audit |
| `daily_vulnerability_counts` | Rollup | Daily vuln aggregations |
| `daily_component_counts` | Rollup | Daily component aggregations |
Rollup retention is 90 days in hot storage. `compute_daily_rollups()` prunes
older rows after each run; archival follows operations runbooks.
Platform WebService can automate rollups + materialized view refreshes via
`PlatformAnalyticsMaintenanceService` (see `architecture.md` for schedule and
configuration).
Use `Platform:AnalyticsMaintenance:BackfillDays` to recompute the most recent
N days of rollups on the first maintenance run after downtime (set to `0` to disable).
### Materialized Views
| View | Refresh | Purpose |
|------|---------|---------|
| `mv_supplier_concentration` | Daily | Top suppliers by component count |
| `mv_license_distribution` | Daily | License category distribution |
| `mv_vuln_exposure` | Daily | CVE exposure adjusted by VEX |
| `mv_attestation_coverage` | Daily | Provenance/SLSA coverage by env/team |
Array-valued fields (for example `environments` and `ecosystems`) are ordered
alphabetically to keep analytics outputs deterministic.
## Quick Start
### Day-1 Queries
**Top supplier concentration (supply chain risk, optional environment filter):**
```sql
SELECT analytics.sp_top_suppliers(20, 'prod');
```
**License risk heatmap (optional environment filter):**
```sql
SELECT analytics.sp_license_heatmap('prod');
```
**CVE exposure adjusted by VEX:**
```sql
SELECT analytics.sp_vuln_exposure('prod', 'high');
```
**Fixable vulnerability backlog:**
```sql
SELECT analytics.sp_fixable_backlog('prod');
```
**Attestation coverage gaps:**
```sql
SELECT analytics.sp_attestation_gaps('prod');
```
### API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/analytics/suppliers` | GET | Supplier concentration data |
| `/api/analytics/licenses` | GET | License distribution |
| `/api/analytics/vulnerabilities` | GET | CVE exposure (VEX-adjusted) |
| `/api/analytics/backlog` | GET | Fixable vulnerability backlog |
| `/api/analytics/attestation-coverage` | GET | Attestation gaps |
| `/api/analytics/trends/vulnerabilities` | GET | Vulnerability time-series |
| `/api/analytics/trends/components` | GET | Component time-series |
All analytics endpoints require the `analytics.read` scope.
The platform metadata capability `analytics` reports whether analytics storage is configured.
#### Query Parameters
- `/api/analytics/suppliers`: `limit` (optional, default 20), `environment` (optional)
- `/api/analytics/licenses`: `environment` (optional)
- `/api/analytics/vulnerabilities`: `minSeverity` (optional, default `low`), `environment` (optional)
- `/api/analytics/backlog`: `environment` (optional)
- `/api/analytics/attestation-coverage`: `environment` (optional)
- `/api/analytics/trends/vulnerabilities`: `environment` (optional), `days` (optional, default 30)
- `/api/analytics/trends/components`: `environment` (optional), `days` (optional, default 30)
## Ingestion Configuration
Analytics ingestion runs inside the Platform WebService and subscribes to Scanner, Concelier, and Attestor streams. Configure ingestion via `Platform:AnalyticsIngestion`:
```yaml
Platform:
Storage:
PostgresConnectionString: "Host=...;Database=analytics;Username=...;Password=..."
AnalyticsIngestion:
Enabled: true
PostgresConnectionString: "" # optional; defaults to Platform:Storage
AllowedTenants: ["tenant-a", "tenant-b"]
Streams:
ScannerStream: "orchestrator:events"
ConcelierObservationStream: "concelier:advisory.observation.updated:v1"
ConcelierLinksetStream: "concelier:advisory.linkset.updated:v1"
AttestorStream: "attestor:events"
StartFromBeginning: false
Cas:
RootPath: "/var/lib/stellaops/cas"
DefaultBucket: "attestations"
Attestations:
BundleUriTemplate: "bundle:{digest}"
```
Bundle URI templates support:
- `{digest}` for the full digest string (for example `sha256:...`).
- `{hash}` for the raw hex digest (no algorithm prefix).
- `bundle:{digest}` which resolves to `cas://<DefaultBucket>/{digest}` by default.
- `file:/path/to/bundles/bundle-{hash}.json` for offline file ingestion.
For offline workflows, verify bundles with `stella bundle verify` before ingesting them.
## Console UI
SBOM Lake analytics are exposed in the Console under `Analytics > SBOM Lake` (`/analytics/sbom-lake`).
Console access requires `ui.read` plus `analytics.read` scopes.
Key UI features:
- Filters for environment, minimum severity, and time window.
- Panels for suppliers, licenses, vulnerability exposure, and attestation coverage.
- Trend views for vulnerabilities and components.
- Fixable backlog table with CSV export.
See [console.md](./console.md) for operator guidance and filter behavior.
## CLI Access
SBOM lake analytics are exposed via the CLI under `stella analytics sbom-lake`
(requires `analytics.read` scope).
```bash
# Top suppliers
stella analytics sbom-lake suppliers --limit 20
# Vulnerability exposure in prod (high+), CSV export
stella analytics sbom-lake vulnerabilities --environment prod --min-severity high --format csv --output vuln.csv
# 30-day trends for both series
stella analytics sbom-lake trends --days 30 --series all --format json
```
See `docs/modules/cli/guides/commands/analytics.md` for command-level details.
## Architecture
See [architecture.md](./architecture.md) for detailed design decisions, data flow, and normalization rules.
## Schema Reference
See [analytics_schema.sql](../../db/analytics_schema.sql) for complete DDL including:
- Table definitions with indexes
- Normalization functions
- Materialized views
- Stored procedures
- Refresh procedures
## Sprint Reference
Implementation tracked in:
- `docs/implplan/SPRINT_20260120_030_Platform_sbom_analytics_lake.md`
- `docs/implplan/SPRINT_20260120_032_Cli_sbom_analytics_cli.md`

View File

@@ -1,298 +0,0 @@
# Analytics Module Architecture
> **Implementation Note:** Analytics is a cross-cutting feature integrated into the **Platform WebService** (`src/Platform/`). There is no standalone `src/Analytics/` module. Data ingestion pipelines span Scanner, Concelier, and Attestor modules. See [Platform Architecture](../platform/architecture-overview.md) for service-level integration details.
## Design Philosophy
The Analytics module implements a **star-schema data warehouse** pattern optimized for analytical queries rather than transactional workloads. Key design principles:
1. **Separation of concerns**: Analytics schema is isolated from operational schemas (scanner, vex, proof_system)
2. **Pre-computation**: Expensive aggregations computed in advance via materialized views
3. **Audit trail**: Raw payloads preserved for reprocessing and compliance
4. **Determinism**: Normalization functions are immutable and reproducible; array aggregates are ordered for stable outputs
5. **Incremental updates**: Supports both full refresh and incremental ingestion
## Data Flow
```
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Scanner │ │ Concelier │ │ Attestor │
│ (SBOM) │ │ (Vuln) │ │ (DSSE) │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
│ SBOM Ingested │ Vuln Updated │ Attestation Created
▼ ▼ ▼
┌──────────────────────────────────────────────────────┐
│ AnalyticsIngestionService │
│ - Normalize components (PURL, supplier, license) │
│ - Upsert to unified registry │
│ - Correlate with vulnerabilities │
│ - Store raw payloads │
└──────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────┐
│ analytics schema │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌────────────┐ │
│ │components│ │artifacts│ │comp_vuln│ │attestations│ │
│ └─────────┘ └─────────┘ └─────────┘ └────────────┘ │
└──────────────────────────────────────────────────────┘
│ Daily refresh
┌──────────────────────────────────────────────────────┐
│ Materialized Views │
│ mv_supplier_concentration | mv_license_distribution │
│ mv_vuln_exposure | mv_attestation_coverage │
└──────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────┐
│ Platform API Endpoints │
│ (with 5-minute caching) │
└──────────────────────────────────────────────────────┘
```
## Normalization Rules
### PURL Parsing
Package URLs (PURLs) are the canonical identifier for components. The `parse_purl()` function extracts:
| Field | Example | Notes |
|-------|---------|-------|
| `purl_type` | `maven`, `npm`, `pypi` | Ecosystem identifier |
| `purl_namespace` | `org.apache.logging` | Group/org/scope (optional) |
| `purl_name` | `log4j-core` | Package name |
| `purl_version` | `2.17.1` | Version string |
### Supplier Normalization
The `normalize_supplier()` function standardizes supplier names for consistent grouping:
1. Convert to lowercase
2. Trim whitespace
3. Remove legal suffixes: Inc., LLC, Ltd., Corp., GmbH, B.V., S.A., PLC, Co.
4. Normalize internal whitespace
**Examples:**
- `"Apache Software Foundation, Inc."``"apache software foundation"`
- `"Google LLC"``"google"`
- `" Microsoft Corp. "``"microsoft"`
### License Categorization
The `categorize_license()` function maps SPDX expressions to risk categories:
| Category | Examples | Risk Level |
|----------|----------|------------|
| `permissive` | MIT, Apache-2.0, BSD-3-Clause, ISC | Low |
| `copyleft-weak` | LGPL-2.1, MPL-2.0, EPL-2.0 | Medium |
| `copyleft-strong` | GPL-3.0, AGPL-3.0, SSPL | High |
| `proprietary` | Proprietary, Commercial | Review Required |
| `unknown` | Unrecognized expressions | Review Required |
**Special handling:**
- GPL with exceptions (e.g., `GPL-2.0 WITH Classpath-exception-2.0`) → `copyleft-weak`
- Dual-licensed (e.g., `MIT OR Apache-2.0`) → uses first match
## Component Deduplication
Components are deduplicated by `(purl, hash_sha256)`:
1. If same PURL and hash: existing record updated (last_seen_at, counts)
2. If same PURL but different hash: new record created (version change)
3. If same hash but different PURL: new record (aliased package)
**Upsert pattern:**
```sql
INSERT INTO analytics.components (...)
VALUES (...)
ON CONFLICT (purl, hash_sha256) DO UPDATE SET
last_seen_at = now(),
sbom_count = components.sbom_count + 1,
updated_at = now();
```
## Vulnerability Correlation
When a component is upserted, the `VulnerabilityCorrelationService` queries Concelier for matching advisories:
1. Query by PURL type + namespace + name
2. Filter by version range matching
3. Upsert to `component_vulns` with severity, EPSS, KEV flags
**Version range matching** currently supports semver ranges and exact matches via
`VersionRuleEvaluator`. Non-semver schemes fall back to exact string matches; wildcard
and ecosystem-specific ranges require upstream normalization.
## VEX Override Logic
The `mv_vuln_exposure` view implements VEX-adjusted counts:
```sql
-- Effective count excludes artifacts with active VEX overrides
COUNT(DISTINCT ac.artifact_id) FILTER (
WHERE NOT EXISTS (
SELECT 1 FROM analytics.vex_overrides vo
WHERE vo.artifact_id = ac.artifact_id
AND vo.vuln_id = cv.vuln_id
AND vo.status = 'not_affected'
AND (vo.valid_until IS NULL OR vo.valid_until > now())
)
) AS effective_artifact_count
```
**Override validity:**
- `valid_from`: When the override became effective
- `valid_until`: Expiration (NULL = no expiration)
- Only `status = 'not_affected'` reduces exposure counts, and only when the override is active in its validity window.
## Attestation Ingestion
Attestation ingestion consumes Attestor Rekor entry events and expects Sigstore bundles
or raw DSSE envelopes. The ingestion service:
- Resolves bundle URIs using `BundleUriTemplate`; `bundle:{digest}` maps to
`cas://<DefaultBucket>/{digest}` by default.
- Decodes DSSE payloads, computes `dsse_payload_hash`, and records `predicate_uri` plus
Rekor log metadata (`rekor_log_id`, `rekor_log_index`).
- Uses in-toto `subject` digests to link artifacts when reanalysis hints are absent.
- Maps predicate URIs into `analytics_attestation_type` values
(`provenance`, `sbom`, `vex`, `build`, `scan`, `policy`).
- Expands VEX statements into `vex_overrides` rows, one per product reference, and
captures optional validity timestamps when provided.
## Time-Series Rollups
Daily rollups computed by `compute_daily_rollups()`:
**Vulnerability counts** (per environment/team/severity):
- `total_vulns`: All affecting vulnerabilities
- `fixable_vulns`: Vulns with `fix_available = TRUE`
- `vex_mitigated`: Vulns with active `not_affected` override
- `kev_vulns`: Vulns in CISA KEV
- `unique_cves`: Distinct CVE IDs
- `affected_artifacts`: Artifacts containing affected components
- `affected_components`: Components with affecting vulns
**Component counts** (per environment/team/license/type):
- `total_components`: Distinct components
- `unique_suppliers`: Distinct normalized suppliers
**Retention policy:** 90 days in hot storage; `compute_daily_rollups()` prunes older rows and downstream jobs archive to cold storage.
## Materialized View Refresh
All materialized views support `REFRESH ... CONCURRENTLY` for zero-downtime updates:
```sql
-- Refresh all views (non-concurrent; run off-peak)
SELECT analytics.refresh_all_views();
```
**Refresh schedule (recommended):**
- `mv_supplier_concentration`: 02:00 UTC daily
- `mv_license_distribution`: 02:15 UTC daily
- `mv_vuln_exposure`: 02:30 UTC daily
- `mv_attestation_coverage`: 02:45 UTC daily
- `compute_daily_rollups()`: 03:00 UTC daily
Platform WebService can run the daily rollup + refresh loop via
`PlatformAnalyticsMaintenanceService`. Configure the schedule with:
- `Platform:AnalyticsMaintenance:Enabled` (default `true`)
- `Platform:AnalyticsMaintenance:IntervalMinutes` (default `1440`)
- `Platform:AnalyticsMaintenance:RunOnStartup` (default `true`)
- `Platform:AnalyticsMaintenance:ComputeDailyRollups` (default `true`)
- `Platform:AnalyticsMaintenance:RefreshMaterializedViews` (default `true`)
- `Platform:AnalyticsMaintenance:BackfillDays` (default `0`, set to `0` to disable; recompute the most recent N days on the first maintenance run)
The hosted service issues concurrent refresh statements directly for each view.
Use a DB scheduler (pg_cron) or external orchestrator if you need the staggered
per-view timing above.
## Performance Considerations
### Indexing Strategy
| Table | Key Indexes | Query Pattern |
|-------|-------------|---------------|
| `components` | `purl`, `supplier_normalized`, `license_category` | Lookup, aggregation |
| `artifacts` | `digest`, `environment`, `team` | Lookup, filtering |
| `component_vulns` | `vuln_id`, `severity`, `fix_available` | Join, filtering |
| `attestations` | `artifact_id`, `predicate_type` | Join, aggregation |
| `vex_overrides` | `(artifact_id, vuln_id)`, `status` | Subquery exists |
### Query Performance Targets
| Query | Target | Notes |
|-------|--------|-------|
| `sp_top_suppliers(20, 'prod')` | < 100ms | Uses materialized view when env is null; env filter reads base tables |
| `sp_license_heatmap('prod')` | < 100ms | Uses materialized view when env is null; env filter reads base tables |
| `sp_vuln_exposure()` | < 200ms | Uses materialized view for global queries; environment filters read base tables |
| `sp_fixable_backlog()` | < 500ms | Live query with indexes |
| `sp_attestation_gaps()` | < 100ms | Uses materialized view |
### Caching Strategy
Platform API endpoints use a 5-minute TTL cache:
- Cache key: endpoint + query parameters
- Invalidation: Time-based only (no event-driven invalidation)
- Storage: Valkey (in-memory)
## Security Considerations
### Schema Permissions
```sql
-- Read-only role for dashboards
GRANT USAGE ON SCHEMA analytics TO dashboard_reader;
GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO dashboard_reader;
GRANT SELECT ON ALL SEQUENCES IN SCHEMA analytics TO dashboard_reader;
-- Write role for ingestion service
GRANT USAGE ON SCHEMA analytics TO analytics_writer;
GRANT SELECT, INSERT, UPDATE ON ALL TABLES IN SCHEMA analytics TO analytics_writer;
GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA analytics TO analytics_writer;
```
### Data Classification
| Table | Classification | Notes |
|-------|----------------|-------|
| `components` | Internal | Contains package names, versions |
| `artifacts` | Internal | Contains image names, team names |
| `component_vulns` | Internal | Vulnerability data (public CVEs) |
| `vex_overrides` | Confidential | Contains justifications, operator IDs |
| `raw_sboms` | Confidential | Full SBOM payloads |
| `raw_attestations` | Confidential | Signed attestation envelopes |
### Audit Trail
All tables include `created_at` and `updated_at` timestamps. Raw payload tables (`raw_sboms`, `raw_attestations`) are append-only with content hashes for integrity verification.
## Integration Points
### Upstream Dependencies
| Service | Event | Contract | Action |
|---------|-------|----------|--------|
| Scanner | SBOM report ready | `scanner.event.report.ready@1` (`docs/modules/signals/events/orchestrator-scanner-events.md`) | Normalize and upsert components |
| Concelier | Advisory observation/linkset updated | `advisory.observation.updated@1` (`docs/modules/concelier/events/advisory.observation.updated@1.schema.json`), `advisory.linkset.updated@1` (`docs/modules/concelier/events/advisory.linkset.updated@1.md`) | Re-correlate affected components |
| Excititor | VEX statement changes | `vex.statement.*` (`docs/modules/excititor/architecture.md`) | Create/update vex_overrides |
| Attestor | Rekor entry logged | `rekor.entry.logged` (`docs/modules/attestor/architecture.md`) | Upsert attestation record |
### Downstream Consumers
| Consumer | Data | Endpoint |
|----------|------|----------|
| Console UI | Dashboard data | `/api/analytics/*` |
| Export Center | Compliance reports | Direct DB query |
| AdvisoryAI | Risk context | `/api/analytics/vulnerabilities` |
## Future Enhancements
1. **Partitioning**: Partition `daily_*` tables by date for faster queries and archival
2. **Incremental refresh**: Implement incremental materialized view refresh for large datasets
3. **Custom dimensions**: Support user-defined component groupings (business units, cost centers)
4. **Predictive analytics**: Add ML-based risk prediction using historical trends
5. **BI tool integration**: Direct connectors for Tableau, Looker, Metabase

View File

@@ -1,64 +0,0 @@
# Analytics Console (SBOM Lake)
The Console exposes SBOM analytics lake data under `Analytics > SBOM Lake`.
This view is read-only and uses the analytics API endpoints documented in `docs/modules/analytics/README.md`.
## Access
- Route: `/analytics/sbom-lake`
- Required scopes: `ui.read` and `analytics.read`
- Console admin bundles: `role/analytics-viewer`, `role/analytics-operator`, `role/analytics-admin`
- Data freshness: the page surfaces the latest `dataAsOf` timestamp returned by the API.
## Filters
The SBOM Lake page supports three filters that round-trip via URL query parameters:
- Environment: `env` (optional, example: `Prod`)
- Minimum severity: `severity` (optional, example: `high`)
- Time window (days): `days` (optional, example: `90`)
When a filter changes, the Console reloads all panels using the updated parameters.
Supplier and license panels honor the environment filter alongside the other views.
## Panels
The dashboard presents four summary panels:
1. Supplier concentration (top suppliers by component count)
2. License distribution (license categories and counts)
3. Vulnerability exposure (top CVEs after VEX adjustments)
4. Attestation coverage (provenance and SLSA 2+ coverage)
Each panel shows a loading state, empty state, and summary counts.
## Trends
Two trend panels are included:
- Vulnerability trend: net exposure over the selected time window
- Component trend: total components and unique suppliers
The Console aggregates trend points by date and renders a simple bar chart plus a compact list.
## Fixable Backlog
The fixable backlog table lists vulnerabilities with fixes available, grouped by component and service.
The "Top backlog components" table derives a component summary from the same backlog data.
### CSV Export
The "Export backlog CSV" action downloads a deterministic, ordered CSV with:
- Service
- Component
- Version
- Vulnerability
- Severity
- Environment
- Fixed version
## Troubleshooting
- If panels show "No data", verify that the analytics schema and materialized views are populated.
- If an error banner appears, check the analytics API availability and ensure the tenant has `analytics.read`.

View File

@@ -1,422 +0,0 @@
# Analytics Query Library
This document provides ready-to-use SQL queries for common analytics use cases. All queries are optimized for the analytics star schema.
## Executive Dashboard Queries
### 1. Top Supplier Concentration (Supply Chain Risk)
Identifies suppliers with the highest component footprint, indicating supply chain concentration risk.
```sql
-- Via stored procedure (recommended, optional environment filter)
SELECT analytics.sp_top_suppliers(20, 'prod');
-- Direct query
SELECT
supplier,
component_count,
artifact_count,
team_count,
critical_vuln_count,
high_vuln_count,
environments
FROM analytics.mv_supplier_concentration
ORDER BY component_count DESC
LIMIT 20;
```
**Use case**: Identify vendors that, if compromised, would affect the most artifacts.
### 2. License Risk Heatmap
Shows distribution of components by license category for compliance review.
```sql
-- Via stored procedure (optional environment filter)
SELECT analytics.sp_license_heatmap('prod');
-- Direct query with grouping
SELECT
license_category,
SUM(component_count) AS total_components,
SUM(artifact_count) AS total_artifacts,
COUNT(DISTINCT license_concluded) AS unique_licenses
FROM analytics.mv_license_distribution
GROUP BY license_category
ORDER BY
CASE license_category
WHEN 'copyleft-strong' THEN 1
WHEN 'proprietary' THEN 2
WHEN 'unknown' THEN 3
WHEN 'copyleft-weak' THEN 4
ELSE 5
END;
```
**Use case**: Compliance review, identify components requiring legal review.
### 3. CVE Exposure Adjusted by VEX
Shows true vulnerability exposure after applying VEX mitigations.
```sql
-- Via stored procedure
SELECT analytics.sp_vuln_exposure('prod', 'high');
-- Direct query showing VEX effectiveness (global view; use sp_vuln_exposure for environment filtering)
SELECT
vuln_id,
severity::TEXT,
cvss_score,
epss_score,
kev_listed,
fix_available,
raw_artifact_count AS total_affected,
effective_artifact_count AS actually_affected,
raw_artifact_count - effective_artifact_count AS vex_mitigated,
ROUND(100.0 * (raw_artifact_count - effective_artifact_count) / NULLIF(raw_artifact_count, 0), 1) AS mitigation_rate
FROM analytics.mv_vuln_exposure
WHERE effective_artifact_count > 0
ORDER BY
CASE severity
WHEN 'critical' THEN 1
WHEN 'high' THEN 2
WHEN 'medium' THEN 3
ELSE 4
END,
effective_artifact_count DESC
LIMIT 50;
```
**Use case**: Show executives the "real" risk after VEX assessment.
### 4. Fixable Vulnerability Backlog
Lists vulnerabilities that can be fixed today (fix available, not VEX-mitigated).
```sql
-- Via stored procedure
SELECT analytics.sp_fixable_backlog('prod');
-- Direct query with priority scoring
SELECT
a.name AS service,
a.environment,
a.team,
c.name AS component,
c.version AS current_version,
cv.vuln_id,
cv.severity::TEXT,
cv.cvss_score,
cv.epss_score,
cv.fixed_version,
cv.kev_listed,
-- Priority score: higher = fix first
(
CASE cv.severity
WHEN 'critical' THEN 100
WHEN 'high' THEN 75
WHEN 'medium' THEN 50
ELSE 25
END
+ COALESCE(cv.epss_score * 100, 0)
+ (CASE WHEN cv.kev_listed THEN 50 ELSE 0 END)
)::INT AS priority_score
FROM analytics.component_vulns cv
JOIN analytics.components c ON c.component_id = cv.component_id
JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
LEFT JOIN analytics.vex_overrides vo ON vo.artifact_id = a.artifact_id
AND vo.vuln_id = cv.vuln_id
AND vo.status = 'not_affected'
AND vo.valid_from <= now()
AND (vo.valid_until IS NULL OR vo.valid_until > now())
WHERE cv.affects = TRUE
AND cv.fix_available = TRUE
AND vo.override_id IS NULL
AND a.environment = 'prod'
ORDER BY priority_score DESC, a.name
LIMIT 100;
```
**Use case**: Prioritize remediation work based on risk and fixability.
### 5. Build Integrity / Attestation Coverage
Shows attestation gaps by environment and team.
```sql
-- Via stored procedure
SELECT analytics.sp_attestation_gaps('prod');
-- Direct query with gap analysis
SELECT
environment,
team,
total_artifacts,
with_provenance,
total_artifacts - with_provenance AS missing_provenance,
provenance_pct,
slsa_level_2_plus,
slsa2_pct,
with_sbom_attestation,
with_vex_attestation
FROM analytics.mv_attestation_coverage
WHERE environment = 'prod'
ORDER BY provenance_pct ASC;
```
**Use case**: Identify teams/environments not meeting attestation requirements.
## Trend Analysis Queries
### 6. Vulnerability Trend (30 Days)
```sql
SELECT
snapshot_date,
environment,
SUM(total_vulns) AS total_vulns,
SUM(fixable_vulns) AS fixable_vulns,
SUM(vex_mitigated) AS vex_mitigated,
SUM(total_vulns) - SUM(vex_mitigated) AS net_exposure,
SUM(kev_vulns) AS kev_vulns
FROM analytics.daily_vulnerability_counts
WHERE snapshot_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY snapshot_date, environment
ORDER BY environment, snapshot_date;
```
### 7. Vulnerability Trend by Severity
```sql
SELECT
snapshot_date,
severity::TEXT,
SUM(total_vulns) AS total_vulns
FROM analytics.daily_vulnerability_counts
WHERE snapshot_date >= CURRENT_DATE - INTERVAL '30 days'
AND environment = 'prod'
GROUP BY snapshot_date, severity
ORDER BY snapshot_date,
CASE severity
WHEN 'critical' THEN 1
WHEN 'high' THEN 2
WHEN 'medium' THEN 3
ELSE 4
END;
```
### 8. Component Growth Trend
```sql
SELECT
snapshot_date,
environment,
SUM(total_components) AS total_components,
SUM(unique_suppliers) AS unique_suppliers
FROM analytics.daily_component_counts
WHERE snapshot_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY snapshot_date, environment
ORDER BY environment, snapshot_date;
```
## Deep-Dive Queries
### 9. Component Impact Analysis
Find all artifacts affected by a specific component.
```sql
SELECT
a.name AS artifact,
a.version,
a.environment,
a.team,
ac.depth AS dependency_depth,
ac.introduced_via
FROM analytics.components c
JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
WHERE c.purl LIKE 'pkg:maven/org.apache.logging.log4j/log4j-core%'
ORDER BY a.environment, a.name;
```
### 10. CVE Impact Analysis
Find all artifacts affected by a specific CVE.
```sql
SELECT DISTINCT
a.name AS artifact,
a.version,
a.environment,
a.team,
c.name AS component,
c.version AS component_version,
cv.cvss_score,
cv.fixed_version,
CASE
WHEN vo.status = 'not_affected' THEN 'VEX Mitigated'
WHEN cv.fix_available THEN 'Fix Available'
ELSE 'Vulnerable'
END AS status
FROM analytics.component_vulns cv
JOIN analytics.components c ON c.component_id = cv.component_id
JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
LEFT JOIN analytics.vex_overrides vo ON vo.artifact_id = a.artifact_id
AND vo.vuln_id = cv.vuln_id
AND vo.valid_from <= now()
AND (vo.valid_until IS NULL OR vo.valid_until > now())
WHERE cv.vuln_id = 'CVE-2021-44228'
ORDER BY a.environment, a.name;
```
### 11. Supplier Vulnerability Profile
Detailed vulnerability breakdown for a specific supplier.
```sql
SELECT
c.supplier_normalized AS supplier,
c.name AS component,
c.version,
cv.vuln_id,
cv.severity::TEXT,
cv.cvss_score,
cv.kev_listed,
cv.fix_available,
cv.fixed_version
FROM analytics.components c
JOIN analytics.component_vulns cv ON cv.component_id = c.component_id
WHERE c.supplier_normalized = 'apache software foundation'
AND cv.affects = TRUE
ORDER BY
CASE cv.severity
WHEN 'critical' THEN 1
WHEN 'high' THEN 2
ELSE 3
END,
cv.cvss_score DESC;
```
### 12. License Compliance Report
Components with concerning licenses in production.
```sql
SELECT
c.name AS component,
c.version,
c.license_concluded,
c.license_category::TEXT,
c.supplier_normalized AS supplier,
COUNT(DISTINCT a.artifact_id) AS artifact_count,
ARRAY_AGG(DISTINCT a.name ORDER BY a.name) AS affected_artifacts
FROM analytics.components c
JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
WHERE c.license_category IN ('copyleft-strong', 'proprietary', 'unknown')
AND a.environment = 'prod'
GROUP BY c.component_id, c.name, c.version, c.license_concluded, c.license_category, c.supplier_normalized
ORDER BY c.license_category, artifact_count DESC;
```
### 13. MTTR Analysis
Mean time to remediate by severity.
```sql
SELECT
cv.severity::TEXT,
COUNT(*) AS remediated_vulns,
AVG(EXTRACT(EPOCH FROM (vo.valid_from - cv.published_at)) / 86400)::NUMERIC(10,2) AS avg_days_to_mitigate,
PERCENTILE_CONT(0.5) WITHIN GROUP (
ORDER BY EXTRACT(EPOCH FROM (vo.valid_from - cv.published_at)) / 86400
)::NUMERIC(10,2) AS median_days,
PERCENTILE_CONT(0.9) WITHIN GROUP (
ORDER BY EXTRACT(EPOCH FROM (vo.valid_from - cv.published_at)) / 86400
)::NUMERIC(10,2) AS p90_days
FROM analytics.component_vulns cv
JOIN analytics.vex_overrides vo ON vo.vuln_id = cv.vuln_id
AND vo.status = 'not_affected'
AND vo.valid_from <= now()
AND (vo.valid_until IS NULL OR vo.valid_until > now())
WHERE cv.published_at >= now() - INTERVAL '90 days'
AND cv.published_at IS NOT NULL
GROUP BY cv.severity
ORDER BY
CASE cv.severity
WHEN 'critical' THEN 1
WHEN 'high' THEN 2
WHEN 'medium' THEN 3
ELSE 4
END;
```
### 14. Transitive Dependency Risk
Components introduced through transitive dependencies.
```sql
SELECT
c.name AS transitive_component,
c.version,
ac.introduced_via AS direct_dependency,
ac.depth,
COUNT(DISTINCT cv.vuln_id) AS vuln_count,
SUM(CASE WHEN cv.severity = 'critical' THEN 1 ELSE 0 END) AS critical_count,
COUNT(DISTINCT a.artifact_id) AS affected_artifacts
FROM analytics.components c
JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
LEFT JOIN analytics.component_vulns cv ON cv.component_id = c.component_id AND cv.affects = TRUE
WHERE ac.depth > 0 -- Transitive only
AND a.environment = 'prod'
GROUP BY c.component_id, c.name, c.version, ac.introduced_via, ac.depth
HAVING COUNT(cv.vuln_id) > 0
ORDER BY critical_count DESC, vuln_count DESC
LIMIT 50;
```
### 15. VEX Effectiveness Report
How effective is the VEX program at reducing noise?
```sql
SELECT
DATE_TRUNC('week', vo.created_at)::DATE AS week,
COUNT(*) AS total_overrides,
COUNT(*) FILTER (WHERE vo.status = 'not_affected') AS not_affected,
COUNT(*) FILTER (WHERE vo.status = 'affected') AS confirmed_affected,
COUNT(*) FILTER (WHERE vo.status = 'under_investigation') AS under_investigation,
COUNT(*) FILTER (WHERE vo.status = 'fixed') AS marked_fixed,
-- Noise reduction rate
ROUND(100.0 * COUNT(*) FILTER (WHERE vo.status = 'not_affected') / NULLIF(COUNT(*), 0), 1) AS noise_reduction_pct
FROM analytics.vex_overrides vo
WHERE vo.created_at >= now() - INTERVAL '90 days'
GROUP BY DATE_TRUNC('week', vo.created_at)
ORDER BY week;
```
## Performance Tips
1. **Use materialized views**: Queries prefixed with `mv_` are pre-computed and fast
2. **Add environment filter**: Most queries benefit from `WHERE environment = 'prod'`
3. **Use stored procedures**: `sp_*` functions return JSON and handle caching
4. **Limit results**: Always use `LIMIT` for large result sets
5. **Check refresh times**: Views are refreshed daily; data may be up to 24h stale
## Query Parameters
Common filter parameters:
| Parameter | Type | Example | Notes |
|-----------|------|---------|-------|
| `environment` | TEXT | `'prod'`, `'stage'` | Filter by deployment environment |
| `team` | TEXT | `'platform'` | Filter by owning team |
| `severity` | TEXT | `'critical'`, `'high'` | Minimum severity level |
| `days` | INT | `30`, `90` | Lookback period |
| `limit` | INT | `20`, `100` | Max results |

View File

@@ -1,54 +0,0 @@
# Benchmark
> **Dual Purpose:** This documentation covers two aspects:
> - **Performance Benchmarking** (Production) — BenchmarkDotNet harnesses in `src/Bench/` for scanner, policy, and notification performance testing
> - **Competitive Benchmarking** (Planned) — Accuracy comparison framework in `src/Scanner/__Libraries/StellaOps.Scanner.Benchmark/`
**Status:** Implemented
**Source:** `src/Bench/`
**Owner:** Platform Team
## Purpose
Benchmark provides performance testing and regression analysis for StellaOps components. Ensures deterministic scan times, throughput validation, and performance profiling for critical paths (scanning, policy evaluation, SBOM generation).
## Components
**Services:**
- `StellaOps.Bench` - Benchmarking harness with BenchmarkDotNet integration
**Key Features:**
- Scanner performance benchmarks (per-analyzer, full-scan)
- Policy engine evaluation latency tests
- SBOM generation throughput tests
- Database query performance profiling
- Determinism validation (output stability)
## Configuration
Benchmark configuration via BenchmarkDotNet attributes and runtime parameters.
Key settings:
- Benchmark filters and categories
- Iterations and warmup counts
- Memory profiling and allocation tracking
- Export formats (JSON, HTML, Markdown)
## Dependencies
- BenchmarkDotNet framework
- Scanner (benchmark targets)
- Policy Engine (benchmark targets)
- SbomService (benchmark targets)
- Test fixtures and datasets
## Related Documentation
- Architecture: `./architecture.md`
- Scanner: `../scanner/`
- Policy: `../policy/`
- Operations: `./operations/` (if exists)
## Current Status
Implemented with BenchmarkDotNet harness. Provides performance baselines for scanner analyzers, policy evaluation, and SBOM generation. Used for regression detection in CI/CD pipeline.

View File

@@ -1,447 +0,0 @@
# Benchmark Module Architecture
## Overview
The Benchmark module provides infrastructure for validating and demonstrating Stella Ops' competitive advantages through automated comparison against other container security scanners (Trivy, Grype, Syft, etc.).
**Module Path**: `src/Scanner/__Libraries/StellaOps.Scanner.Benchmark/`
**Status**: PLANNED (Sprint 7000.0001.0001)
> **Note:** This module focuses on **competitive benchmarking** (accuracy comparison with other scanners). For **performance benchmarks** of StellaOps modules (LinkNotMerge, Notify, PolicyEngine, Scanner.Analyzers), see `src/Bench/`.
---
## Mission
Establish verifiable, reproducible benchmarks that:
1. Validate competitive claims with evidence
2. Detect regressions in accuracy or performance
3. Generate marketing-ready comparison materials
4. Provide ground-truth corpus for testing
---
## Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ Benchmark Module │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Corpus │ │ Harness │ │ Metrics │ │
│ │ Manager │───▶│ Runner │───▶│ Calculator │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │ │ │ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ â”Ground Truth │ │ Competitor │ │ Claims │ │
│ │ Manifest │ │ Adapters │ │ Index │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
```
---
## Components
### 1. Corpus Manager
**Namespace**: `StellaOps.Scanner.Benchmark.Corpus`
Manages the ground-truth corpus of container images with known vulnerabilities.
```csharp
public interface ICorpusManager
{
Task<Corpus> LoadCorpusAsync(string corpusPath, CancellationToken ct);
Task<CorpusImage> GetImageAsync(string digest, CancellationToken ct);
Task<GroundTruth> GetGroundTruthAsync(string digest, CancellationToken ct);
}
public record Corpus(
string Version,
DateTimeOffset CreatedAt,
ImmutableArray<CorpusImage> Images
);
public record CorpusImage(
string Digest,
string Name,
string Tag,
CorpusCategory Category,
GroundTruth GroundTruth
);
public record GroundTruth(
ImmutableArray<string> TruePositives,
ImmutableArray<string> KnownFalsePositives,
ImmutableArray<string> Notes
);
public enum CorpusCategory
{
BaseOS, // Alpine, Debian, Ubuntu, RHEL
ApplicationNode, // Node.js applications
ApplicationPython,// Python applications
ApplicationJava, // Java applications
ApplicationDotNet,// .NET applications
BackportScenario, // Known backported fixes
Unreachable // Known unreachable vulns
}
```
### 2. Harness Runner
**Namespace**: `StellaOps.Scanner.Benchmark.Harness`
Executes scans using Stella Ops and competitor tools.
```csharp
public interface IHarnessRunner
{
Task<BenchmarkRun> RunAsync(
Corpus corpus,
ImmutableArray<ITool> tools,
BenchmarkOptions options,
CancellationToken ct
);
}
public interface ITool
{
string Name { get; }
string Version { get; }
Task<ToolResult> ScanAsync(string imageRef, CancellationToken ct);
}
public record BenchmarkRun(
string RunId,
DateTimeOffset StartedAt,
DateTimeOffset CompletedAt,
ImmutableArray<ToolResult> Results
);
public record ToolResult(
string ToolName,
string ToolVersion,
string ImageDigest,
ImmutableArray<NormalizedFinding> Findings,
TimeSpan Duration
);
```
### 3. Competitor Adapters
**Namespace**: `StellaOps.Scanner.Benchmark.Adapters`
Normalize output from competitor tools.
```csharp
public interface ICompetitorAdapter : ITool
{
Task<ImmutableArray<NormalizedFinding>> ParseOutputAsync(
string output,
CancellationToken ct
);
}
// Implementations
public class TrivyAdapter : ICompetitorAdapter { }
public class GrypeAdapter : ICompetitorAdapter { }
public class SyftAdapter : ICompetitorAdapter { }
public class StellaOpsAdapter : ICompetitorAdapter { }
```
### 4. Metrics Calculator
**Namespace**: `StellaOps.Scanner.Benchmark.Metrics`
Calculate precision, recall, F1, and other metrics.
```csharp
public interface IMetricsCalculator
{
BenchmarkMetrics Calculate(
ToolResult result,
GroundTruth groundTruth
);
ComparativeMetrics Compare(
BenchmarkMetrics baseline,
BenchmarkMetrics comparison
);
}
public record BenchmarkMetrics(
int TruePositives,
int FalsePositives,
int TrueNegatives,
int FalseNegatives,
double Precision,
double Recall,
double F1Score,
ImmutableDictionary<string, BenchmarkMetrics> ByCategory
);
public record ComparativeMetrics(
string BaselineTool,
string ComparisonTool,
double PrecisionDelta,
double RecallDelta,
double F1Delta,
ImmutableArray<string> UniqueFindings,
ImmutableArray<string> MissedFindings
);
```
### 5. Claims Index
**Namespace**: `StellaOps.Scanner.Benchmark.Claims`
Manage verifiable claims with evidence links.
```csharp
public interface IClaimsIndex
{
Task<ImmutableArray<Claim>> GetAllClaimsAsync(CancellationToken ct);
Task<ClaimVerification> VerifyClaimAsync(string claimId, CancellationToken ct);
Task UpdateClaimsAsync(BenchmarkRun run, CancellationToken ct);
}
public record Claim(
string Id,
ClaimCategory Category,
string Statement,
string EvidencePath,
ClaimStatus Status,
DateTimeOffset LastVerified
);
public enum ClaimStatus { Pending, Verified, Published, Disputed, Resolved }
public record ClaimVerification(
string ClaimId,
bool IsValid,
string? Evidence,
string? FailureReason
);
```
---
## Data Flow
```
┌────────────────┐
│ Corpus Images │
│ (50+ images) │
└───────┬────────┘
│
â–¼
┌────────────────┐ ┌────────────────┐
│ Stella Ops Scan┠│ Trivy/Grype │
│ │ │ Scan │
└───────┬────────┘ └───────┬────────┘
│ │
â–¼ â–¼
┌────────────────┐ ┌────────────────┐
│ Normalized │ │ Normalized │
│ Findings │ │ Findings │
└───────┬────────┘ └───────┬────────┘
│ │
└──────────┬───────────┘
│
â–¼
┌──────────────┐
│ Ground Truth │
│ Comparison │
└──────┬───────┘
│
â–¼
┌──────────────┐
│ Metrics │
│ (P/R/F1) │
└──────┬───────┘
│
â–¼
┌──────────────┐
│ Claims Index │
│ Update │
└──────────────┘
```
---
## Corpus Structure
```
bench/competitors/
├── corpus/
│ ├── manifest.json # Corpus metadata
│ ├── ground-truth/
│ │ ├── alpine-3.18.json # Per-image ground truth
│ │ ├── debian-bookworm.json
│ │ └── ...
│ └── images/
│ ├── base-os/
│ ├── applications/
│ └── edge-cases/
├── results/
│ ├── 2025-12-22/
│ │ ├── stellaops.json
│ │ ├── trivy.json
│ │ ├── grype.json
│ │ └── comparison.json
│ └── latest -> 2025-12-22/
└── fixtures/
└── adapters/ # Test fixtures for adapters
```
---
## Ground Truth Format
```json
{
"imageDigest": "sha256:abc123...",
"imageName": "alpine:3.18",
"category": "BaseOS",
"groundTruth": {
"truePositives": [
{
"cveId": "CVE-2024-1234",
"package": "openssl",
"version": "3.0.8",
"notes": "Fixed in 3.0.9"
}
],
"knownFalsePositives": [
{
"cveId": "CVE-2024-9999",
"package": "zlib",
"version": "1.2.13",
"reason": "Backported in alpine:3.18"
}
],
"expectedUnreachable": [
{
"cveId": "CVE-2024-5678",
"package": "curl",
"reason": "Vulnerable function not linked"
}
]
},
"lastVerified": "2025-12-01T00:00:00Z",
"verifiedBy": "security-team"
}
```
---
## CI Integration
### Workflow: `benchmark-vs-competitors.yml`
```yaml
name: Competitive Benchmark
on:
schedule:
- cron: '0 2 * * 0' # Weekly Sunday 2 AM
workflow_dispatch:
push:
paths:
- 'src/Scanner/__Libraries/StellaOps.Scanner.Benchmark/**'
- 'bench/competitors/**'
jobs:
benchmark:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install competitor tools
run: |
# Install Trivy
curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sh
# Install Grype
curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh
- name: Run benchmark
run: stella benchmark run --corpus bench/competitors/corpus --output bench/competitors/results/$(date +%Y-%m-%d)
- name: Update claims index
run: stella benchmark claims --output docs/claims-index.md
- name: Upload results
uses: actions/upload-artifact@v4
with:
name: benchmark-results
path: bench/competitors/results/
```
---
## CLI Commands
```bash
# Run full benchmark
stella benchmark run --corpus <path> --competitors trivy,grype,syft
# Verify a specific claim
stella benchmark verify <CLAIM_ID>
# Generate claims index
stella benchmark claims --output docs/claims-index.md
# Generate marketing battlecard
stella benchmark battlecard --output docs/product/battlecard.md
# Show comparison summary
stella benchmark summary --format table|json|markdown
```
---
## Testing
| Test Type | Location | Purpose |
|-----------|----------|---------|
| Unit | `StellaOps.Scanner.Benchmark.Tests/` | Adapter parsing, metrics calculation |
| Integration | `StellaOps.Scanner.Benchmark.Integration.Tests/` | Full benchmark flow |
| Golden | `bench/competitors/fixtures/` | Deterministic output verification |
---
## Security Considerations
1. **Competitor binaries**: Run in isolated containers, no network access during scan
2. **Corpus images**: Verified digests, no external pulls during benchmark
3. **Results**: Signed with DSSE before publishing
4. **Claims**: Require PR review before status change
---
## Dependencies
- `StellaOps.Scanner.Core` - Normalized finding models
- `StellaOps.Attestor.Dsse` - Result signing
- Docker - Competitor tool execution
- Ground-truth corpus (maintained separately)
---
## Related Documentation
- [Claims Index](../../claims-index.md)
- [Sprint 7000.0001.0001](../../implplan/SPRINT_7000_0001_0001_competitive_benchmarking.md)
- [Testing Strategy](../../implplan/SPRINT_5100_0000_0000_epic_summary.md)
---
*Document Version*: 1.0.0
*Created*: 2025-12-22

View File

@@ -1,19 +0,0 @@
# CI Architecture
## Purpose
Describe CI workflows, triggers, and offline constraints for Stella Ops.
## Scope
- Gitea workflows and templates under `.gitea/`.
- DevOps scripts under `devops/scripts/` and `.gitea/scripts/`.
- Build and test policy docs under `docs/technical/cicd/`.
## Principles
- Deterministic and offline-first execution.
- Pinned tool versions with explicit provenance.
- Evidence logged to sprint Execution Log and audits.
## References
- `docs/technical/cicd/workflow-triggers.md`
- `docs/technical/cicd/release-pipelines.md`
- `docs/operations/devops/README.md`

View File

@@ -1,8 +0,0 @@
# Eventing Module
> **Status: Draft/Planned.** The event envelope SDK is currently in design phase. Implementation is planned for the Timeline and TimelineIndexer modules and will be integrated across all services via `src/__Libraries/StellaOps.Eventing/`. No standalone `src/Eventing/` module exists.
## Related Documentation
- [Event Envelope Schema](event-envelope-schema.md)
- [Timeline UI](timeline-ui.md)

View File

@@ -1,496 +0,0 @@
# Event Envelope Schema
> **Version:** 1.0.0
> **Status:** Draft
> **Sprint:** [SPRINT_20260107_003_001_LB](../../implplan/SPRINT_20260107_003_001_LB_event_envelope_sdk.md)
This document specifies the canonical event envelope schema for the StellaOps Unified Event Timeline.
---
## Overview
The event envelope provides a standardized format for all events emitted across StellaOps services. It enables:
- **Unified Timeline:** Cross-service correlation with HLC ordering
- **Deterministic Replay:** Reproducible event streams for forensics
- **Audit Compliance:** DSSE-signed event bundles for export
- **Causal Analysis:** Stage latency measurement and bottleneck identification
---
## Envelope Schema (v1)
### JSON Schema
```json
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://stellaops.org/schemas/timeline-event.v1.json",
"title": "TimelineEvent",
"description": "Canonical event envelope for StellaOps Unified Event Timeline",
"type": "object",
"required": [
"eventId",
"tHlc",
"tsWall",
"service",
"correlationId",
"kind",
"payload",
"payloadDigest",
"engineVersion",
"schemaVersion"
],
"properties": {
"eventId": {
"type": "string",
"description": "Deterministic event ID: SHA-256(correlationId || tHlc || service || kind)[0:32] hex",
"pattern": "^[a-f0-9]{32}$"
},
"tHlc": {
"type": "string",
"description": "HLC timestamp in sortable string format: <physicalTimeMs>:<logicalCounter>:<nodeId>",
"pattern": "^\\d+:\\d+:[a-zA-Z0-9_-]+$"
},
"tsWall": {
"type": "string",
"format": "date-time",
"description": "Wall-clock time in ISO 8601 format (informational only)"
},
"service": {
"type": "string",
"description": "Service name that emitted the event",
"enum": ["Scheduler", "AirGap", "Attestor", "Policy", "VexLens", "Scanner", "Concelier", "Platform"]
},
"traceParent": {
"type": ["string", "null"],
"description": "W3C Trace Context traceparent header",
"pattern": "^[0-9a-f]{2}-[0-9a-f]{32}-[0-9a-f]{16}-[0-9a-f]{2}$"
},
"correlationId": {
"type": "string",
"description": "Correlation ID linking related events (e.g., scanId, jobId, artifactDigest)"
},
"kind": {
"type": "string",
"description": "Event kind/type",
"enum": [
"ENQUEUE", "DEQUEUE", "EXECUTE", "COMPLETE", "FAIL",
"IMPORT", "EXPORT", "MERGE", "CONFLICT",
"ATTEST", "VERIFY",
"EVALUATE", "GATE_PASS", "GATE_FAIL",
"CONSENSUS", "OVERRIDE",
"SCAN_START", "SCAN_COMPLETE",
"EMIT", "ACK", "ERR"
]
},
"payload": {
"type": "string",
"description": "RFC 8785 canonicalized JSON payload"
},
"payloadDigest": {
"type": "string",
"description": "SHA-256 digest of payload as hex string",
"pattern": "^[a-f0-9]{64}$"
},
"engineVersion": {
"type": "object",
"description": "Engine/resolver version for reproducibility",
"required": ["engineName", "version", "sourceDigest"],
"properties": {
"engineName": {
"type": "string",
"description": "Name of the engine/service"
},
"version": {
"type": "string",
"description": "Semantic version string"
},
"sourceDigest": {
"type": "string",
"description": "SHA-256 digest of engine source/binary"
}
}
},
"dsseSig": {
"type": ["string", "null"],
"description": "Optional DSSE signature in format keyId:base64Signature"
},
"schemaVersion": {
"type": "integer",
"description": "Schema version for envelope evolution",
"const": 1
}
}
}
```
### C# Record Definition
```csharp
/// <summary>
/// Canonical event envelope for unified timeline.
/// </summary>
public sealed record TimelineEvent
{
/// <summary>
/// Deterministic event ID: SHA-256(correlationId || tHlc || service || kind)[0:32] hex.
/// NOT a random ULID - ensures replay determinism.
/// </summary>
[Required]
[RegularExpression("^[a-f0-9]{32}$")]
public required string EventId { get; init; }
/// <summary>
/// HLC timestamp from StellaOps.HybridLogicalClock library.
/// </summary>
[Required]
public required HlcTimestamp THlc { get; init; }
/// <summary>
/// Wall-clock time (informational only, not used for ordering).
/// </summary>
[Required]
public required DateTimeOffset TsWall { get; init; }
/// <summary>
/// Service name that emitted the event.
/// </summary>
[Required]
public required string Service { get; init; }
/// <summary>
/// W3C Trace Context traceparent for OpenTelemetry correlation.
/// </summary>
public string? TraceParent { get; init; }
/// <summary>
/// Correlation ID linking related events.
/// </summary>
[Required]
public required string CorrelationId { get; init; }
/// <summary>
/// Event kind (ENQUEUE, EXECUTE, ATTEST, etc.).
/// </summary>
[Required]
public required string Kind { get; init; }
/// <summary>
/// RFC 8785 canonicalized JSON payload.
/// </summary>
[Required]
public required string Payload { get; init; }
/// <summary>
/// SHA-256 digest of Payload.
/// </summary>
[Required]
public required byte[] PayloadDigest { get; init; }
/// <summary>
/// Engine version for reproducibility (per CLAUDE.md Rule 8.2.1).
/// </summary>
[Required]
public required EngineVersionRef EngineVersion { get; init; }
/// <summary>
/// Optional DSSE signature (keyId:base64Signature).
/// </summary>
public string? DsseSig { get; init; }
/// <summary>
/// Schema version (current: 1).
/// </summary>
public int SchemaVersion { get; init; } = 1;
}
public sealed record EngineVersionRef(
string EngineName,
string Version,
string SourceDigest);
```
---
## Field Specifications
### eventId
**Purpose:** Unique, deterministic identifier for each event.
**Computation:**
```csharp
public static string GenerateEventId(
string correlationId,
HlcTimestamp tHlc,
string service,
string kind)
{
using var hasher = IncrementalHash.CreateHash(HashAlgorithmName.SHA256);
hasher.AppendData(Encoding.UTF8.GetBytes(correlationId));
hasher.AppendData(Encoding.UTF8.GetBytes(tHlc.ToSortableString()));
hasher.AppendData(Encoding.UTF8.GetBytes(service));
hasher.AppendData(Encoding.UTF8.GetBytes(kind));
var hash = hasher.GetHashAndReset();
return Convert.ToHexString(hash.AsSpan(0, 16)).ToLowerInvariant();
}
```
**Rationale:** Unlike ULID or UUID, this deterministic approach ensures that:
- The same event produces the same ID across replays
- Duplicate events can be detected and deduplicated
- Event ordering is verifiable
### tHlc
**Purpose:** Primary ordering timestamp using Hybrid Logical Clock.
**Format:** `<physicalTimeMs>:<logicalCounter>:<nodeId>`
**Example:** `1704585600000:42:scheduler-node-1`
**Ordering:** Lexicographic comparison produces correct temporal order:
1. Compare physical time (milliseconds since Unix epoch)
2. If equal, compare logical counter
3. If equal, compare node ID (for uniqueness)
**Implementation:** Uses existing `StellaOps.HybridLogicalClock.HlcTimestamp` type.
### tsWall
**Purpose:** Human-readable wall-clock timestamp for debugging.
**Format:** ISO 8601 with UTC timezone (e.g., `2026-01-07T12:00:00.000Z`)
**Important:** This field is **informational only**. Never use for ordering or comparison. The `tHlc` field is the authoritative timestamp.
### service
**Purpose:** Identifies the StellaOps service that emitted the event.
**Allowed Values:**
| Value | Description |
|-------|-------------|
| `Scheduler` | Job scheduling and queue management |
| `AirGap` | Offline/air-gap sync operations |
| `Attestor` | DSSE attestation and verification |
| `Policy` | Policy engine evaluation |
| `VexLens` | VEX consensus computation |
| `Scanner` | Container scanning |
| `Concelier` | Advisory ingestion |
| `Platform` | Console backend aggregation |
### traceParent
**Purpose:** W3C Trace Context correlation for OpenTelemetry integration.
**Format:** `00-{trace-id}-{span-id}-{trace-flags}`
**Example:** `00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01`
**Population:** Automatically captured from `Activity.Current?.Id` during event emission.
### correlationId
**Purpose:** Links related events across services.
**Common Patterns:**
| Pattern | Example | Usage |
|---------|---------|-------|
| Scan ID | `scan-abc123` | Container scan lifecycle |
| Job ID | `job-xyz789` | Scheduled job lifecycle |
| Artifact Digest | `sha256:abc...` | Artifact processing |
| Bundle ID | `bundle-def456` | Air-gap bundle operations |
### kind
**Purpose:** Categorizes the event type.
**Event Kinds by Service:**
| Service | Kinds |
|---------|-------|
| Scheduler | `ENQUEUE`, `DEQUEUE`, `EXECUTE`, `COMPLETE`, `FAIL` |
| AirGap | `IMPORT`, `EXPORT`, `MERGE`, `CONFLICT` |
| Attestor | `ATTEST`, `VERIFY` |
| Policy | `EVALUATE`, `GATE_PASS`, `GATE_FAIL` |
| VexLens | `CONSENSUS`, `OVERRIDE` |
| Scanner | `SCAN_START`, `SCAN_COMPLETE` |
| Generic | `EMIT`, `ACK`, `ERR` |
### payload
**Purpose:** Domain-specific event data.
**Requirements:**
1. **RFC 8785 Canonicalization:** Must use `CanonJson.Serialize()` from `StellaOps.Canonical.Json`
2. **No Non-Deterministic Fields:** No random IDs, current timestamps, or environment-specific data
3. **Bounded Size:** Payload should be < 1MB; use references for large data
**Example:**
```json
{
"artifactDigest": "sha256:abc123...",
"jobId": "job-xyz789",
"status": "completed",
"findingsCount": 42
}
```
### payloadDigest
**Purpose:** Integrity verification of payload.
**Computation:**
```csharp
var digest = SHA256.HashData(Encoding.UTF8.GetBytes(payload));
```
**Format:** 64-character lowercase hex string.
### engineVersion
**Purpose:** Records the engine/resolver version for reproducibility verification (per CLAUDE.md Rule 8.2.1).
**Fields:**
| Field | Description | Example |
|-------|-------------|---------|
| `engineName` | Service/engine name | `"Scheduler"` |
| `version` | Semantic version | `"2.5.0"` |
| `sourceDigest` | Build artifact hash | `"sha256:abc..."` |
**Population:** Use `EngineVersionRef.FromAssembly(Assembly.GetExecutingAssembly())`.
### dsseSig
**Purpose:** Optional cryptographic signature for audit compliance.
**Format:** `{keyId}:{base64Signature}`
**Example:** `signing-key-001:MEUCIQD...`
**Integration:** Uses existing `StellaOps.Attestation.DsseHelper` for signature generation.
### schemaVersion
**Purpose:** Enables schema evolution without breaking compatibility.
**Current Value:** `1`
**Migration Strategy:** When schema changes:
1. Increment version number
2. Add migration logic for older versions
3. Document breaking changes
---
## Database Schema
```sql
CREATE SCHEMA IF NOT EXISTS timeline;
CREATE TABLE timeline.events (
event_id TEXT PRIMARY KEY,
t_hlc TEXT NOT NULL,
ts_wall TIMESTAMPTZ NOT NULL,
service TEXT NOT NULL,
trace_parent TEXT,
correlation_id TEXT NOT NULL,
kind TEXT NOT NULL,
payload JSONB NOT NULL,
payload_digest BYTEA NOT NULL,
engine_name TEXT NOT NULL,
engine_version TEXT NOT NULL,
engine_digest TEXT NOT NULL,
dsse_sig TEXT,
schema_version INTEGER NOT NULL DEFAULT 1,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Primary query: events by correlation, HLC ordered
CREATE INDEX idx_events_corr_hlc ON timeline.events (correlation_id, t_hlc);
-- Service-specific queries
CREATE INDEX idx_events_svc_hlc ON timeline.events (service, t_hlc);
-- Payload search (JSONB GIN index)
CREATE INDEX idx_events_payload ON timeline.events USING GIN (payload);
-- Kind filtering
CREATE INDEX idx_events_kind ON timeline.events (kind);
```
---
## Usage Examples
### Emitting an Event
```csharp
public class SchedulerService
{
private readonly ITimelineEventEmitter _emitter;
public async Task EnqueueJobAsync(Job job, CancellationToken ct)
{
// Business logic...
await _queue.EnqueueAsync(job, ct);
// Emit timeline event
await _emitter.EmitAsync(
correlationId: job.Id.ToString(),
kind: "ENQUEUE",
payload: new { jobId = job.Id, priority = job.Priority },
ct);
}
}
```
### Querying Timeline
```csharp
public async Task<IReadOnlyList<TimelineEvent>> GetJobTimelineAsync(
string jobId,
CancellationToken ct)
{
return await _timelineService.GetEventsAsync(
correlationId: jobId,
options: new TimelineQueryOptions
{
Services = ["Scheduler", "Attestor"],
Kinds = ["ENQUEUE", "EXECUTE", "COMPLETE", "ATTEST"]
},
ct);
}
```
---
## Compatibility Notes
### Relation to Existing HLC Infrastructure
This schema builds on the existing `StellaOps.HybridLogicalClock` library:
- Uses `HlcTimestamp` type directly
- Integrates with `IHybridLogicalClock.Tick()` for timestamp generation
- Compatible with air-gap merge algorithms
### Relation to Existing Replay Infrastructure
This schema integrates with `StellaOps.Replay.Core`:
- `KnowledgeSnapshot` can include timeline event references
- Replay uses `FakeTimeProvider` with HLC timestamps
- Verification compares payload digests
---
## References
- [SPRINT_20260107_003_000_INDEX](../../implplan/SPRINT_20260107_003_000_INDEX_unified_event_timeline.md) - Parent sprint index
- [SPRINT_20260105_002_000_INDEX](../../implplan/SPRINT_20260105_002_000_INDEX_hlc_audit_safe_ordering.md) - HLC foundation
- [RFC 8785](https://datatracker.ietf.org/doc/html/rfc8785) - JSON Canonicalization Scheme
- [W3C Trace Context](https://www.w3.org/TR/trace-context/) - Distributed tracing
- CLAUDE.md Section 8.2.1 - Engine version tracking
- CLAUDE.md Section 8.7 - RFC 8785 canonicalization

View File

@@ -1,171 +0,0 @@
# Timeline UI Component
> **Module:** Eventing / Timeline
> **Status:** Implemented
> **Last Updated:** 2026-01-07
## Overview
The Timeline UI provides a visual representation of HLC-ordered events across StellaOps services. It enables operators to trace the causal flow of operations, identify bottlenecks, and investigate specific events with full evidence links.
## Features
### Causal Lanes Visualization
Events are displayed in swimlanes organized by service:
```
┌─────────────────────────────────────────────────────────────────────┐
│ HLC Timeline Axis │
│ |-------|-------|-------|-------|-------|-------|-------|-------> │
├─────────────────────────────────────────────────────────────────────┤
│ Scheduler [E]─────────[X]───────────────[C] │
├─────────────────────────────────────────────────────────────────────┤
│ AirGap [I]──────────[M] │
├─────────────────────────────────────────────────────────────────────┤
│ Attestor [A]──────────[V] │
├─────────────────────────────────────────────────────────────────────┤
│ Policy [G] │
└─────────────────────────────────────────────────────────────────────┘
```
Legend:
- **[E]** Enqueue - Job queued for processing
- **[X]** Execute - Job execution started
- **[C]** Complete - Job completed
- **[I]** Import - Data imported (e.g., SBOM, advisory)
- **[M]** Merge - Data merged
- **[A]** Attest - Attestation created
- **[V]** Verify - Attestation verified
- **[G]** Gate - Policy gate evaluated
### Critical Path Analysis
The critical path view shows the longest sequence of dependent operations:
- Color-coded by severity (green/yellow/red)
- Bottleneck stage highlighted
- Percentage of total duration shown
- Clickable stages for drill-down
### Event Detail Panel
Selected events display:
- Event ID and metadata
- HLC timestamp and wall-clock time
- Service and event kind
- JSON payload viewer
- Engine version information
- Evidence links (SBOM, VEX, Policy, Attestation)
### Filtering
Events can be filtered by:
- **Services**: Scheduler, AirGap, Attestor, Policy, Scanner, etc.
- **Event Kinds**: ENQUEUE, EXECUTE, COMPLETE, IMPORT, ATTEST, etc.
- **HLC Range**: From/To timestamps
Filter state is persisted in URL query parameters.
### Export
Timeline data can be exported as:
- **NDJSON**: Newline-delimited JSON (streaming-friendly)
- **JSON**: Standard JSON array
- **DSSE-signed**: Cryptographically signed bundles for auditing
## Usage
### Accessing the Timeline
Navigate to `/timeline/{correlationId}` where `correlationId` is the unique identifier for a scan, job, or workflow.
Example:
```
/timeline/scan-abc123-def456
```
### Keyboard Navigation
| Key | Action |
|-----|--------|
| Tab | Navigate between events |
| Enter/Space | Select focused event |
| Escape | Clear selection |
| Arrow keys | Scroll within panel |
### URL Parameters
| Parameter | Description | Example |
|-----------|-------------|---------|
| `services` | Comma-separated service filter | `?services=Scheduler,AirGap` |
| `kinds` | Comma-separated kind filter | `?kinds=EXECUTE,COMPLETE` |
| `fromHlc` | Start of HLC range | `?fromHlc=1704067200000:0:node1` |
| `toHlc` | End of HLC range | `?toHlc=1704153600000:0:node1` |
## Component Architecture
```
timeline/
├── components/
│ ├── causal-lanes/ # Swimlane visualization
│ ├── critical-path/ # Bottleneck bar chart
│ ├── event-detail-panel/ # Selected event details
│ ├── evidence-links/ # Links to SBOM/VEX/Policy
│ ├── export-button/ # Export dropdown
│ └── timeline-filter/ # Service/kind filters
├── models/
│ └── timeline.models.ts # TypeScript interfaces
├── pages/
│ └── timeline-page/ # Main page component
├── services/
│ └── timeline.service.ts # API client
└── timeline.routes.ts # Lazy-loaded routes
```
## API Integration
The Timeline UI integrates with the Timeline API:
| Endpoint | Description |
|----------|-------------|
| `GET /api/v1/timeline/{correlationId}` | Fetch events |
| `GET /api/v1/timeline/{correlationId}/critical-path` | Fetch critical path |
| `POST /api/v1/timeline/{correlationId}/export` | Initiate export |
| `GET /api/v1/timeline/export/{exportId}` | Check export status |
| `GET /api/v1/timeline/export/{exportId}/download` | Download bundle |
## Accessibility
The Timeline UI follows WCAG 2.1 AA guidelines:
- **Keyboard Navigation**: All interactive elements are focusable
- **Screen Readers**: ARIA labels on all regions and controls
- **Color Contrast**: 4.5:1 minimum contrast ratio
- **Focus Indicators**: Visible focus rings on all controls
- **Motion**: Respects `prefers-reduced-motion`
## Performance
- **Virtual Scrolling**: Handles 10K+ events efficiently
- **Lazy Loading**: Events loaded on-demand as user scrolls
- **Caching**: Recent queries cached to reduce API calls
- **Debouncing**: Filter changes debounced to avoid excessive requests
## Screenshots
### Timeline View
![Timeline View](./assets/timeline-view.png)
### Critical Path Analysis
![Critical Path](./assets/critical-path.png)
### Event Detail Panel
![Event Details](./assets/event-details.png)
## Related Documentation
- [Timeline API Reference](../../api/timeline-api.md)
- [HLC Clock Specification](../hlc/architecture.md)
- [Eventing SDK](../eventing/architecture.md)
- [Evidence Model](../../schemas/evidence.md)

View File

@@ -1,49 +0,0 @@
# Evidence
**Status:** Design/Planning
**Source:** N/A (cross-cutting concept)
**Owner:** Platform Team
## Purpose
Evidence defines the unified evidence model for vulnerability findings across StellaOps. Provides canonical data structures for evidence capture, aggregation, and scoring used by Signals, Policy Engine, and EvidenceLocker modules.
## Components
**Concept Documentation:**
- `unified-model.md` - Unified evidence data model specification
**Evidence Types:**
- Reachability evidence (call graph, data flow)
- Runtime evidence (eBPF traces, dynamic observations)
- Binary evidence (backport detection, fix validation)
- Exploit evidence (EPSS scores, KEV flags, exploit-db entries)
- VEX evidence (source trust, statement provenance)
- Mitigation evidence (active mitigations, compensating controls)
## Implementation Locations
Evidence structures are implemented across multiple modules:
- **Signals** - Evidence aggregation and normalization
- **Policy Engine** - Reachability analysis and evidence generation
- **EvidenceLocker** - Evidence storage and sealing
- **Scanner** - Binary and vulnerability evidence capture
- **Concelier** - Backport and exploit evidence enrichment
## Dependencies
- All evidence-producing modules (Scanner, Policy, Concelier, etc.)
- Signals (evidence aggregation)
- EvidenceLocker (evidence storage)
## Related Documentation
- Unified Model: `./unified-model.md`
- Signals: `../signals/`
- Policy: `../policy/`
- EvidenceLocker: `../evidence-locker/`
- Data Schemas: `../../11_DATA_SCHEMAS.md`
## Current Status
Evidence model documented in `unified-model.md`. Implementation distributed across Signals (aggregation), Policy (reachability), EvidenceLocker (storage), and Scanner (capture) modules.

View File

@@ -1,360 +0,0 @@
# Unified Evidence Model
> **Module:** `StellaOps.Evidence.Core`
> **Status:** Production
> **Owner:** Platform Guild
## Overview
The Unified Evidence Model provides a standardized interface (`IEvidence`) and implementation (`EvidenceRecord`) for representing evidence across all StellaOps modules. This enables:
- **Cross-module evidence linking**: Evidence from Scanner, Attestor, Excititor, and Policy modules share a common contract.
- **Content-addressed verification**: Evidence records are immutable and verifiable via deterministic hashing.
- **Unified storage**: A single `IEvidenceStore` interface abstracts persistence across modules.
- **Cryptographic attestation**: Multiple signatures from different signers (internal, vendor, CI, operator) can vouch for evidence.
## Core Types
### IEvidence Interface
```csharp
public interface IEvidence
{
string SubjectNodeId { get; } // Content-addressed subject
EvidenceType EvidenceType { get; } // Type discriminator
string EvidenceId { get; } // Computed hash identifier
ReadOnlyMemory<byte> Payload { get; } // Canonical JSON payload
IReadOnlyList<EvidenceSignature> Signatures { get; }
EvidenceProvenance Provenance { get; }
string? ExternalPayloadCid { get; } // For large payloads
string PayloadSchemaVersion { get; }
}
```
### EvidenceType Enum
The platform supports these evidence types:
| Type | Value | Description | Example Payload |
|------|-------|-------------|-----------------|
| `Reachability` | 1 | Call graph analysis | Paths, confidence, graph digest |
| `Scan` | 2 | Vulnerability finding | CVE, severity, affected package |
| `Policy` | 3 | Policy evaluation | Rule ID, verdict, inputs |
| `Artifact` | 4 | SBOM entry metadata | PURL, digest, build info |
| `Vex` | 5 | VEX statement | Status, justification, impact |
| `Epss` | 6 | EPSS score | Score, percentile, model date |
| `Runtime` | 7 | Runtime observation | eBPF/ETW traces, call frames |
| `Provenance` | 8 | Build provenance | SLSA attestation, builder info |
| `Exception` | 9 | Applied exception | Exception ID, reason, expiry |
| `Guard` | 10 | Guard/gate analysis | Gate type, condition, bypass |
| `Kev` | 11 | KEV status | In-KEV flag, added date |
| `License` | 12 | License analysis | SPDX ID, compliance status |
| `Dependency` | 13 | Dependency metadata | Graph edge, version range |
| `Custom` | 100 | User-defined | Schema-versioned custom payload |
### EvidenceRecord
The concrete implementation with deterministic identity:
```csharp
public sealed record EvidenceRecord : IEvidence
{
public static EvidenceRecord Create(
string subjectNodeId,
EvidenceType evidenceType,
ReadOnlyMemory<byte> payload,
EvidenceProvenance provenance,
string payloadSchemaVersion,
IReadOnlyList<EvidenceSignature>? signatures = null,
string? externalPayloadCid = null);
public bool VerifyIntegrity();
}
```
**EvidenceId Computation:**
The `EvidenceId` is a SHA-256 hash of the canonicalized fields using versioned prefixing:
```
EvidenceId = "evidence:" + CanonJson.HashVersionedPrefixed("IEvidence", "v1", {
SubjectNodeId,
EvidenceType,
PayloadHash,
Provenance.GeneratorId,
Provenance.GeneratorVersion,
Provenance.GeneratedAt (ISO 8601)
})
```
### EvidenceSignature
Cryptographic attestation by a signer:
```csharp
public sealed record EvidenceSignature
{
public required string SignerId { get; init; }
public required string Algorithm { get; init; } // ES256, RS256, EdDSA
public required string SignatureBase64 { get; init; }
public required DateTimeOffset SignedAt { get; init; }
public SignerType SignerType { get; init; }
public IReadOnlyList<string>? CertificateChain { get; init; }
}
```
**SignerType Values:**
- `Internal` (0): StellaOps service
- `Vendor` (1): External vendor/supplier
- `CI` (2): CI/CD pipeline
- `Operator` (3): Human operator
- `TransparencyLog` (4): Rekor/transparency log
- `Scanner` (5): Security scanner
- `PolicyEngine` (6): Policy engine
- `Unknown` (255): Unclassified
### EvidenceProvenance
Generation context:
```csharp
public sealed record EvidenceProvenance
{
public required string GeneratorId { get; init; }
public required string GeneratorVersion { get; init; }
public required DateTimeOffset GeneratedAt { get; init; }
public string? CorrelationId { get; init; }
public Guid? TenantId { get; init; }
// ... additional fields
}
```
## Adapters
Adapters convert module-specific evidence types to the unified `IEvidence` interface:
### Available Adapters
| Adapter | Source Module | Source Type | Target Evidence Types |
|---------|---------------|-------------|----------------------|
| `EvidenceBundleAdapter` | Scanner | `EvidenceBundle` | Reachability, Vex, Provenance, Scan |
| `EvidenceStatementAdapter` | Attestor | `EvidenceStatement` (in-toto) | Scan |
| `ProofSegmentAdapter` | Scanner | `ProofSegment` | Varies by segment type |
| `VexObservationAdapter` | Excititor | `VexObservation` | Vex, Provenance |
| `ExceptionApplicationAdapter` | Policy | `ExceptionApplication` | Exception |
### Adapter Interface
```csharp
public interface IEvidenceAdapter<TSource>
{
IReadOnlyList<IEvidence> Convert(
TSource source,
string subjectNodeId,
EvidenceProvenance provenance);
bool CanConvert(TSource source);
}
```
### Using Adapters
Adapters use **input DTOs** to avoid circular dependencies:
```csharp
// Using VexObservationAdapter
var adapter = new VexObservationAdapter();
var input = new VexObservationInput
{
ObservationId = "obs-001",
ProviderId = "nvd",
StreamId = "cve-feed",
// ... other fields from VexObservation
};
var provenance = new EvidenceProvenance
{
GeneratorId = "excititor-ingestor",
GeneratorVersion = "1.0.0",
GeneratedAt = DateTimeOffset.UtcNow
};
if (adapter.CanConvert(input))
{
IReadOnlyList<IEvidence> records = adapter.Convert(
input,
subjectNodeId: "sha256:abc123",
provenance);
}
```
## Evidence Store
### IEvidenceStore Interface
```csharp
public interface IEvidenceStore
{
Task<EvidenceRecord> StoreAsync(
EvidenceRecord record,
CancellationToken ct = default);
Task<IReadOnlyList<EvidenceRecord>> StoreBatchAsync(
IEnumerable<EvidenceRecord> records,
CancellationToken ct = default);
Task<EvidenceRecord?> GetByIdAsync(
string evidenceId,
CancellationToken ct = default);
Task<IReadOnlyList<EvidenceRecord>> GetBySubjectAsync(
string subjectNodeId,
EvidenceType? evidenceType = null,
CancellationToken ct = default);
Task<IReadOnlyList<EvidenceRecord>> GetByTypeAsync(
EvidenceType evidenceType,
int limit = 100,
CancellationToken ct = default);
Task<bool> ExistsAsync(
string evidenceId,
CancellationToken ct = default);
Task<bool> DeleteAsync(
string evidenceId,
CancellationToken ct = default);
}
```
### Implementations
- **`InMemoryEvidenceStore`**: Thread-safe in-memory store for testing and development.
- **`PostgresEvidenceStore`** (planned): Production store with tenant isolation and indexing.
## Usage Examples
### Creating Evidence
```csharp
var provenance = new EvidenceProvenance
{
GeneratorId = "scanner-service",
GeneratorVersion = "2.1.0",
GeneratedAt = DateTimeOffset.UtcNow,
TenantId = tenantId
};
// Serialize payload to canonical JSON
var payloadBytes = CanonJson.Canonicalize(new
{
cveId = "CVE-2024-1234",
severity = "HIGH",
affectedPackage = "pkg:npm/lodash@4.17.20"
});
var evidence = EvidenceRecord.Create(
subjectNodeId: "sha256:abc123def456...",
evidenceType: EvidenceType.Scan,
payload: payloadBytes,
provenance: provenance,
payloadSchemaVersion: "scan/v1");
```
### Storing and Retrieving
```csharp
var store = new InMemoryEvidenceStore();
// Store
await store.StoreAsync(evidence);
// Retrieve by ID
var retrieved = await store.GetByIdAsync(evidence.EvidenceId);
// Retrieve all evidence for a subject
var allForSubject = await store.GetBySubjectAsync(
"sha256:abc123def456...",
evidenceType: EvidenceType.Scan);
// Verify integrity
bool isValid = retrieved!.VerifyIntegrity();
```
### Cross-Module Evidence Linking
```csharp
// Scanner produces evidence bundle
var bundle = scanner.ProduceEvidenceBundle(target);
// Convert to unified evidence
var adapter = new EvidenceBundleAdapter();
var evidenceRecords = adapter.Convert(bundle, subjectNodeId, provenance);
// Store all records
await store.StoreBatchAsync(evidenceRecords);
// Later, any module can query by subject
var allEvidence = await store.GetBySubjectAsync(subjectNodeId);
// Filter by type
var reachabilityEvidence = allEvidence
.Where(e => e.EvidenceType == EvidenceType.Reachability);
var vexEvidence = allEvidence
.Where(e => e.EvidenceType == EvidenceType.Vex);
```
## Schema Versioning
Each evidence type payload has a schema version (`PayloadSchemaVersion`) for forward compatibility:
- `scan/v1`: Initial scan evidence schema
- `reachability/v1`: Reachability evidence schema
- `vex-statement/v1`: VEX statement evidence schema
- `proof-segment/v1`: Proof segment evidence schema
- `exception-application/v1`: Exception application schema
Consumers should check `PayloadSchemaVersion` before deserializing payloads to handle schema evolution.
## Integration Patterns
### Module Integration
Each module that produces evidence should:
1. Create an adapter if converting from module-specific types
2. Use `EvidenceRecord.Create()` for new evidence
3. Store evidence via `IEvidenceStore`
4. Include provenance with generator identification
### Verification Flow
```
1. Retrieve evidence by SubjectNodeId
2. Call VerifyIntegrity() to check EvidenceId
3. Verify signatures against known trust roots
4. Deserialize and validate payload against schema
```
## Testing
The `StellaOps.Evidence.Core.Tests` project includes:
- **111 unit tests** covering:
- EvidenceRecord creation and hash computation
- InMemoryEvidenceStore CRUD operations
- All adapter conversions (VexObservation, ExceptionApplication, ProofSegment)
- Edge cases and error handling
Run tests:
```bash
dotnet test src/__Libraries/StellaOps.Evidence.Core.Tests/
```
## Related Documentation
- [Proof Chain Architecture](../attestor/proof-chain.md)
- [Evidence Bundle Design](../scanner/evidence-bundle.md)
- [VEX Observation Model](../excititor/vex-observation.md)
- [Policy Exceptions](../policy/exceptions.md)

View File

@@ -1,41 +0,0 @@
# Facet
> Cryptographically sealed manifests for logical slices of container images.
## Purpose
The Facet Sealing subsystem provides cryptographically sealed manifests for logical slices of container images, enabling fine-grained drift detection, per-facet quota enforcement, and deterministic change tracking.
## Quick Links
- [Architecture](./architecture.md) - Technical design and implementation details
## Status
| Attribute | Value |
|-----------|-------|
| **Maturity** | Production |
| **Last Reviewed** | 2025-12-29 |
| **Maintainer** | Scanner Guild, Policy Guild |
## Key Features
- **Facet Types**: OS packages, language dependencies, binaries, configs, custom patterns
- **Cryptographic Sealing**: Each facet can be individually sealed with a cryptographic snapshot
- **Drift Detection**: Monitor changes between seals for compliance enforcement
- **Merkle Tree Structure**: Content-addressed storage with integrity verification
## Dependencies
### Upstream (this module depends on)
- **Scanner** - Facet extraction during image analysis
- **Attestor** - DSSE signing for sealed facets
### Downstream (modules that depend on this)
- **Policy** - Drift detection and quota enforcement
- **Replay** - Facet verification in replay workflows
## Related Documentation
- [Scanner Architecture](../scanner/architecture.md)
- [Replay Architecture](../replay/architecture.md)

View File

@@ -1,702 +0,0 @@
# Facet Sealing Architecture
> **Status: Production (Cross-Module Library).** Facet Sealing is a fully implemented subsystem with its core library at `src/__Libraries/StellaOps.Facet/` (30 source files) and integration points spanning **Scanner** (extraction via `FacetSealExtractor`, storage via `PostgresFacetSealStore` in `scanner.facet_seals` table), **Policy** (drift and quota enforcement via `FacetQuotaGate`), **Zastava** (admission validation via `FacetAdmissionValidator`), and **CLI** (`seal`, `drift`, `vex-gen` commands). Comprehensive test coverage exists across 17 test files. This documentation covers the cross-cutting architecture.
> **Ownership:** Scanner Guild, Policy Guild
> **Audience:** Service owners, platform engineers, security architects
> **Related:** [Platform Architecture](../platform/architecture-overview.md), [Scanner Architecture](../scanner/architecture.md), [Replay Architecture](../replay/architecture.md), [Policy Engine](../policy/architecture.md)
This dossier describes the Facet Sealing subsystem, which provides cryptographically sealed manifests for logical slices of container images, enabling fine-grained drift detection, per-facet quota enforcement, and deterministic change tracking.
---
## 1. Overview
A **Facet** is a declared logical slice of a container image representing a cohesive set of files with shared characteristics:
| Facet Type | Description | Examples |
|------------|-------------|----------|
| `os` | Operating system packages | `/var/lib/dpkg/**`, `/var/lib/rpm/**` |
| `lang/<ecosystem>` | Language-specific dependencies | `node_modules/**`, `site-packages/**`, `vendor/**` |
| `binary` | Native binaries and shared libraries | `/usr/bin/*`, `/lib/**/*.so*` |
| `config` | Configuration files | `/etc/**`, `*.conf`, `*.yaml` |
| `custom` | User-defined patterns | Project-specific paths |
Each facet can be individually **sealed** (cryptographic snapshot) and monitored for **drift** (changes between seals).
---
## 2. System Landscape
```mermaid
graph TD
subgraph Scanner["Scanner Services"]
FE[FacetExtractor]
FH[FacetHasher]
MB[MerkleBuilder]
end
subgraph Storage["Facet Storage"]
FS[(PostgreSQL<br/>facet_seals)]
FC[(CAS<br/>facet_manifests)]
end
subgraph Policy["Policy & Enforcement"]
DC[DriftCalculator]
QE[QuotaEnforcer]
AV[AdmissionValidator]
end
subgraph Signing["Attestation"]
DS[DSSE Signer]
AT[Attestor]
end
subgraph CLI["CLI & Integration"]
SealCmd[stella seal]
DriftCmd[stella drift]
VexCmd[stella vex gen]
Zastava[Zastava Webhook]
end
FE --> FH
FH --> MB
MB --> DS
DS --> FS
DS --> FC
FS --> DC
DC --> QE
QE --> AV
AV --> Zastava
SealCmd --> FE
DriftCmd --> DC
VexCmd --> DC
```
---
## 3. Core Data Models
### 3.1 FacetDefinition
Declares a facet with its extraction patterns and quota constraints:
```csharp
public sealed record FacetDefinition
{
public required string FacetId { get; init; } // e.g., "os", "lang/node", "binary"
public required FacetType Type { get; init; } // OS, LangNode, LangPython, Binary, Config, Custom
public required ImmutableArray<string> IncludeGlobs { get; init; }
public ImmutableArray<string> ExcludeGlobs { get; init; } = [];
public FacetQuota? Quota { get; init; }
}
public enum FacetType
{
OS,
LangNode,
LangPython,
LangGo,
LangRust,
LangJava,
LangDotNet,
Binary,
Config,
Custom
}
```
### 3.2 FacetManifest
Per-facet file manifest with Merkle root:
```csharp
public sealed record FacetManifest
{
public required string FacetId { get; init; }
public required FacetType Type { get; init; }
public required ImmutableArray<FacetFileEntry> Files { get; init; }
public required string MerkleRoot { get; init; } // SHA-256 hex
public required int FileCount { get; init; }
public required long TotalBytes { get; init; }
public required DateTimeOffset ExtractedAt { get; init; }
public required string ExtractorVersion { get; init; }
}
public sealed record FacetFileEntry
{
public required string Path { get; init; } // Normalized POSIX path
public required string ContentHash { get; init; } // SHA-256 hex
public required long Size { get; init; }
public required string Mode { get; init; } // POSIX mode string "0644"
public required DateTimeOffset ModTime { get; init; } // Normalized to UTC
}
```
### 3.3 FacetSeal
DSSE-signed seal combining manifest with metadata:
```csharp
public sealed record FacetSeal
{
public required Guid SealId { get; init; }
public required string ImageRef { get; init; } // registry/repo:tag@sha256:...
public required string ImageDigest { get; init; } // sha256:...
public required FacetManifest Manifest { get; init; }
public required DateTimeOffset SealedAt { get; init; }
public required string SealedBy { get; init; } // Identity/service
public required FacetQuota? AppliedQuota { get; init; }
public required DsseEnvelope Envelope { get; init; }
}
```
### 3.4 FacetQuota
Per-facet change budget:
```csharp
public sealed record FacetQuota
{
public required string FacetId { get; init; }
public double MaxChurnPercent { get; init; } = 5.0; // 0-100
public int MaxChangedFiles { get; init; } = 50;
public int MaxAddedFiles { get; init; } = 25;
public int MaxRemovedFiles { get; init; } = 10;
public QuotaAction OnExceed { get; init; } = QuotaAction.Warn;
}
public enum QuotaAction
{
Warn, // Log warning, allow admission
Block, // Reject admission
RequireVex // Require VEX justification before admission
}
```
### 3.5 FacetDrift
Drift calculation result between two seals:
```csharp
public sealed record FacetDrift
{
public required string FacetId { get; init; }
public required Guid BaselineSealId { get; init; }
public required Guid CurrentSealId { get; init; }
public required ImmutableArray<DriftEntry> Added { get; init; }
public required ImmutableArray<DriftEntry> Removed { get; init; }
public required ImmutableArray<DriftEntry> Modified { get; init; }
public required DriftScore Score { get; init; }
public required QuotaVerdict QuotaVerdict { get; init; }
}
public sealed record DriftEntry
{
public required string Path { get; init; }
public string? OldHash { get; init; }
public string? NewHash { get; init; }
public long? OldSize { get; init; }
public long? NewSize { get; init; }
public DriftCause Cause { get; init; } = DriftCause.Unknown;
}
public enum DriftCause
{
Unknown,
PackageUpdate,
ConfigChange,
BinaryRebuild,
NewDependency,
RemovedDependency,
SecurityPatch
}
public sealed record DriftScore
{
public required int TotalChanges { get; init; }
public required double ChurnPercent { get; init; }
public required int AddedCount { get; init; }
public required int RemovedCount { get; init; }
public required int ModifiedCount { get; init; }
}
public sealed record QuotaVerdict
{
public required bool Passed { get; init; }
public required ImmutableArray<QuotaViolation> Violations { get; init; }
public required QuotaAction RecommendedAction { get; init; }
}
public sealed record QuotaViolation
{
public required string QuotaField { get; init; } // e.g., "MaxChurnPercent"
public required double Limit { get; init; }
public required double Actual { get; init; }
public required string Message { get; init; }
}
```
---
## 4. Component Architecture
### 4.1 FacetExtractor
Extracts file entries from container images based on facet definitions:
```csharp
public interface IFacetExtractor
{
Task<FacetManifest> ExtractAsync(
string imageRef,
FacetDefinition definition,
CancellationToken ct = default);
Task<ImmutableArray<FacetManifest>> ExtractAllAsync(
string imageRef,
ImmutableArray<FacetDefinition> definitions,
CancellationToken ct = default);
}
```
Implementation notes:
- Uses existing `ISurfaceReader` for container layer traversal
- Normalizes paths to POSIX format (forward slashes, no trailing slashes)
- Computes SHA-256 content hashes for each file
- Normalizes timestamps to UTC, mode to POSIX string
- Sorts files lexicographically for deterministic ordering
### 4.2 FacetHasher
Computes Merkle tree for facet file entries:
```csharp
public interface IFacetHasher
{
FacetMerkleResult ComputeMerkle(ImmutableArray<FacetFileEntry> files);
}
public sealed record FacetMerkleResult
{
public required string Root { get; init; }
public required ImmutableArray<string> LeafHashes { get; init; }
public required ImmutableArray<MerkleProofNode> Proof { get; init; }
}
```
Implementation notes:
- Leaf hash = SHA-256(path || contentHash || size || mode)
- Binary Merkle tree with lexicographic leaf ordering
- Empty facet produces well-known empty root hash
- Proof enables verification of individual file membership
### 4.3 FacetSealStore
PostgreSQL storage for sealed facet manifests:
```sql
-- Core seal storage
CREATE TABLE facet_seals (
seal_id UUID PRIMARY KEY,
tenant TEXT NOT NULL,
image_ref TEXT NOT NULL,
image_digest TEXT NOT NULL,
facet_id TEXT NOT NULL,
facet_type TEXT NOT NULL,
merkle_root TEXT NOT NULL,
file_count INTEGER NOT NULL,
total_bytes BIGINT NOT NULL,
sealed_at TIMESTAMPTZ NOT NULL,
sealed_by TEXT NOT NULL,
quota_json JSONB,
manifest_cas TEXT NOT NULL, -- CAS URI to full manifest
dsse_envelope JSONB NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
CONSTRAINT uq_facet_seal UNIQUE (tenant, image_digest, facet_id)
);
CREATE INDEX ix_facet_seals_image ON facet_seals (tenant, image_digest);
CREATE INDEX ix_facet_seals_merkle ON facet_seals (merkle_root);
-- Drift history
CREATE TABLE facet_drift_history (
drift_id UUID PRIMARY KEY,
tenant TEXT NOT NULL,
baseline_seal_id UUID NOT NULL REFERENCES facet_seals(seal_id),
current_seal_id UUID NOT NULL REFERENCES facet_seals(seal_id),
facet_id TEXT NOT NULL,
drift_score_json JSONB NOT NULL,
quota_verdict_json JSONB NOT NULL,
computed_at TIMESTAMPTZ NOT NULL,
CONSTRAINT uq_drift_pair UNIQUE (baseline_seal_id, current_seal_id)
);
```
### 4.4 DriftCalculator
Computes drift between baseline and current seals:
```csharp
public interface IDriftCalculator
{
Task<FacetDrift> CalculateAsync(
Guid baselineSealId,
Guid currentSealId,
CancellationToken ct = default);
Task<ImmutableArray<FacetDrift>> CalculateAllAsync(
string imageDigestBaseline,
string imageDigestCurrent,
CancellationToken ct = default);
}
```
Implementation notes:
- Retrieves manifests from CAS via seal metadata
- Performs set difference operations on file paths
- Detects modifications via content hash comparison
- Attributes drift causes where determinable (e.g., package manager metadata)
### 4.5 QuotaEnforcer
Evaluates drift against quota constraints:
```csharp
public interface IQuotaEnforcer
{
QuotaVerdict Evaluate(FacetDrift drift, FacetQuota quota);
Task<ImmutableArray<QuotaVerdict>> EvaluateAllAsync(
ImmutableArray<FacetDrift> drifts,
ImmutableDictionary<string, FacetQuota> quotas,
CancellationToken ct = default);
}
```
### 4.6 AdmissionValidator
Zastava webhook integration for admission control:
```csharp
public interface IFacetAdmissionValidator
{
Task<AdmissionResult> ValidateAsync(
AdmissionRequest request,
CancellationToken ct = default);
}
public sealed record AdmissionResult
{
public required bool Allowed { get; init; }
public string? Message { get; init; }
public ImmutableArray<QuotaViolation> Violations { get; init; } = [];
public string? RequiredVexStatement { get; init; }
}
```
---
## 5. DSSE Envelope Structure
Facet seals use DSSE (Dead Simple Signing Envelope) for cryptographic binding:
```json
{
"payloadType": "application/vnd.stellaops.facet-seal.v1+json",
"payload": "<base64url-encoded canonical JSON of FacetSeal>",
"signatures": [
{
"keyid": "sha256:abc123...",
"sig": "<base64url-encoded signature>"
}
]
}
```
Payload structure (canonical JSON, RFC 8785):
```json
{
"_type": "https://stellaops.io/FacetSeal/v1",
"facetId": "os",
"facetType": "OS",
"imageDigest": "sha256:abc123...",
"imageRef": "registry.example.com/app:v1.2.3",
"manifest": {
"extractedAt": "2026-01-05T10:00:00.000Z",
"extractorVersion": "1.0.0",
"fileCount": 1234,
"files": [
{
"contentHash": "sha256:...",
"mode": "0644",
"modTime": "2026-01-01T00:00:00.000Z",
"path": "/etc/os-release",
"size": 256
}
],
"merkleRoot": "sha256:def456...",
"totalBytes": 1048576
},
"quota": {
"maxAddedFiles": 25,
"maxChangedFiles": 50,
"maxChurnPercent": 5.0,
"maxRemovedFiles": 10,
"onExceed": "Warn"
},
"sealId": "550e8400-e29b-41d4-a716-446655440000",
"sealedAt": "2026-01-05T10:05:00.000Z",
"sealedBy": "scanner-worker-01"
}
```
---
## 6. Default Facet Definitions
Standard facet definitions applied when no custom configuration is provided:
```yaml
# Default facet configuration
facets:
- facetId: os
type: OS
includeGlobs:
- /var/lib/dpkg/**
- /var/lib/rpm/**
- /var/lib/pacman/**
- /var/lib/apk/**
- /var/cache/apt/**
- /etc/apt/**
- /etc/yum.repos.d/**
excludeGlobs:
- "**/*.log"
quota:
maxChurnPercent: 5.0
maxChangedFiles: 100
onExceed: Warn
- facetId: lang/node
type: LangNode
includeGlobs:
- "**/node_modules/**"
- "**/package.json"
- "**/package-lock.json"
- "**/yarn.lock"
- "**/pnpm-lock.yaml"
quota:
maxChurnPercent: 10.0
maxChangedFiles: 500
onExceed: RequireVex
- facetId: lang/python
type: LangPython
includeGlobs:
- "**/site-packages/**"
- "**/dist-packages/**"
- "**/requirements.txt"
- "**/Pipfile.lock"
- "**/poetry.lock"
quota:
maxChurnPercent: 10.0
maxChangedFiles: 200
onExceed: Warn
- facetId: lang/go
type: LangGo
includeGlobs:
- "**/go.mod"
- "**/go.sum"
- "**/vendor/**"
quota:
maxChurnPercent: 15.0
maxChangedFiles: 100
onExceed: Warn
- facetId: binary
type: Binary
includeGlobs:
- /usr/bin/*
- /usr/sbin/*
- /bin/*
- /sbin/*
- /usr/lib/**/*.so*
- /lib/**/*.so*
- /usr/local/bin/*
excludeGlobs:
- "**/*.py"
- "**/*.sh"
quota:
maxChurnPercent: 2.0
maxChangedFiles: 20
onExceed: Block
- facetId: config
type: Config
includeGlobs:
- /etc/**
- "**/*.conf"
- "**/*.cfg"
- "**/*.ini"
- "**/*.yaml"
- "**/*.yml"
- "**/*.json"
excludeGlobs:
- /etc/passwd
- /etc/shadow
- /etc/group
- "**/*.log"
quota:
maxChurnPercent: 20.0
maxChangedFiles: 50
onExceed: Warn
```
---
## 7. Integration Points
### 7.1 Scanner Integration
Scanner invokes facet extraction during scan:
```csharp
// In ScanOrchestrator
var facetDefs = await _facetConfigLoader.LoadAsync(scanRequest.FacetConfig, ct);
var manifests = await _facetExtractor.ExtractAllAsync(imageRef, facetDefs, ct);
foreach (var manifest in manifests)
{
var seal = await _facetSealer.SealAsync(manifest, scanRequest, ct);
await _facetSealStore.SaveAsync(seal, ct);
}
```
### 7.2 CLI Integration
```bash
# Seal all facets for an image
stella seal myregistry.io/app:v1.2.3 --output seals.json
# Seal specific facets
stella seal myregistry.io/app:v1.2.3 --facet os --facet lang/node
# Check drift between two image versions
stella drift myregistry.io/app:v1.2.3 myregistry.io/app:v1.2.4 --format json
# Generate VEX from drift
stella vex gen --from-drift myregistry.io/app:v1.2.3 myregistry.io/app:v1.2.4
```
### 7.3 Zastava Webhook Integration
```csharp
// In FacetAdmissionValidator
public async Task<AdmissionResult> ValidateAsync(AdmissionRequest request, CancellationToken ct)
{
// Find baseline seal (latest approved)
var baseline = await _sealStore.GetLatestApprovedAsync(request.ImageRef, ct);
if (baseline is null)
return AdmissionResult.Allowed("No baseline seal found, skipping facet check");
// Extract current facets
var currentManifests = await _extractor.ExtractAllAsync(request.ImageRef, _defaultFacets, ct);
// Calculate drift for each facet
var drifts = new List<FacetDrift>();
foreach (var manifest in currentManifests)
{
var baselineSeal = baseline.FirstOrDefault(s => s.FacetId == manifest.FacetId);
if (baselineSeal is not null)
{
var drift = await _driftCalculator.CalculateAsync(baselineSeal, manifest, ct);
drifts.Add(drift);
}
}
// Evaluate quotas
var violations = new List<QuotaViolation>();
QuotaAction maxAction = QuotaAction.Warn;
foreach (var drift in drifts)
{
var verdict = _quotaEnforcer.Evaluate(drift, drift.AppliedQuota);
if (!verdict.Passed)
{
violations.AddRange(verdict.Violations);
if (verdict.RecommendedAction > maxAction)
maxAction = verdict.RecommendedAction;
}
}
return maxAction switch
{
QuotaAction.Block => AdmissionResult.Denied(violations),
QuotaAction.RequireVex => AdmissionResult.RequiresVex(violations),
_ => AdmissionResult.Allowed(violations)
};
}
```
---
## 8. Observability
### 8.1 Metrics
| Metric | Type | Labels | Description |
|--------|------|--------|-------------|
| `facet_seal_total` | Counter | `tenant`, `facet_type`, `status` | Total seals created |
| `facet_seal_duration_seconds` | Histogram | `facet_type` | Time to create seal |
| `facet_drift_score` | Gauge | `tenant`, `facet_id`, `image` | Current drift score |
| `facet_quota_violations_total` | Counter | `tenant`, `facet_id`, `quota_field` | Quota violations |
| `facet_admission_decisions_total` | Counter | `tenant`, `decision`, `facet_id` | Admission decisions |
### 8.2 Traces
```
facet.extract - Facet file extraction from image
facet.hash - Merkle tree computation
facet.seal - DSSE signing
facet.drift.compute - Drift calculation
facet.quota.evaluate - Quota enforcement
facet.admission - Admission validation
```
### 8.3 Logs
Structured log fields:
- `facetId`: Facet identifier
- `imageRef`: Container image reference
- `imageDigest`: Image content digest
- `merkleRoot`: Facet Merkle root
- `driftScore`: Computed drift percentage
- `quotaVerdict`: Pass/fail status
---
## 9. Security Considerations
1. **Signature Verification**: All seals must be DSSE-signed with keys managed by Authority service
2. **Tenant Isolation**: Seals are scoped to tenants; cross-tenant access is prohibited
3. **Immutability**: Once created, seals cannot be modified; only superseded by new seals
4. **Audit Trail**: All seal operations are logged with correlation IDs
5. **Key Rotation**: Signing keys support rotation; old signatures remain valid with archived keys
---
## 10. References
- [DSSE Specification](https://github.com/secure-systems-lab/dsse)
- [RFC 8785 - JSON Canonicalization](https://tools.ietf.org/html/rfc8785)
- [Scanner Architecture](../scanner/architecture.md)
- [Attestor Architecture](../attestor/architecture.md)
- [Policy Engine Architecture](../policy/architecture.md)
- [Replay Architecture](../replay/architecture.md)
---
*Last updated: 2026-01-05*

View File

@@ -1,622 +0,0 @@
# Provcache Module
> **Status: Implemented** — Core library shipped in Sprint 8200.0001.0001. API endpoints, caching, invalidation and write-behind queue are operational. Policy Engine integration pending architectural review.
> Provenance Cache — Maximizing Trust Evidence Density
## Overview
Provcache is a caching layer that maximizes "provenance density" — the amount of trustworthy evidence retained per byte — enabling faster security decisions, offline replays, and smaller air-gap bundles.
### Key Benefits
- **Trust Latency**: Warm cache lookups return in single-digit milliseconds
- **Bandwidth Efficiency**: Avoid re-fetching bulky SBOMs/attestations
- **Offline Operation**: Decisions usable without full SBOM/VEX payloads
- **Audit Transparency**: Full evidence chain verifiable via Merkle proofs
## Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ Policy Evaluator │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ VeriKey │───▶│ Provcache │───▶│ TrustLatticeEngine │ │
│ │ Builder │ │ Service │ │ (if cache miss) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Provcache Store │
│ ┌─────────────┐ ┌────────────────┐ │
│ │ Valkey │◀──▶│ Postgres │ │
│ │ (read-thru) │ │ (write-behind) │ │
│ └─────────────┘ └────────────────┘ │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Evidence Chunk Store │
│ ┌─────────────────────────────────────┐│
│ │ prov_evidence_chunks (Postgres) ││
│ │ - Chunked SBOM/VEX/CallGraph ││
│ │ - Merkle tree verification ││
│ └─────────────────────────────────────┘│
└─────────────────────────────────────────┘
```
## Core Concepts
### VeriKey (Provenance Identity Key)
A composite hash that uniquely identifies a provenance decision context:
```
VeriKey = SHA256(
"v1|" || // Version prefix for compatibility
source_hash || // Image/artifact digest
"|" ||
sbom_hash || // Canonical SBOM hash
"|" ||
vex_hash_set_hash || // Sorted VEX statement hashes
"|" ||
merge_policy_hash || // PolicyBundle hash
"|" ||
signer_set_hash || // Signer certificate hashes
"|" ||
time_window // Epoch bucket
)
```
**Why each component?**
| Component | Purpose |
|-----------|---------|
| `source_hash` | Different artifacts → different keys |
| `sbom_hash` | SBOM changes (new packages) → new key |
| `vex_hash_set` | VEX updates → new key |
| `policy_hash` | Policy changes → new key |
| `signer_set_hash` | Key rotation → new key (security) |
| `time_window` | Temporal bucketing → controlled expiry |
#### VeriKey Composition Rules
1. **Hash Normalization**: All input hashes are normalized to lowercase with `sha256:` prefix stripped if present
2. **Set Hash Computation**: For VEX statements and signer certificates:
- Individual hashes are sorted lexicographically (ordinal)
- Sorted hashes are concatenated with `|` delimiter
- Result is SHA256-hashed
- Empty sets use well-known sentinels (`"empty-vex-set"`, `"empty-signer-set"`)
3. **Time Window Computation**: `floor(timestamp.Ticks / bucket.Ticks) * bucket.Ticks` in UTC ISO-8601 format
4. **Output Format**: `sha256:<64-char-lowercase-hex>`
#### Code Example
```csharp
var veriKey = new VeriKeyBuilder(options)
.WithSourceHash("sha256:abc123...") // Image digest
.WithSbomHash("sha256:def456...") // SBOM digest
.WithVexStatementHashes(["sha256:v1", "sha256:v2"]) // Sorted automatically
.WithMergePolicyHash("sha256:policy...") // Policy bundle
.WithCertificateHashes(["sha256:cert1"]) // Signer certs
.WithTimeWindow(DateTimeOffset.UtcNow) // Auto-bucketed
.Build();
// Returns: "sha256:789abc..."
```
### DecisionDigest
Canonicalized representation of an evaluation result:
```json
{
"digestVersion": "v1",
"veriKey": "sha256:abc123...",
"verdictHash": "sha256:def456...",
"proofRoot": "sha256:789abc...",
"replaySeed": {
"feedIds": ["cve-2024", "ghsa-2024"],
"ruleIds": ["default-policy-v2"]
},
"trustScore": 85,
"createdAt": "2025-12-24T12:00:00Z",
"expiresAt": "2025-12-25T12:00:00Z"
}
```
### Trust Score
A composite score (0-100) indicating decision confidence:
| Component | Weight | Calculation |
|-----------|--------|-------------|
| Reachability | 25% | Call graph coverage, entry points analyzed |
| SBOM Completeness | 20% | Package count, license data presence |
| VEX Coverage | 20% | Vendor statements, justifications |
| Policy Freshness | 15% | Time since last policy update |
| Signer Trust | 20% | Key age, reputation, chain validity |
### Evidence Chunks
Large evidence (SBOM, VEX, call graphs) is stored in fixed-size chunks:
- **Default size**: 64 KB per chunk
- **Merkle verification**: Each chunk is a Merkle leaf
- **Lazy fetch**: Only fetch chunks needed for audit
- **LRU eviction**: Old chunks evicted under storage pressure
## API Reference
### Endpoints
| Method | Path | Description |
|--------|------|-------------|
| GET | `/v1/provcache/{veriKey}` | Lookup cached decision |
| POST | `/v1/provcache` | Store decision (idempotent) |
| POST | `/v1/provcache/invalidate` | Invalidate by pattern |
| GET | `/v1/proofs/{proofRoot}` | List evidence chunks |
| GET | `/v1/proofs/{proofRoot}/chunks/{index}` | Download chunk |
### Cache Lookup Flow
```mermaid
sequenceDiagram
participant Client
participant PolicyEngine
participant Provcache
participant Valkey
participant Postgres
participant TrustLattice
Client->>PolicyEngine: Evaluate(artifact)
PolicyEngine->>Provcache: Get(VeriKey)
Provcache->>Valkey: GET verikey
alt Cache Hit
Valkey-->>Provcache: DecisionDigest
Provcache-->>PolicyEngine: CacheResult(hit)
PolicyEngine-->>Client: Decision (cached)
else Cache Miss
Valkey-->>Provcache: null
Provcache->>Postgres: SELECT * FROM provcache_items
alt DB Hit
Postgres-->>Provcache: ProvcacheEntry
Provcache->>Valkey: SET (backfill)
Provcache-->>PolicyEngine: CacheResult(hit, source=postgres)
else DB Miss
Postgres-->>Provcache: null
Provcache-->>PolicyEngine: CacheResult(miss)
PolicyEngine->>TrustLattice: Evaluate
TrustLattice-->>PolicyEngine: EvaluationResult
PolicyEngine->>Provcache: Set(VeriKey, DecisionDigest)
Provcache->>Valkey: SET
Provcache->>Postgres: INSERT (async)
PolicyEngine-->>Client: Decision (computed)
end
end
```
## Invalidation
> **See also**: [architecture.md](architecture.md#invalidation-mechanisms) for detailed invalidation flow diagrams.
### Automatic Invalidation Triggers
| Trigger | Event | Scope | Implementation |
|---------|-------|-------|----------------|
| Signer Revocation | `SignerRevokedEvent` | All entries with matching `signer_set_hash` | `SignerSetInvalidator` |
| Feed Epoch Advance | `FeedEpochAdvancedEvent` | Entries with older `feed_epoch` | `FeedEpochInvalidator` |
| Policy Update | `PolicyUpdatedEvent` | Entries with matching `policy_hash` | `PolicyHashInvalidator` |
| TTL Expiry | Background job | Entries past `expires_at` | `TtlExpirationService` |
### Invalidation Interfaces
```csharp
// Main invalidator interface
public interface IProvcacheInvalidator
{
Task<int> InvalidateAsync(
InvalidationCriteria criteria,
string reason,
string? correlationId = null,
CancellationToken cancellationToken = default);
}
// Revocation ledger for audit trail
public interface IRevocationLedger
{
Task RecordAsync(RevocationEntry entry, CancellationToken ct = default);
Task<IReadOnlyList<RevocationEntry>> GetEntriesSinceAsync(long sinceSeqNo, int limit = 1000, CancellationToken ct = default);
Task<RevocationLedgerStats> GetStatsAsync(CancellationToken ct = default);
}
```
### Manual Invalidation
```bash
# Invalidate by signer
POST /v1/provcache/invalidate
{
"by": "signer_set_hash",
"value": "sha256:revoked-signer...",
"reason": "key-compromise"
}
# Invalidate by policy
POST /v1/provcache/invalidate
{
"by": "policy_hash",
"value": "sha256:old-policy...",
"reason": "policy-update"
}
```
### Revocation Replay
Nodes can replay missed revocation events after restart or network partition:
```csharp
var replayService = services.GetRequiredService<IRevocationReplayService>();
var checkpoint = await replayService.GetCheckpointAsync();
var result = await replayService.ReplayFromAsync(
sinceSeqNo: checkpoint,
new RevocationReplayOptions { BatchSize = 1000 });
// result.EntriesReplayed, result.TotalInvalidations
```
## Air-Gap Integration
> **See also**: [architecture.md](architecture.md#air-gap-exportimport) for bundle format specification and architecture diagrams.
### Export Workflow
```bash
# Export minimal proof (digest only)
stella prov export --verikey sha256:abc123 --density lite
# Export with evidence chunks
stella prov export --verikey sha256:abc123 --density standard
# Export full evidence
stella prov export --verikey sha256:abc123 --density strict --sign
```
### Import Workflow
```bash
# Import and verify Merkle root
stella prov import --input proof.bundle
# Import with lazy chunk fetch (connected mode)
stella prov import --input proof-lite.json --lazy-fetch --backend https://api.stellaops.com
# Import with lazy fetch from file directory (sneakernet mode)
stella prov import --input proof-lite.json --lazy-fetch --chunks-dir /mnt/usb/evidence
```
### Density Levels
| Level | Contents | Size | Use Case | Lazy Fetch Support |
|-------|----------|------|----------|--------------------|
| `lite` | DecisionDigest + ProofRoot + Manifest | ~2 KB | Quick verification | Required |
| `standard` | + First N chunks (~10%) | ~200 KB | Normal audit | Partial (remaining chunks) |
| `strict` | + All chunks | Variable | Full compliance | Not needed |
### Lazy Evidence Fetching
For `lite` and `standard` density exports, missing chunks can be fetched on-demand:
```csharp
// HTTP fetcher (connected mode)
var httpFetcher = new HttpChunkFetcher(
new Uri("https://api.stellaops.com"), logger);
// File fetcher (air-gapped/sneakernet mode)
var fileFetcher = new FileChunkFetcher(
basePath: "/mnt/usb/evidence", logger);
// Orchestrate fetch + verify + store
var orchestrator = new LazyFetchOrchestrator(repository, logger);
var result = await orchestrator.FetchAndStoreAsync(
proofRoot: "sha256:...",
fetcher,
new LazyFetchOptions
{
VerifyOnFetch = true,
BatchSize = 100,
MaxChunks = 1000
});
```
### Sneakernet Export for Chunked Evidence
```csharp
// Export evidence chunks to file system for transport
await fileFetcher.ExportEvidenceChunksToFilesAsync(
manifest,
chunks,
outputDirectory: "/mnt/usb/evidence");
```
## Configuration
### C# Configuration Class
The `ProvcacheOptions` class (section name: `"Provcache"`) exposes the following settings:
| Property | Type | Default | Validation | Description |
|----------|------|---------|------------|-------------|
| `DefaultTtl` | `TimeSpan` | 24h | 1min7d | Default time-to-live for cache entries |
| `MaxTtl` | `TimeSpan` | 7d | 1min30d | Maximum allowed TTL regardless of request |
| `TimeWindowBucket` | `TimeSpan` | 1h | 1min24h | Time window bucket for VeriKey computation |
| `ValkeyKeyPrefix` | `string` | `"stellaops:prov:"` | — | Key prefix for Valkey storage |
| `EnableWriteBehind` | `bool` | `true` | — | Enable async Postgres persistence |
| `WriteBehindFlushInterval` | `TimeSpan` | 5s | 1s5min | Interval for flushing write-behind queue |
| `WriteBehindMaxBatchSize` | `int` | 100 | 110000 | Maximum batch size per flush |
| `WriteBehindQueueCapacity` | `int` | 10000 | 1001M | Max queue capacity (blocks when full) |
| `WriteBehindMaxRetries` | `int` | 3 | 010 | Retry attempts for failed writes |
| `ChunkSize` | `int` | 65536 | 1KB1MB | Evidence chunk size in bytes |
| `MaxChunksPerEntry` | `int` | 1000 | 1100000 | Max chunks per cache entry |
| `AllowCacheBypass` | `bool` | `true` | — | Allow clients to force re-evaluation |
| `DigestVersion` | `string` | `"v1"` | — | Serialization version for digests |
| `HashAlgorithm` | `string` | `"SHA256"` | — | Hash algorithm for VeriKey/digest |
| `EnableValkeyCache` | `bool` | `true` | — | Enable Valkey layer (false = Postgres only) |
| `SlidingExpiration` | `bool` | `false` | — | Refresh TTL on cache hits |
### appsettings.json Example
```json
{
"Provcache": {
"DefaultTtl": "24:00:00",
"MaxTtl": "7.00:00:00",
"TimeWindowBucket": "01:00:00",
"ValkeyKeyPrefix": "stellaops:prov:",
"EnableWriteBehind": true,
"WriteBehindFlushInterval": "00:00:05",
"WriteBehindMaxBatchSize": 100,
"WriteBehindQueueCapacity": 10000,
"WriteBehindMaxRetries": 3,
"ChunkSize": 65536,
"MaxChunksPerEntry": 1000,
"AllowCacheBypass": true,
"DigestVersion": "v1",
"HashAlgorithm": "SHA256",
"EnableValkeyCache": true,
"SlidingExpiration": false
}
}
```
### YAML Example (Helm/Kubernetes)
```yaml
provcache:
# TTL configuration
defaultTtl: 24h
maxTtl: 168h # 7 days
timeWindowBucket: 1h
# Storage
valkeyKeyPrefix: "stellaops:prov:"
enableWriteBehind: true
writeBehindFlushInterval: 5s
writeBehindMaxBatchSize: 100
# Evidence chunking
chunkSize: 65536 # 64 KB
maxChunksPerEntry: 1000
# Behavior
allowCacheBypass: true
digestVersion: "v1"
```
### Dependency Injection Registration
```csharp
// In Program.cs or Startup.cs
services.AddProvcache(configuration);
// Or with explicit configuration
services.AddProvcache(options =>
{
options.DefaultTtl = TimeSpan.FromHours(12);
options.EnableWriteBehind = true;
options.WriteBehindMaxBatchSize = 200;
});
```
## Observability
### Metrics
| Metric | Type | Description |
|--------|------|-------------|
| `provcache_requests_total` | Counter | Total cache requests |
| `provcache_hits_total` | Counter | Cache hits |
| `provcache_misses_total` | Counter | Cache misses |
| `provcache_latency_seconds` | Histogram | Operation latency |
| `provcache_items_count` | Gauge | Current item count |
| `provcache_invalidations_total` | Counter | Invalidation count |
### Alerts
```yaml
# Low cache hit rate
- alert: ProvcacheLowHitRate
expr: rate(provcache_hits_total[5m]) / rate(provcache_requests_total[5m]) < 0.5
for: 10m
labels:
severity: warning
annotations:
summary: "Provcache hit rate below 50%"
# High invalidation rate
- alert: ProvcacheHighInvalidationRate
expr: rate(provcache_invalidations_total[5m]) > 100
for: 5m
labels:
severity: warning
annotations:
summary: "High cache invalidation rate"
```
## Security Considerations
### Signer-Aware Caching
The `signer_set_hash` is part of the VeriKey, ensuring:
- Key rotation → new cache entries
- Key revocation → immediate invalidation
- No stale decisions from compromised signers
### Merkle Verification
All evidence chunks are Merkle-verified:
- `ProofRoot` = Merkle root of all chunks
- Individual chunks verifiable without full tree
- Tamper detection on import
### Audit Trail
All invalidations are logged to `prov_revocations` table:
```sql
SELECT * FROM provcache.prov_revocations
WHERE created_at > NOW() - INTERVAL '24 hours'
ORDER BY created_at DESC;
```
## Database Schema
### provcache_items
```sql
CREATE TABLE provcache.provcache_items (
verikey TEXT PRIMARY KEY,
digest_version TEXT NOT NULL,
verdict_hash TEXT NOT NULL,
proof_root TEXT NOT NULL,
replay_seed JSONB NOT NULL,
policy_hash TEXT NOT NULL,
signer_set_hash TEXT NOT NULL,
feed_epoch TEXT NOT NULL,
trust_score INTEGER NOT NULL,
hit_count BIGINT DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL,
expires_at TIMESTAMPTZ NOT NULL,
updated_at TIMESTAMPTZ NOT NULL
);
```
### prov_evidence_chunks
```sql
CREATE TABLE provcache.prov_evidence_chunks (
chunk_id UUID PRIMARY KEY,
proof_root TEXT NOT NULL REFERENCES provcache_items(proof_root),
chunk_index INTEGER NOT NULL,
chunk_hash TEXT NOT NULL,
blob BYTEA NOT NULL,
blob_size INTEGER NOT NULL,
content_type TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL
);
```
### prov_revocations
```sql
CREATE TABLE provcache.prov_revocations (
seq_no BIGSERIAL PRIMARY KEY,
revocation_id UUID NOT NULL UNIQUE,
revocation_type VARCHAR(32) NOT NULL, -- signer, feed_epoch, policy, explicit, expiration
revoked_key VARCHAR(512) NOT NULL,
reason VARCHAR(1024),
entries_invalidated INTEGER NOT NULL,
source VARCHAR(128) NOT NULL,
correlation_id VARCHAR(128),
revoked_at TIMESTAMPTZ NOT NULL,
metadata JSONB,
CONSTRAINT chk_revocation_type CHECK (
revocation_type IN ('signer', 'feed_epoch', 'policy', 'explicit', 'expiration')
)
);
CREATE INDEX idx_revocations_type ON provcache.prov_revocations(revocation_type);
CREATE INDEX idx_revocations_key ON provcache.prov_revocations(revoked_key);
CREATE INDEX idx_revocations_time ON provcache.prov_revocations(revoked_at);
```
## Implementation Status
### Completed (Sprint 8200.0001.0001 - Core Backend)
| Component | Path | Status |
|-----------|------|--------|
| Core Models | `src/__Libraries/StellaOps.Provcache/Models/` | ✅ Done |
| VeriKeyBuilder | `src/__Libraries/StellaOps.Provcache/VeriKeyBuilder.cs` | ✅ Done |
| DecisionDigest | `src/__Libraries/StellaOps.Provcache/DecisionDigest.cs` | ✅ Done |
| Caching Layer | `src/__Libraries/StellaOps.Provcache/Caching/` | ✅ Done |
| WriteBehindQueue | `src/__Libraries/StellaOps.Provcache/Persistence/` | ✅ Done |
| API Endpoints | `src/__Libraries/StellaOps.Provcache.Api/` | ✅ Done |
| Unit Tests (53) | `src/__Libraries/__Tests/StellaOps.Provcache.Tests/` | ✅ Done |
### Completed (Sprint 8200.0001.0002 - Invalidation & Air-Gap)
| Component | Path | Status |
|-----------|------|--------|
| Invalidation Interfaces | `src/__Libraries/StellaOps.Provcache/Invalidation/` | ✅ Done |
| Repository Invalidation Methods | `IEvidenceChunkRepository.Delete*Async()` | ✅ Done |
| Export Interfaces | `src/__Libraries/StellaOps.Provcache/Export/` | ✅ Done |
| IMinimalProofExporter | `Export/IMinimalProofExporter.cs` | ✅ Done |
| MinimalProofExporter | `Export/MinimalProofExporter.cs` | ✅ Done |
| Lazy Fetch - ILazyEvidenceFetcher | `LazyFetch/ILazyEvidenceFetcher.cs` | ✅ Done |
| Lazy Fetch - HttpChunkFetcher | `LazyFetch/HttpChunkFetcher.cs` | ✅ Done |
| Lazy Fetch - FileChunkFetcher | `LazyFetch/FileChunkFetcher.cs` | ✅ Done |
| Lazy Fetch - LazyFetchOrchestrator | `LazyFetch/LazyFetchOrchestrator.cs` | ✅ Done |
| Revocation - IRevocationLedger | `Revocation/IRevocationLedger.cs` | ✅ Done |
| Revocation - InMemoryRevocationLedger | `Revocation/InMemoryRevocationLedger.cs` | ✅ Done |
| Revocation - RevocationReplayService | `Revocation/RevocationReplayService.cs` | ✅ Done |
| ProvRevocationEntity | `Entities/ProvRevocationEntity.cs` | ✅ Done |
| Unit Tests (124 total) | `src/__Libraries/__Tests/StellaOps.Provcache.Tests/` | ✅ Done |
### Blocked
| Component | Reason |
|-----------|--------|
| Policy Engine Integration | `PolicyEvaluator` is `internal sealed`; requires architectural review to expose injection points for `IProvcacheService` |
| CLI e2e Tests | `AddSimRemoteCryptoProvider` method missing in CLI codebase |
### Pending
| Component | Sprint |
|-----------|--------|
| Authority Event Integration | 8200.0001.0002 (BLOCKED - Authority needs event publishing) |
| Concelier Event Integration | 8200.0001.0002 (BLOCKED - Concelier needs event publishing) |
| PostgresRevocationLedger | Future (requires EF Core integration) |
| UI Badges & Proof Tree | 8200.0001.0003 |
| Grafana Dashboards | 8200.0001.0003 |
## Implementation Sprints
| Sprint | Focus | Key Deliverables |
|--------|-------|------------------|
| [8200.0001.0001](../../implplan/SPRINT_8200_0001_0001_provcache_core_backend.md) | Core Backend | VeriKey, DecisionDigest, Valkey+Postgres, API |
| [8200.0001.0002](../../implplan/SPRINT_8200_0001_0002_provcache_invalidation_airgap.md) | Invalidation & Air-Gap | Signer revocation, feed epochs, CLI export/import |
| [8200.0001.0003](../../implplan/SPRINT_8200_0001_0003_provcache_ux_observability.md) | UX & Observability | UI badges, proof tree, Grafana, OCI attestation |
## Related Documentation
- **[Provcache Architecture Guide](architecture.md)** - Detailed architecture, invalidation flows, and API reference
- [Policy Engine Architecture](../policy/README.md)
- [TrustLattice Engine](../policy/design/policy-deterministic-evaluator.md)
- [Offline Kit Documentation](../../OFFLINE_KIT.md)
- [Air-Gap Controller](../airgap/README.md)
- [Authority Key Rotation](../authority/README.md)

View File

@@ -1,451 +0,0 @@
# Provcache Architecture Guide
> **Status: Production (Shared Library Family).** Provcache is a mature, heavily-used shared library family — not a planned component. The implementation spans four libraries: `src/__Libraries/StellaOps.Provcache/` (77 core files: VeriKey, DecisionDigest, chunking, write-behind queue, invalidation, telemetry), `StellaOps.Provcache.Postgres/` (16 files: EF Core persistence with `provcache` schema), `StellaOps.Provcache.Valkey/` (hot-cache layer), and `StellaOps.Provcache.Api/` (HTTP endpoints). Consumed by 89+ files across Policy Engine, Concelier, ExportCenter, CLI, and other modules. Comprehensive test coverage (89 test files). Actively maintained with recent determinism refactoring (DET-005).
> Detailed architecture documentation for the Provenance Cache module
## Overview
Provcache provides a caching layer that maximizes "provenance density" — the amount of trustworthy evidence retained per byte. This document covers the internal architecture, invalidation mechanisms, air-gap support, and replay capabilities.
## Table of Contents
1. [Cache Architecture](#cache-architecture)
2. [Invalidation Mechanisms](#invalidation-mechanisms)
3. [Evidence Chunk Storage](#evidence-chunk-storage)
4. [Air-Gap Export/Import](#air-gap-exportimport)
5. [Lazy Evidence Fetching](#lazy-evidence-fetching)
6. [Revocation Ledger](#revocation-ledger)
7. [API Reference](#api-reference)
---
## Cache Architecture
### Storage Layers
```
┌───────────────────────────────────────────────────────────────┐
│ Application Layer │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────┐ │
│ │ VeriKey │───▶│ Provcache │───▶│ Policy Engine │ │
│ │ Builder │ │ Service │ │ (cache miss) │ │
│ └─────────────┘ └─────────────┘ └─────────────────┘ │
└───────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────┐
│ Caching Layer │
│ ┌─────────────────┐ ┌──────────────────────────┐ │
│ │ Valkey │◀───────▶│ PostgreSQL │ │
│ │ (read-through) │ │ (write-behind queue) │ │
│ │ │ │ │ │
│ │ • Hot cache │ │ • provcache_items │ │
│ │ • Sub-ms reads │ │ • prov_evidence_chunks │ │
│ │ • TTL-based │ │ • prov_revocations │ │
│ └─────────────────┘ └──────────────────────────┘ │
└───────────────────────────────────────────────────────────────┘
```
### Key Components
| Component | Purpose |
|-----------|---------|
| `IProvcacheService` | Main service interface for cache operations |
| `IProvcacheStore` | Storage abstraction (Valkey + Postgres) |
| `WriteBehindQueue` | Async persistence to Postgres |
| `IEvidenceChunker` | Splits large evidence into Merkle-verified chunks |
| `IRevocationLedger` | Audit trail for all invalidation events |
---
## Invalidation Mechanisms
Provcache supports multiple invalidation triggers to ensure cache consistency when upstream data changes.
### Automatic Invalidation
#### 1. Signer Revocation
When a signing key is compromised or rotated:
```
┌─────────────┐ SignerRevokedEvent ┌──────────────────┐
│ Authority │ ──────────────────────────▶│ SignerSet │
│ Module │ │ Invalidator │
└─────────────┘ └────────┬─────────┘
DELETE FROM provcache_items
WHERE signer_set_hash = ?
```
**Implementation**: `SignerSetInvalidator` subscribes to `SignerRevokedEvent` and invalidates all entries signed by the revoked key.
#### 2. Feed Epoch Advancement
When vulnerability feeds are updated:
```
┌─────────────┐ FeedEpochAdvancedEvent ┌──────────────────┐
│ Concelier │ ───────────────────────────▶│ FeedEpoch │
│ Module │ │ Invalidator │
└─────────────┘ └────────┬─────────┘
DELETE FROM provcache_items
WHERE feed_epoch < ?
```
**Implementation**: `FeedEpochInvalidator` compares epochs using semantic versioning or ISO timestamps.
#### 3. Policy Updates
When policy bundles change:
```
┌─────────────┐ PolicyUpdatedEvent ┌──────────────────┐
│ Policy │ ───────────────────────────▶│ PolicyHash │
│ Engine │ │ Invalidator │
└─────────────┘ └────────┬─────────┘
DELETE FROM provcache_items
WHERE policy_hash = ?
```
### Invalidation DI Wiring and Lifecycle
`AddProvcacheInvalidators()` registers the event-driven invalidation pipeline in dependency injection:
- Creates `IEventStream<SignerRevokedEvent>` using `IEventStreamFactory.Create<T>(new EventStreamOptions { StreamName = SignerRevokedEvent.StreamName })`
- Creates `IEventStream<FeedEpochAdvancedEvent>` using `IEventStreamFactory.Create<T>(new EventStreamOptions { StreamName = FeedEpochAdvancedEvent.StreamName })`
- Registers `SignerSetInvalidator` and `FeedEpochInvalidator` as singleton `IProvcacheInvalidator` implementations
- Registers `InvalidatorHostedService` as `IHostedService` to own invalidator startup/shutdown
`InvalidatorHostedService` starts all registered invalidators during host startup and stops them in reverse order during host shutdown. Each invalidator subscribes from `StreamPosition.End`, so only new events are consumed after process start.
### Invalidation Recording
All invalidation events are recorded in the revocation ledger for audit and replay:
```csharp
public interface IProvcacheInvalidator
{
Task<int> InvalidateAsync(
InvalidationCriteria criteria,
string reason,
string? correlationId = null,
CancellationToken cancellationToken = default);
}
```
The ledger entry includes:
- Revocation type (signer, feed_epoch, policy, explicit)
- The revoked key
- Number of entries invalidated
- Timestamp and correlation ID for tracing
---
## Evidence Chunk Storage
Large evidence (SBOMs, VEX documents, call graphs) is stored in fixed-size chunks with Merkle tree verification.
### Chunking Process
```
┌─────────────────────────────────────────────────────────────────┐
│ Original Evidence │
│ [ 2.3 MB SPDX SBOM JSON ] │
└─────────────────────────────────────────────────────────────────┘
▼ IEvidenceChunker.ChunkAsync()
┌─────────────────────────────────────────────────────────────────┐
│ Chunk 0 (64KB) │ Chunk 1 (64KB) │ ... │ Chunk N (partial) │
│ hash: abc123 │ hash: def456 │ │ hash: xyz789 │
└─────────────────────────────────────────────────────────────────┘
▼ Merkle tree construction
┌─────────────────────────────────────────────────────────────────┐
│ Proof Root │
│ sha256:merkle_root_of_all_chunks │
└─────────────────────────────────────────────────────────────────┘
```
### Database Schema
```sql
CREATE TABLE provcache.prov_evidence_chunks (
chunk_id UUID PRIMARY KEY,
proof_root VARCHAR(128) NOT NULL,
chunk_index INTEGER NOT NULL,
chunk_hash VARCHAR(128) NOT NULL,
blob BYTEA NOT NULL,
blob_size INTEGER NOT NULL,
content_type VARCHAR(64) NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
CONSTRAINT uk_proof_chunk UNIQUE (proof_root, chunk_index)
);
CREATE INDEX idx_evidence_proof_root ON provcache.prov_evidence_chunks(proof_root);
```
### Paging API
Evidence can be retrieved in pages to manage memory:
```http
GET /api/v1/proofs/{proofRoot}?page=0&pageSize=10
```
Response includes chunk metadata without blob data, allowing clients to fetch specific chunks on demand.
---
## Air-Gap Export/Import
Provcache supports air-gapped environments through minimal proof bundles.
### Bundle Format (v1)
```json
{
"version": "v1",
"exportedAt": "2025-01-15T10:30:00Z",
"density": "standard",
"digest": {
"veriKey": "sha256:...",
"verdictHash": "sha256:...",
"proofRoot": "sha256:...",
"trustScore": 85
},
"manifest": {
"proofRoot": "sha256:...",
"totalChunks": 42,
"totalSize": 2752512,
"chunks": [...]
},
"chunks": [
{
"index": 0,
"data": "base64...",
"hash": "sha256:..."
}
],
"signature": {
"algorithm": "ECDSA-P256",
"signature": "base64...",
"signedAt": "2025-01-15T10:30:01Z"
}
}
```
### Density Levels
| Level | Contents | Typical Size | Use Case |
|-------|----------|--------------|----------|
| **Lite** | Digest + ProofRoot + Manifest | ~2 KB | Quick verification, requires lazy fetch for full evidence |
| **Standard** | + First 10% of chunks | ~200 KB | Normal audits, balance of size vs completeness |
| **Strict** | + All chunks | Variable | Full compliance, no network needed |
### Export Example
```csharp
var exporter = serviceProvider.GetRequiredService<IMinimalProofExporter>();
// Lite export (manifest only)
var liteBundle = await exporter.ExportAsync(
veriKey: "sha256:abc123",
new MinimalProofExportOptions { Density = ProofDensity.Lite });
// Signed strict export
var strictBundle = await exporter.ExportAsync(
veriKey: "sha256:abc123",
new MinimalProofExportOptions
{
Density = ProofDensity.Strict,
SignBundle = true,
Signer = signerInstance
});
```
### Import and Verification
```csharp
var result = await exporter.ImportAsync(bundle);
if (result.DigestVerified && result.ChunksVerified)
{
// Bundle is authentic
await provcache.UpsertAsync(result.Entry);
}
```
---
## Lazy Evidence Fetching
For lite bundles, missing chunks can be fetched on-demand from connected or file sources.
### Fetcher Architecture
```
┌────────────────────┐
│ ILazyEvidenceFetcher│
└─────────┬──────────┘
┌─────┴─────┐
│ │
▼ ▼
┌─────────┐ ┌──────────┐
│ HTTP │ │ File │
│ Fetcher │ │ Fetcher │
└─────────┘ └──────────┘
```
### HTTP Fetcher (Connected Mode)
```csharp
var fetcher = new HttpChunkFetcher(
new Uri("https://api.stellaops.com"),
logger);
var orchestrator = new LazyFetchOrchestrator(repository, logger);
var result = await orchestrator.FetchAndStoreAsync(
proofRoot: "sha256:...",
fetcher,
new LazyFetchOptions
{
VerifyOnFetch = true,
BatchSize = 100
});
```
### File Fetcher (Sneakernet Mode)
For fully air-gapped environments:
1. Export full evidence to USB drive
2. Transport to isolated network
3. Import using file fetcher
```csharp
var fetcher = new FileChunkFetcher(
basePath: "/mnt/usb/evidence",
logger);
var result = await orchestrator.FetchAndStoreAsync(proofRoot, fetcher);
```
---
## Revocation Ledger
The revocation ledger provides a complete audit trail of all invalidation events.
### Schema
```sql
CREATE TABLE provcache.prov_revocations (
seq_no BIGSERIAL PRIMARY KEY,
revocation_id UUID NOT NULL,
revocation_type VARCHAR(32) NOT NULL,
revoked_key VARCHAR(512) NOT NULL,
reason VARCHAR(1024),
entries_invalidated INTEGER NOT NULL,
source VARCHAR(128) NOT NULL,
correlation_id VARCHAR(128),
revoked_at TIMESTAMPTZ NOT NULL,
metadata JSONB
);
```
### Replay for Catch-Up
After node restart or network partition, nodes can replay missed revocations:
```csharp
var replayService = serviceProvider.GetRequiredService<IRevocationReplayService>();
// Get last checkpoint
var checkpoint = await replayService.GetCheckpointAsync();
// Replay from checkpoint
var result = await replayService.ReplayFromAsync(
sinceSeqNo: checkpoint,
new RevocationReplayOptions
{
BatchSize = 1000,
SaveCheckpointPerBatch = true
});
Console.WriteLine($"Replayed {result.EntriesReplayed} revocations, {result.TotalInvalidations} entries invalidated");
```
### Statistics
```csharp
var ledger = serviceProvider.GetRequiredService<IRevocationLedger>();
var stats = await ledger.GetStatsAsync();
// stats.TotalEntries - total revocation events
// stats.EntriesByType - breakdown by type (signer, feed_epoch, etc.)
// stats.TotalEntriesInvalidated - sum of all invalidated cache entries
```
---
## API Reference
### Evidence Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/v1/proofs/{proofRoot}` | GET | Get paged evidence chunks |
| `/api/v1/proofs/{proofRoot}/manifest` | GET | Get chunk manifest |
| `/api/v1/proofs/{proofRoot}/chunks/{index}` | GET | Get specific chunk |
| `/api/v1/proofs/{proofRoot}/verify` | POST | Verify Merkle proof |
### Invalidation Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/v1/provcache/invalidate` | POST | Manual invalidation |
| `/api/v1/provcache/revocations` | GET | List revocation history |
| `/api/v1/provcache/stats` | GET | Cache statistics |
### CLI Commands
```bash
# Export commands
stella prov export --verikey <key> --density <lite|standard|strict> [--output <file>] [--sign]
# Import commands
stella prov import <file> [--lazy-fetch] [--backend <url>] [--chunks-dir <path>]
# Verify commands
stella prov verify <file> [--signer-cert <cert>]
```
---
## Configuration
Key settings in `appsettings.json`:
```json
{
"Provcache": {
"ChunkSize": 65536,
"MaxChunksPerEntry": 1000,
"DefaultTtl": "24:00:00",
"EnableWriteBehind": true,
"WriteBehindFlushInterval": "00:00:05"
}
}
```
See [README.md](README.md) for full configuration reference.

View File

@@ -1,419 +0,0 @@
# Provcache Metrics and Alerting Guide
This document describes the Prometheus metrics exposed by the Provcache layer and recommended alerting configurations.
## Overview
Provcache emits metrics for monitoring cache performance, hit rates, latency, and invalidation patterns. These metrics enable operators to:
- Track cache effectiveness
- Identify performance degradation
- Detect anomalous invalidation patterns
- Capacity plan for cache infrastructure
## Prometheus Metrics
### Request Counters
#### `provcache_requests_total`
Total number of cache requests.
| Label | Values | Description |
|-------|--------|-------------|
| `source` | `valkey`, `postgres` | Cache tier that handled the request |
| `result` | `hit`, `miss`, `expired` | Request outcome |
```promql
# Total requests per minute
rate(provcache_requests_total[1m])
# Hit rate percentage
sum(rate(provcache_requests_total{result="hit"}[5m])) /
sum(rate(provcache_requests_total[5m])) * 100
```
#### `provcache_hits_total`
Total cache hits (subset of requests with `result="hit"`).
| Label | Values | Description |
|-------|--------|-------------|
| `source` | `valkey`, `postgres` | Cache tier |
```promql
# Valkey vs Postgres hit ratio
sum(rate(provcache_hits_total{source="valkey"}[5m])) /
sum(rate(provcache_hits_total[5m])) * 100
```
#### `provcache_misses_total`
Total cache misses.
| Label | Values | Description |
|-------|--------|-------------|
| `reason` | `not_found`, `expired`, `invalidated` | Miss reason |
```promql
# Miss rate by reason
sum by (reason) (rate(provcache_misses_total[5m]))
```
### Latency Histogram
#### `provcache_latency_seconds`
Latency distribution for cache operations.
| Label | Values | Description |
|-------|--------|-------------|
| `operation` | `get`, `set`, `invalidate` | Operation type |
| `source` | `valkey`, `postgres` | Cache tier |
Buckets: `0.001, 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0`
```promql
# P50 latency for cache gets
histogram_quantile(0.50, rate(provcache_latency_seconds_bucket{operation="get"}[5m]))
# P95 latency
histogram_quantile(0.95, rate(provcache_latency_seconds_bucket{operation="get"}[5m]))
# P99 latency
histogram_quantile(0.99, rate(provcache_latency_seconds_bucket{operation="get"}[5m]))
```
### Gauge Metrics
#### `provcache_items_count`
Current number of items in cache.
| Label | Values | Description |
|-------|--------|-------------|
| `source` | `valkey`, `postgres` | Cache tier |
```promql
# Total cached items
sum(provcache_items_count)
# Items by tier
sum by (source) (provcache_items_count)
```
### Invalidation Metrics
#### `provcache_invalidations_total`
Total invalidation events.
| Label | Values | Description |
|-------|--------|-------------|
| `reason` | `signer_revoked`, `epoch_advanced`, `ttl_expired`, `manual` | Invalidation trigger |
```promql
# Invalidation rate by reason
sum by (reason) (rate(provcache_invalidations_total[5m]))
# Security-related invalidations
sum(rate(provcache_invalidations_total{reason="signer_revoked"}[5m]))
```
### Trust Score Metrics
#### `provcache_trust_score_average`
Gauge showing average trust score across cached decisions.
```promql
# Current average trust score
provcache_trust_score_average
```
#### `provcache_trust_score_bucket`
Histogram of trust score distribution.
Buckets: `20, 40, 60, 80, 100`
```promql
# Percentage of decisions with trust score >= 80
sum(rate(provcache_trust_score_bucket{le="100"}[5m])) -
sum(rate(provcache_trust_score_bucket{le="80"}[5m]))
```
---
## Grafana Dashboard
A pre-built dashboard is available at `deploy/grafana/dashboards/provcache-overview.json`.
### Panels
| Panel | Type | Description |
|-------|------|-------------|
| Cache Hit Rate | Gauge | Current hit rate percentage |
| Hit Rate Over Time | Time series | Hit rate trend |
| Latency Percentiles | Time series | P50, P95, P99 latency |
| Invalidation Rate | Time series | Invalidations per minute |
| Cache Size | Time series | Item count over time |
| Hits by Source | Pie chart | Valkey vs Postgres distribution |
| Entry Size Distribution | Histogram | Size of cached entries |
| Trust Score Distribution | Histogram | Decision trust scores |
### Importing the Dashboard
```bash
# Via Grafana HTTP API
curl -X POST http://grafana:3000/api/dashboards/db \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $GRAFANA_API_KEY" \
-d @deploy/grafana/dashboards/provcache-overview.json
# Via Helm (auto-provisioned)
# Dashboard is auto-imported when using StellaOps Helm chart
helm upgrade stellaops ./devops/helm/stellaops \
--set grafana.dashboards.provcache.enabled=true
```
---
## Alerting Rules
### Recommended Alerts
#### Low Cache Hit Rate
```yaml
alert: ProvcacheLowHitRate
expr: |
sum(rate(provcache_requests_total{result="hit"}[5m])) /
sum(rate(provcache_requests_total[5m])) < 0.7
for: 10m
labels:
severity: warning
annotations:
summary: "Provcache hit rate below 70%"
description: "Cache hit rate is {{ $value | humanizePercentage }}. Check for invalidation storms or cold cache."
```
#### Critical Hit Rate Drop
```yaml
alert: ProvcacheCriticalHitRate
expr: |
sum(rate(provcache_requests_total{result="hit"}[5m])) /
sum(rate(provcache_requests_total[5m])) < 0.5
for: 5m
labels:
severity: critical
annotations:
summary: "Provcache hit rate critically low"
description: "Cache hit rate is {{ $value | humanizePercentage }}. Immediate investigation required."
```
#### High Latency
```yaml
alert: ProvcacheHighLatency
expr: |
histogram_quantile(0.95, rate(provcache_latency_seconds_bucket{operation="get"}[5m])) > 0.1
for: 5m
labels:
severity: warning
annotations:
summary: "Provcache P95 latency above 100ms"
description: "P95 get latency is {{ $value | humanizeDuration }}. Check Valkey/Postgres performance."
```
#### Excessive Invalidations
```yaml
alert: ProvcacheInvalidationStorm
expr: |
sum(rate(provcache_invalidations_total[5m])) > 100
for: 5m
labels:
severity: warning
annotations:
summary: "Provcache invalidation rate spike"
description: "Invalidations at {{ $value }} per second. Check for feed epoch changes or revocations."
```
#### Signer Revocation Spike
```yaml
alert: ProvcacheSignerRevocations
expr: |
sum(rate(provcache_invalidations_total{reason="signer_revoked"}[5m])) > 10
for: 2m
labels:
severity: critical
annotations:
summary: "Signer revocation causing mass invalidation"
description: "{{ $value }} invalidations/sec due to signer revocation. Security event investigation required."
```
#### Cache Size Approaching Limit
```yaml
alert: ProvcacheSizeHigh
expr: |
sum(provcache_items_count) > 900000
for: 15m
labels:
severity: warning
annotations:
summary: "Provcache size approaching limit"
description: "Cache has {{ $value }} items. Consider scaling or tuning TTL."
```
#### Low Trust Scores
```yaml
alert: ProvcacheLowTrustScores
expr: |
provcache_trust_score_average < 60
for: 30m
labels:
severity: info
annotations:
summary: "Average trust score below 60"
description: "Average trust score is {{ $value }}. Review SBOM completeness and VEX coverage."
```
### AlertManager Configuration
```yaml
# alertmanager.yml
route:
group_by: ['alertname', 'severity']
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
receiver: 'default-receiver'
routes:
- match:
severity: critical
receiver: 'pagerduty-critical'
- match:
alertname: ProvcacheSignerRevocations
receiver: 'security-team'
receivers:
- name: 'default-receiver'
slack_configs:
- channel: '#stellaops-alerts'
send_resolved: true
- name: 'pagerduty-critical'
pagerduty_configs:
- service_key: '<pagerduty-key>'
- name: 'security-team'
email_configs:
- to: 'security@example.com'
send_resolved: true
```
---
## Recording Rules
Pre-compute expensive queries for dashboard performance:
```yaml
# prometheus-rules.yml
groups:
- name: provcache-recording
interval: 30s
rules:
# Hit rate pre-computed
- record: provcache:hit_rate:5m
expr: |
sum(rate(provcache_requests_total{result="hit"}[5m])) /
sum(rate(provcache_requests_total[5m]))
# P95 latency pre-computed
- record: provcache:latency_p95:5m
expr: |
histogram_quantile(0.95, rate(provcache_latency_seconds_bucket{operation="get"}[5m]))
# Invalidation rate
- record: provcache:invalidation_rate:5m
expr: |
sum(rate(provcache_invalidations_total[5m]))
# Cache efficiency (hits per second vs misses)
- record: provcache:efficiency:5m
expr: |
sum(rate(provcache_hits_total[5m])) /
(sum(rate(provcache_hits_total[5m])) + sum(rate(provcache_misses_total[5m])))
```
---
## Operational Runbook
### Low Hit Rate Investigation
1. **Check invalidation metrics** — Is there an invalidation storm?
```promql
sum by (reason) (rate(provcache_invalidations_total[5m]))
```
2. **Check cache age** — Is the cache newly deployed (cold)?
```promql
sum(provcache_items_count)
```
3. **Check request patterns** — Are there many unique VeriKeys?
```promql
# High cardinality of unique requests suggests insufficient cache sharing
```
4. **Check TTL configuration** — Is TTL too aggressive?
- Review `Provcache:DefaultTtl` setting
- Consider increasing for stable workloads
### High Latency Investigation
1. **Check Valkey health**
```bash
valkey-cli -h valkey info stats
```
2. **Check Postgres connections**
```sql
SELECT count(*) FROM pg_stat_activity WHERE datname = 'stellaops';
```
3. **Check entry sizes**
```promql
histogram_quantile(0.95, rate(provcache_entry_size_bytes_bucket[5m]))
```
4. **Check network latency** between services
### Invalidation Storm Response
1. **Identify cause**
```promql
sum by (reason) (increase(provcache_invalidations_total[10m]))
```
2. **If epoch-related**: Expected during feed updates. Monitor duration.
3. **If signer-related**: Security event — escalate to security team.
4. **If manual**: Check audit logs for unauthorized invalidation.
---
## Related Documentation
- [Provcache Module README](../provcache/README.md) — Core concepts
- [Provcache Architecture](../provcache/architecture.md) — Technical details
- [Telemetry Architecture](../telemetry/architecture.md) — Observability patterns
- [Grafana Dashboard Guide](../../deploy/grafana/README.md) — Dashboard management

View File

@@ -1,439 +0,0 @@
# Provcache OCI Attestation Verification Guide
This document describes how to verify Provcache decision attestations attached to OCI container images.
## Overview
StellaOps can attach provenance cache decisions as OCI-attached attestations to container images. These attestations enable:
- **Supply chain verification** — Verify security decisions were made by trusted evaluators
- **Audit trails** — Retrieve the exact decision state at image push time
- **Policy gates** — Admission controllers can verify attestations before deployment
- **Offline verification** — Decisions verifiable without calling StellaOps services
## Attestation Format
### Predicate Type
```
stella.ops/provcache@v1
```
### Predicate Schema
```json
{
"_type": "stella.ops/provcache@v1",
"veriKey": "sha256:abc123...",
"decision": {
"digestVersion": "v1",
"verdictHash": "sha256:def456...",
"proofRoot": "sha256:789abc...",
"trustScore": 85,
"createdAt": "2025-12-24T12:00:00Z",
"expiresAt": "2025-12-25T12:00:00Z"
},
"inputs": {
"sourceDigest": "sha256:image...",
"sbomDigest": "sha256:sbom...",
"policyDigest": "sha256:policy...",
"feedEpoch": "2024-W52"
},
"verdicts": {
"CVE-2024-1234": "mitigated",
"CVE-2024-5678": "affected"
}
}
```
### Field Descriptions
| Field | Type | Description |
|-------|------|-------------|
| `_type` | string | Predicate type URI |
| `veriKey` | string | VeriKey hash identifying this decision context |
| `decision.digestVersion` | string | Decision digest schema version |
| `decision.verdictHash` | string | Hash of all verdicts |
| `decision.proofRoot` | string | Merkle proof root hash |
| `decision.trustScore` | number | Overall trust score (0-100) |
| `decision.createdAt` | string | ISO-8601 creation timestamp |
| `decision.expiresAt` | string | ISO-8601 expiry timestamp |
| `inputs.sourceDigest` | string | Container image digest |
| `inputs.sbomDigest` | string | SBOM document digest |
| `inputs.policyDigest` | string | Policy bundle digest |
| `inputs.feedEpoch` | string | Feed epoch identifier |
| `verdicts` | object | Map of CVE IDs to verdict status |
---
## Verification with Cosign
### Prerequisites
```bash
# Install cosign
brew install cosign # macOS
# or
go install github.com/sigstore/cosign/v2/cmd/cosign@latest
```
### Basic Verification
```bash
# Verify attestation exists and is signed
cosign verify-attestation \
--type stella.ops/provcache@v1 \
registry.example.com/app:v1.2.3
```
### Verify with Identity Constraints
```bash
# Verify with signer identity (Fulcio)
cosign verify-attestation \
--type stella.ops/provcache@v1 \
--certificate-identity-regexp '.*@stellaops\.example\.com' \
--certificate-oidc-issuer https://auth.stellaops.example.com \
registry.example.com/app:v1.2.3
```
### Verify with Custom Trust Root
```bash
# Using enterprise CA
cosign verify-attestation \
--type stella.ops/provcache@v1 \
--certificate /path/to/enterprise-ca.crt \
--certificate-chain /path/to/ca-chain.crt \
registry.example.com/app:v1.2.3
```
### Extract Attestation Payload
```bash
# Get raw attestation JSON
cosign verify-attestation \
--type stella.ops/provcache@v1 \
--certificate-identity-regexp '.*@stellaops\.example\.com' \
--certificate-oidc-issuer https://auth.stellaops.example.com \
registry.example.com/app:v1.2.3 | jq '.payload' | base64 -d | jq .
```
---
## Verification with StellaOps CLI
### Verify Attestation
```bash
# Verify using StellaOps CLI
stella verify attestation \
--image registry.example.com/app:v1.2.3 \
--type provcache
# Output:
# ✓ Attestation found: stella.ops/provcache@v1
# ✓ Signature valid (Fulcio)
# ✓ Trust score: 85
# ✓ Decision created: 2025-12-24T12:00:00Z
# ✓ Decision expires: 2025-12-25T12:00:00Z
```
### Verify with Policy Requirements
```bash
# Verify with minimum trust score
stella verify attestation \
--image registry.example.com/app:v1.2.3 \
--type provcache \
--min-trust-score 80
# Verify with freshness requirement
stella verify attestation \
--image registry.example.com/app:v1.2.3 \
--type provcache \
--max-age 24h
```
### Extract Decision Details
```bash
# Get full decision details
stella verify attestation \
--image registry.example.com/app:v1.2.3 \
--type provcache \
--output json | jq .
# Get specific fields
stella verify attestation \
--image registry.example.com/app:v1.2.3 \
--type provcache \
--output json | jq '.predicate.verdicts'
```
---
## Kubernetes Admission Control
### Gatekeeper Policy
```yaml
# constraint-template.yaml
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
name: provcacheattestation
spec:
crd:
spec:
names:
kind: ProvcacheAttestation
validation:
openAPIV3Schema:
type: object
properties:
minTrustScore:
type: integer
minimum: 0
maximum: 100
maxAgeHours:
type: integer
minimum: 1
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package provcacheattestation
violation[{"msg": msg}] {
container := input.review.object.spec.containers[_]
image := container.image
not has_valid_attestation(image)
msg := sprintf("Image %v missing valid provcache attestation", [image])
}
has_valid_attestation(image) {
attestation := get_attestation(image, "stella.ops/provcache@v1")
attestation.predicate.decision.trustScore >= input.parameters.minTrustScore
not is_expired(attestation.predicate.decision.expiresAt)
}
is_expired(expiry) {
time.parse_rfc3339_ns(expiry) < time.now_ns()
}
```
```yaml
# constraint.yaml
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: ProvcacheAttestation
metadata:
name: require-provcache-attestation
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
namespaces:
- production
parameters:
minTrustScore: 80
maxAgeHours: 48
```
### Kyverno Policy
```yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: verify-provcache-attestation
spec:
validationFailureAction: enforce
background: true
rules:
- name: check-provcache-attestation
match:
any:
- resources:
kinds:
- Pod
verifyImages:
- imageReferences:
- "*"
attestations:
- predicateType: stella.ops/provcache@v1
conditions:
- all:
- key: "{{ decision.trustScore }}"
operator: GreaterThanOrEquals
value: 80
- key: "{{ decision.expiresAt }}"
operator: GreaterThan
value: "{{ time.Now() }}"
attestors:
- entries:
- keyless:
issuer: https://auth.stellaops.example.com
subject: ".*@stellaops\\.example\\.com"
```
---
## CI/CD Integration
### GitHub Actions
```yaml
# .github/workflows/verify-attestation.yml
name: Verify Provcache Attestation
on:
workflow_dispatch:
inputs:
image:
description: 'Image to verify'
required: true
jobs:
verify:
runs-on: ubuntu-latest
steps:
- name: Install cosign
uses: sigstore/cosign-installer@v3
- name: Verify attestation
run: |
cosign verify-attestation \
--type stella.ops/provcache@v1 \
--certificate-identity-regexp '.*@stellaops\.example\.com' \
--certificate-oidc-issuer https://auth.stellaops.example.com \
${{ inputs.image }}
- name: Check trust score
run: |
TRUST_SCORE=$(cosign verify-attestation \
--type stella.ops/provcache@v1 \
--certificate-identity-regexp '.*@stellaops\.example\.com' \
--certificate-oidc-issuer https://auth.stellaops.example.com \
${{ inputs.image }} | jq -r '.payload' | base64 -d | jq '.predicate.decision.trustScore')
if [ "$TRUST_SCORE" -lt 80 ]; then
echo "Trust score $TRUST_SCORE is below threshold (80)"
exit 1
fi
```
### GitLab CI
```yaml
# .gitlab-ci.yml
verify-attestation:
stage: verify
image: gcr.io/projectsigstore/cosign:latest
script:
- cosign verify-attestation
--type stella.ops/provcache@v1
--certificate-identity-regexp '.*@stellaops\.example\.com'
--certificate-oidc-issuer https://auth.stellaops.example.com
${CI_REGISTRY_IMAGE}:${CI_COMMIT_TAG}
rules:
- if: $CI_COMMIT_TAG
```
---
## Troubleshooting
### No Attestation Found
```bash
# List all attestations on image
cosign tree registry.example.com/app:v1.2.3
# Check if attestation was pushed
crane manifest registry.example.com/app:sha256-<digest>.att
```
### Signature Verification Failed
```bash
# Check certificate details
cosign verify-attestation \
--type stella.ops/provcache@v1 \
--output text \
registry.example.com/app:v1.2.3 2>&1 | grep -A5 "Certificate"
# Verify with verbose output
COSIGN_EXPERIMENTAL=1 cosign verify-attestation \
--type stella.ops/provcache@v1 \
registry.example.com/app:v1.2.3 -v
```
### Attestation Expired
```bash
# Check expiry timestamp
cosign verify-attestation \
--type stella.ops/provcache@v1 \
--certificate-identity-regexp '.*@stellaops\.example\.com' \
--certificate-oidc-issuer https://auth.stellaops.example.com \
registry.example.com/app:v1.2.3 | \
jq -r '.payload' | base64 -d | jq '.predicate.decision.expiresAt'
```
### Trust Score Below Threshold
```bash
# Check trust score breakdown
stella verify attestation \
--image registry.example.com/app:v1.2.3 \
--type provcache \
--output json | jq '.predicate.decision.trustScore'
# If score is low, check individual components:
# - SBOM completeness
# - VEX coverage
# - Reachability analysis
# - Policy freshness
# - Signer trust
```
---
## Security Considerations
### Key Management
- **Fulcio** — Ephemeral certificates tied to OIDC identity; recommended for public workflows
- **Enterprise CA** — Long-lived certificates for air-gapped environments
- **Self-signed** — Only for development/testing; not recommended for production
### Attestation Integrity
- Attestations are signed at push time
- Signature covers the entire predicate payload
- Modifying any field invalidates the signature
### Expiry Handling
- Attestations have `expiresAt` timestamps
- Expired attestations should be rejected by admission controllers
- Consider re-scanning images before deployment to get fresh attestations
### Verdict Reconciliation
- Verdicts in attestation reflect state at push time
- New vulnerabilities discovered after push won't appear
- Use `stella verify attestation --check-freshness` to compare against current feeds
---
## Related Documentation
- [Provcache Module README](./README.md) — Core concepts
- [Provcache Metrics and Alerting](./metrics-alerting.md) — Observability
- [Signer Module](../signer/architecture.md) — Signing infrastructure
- [Attestor Module](../attestor/architecture.md) — Attestation generation
- [OCI Artifact Spec](https://github.com/opencontainers/image-spec) — OCI standards
- [In-toto Attestation Spec](https://github.com/in-toto/attestation) — Attestation format
- [Sigstore Documentation](https://docs.sigstore.dev/) — Cosign and Fulcio

View File

@@ -1,552 +0,0 @@
# Reachability Module Architecture
## Overview
The **Reachability** module provides a unified hybrid reachability analysis system that combines static call-graph analysis with runtime execution evidence to determine whether vulnerable code paths are actually exploitable in a given artifact. It serves as the **evidence backbone** for VEX (Vulnerability Exploitability eXchange) verdicts.
## Problem Statement
Vulnerability scanners generate excessive false positives:
- **Static analysis** over-approximates: flags code that is dead, feature-gated, or unreachable
- **Runtime analysis** under-approximates: misses rarely-executed but exploitable paths
- **No unified view** across static and runtime evidence sources
- **Symbol mismatch** between static extraction (Roslyn, ASM) and runtime observation (ETW, eBPF)
### Before Reachability Module
| Question | Answer Method | Limitation |
|----------|---------------|------------|
| Is CVE reachable statically? | Query ReachGraph | No runtime context |
| Was CVE executed at runtime? | Query Signals runtime facts | No static context |
| Should we mark CVE as NA? | Manual analysis | No evidence, no audit trail |
| What's the confidence? | Guesswork | No formal model |
### After Reachability Module
Single `IReachabilityIndex.QueryHybridAsync()` call returns:
- Lattice state (8-level certainty model)
- Confidence score (0.0-1.0)
- Evidence URIs (auditable, reproducible)
- Recommended VEX status + justification
---
## Module Location
```
src/__Libraries/StellaOps.Reachability.Core/
├── IReachabilityIndex.cs # Main facade interface
├── ReachabilityIndex.cs # Implementation
├── HybridQueryOptions.cs # Query configuration
├── SymbolRef.cs # Symbol reference
├── StaticReachabilityResult.cs # Static query result
├── RuntimeReachabilityResult.cs # Runtime query result
├── HybridReachabilityResult.cs # Combined result
├── LatticeState.cs # 8-state lattice enum
├── ReachabilityLattice.cs # Lattice state machine
├── ConfidenceCalculator.cs # Evidence-weighted confidence
├── EvidenceUriBuilder.cs # stella:// URI construction
├── IReachGraphAdapter.cs # ReachGraph integration interface
├── ISignalsAdapter.cs # Signals integration interface
├── ServiceCollectionExtensions.cs # DI registration
├── Symbols/
│ ├── ISymbolCanonicalizer.cs # Symbol normalization interface
│ ├── SymbolCanonicalizer.cs # Implementation
│ ├── ISymbolNormalizer.cs # Normalizer interface
│ ├── CanonicalSymbol.cs # Canonicalized symbol
│ ├── RawSymbol.cs # Raw input symbol
│ ├── SymbolMatchResult.cs # Match result
│ ├── SymbolMatchOptions.cs # Matching configuration
│ ├── SymbolMatcher.cs # Symbol matching logic
│ ├── SymbolSource.cs # Source enum
│ ├── ProgrammingLanguage.cs # Language enum
│ ├── DotNetSymbolNormalizer.cs # .NET symbols
│ ├── JavaSymbolNormalizer.cs # Java symbols
│ ├── NativeSymbolNormalizer.cs # C/C++/Rust
│ └── ScriptSymbolNormalizer.cs # JS/Python/PHP
└── CveMapping/
├── ICveSymbolMappingService.cs # CVE-symbol mapping interface
├── CveSymbolMappingService.cs # Implementation
├── CveSymbolMapping.cs # Mapping record
├── VulnerableSymbol.cs # Vulnerable symbol record
├── MappingSource.cs # Source enum
├── VulnerabilityType.cs # Vulnerability type enum
├── PatchAnalysisResult.cs # Patch analysis result
├── IPatchSymbolExtractor.cs # Patch analysis interface
├── IOsvEnricher.cs # OSV enricher interface
├── GitDiffExtractor.cs # Git diff parsing
├── UnifiedDiffParser.cs # Unified diff format parser
├── FunctionBoundaryDetector.cs # Function boundary detection
└── OsvEnricher.cs # OSV API enrichment
```
---
## Core Concepts
### 1. Reachability Lattice (8-State Model)
The lattice provides mathematically sound evidence aggregation:
```
X (Contested)
/ \
/ \
CR (Confirmed CU (Confirmed
Reachable) Unreachable)
| \ / |
| \ / |
RO (Runtime RU (Runtime
Observed) Unobserved)
| |
| |
SR (Static SU (Static
Reachable) Unreachable)
\ /
\ /
U (Unknown)
```
| State | Code | Description | Confidence Base |
|-------|------|-------------|-----------------|
| Unknown | U | No analysis performed | 0.00 |
| Static Reachable | SR | Call graph shows path exists | 0.30 |
| Static Unreachable | SU | Call graph proves no path | 0.40 |
| Runtime Observed | RO | Symbol executed at runtime | 0.70 |
| Runtime Unobserved | RU | Observation window passed, no execution | 0.60 |
| Confirmed Reachable | CR | Multiple sources confirm reachability | 0.90 |
| Confirmed Unreachable | CU | Multiple sources confirm no reachability | 0.95 |
| Contested | X | Evidence conflict | 0.20 (requires review) |
### 2. Symbol Canonicalization
Symbols from different sources must be normalized to enable matching:
| Source | Raw Format | Canonical Format |
|--------|-----------|------------------|
| Roslyn (.NET) | `StellaOps.Scanner.Core.SbomGenerator::GenerateAsync` | `stellaops.scanner.core/sbomgenerator/generateasync/(cancellationtoken)` |
| ASM (Java) | `org/apache/log4j/core/lookup/JndiLookup.lookup(Ljava/lang/String;)Ljava/lang/String;` | `org.apache.log4j.core.lookup/jndilookup/lookup/(string)` |
| eBPF (Native) | `_ZN4llvm12DenseMapBaseINS_...` | `llvm/densemapbase/operator[]/(keytype)` |
| ETW (.NET) | `MethodID=12345 ModuleID=67890` | (resolved via metadata) |
### 3. CVE-Symbol Mapping
Maps CVE identifiers to specific vulnerable symbols:
```json
{
"cveId": "CVE-2021-44228",
"symbols": [
{
"canonicalId": "sha256:abc123...",
"displayName": "org.apache.log4j.core.lookup/jndilookup/lookup/(string)",
"type": "Sink",
"condition": "When lookup string contains ${jndi:...}"
}
],
"source": "PatchAnalysis",
"confidence": 0.98,
"patchCommitUrl": "https://github.com/apache/logging-log4j2/commit/abc123"
}
```
### 4. Evidence URIs
Standardized `stella://` URI scheme for evidence references:
| Pattern | Example |
|---------|---------|
| `stella://reachgraph/{digest}` | `stella://reachgraph/blake3:abc123` |
| `stella://reachgraph/{digest}/slice?symbol={id}` | `stella://reachgraph/blake3:abc123/slice?symbol=sha256:def` |
| `stella://signals/runtime/{tenant}/{artifact}` | `stella://signals/runtime/acme/sha256:abc` |
| `stella://cvemap/{cveId}` | `stella://cvemap/CVE-2021-44228` |
| `stella://attestation/{digest}` | `stella://attestation/sha256:sig789` |
---
## Architecture Diagram
```
┌─────────────────────────────────────────────────────────────────────────────────┐
│ Reachability Core Library │
├─────────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────────────────────────────────────────────────────────────────┐ │
│ │ IReachabilityIndex │ │
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────────────┐ │ │
│ │ │ QueryStaticAsync │ │ QueryRuntimeAsync│ │ QueryHybridAsync │ │ │
│ │ └────────┬────────┘ └────────┬────────┘ └────────────┬───────────────┘ │ │
│ └───────────┼────────────────────┼─────────────────────────┼────────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────────────┐│
│ │ Internal Components ││
│ │ ││
│ │ ┌────────────────┐ ┌────────────────┐ ┌────────────────────────────┐ ││
│ │ │ Symbol │ │ CVE-Symbol │ │ Reachability │ ││
│ │ │ Canonicalizer │ │ Mapping │ │ Lattice │ ││
│ │ │ │ │ │ │ │ ││
│ │ │ ┌────────────┐ │ │ ┌────────────┐ │ │ ┌───────────────────────┐ │ ││
│ │ │ │.NET Norm. │ │ │ │PatchExtract│ │ │ │ State Machine │ │ ││
│ │ │ │Java Norm. │ │ │ │OSV Enrich │ │ │ │ Confidence Calc │ │ ││
│ │ │ │Native Norm.│ │ │ │DeltaSig │ │ │ │ Transition Rules │ │ ││
│ │ │ │Script Norm.│ │ │ │Manual Input│ │ │ └───────────────────────┘ │ ││
│ │ │ └────────────┘ │ │ └────────────┘ │ │ │ ││
│ │ └────────────────┘ └────────────────┘ └────────────────────────────┘ ││
│ │ ││
│ └──────────────────────────────────────────────────────────────────────────────┘│
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────────────────┐│
│ │ Evidence Layer ││
│ │ ││
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────────────┐ ││
│ │ │ Evidence URI │ │ Evidence Bundle │ │ Evidence Attestation │ ││
│ │ │ Builder │ │ (Collection) │ │ Service (DSSE) │ ││
│ │ └─────────────────┘ └─────────────────┘ └─────────────────────────────┘ ││
│ │ ││
│ └──────────────────────────────────────────────────────────────────────────────┘│
│ │
└──────────────────────────────────────────────────────────────────────────────────┘
┌────────────────────────┼────────────────────────┐
│ │ │
▼ ▼ ▼
┌────────────────┐ ┌────────────────┐ ┌────────────────┐
│ ReachGraph │ │ Signals │ │ Policy Engine │
│ Adapter │ │ Adapter │ │ Adapter │
└───────┬────────┘ └───────┬────────┘ └───────┬────────┘
│ │ │
▼ ▼ ▼
┌────────────────┐ ┌────────────────┐ ┌────────────────┐
│ ReachGraph │ │ Signals │ │ Policy Engine │
│ WebService │ │ WebService │ │ (VEX Emit) │
└────────────────┘ └────────────────┘ └────────────────┘
```
---
## Data Flow
### Query Flow
```
1. Consumer calls IReachabilityIndex.QueryHybridAsync(symbol, artifact, options)
2. SymbolCanonicalizer normalizes input symbol to CanonicalSymbol
3. Parallel queries:
├── ReachGraphAdapter.QueryAsync() → StaticReachabilityResult
└── SignalsAdapter.QueryRuntimeFactsAsync() → RuntimeReachabilityResult
4. ReachabilityLattice computes combined state from evidence
5. ConfidenceCalculator applies evidence weights and guardrails
6. EvidenceBundle collects URIs for audit trail
7. Return HybridReachabilityResult with verdict recommendation
```
### Ingestion Flow (CVE Mapping)
```
1. Patch commit detected (Concelier, Feedser, or manual)
2. GitDiffExtractor parses diff to find changed functions
3. SymbolCanonicalizer normalizes extracted symbols
4. OsvEnricher adds context from OSV database
5. CveSymbolMappingService persists mapping with provenance
6. Mapping available for reachability queries
```
---
## API Contracts
### IReachabilityIndex
```csharp
public interface IReachabilityIndex
{
/// <summary>
/// Query static reachability from call graph.
/// </summary>
Task<StaticReachabilityResult> QueryStaticAsync(
SymbolRef symbol,
string artifactDigest,
CancellationToken ct);
/// <summary>
/// Query runtime reachability from observed facts.
/// </summary>
Task<RuntimeReachabilityResult> QueryRuntimeAsync(
SymbolRef symbol,
string artifactDigest,
TimeSpan observationWindow,
CancellationToken ct);
/// <summary>
/// Query hybrid reachability combining static + runtime.
/// </summary>
Task<HybridReachabilityResult> QueryHybridAsync(
SymbolRef symbol,
string artifactDigest,
HybridQueryOptions options,
CancellationToken ct);
/// <summary>
/// Batch query for CVE vulnerability analysis.
/// </summary>
Task<IReadOnlyList<HybridReachabilityResult>> QueryBatchAsync(
IEnumerable<SymbolRef> symbols,
string artifactDigest,
HybridQueryOptions options,
CancellationToken ct);
/// <summary>
/// Get vulnerable symbols for a CVE.
/// </summary>
Task<CveSymbolMapping?> GetCveMappingAsync(
string cveId,
CancellationToken ct);
}
```
### Result Types
```csharp
public sealed record HybridReachabilityResult
{
public required SymbolRef Symbol { get; init; }
public required string ArtifactDigest { get; init; }
public required LatticeState LatticeState { get; init; }
public required double Confidence { get; init; }
public required StaticEvidence? StaticEvidence { get; init; }
public required RuntimeEvidence? RuntimeEvidence { get; init; }
public required VerdictRecommendation Verdict { get; init; }
public required ImmutableArray<string> EvidenceUris { get; init; }
public required DateTimeOffset ComputedAt { get; init; }
public required string ComputedBy { get; init; }
}
public sealed record VerdictRecommendation
{
public required VexStatus Status { get; init; }
public VexJustification? Justification { get; init; }
public required ConfidenceBucket ConfidenceBucket { get; init; }
public string? ImpactStatement { get; init; }
public string? ActionStatement { get; init; }
}
public enum LatticeState
{
Unknown = 0,
StaticReachable = 1,
StaticUnreachable = 2,
RuntimeObserved = 3,
RuntimeUnobserved = 4,
ConfirmedReachable = 5,
ConfirmedUnreachable = 6,
Contested = 7
}
```
---
## Integration Points
### Upstream (Data Sources)
| Module | Interface | Data |
|--------|-----------|------|
| ReachGraph | `IReachGraphSliceService` | Static call-graph nodes/edges |
| Signals | `IRuntimeFactsService` | Runtime method observations |
| Scanner.CallGraph | `ICallGraphExtractor` | Per-artifact call graphs |
| Feedser | `IBackportProofService` | Patch analysis results |
### Downstream (Consumers)
| Module | Interface | Usage |
|--------|-----------|-------|
| Policy Engine | `IReachabilityAwareVexEmitter` | VEX verdict with evidence |
| VexLens | `IReachabilityIndex` | Consensus enrichment |
| Web Console | REST API | Evidence panel display |
| CLI | `stella reachability` | Command-line queries |
| ExportCenter | `IReachabilityExporter` | Offline bundles |
---
## Storage
### PostgreSQL Schema
```sql
-- CVE-Symbol Mappings
CREATE TABLE reachability.cve_symbol_mappings (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
cve_id TEXT NOT NULL,
symbol_canonical_id TEXT NOT NULL,
symbol_display_name TEXT NOT NULL,
vulnerability_type TEXT NOT NULL,
condition TEXT,
source TEXT NOT NULL,
confidence DECIMAL(3,2) NOT NULL,
patch_commit_url TEXT,
delta_sig_digest TEXT,
extracted_at TIMESTAMPTZ NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE (tenant_id, cve_id, symbol_canonical_id)
);
-- Query Cache
CREATE TABLE reachability.query_cache (
cache_key TEXT PRIMARY KEY,
artifact_digest TEXT NOT NULL,
symbol_canonical_id TEXT NOT NULL,
lattice_state INTEGER NOT NULL,
confidence DECIMAL(3,2) NOT NULL,
result_json JSONB NOT NULL,
computed_at TIMESTAMPTZ NOT NULL,
expires_at TIMESTAMPTZ NOT NULL
);
-- Audit Log
CREATE TABLE reachability.query_audit_log (
id BIGSERIAL PRIMARY KEY,
tenant_id UUID NOT NULL,
query_type TEXT NOT NULL,
artifact_digest TEXT NOT NULL,
symbol_count INTEGER NOT NULL,
lattice_state INTEGER NOT NULL,
confidence DECIMAL(3,2) NOT NULL,
duration_ms INTEGER NOT NULL,
queried_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
```
### Valkey (Redis) Caching
| Key Pattern | TTL | Purpose |
|-------------|-----|---------|
| `reach:static:{artifact}:{symbol}` | 1h | Static query cache |
| `reach:runtime:{artifact}:{symbol}` | 5m | Runtime query cache |
| `reach:hybrid:{artifact}:{symbol}:{options_hash}` | 15m | Hybrid query cache |
| `cvemap:{cve_id}` | 24h | CVE mapping cache |
---
## Determinism Guarantees
### Reproducibility Rules
1. **Canonical Symbol IDs:** SHA-256 of `purl|namespace|type|method|signature` (lowercase, sorted)
2. **Stable Lattice Transitions:** Deterministic state machine, no randomness
3. **Ordered Evidence:** Evidence URIs sorted lexicographically
4. **Time Injection:** All `ComputedAt` via `TimeProvider`
5. **Culture Invariance:** `InvariantCulture` for all string operations
### Replay Verification
```csharp
public interface IReachabilityReplayService
{
Task<ReplayResult> ReplayAsync(
HybridReachabilityInputs inputs,
HybridReachabilityResult expected,
CancellationToken ct);
}
```
---
## Performance Characteristics
| Operation | Target P95 | Notes |
|-----------|-----------|-------|
| Static query (cached) | <10ms | Valkey hit |
| Static query (uncached) | <100ms | ReachGraph slice |
| Runtime query (cached) | <5ms | Valkey hit |
| Runtime query (uncached) | <50ms | Signals lookup |
| Hybrid query | <50ms | Parallel static + runtime |
| Batch query (100 symbols) | <500ms | Parallelized |
| CVE mapping lookup | <10ms | Cached |
| Symbol canonicalization | <1ms | In-memory |
---
## Security Considerations
### Access Control
| Operation | Required Scope |
|-----------|---------------|
| Query reachability | `reachability:read` |
| Ingest CVE mapping | `reachability:write` |
| Admin CVE mapping | `reachability:admin` |
| Export bundles | `reachability:export` |
### Tenant Isolation
- All queries filtered by `tenant_id`
- RLS policies on all tables
- Cache keys include tenant prefix
### Data Sensitivity
- Symbol names may reveal internal architecture
- Runtime traces expose execution patterns
- CVE mappings are security-sensitive
---
## Observability
### Metrics
| Metric | Type | Labels |
|--------|------|--------|
| `reachability_query_duration_seconds` | histogram | query_type, cache_hit |
| `reachability_lattice_state_total` | counter | state |
| `reachability_cache_hit_ratio` | gauge | cache_type |
| `reachability_cvemap_count` | gauge | source |
### Traces
| Span | Description |
|------|-------------|
| `reachability.query.static` | Static graph query |
| `reachability.query.runtime` | Runtime facts query |
| `reachability.query.hybrid` | Combined computation |
| `reachability.canonicalize` | Symbol normalization |
| `reachability.lattice.compute` | State calculation |
---
## Related Documentation
- [Product Advisory: Hybrid Reachability](../../product/advisories/09-Jan-2026%20-%20Hybrid%20Reachability%20and%20VEX%20Integration%20(Revised).md)
- [ReachGraph Architecture](../reach-graph/architecture.md)
- [Signals Architecture](../signals/architecture.md)
- [VexLens Architecture](../vex-lens/architecture.md)
- [Sprint Index](../../implplan/SPRINT_20260109_009_000_INDEX_hybrid_reachability.md)
---
_Last updated: 10-Jan-2026_

View File

@@ -1,588 +0,0 @@
# SARIF Export Module Architecture
> **Implementation Status:**
> - SARIF 2.1.0 Models: **Implemented** (`src/Scanner/__Libraries/StellaOps.Scanner.Sarif/`)
> - Export Service: **Implemented**
> - SmartDiff Integration: **Implemented**
> - Fingerprint Generator: **Implemented**
> - GitHub Upload Client: **Planned**
>
> There is no standalone `src/SarifExport/` module; SARIF export is a capability within the Scanner module.
## Overview
The **SARIF Export** module provides SARIF 2.1.0 compliant output for StellaOps Scanner findings, enabling integration with GitHub Code Scanning, GitLab SAST, Azure DevOps, and other platforms that consume SARIF.
## Current State
| Component | Status | Location |
|-----------|--------|----------|
| SARIF 2.1.0 Models | **Implemented** | `Scanner.Sarif/Models/SarifModels.cs` |
| SmartDiff SARIF Generator | **Implemented** | `Scanner.SmartDiff/Output/SarifOutputGenerator.cs` |
| SmartDiff SARIF Endpoint | **Implemented** | `GET /smart-diff/scans/{scanId}/sarif` |
| Findings SARIF Mapper | **Implemented** | `Scanner.Sarif/SarifExportService.cs` |
| SARIF Rule Registry | **Implemented** | `Scanner.Sarif/Rules/SarifRuleRegistry.cs` |
| Fingerprint Generator | **Implemented** | `Scanner.Sarif/Fingerprints/FingerprintGenerator.cs` |
| GitHub Upload Client | **Not Implemented** | Proposed |
---
## Module Location
```
src/Scanner/__Libraries/StellaOps.Scanner.Sarif/
├── ISarifExportService.cs # Main export interface
├── SarifExportService.cs # Implementation (DONE)
├── SarifExportOptions.cs # Configuration (DONE)
├── FindingInput.cs # Input model (DONE)
├── Models/
│ └── SarifModels.cs # Complete SARIF 2.1.0 types (DONE)
├── Rules/
│ ├── ISarifRuleRegistry.cs # Rule registry interface (DONE)
│ └── SarifRuleRegistry.cs # 21 rules implemented (DONE)
└── Fingerprints/
├── IFingerprintGenerator.cs # Fingerprint interface (DONE)
└── FingerprintGenerator.cs # SHA-256 fingerprints (DONE)
```
---
## Existing SmartDiff SARIF Implementation
The SmartDiff module provides a reference implementation:
### SarifModels.cs (Existing)
```csharp
// Already implemented record types
public sealed record SarifLog(
string Version,
string Schema,
ImmutableArray<SarifRun> Runs);
public sealed record SarifRun(
SarifTool Tool,
ImmutableArray<SarifResult> Results,
ImmutableArray<SarifArtifact> Artifacts,
ImmutableArray<SarifVersionControlDetails> VersionControlProvenance,
ImmutableDictionary<string, object> Properties);
public sealed record SarifResult(
string RuleId,
int? RuleIndex,
SarifLevel Level,
SarifMessage Message,
ImmutableArray<SarifLocation> Locations,
ImmutableDictionary<string, string> Fingerprints,
ImmutableDictionary<string, string> PartialFingerprints,
ImmutableDictionary<string, object> Properties);
```
### SarifOutputGenerator.cs (Existing)
```csharp
// Existing generator for SmartDiff findings
public class SarifOutputGenerator
{
public SarifLog Generate(
IEnumerable<MaterialRiskChangeResult> changes,
SarifOutputOptions options);
}
```
---
## New Findings SARIF Architecture
### ISarifExportService
```csharp
namespace StellaOps.Scanner.Sarif;
/// <summary>
/// Service for exporting scanner findings to SARIF format.
/// </summary>
public interface ISarifExportService
{
/// <summary>
/// Export findings to SARIF 2.1.0 format.
/// </summary>
/// <param name="findings">Scanner findings to export.</param>
/// <param name="options">Export options.</param>
/// <param name="ct">Cancellation token.</param>
/// <returns>SARIF log document.</returns>
Task<SarifLog> ExportAsync(
IEnumerable<Finding> findings,
SarifExportOptions options,
CancellationToken ct);
/// <summary>
/// Export findings to SARIF JSON string.
/// </summary>
Task<string> ExportToJsonAsync(
IEnumerable<Finding> findings,
SarifExportOptions options,
CancellationToken ct);
/// <summary>
/// Export findings to SARIF JSON stream.
/// </summary>
Task ExportToStreamAsync(
IEnumerable<Finding> findings,
SarifExportOptions options,
Stream outputStream,
CancellationToken ct);
/// <summary>
/// Validate SARIF output against schema.
/// </summary>
Task<SarifValidationResult> ValidateAsync(
SarifLog log,
CancellationToken ct);
}
```
### SarifExportOptions
```csharp
namespace StellaOps.Scanner.Sarif;
/// <summary>
/// Options for SARIF export.
/// </summary>
public sealed record SarifExportOptions
{
/// <summary>Tool name in SARIF output.</summary>
public string ToolName { get; init; } = "StellaOps Scanner";
/// <summary>Tool version.</summary>
public required string ToolVersion { get; init; }
/// <summary>Tool information URI.</summary>
public string ToolUri { get; init; } = "https://stellaops.io/scanner";
/// <summary>Minimum severity to include.</summary>
public Severity? MinimumSeverity { get; init; }
/// <summary>Include reachability evidence in properties.</summary>
public bool IncludeReachability { get; init; } = true;
/// <summary>Include VEX status in properties.</summary>
public bool IncludeVexStatus { get; init; } = true;
/// <summary>Include EPSS scores in properties.</summary>
public bool IncludeEpss { get; init; } = true;
/// <summary>Include KEV status in properties.</summary>
public bool IncludeKev { get; init; } = true;
/// <summary>Include evidence URIs in properties.</summary>
public bool IncludeEvidenceUris { get; init; } = false;
/// <summary>Include attestation reference in run properties.</summary>
public bool IncludeAttestation { get; init; } = true;
/// <summary>Version control provenance.</summary>
public VersionControlInfo? VersionControl { get; init; }
/// <summary>Pretty-print JSON output.</summary>
public bool IndentedJson { get; init; } = false;
/// <summary>Category for GitHub upload (distinguishes multiple tools).</summary>
public string? Category { get; init; }
/// <summary>Base URI for source files.</summary>
public string? SourceRoot { get; init; }
}
public sealed record VersionControlInfo
{
public required string RepositoryUri { get; init; }
public required string RevisionId { get; init; }
public string? Branch { get; init; }
}
```
---
## Rule Registry
### ISarifRuleRegistry
```csharp
namespace StellaOps.Scanner.Sarif.Rules;
/// <summary>
/// Registry of SARIF rules for StellaOps findings.
/// </summary>
public interface ISarifRuleRegistry
{
/// <summary>Get rule by ID.</summary>
SarifRule? GetRule(string ruleId);
/// <summary>Get rule for finding type and severity.</summary>
SarifRule GetRuleForFinding(FindingType type, Severity severity);
/// <summary>Get all registered rules.</summary>
IReadOnlyList<SarifRule> GetAllRules();
/// <summary>Get rules by category.</summary>
IReadOnlyList<SarifRule> GetRulesByCategory(string category);
}
```
### Rule Definitions
```csharp
namespace StellaOps.Scanner.Sarif.Rules;
public static class VulnerabilityRules
{
public static readonly SarifRule Critical = new()
{
Id = "STELLA-VULN-001",
Name = "CriticalVulnerability",
ShortDescription = "Critical vulnerability detected (CVSS >= 9.0)",
FullDescription = "A critical severity vulnerability was detected. " +
"This may be a known exploited vulnerability (KEV) or " +
"have a CVSS score of 9.0 or higher.",
HelpUri = "https://stellaops.io/rules/STELLA-VULN-001",
DefaultLevel = SarifLevel.Error,
Properties = new Dictionary<string, object>
{
["precision"] = "high",
["problem.severity"] = "error",
["security-severity"] = "10.0",
["tags"] = new[] { "security", "vulnerability", "critical" }
}.ToImmutableDictionary()
};
public static readonly SarifRule High = new()
{
Id = "STELLA-VULN-002",
Name = "HighVulnerability",
ShortDescription = "High severity vulnerability detected (CVSS 7.0-8.9)",
FullDescription = "A high severity vulnerability was detected with " +
"CVSS score between 7.0 and 8.9.",
HelpUri = "https://stellaops.io/rules/STELLA-VULN-002",
DefaultLevel = SarifLevel.Error,
Properties = new Dictionary<string, object>
{
["precision"] = "high",
["problem.severity"] = "error",
["security-severity"] = "8.0",
["tags"] = new[] { "security", "vulnerability", "high" }
}.ToImmutableDictionary()
};
public static readonly SarifRule Medium = new()
{
Id = "STELLA-VULN-003",
Name = "MediumVulnerability",
ShortDescription = "Medium severity vulnerability detected (CVSS 4.0-6.9)",
HelpUri = "https://stellaops.io/rules/STELLA-VULN-003",
DefaultLevel = SarifLevel.Warning,
Properties = new Dictionary<string, object>
{
["precision"] = "high",
["problem.severity"] = "warning",
["security-severity"] = "5.5",
["tags"] = new[] { "security", "vulnerability", "medium" }
}.ToImmutableDictionary()
};
public static readonly SarifRule Low = new()
{
Id = "STELLA-VULN-004",
Name = "LowVulnerability",
ShortDescription = "Low severity vulnerability detected (CVSS < 4.0)",
HelpUri = "https://stellaops.io/rules/STELLA-VULN-004",
DefaultLevel = SarifLevel.Note,
Properties = new Dictionary<string, object>
{
["precision"] = "high",
["problem.severity"] = "note",
["security-severity"] = "2.0",
["tags"] = new[] { "security", "vulnerability", "low" }
}.ToImmutableDictionary()
};
// Reachability-enhanced rules
public static readonly SarifRule RuntimeReachable = new()
{
Id = "STELLA-VULN-005",
Name = "ReachableVulnerability",
ShortDescription = "Runtime-confirmed reachable vulnerability",
FullDescription = "A vulnerability with runtime-confirmed reachability. " +
"The vulnerable code path was observed during execution.",
HelpUri = "https://stellaops.io/rules/STELLA-VULN-005",
DefaultLevel = SarifLevel.Error,
Properties = new Dictionary<string, object>
{
["precision"] = "very-high",
["problem.severity"] = "error",
["security-severity"] = "9.5",
["tags"] = new[] { "security", "vulnerability", "reachable", "runtime" }
}.ToImmutableDictionary()
};
}
```
---
## Fingerprint Generation
### IFingerprintGenerator
```csharp
namespace StellaOps.Scanner.Sarif.Fingerprints;
/// <summary>
/// Generates deterministic fingerprints for SARIF deduplication.
/// </summary>
public interface IFingerprintGenerator
{
/// <summary>
/// Generate primary fingerprint for a finding.
/// </summary>
string GeneratePrimary(Finding finding, FingerprintStrategy strategy);
/// <summary>
/// Generate partial fingerprints for GitHub fallback.
/// </summary>
ImmutableDictionary<string, string> GeneratePartial(
Finding finding,
string? sourceContent);
}
public enum FingerprintStrategy
{
/// <summary>Hash of ruleId + purl + vulnId + artifactDigest.</summary>
Standard,
/// <summary>Hash including file location for source-level findings.</summary>
WithLocation,
/// <summary>Hash including content hash for maximum stability.</summary>
ContentBased
}
```
### Implementation
```csharp
public class FingerprintGenerator : IFingerprintGenerator
{
public string GeneratePrimary(Finding finding, FingerprintStrategy strategy)
{
var input = strategy switch
{
FingerprintStrategy.Standard => string.Join("|",
finding.RuleId,
finding.ComponentPurl,
finding.VulnerabilityId ?? "",
finding.ArtifactDigest),
FingerprintStrategy.WithLocation => string.Join("|",
finding.RuleId,
finding.ComponentPurl,
finding.VulnerabilityId ?? "",
finding.ArtifactDigest,
finding.FilePath ?? "",
finding.LineNumber?.ToString(CultureInfo.InvariantCulture) ?? ""),
FingerprintStrategy.ContentBased => string.Join("|",
finding.RuleId,
finding.ComponentPurl,
finding.VulnerabilityId ?? "",
finding.ContentHash ?? finding.ArtifactDigest),
_ => throw new ArgumentOutOfRangeException(nameof(strategy))
};
return ComputeSha256(input);
}
public ImmutableDictionary<string, string> GeneratePartial(
Finding finding,
string? sourceContent)
{
var partial = new Dictionary<string, string>();
// Line hash for GitHub deduplication
if (!string.IsNullOrEmpty(sourceContent) && finding.LineNumber.HasValue)
{
var lines = sourceContent.Split('\n');
if (finding.LineNumber.Value <= lines.Length)
{
var line = lines[finding.LineNumber.Value - 1];
partial["primaryLocationLineHash"] = ComputeSha256(line.Trim());
}
}
return partial.ToImmutableDictionary();
}
private static string ComputeSha256(string input)
{
var bytes = SHA256.HashData(Encoding.UTF8.GetBytes(input));
return Convert.ToHexString(bytes).ToLowerInvariant();
}
}
```
---
## Severity Mapping
```csharp
public static class SeverityMapper
{
public static SarifLevel MapToSarifLevel(Severity severity, bool isReachable = false)
{
// Reachable vulnerabilities are always error level
if (isReachable && severity >= Severity.Medium)
return SarifLevel.Error;
return severity switch
{
Severity.Critical => SarifLevel.Error,
Severity.High => SarifLevel.Error,
Severity.Medium => SarifLevel.Warning,
Severity.Low => SarifLevel.Note,
Severity.Info => SarifLevel.Note,
_ => SarifLevel.None
};
}
public static double MapToSecuritySeverity(double cvssScore)
{
// GitHub uses security-severity for ordering
// Map CVSS 0-10 scale directly
return Math.Clamp(cvssScore, 0.0, 10.0);
}
}
```
---
## Determinism Requirements
Following CLAUDE.md rules:
1. **Canonical JSON:** RFC 8785 sorted keys, no nulls
2. **Stable Rule Ordering:** Rules sorted by ID
3. **Stable Result Ordering:** Results sorted by (ruleId, location, fingerprint)
4. **Time Injection:** Use `TimeProvider` for timestamps
5. **Culture Invariance:** `InvariantCulture` for all string operations
6. **Immutable Collections:** All outputs use `ImmutableArray`, `ImmutableDictionary`
---
## API Endpoints
### Scanner Export Endpoints
```csharp
public static class SarifExportEndpoints
{
public static void MapSarifEndpoints(this IEndpointRouteBuilder app)
{
var group = app.MapGroup("/v1/scans/{scanId}/exports")
.RequireAuthorization("scanner:read");
// SARIF export
group.MapGet("/sarif", ExportSarif)
.WithName("ExportScanSarif")
.Produces<string>(StatusCodes.Status200OK, "application/sarif+json");
// SARIF with options
group.MapPost("/sarif", ExportSarifWithOptions)
.WithName("ExportScanSarifWithOptions")
.Produces<string>(StatusCodes.Status200OK, "application/sarif+json");
}
private static async Task<IResult> ExportSarif(
Guid scanId,
[FromQuery] string? minSeverity,
[FromQuery] bool pretty = false,
[FromQuery] bool includeReachability = true,
ISarifExportService sarifService,
IFindingsService findingsService,
CancellationToken ct)
{
var findings = await findingsService.GetByScanIdAsync(scanId, ct);
var options = new SarifExportOptions
{
ToolVersion = GetToolVersion(),
MinimumSeverity = ParseSeverity(minSeverity),
IncludeReachability = includeReachability,
IndentedJson = pretty
};
var json = await sarifService.ExportToJsonAsync(findings, options, ct);
return Results.Content(json, "application/sarif+json");
}
}
```
---
## Integration with GitHub
See `src/Integrations/__Plugins/StellaOps.Integrations.Plugin.GitHubApp/` for GitHub connector.
New GitHub Code Scanning client extends existing infrastructure:
```csharp
public interface IGitHubCodeScanningClient
{
/// <summary>Upload SARIF to GitHub Code Scanning.</summary>
Task<SarifUploadResult> UploadSarifAsync(
string owner,
string repo,
SarifUploadRequest request,
CancellationToken ct);
/// <summary>Get upload status.</summary>
Task<SarifUploadStatus> GetUploadStatusAsync(
string owner,
string repo,
string sarifId,
CancellationToken ct);
/// <summary>List code scanning alerts.</summary>
Task<IReadOnlyList<CodeScanningAlert>> ListAlertsAsync(
string owner,
string repo,
AlertFilter? filter,
CancellationToken ct);
}
```
---
## Performance Targets
| Operation | Target P95 | Notes |
|-----------|-----------|-------|
| Export 100 findings | < 100ms | In-memory |
| Export 10,000 findings | < 5s | Streaming |
| SARIF serialization | < 50ms/MB | RFC 8785 |
| Schema validation | < 200ms | JSON Schema |
| Fingerprint generation | < 1ms/finding | SHA-256 |
---
## Related Documentation
- [Product Advisory](../../product/advisories/09-Jan-2026%20-%20GitHub%20Code%20Scanning%20Integration%20(Revised).md)
- [SARIF 2.1.0 Specification](https://docs.oasis-open.org/sarif/sarif/v2.1.0/sarif-v2.1.0.html)
- [GitHub SARIF Support](https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning)
- [Existing SmartDiff SARIF](../../../src/Scanner/__Libraries/StellaOps.Scanner.SmartDiff/Output/)
---
_Last updated: 09-Jan-2026_

View File

@@ -1,287 +0,0 @@
# Evidence Panel Component
> **Sprint:** SPRINT_20260107_006_001_FE
> **Module:** Triage UI
> **Version:** 1.0.0
## Overview
The Evidence Panel provides a unified tabbed interface for viewing all evidence related to a security finding. It consolidates five categories of evidence:
1. **Provenance** - DSSE attestation chain, signer identity, Rekor transparency
2. **Reachability** - Code path analysis showing if vulnerability is reachable
3. **Diff** - Source code changes introducing the vulnerability
4. **Runtime** - Runtime telemetry and execution evidence
5. **Policy** - OPA/Rego policy decisions and lattice trace
## Architecture
```
┌──────────────────────────────────────────────────────────────────────┐
│ TabbedEvidencePanelComponent │
├──────────────────────────────────────────────────────────────────────┤
│ [Provenance] [Reachability] [Diff] [Runtime] [Policy] │
├──────────────────────────────────────────────────────────────────────┤
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ Tab Content (lazy-loaded) │ │
│ │ │ │
│ │ ProvenanceTabComponent / ReachabilityTabComponent / etc. │ │
│ └──────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
```
## Components
### TabbedEvidencePanelComponent
**Selector:** `app-tabbed-evidence-panel`
**Inputs:**
- `findingId: string` - The finding ID to load evidence for
**Outputs:**
- `tabChange: EventEmitter<EvidenceTabType>` - Emitted when tab changes
**Usage:**
```html
<app-tabbed-evidence-panel
[findingId]="selectedFindingId"
(tabChange)="onTabChange($event)"
/>
```
### ProvenanceTabComponent
Displays DSSE attestation information including:
- DSSE verification badge (verified/partial/missing)
- Attestation chain visualization (build → scan → triage → policy)
- Signer identity and key information
- Rekor log index with verification link
- Collapsible in-toto statement JSON
### DsseBadgeComponent
**Selector:** `app-dsse-badge`
Displays the DSSE verification status as a badge.
**Inputs:**
- `status: DsseBadgeStatus` - 'verified' | 'partial' | 'missing'
- `details?: DsseVerificationDetails` - Additional verification details
- `showTooltip?: boolean` - Show tooltip on hover (default: true)
- `animate?: boolean` - Enable hover animations (default: true)
**States:**
| State | Color | Icon | Meaning |
|-------|-------|------|---------|
| verified | Green | ✓ | Full DSSE chain verified |
| partial | Amber | ⚠ | Some attestations missing |
| missing | Red | ✗ | No valid attestation |
### AttestationChainComponent
**Selector:** `app-attestation-chain`
Visualizes the attestation chain as connected nodes.
**Inputs:**
- `nodes: AttestationChainNode[]` - Chain nodes to display
**Outputs:**
- `nodeClick: EventEmitter<AttestationChainNode>` - Emitted on node click
### PolicyTabComponent
Displays policy evaluation details including:
- Verdict badge (ALLOW/DENY/QUARANTINE/REVIEW)
- OPA/Rego rule path that matched
- K4 lattice merge trace visualization
- Counterfactual analysis ("What would change verdict?")
- Policy version and editor link
### ReachabilityTabComponent
Integrates the existing `ReachabilityContextComponent` with:
- Summary header with status badge
- Confidence percentage display
- Entry points list
- Link to full graph view
## Services
### EvidenceTabService
**Path:** `services/evidence-tab.service.ts`
Fetches evidence data for each tab with caching.
```typescript
interface EvidenceTabService {
getProvenanceEvidence(findingId: string, forceRefresh?: boolean): Observable<LoadState<ProvenanceEvidence>>;
getReachabilityEvidence(findingId: string, forceRefresh?: boolean): Observable<LoadState<ReachabilityData>>;
getDiffEvidence(findingId: string, forceRefresh?: boolean): Observable<LoadState<DiffEvidence>>;
getRuntimeEvidence(findingId: string, forceRefresh?: boolean): Observable<LoadState<RuntimeEvidence>>;
getPolicyEvidence(findingId: string, forceRefresh?: boolean): Observable<LoadState<PolicyEvidence>>;
clearCache(findingId?: string): void;
}
```
### TabUrlPersistenceService
**Path:** `services/tab-url-persistence.service.ts`
Manages URL query param persistence for selected tab.
```typescript
interface TabUrlPersistenceService {
readonly selectedTab$: Observable<EvidenceTabType>;
getCurrentTab(): EvidenceTabType;
setTab(tab: EvidenceTabType): void;
navigateToTab(tab: EvidenceTabType): void;
}
```
## Keyboard Shortcuts
| Key | Action |
|-----|--------|
| `1` | Go to Provenance tab |
| `2` | Go to Reachability tab |
| `3` | Go to Diff tab |
| `4` | Go to Runtime tab |
| `5` | Go to Policy tab |
| `→` | Next tab |
| `←` | Previous tab |
| `Home` | First tab |
| `End` | Last tab |
## URL Persistence
The selected tab is persisted in the URL query string:
```
/triage/findings/CVE-2024-1234?tab=provenance
/triage/findings/CVE-2024-1234?tab=reachability
/triage/findings/CVE-2024-1234?tab=diff
/triage/findings/CVE-2024-1234?tab=runtime
/triage/findings/CVE-2024-1234?tab=policy
```
This enables:
- Deep linking to specific evidence
- Browser history navigation
- Sharing links with colleagues
## Data Models
### ProvenanceEvidence
```typescript
interface ProvenanceEvidence {
dsseStatus: DsseBadgeStatus;
dsseDetails?: DsseVerificationDetails;
attestationChain: AttestationChainNode[];
signer?: SignerInfo;
rekorLogIndex?: number;
rekorVerifyUrl?: string;
inTotoStatement?: object;
}
```
### AttestationChainNode
```typescript
interface AttestationChainNode {
id: string;
type: 'build' | 'scan' | 'triage' | 'policy' | 'custom';
label: string;
status: 'verified' | 'pending' | 'missing' | 'failed';
predicateType?: string;
digest?: string;
timestamp?: string;
signer?: string;
details?: AttestationDetails;
}
```
### PolicyEvidence
```typescript
interface PolicyEvidence {
verdict: PolicyVerdict;
rulePath?: string;
latticeTrace?: LatticeTraceStep[];
counterfactuals?: PolicyCounterfactual[];
policyVersion?: string;
policyDigest?: string;
policyEditorUrl?: string;
evaluatedAt?: string;
}
```
## Accessibility
The Evidence Panel follows WAI-ARIA tabs pattern:
- `role="tablist"` on tab navigation
- `role="tab"` on each tab button
- `role="tabpanel"` on each panel
- `aria-selected` indicates active tab
- `aria-controls` links tabs to panels
- `aria-labelledby` links panels to tabs
- `tabindex` management for keyboard navigation
- Screen reader announcements on tab change
## Testing
### Unit Tests
Located in `evidence-panel/*.spec.ts`:
- Tab navigation behavior
- DSSE badge states and styling
- Attestation chain rendering
- Keyboard navigation
- URL persistence
- Loading/error states
### E2E Tests
Located in `e2e/evidence-panel.e2e.spec.ts`:
- Full tab switching workflow
- Evidence loading and display
- Copy JSON functionality
- URL persistence across reloads
- Accessibility compliance
## API Dependencies
The Evidence Panel depends on these API endpoints:
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/evidence/provenance/{findingId}` | GET | Fetch provenance data |
| `/api/evidence/reachability/{findingId}` | GET | Fetch reachability data |
| `/api/evidence/diff/{findingId}` | GET | Fetch diff data |
| `/api/evidence/runtime/{findingId}` | GET | Fetch runtime data |
| `/api/evidence/policy/{findingId}` | GET | Fetch policy data |
See [Evidence API Reference](../../../api/evidence-api.md) for details.
## Screenshots
### Provenance Tab
![Provenance Tab](../../assets/screenshots/evidence-provenance.png)
### Policy Tab with Lattice Trace
![Policy Tab](../../assets/screenshots/evidence-policy.png)
### Attestation Chain Expanded
![Attestation Chain](../../assets/screenshots/attestation-chain.png)
## Changelog
| Version | Date | Changes |
|---------|------|---------|
| 1.0.0 | 2026-01-09 | Initial implementation |

View File

@@ -180,12 +180,19 @@ Each feature folder builds as a **standalone route** (lazy loaded). All HTTP sha
- Agent download and registration flow.
* **Models**: `integration.models.ts` defines `IntegrationDraft`, `IntegrationProvider`, `WizardStep`, `PreflightCheck`, `AuthMethod`, and provider constants.
### 3.12 Advisor (Ask Stella)
### 3.12 Advisor (Ask Stella)
* **Chat panel** scoped to the current artifact, CVE, or release, with citations and evidence chips.
* **Citations and Evidence** drawer lists object refs (SBOM, VEX, scan IDs) and hashes.
* **Action confirmation** modal required for any tool action; disabled when policy denies.
* **Budget indicators** show quota or token budget exhaustion with retry hints.
* **Action confirmation** modal required for any tool action; disabled when policy denies.
* **Budget indicators** show quota or token budget exhaustion with retry hints.
### 3.13 Global Search and Assistant Bridge
* **Search -> assistant handoff**: result cards and synthesis panel expose `Ask AI` actions that route to `/security/triage?openChat=true` and seed chat context through `SearchChatContextService`.
* **Assistant host**: `/security/triage` mounts `SecurityTriageChatHostComponent`, which consumes `openChat` intent deterministically and opens the chat drawer in the primary shell.
* **Assistant -> search return**: assistant responses expose `Search for more` and `Search related` actions; these populate global search query/domain context and focus the search surface.
* **Fallback transparency**: when unified search drops to legacy fallback, global search displays an explicit degraded banner and emits enter/exit telemetry markers for operator visibility.
---