tests fixes and sprints work

2026-01-22 19:08:46 +02:00
parent c32fff8f86
commit 726d70dc7f
881 changed files with 134434 additions and 6228 deletions
--- a/docs/modules/airgap/README.md
+++ b/docs/modules/airgap/README.md
@@ -39,6 +39,7 @@ Key settings:
 - `subject`: sha256 (+ optional sha512) digest of the bundle target.
 - `timestamps`: RFC3161/eIDAS timestamp entries with TSA chain/OCSP/CRL refs.
 - `rekorProofs`: entry body/inclusion proof paths plus signed entry timestamp for offline verification.
+- Inline artifacts (no `path`) are capped at 4 MiB; larger artifacts are written under `artifacts/`.

 ## Dependencies

@@ -55,6 +56,63 @@ Key settings:
 - Mirror: `../mirror/`
 - ExportCenter: `../export-center/`

+## Evidence Bundles for Air-Gapped Verification
+
+The AirGap module supports golden corpus evidence bundles for offline verification of patch provenance. These bundles enable auditors to verify security patch status without network access.
+
+### Bundle Contents
+
+Evidence bundles follow the OCI format and contain:
+- Pre/post binaries with debug symbols
+- Canonical SBOM for each binary
+- DSSE delta-sig predicate proving patch status
+- Build provenance (if available from buildinfo)
+- RFC 3161 timestamps for each signed artifact
+- Validation run results and KPIs
+
+### Bundle Export
+
+```bash
+stella groundtruth bundle export \
+  --packages openssl,zlib,glibc \
+  --distros debian,fedora \
+  --output symbol-bundle.tar.gz \
+  --sign-with cosign
+```
+
+### Bundle Import and Verification
+
+```bash
+stella groundtruth bundle import \
+  --input symbol-bundle.tar.gz \
+  --verify-signature \
+  --trusted-keys /etc/stellaops/trusted-keys.pub \
+  --output verification-report.md
+```
+
+### Standalone Verifier
+
+For air-gapped environments without the full Stella Ops stack, use the standalone verifier:
+
+```bash
+stella-verifier verify \
+  --bundle evidence-bundle.oci.tar \
+  --trusted-keys trusted-keys.pub \
+  --trust-profile eu-eidas.trustprofile.json \
+  --output report.json
+```
+
+Exit codes:
+- `0`: All verifications passed
+- `1`: One or more verifications failed
+- `2`: Invalid input or configuration error
+
+### Related Documentation
+
+- [Golden Corpus Layout](../binary-index/golden-corpus-layout.md)
+- [Golden Corpus Maintenance](../binary-index/golden-corpus-maintenance.md)
+- [Golden Corpus Operations Runbook](../../runbooks/golden-corpus-operations.md)
+
 ## Current Status

 Implemented with Controller for snapshot export and Importer for secure ingestion. Staleness policies enforce time-bound validity. Integrated with ExportCenter for bundle packaging and all data modules for content export/import.
--- a/docs/modules/analytics/README.md
+++ b/docs/modules/analytics/README.md
@@ -17,7 +17,7 @@ Stella Ops generates rich data through SBOM ingestion, vulnerability correlation
 |------------|-------------|
 | Unified component registry | Canonical component table with normalized suppliers and licenses |
 | Vulnerability correlation | Pre-joined component-vulnerability mapping with EPSS/KEV flags |
-| VEX-adjusted exposure | Vulnerability counts that respect VEX overrides |
+| VEX-adjusted exposure | Vulnerability counts that respect active VEX overrides (validity windows applied) |
 | Attestation tracking | Provenance and SLSA level coverage by environment/team |
 | Time-series rollups | Daily snapshots for trend analysis |
 | Materialized views | Pre-computed aggregations for dashboard performance |
@@ -68,6 +68,14 @@ Stella Ops generates rich data through SBOM ingestion, vulnerability correlation
 | `daily_vulnerability_counts` | Rollup | Daily vuln aggregations |
 | `daily_component_counts` | Rollup | Daily component aggregations |

+Rollup retention is 90 days in hot storage. `compute_daily_rollups()` prunes
+older rows after each run; archival follows operations runbooks.
+Platform WebService can automate rollups + materialized view refreshes via
+`PlatformAnalyticsMaintenanceService` (see `architecture.md` for schedule and
+configuration).
+Use `Platform:AnalyticsMaintenance:BackfillDays` to recompute the most recent
+N days of rollups on the first maintenance run after downtime (set to `0` to disable).
+
 ### Materialized Views

 | View | Refresh | Purpose |
@@ -77,33 +85,36 @@ Stella Ops generates rich data through SBOM ingestion, vulnerability correlation
 | `mv_vuln_exposure` | Daily | CVE exposure adjusted by VEX |
 | `mv_attestation_coverage` | Daily | Provenance/SLSA coverage by env/team |

+Array-valued fields (for example `environments` and `ecosystems`) are ordered
+alphabetically to keep analytics outputs deterministic.
+
 ## Quick Start

 ### Day-1 Queries

-**Top supplier concentration (supply chain risk):**
+**Top supplier concentration (supply chain risk, optional environment filter):**
 ```sql
-SELECT * FROM analytics.sp_top_suppliers(20);
+SELECT analytics.sp_top_suppliers(20, 'prod');
 ```

-**License risk heatmap:**
+**License risk heatmap (optional environment filter):**
 ```sql
-SELECT * FROM analytics.sp_license_heatmap();
+SELECT analytics.sp_license_heatmap('prod');
 ```

 **CVE exposure adjusted by VEX:**
 ```sql
-SELECT * FROM analytics.sp_vuln_exposure('prod', 'high');
+SELECT analytics.sp_vuln_exposure('prod', 'high');
 ```

 **Fixable vulnerability backlog:**
 ```sql
-SELECT * FROM analytics.sp_fixable_backlog('prod');
+SELECT analytics.sp_fixable_backlog('prod');
 ```

 **Attestation coverage gaps:**
 ```sql
-SELECT * FROM analytics.sp_attestation_gaps('prod');
+SELECT analytics.sp_attestation_gaps('prod');
 ```

 ### API Endpoints
@@ -118,6 +129,82 @@ SELECT * FROM analytics.sp_attestation_gaps('prod');
 | `/api/analytics/trends/vulnerabilities` | GET | Vulnerability time-series |
 | `/api/analytics/trends/components` | GET | Component time-series |

+All analytics endpoints require the `analytics.read` scope.
+The platform metadata capability `analytics` reports whether analytics storage is configured.
+
+#### Query Parameters
+- `/api/analytics/suppliers`: `limit` (optional, default 20), `environment` (optional)
+- `/api/analytics/licenses`: `environment` (optional)
+- `/api/analytics/vulnerabilities`: `minSeverity` (optional, default `low`), `environment` (optional)
+- `/api/analytics/backlog`: `environment` (optional)
+- `/api/analytics/attestation-coverage`: `environment` (optional)
+- `/api/analytics/trends/vulnerabilities`: `environment` (optional), `days` (optional, default 30)
+- `/api/analytics/trends/components`: `environment` (optional), `days` (optional, default 30)
+
+## Ingestion Configuration
+
+Analytics ingestion runs inside the Platform WebService and subscribes to Scanner, Concelier, and Attestor streams. Configure ingestion via `Platform:AnalyticsIngestion`:
+
+```yaml
+Platform:
+  Storage:
+    PostgresConnectionString: "Host=...;Database=analytics;Username=...;Password=..."
+  AnalyticsIngestion:
+    Enabled: true
+    PostgresConnectionString: "" # optional; defaults to Platform:Storage
+    AllowedTenants: ["tenant-a", "tenant-b"]
+    Streams:
+      ScannerStream: "orchestrator:events"
+      ConcelierObservationStream: "concelier:advisory.observation.updated:v1"
+      ConcelierLinksetStream: "concelier:advisory.linkset.updated:v1"
+      AttestorStream: "attestor:events"
+      StartFromBeginning: false
+    Cas:
+      RootPath: "/var/lib/stellaops/cas"
+      DefaultBucket: "attestations"
+    Attestations:
+      BundleUriTemplate: "bundle:{digest}"
+```
+
+Bundle URI templates support:
+- `{digest}` for the full digest string (for example `sha256:...`).
+- `{hash}` for the raw hex digest (no algorithm prefix).
+- `bundle:{digest}` which resolves to `cas://<DefaultBucket>/{digest}` by default.
+- `file:/path/to/bundles/bundle-{hash}.json` for offline file ingestion.
+
+For offline workflows, verify bundles with `stella bundle verify` before ingesting them.
+
+## Console UI
+
+SBOM Lake analytics are exposed in the Console under `Analytics > SBOM Lake` (`/analytics/sbom-lake`).
+Console access requires `ui.read` plus `analytics.read` scopes.
+
+Key UI features:
+- Filters for environment, minimum severity, and time window.
+- Panels for suppliers, licenses, vulnerability exposure, and attestation coverage.
+- Trend views for vulnerabilities and components.
+- Fixable backlog table with CSV export.
+
+See [console.md](./console.md) for operator guidance and filter behavior.
+
+## CLI Access
+
+SBOM lake analytics are exposed via the CLI under `stella analytics sbom-lake`
+(requires `analytics.read` scope).
+
+```bash
+# Top suppliers
+stella analytics sbom-lake suppliers --limit 20
+
+# Vulnerability exposure in prod (high+), CSV export
+stella analytics sbom-lake vulnerabilities --environment prod --min-severity high --format csv --output vuln.csv
+
+# 30-day trends for both series
+stella analytics sbom-lake trends --days 30 --series all --format json
+```
+
+See `docs/modules/cli/guides/commands/analytics.md` for command-level details.
+
 ## Architecture

 See [architecture.md](./architecture.md) for detailed design decisions, data flow, and normalization rules.
@@ -133,4 +220,6 @@ See [analytics_schema.sql](../../db/analytics_schema.sql) for complete DDL inclu

 ## Sprint Reference

-Implementation tracked in: `docs/implplan/SPRINT_20260120_030_Platform_sbom_analytics_lake.md`
+Implementation tracked in:
+- `docs/implplan/SPRINT_20260120_030_Platform_sbom_analytics_lake.md`
+- `docs/implplan/SPRINT_20260120_032_Cli_sbom_analytics_cli.md`
--- a/docs/modules/analytics/architecture.md
+++ b/docs/modules/analytics/architecture.md
@@ -7,7 +7,7 @@ The Analytics module implements a **star-schema data warehouse** pattern optimiz
 1. **Separation of concerns**: Analytics schema is isolated from operational schemas (scanner, vex, proof_system)
 2. **Pre-computation**: Expensive aggregations computed in advance via materialized views
 3. **Audit trail**: Raw payloads preserved for reprocessing and compliance
-4. **Determinism**: All normalization functions are immutable and reproducible
+4. **Determinism**: Normalization functions are immutable and reproducible; array aggregates are ordered for stable outputs
 5. **Incremental updates**: Supports both full refresh and incremental ingestion

 ## Data Flow
@@ -120,10 +120,9 @@ When a component is upserted, the `VulnerabilityCorrelationService` queries Conc
 2. Filter by version range matching
 3. Upsert to `component_vulns` with severity, EPSS, KEV flags

-**Version range matching** uses Concelier's existing logic to handle:
- Semver ranges: `>=1.0.0 <2.0.0`
- Exact versions: `1.2.3`
- Wildcards: `1.x`
+**Version range matching** currently supports semver ranges and exact matches via
+`VersionRuleEvaluator`. Non-semver schemes fall back to exact string matches; wildcard
+and ecosystem-specific ranges require upstream normalization.

 ## VEX Override Logic

@@ -145,7 +144,21 @@ COUNT(DISTINCT ac.artifact_id) FILTER (
 **Override validity:**
 - `valid_from`: When the override became effective
 - `valid_until`: Expiration (NULL = no expiration)
- Only `status = 'not_affected'` reduces exposure counts
+- Only `status = 'not_affected'` reduces exposure counts, and only when the override is active in its validity window.
+
+## Attestation Ingestion
+
+Attestation ingestion consumes Attestor Rekor entry events and expects Sigstore bundles
+or raw DSSE envelopes. The ingestion service:
+- Resolves bundle URIs using `BundleUriTemplate`; `bundle:{digest}` maps to
+  `cas://<DefaultBucket>/{digest}` by default.
+- Decodes DSSE payloads, computes `dsse_payload_hash`, and records `predicate_uri` plus
+  Rekor log metadata (`rekor_log_id`, `rekor_log_index`).
+- Uses in-toto `subject` digests to link artifacts when reanalysis hints are absent.
+- Maps predicate URIs into `analytics_attestation_type` values
+  (`provenance`, `sbom`, `vex`, `build`, `scan`, `policy`).
+- Expands VEX statements into `vex_overrides` rows, one per product reference, and
+  captures optional validity timestamps when provided.

 ## Time-Series Rollups

@@ -164,14 +177,14 @@ Daily rollups computed by `compute_daily_rollups()`:
 - `total_components`: Distinct components
 - `unique_suppliers`: Distinct normalized suppliers

-**Retention policy:** 90 days in hot storage; older data archived to cold storage.
+**Retention policy:** 90 days in hot storage; `compute_daily_rollups()` prunes older rows and downstream jobs archive to cold storage.

 ## Materialized View Refresh

 All materialized views support `REFRESH ... CONCURRENTLY` for zero-downtime updates:

 ```sql
-- Refresh all views (run daily via pg_cron or Scheduler)
+-- Refresh all views (non-concurrent; run off-peak)
 SELECT analytics.refresh_all_views();
 ```

@@ -182,6 +195,19 @@ SELECT analytics.refresh_all_views();
 - `mv_attestation_coverage`: 02:45 UTC daily
 - `compute_daily_rollups()`: 03:00 UTC daily

+Platform WebService can run the daily rollup + refresh loop via
+`PlatformAnalyticsMaintenanceService`. Configure the schedule with:
+- `Platform:AnalyticsMaintenance:Enabled` (default `true`)
+- `Platform:AnalyticsMaintenance:IntervalMinutes` (default `1440`)
+- `Platform:AnalyticsMaintenance:RunOnStartup` (default `true`)
+- `Platform:AnalyticsMaintenance:ComputeDailyRollups` (default `true`)
+- `Platform:AnalyticsMaintenance:RefreshMaterializedViews` (default `true`)
+- `Platform:AnalyticsMaintenance:BackfillDays` (default `0`, set to `0` to disable; recompute the most recent N days on the first maintenance run)
+
+The hosted service issues concurrent refresh statements directly for each view.
+Use a DB scheduler (pg_cron) or external orchestrator if you need the staggered
+per-view timing above.
+
 ## Performance Considerations

 ### Indexing Strategy
@@ -198,9 +224,9 @@ SELECT analytics.refresh_all_views();

 | Query | Target | Notes |
 |-------|--------|-------|
-| `sp_top_suppliers(20)` | < 100ms | Uses materialized view |
-| `sp_license_heatmap()` | < 100ms | Uses materialized view |
-| `sp_vuln_exposure()` | < 200ms | Uses materialized view |
+| `sp_top_suppliers(20, 'prod')` | < 100ms | Uses materialized view when env is null; env filter reads base tables |
+| `sp_license_heatmap('prod')` | < 100ms | Uses materialized view when env is null; env filter reads base tables |
+| `sp_vuln_exposure()` | < 200ms | Uses materialized view for global queries; environment filters read base tables |
 | `sp_fixable_backlog()` | < 500ms | Live query with indexes |
 | `sp_attestation_gaps()` | < 100ms | Uses materialized view |

@@ -246,12 +272,12 @@ All tables include `created_at` and `updated_at` timestamps. Raw payload tables

 ### Upstream Dependencies

-| Service | Event | Action |
-|---------|-------|--------|
-| Scanner | SBOM ingested | Normalize and upsert components |
-| Concelier | Advisory updated | Re-correlate affected components |
-| Excititor | VEX observation | Create/update vex_overrides |
-| Attestor | Attestation created | Upsert attestation record |
+| Service | Event | Contract | Action |
+|---------|-------|----------|--------|
+| Scanner | SBOM report ready | `scanner.event.report.ready@1` (`docs/modules/signals/events/orchestrator-scanner-events.md`) | Normalize and upsert components |
+| Concelier | Advisory observation/linkset updated | `advisory.observation.updated@1` (`docs/modules/concelier/events/advisory.observation.updated@1.schema.json`), `advisory.linkset.updated@1` (`docs/modules/concelier/events/advisory.linkset.updated@1.md`) | Re-correlate affected components |
+| Excititor | VEX statement changes | `vex.statement.*` (`docs/modules/excititor/architecture.md`) | Create/update vex_overrides |
+| Attestor | Rekor entry logged | `rekor.entry.logged` (`docs/modules/attestor/architecture.md`) | Upsert attestation record |

 ### Downstream Consumers

--- a/docs/modules/analytics/console.md
+++ b/docs/modules/analytics/console.md
@@ -0,0 +1,64 @@
+# Analytics Console (SBOM Lake)
+
+The Console exposes SBOM analytics lake data under `Analytics > SBOM Lake`.
+This view is read-only and uses the analytics API endpoints documented in `docs/modules/analytics/README.md`.
+
+## Access
+
+- Route: `/analytics/sbom-lake`
+- Required scopes: `ui.read` and `analytics.read`
+- Console admin bundles: `role/analytics-viewer`, `role/analytics-operator`, `role/analytics-admin`
+- Data freshness: the page surfaces the latest `dataAsOf` timestamp returned by the API.
+
+## Filters
+
+The SBOM Lake page supports three filters that round-trip via URL query parameters:
+
+- Environment: `env` (optional, example: `Prod`)
+- Minimum severity: `severity` (optional, example: `high`)
+- Time window (days): `days` (optional, example: `90`)
+
+When a filter changes, the Console reloads all panels using the updated parameters.
+Supplier and license panels honor the environment filter alongside the other views.
+
+## Panels
+
+The dashboard presents four summary panels:
+
+1. Supplier concentration (top suppliers by component count)
+2. License distribution (license categories and counts)
+3. Vulnerability exposure (top CVEs after VEX adjustments)
+4. Attestation coverage (provenance and SLSA 2+ coverage)
+
+Each panel shows a loading state, empty state, and summary counts.
+
+## Trends
+
+Two trend panels are included:
+
+- Vulnerability trend: net exposure over the selected time window
+- Component trend: total components and unique suppliers
+
+The Console aggregates trend points by date and renders a simple bar chart plus a compact list.
+
+## Fixable Backlog
+
+The fixable backlog table lists vulnerabilities with fixes available, grouped by component and service.
+The "Top backlog components" table derives a component summary from the same backlog data.
+
+### CSV Export
+
+The "Export backlog CSV" action downloads a deterministic, ordered CSV with:
+
+- Service
+- Component
+- Version
+- Vulnerability
+- Severity
+- Environment
+- Fixed version
+
+## Troubleshooting
+
+- If panels show "No data", verify that the analytics schema and materialized views are populated.
+- If an error banner appears, check the analytics API availability and ensure the tenant has `analytics.read`.
--- a/docs/modules/analytics/queries.md
+++ b/docs/modules/analytics/queries.md
@@ -9,8 +9,8 @@ This document provides ready-to-use SQL queries for common analytics use cases.
 Identifies suppliers with the highest component footprint, indicating supply chain concentration risk.

 ```sql
-- Via stored procedure (recommended)
-SELECT * FROM analytics.sp_top_suppliers(20);
+-- Via stored procedure (recommended, optional environment filter)
+SELECT analytics.sp_top_suppliers(20, 'prod');

 -- Direct query
 SELECT
@@ -33,8 +33,8 @@ LIMIT 20;
 Shows distribution of components by license category for compliance review.

 ```sql
-- Via stored procedure
-SELECT * FROM analytics.sp_license_heatmap();
+-- Via stored procedure (optional environment filter)
+SELECT analytics.sp_license_heatmap('prod');

 -- Direct query with grouping
 SELECT
@@ -62,9 +62,9 @@ Shows true vulnerability exposure after applying VEX mitigations.

 ```sql
 -- Via stored procedure
-SELECT * FROM analytics.sp_vuln_exposure('prod', 'high');
+SELECT analytics.sp_vuln_exposure('prod', 'high');

-- Direct query showing VEX effectiveness
+-- Direct query showing VEX effectiveness (global view; use sp_vuln_exposure for environment filtering)
 SELECT
  vuln_id,
  severity::TEXT,
@@ -97,7 +97,7 @@ Lists vulnerabilities that can be fixed today (fix available, not VEX-mitigated)

 ```sql
 -- Via stored procedure
-SELECT * FROM analytics.sp_fixable_backlog('prod');
+SELECT analytics.sp_fixable_backlog('prod');

 -- Direct query with priority scoring
 SELECT
@@ -130,6 +130,7 @@ JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
 LEFT JOIN analytics.vex_overrides vo ON vo.artifact_id = a.artifact_id
  AND vo.vuln_id = cv.vuln_id
  AND vo.status = 'not_affected'
+  AND vo.valid_from <= now()
  AND (vo.valid_until IS NULL OR vo.valid_until > now())
 WHERE cv.affects = TRUE
  AND cv.fix_available = TRUE
@@ -147,7 +148,7 @@ Shows attestation gaps by environment and team.

 ```sql
 -- Via stored procedure
-SELECT * FROM analytics.sp_attestation_gaps('prod');
+SELECT analytics.sp_attestation_gaps('prod');

 -- Direct query with gap analysis
 SELECT
@@ -267,6 +268,7 @@ JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
 JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
 LEFT JOIN analytics.vex_overrides vo ON vo.artifact_id = a.artifact_id
  AND vo.vuln_id = cv.vuln_id
+  AND vo.valid_from <= now()
  AND (vo.valid_until IS NULL OR vo.valid_until > now())
 WHERE cv.vuln_id = 'CVE-2021-44228'
 ORDER BY a.environment, a.name;
@@ -312,7 +314,7 @@ SELECT
  c.license_category::TEXT,
  c.supplier_normalized AS supplier,
  COUNT(DISTINCT a.artifact_id) AS artifact_count,
-  ARRAY_AGG(DISTINCT a.name) AS affected_artifacts
+  ARRAY_AGG(DISTINCT a.name ORDER BY a.name) AS affected_artifacts
 FROM analytics.components c
 JOIN analytics.artifact_components ac ON ac.component_id = c.component_id
 JOIN analytics.artifacts a ON a.artifact_id = ac.artifact_id
@@ -340,6 +342,8 @@ SELECT
 FROM analytics.component_vulns cv
 JOIN analytics.vex_overrides vo ON vo.vuln_id = cv.vuln_id
  AND vo.status = 'not_affected'
+  AND vo.valid_from <= now()
+  AND (vo.valid_until IS NULL OR vo.valid_until > now())
 WHERE cv.published_at >= now() - INTERVAL '90 days'
  AND cv.published_at IS NOT NULL
 GROUP BY cv.severity
--- a/docs/modules/attestor/guides/README.md
+++ b/docs/modules/attestor/guides/README.md
@@ -14,7 +14,7 @@ StellaOps SBOM interoperability tests ensure compatibility with third-party secu
 | SPDX | 3.0.1 | ✅ Supported | 95%+ |

 Notes:
- SPDX 3.0.1 generation currently emits JSON-LD `@context`, `spdxVersion`, core document/package/relationship elements, software package/file/snippet metadata, build profile elements with output relationships, security vulnerabilities with assessment relationships, verifiedUsing hashes/signatures, and external references/identifiers. Full profile coverage is tracked in SPRINT_20260119_014.
+- SPDX 3.0.1 generation currently emits JSON-LD `@context`, `spdxVersion`, core document/package/relationship elements (including agent/tool elements for creationInfo), software package/file/snippet metadata, build profile elements with output relationships, security vulnerabilities with assessment relationships, licensing license elements with declared/concluded relationships, AI AIPackage metadata (autonomy, domain, metrics, safety risk assessment), Dataset package metadata (type, collection, preprocessing, availability), verifiedUsing hashes/signatures, external references/identifiers (including externalRef contentType when available), namespaceMap/imports for cross-document references, extension metadata via SbomExtension namespace/properties on document/component/vulnerability elements, and Lite profile output (opt-in via SpdxWriterOptions.UseLiteProfile). Full profile coverage is tracked in SPRINT_20260119_014.

 ### Third-Party Tools

--- a/docs/modules/attestor/guides/offline-verification.md
+++ b/docs/modules/attestor/guides/offline-verification.md
@@ -29,11 +29,14 @@ Use the bundle verification flow aligned to domain operations:

 ```bash
 stella bundle verify --bundle /path/to/bundle --offline --trust-root /path/to/tsa-root.pem --rekor-checkpoint /path/to/checkpoint.json
+stella bundle verify --bundle /path/to/bundle --offline --signer /path/to/report-key.pem --signer-cert /path/to/report-cert.pem
 ```

 Notes:
- Offline mode fails closed when revocation evidence is missing or invalid.
+- Offline mode fails closed when revocation evidence is missing or invalid.     
 - Trust roots must be provided locally; no network fetches are allowed.
+- When `--signer` is set, a DSSE report is written to `out/verification.report.json`.
+- Signed report metadata includes `verifier.algo`, `verifier.cert`, `signed_at`.

 ## 4. Verification Behavior

--- a/docs/modules/binary-index/architecture.md
+++ b/docs/modules/binary-index/architecture.md
@@ -1239,7 +1239,183 @@ binaryindex:

 ---

-## 10. References
+## 10. Golden Corpus for Patch Provenance
+
+> **Sprint:** SPRINT_20260121_034/035/036 - Golden Corpus Implementation
+
+The BinaryIndex module supports a **golden corpus** of patch-paired artifacts that enables offline SBOM reproducibility and binary-level patch provenance verification.
+
+### 10.1 Corpus Purpose
+
+The golden corpus provides:
+- **Auditor-ready evidence bundles** for air-gapped customers
+- **Regression testing** for binary matching accuracy
+- **Proof of patch status** independent of package metadata
+
+### 10.2 Corpus Sources
+
+| Source | Type | Purpose |
+|--------|------|---------|
+| Debian Security Tracker / DSAs | Advisory | Primary advisory linkage |
+| Debian Snapshot | Binary archive | Pre/post patch binary pairs |
+| Ubuntu Security Notices | Advisory | Ubuntu-specific advisories |
+| Alpine secdb | Advisory | Alpine YAML advisories |
+| OSV dump | Unified schema | Cross-reference and commit ranges |
+
+### 10.2.1 Symbol Source Connectors
+
+> **Sprint:** SPRINT_20260121_035_BinaryIndex_golden_corpus_connectors_cli
+
+The corpus ingestion layer uses pluggable connectors to retrieve symbols and metadata from upstream sources:
+
+| Connector ID | Implementation | Protocol | Data Retrieved |
+|--------------|----------------|----------|----------------|
+| `debuginfod-fedora` | `DebuginfodConnector` | debuginfod HTTP | ELF debug symbols by Build-ID |
+| `debuginfod-ubuntu` | `DebuginfodConnector` | debuginfod HTTP | ELF debug symbols by Build-ID |
+| `ddeb-ubuntu` | `DdebConnector` | APT/HTTP | `.ddeb` debug packages |
+| `buildinfo-debian` | `BuildinfoConnector` | HTTP | `.buildinfo` reproducibility records |
+| `secdb-alpine` | `AlpineSecDbConnector` | Git/HTTP | `secfixes` YAML from APKBUILD |
+
+**Connector Interface:**
+
+```csharp
+public interface ISymbolSourceConnector
+{
+    string ConnectorId { get; }
+    string DisplayName { get; }
+    string[] SupportedDistros { get; }
+
+    Task<ConnectorStatus> GetStatusAsync(CancellationToken ct);
+    Task SyncAsync(SyncOptions options, CancellationToken ct);
+    Task<SymbolLookupResult?> LookupByBuildIdAsync(string buildId, CancellationToken ct);
+    Task<IAsyncEnumerable<SymbolRecord>> SearchAsync(SymbolSearchQuery query, CancellationToken ct);
+}
+```
+
+**Debuginfod Connector:**
+
+The `DebuginfodConnector` implements the [debuginfod protocol](https://sourceware.org/elfutils/Debuginfod.html) for retrieving debug symbols:
+
+- Endpoint: `GET /buildid/<build-id>/debuginfo`
+- Supports federated queries across multiple debuginfod servers
+- Caches retrieved symbols in RustFS blob storage
+- Rate-limited to respect upstream server policies
+
+**Ubuntu ddeb Connector:**
+
+The `DdebConnector` retrieves Ubuntu debug symbol packages (`.ddeb`):
+
+- Sources: `ddebs.ubuntu.com` mirror
+- Indexes: Reads `Packages.xz` for package metadata
+- Extraction: Unpacks `.ddeb` AR archives to extract DWARF symbols
+- Mapping: Links debug symbols to binary packages via Build-ID
+
+**Debian Buildinfo Connector:**
+
+The `BuildinfoConnector` retrieves Debian buildinfo files for reproducibility verification:
+
+- Source: `buildinfos.debian.net` and snapshot archives
+- Purpose: Provides build environment metadata for reproducible builds
+- Fields extracted: `Build-Date`, `Build-Architecture`, `Checksums-Sha256`
+- Integration: Cross-references with binary packages for provenance
+
+**Alpine SecDB Connector:**
+
+The `AlpineSecDbConnector` parses Alpine's security database:
+
+- Source: `secfixes` blocks in APKBUILD files
+- Repository: `alpine/aports` Git repository
+- Format: YAML blocks mapping CVEs to fixed versions
+- Example:
+  ```yaml
+  secfixes:
+    3.0.11-r0:
+      - CVE-2024-0727
+      - CVE-2024-0728
+  ```
+
+**OSV Dump Parser:**
+
+The `OsvDumpParser` processes Google OSV database dumps for advisory cross-correlation:
+
+- Source: `osv.dev` bulk exports (JSON)
+- Purpose: CVE → commit range extraction for patch identification
+- Cross-reference: Correlates OSV entries with distribution advisories
+- Inconsistency detection: Identifies discrepancies between OSV and distro advisories
+
+```csharp
+public interface IOsvDumpParser
+{
+    IAsyncEnumerable<OsvParsedEntry> ParseDumpAsync(Stream osvDumpStream, CancellationToken ct);
+    OsvCveIndex BuildCveIndex(IEnumerable<OsvParsedEntry> entries);
+    IEnumerable<AdvisoryCorrelation> CrossReferenceWithExternal(
+        OsvCveIndex osvIndex,
+        IEnumerable<ExternalAdvisory> externalAdvisories);
+    IEnumerable<AdvisoryInconsistency> DetectInconsistencies(
+        IEnumerable<AdvisoryCorrelation> correlations);
+}
+```
+
+**CLI Access:**
+
+All connectors are manageable via the `stella groundtruth sources` CLI commands:
+
+```bash
+# List all connectors
+stella groundtruth sources list
+
+# Sync specific connector
+stella groundtruth sources sync --source buildinfo-debian --full
+
+# Enable/disable connectors
+stella groundtruth sources enable ddeb-ubuntu
+stella groundtruth sources disable debuginfod-fedora
+```
+
+See [Ground-Truth CLI Guide](../cli/guides/ground-truth-cli.md) for complete CLI documentation
+
+### 10.3 Key Performance Indicators
+
+| KPI | Target | Description |
+|-----|--------|-------------|
+| Per-function match rate | >= 90% | Functions matched in post-patch binary |
+| False-negative patch detection | <= 5% | Patched functions incorrectly classified |
+| SBOM canonical-hash stability | 3/3 | Determinism across independent runs |
+| Binary reconstruction equivalence | Trend | Rebuilt binary matches original |
+| End-to-end verify time (p95, cold) | Trend | Offline verification performance |
+
+### 10.4 Validation Harness
+
+The validation harness (`IValidationHarness`) orchestrates end-to-end verification:
+
+```
+Binary Pair (pre/post) → Symbol Recovery → IR Lifting → Fingerprinting → Matching → Metrics
+```
+
+### 10.5 Evidence Bundle Format
+
+Evidence bundles follow OCI/ORAS conventions:
+
+```
+<pkg>-<advisory>-bundle.oci.tar
+├── manifest.json           # OCI manifest
+└── blobs/
+    ├── sha256:<sbom>       # Canonical SBOM
+    ├── sha256:<pre-bin>    # Pre-fix binary
+    ├── sha256:<post-bin>   # Post-fix binary
+    ├── sha256:<delta-sig>  # DSSE delta-sig predicate
+    └── sha256:<timestamp>  # RFC 3161 timestamp
+```
+
+### 10.6 Related Documentation
+
+- [Golden Corpus KPIs](../../benchmarks/golden-corpus-kpis.md)
+- [Golden Corpus Seed List](../../benchmarks/golden-corpus-seed-list.md)
+- [Ground-Truth Corpus Specification](../../benchmarks/ground-truth-corpus.md)
+
+---
+
+## 11. References

 - Advisory: `docs/product/advisories/21-Dec-2025 - Mapping Evidence Within Compiled Binaries.md`
 - Scanner Native Analysis: `src/Scanner/StellaOps.Scanner.Analyzers.Native/`
@@ -1248,8 +1424,9 @@ binaryindex:
 - **Semantic Diffing Sprint:** `docs/implplan/SPRINT_20260105_001_001_BINDEX_semdiff_ir_semantics.md`
 - **Semantic Library:** `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/`
 - **Semantic Tests:** `src/BinaryIndex/__Tests/StellaOps.BinaryIndex.Semantic.Tests/`
+- **Golden Corpus Sprints:** `docs/implplan/SPRINT_20260121_034_BinaryIndex_golden_corpus_foundation.md`

 ---

-*Document Version: 1.1.1*
-*Last Updated: 2026-01-14*
+*Document Version: 1.2.0*
+*Last Updated: 2026-01-21*
--- a/docs/modules/binary-index/golden-corpus-layout.md
+++ b/docs/modules/binary-index/golden-corpus-layout.md
@@ -0,0 +1,347 @@
+# Golden Corpus Folder Layout
+
+Sprint: SPRINT_20260121_036_BinaryIndex_golden_corpus_bundle_verification
+Task: GCB-006 - Document corpus folder layout and maintenance procedures
+
+## Overview
+
+The golden corpus is a curated dataset of pre/post security patch binary pairs used for:
+- Validating binary matching algorithms
+- Benchmarking reproducibility verification
+- Training machine learning models for function identification
+- Generating audit-ready evidence bundles
+
+## Root Layout
+
+```
+golden-corpus/
+├── corpus/                    # Security pairs organized by distro
+│   ├── debian/
+│   ├── ubuntu/
+│   └── alpine/
+├── mirrors/                   # Local mirrors of upstream sources
+│   ├── debian/
+│   ├── ubuntu/
+│   ├── alpine/
+│   └── osv/
+├── harness/                   # Build and verification tooling
+│   ├── chroots/
+│   ├── lifter-matcher/
+│   ├── sbom-canonicalizer/
+│   └── verifier/
+├── evidence/                  # Generated evidence bundles
+│   └── <pkg>-<advisory>-bundle.oci.tar
+└── bench/                     # Benchmark data and baselines
+    ├── baselines/
+    └── results/
+```
+
+## Corpus Directory Structure
+
+Each security pair follows a consistent structure:
+
+```
+corpus/<distro>/<package>/<advisory-id>/
+├── pre/                       # Pre-patch (vulnerable) artifacts
+│   ├── src/                   # Source code
+│   │   ├── *.tar.gz          # Original source tarball
+│   │   ├── debian/           # Packaging metadata
+│   │   └── buildinfo         # Build reproducibility info
+│   └── debs/                  # Built binaries
+│       ├── *.deb             # Binary packages
+│       ├── *.ddeb            # Debug symbols
+│       └── buildlog          # Build log
+├── post/                      # Post-patch (fixed) artifacts
+│   ├── src/
+│   └── debs/
+└── metadata/
+    ├── advisory.json         # Advisory details
+    ├── osv.json              # OSV format vulnerability
+    ├── pair-manifest.json    # Pair configuration
+    └── ground-truth.json     # Function-level ground truth
+```
+
+### Debian Example
+
+```
+corpus/debian/openssl/DSA-5678-1/
+├── pre/
+│   ├── src/
+│   │   ├── openssl_3.0.10.orig.tar.gz
+│   │   ├── openssl_3.0.10-1.debian.tar.xz
+│   │   ├── openssl_3.0.10-1.dsc
+│   │   └── openssl_3.0.10-1.buildinfo
+│   └── debs/
+│       ├── libssl3_3.0.10-1_amd64.deb
+│       ├── libssl3-dbgsym_3.0.10-1_amd64.ddeb
+│       └── build.log
+├── post/
+│   ├── src/
+│   │   ├── openssl_3.0.11.orig.tar.gz
+│   │   ├── openssl_3.0.11-1.debian.tar.xz
+│   │   └── ...
+│   └── debs/
+│       └── ...
+└── metadata/
+    ├── advisory.json
+    └── ground-truth.json
+```
+
+### Ubuntu Example
+
+```
+corpus/ubuntu/curl/USN-1234-1/
+├── pre/
+│   ├── src/
+│   │   └── curl_8.4.0-1ubuntu1.tar.xz
+│   └── debs/
+│       └── libcurl4_8.4.0-1ubuntu1_amd64.deb
+├── post/
+│   └── ...
+└── metadata/
+    ├── advisory.json
+    └── usn.json
+```
+
+### Alpine Example
+
+```
+corpus/alpine/zlib/CVE-2022-37434/
+├── pre/
+│   ├── src/
+│   │   └── APKBUILD
+│   └── apks/
+│       └── zlib-1.2.12-r2.apk
+├── post/
+│   └── ...
+└── metadata/
+    └── secdb-entry.json
+```
+
+## Mirrors Directory Structure
+
+Local mirrors cache upstream artifacts for offline operation:
+
+```
+mirrors/
+├── debian/
+│   ├── archive/              # snapshot.debian.org mirrors
+│   │   └── pool/main/o/openssl/
+│   ├── snapshot/             # Point-in-time snapshots
+│   │   └── 20260101T000000Z/
+│   └── buildinfo/            # buildinfos.debian.net cache
+│       └── <source-name>/
+├── ubuntu/
+│   ├── archive/              # archive.ubuntu.com mirrors
+│   ├── usn-index/            # USN metadata
+│   │   └── usn-db.json
+│   └── launchpad/            # Build logs from Launchpad
+├── alpine/
+│   ├── packages/             # Alpine package mirror
+│   └── secdb/                # Security database
+│       └── community.json
+└── osv/
+    ├── all.zip               # Full OSV database
+    └── debian/               # Distro-specific extracts
+```
+
+## Harness Directory Structure
+
+Build and verification tooling:
+
+```
+harness/
+├── chroots/                  # Build environments
+│   ├── debian-bookworm-amd64/
+│   ├── debian-bullseye-amd64/
+│   ├── ubuntu-noble-amd64/
+│   └── alpine-3.19-amd64/
+├── lifter-matcher/           # Binary analysis tools
+│   ├── ghidra/               # Ghidra installation
+│   ├── bsim-server/          # BSim database server
+│   └── semantic-diffing/     # Semantic diff tools
+├── sbom-canonicalizer/       # SBOM normalization
+│   └── config/
+└── verifier/                 # Standalone verifier
+    ├── stella-verifier       # Verifier binary
+    └── trust-profiles/       # Trust profiles
+```
+
+## Evidence Directory Structure
+
+Generated bundles for audit/compliance:
+
+```
+evidence/
+├── openssl-DSA-5678-1-bundle.oci.tar
+├── curl-USN-1234-1-bundle.oci.tar
+└── manifests/
+    └── inventory.json
+```
+
+### Bundle Internal Structure (OCI Format)
+
+```
+openssl-DSA-5678-1-bundle.oci.tar/
+├── oci-layout               # OCI layout version
+├── index.json               # OCI index with referrers
+├── blobs/
+│   └── sha256/
+│       ├── <manifest>       # Bundle manifest
+│       ├── <sbom-pre>       # Pre-patch SBOM
+│       ├── <sbom-post>      # Post-patch SBOM
+│       ├── <binary-pre>     # Pre-patch binary
+│       ├── <binary-post>    # Post-patch binary
+│       ├── <delta-sig>      # DSSE delta-sig predicate
+│       ├── <provenance>     # Build provenance
+│       └── <timestamp>      # RFC 3161 timestamp
+└── manifest.json            # Signed bundle manifest
+```
+
+## Bench Directory Structure
+
+Benchmark data and KPI baselines:
+
+```
+bench/
+├── baselines/
+│   ├── current.json         # Active KPI baseline
+│   └── archive/             # Historical baselines
+│       ├── baseline-20260115.json
+│       └── baseline-20260108.json
+├── results/
+│   ├── 20260122120000.json  # Validation run results
+│   └── ...
+└── reports/
+    └── regression-report-*.md
+```
+
+### Baseline File Format
+
+```json
+{
+  "baselineId": "baseline-20260122120000",
+  "createdAt": "2026-01-22T12:00:00Z",
+  "source": "abc123def456",
+  "description": "Post-semantic-diffing-v2 baseline",
+  "precision": 0.95,
+  "recall": 0.92,
+  "falseNegativeRate": 0.08,
+  "deterministicReplayRate": 1.0,
+  "ttfrpP95Ms": 150,
+  "additionalKpis": {}
+}
+```
+
+## File Naming Conventions
+
+| Type | Pattern | Example |
+|------|---------|---------|
+| Advisory ID (Debian) | `DSA-<number>-<revision>` | `DSA-5678-1` |
+| Advisory ID (Ubuntu) | `USN-<number>-<revision>` | `USN-1234-1` |
+| Advisory ID (Alpine) | `CVE-<year>-<number>` | `CVE-2022-37434` |
+| Bundle file | `<pkg>-<advisory>-bundle.oci.tar` | `openssl-DSA-5678-1-bundle.oci.tar` |
+| Baseline file | `baseline-<timestamp>.json` | `baseline-20260122120000.json` |
+| Results file | `<timestamp>.json` | `20260122120000.json` |
+
+## Metadata Files
+
+### advisory.json
+
+```json
+{
+  "advisoryId": "DSA-5678-1",
+  "cves": ["CVE-2024-1234", "CVE-2024-5678"],
+  "package": "openssl",
+  "vulnerableVersions": ["3.0.10-1"],
+  "fixedVersions": ["3.0.11-1"],
+  "severity": "high",
+  "publishedAt": "2024-11-15T00:00:00Z",
+  "summary": "Multiple vulnerabilities in OpenSSL"
+}
+```
+
+### pair-manifest.json
+
+```json
+{
+  "pairId": "openssl-DSA-5678-1",
+  "package": "openssl",
+  "distribution": "debian",
+  "suite": "bookworm",
+  "architecture": "amd64",
+  "preVersion": "3.0.10-1",
+  "postVersion": "3.0.11-1",
+  "binaries": [
+    "libssl3",
+    "libcrypto3"
+  ],
+  "createdAt": "2026-01-15T10:00:00Z",
+  "validatedAt": "2026-01-22T12:00:00Z"
+}
+```
+
+### ground-truth.json
+
+```json
+{
+  "pairId": "openssl-DSA-5678-1",
+  "binary": "libcrypto.so.3",
+  "functions": [
+    {
+      "name": "EVP_DigestInit_ex",
+      "preAddress": "0x12345",
+      "postAddress": "0x12347",
+      "status": "modified",
+      "confidence": 1.0
+    },
+    {
+      "name": "EVP_DigestUpdate",
+      "preAddress": "0x12400",
+      "postAddress": "0x12400",
+      "status": "unchanged",
+      "confidence": 1.0
+    }
+  ],
+  "metadata": {
+    "generatedBy": "manual-annotation",
+    "reviewedBy": "security-team",
+    "reviewedAt": "2026-01-20T14:00:00Z"
+  }
+}
+```
+
+## Access Patterns
+
+### Read-Only Access
+- Validation harness reads corpus pairs
+- CI reads baselines for regression checks
+- Auditors read evidence bundles
+
+### Write Access
+- Corpus ingestion adds new pairs
+- Baseline update writes new baseline files
+- Bundle export creates evidence bundles
+
+### Sync Access
+- Mirror sync updates upstream caches
+- Scheduled jobs refresh OSV database
+
+## Storage Requirements
+
+| Component | Typical Size | Growth Rate |
+|-----------|--------------|-------------|
+| Corpus (per pair) | 50-500 MB | N/A |
+| Mirrors (Debian) | 10-50 GB | Monthly |
+| Mirrors (Ubuntu) | 5-20 GB | Monthly |
+| Mirrors (Alpine) | 1-5 GB | Monthly |
+| OSV Database | 500 MB | Weekly |
+| Evidence bundles | 100-500 MB each | Per pair |
+| Baselines | < 10 KB each | Per run |
+
+## Related Documentation
+
+- [Ground Truth Corpus Overview](ground-truth-corpus.md)
+- [Golden Corpus Maintenance](golden-corpus-maintenance.md)
+- [Corpus Ingestion Operations](corpus-ingestion-operations.md)
+- [Golden Corpus Operations Runbook](../../runbooks/golden-corpus-operations.md)
--- a/docs/modules/binary-index/golden-corpus-maintenance.md
+++ b/docs/modules/binary-index/golden-corpus-maintenance.md
@@ -0,0 +1,492 @@
+# Golden Corpus Maintenance
+
+Sprint: SPRINT_20260121_036_BinaryIndex_golden_corpus_bundle_verification
+Task: GCB-006 - Document corpus folder layout and maintenance procedures
+
+## Overview
+
+This document describes maintenance procedures for the golden corpus, including:
+- Mirror synchronization
+- Baseline management
+- Evidence bundle generation
+- Health monitoring
+
+## Mirror Synchronization
+
+### Automated Sync Schedule
+
+Mirror sync should be automated via cron jobs or CI scheduled workflows.
+
+#### Recommended Schedule
+
+| Mirror | Frequency | Rationale |
+|--------|-----------|-----------|
+| Debian archive | Daily | Security updates published daily |
+| Debian buildinfo | Daily | Matches archive updates |
+| Ubuntu archive | Daily | Security updates published daily |
+| Ubuntu USN index | Hourly | USN metadata changes frequently |
+| Alpine secdb | Daily | Less frequent updates |
+| OSV database | Hourly | Aggregates multiple sources |
+
+### Sync Scripts
+
+#### Debian Mirror Sync
+
+```bash
+#!/bin/bash
+# sync-debian-mirrors.sh
+# Syncs Debian archives and buildinfo
+
+set -euo pipefail
+
+MIRRORS_ROOT="${MIRRORS_ROOT:-/data/golden-corpus/mirrors}"
+DEBIAN_MIRROR="${DEBIAN_MIRROR:-https://snapshot.debian.org}"
+BUILDINFO_URL="${BUILDINFO_URL:-https://buildinfos.debian.net}"
+
+# Packages to mirror (security-relevant)
+PACKAGES=(openssl curl zlib glibc libxml2 libpng)
+
+# Sync source packages
+for pkg in "${PACKAGES[@]}"; do
+    echo "Syncing Debian sources for: $pkg"
+
+    # Create package directory
+    mkdir -p "$MIRRORS_ROOT/debian/archive/pool/main/${pkg:0:1}/$pkg"
+
+    # Download available versions
+    rsync -avz --progress \
+        "rsync://snapshot.debian.org/snapshot/debian/pool/main/${pkg:0:1}/$pkg/" \
+        "$MIRRORS_ROOT/debian/archive/pool/main/${pkg:0:1}/$pkg/"
+done
+
+# Sync buildinfo files
+for pkg in "${PACKAGES[@]}"; do
+    echo "Syncing buildinfo for: $pkg"
+
+    mkdir -p "$MIRRORS_ROOT/debian/buildinfo/$pkg"
+
+    # Use wget to fetch buildinfo index and files
+    wget -r -np -nH --cut-dirs=2 -P "$MIRRORS_ROOT/debian/buildinfo/$pkg" \
+        "$BUILDINFO_URL/api/v1/buildinfo/$pkg/" || true
+done
+
+echo "Debian mirror sync complete"
+date > "$MIRRORS_ROOT/debian/.last-sync"
+```
+
+#### Ubuntu Mirror Sync
+
+```bash
+#!/bin/bash
+# sync-ubuntu-mirrors.sh
+# Syncs Ubuntu archives and USN metadata
+
+set -euo pipefail
+
+MIRRORS_ROOT="${MIRRORS_ROOT:-/data/golden-corpus/mirrors}"
+UBUNTU_ARCHIVE="https://archive.ubuntu.com/ubuntu"
+USN_API="https://ubuntu.com/security/notices.json"
+
+# Sync USN database
+echo "Syncing Ubuntu USN database..."
+mkdir -p "$MIRRORS_ROOT/ubuntu/usn-index"
+curl -sSL "$USN_API" -o "$MIRRORS_ROOT/ubuntu/usn-index/usn-db.json.tmp"
+mv "$MIRRORS_ROOT/ubuntu/usn-index/usn-db.json.tmp" "$MIRRORS_ROOT/ubuntu/usn-index/usn-db.json"
+
+# Sync packages (similar to Debian)
+PACKAGES=(openssl curl zlib1g libxml2)
+
+for pkg in "${PACKAGES[@]}"; do
+    echo "Syncing Ubuntu sources for: $pkg"
+    mkdir -p "$MIRRORS_ROOT/ubuntu/archive/pool/main/${pkg:0:1}/$pkg"
+    # ... sync logic
+done
+
+echo "Ubuntu mirror sync complete"
+date > "$MIRRORS_ROOT/ubuntu/.last-sync"
+```
+
+#### Alpine SecDB Sync
+
+```bash
+#!/bin/bash
+# sync-alpine-secdb.sh
+# Syncs Alpine security database
+
+set -euo pipefail
+
+MIRRORS_ROOT="${MIRRORS_ROOT:-/data/golden-corpus/mirrors}"
+ALPINE_SECDB="https://secdb.alpinelinux.org"
+
+mkdir -p "$MIRRORS_ROOT/alpine/secdb"
+
+# Download all security databases
+for branch in v3.17 v3.18 v3.19 v3.20 edge; do
+    for repo in main community; do
+        echo "Syncing Alpine secdb: $branch/$repo"
+        curl -sSL "$ALPINE_SECDB/$branch/$repo.json" \
+            -o "$MIRRORS_ROOT/alpine/secdb/${branch}-${repo}.json" || true
+    done
+done
+
+echo "Alpine secdb sync complete"
+date > "$MIRRORS_ROOT/alpine/.last-sync"
+```
+
+#### OSV Database Sync
+
+```bash
+#!/bin/bash
+# sync-osv.sh
+# Syncs OSV vulnerability database
+
+set -euo pipefail
+
+MIRRORS_ROOT="${MIRRORS_ROOT:-/data/golden-corpus/mirrors}"
+OSV_URL="https://osv-vulnerabilities.storage.googleapis.com"
+
+mkdir -p "$MIRRORS_ROOT/osv"
+
+# Download full database
+echo "Downloading OSV all.zip..."
+curl -sSL "$OSV_URL/all.zip" -o "$MIRRORS_ROOT/osv/all.zip.tmp"
+mv "$MIRRORS_ROOT/osv/all.zip.tmp" "$MIRRORS_ROOT/osv/all.zip"
+
+# Extract ecosystem-specific databases
+for ecosystem in Debian Ubuntu Alpine; do
+    mkdir -p "$MIRRORS_ROOT/osv/$ecosystem"
+    unzip -o -q "$MIRRORS_ROOT/osv/all.zip" "$ecosystem/*" -d "$MIRRORS_ROOT/osv/" || true
+done
+
+echo "OSV sync complete"
+date > "$MIRRORS_ROOT/osv/.last-sync"
+```
+
+### Cron Configuration
+
+```cron
+# /etc/cron.d/golden-corpus-sync
+
+# Mirror sync jobs
+0 */4 * * * corpus /opt/golden-corpus/scripts/sync-debian-mirrors.sh >> /var/log/corpus/debian-sync.log 2>&1
+0 */4 * * * corpus /opt/golden-corpus/scripts/sync-ubuntu-mirrors.sh >> /var/log/corpus/ubuntu-sync.log 2>&1
+0 6 * * *   corpus /opt/golden-corpus/scripts/sync-alpine-secdb.sh >> /var/log/corpus/alpine-sync.log 2>&1
+0 * * * *   corpus /opt/golden-corpus/scripts/sync-osv.sh >> /var/log/corpus/osv-sync.log 2>&1
+
+# Health check
+*/15 * * * * corpus /opt/golden-corpus/scripts/check-mirror-health.sh >> /var/log/corpus/health.log 2>&1
+```
+
+## Baseline Management
+
+### When to Update Baselines
+
+Update the KPI baseline when:
+1. Algorithm improvements are merged (expected KPI improvement)
+2. New corpus pairs are added (may change baseline metrics)
+3. False positives/negatives are corrected in ground truth
+4. Major version upgrades of analysis tools
+
+### Baseline Update Procedure
+
+#### 1. Run Full Validation
+
+```bash
+# Run validation on the full corpus
+stella groundtruth validate run \
+    --matcher semantic-diffing \
+    --output bench/results/$(date +%Y%m%d%H%M%S).json \
+    --verbose
+```
+
+#### 2. Review Results
+
+```bash
+# Check metrics
+stella groundtruth validate metrics --run-id latest
+
+# Compare against current baseline
+stella groundtruth validate check \
+    --results bench/results/latest.json \
+    --baseline bench/baselines/current.json
+```
+
+#### 3. Update Baseline
+
+Only if regression check passes or improvements are expected:
+
+```bash
+# Archive current baseline
+cp bench/baselines/current.json \
+   bench/baselines/archive/baseline-$(date +%Y%m%d).json
+
+# Update baseline
+stella groundtruth baseline update \
+    --from-results bench/results/latest.json \
+    --output bench/baselines/current.json \
+    --description "Post algorithm-v2.3 update" \
+    --source "$(git rev-parse HEAD)"
+```
+
+#### 4. Commit and Document
+
+```bash
+# Commit the baseline update
+git add bench/baselines/
+git commit -m "chore(bench): update golden corpus baseline
+
+Reason: Algorithm v2.3 improvements
+Previous baseline: baseline-20260115.json
+
+Metrics:
+- Precision: 0.95 -> 0.97 (+2pp)
+- Recall: 0.92 -> 0.94 (+2pp)
+- FN Rate: 0.08 -> 0.06 (-2pp)
+- Determinism: 100%
+- TTFRP p95: 150ms -> 140ms (-7%)"
+
+git push
+```
+
+### Baseline Rollback
+
+If a baseline update causes issues:
+
+```bash
+# Restore previous baseline
+cp bench/baselines/archive/baseline-20260115.json \
+   bench/baselines/current.json
+
+git add bench/baselines/current.json
+git commit -m "revert(bench): rollback baseline to 20260115"
+git push
+```
+
+## Evidence Bundle Generation
+
+### Manual Bundle Export
+
+```bash
+# Export bundle for specific packages
+stella groundtruth bundle export \
+    --packages openssl,curl,zlib \
+    --distros debian,ubuntu \
+    --output evidence/security-bundle-$(date +%Y%m%d).tar.gz \
+    --sign-with-cosign \
+    --include-debug \
+    --include-kpis \
+    --include-timestamps
+```
+
+### Automated Bundle Generation
+
+Schedule bundle generation for compliance reporting:
+
+```bash
+#!/bin/bash
+# generate-compliance-bundles.sh
+# Run monthly for audit evidence
+
+set -euo pipefail
+
+EVIDENCE_DIR="/data/golden-corpus/evidence"
+MONTH=$(date +%Y%m)
+
+# Generate bundles for each distro
+for distro in debian ubuntu alpine; do
+    stella groundtruth bundle export \
+        --distros "$distro" \
+        --packages all \
+        --output "$EVIDENCE_DIR/$distro-bundle-$MONTH.tar.gz" \
+        --sign-with-cosign \
+        --include-kpis \
+        --include-timestamps
+done
+
+# Create manifest
+echo "{\"month\": \"$MONTH\", \"bundles\": [\"debian\", \"ubuntu\", \"alpine\"]}" \
+    > "$EVIDENCE_DIR/manifest-$MONTH.json"
+```
+
+### Bundle Verification
+
+Always verify bundles after generation:
+
+```bash
+# Verify bundle integrity
+stella groundtruth bundle import \
+    --input evidence/security-bundle-20260122.tar.gz \
+    --verify \
+    --trusted-keys /etc/stellaops/trusted-keys.pub \
+    --trust-profile /etc/stellaops/trust-profiles/global.json \
+    --output verification-report.md
+```
+
+## Health Monitoring
+
+### Doctor Checks
+
+Run Doctor checks regularly to validate corpus health:
+
+```bash
+# Run all corpus-related checks
+stella doctor --check "check.binaryanalysis.corpus.*"
+
+# Specific checks
+stella doctor --check check.binaryanalysis.corpus.mirror.freshness
+stella doctor --check check.binaryanalysis.corpus.kpi.baseline
+stella doctor --check check.binaryanalysis.debuginfod.availability
+```
+
+### Health Check Script
+
+```bash
+#!/bin/bash
+# check-mirror-health.sh
+# Validates mirror freshness and connectivity
+
+set -euo pipefail
+
+MIRRORS_ROOT="${MIRRORS_ROOT:-/data/golden-corpus/mirrors}"
+STALE_THRESHOLD_DAYS=7
+ALERTS=""
+
+check_mirror() {
+    local mirror_name=$1
+    local last_sync_file=$2
+    local max_age=$3
+
+    if [[ ! -f "$last_sync_file" ]]; then
+        ALERTS+="CRITICAL: $mirror_name has never been synced\n"
+        return
+    fi
+
+    local last_sync=$(cat "$last_sync_file")
+    local last_sync_epoch=$(date -d "$last_sync" +%s)
+    local now_epoch=$(date +%s)
+    local age_days=$(( (now_epoch - last_sync_epoch) / 86400 ))
+
+    if [[ $age_days -gt $max_age ]]; then
+        ALERTS+="WARNING: $mirror_name is $age_days days old (threshold: $max_age)\n"
+    fi
+}
+
+# Check each mirror
+check_mirror "Debian" "$MIRRORS_ROOT/debian/.last-sync" $STALE_THRESHOLD_DAYS
+check_mirror "Ubuntu" "$MIRRORS_ROOT/ubuntu/.last-sync" $STALE_THRESHOLD_DAYS
+check_mirror "Alpine" "$MIRRORS_ROOT/alpine/.last-sync" $STALE_THRESHOLD_DAYS
+check_mirror "OSV" "$MIRRORS_ROOT/osv/.last-sync" 1  # OSV should be hourly
+
+# Check connectivity
+for url in \
+    "https://snapshot.debian.org" \
+    "https://buildinfos.debian.net" \
+    "https://ubuntu.com/security/notices.json" \
+    "https://secdb.alpinelinux.org"; do
+
+    if ! curl -sSf --connect-timeout 5 "$url" > /dev/null 2>&1; then
+        ALERTS+="ERROR: Cannot reach $url\n"
+    fi
+done
+
+# Report results
+if [[ -n "$ALERTS" ]]; then
+    echo -e "Golden Corpus Health Issues:\n$ALERTS"
+    # Send alert (customize for your alerting system)
+    # curl -X POST -d "$ALERTS" https://alerts.example.com/webhook
+    exit 1
+fi
+
+echo "All mirrors healthy at $(date)"
+```
+
+### Monitoring Metrics
+
+Export these metrics to your monitoring system:
+
+| Metric | Description | Alert Threshold |
+|--------|-------------|-----------------|
+| `corpus.mirrors.age_seconds` | Time since last mirror sync | > 7 days |
+| `corpus.pairs.total` | Total number of security pairs | N/A (info) |
+| `corpus.validation.precision` | Latest precision rate | < baseline - 0.01 |
+| `corpus.validation.recall` | Latest recall rate | < baseline - 0.01 |
+| `corpus.validation.determinism` | Deterministic replay rate | < 1.0 |
+| `corpus.bundle.count` | Number of evidence bundles | N/A (info) |
+| `corpus.baseline.age_days` | Days since baseline update | > 30 days |
+
+### Prometheus Metrics Example
+
+```yaml
+# prometheus-corpus-metrics.yaml
+groups:
+  - name: golden-corpus
+    rules:
+      - alert: CorpusMirrorStale
+        expr: corpus_mirror_age_seconds > 604800  # 7 days
+        labels:
+          severity: warning
+        annotations:
+          summary: "Corpus mirror {{ $labels.mirror }} is stale"
+
+      - alert: CorpusRegressionDetected
+        expr: corpus_validation_precision < corpus_baseline_precision - 0.01
+        labels:
+          severity: critical
+        annotations:
+          summary: "Precision regression detected in golden corpus validation"
+
+      - alert: CorpusDeterminismFailure
+        expr: corpus_validation_determinism < 1.0
+        labels:
+          severity: critical
+        annotations:
+          summary: "Non-deterministic replay detected"
+```
+
+## Cleanup and Archival
+
+### Archive Old Results
+
+```bash
+#!/bin/bash
+# archive-old-results.sh
+# Archives results older than 90 days
+
+RESULTS_DIR="/data/golden-corpus/bench/results"
+ARCHIVE_DIR="/data/golden-corpus/bench/archive"
+AGE_DAYS=90
+
+mkdir -p "$ARCHIVE_DIR"
+
+find "$RESULTS_DIR" -name "*.json" -mtime +$AGE_DAYS -exec \
+    mv {} "$ARCHIVE_DIR/" \;
+
+# Compress archived results by month
+cd "$ARCHIVE_DIR"
+for month in $(ls *.json | cut -c1-6 | sort -u); do
+    tar -czf "results-$month.tar.gz" "${month}"*.json && \
+        rm -f "${month}"*.json
+done
+```
+
+### Prune Old Baselines
+
+Keep only the last N baselines:
+
+```bash
+#!/bin/bash
+# prune-baselines.sh
+# Keeps only the 10 most recent baseline archives
+
+BASELINE_ARCHIVE="/data/golden-corpus/bench/baselines/archive"
+KEEP_COUNT=10
+
+cd "$BASELINE_ARCHIVE"
+ls -t baseline-*.json | tail -n +$((KEEP_COUNT + 1)) | xargs -r rm -f
+```
+
+## Related Documentation
+
+- [Golden Corpus Folder Layout](golden-corpus-layout.md)
+- [Ground Truth Corpus Overview](ground-truth-corpus.md)
+- [Golden Corpus Operations Runbook](../../runbooks/golden-corpus-operations.md)
--- a/docs/modules/cli/README.md
+++ b/docs/modules/cli/README.md
@@ -23,10 +23,12 @@ The `stella` CLI is the operator-facing Swiss army knife for scans, exports, pol
 - Versioned command docs in `docs/modules/cli/guides`.
 - Plugin catalogue in `plugins/cli/**` (restart-only).

-## Related resources
- ./guides/20_REFERENCE.md
- ./guides/cli-reference.md
- ./guides/policy.md
+## Related resources
+- ./guides/20_REFERENCE.md
+- ./guides/cli-reference.md
+- ./guides/commands/analytics.md
+- ./guides/policy.md
+- ./guides/trust-profiles.md

 ## Backlog references
 - DOCS-CLI-OBS-52-001 / DOCS-CLI-FORENSICS-53-001 in ../../TASKS.md.
--- a/docs/modules/cli/cli-vs-ui-parity.md
+++ b/docs/modules/cli/cli-vs-ui-parity.md
@@ -51,10 +51,11 @@ Status key:

 | UI capability | CLI command(s) | Status | Notes / Tasks |
 |---------------|----------------|--------|---------------|
-| Advisory observations search | `stella vuln observations` | ✅ Available | Implemented via `BuildVulnCommand`. |
-| Advisory linkset export | `stella advisory linkset show/export` | 🟩 Planned | `CLI-LNM-22-001`. |
-| VEX observations / linksets | `stella vex obs get/linkset show` | 🟩 Planned | `CLI-LNM-22-002`. |
-| SBOM overlay export | `stella sbom overlay apply/export` | 🟩 Planned | Scoped to upcoming SBOM CLI sprint (`SBOM-CONSOLE-23-001/002` + CLI backlog). |
+| Advisory observations search | `stella vuln observations` | ✅ Available | Implemented via `BuildVulnCommand`. |
+| Advisory linkset export | `stella advisory linkset show/export` | 🟩 Planned | `CLI-LNM-22-001`. |
+| VEX observations / linksets | `stella vex obs get/linkset show` | 🟩 Planned | `CLI-LNM-22-002`. |
+| SBOM overlay export | `stella sbom overlay apply/export` | 🟩 Planned | Scoped to upcoming SBOM CLI sprint (`SBOM-CONSOLE-23-001/002` + CLI backlog). |
+| SBOM Lake analytics (`/analytics/sbom-lake`) | `stella analytics sbom-lake <subcommand>` | ✅ Available | CLI guide at `docs/modules/cli/guides/commands/analytics.md` (SPRINT_20260120_032). |

 ---

@@ -151,5 +152,5 @@ The script should emit a parity report that feeds into the Downloads workspace (

 ---

-*Last updated: 2025-10-28 (Sprint 23).* 
+*Last updated: 2026-01-20 (Sprint 20260120).*

--- a/docs/modules/cli/contracts/cli-spec-v1.yaml
+++ b/docs/modules/cli/contracts/cli-spec-v1.yaml
@@ -1,5 +1,5 @@
 version: 1
-generated: 2025-12-01T00:00:00Z
+generated: 2026-01-20T00:00:00Z
 compatibility:
  policy: "SemVer-like: commands/flags/exitCodes are backwards compatible within major version."
  deprecation:
@@ -38,6 +38,108 @@ commands:
          0: success
          4: auth-misconfigured
          5: token-invalid
+  - name: analytics
+    subcommands:
+      - name: sbom-lake
+        subcommands:
+          - name: suppliers
+            formats: [table, json, csv]
+            flags:
+              - name: environment
+                required: false
+              - name: limit
+                required: false
+              - name: format
+                required: false
+                values: [table, json, csv]
+              - name: output
+                required: false
+            exitCodes:
+              0: success
+              1: error
+          - name: licenses
+            formats: [table, json, csv]
+            flags:
+              - name: environment
+                required: false
+              - name: limit
+                required: false
+              - name: format
+                required: false
+                values: [table, json, csv]
+              - name: output
+                required: false
+            exitCodes:
+              0: success
+              1: error
+          - name: vulnerabilities
+            formats: [table, json, csv]
+            flags:
+              - name: environment
+                required: false
+              - name: min-severity
+                required: false
+                values: [critical, high, medium, low]
+              - name: limit
+                required: false
+              - name: format
+                required: false
+                values: [table, json, csv]
+              - name: output
+                required: false
+            exitCodes:
+              0: success
+              1: error
+          - name: backlog
+            formats: [table, json, csv]
+            flags:
+              - name: environment
+                required: false
+              - name: limit
+                required: false
+              - name: format
+                required: false
+                values: [table, json, csv]
+              - name: output
+                required: false
+            exitCodes:
+              0: success
+              1: error
+          - name: attestation-coverage
+            formats: [table, json, csv]
+            flags:
+              - name: environment
+                required: false
+              - name: limit
+                required: false
+              - name: format
+                required: false
+                values: [table, json, csv]
+              - name: output
+                required: false
+            exitCodes:
+              0: success
+              1: error
+          - name: trends
+            formats: [table, json, csv]
+            flags:
+              - name: environment
+                required: false
+              - name: days
+                required: false
+              - name: series
+                required: false
+                values: [vulnerabilities, components, all]
+              - name: limit
+                required: false
+              - name: format
+                required: false
+                values: [table, json, csv]
+              - name: output
+                required: false
+            exitCodes:
+              0: success
+              1: error
 telemetry:
  defaultEnabled: false
  envVars:
--- a/docs/modules/cli/guides/commands/analytics.md
+++ b/docs/modules/cli/guides/commands/analytics.md
@@ -0,0 +1,47 @@
+# stella analytics - Command Guide
+
+## Commands
+- `stella analytics sbom-lake suppliers [--environment <env>] [--limit <n>] [--format table|json|csv] [--output <path>]`
+- `stella analytics sbom-lake licenses [--environment <env>] [--limit <n>] [--format table|json|csv] [--output <path>]`
+- `stella analytics sbom-lake vulnerabilities [--environment <env>] [--min-severity <level>] [--limit <n>] [--format table|json|csv] [--output <path>]`
+- `stella analytics sbom-lake backlog [--environment <env>] [--limit <n>] [--format table|json|csv] [--output <path>]`
+- `stella analytics sbom-lake attestation-coverage [--environment <env>] [--limit <n>] [--format table|json|csv] [--output <path>]`
+- `stella analytics sbom-lake trends [--environment <env>] [--days <n>] [--series vulnerabilities|components|all] [--limit <n>] [--format table|json|csv] [--output <path>]`
+
+## Flags (common)
+- `--format`: Output format for rendering (`table`, `json`, `csv`).
+- `--output`: Write output to a file path instead of stdout.
+- `--limit`: Cap the number of rows returned.
+- `--environment`: Filter by environment name.
+
+## SBOM lake notes
+- Endpoints require the `analytics.read` scope.
+- `--min-severity` accepts `critical`, `high`, `medium`, `low`.
+- `--series` controls trend output (`vulnerabilities`, `components`, `all`).
+- Tables use deterministic ordering (severity and counts first, then names).
+
+## Examples
+
+```bash
+# Top suppliers
+stella analytics sbom-lake suppliers --limit 20
+
+# License distribution as CSV (prod)
+stella analytics sbom-lake licenses --environment prod --format csv --output licenses.csv
+
+# Vulnerability exposure in prod (high+)
+stella analytics sbom-lake vulnerabilities --environment prod --min-severity high
+
+# Fixable backlog with table output
+stella analytics sbom-lake backlog --environment prod --limit 50
+
+# Attestation coverage in staging, JSON output
+stella analytics sbom-lake attestation-coverage --environment stage --format json
+
+# 30-day trend snapshot (both series)
+stella analytics sbom-lake trends --days 30 --series all --format csv --output trends.csv
+```
+
+## Offline/verification note
+- If analytics exports arrive via offline bundles, verify the bundle first with
+  `stella bundle verify` before importing data into downstream reports.
--- a/docs/modules/cli/guides/commands/reference.md
+++ b/docs/modules/cli/guides/commands/reference.md
@@ -16,6 +16,7 @@ graph TD
    CLI --> EXPLAIN[Explainability]
    CLI --> VEX[VEX & Decisioning]
    CLI --> SBOM[SBOM Operations]
+    CLI --> ANALYTICS[Analytics & Insights]
    CLI --> REPORT[Reporting & Export]
    CLI --> OFFLINE[Offline Operations]
    CLI --> SYSTEM[System & Config]
@@ -742,6 +743,601 @@ stella sbom merge --sbom <path1> --sbom <path2> [--output <path>] [--verbose]

 ---

+## Analytics Commands
+
+### stella analytics sbom-lake
+
+Query SBOM lake analytics views (suppliers, licenses, vulnerabilities, backlog,
+attestation coverage, trends).
+
+**Usage:**
+```bash
+stella analytics sbom-lake <subcommand> [options]
+```
+
+**Subcommands:**
+- `suppliers` - Supplier concentration
+- `licenses` - License distribution
+- `vulnerabilities` - CVE exposure (VEX-adjusted)
+- `backlog` - Fixable vulnerability backlog
+- `attestation-coverage` - Provenance/SLSA coverage
+- `trends` - Time-series trends (vulnerabilities/components)
+
+**Common options:**
+| Option | Description |
+|--------|-------------|
+| `--environment <env>` | Filter to a specific environment |
+| `--min-severity <level>` | Minimum severity (`critical`, `high`, `medium`, `low`) |
+| `--days <n>` | Lookback window in days (trends only) |
+| `--series <name>` | Trend series (`vulnerabilities`, `components`, `all`) |
+| `--limit <n>` | Maximum number of rows |
+| `--format <fmt>` | Output format: `table`, `json`, `csv` |
+| `--output <path>` | Output file path |
+
+**Example:**
+```bash
+stella analytics sbom-lake vulnerabilities --environment prod --min-severity high --format csv --output vuln.csv
+```
+
+---
+
+## Ground-Truth Corpus Commands
+
+### stella groundtruth
+
+Manage ground-truth corpus for patch-paired binary verification. The corpus supports
+precision validation of security advisories by maintaining symbol and binary pairs
+from upstream sources.
+
+**Sprint:** SPRINT_20260121_035_BinaryIndex_golden_corpus_connectors_cli
+
+**Usage:**
+```bash
+stella groundtruth <subcommand> [options]
+```
+
+**Subcommands:**
+- `sources` - Manage symbol source connectors
+- `symbols` - Query and search symbols in the corpus
+- `pairs` - Manage security pairs (vuln/patch binary pairs)
+- `validate` - Run validation and view metrics
+
+---
+
+### stella groundtruth sources
+
+Manage upstream symbol source connectors.
+
+**Usage:**
+```bash
+stella groundtruth sources <command> [options]
+```
+
+**Subcommands:**
+
+#### stella groundtruth sources list
+
+List available symbol source connectors.
+
+```bash
+stella groundtruth sources list [--output-format table|json] [--verbose]
+```
+
+**Output:**
+```
+ID                        Display Name              Status       Last Sync
+------------------------------------------------------------------------------------------
+debuginfod-fedora         Fedora Debuginfod         Enabled      2026-01-22T10:00:00Z
+debuginfod-ubuntu         Ubuntu Debuginfod         Enabled      2026-01-22T10:00:00Z
+ddeb-ubuntu               Ubuntu ddebs              Enabled      2026-01-22T09:30:00Z
+buildinfo-debian          Debian Buildinfo          Enabled      2026-01-22T08:00:00Z
+secdb-alpine              Alpine SecDB              Enabled      2026-01-22T06:00:00Z
+```
+
+#### stella groundtruth sources enable
+
+Enable a symbol source connector.
+
+```bash
+stella groundtruth sources enable <source> [--verbose]
+```
+
+**Arguments:**
+- `<source>` - Source connector ID (e.g., `debuginfod-fedora`)
+
+**Example:**
+```bash
+stella groundtruth sources enable debuginfod-fedora
+```
+
+#### stella groundtruth sources disable
+
+Disable a symbol source connector.
+
+```bash
+stella groundtruth sources disable <source> [--verbose]
+```
+
+#### stella groundtruth sources sync
+
+Synchronize symbol sources from upstream.
+
+```bash
+stella groundtruth sources sync [--source <id>] [--full] [--verbose]
+```
+
+**Options:**
+| Option | Description |
+|--------|-------------|
+| `--source <id>` | Source connector ID (all if not specified) |
+| `--full` | Perform a full sync instead of incremental |
+
+**Example:**
+```bash
+# Incremental sync of all sources
+stella groundtruth sources sync
+
+# Full sync of Debian buildinfo
+stella groundtruth sources sync --source buildinfo-debian --full
+```
+
+---
+
+### stella groundtruth symbols
+
+Query and search symbols in the corpus.
+
+**Usage:**
+```bash
+stella groundtruth symbols <command> [options]
+```
+
+#### stella groundtruth symbols lookup
+
+Lookup symbols by debug ID (build-id).
+
+```bash
+stella groundtruth symbols lookup --debug-id <id> [--output-format table|json] [--verbose]
+```
+
+**Options:**
+| Option | Alias | Description | Required |
+|--------|-------|-------------|----------|
+| `--debug-id` | `-d` | Debug ID (build-id) to lookup | Yes |
+| `--output-format` | `-O` | Output format: `table`, `json` | No |
+
+**Example:**
+```bash
+stella groundtruth symbols lookup --debug-id 7f8a9b2c4d5e6f1a --output-format json
+```
+
+**Output (table):**
+```
+Binary: libcrypto.so.3
+Architecture: x86_64
+Distribution: debian-bookworm
+Package: openssl@3.0.11-1
+Symbol Count: 4523
+Sources: debuginfod-fedora, buildinfo-debian
+```
+
+#### stella groundtruth symbols search
+
+Search symbols by package or distribution.
+
+```bash
+stella groundtruth symbols search [--package <name>] [--distro <distro>] [--limit <n>] [--output-format table|json] [--verbose]
+```
+
+**Options:**
+| Option | Alias | Description | Default |
+|--------|-------|-------------|---------|
+| `--package` | `-p` | Package name to search for | - |
+| `--distro` | | Distribution filter (debian, ubuntu, alpine) | - |
+| `--limit` | `-l` | Maximum results | 20 |
+
+**Example:**
+```bash
+stella groundtruth symbols search --package openssl --distro debian --limit 50
+```
+
+---
+
+### stella groundtruth pairs
+
+Manage security pairs (vulnerable/patched binary pairs) in the corpus.
+
+**Usage:**
+```bash
+stella groundtruth pairs <command> [options]
+```
+
+#### stella groundtruth pairs create
+
+Create a new security pair.
+
+```bash
+stella groundtruth pairs create --cve <cve-id> --vuln-pkg <pkg=ver> --patch-pkg <pkg=ver> [--distro <distro>] [--verbose]
+```
+
+**Options:**
+| Option | Description | Required |
+|--------|-------------|----------|
+| `--cve` | CVE identifier | Yes |
+| `--vuln-pkg` | Vulnerable package (name=version) | Yes |
+| `--patch-pkg` | Patched package (name=version) | Yes |
+| `--distro` | Distribution (e.g., `debian-bookworm`) | No |
+
+**Example:**
+```bash
+stella groundtruth pairs create \
+  --cve CVE-2024-1234 \
+  --vuln-pkg openssl=3.0.10-1 \
+  --patch-pkg openssl=3.0.11-1 \
+  --distro debian-bookworm
+```
+
+#### stella groundtruth pairs list
+
+List security pairs in the corpus.
+
+```bash
+stella groundtruth pairs list [--cve <pattern>] [--package <name>] [--limit <n>] [--output-format table|json] [--verbose]
+```
+
+**Options:**
+| Option | Alias | Description | Default |
+|--------|-------|-------------|---------|
+| `--cve` | | Filter by CVE (supports wildcards: `CVE-2024-*`) | - |
+| `--package` | `-p` | Filter by package name | - |
+| `--limit` | `-l` | Maximum results | 50 |
+
+**Example:**
+```bash
+stella groundtruth pairs list --cve CVE-2024-* --package openssl --limit 100
+```
+
+**Output:**
+```
+Pair ID      CVE                Package      Vuln Version    Patch Version
+-------------------------------------------------------------------------------
+pair-001     CVE-2024-1234      openssl      3.0.10-1        3.0.11-1
+pair-002     CVE-2024-5678      curl         8.4.0-1         8.5.0-1
+```
+
+#### stella groundtruth pairs delete
+
+Delete a security pair from the corpus.
+
+```bash
+stella groundtruth pairs delete <pair-id> [--force] [--verbose]
+```
+
+**Options:**
+| Option | Alias | Description |
+|--------|-------|-------------|
+| `--force` | `-f` | Skip confirmation prompt |
+
+---
+
+### stella groundtruth validate
+
+Run validation harness against security pairs.
+
+**Usage:**
+```bash
+stella groundtruth validate <command> [options]
+```
+
+#### stella groundtruth validate run
+
+Run validation on security pairs.
+
+```bash
+stella groundtruth validate run [--pairs <pattern>] [--matcher <type>] [--output <path>] [--parallel <n>] [--verbose]
+```
+
+**Options:**
+| Option | Alias | Description | Default |
+|--------|-------|-------------|---------|
+| `--pairs` | `-p` | Pair filter pattern (e.g., `openssl:CVE-2024-*`) | all |
+| `--matcher` | `-m` | Matcher type: `semantic-diffing`, `hash-based`, `hybrid` | `semantic-diffing` |
+| `--output` | `-o` | Output file for validation report | - |
+| `--parallel` | | Maximum parallel validations | 4 |
+
+**Example:**
+```bash
+stella groundtruth validate run \
+  --pairs "openssl:CVE-2024-*" \
+  --matcher semantic-diffing \
+  --parallel 8 \
+  --output validation-report.md
+```
+
+**Output:**
+```
+Validating pairs: 10/10
+Validation complete. Run ID: vr-20260122100532
+  Function Match Rate: 94.2%
+  False-Negative Rate: 2.1%
+  SBOM Hash Stability: 3/3
+Report written to: validation-report.md
+```
+
+#### stella groundtruth validate metrics
+
+View metrics for a validation run.
+
+```bash
+stella groundtruth validate metrics --run-id <id> [--output-format table|json] [--verbose]
+```
+
+**Options:**
+| Option | Alias | Description | Required |
+|--------|-------|-------------|----------|
+| `--run-id` | `-r` | Validation run ID | Yes |
+
+**Example:**
+```bash
+stella groundtruth validate metrics --run-id vr-20260122100532 --output-format json
+```
+
+**Output (table):**
+```
+Run ID: vr-20260122100532
+Duration: 2026-01-22T10:00:00Z - 2026-01-22T10:15:32Z
+Pairs: 48/50 successful
+Function Match Rate: 94.2%
+False-Negative Rate: 2.1%
+SBOM Hash Stability: 3/3
+Verify Time (p50/p95): 423ms / 1.2s
+```
+
+#### stella groundtruth validate export
+
+Export validation report.
+
+```bash
+stella groundtruth validate export --run-id <id> --output <path> [--format <fmt>] [--verbose]
+```
+
+**Options:**
+| Option | Alias | Description | Default |
+|--------|-------|-------------|---------|
+| `--run-id` | `-r` | Validation run ID | (required) |
+| `--output` | `-o` | Output file path | (required) |
+| `--format` | `-f` | Export format: `markdown`, `html`, `json` | `markdown` |
+
+**Example:**
+```bash
+stella groundtruth validate export \
+  --run-id vr-20260122100532 \
+  --format markdown \
+  --output validation-report.md
+```
+
+**See Also:** [Ground-Truth CLI Guide](../ground-truth-cli.md)
+
+---
+
+### stella groundtruth bundle
+
+Manage evidence bundles for offline verification of patch provenance.
+
+**Sprint:** SPRINT_20260121_036_BinaryIndex_golden_corpus_bundle_verification
+
+**Usage:**
+```bash
+stella groundtruth bundle <command> [options]
+```
+
+**Subcommands:**
+- `export` - Create evidence bundles for air-gapped environments
+- `import` - Import and verify evidence bundles
+
+#### stella groundtruth bundle export
+
+Export evidence bundles containing pre/post binaries, SBOMs, delta-sig predicates, and timestamps.
+
+```bash
+stella groundtruth bundle export [options]
+```
+
+**Options:**
+| Option | Description | Required |
+|--------|-------------|----------|
+| `--packages <list>` | Comma-separated package names (e.g., `openssl,curl`) | Yes |
+| `--distros <list>` | Comma-separated distributions (e.g., `debian,ubuntu`) | Yes |
+| `--output <path>` | Output bundle path (.tar.gz or .oci.tar) | Yes |
+| `--sign-with <signer>` | Signing method: `cosign`, `sigstore`, `none` | No |
+| `--include-debug` | Include debug symbols | No |
+| `--include-kpis` | Include KPI validation results | No |
+| `--include-timestamps` | Include RFC 3161 timestamps | No |
+
+**Example:**
+```bash
+stella groundtruth bundle export \
+  --packages openssl,zlib,glibc \
+  --distros debian,fedora \
+  --output evidence/security-bundle.tar.gz \
+  --sign-with cosign \
+  --include-debug \
+  --include-kpis \
+  --include-timestamps
+```
+
+**Exit Codes:**
+- `0` - Bundle created successfully
+- `1` - Bundle creation failed
+- `2` - Invalid input or configuration error
+
+#### stella groundtruth bundle import
+
+Import and verify evidence bundles in air-gapped environments.
+
+```bash
+stella groundtruth bundle import [options]
+```
+
+**Options:**
+| Option | Description | Required |
+|--------|-------------|----------|
+| `--input <path>` | Input bundle path | Yes |
+| `--verify-signature` | Verify bundle signatures | No |
+| `--trusted-keys <path>` | Path to trusted public keys | No |
+| `--trust-profile <path>` | Trust profile for verification | No |
+| `--output <path>` | Output verification report | No |
+| `--format <fmt>` | Report format: `markdown`, `json`, `html` | No |
+
+**Example:**
+```bash
+stella groundtruth bundle import \
+  --input symbol-bundle.tar.gz \
+  --verify-signature \
+  --trusted-keys /etc/stellaops/trusted-keys.pub \
+  --trust-profile /etc/stellaops/trust-profiles/global.json \
+  --output verification-report.md
+```
+
+**Verification Steps:**
+1. Validate bundle manifest signature
+2. Verify all blob digests match manifest
+3. Validate DSSE envelope signatures against trusted keys
+4. Verify RFC 3161 timestamps against trusted TSA certificates
+5. Run IR matcher to confirm patched functions
+6. Verify SBOM canonical hash matches signed predicate
+7. Output verification report with KPI line items
+
+**Exit Codes:**
+- `0` - All verifications passed
+- `1` - One or more verifications failed
+- `2` - Invalid input or configuration error
+
+---
+
+### stella groundtruth validate check
+
+Check KPI regression against baseline thresholds.
+
+**Sprint:** SPRINT_20260121_036_BinaryIndex_golden_corpus_bundle_verification
+
+```bash
+stella groundtruth validate check [options]
+```
+
+**Options:**
+| Option | Description | Default |
+|--------|-------------|---------|
+| `--results <path>` | Path to validation results JSON | (required) |
+| `--baseline <path>` | Path to baseline JSON | (required) |
+| `--precision-threshold <pp>` | Max precision drop (percentage points) | 0.01 |
+| `--recall-threshold <pp>` | Max recall drop (percentage points) | 0.01 |
+| `--fn-rate-threshold <pp>` | Max FN rate increase (percentage points) | 0.01 |
+| `--determinism-threshold <rate>` | Min determinism rate | 1.0 |
+| `--ttfrp-threshold <pct>` | Max TTFRP p95 increase (percentage) | 0.20 |
+| `--output <path>` | Output report path | stdout |
+| `--format <fmt>` | Report format: `markdown`, `json` | `markdown` |
+
+**Example:**
+```bash
+stella groundtruth validate check \
+  --results bench/results/20260122.json \
+  --baseline bench/baselines/current.json \
+  --precision-threshold 0.01 \
+  --recall-threshold 0.01 \
+  --fn-rate-threshold 0.01 \
+  --determinism-threshold 1.0 \
+  --output regression-report.md
+```
+
+**Regression Gates:**
+| Metric | Threshold | Action |
+|--------|-----------|--------|
+| Precision | Drops > threshold | Fail |
+| Recall | Drops > threshold | Fail |
+| False-negative rate | Increases > threshold | Fail |
+| Deterministic replay | Drops below threshold | Fail |
+| TTFRP p95 | Increases > threshold | Warn |
+
+**Exit Codes:**
+- `0` - All gates passed
+- `1` - One or more gates failed
+- `2` - Invalid input or configuration error
+
+---
+
+### stella groundtruth baseline
+
+Manage KPI baselines for regression detection.
+
+**Sprint:** SPRINT_20260121_036_BinaryIndex_golden_corpus_bundle_verification
+
+**Usage:**
+```bash
+stella groundtruth baseline <command> [options]
+```
+
+**Subcommands:**
+- `update` - Update baseline from validation results
+- `show` - Display baseline contents
+
+#### stella groundtruth baseline update
+
+Update baseline from validation results.
+
+```bash
+stella groundtruth baseline update [options]
+```
+
+**Options:**
+| Option | Description | Required |
+|--------|-------------|----------|
+| `--from-results <path>` | Path to validation results JSON | Yes |
+| `--output <path>` | Output baseline path | Yes |
+| `--description <text>` | Description for the baseline update | No |
+| `--source <commit>` | Source commit SHA for traceability | No |
+
+**Example:**
+```bash
+stella groundtruth baseline update \
+  --from-results bench/results/20260122.json \
+  --output bench/baselines/current.json \
+  --description "Post algorithm-v2.3 update" \
+  --source "$(git rev-parse HEAD)"
+```
+
+#### stella groundtruth baseline show
+
+Display baseline contents.
+
+```bash
+stella groundtruth baseline show --baseline <path> [--format table|json]
+```
+
+**Options:**
+| Option | Description | Default |
+|--------|-------------|---------|
+| `--baseline <path>` | Path to baseline JSON | (required) |
+| `--format` | Output format: `table`, `json` | `table` |
+
+**Output (table):**
+```
+Baseline ID: baseline-20260122120000
+Created: 2026-01-22T12:00:00Z
+Source: abc123def456
+Description: Post-semantic-diffing-v2 baseline
+
+KPIs:
+  Precision:              0.9500
+  Recall:                 0.9200
+  False Negative Rate:    0.0800
+  Determinism:            1.0000
+  TTFRP p95:              150ms
+```
+
+**See Also:** [Ground-Truth CLI Guide](../ground-truth-cli.md)
+
+---
 ## Reporting & Export Commands

 ### stella report
--- a/docs/modules/cli/guides/ground-truth-cli.md
+++ b/docs/modules/cli/guides/ground-truth-cli.md
@@ -0,0 +1,351 @@
+# Ground-Truth Corpus CLI Guide
+
+**Sprint:** SPRINT_20260121_035_BinaryIndex_golden_corpus_connectors_cli
+
+## Overview
+
+The `stella groundtruth` command group provides CLI access to the ground-truth corpus for patch-paired binary verification. This corpus enables precision validation of security advisories by maintaining symbol and binary pairs from upstream distribution sources.
+
+## Use Cases
+
+- **Security teams**: Validate patch presence in production binaries
+- **Compliance auditors**: Generate evidence bundles for air-gapped verification
+- **DevSecOps**: Integrate corpus validation into CI/CD pipelines
+- **Researchers**: Query symbol databases for vulnerability analysis
+
+## Prerequisites
+
+- Stella CLI installed and configured
+- Backend connectivity to Platform service (or offline bundle)
+- For sync operations: network access to upstream sources
+
+## Command Structure
+
+```
+stella groundtruth
+├── sources        # Manage symbol source connectors
+│   ├── list       # List available connectors
+│   ├── enable     # Enable a connector
+│   ├── disable    # Disable a connector
+│   └── sync       # Sync from upstream
+├── symbols        # Query symbols in corpus
+│   ├── lookup     # Lookup by debug ID
+│   └── search     # Search by package/distro
+├── pairs          # Manage security pairs
+│   ├── create     # Create vuln/patch pair
+│   ├── list       # List existing pairs
+│   └── delete     # Remove a pair
+└── validate       # Run validation harness
+    ├── run        # Execute validation
+    ├── metrics    # View run metrics
+    └── export     # Export report
+```
+
+## Source Connectors
+
+The ground-truth corpus ingests data from multiple upstream sources:
+
+| Connector ID | Distribution | Data Type | Description |
+|--------------|--------------|-----------|-------------|
+| `debuginfod-fedora` | Fedora | Debug symbols | ELF debuginfo via debuginfod protocol |
+| `debuginfod-ubuntu` | Ubuntu | Debug symbols | ELF debuginfo via debuginfod protocol |
+| `ddeb-ubuntu` | Ubuntu | Debug packages | `.ddeb` debug symbol packages |
+| `buildinfo-debian` | Debian | Build metadata | `.buildinfo` reproducibility records |
+| `secdb-alpine` | Alpine | Security DB | `secfixes` YAML from APKBUILD |
+
+### List Sources
+
+```bash
+stella groundtruth sources list
+
+# Output:
+ID                        Display Name              Status       Last Sync
+------------------------------------------------------------------------------------------
+debuginfod-fedora         Fedora Debuginfod         Enabled      2026-01-22T10:00:00Z
+debuginfod-ubuntu         Ubuntu Debuginfod         Enabled      2026-01-22T10:00:00Z
+ddeb-ubuntu               Ubuntu ddebs              Enabled      2026-01-22T09:30:00Z
+buildinfo-debian          Debian Buildinfo          Enabled      2026-01-22T08:00:00Z
+secdb-alpine              Alpine SecDB              Enabled      2026-01-22T06:00:00Z
+```
+
+### Enable/Disable Sources
+
+```bash
+# Enable a source connector
+stella groundtruth sources enable debuginfod-fedora
+
+# Disable a source connector (stops future syncs)
+stella groundtruth sources disable debuginfod-fedora
+```
+
+### Sync Sources
+
+```bash
+# Incremental sync of all enabled sources
+stella groundtruth sources sync
+
+# Full sync of a specific source
+stella groundtruth sources sync --source buildinfo-debian --full
+
+# Sync with verbose output
+stella groundtruth sources sync --source ddeb-ubuntu -v
+```
+
+## Symbol Operations
+
+### Lookup by Debug ID
+
+Query symbols using the ELF GNU Build-ID or equivalent identifier:
+
+```bash
+# Lookup by build-id
+stella groundtruth symbols lookup --debug-id 7f8a9b2c4d5e6f1a
+
+# JSON output
+stella groundtruth symbols lookup --debug-id 7f8a9b2c4d5e6f1a --output-format json
+```
+
+**Example output:**
+```
+Binary: libcrypto.so.3
+Architecture: x86_64
+Distribution: debian-bookworm
+Package: openssl@3.0.11-1
+Symbol Count: 4523
+Sources: debuginfod-fedora, buildinfo-debian
+```
+
+### Search Symbols
+
+Search across the corpus by package name or distribution:
+
+```bash
+# Search by package
+stella groundtruth symbols search --package openssl
+
+# Filter by distribution
+stella groundtruth symbols search --package openssl --distro debian
+
+# Limit results
+stella groundtruth symbols search --package curl --limit 100
+```
+
+## Security Pairs
+
+Security pairs link vulnerable and patched binary versions for a specific CVE.
+
+### Create a Pair
+
+```bash
+stella groundtruth pairs create \
+  --cve CVE-2024-1234 \
+  --vuln-pkg openssl=3.0.10-1 \
+  --patch-pkg openssl=3.0.11-1 \
+  --distro debian-bookworm
+```
+
+### List Pairs
+
+```bash
+# List all pairs
+stella groundtruth pairs list
+
+# Filter by CVE pattern
+stella groundtruth pairs list --cve "CVE-2024-*"
+
+# Filter by package
+stella groundtruth pairs list --package openssl --limit 50
+
+# JSON output
+stella groundtruth pairs list --output-format json
+```
+
+**Example output:**
+```
+Pair ID      CVE                Package      Vuln Version    Patch Version
+-------------------------------------------------------------------------------
+pair-001     CVE-2024-1234      openssl      3.0.10-1        3.0.11-1
+pair-002     CVE-2024-5678      curl         8.4.0-1         8.5.0-1
+```
+
+### Delete a Pair
+
+```bash
+# Delete with confirmation prompt
+stella groundtruth pairs delete pair-001
+
+# Skip confirmation
+stella groundtruth pairs delete pair-001 --force
+```
+
+## Validation Harness
+
+The validation harness runs end-to-end verification against security pairs.
+
+### Run Validation
+
+```bash
+# Validate all pairs
+stella groundtruth validate run
+
+# Validate specific pairs (pattern match)
+stella groundtruth validate run --pairs "openssl:CVE-2024-*"
+
+# Use specific matcher
+stella groundtruth validate run --matcher semantic-diffing
+
+# Parallel validation with report output
+stella groundtruth validate run \
+  --pairs "curl:*" \
+  --parallel 8 \
+  --output validation-report.md
+```
+
+**Matcher types:**
+| Matcher | Description |
+|---------|-------------|
+| `semantic-diffing` | IR-level semantic comparison (default) |
+| `hash-based` | Function hash matching |
+| `hybrid` | Combined semantic + hash approach |
+
+### View Metrics
+
+```bash
+stella groundtruth validate metrics --run-id vr-20260122100532
+
+# JSON output
+stella groundtruth validate metrics --run-id vr-20260122100532 --output-format json
+```
+
+**Example output:**
+```
+Run ID: vr-20260122100532
+Duration: 2026-01-22T10:00:00Z - 2026-01-22T10:15:32Z
+Pairs: 48/50 successful
+Function Match Rate: 94.2%
+False-Negative Rate: 2.1%
+SBOM Hash Stability: 3/3
+Verify Time (p50/p95): 423ms / 1.2s
+```
+
+### Export Reports
+
+```bash
+# Export as Markdown
+stella groundtruth validate export \
+  --run-id vr-20260122100532 \
+  --format markdown \
+  --output report.md
+
+# Export as HTML
+stella groundtruth validate export \
+  --run-id vr-20260122100532 \
+  --format html \
+  --output report.html
+
+# Export as JSON (machine-readable)
+stella groundtruth validate export \
+  --run-id vr-20260122100532 \
+  --format json \
+  --output report.json
+```
+
+## CI/CD Integration
+
+### GitHub Actions Example
+
+```yaml
+name: Corpus Validation
+on:
+  schedule:
+    - cron: '0 6 * * 1'  # Weekly on Monday
+
+jobs:
+  validate:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Sync corpus sources
+        run: stella groundtruth sources sync
+
+      - name: Run validation
+        run: |
+          stella groundtruth validate run \
+            --matcher semantic-diffing \
+            --parallel 4 \
+            --output validation-${{ github.run_id }}.md
+
+      - name: Check metrics
+        run: |
+          MATCH_RATE=$(stella groundtruth validate metrics --run-id $(cat run-id.txt) --output-format json | jq '.functionMatchRate')
+          if (( $(echo "$MATCH_RATE < 90" | bc -l) )); then
+            echo "Match rate below threshold: $MATCH_RATE%"
+            exit 1
+          fi
+```
+
+### GitLab CI Example
+
+```yaml
+corpus-validation:
+  stage: verify
+  script:
+    - stella groundtruth sources sync --source buildinfo-debian
+    - stella groundtruth validate run --pairs "openssl:*" --output report.md
+  artifacts:
+    paths:
+      - report.md
+    expire_in: 1 week
+  rules:
+    - if: $CI_PIPELINE_SOURCE == "schedule"
+```
+
+## Offline Usage
+
+For air-gapped environments, use offline bundles:
+
+```bash
+# Export corpus for offline use
+stella bundle export \
+  --include-corpus \
+  --output corpus-bundle-$(date +%F).tar.gz
+
+# Import on air-gapped system
+stella bundle import --package corpus-bundle-2026-01-22.tar.gz
+
+# Run validation offline
+stella groundtruth validate run --offline
+```
+
+## Troubleshooting
+
+### Common Issues
+
+**Sync fails with network error:**
+```bash
+# Check source status
+stella groundtruth sources list
+
+# Retry with verbose output
+stella groundtruth sources sync --source debuginfod-ubuntu -v
+```
+
+**Symbol lookup returns no results:**
+```bash
+# Verify debug-id format (hex string)
+stella groundtruth symbols lookup --debug-id abc123 -v
+
+# Try searching by package instead
+stella groundtruth symbols search --package libcrypto
+```
+
+**Validation metrics show low match rate:**
+- Check that both vuln and patch binaries are present in corpus
+- Verify symbol sources are synced and enabled
+- Consider using `hybrid` matcher for complex cases
+
+## See Also
+
+- [CLI Command Reference](commands/reference.md#ground-truth-corpus-commands)
+- [BinaryIndex Architecture](../../binary-index/architecture.md)
+- [Golden Corpus KPIs](../../benchmarks/golden-corpus-kpis.md)
+- [Air-Gap Bundle Guide](../../modules/airgap/README.md)
--- a/docs/modules/cli/guides/trust-profiles.md
+++ b/docs/modules/cli/guides/trust-profiles.md
@@ -0,0 +1,36 @@
+# Trust Profiles
+
+Trust profiles are offline trust-store templates for bundle verification. They define trust roots, Rekor public keys, and TSA roots in a single file so operators can apply a profile into a local trust store.
+
+Default profile location:
+- `etc/trust-profiles/*.trustprofile.json`
+- Assets referenced by profiles live under `etc/trust-profiles/assets/`
+
+Profile structure (summary):
+- `profileId`: stable identifier (used by CLI commands)
+- `trustRoots[]`: signing trust roots (PEM files)
+- `rekorKeys[]`: Rekor public keys for offline inclusion proof verification
+- `tsaRoots[]`: TSA roots for RFC3161 verification
+- `metadata`: optional compliance metadata
+
+CLI usage:
+- `stella trust-profile list`
+- `stella trust-profile show <profile-id>`
+- `stella trust-profile apply <profile-id> --output <dir>`
+
+Profile lookup overrides:
+- `--profiles-dir <path>` to point at a custom profiles directory
+- `STELLAOPS_TRUST_PROFILES` environment variable for default lookup
+
+Apply output:
+- `trust-manifest.json` (trust roots manifest for offline verification)
+- `trust-profile.json` (resolved profile copy)
+- `trust-root.pem` (combined trust roots for CLI verification)
+- `trust-roots/`, `rekor/`, `tsa/` folders with PEM assets
+
+Example apply workflow:
+1. `stella trust-profile apply global --output ./trust-store`
+2. `stella bundle verify --trust-root ./trust-store/trust-root.pem`
+
+Note:
+- Default profiles ship with placeholder roots for scaffolding only. Replace them with compliance-approved roots before production use.
--- a/docs/modules/concelier/sbom-learning-api.md
+++ b/docs/modules/concelier/sbom-learning-api.md
@@ -10,18 +10,68 @@ The SBOM Learning API enables Concelier to learn which advisories are relevant t
 Concelier normalizes incoming CycloneDX 1.7 and SPDX 3.0.1 documents into the internal `ParsedSbom` model for matching and downstream analysis.

 Current extraction coverage (SPRINT_20260119_015):
- Document metadata: format, specVersion, serialNumber, created, name, namespace when present
- Components: bomRef, type, name, version, purl, cpe, hashes (including SPDX verifiedUsing), license IDs/expressions, license text (base64 decode), external references, properties, scope/modified, supplier/manufacturer, evidence, pedigree, cryptoProperties, modelCard (CycloneDX)
- Dependencies: component dependency edges (CycloneDX dependencies, SPDX relationships)
+- Document metadata: format, specVersion, serialNumber, created, name, profiles, sbomType, namespace/imports
+- Components: bomRef, type, name, version, purl, cpe, hashes (including SPDX verifiedUsing), license IDs/expressions, license text (base64 decode), external references, properties, scope/modified, supplier/manufacturer, evidence, pedigree, cryptoProperties, modelCard (CycloneDX), swid (CycloneDX), SPDX AI model parameters, SPDX dataset metadata, SPDX file/snippet properties
+- Licensing: SPDX Licensing profile elements (listed/custom licenses, license additions, AND/OR/WITH/or-later operators), with OSI/FSF flags and deprecated IDs captured
+- Dependencies: component dependency edges (CycloneDX dependencies, SPDX relationships; DependencyOf is inverted to DependsOn)
+- Vulnerabilities: CycloneDX embedded vulnerabilities (ratings, affects, VEX analysis), SPDX Security profile vulnerabilities + VEX assessments
 - Services: endpoints, authentication, crossesTrustBoundary, data flows, licenses, external references (CycloneDX)
 - Formulation: components, workflows, tasks, properties (CycloneDX)
+- Declarations/definitions: attestations, affirmations, standards, signatures (CycloneDX)
+- Compositions/annotations (CycloneDX)
 - Build metadata: buildId, buildType, timestamps, config source, environment, parameters (SPDX)
 - Document properties

 Notes:
- Full SPDX Licensing profile objects, vulnerabilities, and other SPDX profiles are pending in SPRINT_20260119_015.
+- License expressions can be validated against embedded SPDX license/exception lists via `ILicenseExpressionValidator`.
 - Matching currently uses PURL and CPE; additional fields are stored for downstream consumers.

+## VEX consumption
+When SBOM vulnerabilities include embedded VEX analysis, Concelier consumes the statements
+to filter or annotate advisory matches. NotAffected statements can be filtered when policy
+allows, and trust evaluation checks timestamps, signatures (when provided), and justification
+requirements for not-affected claims.
+
+Configuration (YAML or JSON), loaded from `Concelier:VexConsumption:PolicyPath`:
+
+```yaml
+vexConsumptionPolicy:
+  trustEmbeddedVex: true
+  minimumTrustLevel: Unverified
+  filterNotAffected: true
+
+  signatureRequirements:
+    requireSignedVex: false
+    trustedSigners:
+      - "https://example.com/keys/vex-signer"
+
+  timestampRequirements:
+    maxAgeHours: 720
+    requireTimestamp: true
+
+  conflictResolution:
+    strategy: mostRecent
+    logConflicts: true
+
+  mergePolicy:
+    mode: union
+    externalSources:
+      - type: repository
+        url: "https://vex.example.com/api"
+
+  justificationRequirements:
+    requireJustificationForNotAffected: true
+    acceptedJustifications:
+      - component_not_present
+      - vulnerable_code_not_present
+      - vulnerable_code_not_in_execute_path
+      - inline_mitigations_already_exist
+```
+
+Reports are emitted via `VexConsumptionReporter` in JSON, SARIF, and text formats.
+Runtime overrides can be supplied via `Concelier:VexConsumption` (Enabled, IgnoreVex,
+PolicyPath, TrustEmbeddedVex, MinimumTrustLevel, FilterNotAffected, ExternalVexSources).
+
 ## Flow

 ```
@@ -339,23 +389,51 @@ var affected = await sbomService.GetAffectedAdvisoriesAsync(
 ```sql
 CREATE TABLE vuln.sbom_registry (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
-    tenant_id UUID NOT NULL,
-    artifact_id TEXT NOT NULL,
-    sbom_digest TEXT NOT NULL,
-    sbom_format TEXT NOT NULL,
+    digest TEXT NOT NULL,
+    format TEXT NOT NULL CHECK (format IN ('cyclonedx', 'spdx')),
+    spec_version TEXT NOT NULL,
+    primary_name TEXT,
+    primary_version TEXT,
    component_count INT NOT NULL DEFAULT 0,
+    affected_count INT NOT NULL DEFAULT 0,
+    source TEXT NOT NULL,
+    tenant_id TEXT,
    registered_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    last_matched_at TIMESTAMPTZ,
-    CONSTRAINT uq_sbom_registry_digest UNIQUE (tenant_id, sbom_digest)
+    CONSTRAINT uq_sbom_registry_digest UNIQUE (digest)
 );

 CREATE TABLE vuln.sbom_canonical_match (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    sbom_id UUID NOT NULL REFERENCES vuln.sbom_registry(id),
    canonical_id UUID NOT NULL REFERENCES vuln.advisory_canonical(id),
-    matched_purl TEXT NOT NULL,
+    purl TEXT NOT NULL,
+    match_method TEXT NOT NULL,
+    confidence NUMERIC(3,2) NOT NULL DEFAULT 1.0,
    is_reachable BOOLEAN NOT NULL DEFAULT false,
+    is_deployed BOOLEAN NOT NULL DEFAULT false,
    matched_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
-    CONSTRAINT uq_sbom_canonical_match UNIQUE (sbom_id, canonical_id)
+    CONSTRAINT uq_sbom_canonical_match UNIQUE (sbom_id, canonical_id, purl)
+);
+
+CREATE TABLE concelier.sbom_documents (
+    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    serial_number TEXT NOT NULL,
+    artifact_digest TEXT,
+    format TEXT NOT NULL CHECK (format IN ('cyclonedx', 'spdx')),
+    spec_version TEXT NOT NULL,
+    component_count INT NOT NULL DEFAULT 0,
+    service_count INT NOT NULL DEFAULT 0,
+    vulnerability_count INT NOT NULL DEFAULT 0,
+    has_crypto BOOLEAN NOT NULL DEFAULT false,
+    has_services BOOLEAN NOT NULL DEFAULT false,
+    has_vulnerabilities BOOLEAN NOT NULL DEFAULT false,
+    license_ids TEXT[] NOT NULL DEFAULT '{}',
+    license_expressions TEXT[] NOT NULL DEFAULT '{}',
+    sbom_json JSONB NOT NULL,
+    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
+    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
+    CONSTRAINT uq_concelier_sbom_serial UNIQUE (serial_number),
+    CONSTRAINT uq_concelier_sbom_artifact UNIQUE (artifact_digest)
 );
 ```
--- a/docs/modules/platform/platform-service.md
+++ b/docs/modules/platform/platform-service.md
@@ -15,6 +15,7 @@ Provide a single, deterministic aggregation layer for cross-service UX workflows
 - Persist dashboard personalization and layout preferences.
 - Provide global search aggregation across entities.
 - Surface platform metadata for UI bootstrapping (version, build, offline status).
+- Expose analytics lake aggregates for SBOM, vulnerability, and attestation reporting.

 ## API surface (v1)

@@ -49,6 +50,16 @@ Provide a single, deterministic aggregation layer for cross-service UX workflows

 ### Metadata
 - GET `/api/v1/platform/metadata`
+- Response includes a capabilities list for UI bootstrapping; analytics capability is reported only when analytics storage is configured.
+
+### Analytics (SBOM lake)
+- GET `/api/analytics/suppliers`
+- GET `/api/analytics/licenses`
+- GET `/api/analytics/vulnerabilities`
+- GET `/api/analytics/backlog`
+- GET `/api/analytics/attestation-coverage`
+- GET `/api/analytics/trends/vulnerabilities`
+- GET `/api/analytics/trends/components`

 ## Data model
 - `platform.dashboard_preferences` (dashboard layout, widgets, filters)
@@ -72,11 +83,58 @@ Provide a single, deterministic aggregation layer for cross-service UX workflows
 - Preferences: `ui.preferences.read`, `ui.preferences.write`
 - Search: `search.read` plus downstream service scopes (`findings:read`, `policy:read`, etc.)
 - Metadata: `platform.metadata.read`
+- Analytics: `analytics.read`

 ## Determinism and offline posture
- Stable ordering with explicit sort keys and deterministic tiebreakers.
+- Stable ordering with explicit sort keys and deterministic tiebreakers.        
 - All timestamps in UTC ISO-8601.
- Cache last-known snapshots for offline rendering with "data as of" markers.
+- Cache last-known snapshots for offline rendering with "data as of" markers.   
+
+## Analytics ingestion configuration
+
+Analytics ingestion runs inside the Platform WebService and subscribes to Scanner,
+Concelier, and Attestor streams. Configure ingestion with `Platform:AnalyticsIngestion`:
+
+```yaml
+Platform:
+  AnalyticsIngestion:
+    Enabled: true
+    PostgresConnectionString: "" # optional; defaults to Platform:Storage
+    AllowedTenants: ["tenant-a"]
+    Streams:
+      ScannerStream: "orchestrator:events"
+      ConcelierObservationStream: "concelier:advisory.observation.updated:v1"
+      ConcelierLinksetStream: "concelier:advisory.linkset.updated:v1"
+      AttestorStream: "attestor:events"
+      StartFromBeginning: false
+    Cas:
+      RootPath: "/var/lib/stellaops/cas"
+      DefaultBucket: "attestations"
+    Attestations:
+      BundleUriTemplate: "bundle:{digest}"
+```
+
+`BundleUriTemplate` supports `{digest}` and `{hash}` placeholders. The `bundle:` scheme
+maps to `cas://<DefaultBucket>/{digest}` by default. Verify offline bundles with
+`stella bundle verify` before ingestion.
+
+## Analytics maintenance configuration
+Analytics rollups + materialized view refreshes are driven by
+`PlatformAnalyticsMaintenanceService` when analytics storage is configured.     
+Use `BackfillDays` to recompute recent rollups on the first maintenance run (set to `0` to disable).
+
+```yaml
+Platform:
+  Storage:
+    PostgresConnectionString: "Host=...;Database=...;Username=...;Password=..."
+  AnalyticsMaintenance:
+    Enabled: true
+    RunOnStartup: true
+    IntervalMinutes: 1440
+    ComputeDailyRollups: true
+    RefreshMaterializedViews: true
+    BackfillDays: 7
+```

 ## Observability
 - Metrics: `platform.aggregate.latency_ms`, `platform.aggregate.errors_total`, `platform.aggregate.cache_hits_total`
--- a/docs/modules/policy/architecture.md
+++ b/docs/modules/policy/architecture.md
@@ -17,6 +17,7 @@ The service operates strictly downstream of the **Aggregation-Only Contract (AOC

 - Compile and evaluate `stella-dsl@1` policy packs into deterministic verdicts.
 - Join SBOM inventory, Concelier advisories, and Excititor VEX evidence via canonical linksets and equivalence tables.
+- Evaluate SBOM license expressions against policy (SPDX AND/OR/WITH/+), emitting compliance findings and attribution requirements for gate decisions.
 - Materialise effective findings (`effective_finding_{policyId}`) with append-only history and produce explain traces.
 - Emit CVSS v4.0 receipts with canonical hashing and policy replay/backfill rules; store tenant-scoped receipts with RBAC; export receipts deterministically (UTC/fonts/order) and flag v3.1→v4.0 conversions (see Sprint 0190 CVSS-GAPS-190-014 / `docs/modules/policy/cvss-v4.md`).
 - Emit per-finding OpenVEX decisions anchored to reachability evidence, forward them to Signer/Attestor for DSSE/Rekor, and publish the resulting artifacts for bench/verification consumers.
@@ -171,9 +172,52 @@ The Determinization subsystem calculates uncertainty scores based on signal comp
 **Usage in policies:**

 Determinization scores are exposed to SPL policies via the `signals.trust.*` and `signals.uncertainty.*` namespaces. Use `signals.uncertainty.entropy` to access entropy values and `signals.trust.score` for aggregated trust scores that combine VEX, reachability, runtime, and other signals with decay/weighting.
+
+### 3.2 - License compliance configuration
+
+License compliance evaluation runs during SBOM evaluation when enabled in
+`licenseCompliance` settings.
+
+```json
+{
+  "licenseCompliance": {
+    "enabled": true,
+    "policyPath": "policies/license-policy.yaml"
+  }
+}
+```
+
+- `sbom.license` exposes the compliance report (findings, conflicts, inventory).
+- `sbom.license_status` exposes `pass`, `warn`, or `fail` (or `unknown` when disabled).
+- Failures set the policy verdict status to `blocked` and emit `license.*` annotations.
+- Trademark notice obligations are tracked alongside attribution requirements and produce warn-level findings.
+- License compliance reports support JSON, text/markdown/html, legal-review, and PDF outputs.
+- Category breakdown includes percent totals and chart renderings (ASCII chart in text/markdown/legal-review/PDF, pie chart in HTML).
 ---

-## 4 · Data Model & Persistence
+### 3.3 - NTIA compliance configuration
+
+NTIA minimum-elements validation runs when enabled under `ntiaCompliance`.
+
+```json
+{
+  "ntiaCompliance": {
+    "enabled": true,
+    "enforceGate": false,
+    "policyPath": "policies/ntia-policy.yaml"
+  }
+}
+```
+
+- `sbom.ntia` exposes NTIA compliance details (elements, findings, supplier status).
+- `sbom.ntia_status` exposes `pass`, `warn`, `fail`, or `unknown`.
+- NTIA compliance can be configured as an advisory-only check or a release gate via `enforceGate`.
+- The NTIA policy supports element selection, supplier validation (placeholder patterns, trusted/blocked lists), and framework-specific requirements.
+- Reports support JSON, text/markdown/html, and PDF output for regulatory submissions.
+
+---
+
+## 4 · Data Model & Persistence

 ### 4.1 Collections

--- a/docs/modules/release-orchestrator/test-structure.md
+++ b/docs/modules/release-orchestrator/test-structure.md
@@ -382,19 +382,19 @@ public class EvidenceHashDeterminismTests
 ### Run All Tests

 ```bash
-dotnet test src/StellaOps.sln
+dotnet test src/ReleaseOrchestrator/StellaOps.ReleaseOrchestrator.sln
 ```

 ### Run Only Unit Tests

 ```bash
-dotnet test src/StellaOps.sln --filter "Category=Unit"
+dotnet test src/ReleaseOrchestrator/StellaOps.ReleaseOrchestrator.sln --filter "Category=Unit"
 ```

 ### Run Only Integration Tests

 ```bash
-dotnet test src/StellaOps.sln --filter "Category=Integration"
+dotnet test src/ReleaseOrchestrator/StellaOps.ReleaseOrchestrator.sln --filter "Category=Integration"
 ```

 ### Run Specific Test Class
@@ -406,7 +406,7 @@ dotnet test --filter "FullyQualifiedName~PromotionValidatorTests"
 ### Run with Coverage

 ```bash
-dotnet test src/StellaOps.sln --collect:"XPlat Code Coverage"
+dotnet test src/ReleaseOrchestrator/StellaOps.ReleaseOrchestrator.sln --collect:"XPlat Code Coverage"
 ```

 ---
--- a/docs/modules/scanner/architecture.md
+++ b/docs/modules/scanner/architecture.md
@@ -14,10 +14,14 @@
 **Boundaries.**

 * Scanner **does not** produce PASS/FAIL. The backend (Policy + Excititor + Concelier) decides presentation and verdicts.
-* Scanner **does not** keep thirdâ€‘party SBOM warehouses. It may **bind** to existing attestations for exact hashes.
-* Core analyzers are **deterministic** (no fuzzy identity). Optional heuristic plugâ€‘ins (e.g., patchâ€‘presence) run under explicit flags and never contaminate the core SBOM.
-
---
+* Scanner **does not** keep thirdâ€‘party SBOM warehouses. It may **bind** to existing attestations for exact hashes.
+* Core analyzers are **deterministic** (no fuzzy identity). Optional heuristic plugâ€‘ins (e.g., patchâ€‘presence) run under explicit flags and never contaminate the core SBOM.
+
+SBOM dependency reachability inference uses dependency graphs to reduce false positives and
+apply reachability-aware severity adjustments. See `src/Scanner/docs/sbom-reachability-filtering.md`
+for policy configuration and reporting expectations.
+
+---

 ## 1) Solution & project layout

@@ -374,7 +378,40 @@ public sealed record BinaryFindingEvidence

 The emitted `buildId` metadata is preserved in component hashes, diff payloads, and `/policy/runtime` responses so operators can pivot from SBOM entries â†’ runtime events â†’ `debug/.build-id/<aa>/<rest>.debug` within the Offline Kit or release bundle.

-### 5.6 DSSE attestation (via Signer/Attestor)
+### 5.5.1 Service security analysis (Sprint 20260119_016)
+
+When an SBOM path is provided, the worker runs the `service-security` stage to parse CycloneDX services and emit a deterministic report covering:
+
+- Endpoint scheme hygiene (HTTP/WS/plaintext protocol detection).
+- Authentication and trust-boundary enforcement.
+- Sensitive data flow exposure and unencrypted transfers.
+- Deprecated service versions and rate-limiting metadata gaps.
+
+Inputs are passed via scan metadata (`sbom.path` or `sbomPath`, plus `sbom.format`). The report is attached as a surface observation payload (`service-security.report`) and keyed in the analysis store for downstream policy and report assembly. See `src/Scanner/docs/service-security.md` for the policy schema and output formats.
+
+### 5.5.2 CBOM crypto analysis (Sprint 20260119_017)
+
+When an SBOM includes CycloneDX `cryptoProperties`, the worker runs the `crypto-analysis` stage to produce a crypto inventory and compliance findings for weak algorithms, short keys, deprecated protocol versions, certificate hygiene, and post-quantum readiness. The report is attached as a surface observation payload (`crypto-analysis.report`) and keyed in the analysis store for downstream evidence workflows. See `src/Scanner/docs/crypto-analysis.md` for the policy schema and inventory export formats.
+
+### 5.5.3 AI/ML supply chain security (Sprint 20260119_018)
+
+When an SBOM includes CycloneDX `modelCard` or SPDX AI profile data, the worker runs the `ai-ml-security` stage to evaluate model governance readiness. The report covers model card completeness, training data provenance, bias/fairness checks, safety risk assessment coverage, and provenance verification. The report is attached as a surface observation payload (`ai-ml-security.report`) and keyed in the analysis store for policy evaluation and audit trails. See `src/Scanner/docs/ai-ml-security.md` for policy schema, CLI toggles, and binary analysis conventions.
+
+### 5.5.4 Build provenance verification (Sprint 20260119_019)
+
+When an SBOM includes CycloneDX formulation or SPDX build profile data, the worker runs the `build-provenance` stage to verify provenance completeness, builder trust, source integrity, hermetic build requirements, and optional reproducibility checks. The report is attached as a surface observation payload (`build-provenance.report`) and keyed in the analysis store for policy enforcement and audit evidence. See `src/Scanner/docs/build-provenance.md` for policy schema, CLI toggles, and report formats.
+
+### 5.5.5 SBOM dependency reachability (Sprint 20260119_022)
+
+When configured, the worker runs the `reachability-analysis` stage to infer dependency reachability from SBOM graphs and optionally refine it with a `richgraph-v1` call graph. Advisory matches are filtered or severity-adjusted using `VulnerabilityReachabilityFilter`, with false-positive reduction metrics recorded for auditability. The stage attaches:
+
+- `reachability.report` (JSON) for component and vulnerability reachability.
+- `reachability.report.sarif` (SARIF 2.1.0) for toolchain export.
+- `reachability.graph.dot` (GraphViz) for dependency visualization.
+
+Configuration lives in `src/Scanner/docs/sbom-reachability-filtering.md`, including policy schema, metadata keys, and report outputs.
+
+### 5.6 DSSE attestation (via Signer/Attestor)

 * WebService constructs **predicate** with `image_digest`, `stellaops_version`, `license_id`, `policy_digest?` (when emitting **final reports**), timestamps.
 * Calls **Signer** (requires **OpTok + PoE**); Signer verifies **entitlement + scanner image integrity** and returns **DSSE bundle**.