Implement ledger metrics for observability and add tests for Ruby packages endpoints

- Added `LedgerMetrics` class to record write latency and total events for ledger operations. - Created comprehensive tests for Ruby packages endpoints, covering scenarios for missing inventory, successful retrieval, and identifier handling. - Introduced `TestSurfaceSecretsScope` for managing environment variables during tests. - Developed `ProvenanceMongoExtensions` for attaching DSSE provenance and trust information to event documents. - Implemented `EventProvenanceWriter` and `EventWriter` classes for managing event provenance in MongoDB. - Established MongoDB indexes for efficient querying of events based on provenance and trust. - Added models and JSON parsing logic for DSSE provenance and trust information.
2025-11-13 09:29:09 +02:00
parent 151f6b35cc
commit 61f963fd52
101 changed files with 5881 additions and 1776 deletions
--- a/docs/modules/excititor/README.md
+++ b/docs/modules/excititor/README.md
@@ -4,6 +4,7 @@ Excititor converts heterogeneous VEX feeds into raw observations and linksets th

 ## Latest updates (2025-11-05)
 - Link-Not-Merge readiness: release note [Excitor consensus beta](../../updates/2025-11-05-excitor-consensus-beta.md) captures how Excititor feeds power the Excititor consensus beta (sample payload in [consensus JSON](../../vex/consensus-json.md)).
+- Added [observability guide](operations/observability.md) describing the evidence metrics emitted by `EXCITITOR-AIAI-31-003` (request counters, statement histogram, signature status, guard violations) so Ops/Lens can alert on misuse.
 - README now points policy/UI teams to the upcoming consensus integration work.
 - DSSE packaging for consensus bundles and Export Center hooks are documented in the [beta release note](../../updates/2025-11-05-excitor-consensus-beta.md); operators mirroring Excititor exports must verify detached JWS artefacts (`bundle.json.jws`) alongside each bundle.
 - Follow-ups called out in the release note (Policy weighting knobs `POLICY-ENGINE-30-101`, CLI verb `CLI-VEX-30-002`) remain in-flight and are tracked in `/docs/implplan/SPRINT_200_documentation_process.md`.
--- a/docs/modules/excititor/architecture.md
+++ b/docs/modules/excititor/architecture.md
@@ -2,7 +2,7 @@

 > Consolidates the VEX ingestion guardrails from Epic 1 with consensus and AI-facing requirements from Epics 7 and 8. This is the authoritative architecture record for Excititor.

-> **Scope.** This document specifies the **Excititor** service: its purpose, trust model, data structures, observation/linkset pipelines, APIs, plug-in contracts, storage schema, performance budgets, testing matrix, and how it integrates with Concelier, Policy Engine, and evidence surfaces. It is implementation-ready.
+> **Scope.** This document specifies the **Excititor** service: its purpose, trust model, data structures, observation/linkset pipelines, APIs, plug-in contracts, storage schema, performance budgets, testing matrix, and how it integrates with Concelier, Policy Engine, and evidence surfaces. It is implementation-ready. The immutable observation store schema lives in [`vex_observations.md`](./vex_observations.md).

 ---

--- a/docs/modules/excititor/operations/observability.md
+++ b/docs/modules/excititor/operations/observability.md
@@ -0,0 +1,41 @@
+# Excititor Observability Guide
+
+> Added 2025-11-14 alongside Sprint 119 (`EXCITITOR-AIAI-31-003`). Complements the AirGap/mirror runbooks under the same folder.
+
+Excititor’s evidence APIs now emit first-class OpenTelemetry metrics so Lens, Advisory AI, and Ops can detect misuse or missing provenance without paging through logs. This document lists the counters/histograms shipped by the WebService (`src/Excititor/StellaOps.Excititor.WebService`) and how to hook them into your exporters/dashboards.
+
+## Telemetry prerequisites
+
+- Enable `Excititor:Telemetry` in the service configuration (`appsettings.*`), ensuring **metrics** export is on. The WebService automatically adds the evidence meter (`StellaOps.Excititor.WebService.Evidence`) alongside the ingestion meter.
+- Deploy at least one OTLP or console exporter (see `TelemetryExtensions.ConfigureExcititorTelemetry`). If your region lacks OTLP transport, fall back to scraping the console exporter for smoke tests.
+- Coordinate with the Ops/Signals guild to provision the span/metric sinks referenced in `docs/modules/platform/architecture-overview.md#observability`.
+
+## Metrics reference
+
+| Metric | Type | Description | Key dimensions |
+| --- | --- | --- | --- |
+| `excititor.vex.observation.requests` | Counter | Number of `/v1/vex/observations/{vulnerabilityId}/{productKey}` requests handled. | `tenant`, `outcome` (`success`, `error`, `cancelled`), `truncated` (`true/false`) |
+| `excititor.vex.observation.statement_count` | Histogram | Distribution of statements returned per observation projection request. | `tenant`, `outcome` |
+| `excititor.vex.signature.status` | Counter | Signature status per statement (missing vs. unverified). | `tenant`, `status` (`missing`, `unverified`) |
+| `excititor.vex.aoc.guard_violations` | Counter | Aggregated count of Aggregation-Only Contract violations detected by the WebService (ingest + `/vex/aoc/verify`). | `tenant`, `surface` (`ingest`, `aoc_verify`, etc.), `code` (AOC error code) |
+
+> All metrics originate from the `EvidenceTelemetry` helper (`src/Excititor/StellaOps.Excititor.WebService/Telemetry/EvidenceTelemetry.cs`). When disabled (telemetry off), the helper is inert.
+
+### Dashboard hints
+
+- **Advisory-AI readiness** – alert when `excititor.vex.signature.status{status="missing"}` spikes for a tenant, indicating connectors aren’t supplying signatures.
+- **Guardrail monitoring** – graph `excititor.vex.aoc.guard_violations` per `code` to catch upstream feed regressions before they pollute Evidence Locker or Lens caches.
+- **Capacity planning** – histogram percentiles of `excititor.vex.observation.statement_count` feed API sizing (higher counts mean Advisory AI is requesting broad scopes).
+
+## Operational steps
+
+1. **Enable telemetry**: set `Excititor:Telemetry:EnableMetrics=true`, configure OTLP endpoints/headers as described in `TelemetryExtensions`.
+2. **Add dashboards**: import panels referencing the metrics above (see Grafana JSON snippets in Ops repo once merged).
+3. **Alerting**: add rules for high guard violation rates and missing signatures. Tie alerts back to connectors via tenant metadata.
+4. **Post-deploy checks**: after each release, verify metrics emit by curling `/v1/vex/observations/...`, watching the console exporter (dev) or OTLP (prod).
+
+## Related documents
+
+- `docs/modules/excititor/architecture.md` – API contract, AOC guardrails, connector responsibilities.
+- `docs/modules/excititor/mirrors.md` – AirGap/mirror ingestion checklist (feeds into `EXCITITOR-AIRGAP-56/57`).
+- `docs/modules/platform/architecture-overview.md#observability` – platform-wide telemetry guidance.
--- a/docs/modules/excititor/vex_observations.md
+++ b/docs/modules/excititor/vex_observations.md
@@ -0,0 +1,131 @@
+# VEX Observation Model (`vex_observations`)
+
+> Authored 2025-11-14 for Sprint 120 (`EXCITITOR-LNM-21-001`). This document is the canonical schema description for Excititor’s immutable observation records. It unblocks downstream documentation tasks (`DOCS-LNM-22-002`) and aligns the WebService/Worker data structures with Mongo persistence.
+
+Excititor ingests heterogeneous VEX statements, normalizes them under the Aggregation-Only Contract (AOC), and persists each normalized statement as a **VEX observation**. These observations are the source of truth for:
+
+- Advisory AI citation APIs (`/v1/vex/observations/{vulnerabilityId}/{productKey}`)
+- Graph/Vuln Explorer overlays (batch observation APIs)
+- Evidence Locker + portable bundle manifests
+- Policy Engine materialization and audit trails
+
+All observation documents are immutable. New information creates a new observation record linked by `observationId`; supersedence happens through Graph/Lens layers, not by mutating this collection.
+
+## Storage & routing
+
+| Aspect | Value |
+| --- | --- |
+| Collection | `vex_observations` (Mongo) |
+| Upstream generator | `VexObservationProjectionService` (WebService) and Worker normalization pipeline |
+| Primary key | `{tenant, observationId}` |
+| Required indexes | `{tenant, vulnerabilityId}`, `{tenant, productKey}`, `{tenant, document.digest}`, `{tenant, providerId, status}` |
+| Source of truth for | `/v1/vex/observations`, Graph batch APIs, Excititor → Evidence Locker replication |
+
+## Canonical document shape
+
+```jsonc
+{
+  "tenant": "default",
+  "observationId": "vex:obs:sha256:...",
+  "vulnerabilityId": "CVE-2024-12345",
+  "productKey": "pkg:maven/org.example/app@1.2.3",
+  "providerId": "ubuntu-csaf",
+  "status": "affected",                // matches VexClaimStatus enum
+  "justification": {
+    "type": "component_not_present",
+    "reason": "Package not shipped in this profile",
+    "detail": "Binary not in base image"
+  },
+  "detail": "Free-form vendor detail",
+  "confidence": {
+    "score": 0.9,
+    "level": "high",
+    "method": "vendor"
+  },
+  "signals": {
+    "severity": {
+      "scheme": "cvss3.1",
+      "score": 7.8,
+      "label": "High",
+      "vector": "CVSS:3.1/..."
+    },
+    "kev": true,
+    "epss": 0.77
+  },
+  "scope": {
+    "key": "pkg:deb/ubuntu/apache2@2.4.58-1",
+    "purls": [
+      "pkg:deb/ubuntu/apache2@2.4.58-1",
+      "pkg:docker/example/app@sha256:..."
+    ],
+    "cpes": ["cpe:2.3:a:apache:http_server:2.4.58:*:*:*:*:*:*:*"]
+  },
+  "anchors": [
+    "#/statements/0/justification",
+    "#/statements/0/detail"
+  ],
+  "document": {
+    "format": "csaf",
+    "digest": "sha256:abc123...",
+    "revision": "2024-10-22T09:00:00Z",
+    "sourceUri": "https://ubuntu.com/security/notices/USN-0000-1",
+    "signature": {
+      "type": "cosign",
+      "issuer": "https://token.actions.githubusercontent.com",
+      "keyId": "ubuntu-vex-prod",
+      "verifiedAt": "2024-10-22T09:01:00Z",
+      "transparencyLogReference": "rekor://UUID",
+      "trust": {
+        "tenantId": "default",
+        "issuerId": "ubuntu",
+        "effectiveWeight": 0.9,
+        "tenantOverrideApplied": false,
+        "retrievedAtUtc": "2024-10-22T09:00:30Z"
+      }
+    }
+  },
+  "aoc": {
+    "guardVersion": "2024.10.0",
+    "violations": [],                    // non-empty -> stored + surfaced
+    "ingestedAt": "2024-10-22T09:00:05Z",
+    "retrievedAt": "2024-10-22T08:59:59Z"
+  },
+  "metadata": {
+    "provider-hint": "Mainline feed",
+    "source-channel": "mirror"
+  }
+}
+```
+
+### Field notes
+
+- **`tenant`** – logical tenant resolved by WebService based on headers or default configuration.
+- **`observationId`** – deterministic hash (sha256) over `{tenant, vulnerabilityId, productKey, providerId, statementDigest}`. Never reused.
+- **`status` + `justification`** – follow the OpenVEX semantics enforced by `StellaOps.Excititor.Core.VexClaim`.
+- **`scope`** – includes canonical `key` plus normalized PURLs/CPES; deterministic ordering.
+- **`anchors`** – optional JSON-pointer hints pointing to the source document sections; stored as trimmed strings.
+- **`document.signature`** – mirrors `VexSignatureMetadata`; empty if upstream feed lacks signatures.
+- **`aoc.violations`** – stored if the guard detected non-fatal issues; fatal issues never create an observation.
+- **`metadata`** – reserved for deterministic provider hints; keys follow `vex.*` prefix guidance.
+
+## Determinism & AOC guarantees
+
+1. **Write-once** – once inserted, observation documents never change. New evidence creates a new `observationId`.
+2. **Sorted collections** – arrays (`anchors`, `purls`, `cpes`) are sorted lexicographically before persistence.
+3. **Guard metadata** – `aoc.guardVersion` records the guard library version (`docs/aoc/guard-library.md`), enabling audits.
+4. **Signatures** – only verification metadata proven by the Worker is stored; WebService never recomputes trust.
+5. **Time normalization** – all timestamps stored as UTC ISO-8601 strings (Mongo `DateTime`).
+
+## API mapping
+
+| API | Source fields | Notes |
+| --- | --- | --- |
+| `/v1/vex/observations/{vuln}/{product}` | `tenant`, `vulnerabilityId`, `productKey`, `scope`, `statements[]` | Response uses `VexObservationProjectionService` to render `statements`, `document`, and `signature` fields. |
+| `/vex/aoc/verify` | `document.digest`, `providerId`, `aoc` | Replays guard validation for recent digests; guard violations here align with `aoc.violations`. |
+| Evidence batch API (Graph) | `statements[]`, `scope`, `signals`, `anchors` | Format optimized for overlays; resuces `document` to digest/URI. |
+
+## Related work
+
+- `EXCITITOR-GRAPH-24-*` relies on this schema to build overlays.
+- `DOCS-LNM-22-002` (Link-Not-Merge documentation) references this file.
+- `EXCITITOR-ATTEST-73-*` uses `document.digest` + `signature` to embed provenance in attestation payloads.