feat: Add initial implementation of Vulnerability Resolver Jobs

- Created project for StellaOps.Scanner.Analyzers.Native.Tests with necessary dependencies. - Documented roles and guidelines in AGENTS.md for Scheduler module. - Implemented IResolverJobService interface and InMemoryResolverJobService for handling resolver jobs. - Added ResolverBacklogNotifier and ResolverBacklogService for monitoring job metrics. - Developed API endpoints for managing resolver jobs and retrieving metrics. - Defined models for resolver job requests and responses. - Integrated dependency injection for resolver job services. - Implemented ImpactIndexSnapshot for persisting impact index data. - Introduced SignalsScoringOptions for configurable scoring weights in reachability scoring. - Added unit tests for ReachabilityScoringService and RuntimeFactsIngestionService. - Created dotnet-filter.sh script to handle command-line arguments for dotnet. - Established nuget-prime project for managing package downloads.
2025-11-18 07:52:15 +02:00
parent e69b57d467
commit 8355e2ff75
299 changed files with 13293 additions and 2444 deletions
--- a/docs/modules/excititor/evidence-contract.md
+++ b/docs/modules/excititor/evidence-contract.md
@@ -0,0 +1,89 @@
+# Excititor Advisory-AI Evidence Contract (v1)
+
+Updated: 2025-11-18 · Scope: EXCITITOR-AIAI-31-004 (Phase 119)
+
+This note defines the deterministic, aggregation-only contract that Excititor exposes to Advisory AI and Lens consumers. It covers the `/v1/vex/evidence/chunks` NDJSON stream plus the projection rules for observation IDs, signatures, and provenance metadata.
+
+## Goals
+- **Deterministic & replayable**: stable ordering, no implicit clocks, fixed schemas.
+- **Aggregation-only**: no consensus/inference; raw supplier statements plus signatures and AOC (Aggregation-Only Contract) guardrails.
+- **Offline-friendly**: chunked NDJSON; no cross-tenant lookups; portable enough for mirror/air-gap bundles.
+
+## Endpoint
+- `GET /v1/vex/evidence/chunks`
+  - **Query**:
+    - `tenant` (required)
+    - `vulnerabilityId` (optional, repeatable) — CVE, GHSA, etc.
+    - `productKey` (optional, repeatable) — PURLish key used by Advisory AI.
+    - `cursor` (optional) — stable pagination token.
+    - `limit` (optional) — max records per stream chunk (default 500, max 2000).
+  - **Response**: `Content-Type: application/x-ndjson`
+    - Each line is a single evidence record (see schema below).
+    - Ordered by `(tenant, vulnerabilityId, productKey, observationId, statementId)` to stay deterministic.
+
+## Evidence record schema (NDJSON)
+```json
+{
+  "tenant": "acme",
+  "vulnerabilityId": "CVE-2024-1234",
+  "productKey": "pkg:pypi/django@3.2.24",
+  "observationId": "obs-3cf9d6e4-…",
+  "statementId": "stmt-9c1d…",
+  "source": {
+    "supplier": "upstream:osv",
+    "documentId": "osv:GHSA-xxxx-yyyy",
+    "retrievedAt": "2025-11-10T12:34:56Z",
+    "signatureStatus": "missing|unverified|verified"
+  },
+  "aoc": {
+    "violations": [
+      { "code": "EVIDENCE_SIGNATURE_MISSING", "surface": "ingest" }
+    ]
+  },
+  "evidence": {
+    "type": "vex.statement",
+    "payload": { "...supplier-normalized-fields..." }
+  },
+  "provenance": {
+    "hash": "sha256:...",
+    "canonicalUri": "https://mirror.example/bundles/…",
+    "bundleId": "mirror-bundle-001"
+  }
+}
+```
+
+### Field notes
+- `observationId` is stable and maps 1:1 to internal storage; Advisory AI must cite it when emitting narratives.
+- `statementId` remains unique within an observation.
+- `signatureStatus` is pass-through from ingest; no interpretation beyond `missing|unverified|verified`.
+- `aoc.violations` enumerates guardrail violations without blocking delivery.
+- `evidence.payload` is supplier-shaped; we **do not** merge or rank.
+- `provenance.hash` is the SHA-256 of the supplier document bytes; `canonicalUri` points to the mirror bundle when available.
+
+## Determinism rules
+- Ordering: fixed sort above; pagination cursor is derived from the last emitted `(tenant, vulnerabilityId, productKey, observationId, statementId)`.
+- Clocks: All timestamps are UTC ISO-8601 with `Z`.
+- No server-generated randomness; record content is idempotent for identical upstream inputs.
+
+## AOC guardrails
+- Enforced surfaces: ingest, `/v1/vex/aoc/verify`, and chunk emission.
+- Violations are reported via `aoc.violations` and metric `excititor.vex.aoc.guard_violations`.
+- No statements are dropped due to AOC; consumers decide how to act.
+
+## Telemetry (counters/logs-only until span sink arrives)
+- `excititor.vex.chunks.requests` — by `tenant`, `outcome`, `truncated`.
+- `excititor.vex.chunks.bytes` — histogram of NDJSON stream sizes.
+- `excititor.vex.chunks.records` — histogram of records per stream.
+- Existing observation metrics (`excititor.vex.observation.*`) remain unchanged.
+
+## Error handling
+- 400 for invalid tenant or mutually exclusive filters.
+- 429 with `Retry-After` when throttle budgets exceeded.
+- 503 on upstream store/transient failures; responses remain NDJSON-free on error.
+
+## Offline / mirror readiness
+- When mirror bundles are configured, `provenance.canonicalUri` points to the local bundle path; otherwise it is omitted.
+- All payloads are side-effect free; no remote fetches occur while streaming.
+
+## Versioning
+- Contract version: `v1` (this document). Changes must be additive; breaking changes require `v2` path and updated doc.
--- a/docs/modules/excititor/operations/observability.md
+++ b/docs/modules/excititor/operations/observability.md
@@ -17,7 +17,10 @@ Excititor’s evidence APIs now emit first-class OpenTelemetry metrics so Lens,
 | `excititor.vex.observation.requests` | Counter | Number of `/v1/vex/observations/{vulnerabilityId}/{productKey}` requests handled. | `tenant`, `outcome` (`success`, `error`, `cancelled`), `truncated` (`true/false`) |
 | `excititor.vex.observation.statement_count` | Histogram | Distribution of statements returned per observation projection request. | `tenant`, `outcome` |
 | `excititor.vex.signature.status` | Counter | Signature status per statement (missing vs. unverified). | `tenant`, `status` (`missing`, `unverified`) |
-| `excititor.vex.aoc.guard_violations` | Counter | Aggregated count of Aggregation-Only Contract violations detected by the WebService (ingest + `/vex/aoc/verify`). | `tenant`, `surface` (`ingest`, `aoc_verify`, etc.), `code` (AOC error code) |
+| `excititor.vex.aoc.guard_violations` | Counter | Aggregated count of Aggregation-Only Contract violations detected by the WebService (ingest + `/v1/vex/aoc/verify`). | `tenant`, `surface` (`ingest`, `aoc_verify`, etc.), `code` (AOC error code) |
+| `excititor.vex.chunks.requests` | Counter | Requests to `/v1/vex/evidence/chunks` (NDJSON stream). | `tenant`, `outcome` (`success`,`error`,`cancelled`), `truncated` (`true/false`) |
+| `excititor.vex.chunks.bytes` | Histogram | Size of NDJSON chunk streams served (bytes). | `tenant`, `outcome` |
+| `excititor.vex.chunks.records` | Histogram | Count of evidence records emitted per chunk stream. | `tenant`, `outcome` |

 > All metrics originate from the `EvidenceTelemetry` helper (`src/Excititor/StellaOps.Excititor.WebService/Telemetry/EvidenceTelemetry.cs`). When disabled (telemetry off), the helper is inert.

@@ -31,8 +34,8 @@ Excititor’s evidence APIs now emit first-class OpenTelemetry metrics so Lens,

 1. **Enable telemetry**: set `Excititor:Telemetry:EnableMetrics=true`, configure OTLP endpoints/headers as described in `TelemetryExtensions`.
 2. **Add dashboards**: import panels referencing the metrics above (see Grafana JSON snippets in Ops repo once merged).
-3. **Alerting**: add rules for high guard violation rates and missing signatures. Tie alerts back to connectors via tenant metadata.
-4. **Post-deploy checks**: after each release, verify metrics emit by curling `/v1/vex/observations/...`, watching the console exporter (dev) or OTLP (prod).
+3. **Alerting**: add rules for high guard violation rates, missing signatures, and abnormal chunk bytes/record counts. Tie alerts back to connectors via tenant metadata.
+4. **Post-deploy checks**: after each release, verify metrics emit by curling `/v1/vex/observations/...` and `/v1/vex/evidence/chunks`, watching the console exporter (dev) or OTLP (prod).

 ## Related documents