Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
- Created project for StellaOps.Scanner.Analyzers.Native.Tests with necessary dependencies. - Documented roles and guidelines in AGENTS.md for Scheduler module. - Implemented IResolverJobService interface and InMemoryResolverJobService for handling resolver jobs. - Added ResolverBacklogNotifier and ResolverBacklogService for monitoring job metrics. - Developed API endpoints for managing resolver jobs and retrieving metrics. - Defined models for resolver job requests and responses. - Integrated dependency injection for resolver job services. - Implemented ImpactIndexSnapshot for persisting impact index data. - Introduced SignalsScoringOptions for configurable scoring weights in reachability scoring. - Added unit tests for ReachabilityScoringService and RuntimeFactsIngestionService. - Created dotnet-filter.sh script to handle command-line arguments for dotnet. - Established nuget-prime project for managing package downloads.
4.0 KiB
4.0 KiB
Excititor Observability Guide
Added 2025-11-14 alongside Sprint 119 (
EXCITITOR-AIAI-31-003). Complements the AirGap/mirror runbooks under the same folder.
Excititor’s evidence APIs now emit first-class OpenTelemetry metrics so Lens, Advisory AI, and Ops can detect misuse or missing provenance without paging through logs. This document lists the counters/histograms shipped by the WebService (src/Excititor/StellaOps.Excititor.WebService) and how to hook them into your exporters/dashboards.
Telemetry prerequisites
- Enable
Excititor:Telemetryin the service configuration (appsettings.*), ensuring metrics export is on. The WebService automatically adds the evidence meter (StellaOps.Excititor.WebService.Evidence) alongside the ingestion meter. - Deploy at least one OTLP or console exporter (see
TelemetryExtensions.ConfigureExcititorTelemetry). If your region lacks OTLP transport, fall back to scraping the console exporter for smoke tests. - Coordinate with the Ops/Signals guild to provision the span/metric sinks referenced in
docs/modules/platform/architecture-overview.md#observability.
Metrics reference
| Metric | Type | Description | Key dimensions |
|---|---|---|---|
excititor.vex.observation.requests |
Counter | Number of /v1/vex/observations/{vulnerabilityId}/{productKey} requests handled. |
tenant, outcome (success, error, cancelled), truncated (true/false) |
excititor.vex.observation.statement_count |
Histogram | Distribution of statements returned per observation projection request. | tenant, outcome |
excititor.vex.signature.status |
Counter | Signature status per statement (missing vs. unverified). | tenant, status (missing, unverified) |
excititor.vex.aoc.guard_violations |
Counter | Aggregated count of Aggregation-Only Contract violations detected by the WebService (ingest + /v1/vex/aoc/verify). |
tenant, surface (ingest, aoc_verify, etc.), code (AOC error code) |
excititor.vex.chunks.requests |
Counter | Requests to /v1/vex/evidence/chunks (NDJSON stream). |
tenant, outcome (success,error,cancelled), truncated (true/false) |
excititor.vex.chunks.bytes |
Histogram | Size of NDJSON chunk streams served (bytes). | tenant, outcome |
excititor.vex.chunks.records |
Histogram | Count of evidence records emitted per chunk stream. | tenant, outcome |
All metrics originate from the
EvidenceTelemetryhelper (src/Excititor/StellaOps.Excititor.WebService/Telemetry/EvidenceTelemetry.cs). When disabled (telemetry off), the helper is inert.
Dashboard hints
- Advisory-AI readiness – alert when
excititor.vex.signature.status{status="missing"}spikes for a tenant, indicating connectors aren’t supplying signatures. - Guardrail monitoring – graph
excititor.vex.aoc.guard_violationspercodeto catch upstream feed regressions before they pollute Evidence Locker or Lens caches. - Capacity planning – histogram percentiles of
excititor.vex.observation.statement_countfeed API sizing (higher counts mean Advisory AI is requesting broad scopes).
Operational steps
- Enable telemetry: set
Excititor:Telemetry:EnableMetrics=true, configure OTLP endpoints/headers as described inTelemetryExtensions. - Add dashboards: import panels referencing the metrics above (see Grafana JSON snippets in Ops repo once merged).
- Alerting: add rules for high guard violation rates, missing signatures, and abnormal chunk bytes/record counts. Tie alerts back to connectors via tenant metadata.
- Post-deploy checks: after each release, verify metrics emit by curling
/v1/vex/observations/...and/v1/vex/evidence/chunks, watching the console exporter (dev) or OTLP (prod).
Related documents
docs/modules/excititor/architecture.md– API contract, AOC guardrails, connector responsibilities.docs/modules/excititor/mirrors.md– AirGap/mirror ingestion checklist (feeds intoEXCITITOR-AIRGAP-56/57).docs/modules/platform/architecture-overview.md#observability– platform-wide telemetry guidance.