Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
SDK Publish & Sign / sdk-publish (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
1.9 KiB
1.9 KiB
Aggregation Observability
Last updated: 2025-11-25 (Docs Tasks Md.V · DOCS-LNM-22-007)
Covers metrics, traces, and logs for Link-Not-Merge (LNM) aggregation and evidence pipelines.
Metrics
aggregation_ingest_latency_seconds(histogram) — end-to-end ingest per statement; labels:tenant,source,status.aggregation_conflict_total(counter) — conflicts encountered; labels:tenant,advisory,product,reason.aggregation_overlay_cache_hits_total/_misses_total— overlay cache effectiveness; labels:tenant,cache.aggregation_vex_gate_total— VEX gating outcomes; labels:tenant,status(affected,not_affected,unknown).aggregation_queue_depth(gauge) — pending statements per tenant.
Traces
- Span name
aggregation.processwith attributes:tenant,advisory,product,vex_status,source_kindoverlay_version,cache_hit(bool)
- Link to upstream ingest span (
traceparentforwarded by Excititor/Concelier). - Export to OTLP; sampling default 10% outside prod, 100% for
status=error.
Logs
Structured JSON with fields: tenant, advisory, product, vex_status, decision (merged|suppressed|dropped), reason, duration_ms, trace_id.
SLOs
- Ingest latency: p95 < 500ms per statement (steady state).
- Cache hit rate: >80% for overlays; alerts when below for 15 minutes.
- Error rate: <0.1% over 10 minute window.
Alerts
HighConflictRate—aggregation_conflict_totaldelta > 100/minute per tenant.QueueBacklog—aggregation_queue_depth> 10k for 5 minutes.LowCacheHit— overlay cache hit rate < 60% for 10 minutes.
Offline/air-gap considerations
- Export metrics to local Prometheus scrape; no external sinks.
- Trace sampling and log retention configured via environment without needing control-plane access.
- Deterministic ordering preserved; cache warmers seeded from bundled fixtures.