docs consolidation work

This commit is contained in:
StellaOps Bot
2025-12-25 10:53:53 +02:00
parent b9f71fc7e9
commit deb82b4f03
117 changed files with 852 additions and 847 deletions

View File

@@ -3,7 +3,7 @@
> **Audience:** Observability Guild, Concelier/Excititor SREs, platform operators.
> **Scope:** Metrics, traces, logs, dashboards, and runbooks introduced as part of the Aggregation-Only Contract (AOC) rollout (Sprint19).
This guide captures the canonical signals emitted by Concelier and Excititor once AOC guards are active. It explains how to consume the metrics in dashboards, correlate traces/logs for incident triage, and operate in offline environments. Pair this guide with the [AOC reference](../ingestion/aggregation-only-contract.md) and [architecture overview](../modules/platform/architecture-overview.md).
This guide captures the canonical signals emitted by Concelier and Excititor once AOC guards are active. It explains how to consume the metrics in dashboards, correlate traces/logs for incident triage, and operate in offline environments. Pair this guide with the [AOC reference](../aoc/aggregation-only-contract.md) and [architecture overview](../modules/platform/architecture-overview.md).
---
@@ -215,7 +215,7 @@ Update `docs/assets/dashboards/` with screenshots when Grafana capture pipeline
## 7·References
- [Aggregation-Only Contract reference](../ingestion/aggregation-only-contract.md)
- [Aggregation-Only Contract reference](../aoc/aggregation-only-contract.md)
- [Architecture overview](../modules/platform/architecture-overview.md)
- [Console guide](../15_UI_GUIDE.md)
- [CLI AOC commands](../modules/cli/guides/cli-reference.md)

View File

@@ -78,7 +78,7 @@
- `ui.api.fetch` HTTP fetch to backend; attributes: `service`, `endpoint`, `status`, `networkTime`.
- `ui.sse.stream` Server-sent event subscriptions (status ticker, runs); attributes: `channel`, `connectedMillis`, `reconnects`.
- `ui.telemetry.batch` Browser OTLP flush; attributes: `batchSize`, `success`, `retryCount`.
- `ui.policy.action` Policy workspace actions (simulate, approve, activate) per `docs/15_UI_GUIDE.md`.
- `ui.policy.action` Policy workspace actions (simulate, approve, activate) per `docs/15_UI_GUIDE.md`.
- **Propagation:** Spans use W3C `traceparent`; gateway echoes header to backend APIs so traces stitch across UI gateway service.
- **Sampling controls:** `OTEL_TRACES_SAMPLER_ARG` (ratio) and feature flag `telemetry.forceSampling` (sets to 100% for incident debugging).
- **Viewing traces:** Grafana Tempo or Jaeger via collector. Filter by `service.name = stellaops-console`. For cross-service debugging, filter on `correlationId` and `tenant`.
@@ -147,13 +147,13 @@ Integrate alerts with Notifier (`ui.alerts`) or existing Ops channels. Tag incid
| `OTEL_SERVICE_NAME` | Service tag for traces/logs. Set to `stellaops-console`. | auto |
| `CONSOLE_TELEMETRY_SSE_ENABLED` | Enables `/console/telemetry` SSE feed for dashboards. | `true` |
Feature flag changes should be tracked in release notes and mirrored in `docs/15_UI_GUIDE.md` (navigation and workflow expectations).
Feature flag changes should be tracked in release notes and mirrored in `docs/15_UI_GUIDE.md` (navigation and workflow expectations).
---
## 8·Offline / Air-Gapped Workflow
- Mirror the console image and telemetry collector as part of the Offline Kit (see `/docs/install/docker.md` §4).
- Mirror the console image and telemetry collector as part of the Offline Kit (see `/docs/operations/console-docker-install.md` §4).
- Scrape metrics locally via `curl -k https://console.local/metrics > metrics.prom`; archive alongside logs for audits.
- Use `stella offline kit import` to keep the downloads manifest in sync; dashboards display staleness using `ui_download_manifest_refresh_seconds`.
- When collectors are unavailable, console queues OTLP batches (up to 5min) and exposes backlog through `ui_telemetry_queue_depth`; export queue metrics to prove no data loss.
@@ -171,7 +171,7 @@ Feature flag changes should be tracked in release notes and mirrored in `docs/15
- [ ] DPoP/fresh-auth anomalies correlated with Authority audit logs during drill.
- [ ] Offline capture workflow exercised; evidence stored in audit vault.
- [ ] Screenshots of Grafana dashboards committed once they stabilise (update references).
- [ ] Cross-links verified (`docs/deploy/console.md`, `docs/security/console-security.md`, `docs/15_UI_GUIDE.md`).
- [ ] Cross-links verified (`docs/deploy/console.md`, `docs/security/console-security.md`, `docs/15_UI_GUIDE.md`).
---
@@ -179,10 +179,10 @@ Feature flag changes should be tracked in release notes and mirrored in `docs/15
- `/docs/deploy/console.md` Metrics endpoint, OTLP config, health checks.
- `/docs/security/console-security.md` Security metrics & alert hints.
- `docs/15_UI_GUIDE.md` Console workflows and offline posture.
- `docs/15_UI_GUIDE.md` Console workflows and offline posture.
- `/docs/observability/observability.md` Platform-wide practices.
- `/ops/telemetry-collector.md` & `/ops/telemetry-storage.md` Collector deployment.
- `/docs/install/docker.md` Compose/Helm environment variables.
- `/docs/operations/console-docker-install.md` Compose/Helm environment variables.
---