Files
git.stella-ops.org/docs/observability/cli-incident-toggle-12-001.md
master 10212d67c0
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
api-governance / spectral-lint (push) Has been cancelled
Refactor code structure for improved readability and maintainability; removed redundant code blocks and optimized function calls.
2025-11-20 07:50:52 +02:00

1.5 KiB

CLI incident toggle contract (CLI-OBS-12-001)

Goal: define a deterministic CLI flag and config surface to enter/exit incident mode, required by TELEMETRY-OBS-55-001/56-001.

Flags and config

  • CLI flag: --incident-mode (bool). Defaults to false.
  • Config key: Telemetry:Incident:Enabled (bool) and Telemetry:Incident:TTL (TimeSpan).
  • When both flag and config specified, flag wins (opt-in only; cannot disable if config enables and flag present).

Effects when enabled

  • Increase sampling rate ceiling to 100% for telemetry within the process.
  • Add tag incident=true to logs/metrics/traces.
  • Shorten exporter/reporting flush interval to 5s; disable external exporters when Sealed=true.
  • Emit activation audit event telemetry.incident.activated with fields {tenant, actor, source, expires_at}.

Persistence

  • Incident flag runtime value stored in local state file ~/.stellaops/incident-mode.json with fields {enabled, set_at, expires_at, actor} for offline continuity.
  • File is tenant-scoped; permissions 0600.

Expiry / TTL

  • Default TTL: 30 minutes unless Telemetry:Incident:TTL provided.
  • On expiry, emit telemetry.incident.expired audit event.

Validation expectations

  • CLI should refuse --incident-mode if --sealed is set and external exporters are configured (must drop exporters first).
  • Unit tests to cover precedence (flag over config), TTL expiry, state file perms, and audit emissions.

Provenance

  • Authored 2025-11-20 to unblock PREP-CLI-OBS-12-001 and TELEMETRY-OBS-55-001.