# Telemetry Storage Deployment (DEVOPS-OBS-50-002) > **Audience:** DevOps Guild, Observability Guild > > **Scope:** Prometheus (metrics), Tempo (traces), Loki (logs) storage backends with tenant isolation, TLS, retention policies, and Authority integration. --- ## 1. Components & Ports | Service | Port | Purpose | TLS | |-----------|------|---------|-----| | Prometheus | 9090 | Metrics API / alerting | Client auth (mTLS) to scrape collector | | Tempo | 3200 | Trace ingest + API | mTLS (client cert required) | | Loki | 3100 | Log ingest + API | mTLS (client cert required) | The collector forwards OTLP traffic to Tempo (traces), Prometheus scrapes the collector’s `/metrics` endpoint, and Loki is used for log search. --- ## 2. Local validation (Compose) ```bash ./ops/devops/telemetry/generate_dev_tls.sh cd deploy/compose # Start collector + storage stack docker compose -f docker-compose.telemetry.yaml up -d docker compose -f docker-compose.telemetry-storage.yaml up -d python ../../ops/devops/telemetry/smoke_otel_collector.py --host localhost ``` Configuration files live in `deploy/telemetry/storage/`. Adjust the overrides before shipping to staging/production. --- ## 3. Kubernetes blueprint Deploy Prometheus, Tempo, and Loki to the `observability` namespace. The Helm values snippet below illustrates the key settings (charts not yet versioned—define them in the observability repo): ```yaml prometheus: server: extraFlags: - web.enable-lifecycle persistentVolume: enabled: true size: 200Gi additionalScrapeConfigsSecret: stellaops-prometheus-scrape extraSecretMounts: - name: otel-mtls secretName: stellaops-otel-tls-stage mountPath: /etc/telemetry/tls readOnly: true - name: otel-token secretName: stellaops-prometheus-token mountPath: /etc/telemetry/auth readOnly: true loki: auth_enabled: true singleBinary: replicas: 2 storage: type: filesystem existingSecretForTls: stellaops-otel-tls-stage runtimeConfig: configMap: name: stellaops-loki-tenant-overrides tempo: server: http_listen_port: 3200 storage: trace: backend: s3 s3: endpoint: tempo-minio.observability.svc:9000 bucket: tempo-traces multitenancyEnabled: true extraVolumeMounts: - name: otel-mtls mountPath: /etc/telemetry/tls readOnly: true - name: tempo-tenant-overrides mountPath: /etc/telemetry/tenants readOnly: true ``` ### Staging bootstrap commands ```bash kubectl create namespace observability --dry-run=client -o yaml | kubectl apply -f - # TLS material (generated via ops/devops/telemetry/generate_dev_tls.sh or from PKI) kubectl -n observability create secret generic stellaops-otel-tls-stage \ --from-file=tls.crt=collector-stage.crt \ --from-file=tls.key=collector-stage.key \ --from-file=ca.crt=collector-ca.crt # Prometheus bearer token issued by Authority (scope obs:read) kubectl -n observability create secret generic stellaops-prometheus-token \ --from-file=token=prometheus-stage.token # Tenant overrides kubectl -n observability create configmap stellaops-loki-tenant-overrides \ --from-file=overrides.yaml=deploy/telemetry/storage/tenants/loki-overrides.yaml kubectl -n observability create configmap tempo-tenant-overrides \ --from-file=tempo-overrides.yaml=deploy/telemetry/storage/tenants/tempo-overrides.yaml # Additional scrape config referencing the collector service kubectl -n observability create secret generic stellaops-prometheus-scrape \ --from-file=prometheus-additional.yaml=deploy/telemetry/storage/prometheus.yaml ``` Provision the following secrets/configs (names can be overridden via Helm values): | Name | Type | Notes | |------|------|-------| | `stellaops-otel-tls-stage` | Secret | Shared CA + server cert/key for collector/storage mTLS. | `stellaops-prometheus-token` | Secret | Bearer token minted by Authority (`obs:read`). | `stellaops-loki-tenant-overrides` | ConfigMap | Text from `deploy/telemetry/storage/tenants/loki-overrides.yaml`. | `tempo-tenant-overrides` | ConfigMap | Text from `deploy/telemetry/storage/tenants/tempo-overrides.yaml`. --- ## 4. Authority & tenancy integration 1. Create Authority clients for each backend (`observability-prometheus`, `observability-loki`, `observability-tempo`). ```bash stella authority client create observability-prometheus \ --scopes obs:read \ --audience observability --description "Prometheus collector scrape" stella authority client create observability-loki \ --scopes obs:logs timeline:read \ --audience observability --description "Loki ingestion" stella authority client create observability-tempo \ --scopes obs:traces \ --audience observability --description "Tempo ingestion" ``` 2. Mint tokens/credentials and store them in the secrets above (see staging bootstrap commands). Example: ```bash stella authority token issue observability-prometheus --ttl 30d > prometheus-stage.token ``` 3. Update ingress/gateway policies to forward `X-StellaOps-Tenant` into Loki/Tempo so tenant headers propagate end-to-end, and ensure each workload sets `tenant.id` attributes (see `docs/observability/observability.md`). --- ## 5. Retention & isolation - Adjust `deploy/telemetry/storage/tenants/*.yaml` to set per-tenant retention and ingestion limits. - Configure object storage (S3, GCS, Azure Blob) when moving beyond filesystem storage. - For air-gapped deployments, mirror the telemetry bundle using `ops/devops/telemetry/package_offline_bundle.py` and import inside the Offline Kit staging directory. --- ## 6. Operational checklist - [ ] Certificates rotated and secrets updated. - [ ] Prometheus scrape succeeds (`curl -sk --cert client.crt --key client.key https://collector:9464`). - [ ] Tempo and Loki report tenant activity (`/api/status`). - [ ] Retention policy tested by uploading sample data and verifying expiry. - [ ] Alerts wired into SLO evaluator (DEVOPS-OBS-51-001). --- ## 7. References - `deploy/telemetry/storage/README.md` - `deploy/compose/docker-compose.telemetry-storage.yaml` - `docs/ops/telemetry-collector.md` - `docs/observability/observability.md`