Implement ledger metrics for observability and add tests for Ruby packages endpoints

- Added `LedgerMetrics` class to record write latency and total events for ledger operations. - Created comprehensive tests for Ruby packages endpoints, covering scenarios for missing inventory, successful retrieval, and identifier handling. - Introduced `TestSurfaceSecretsScope` for managing environment variables during tests. - Developed `ProvenanceMongoExtensions` for attaching DSSE provenance and trust information to event documents. - Implemented `EventProvenanceWriter` and `EventWriter` classes for managing event provenance in MongoDB. - Established MongoDB indexes for efficient querying of events based on provenance and trust. - Added models and JSON parsing logic for DSSE provenance and trust information.
2025-11-13 09:29:09 +02:00
parent 151f6b35cc
commit 61f963fd52
101 changed files with 5881 additions and 1776 deletions
--- a/docs/modules/findings-ledger/deployment.md
+++ b/docs/modules/findings-ledger/deployment.md
@@ -0,0 +1,129 @@
+# Findings Ledger Deployment & Operations Guide
+
+> **Applies to** `StellaOps.Findings.Ledger` writer + projector services (Sprint 120).  
+> **Audience** Platform/DevOps engineers bringing up Findings Ledger across dev/stage/prod and air-gapped sites.
+
+## 1. Prerequisites
+
+| Component | Requirement |
+| --- | --- |
+| Database | PostgreSQL 14+ with `citext`, `uuid-ossp`, `pgcrypto`, and `pg_partman`. Provision dedicated database/user per environment. |
+| Storage | Minimum 200 GB SSD per production environment (ledger + projection + Merkle tables). |
+| TLS & identity | Authority reachable for service-to-service JWTs; mTLS optional but recommended. |
+| Secrets | Store DB connection string, encryption keys (`LEDGER__ATTACHMENTS__ENCRYPTIONKEY`), signing credentials for Merkle anchoring in secrets manager. |
+| Observability | OTLP collector endpoint (or Loki/Prometheus endpoints) configured; see `docs/modules/findings-ledger/observability.md`. |
+
+## 2. Docker Compose deployment
+
+1. **Create env files**
+   ```bash
+   cp deploy/compose/env/ledger.env.example ledger.env
+   cp etc/secrets/ledger.postgres.secret.example ledger.postgres.env
+   # Populate LEDGER__DB__CONNECTIONSTRING, LEDGER__ATTACHMENTS__ENCRYPTIONKEY, etc.
+   ```
+2. **Add ledger service overlay** (append to the Compose file in use, e.g. `docker-compose.prod.yaml`):
+   ```yaml
+   services:
+     findings-ledger:
+       image: stellaops/findings-ledger:${STELLA_VERSION:-2025.11.0}
+       restart: unless-stopped
+       env_file:
+         - ledger.env
+         - ledger.postgres.env
+       environment:
+         ASPNETCORE_URLS: http://0.0.0.0:8080
+         LEDGER__DB__CONNECTIONSTRING: ${LEDGER__DB__CONNECTIONSTRING}
+         LEDGER__OBSERVABILITY__ENABLED: "true"
+         LEDGER__MERKLE__ANCHORINTERVAL: "00:05:00"
+       ports:
+         - "8188:8080"
+       depends_on:
+         - postgres
+       volumes:
+         - ./etc/ledger/appsettings.json:/app/appsettings.json:ro
+   ```
+3. **Run migrations then start services**
+   ```bash
+   dotnet run --project src/Findings/StellaOps.Findings.Ledger.Migrations \
+     -- --connection "$LEDGER__DB__CONNECTIONSTRING"
+
+   docker compose --env-file ledger.env --env-file ledger.postgres.env \
+     -f deploy/compose/docker-compose.prod.yaml up -d findings-ledger
+   ```
+4. **Smoke test**
+   ```bash
+   curl -sf http://localhost:8188/health/ready
+   curl -sf http://localhost:8188/metrics | grep ledger_write_latency_seconds
+   ```
+
+## 3. Helm deployment
+
+1. **Create secret**
+   ```bash
+   kubectl create secret generic findings-ledger-secrets \
+     --from-literal=LEDGER__DB__CONNECTIONSTRING="$CONN_STRING" \
+     --from-literal=LEDGER__ATTACHMENTS__ENCRYPTIONKEY="$ENC_KEY" \
+     --dry-run=client -o yaml | kubectl apply -f -
+   ```
+2. **Helm values excerpt**
+   ```yaml
+   services:
+     findingsLedger:
+       enabled: true
+       image:
+         repository: stellaops/findings-ledger
+         tag: 2025.11.0
+       envFromSecrets:
+         - name: findings-ledger-secrets
+       env:
+         LEDGER__OBSERVABILITY__ENABLED: "true"
+         LEDGER__MERKLE__ANCHORINTERVAL: "00:05:00"
+       resources:
+         requests: { cpu: "500m", memory: "1Gi" }
+         limits:   { cpu: "2",    memory: "4Gi" }
+       probes:
+         readinessPath: /health/ready
+         livenessPath: /health/live
+   ```
+3. **Install/upgrade**
+   ```bash
+   helm upgrade --install stellaops deploy/helm/stellaops \
+     -f deploy/helm/stellaops/values-prod.yaml
+   ```
+4. **Verify**
+   ```bash
+   kubectl logs deploy/stellaops-findings-ledger | grep "Ledger started"
+   kubectl port-forward svc/stellaops-findings-ledger 8080 &
+   curl -sf http://127.0.0.1:8080/metrics | head
+   ```
+
+## 4. Backups & restores
+
+| Task | Command / guidance |
+| --- | --- |
+| Online backup | `pg_dump -Fc --dbname="$LEDGER_DB" --file ledger-$(date -u +%Y%m%d).dump` (run hourly for WAL + daily full dumps). |
+| Point-in-time recovery | Enable WAL archiving; document target `recovery_target_time`. |
+| Projection rebuild | After restore, run `dotnet run --project tools/LedgerReplayHarness -- --connection "$LEDGER_DB" --tenant all` to regenerate projections and verify hashes. |
+| Evidence bundles | Store Merkle root anchors + replay DSSE bundles alongside DB backups for audit parity. |
+
+## 5. Offline / air-gapped workflow
+
+- Use `stella ledger observability snapshot --out offline/ledger/metrics.tar.gz` before exporting Offline Kits. Include:
+  - `ledger_write_latency_seconds` summaries
+  - `ledger_merkle_anchor_duration_seconds` histogram
+  - Latest `ledger_merkle_roots` rows (export via `psql \copy`)
+- Package ledger service binaries + migrations using `ops/offline-kit/build_offline_kit.py --include ledger`.
+- Document sealed-mode restrictions: disable outbound attachments unless egress policy allows Evidence Locker endpoints; set `LEDGER__ATTACHMENTS__ALLOWEGRESS=false`.
+
+## 6. Post-deploy checklist
+
+- [ ] Health + metrics endpoints respond.
+- [ ] Merkle anchors writing to `ledger_merkle_roots`.
+- [ ] Projection lag < 30 s (`ledger_projection_lag_seconds`).
+- [ ] Grafana dashboards imported under “Findings Ledger”.
+- [ ] Backups scheduled + restore playbook tested.
+- [ ] Offline snapshot taken (air-gapped sites).
+
+---
+
+*Draft prepared 2025-11-13 for LEDGER-29-009/LEDGER-AIRGAP-56-001 planning. Update once Compose/Helm overlays are merged.*