Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
- Added `LedgerMetrics` class to record write latency and total events for ledger operations. - Created comprehensive tests for Ruby packages endpoints, covering scenarios for missing inventory, successful retrieval, and identifier handling. - Introduced `TestSurfaceSecretsScope` for managing environment variables during tests. - Developed `ProvenanceMongoExtensions` for attaching DSSE provenance and trust information to event documents. - Implemented `EventProvenanceWriter` and `EventWriter` classes for managing event provenance in MongoDB. - Established MongoDB indexes for efficient querying of events based on provenance and trust. - Added models and JSON parsing logic for DSSE provenance and trust information.
5.2 KiB
5.2 KiB
Findings Ledger Deployment & Operations Guide
Applies to
StellaOps.Findings.Ledgerwriter + projector services (Sprint 120).
Audience Platform/DevOps engineers bringing up Findings Ledger across dev/stage/prod and air-gapped sites.
1. Prerequisites
| Component | Requirement |
|---|---|
| Database | PostgreSQL 14+ with citext, uuid-ossp, pgcrypto, and pg_partman. Provision dedicated database/user per environment. |
| Storage | Minimum 200 GB SSD per production environment (ledger + projection + Merkle tables). |
| TLS & identity | Authority reachable for service-to-service JWTs; mTLS optional but recommended. |
| Secrets | Store DB connection string, encryption keys (LEDGER__ATTACHMENTS__ENCRYPTIONKEY), signing credentials for Merkle anchoring in secrets manager. |
| Observability | OTLP collector endpoint (or Loki/Prometheus endpoints) configured; see docs/modules/findings-ledger/observability.md. |
2. Docker Compose deployment
- Create env files
cp deploy/compose/env/ledger.env.example ledger.env cp etc/secrets/ledger.postgres.secret.example ledger.postgres.env # Populate LEDGER__DB__CONNECTIONSTRING, LEDGER__ATTACHMENTS__ENCRYPTIONKEY, etc. - Add ledger service overlay (append to the Compose file in use, e.g.
docker-compose.prod.yaml):services: findings-ledger: image: stellaops/findings-ledger:${STELLA_VERSION:-2025.11.0} restart: unless-stopped env_file: - ledger.env - ledger.postgres.env environment: ASPNETCORE_URLS: http://0.0.0.0:8080 LEDGER__DB__CONNECTIONSTRING: ${LEDGER__DB__CONNECTIONSTRING} LEDGER__OBSERVABILITY__ENABLED: "true" LEDGER__MERKLE__ANCHORINTERVAL: "00:05:00" ports: - "8188:8080" depends_on: - postgres volumes: - ./etc/ledger/appsettings.json:/app/appsettings.json:ro - Run migrations then start services
dotnet run --project src/Findings/StellaOps.Findings.Ledger.Migrations \ -- --connection "$LEDGER__DB__CONNECTIONSTRING" docker compose --env-file ledger.env --env-file ledger.postgres.env \ -f deploy/compose/docker-compose.prod.yaml up -d findings-ledger - Smoke test
curl -sf http://localhost:8188/health/ready curl -sf http://localhost:8188/metrics | grep ledger_write_latency_seconds
3. Helm deployment
- Create secret
kubectl create secret generic findings-ledger-secrets \ --from-literal=LEDGER__DB__CONNECTIONSTRING="$CONN_STRING" \ --from-literal=LEDGER__ATTACHMENTS__ENCRYPTIONKEY="$ENC_KEY" \ --dry-run=client -o yaml | kubectl apply -f - - Helm values excerpt
services: findingsLedger: enabled: true image: repository: stellaops/findings-ledger tag: 2025.11.0 envFromSecrets: - name: findings-ledger-secrets env: LEDGER__OBSERVABILITY__ENABLED: "true" LEDGER__MERKLE__ANCHORINTERVAL: "00:05:00" resources: requests: { cpu: "500m", memory: "1Gi" } limits: { cpu: "2", memory: "4Gi" } probes: readinessPath: /health/ready livenessPath: /health/live - Install/upgrade
helm upgrade --install stellaops deploy/helm/stellaops \ -f deploy/helm/stellaops/values-prod.yaml - Verify
kubectl logs deploy/stellaops-findings-ledger | grep "Ledger started" kubectl port-forward svc/stellaops-findings-ledger 8080 & curl -sf http://127.0.0.1:8080/metrics | head
4. Backups & restores
| Task | Command / guidance |
|---|---|
| Online backup | pg_dump -Fc --dbname="$LEDGER_DB" --file ledger-$(date -u +%Y%m%d).dump (run hourly for WAL + daily full dumps). |
| Point-in-time recovery | Enable WAL archiving; document target recovery_target_time. |
| Projection rebuild | After restore, run dotnet run --project tools/LedgerReplayHarness -- --connection "$LEDGER_DB" --tenant all to regenerate projections and verify hashes. |
| Evidence bundles | Store Merkle root anchors + replay DSSE bundles alongside DB backups for audit parity. |
5. Offline / air-gapped workflow
- Use
stella ledger observability snapshot --out offline/ledger/metrics.tar.gzbefore exporting Offline Kits. Include:ledger_write_latency_secondssummariesledger_merkle_anchor_duration_secondshistogram- Latest
ledger_merkle_rootsrows (export viapsql \copy)
- Package ledger service binaries + migrations using
ops/offline-kit/build_offline_kit.py --include ledger. - Document sealed-mode restrictions: disable outbound attachments unless egress policy allows Evidence Locker endpoints; set
LEDGER__ATTACHMENTS__ALLOWEGRESS=false.
6. Post-deploy checklist
- Health + metrics endpoints respond.
- Merkle anchors writing to
ledger_merkle_roots. - Projection lag < 30 s (
ledger_projection_lag_seconds). - Grafana dashboards imported under “Findings Ledger”.
- Backups scheduled + restore playbook tested.
- Offline snapshot taken (air-gapped sites).
Draft prepared 2025-11-13 for LEDGER-29-009/LEDGER-AIRGAP-56-001 planning. Update once Compose/Helm overlays are merged.