Implement ledger metrics for observability and add tests for Ruby packages endpoints
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled

- Added `LedgerMetrics` class to record write latency and total events for ledger operations.
- Created comprehensive tests for Ruby packages endpoints, covering scenarios for missing inventory, successful retrieval, and identifier handling.
- Introduced `TestSurfaceSecretsScope` for managing environment variables during tests.
- Developed `ProvenanceMongoExtensions` for attaching DSSE provenance and trust information to event documents.
- Implemented `EventProvenanceWriter` and `EventWriter` classes for managing event provenance in MongoDB.
- Established MongoDB indexes for efficient querying of events based on provenance and trust.
- Added models and JSON parsing logic for DSSE provenance and trust information.
This commit is contained in:
master
2025-11-13 09:29:09 +02:00
parent 151f6b35cc
commit 61f963fd52
101 changed files with 5881 additions and 1776 deletions

View File

@@ -0,0 +1,129 @@
# Findings Ledger Deployment & Operations Guide
> **Applies to** `StellaOps.Findings.Ledger` writer + projector services (Sprint120).
> **Audience** Platform/DevOps engineers bringing up Findings Ledger across dev/stage/prod and air-gapped sites.
## 1. Prerequisites
| Component | Requirement |
| --- | --- |
| Database | PostgreSQL 14+ with `citext`, `uuid-ossp`, `pgcrypto`, and `pg_partman`. Provision dedicated database/user per environment. |
| Storage | Minimum 200GB SSD per production environment (ledger + projection + Merkle tables). |
| TLS & identity | Authority reachable for service-to-service JWTs; mTLS optional but recommended. |
| Secrets | Store DB connection string, encryption keys (`LEDGER__ATTACHMENTS__ENCRYPTIONKEY`), signing credentials for Merkle anchoring in secrets manager. |
| Observability | OTLP collector endpoint (or Loki/Prometheus endpoints) configured; see `docs/modules/findings-ledger/observability.md`. |
## 2. Docker Compose deployment
1. **Create env files**
```bash
cp deploy/compose/env/ledger.env.example ledger.env
cp etc/secrets/ledger.postgres.secret.example ledger.postgres.env
# Populate LEDGER__DB__CONNECTIONSTRING, LEDGER__ATTACHMENTS__ENCRYPTIONKEY, etc.
```
2. **Add ledger service overlay** (append to the Compose file in use, e.g. `docker-compose.prod.yaml`):
```yaml
services:
findings-ledger:
image: stellaops/findings-ledger:${STELLA_VERSION:-2025.11.0}
restart: unless-stopped
env_file:
- ledger.env
- ledger.postgres.env
environment:
ASPNETCORE_URLS: http://0.0.0.0:8080
LEDGER__DB__CONNECTIONSTRING: ${LEDGER__DB__CONNECTIONSTRING}
LEDGER__OBSERVABILITY__ENABLED: "true"
LEDGER__MERKLE__ANCHORINTERVAL: "00:05:00"
ports:
- "8188:8080"
depends_on:
- postgres
volumes:
- ./etc/ledger/appsettings.json:/app/appsettings.json:ro
```
3. **Run migrations then start services**
```bash
dotnet run --project src/Findings/StellaOps.Findings.Ledger.Migrations \
-- --connection "$LEDGER__DB__CONNECTIONSTRING"
docker compose --env-file ledger.env --env-file ledger.postgres.env \
-f deploy/compose/docker-compose.prod.yaml up -d findings-ledger
```
4. **Smoke test**
```bash
curl -sf http://localhost:8188/health/ready
curl -sf http://localhost:8188/metrics | grep ledger_write_latency_seconds
```
## 3. Helm deployment
1. **Create secret**
```bash
kubectl create secret generic findings-ledger-secrets \
--from-literal=LEDGER__DB__CONNECTIONSTRING="$CONN_STRING" \
--from-literal=LEDGER__ATTACHMENTS__ENCRYPTIONKEY="$ENC_KEY" \
--dry-run=client -o yaml | kubectl apply -f -
```
2. **Helm values excerpt**
```yaml
services:
findingsLedger:
enabled: true
image:
repository: stellaops/findings-ledger
tag: 2025.11.0
envFromSecrets:
- name: findings-ledger-secrets
env:
LEDGER__OBSERVABILITY__ENABLED: "true"
LEDGER__MERKLE__ANCHORINTERVAL: "00:05:00"
resources:
requests: { cpu: "500m", memory: "1Gi" }
limits: { cpu: "2", memory: "4Gi" }
probes:
readinessPath: /health/ready
livenessPath: /health/live
```
3. **Install/upgrade**
```bash
helm upgrade --install stellaops deploy/helm/stellaops \
-f deploy/helm/stellaops/values-prod.yaml
```
4. **Verify**
```bash
kubectl logs deploy/stellaops-findings-ledger | grep "Ledger started"
kubectl port-forward svc/stellaops-findings-ledger 8080 &
curl -sf http://127.0.0.1:8080/metrics | head
```
## 4. Backups & restores
| Task | Command / guidance |
| --- | --- |
| Online backup | `pg_dump -Fc --dbname="$LEDGER_DB" --file ledger-$(date -u +%Y%m%d).dump` (run hourly for WAL + daily full dumps). |
| Point-in-time recovery | Enable WAL archiving; document target `recovery_target_time`. |
| Projection rebuild | After restore, run `dotnet run --project tools/LedgerReplayHarness -- --connection "$LEDGER_DB" --tenant all` to regenerate projections and verify hashes. |
| Evidence bundles | Store Merkle root anchors + replay DSSE bundles alongside DB backups for audit parity. |
## 5. Offline / air-gapped workflow
- Use `stella ledger observability snapshot --out offline/ledger/metrics.tar.gz` before exporting Offline Kits. Include:
- `ledger_write_latency_seconds` summaries
- `ledger_merkle_anchor_duration_seconds` histogram
- Latest `ledger_merkle_roots` rows (export via `psql \copy`)
- Package ledger service binaries + migrations using `ops/offline-kit/build_offline_kit.py --include ledger`.
- Document sealed-mode restrictions: disable outbound attachments unless egress policy allows Evidence Locker endpoints; set `LEDGER__ATTACHMENTS__ALLOWEGRESS=false`.
## 6. Post-deploy checklist
- [ ] Health + metrics endpoints respond.
- [ ] Merkle anchors writing to `ledger_merkle_roots`.
- [ ] Projection lag < 30s (`ledger_projection_lag_seconds`).
- [ ] Grafana dashboards imported under “Findings Ledger”.
- [ ] Backups scheduled + restore playbook tested.
- [ ] Offline snapshot taken (air-gapped sites).
---
*Draft prepared 2025-11-13 for LEDGER-29-009/LEDGER-AIRGAP-56-001 planning. Update once Compose/Helm overlays are merged.*