Implement ledger metrics for observability and add tests for Ruby packages endpoints
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
- Added `LedgerMetrics` class to record write latency and total events for ledger operations. - Created comprehensive tests for Ruby packages endpoints, covering scenarios for missing inventory, successful retrieval, and identifier handling. - Introduced `TestSurfaceSecretsScope` for managing environment variables during tests. - Developed `ProvenanceMongoExtensions` for attaching DSSE provenance and trust information to event documents. - Implemented `EventProvenanceWriter` and `EventWriter` classes for managing event provenance in MongoDB. - Established MongoDB indexes for efficient querying of events based on provenance and trust. - Added models and JSON parsing logic for DSSE provenance and trust information.
This commit is contained in:
86
docs/modules/findings-ledger/replay-harness.md
Normal file
86
docs/modules/findings-ledger/replay-harness.md
Normal file
@@ -0,0 +1,86 @@
|
||||
# Findings Ledger Replay & Determinism Harness (LEDGER-29-008)
|
||||
|
||||
> **Audience:** Findings Ledger Guild · QA Guild · Policy Guild
|
||||
> **Purpose:** Define the reproducible harness for 5 M findings/tenant replay tests and determinism validation required by LEDGER-29-008.
|
||||
|
||||
## 1. Goals
|
||||
- Reproduce ledger + projection state from canonical event fixtures with byte-for-byte determinism.
|
||||
- Stress test writer/projector throughput at ≥5 M findings per tenant, capturing CPU/memory/latency profiles.
|
||||
- Produce signed reports (DSSE) that CI and auditors can review before shipping.
|
||||
|
||||
## 2. Architecture
|
||||
|
||||
```
|
||||
Fixtures (.ndjson) → Harness Runner → Ledger Writer API → Postgres Ledger DB
|
||||
↘ Projector (same DB) ↘ Metrics snapshot
|
||||
```
|
||||
|
||||
- **Fixtures:** `fixtures/ledger/*.ndjson`, sorted by `sequence_no`, containing canonical JSON envelopes with precomputed hashes.
|
||||
- **Runner:** `tools/LedgerReplayHarness` (console app) feeds events, waits for projector catch-up, and verifies projection hashes.
|
||||
- **Validation:** After replay, the runner re-reads ledger/projection tables, recomputes hashes, and compares to fixture expectations.
|
||||
- **Reporting:** Generates `harness-report.json` with metrics (latency histogram, insertion throughput, projection lag) plus a DSSE signature.
|
||||
|
||||
## 3. CLI usage
|
||||
|
||||
```bash
|
||||
dotnet run --project tools/LedgerReplayHarness \
|
||||
-- --fixture fixtures/ledger/tenant-a.ndjson \
|
||||
--connection "Host=postgres;Username=stellaops;Password=***;Database=findings_ledger" \
|
||||
--tenant tenant-a \
|
||||
--maxParallel 8 \
|
||||
--report out/harness/tenant-a-report.json
|
||||
```
|
||||
|
||||
Options:
|
||||
|
||||
| Option | Description |
|
||||
| --- | --- |
|
||||
| `--fixture` | Path to NDJSON file (supports multiple). |
|
||||
| `--connection` | Postgres connection string (writer + projector share). |
|
||||
| `--tenant` | Tenant identifier; harness ensures partitions exist. |
|
||||
| `--maxParallel` | Batch concurrency (default 4). |
|
||||
| `--report` | Output path for report JSON; `.sig` generated alongside. |
|
||||
| `--metrics-endpoint` | Optional Prometheus scrape URI for live metrics snapshot. |
|
||||
|
||||
## 4. Verification steps
|
||||
|
||||
1. **Hash validation:** Recompute `event_hash` for each appended event and ensure matches fixture.
|
||||
2. **Sequence integrity:** Confirm gapless sequences per chain; harness aborts on mismatch.
|
||||
3. **Projection determinism:** Compare projector-derived `cycle_hash` with expected value from fixture metadata.
|
||||
4. **Performance:** Capture P50/P95 latencies for `ledger_write_latency_seconds` and ensure targets (<120 ms P95) met.
|
||||
5. **Resource usage:** Sample CPU/memory via `dotnet-counters` or `kubectl top` and store in report.
|
||||
6. **Merkle root check:** Rebuild Merkle tree from events and ensure root equals database `ledger_merkle_roots` entry.
|
||||
|
||||
## 5. Output report schema
|
||||
|
||||
```json
|
||||
{
|
||||
"tenant": "tenant-a",
|
||||
"fixtures": ["fixtures/ledger/tenant-a.ndjson"],
|
||||
"eventsWritten": 5123456,
|
||||
"durationSeconds": 1422.4,
|
||||
"latencyP95Ms": 108.3,
|
||||
"projectionLagMaxSeconds": 18.2,
|
||||
"cpuPercentMax": 72.5,
|
||||
"memoryMbMax": 3580,
|
||||
"merkleRoot": "3f1a…",
|
||||
"status": "pass",
|
||||
"timestamp": "2025-11-13T11:45:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
The harness writes `harness-report.json` plus `harness-report.json.sig` (DSSE) and `metrics-snapshot.prom` for archival.
|
||||
|
||||
## 6. CI integration
|
||||
- New pipeline job `ledger-replay-harness` runs nightly with reduced dataset (1 M findings) to detect regressions quickly.
|
||||
- Full 5 M run executes weekly and before releases; artifacts uploaded to `out/qa/findings-ledger/`.
|
||||
- Gates: merge blocked if harness `status != pass` or latencies exceed thresholds.
|
||||
|
||||
## 7. Air-gapped execution
|
||||
- Include fixtures + harness binaries inside Offline Kit under `offline/ledger/replay/`.
|
||||
- Provide `run-harness.sh` script that sets env vars, executes runner, and exports reports.
|
||||
- Operators attach signed reports to audit trails, verifying hashed fixtures before import.
|
||||
|
||||
---
|
||||
|
||||
*Draft prepared 2025-11-13 for LEDGER-29-008. Update when CLI options or thresholds change.*
|
||||
Reference in New Issue
Block a user