Files
git.stella-ops.org/docs/runbooks/reachability-runtime.md
StellaOps Bot d63af51f84
Some checks failed
api-governance / spectral-lint (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
oas-ci / oas-validate (push) Has been cancelled
SDK Publish & Sign / sdk-publish (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
up
2025-11-26 20:23:28 +02:00

64 lines
3.5 KiB
Markdown

# Reachability Runtime Ingestion Runbook
> **Imposed rule:** Runtime traces must never bypass CAS/DSSE verification; ingest only CAS-addressed NDJSON with hashes logged to Timeline and Evidence Locker.
This runbook guides operators through ingesting runtime reachability evidence (EntryTrace, probes, Signals ingestion) and wiring it into the reachability evidence chain.
## 1. Prerequisites
- Services: `Signals` API, `Zastava Observer` (or other probes), `Evidence Locker`, optional `Attestor` for DSSE.
- Reachability schema: `docs/reachability/function-level-evidence.md`, `docs/reachability/evidence-schema.md`.
- CAS: configured bucket/path for `cas://reachability/runtime/*` and `.../graphs/*`.
- Time sync: AirGap Time anchor if sealed; otherwise NTP with drift <200ms.
## 2. Ingestion workflow (online)
1) **Capture traces** from Observer/probes NDJSON (`runtime-trace.ndjson.gz`) with `symbol_id`, `purl`, `timestamp`, `pid`, `container`, `count`.
2) **Stage to CAS**: upload file, record `sha256`, store at `cas://reachability/runtime/<sha256>`.
3) **Optionally sign**: wrap CAS digest in DSSE (`stella attest runtime --bundle runtime.dsse.json`).
4) **Ingest** via Signals API:
```sh
curl -H "X-Stella-Tenant: acme" \
-H "Content-Type: application/x-ndjson" \
--data-binary @runtime-trace.ndjson.gz \
"https://signals.example/api/v1/runtime-facts?graph_hash=<graph>"
```
Headers returned: `Content-SHA256`, `X-Graph-Hash`, `X-Ingest-Id`.
5) **Emit timeline**: ensure Timeline event `reach.runtime.ingested` with CAS digest and ingest id.
6) **Verify**: run `stella graph verify --runtime runtime-trace.ndjson.gz --graph <graph_hash>` to confirm edges mapped.
## 3. Ingestion workflow (air-gap)
1) Receive runtime bundle containing `runtime-trace.ndjson.gz`, `manifest.json` (hashes), optional DSSE.
2) Validate hashes against manifest; if present, verify DSSE bundle.
3) Import into CAS path `cas://reachability/runtime/<sha256>` using offline loader.
4) Run Signals offline ingest tool:
```sh
signals-offline ingest-runtime \
--tenant acme \
--graph-hash <graph_hash> \
--runtime runtime-trace.ndjson.gz \
--manifest manifest.json
```
5) Export ingest receipt and add to Evidence Locker; update Timeline when reconnected.
## 4. Checks & alerts
- **Drift**: block ingest if time anchor age > configured budget; surface `staleness_seconds`.
- **Hash mismatch**: fail ingest; write `runtime.ingest.failed` event with reason.
- **Orphan traces**: if no matching `graph_hash`, queue for retry and alert `reachability.orphan_traces` counter.
## 5. Troubleshooting
- **400 Bad Request**: validate NDJSON schema; run `scripts/reachability/validate_runtime_trace.py`.
- **Hash mismatch**: recompute `sha256sum runtime-trace.ndjson.gz`; compare to manifest.
- **Missing symbols**: ensure symbol manifest ingested (see `docs/specs/symbols/SYMBOL_MANIFEST_v1.md`); rerun `stella graph verify`.
- **High drift**: refresh time anchor (AirGap Time service) or resync NTP; retry ingest.
## 6. Artefact checklist
- `runtime-trace.ndjson.gz` (or `.json`), `sha256` recorded.
- Optional `runtime.dsse.json` DSSE bundle.
- Ingest receipt (ingest id, graph hash, CAS digest, tenant).
- Timeline event `reach.runtime.ingested` and Evidence Locker record (bundle + receipt).
## 7. References
- `docs/reachability/DELIVERY_GUIDE.md`
- `docs/reachability/function-level-evidence.md`
- `docs/reachability/evidence-schema.md`
- `docs/specs/symbols/SYMBOL_MANIFEST_v1.md`