Files
git.stella-ops.org/docs/runbooks/reachability-runtime.md
StellaOps Bot d63af51f84
Some checks failed
api-governance / spectral-lint (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
oas-ci / oas-validate (push) Has been cancelled
SDK Publish & Sign / sdk-publish (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
up
2025-11-26 20:23:28 +02:00

3.5 KiB

Reachability Runtime Ingestion Runbook

Imposed rule: Runtime traces must never bypass CAS/DSSE verification; ingest only CAS-addressed NDJSON with hashes logged to Timeline and Evidence Locker.

This runbook guides operators through ingesting runtime reachability evidence (EntryTrace, probes, Signals ingestion) and wiring it into the reachability evidence chain.

1. Prerequisites

  • Services: Signals API, Zastava Observer (or other probes), Evidence Locker, optional Attestor for DSSE.
  • Reachability schema: docs/reachability/function-level-evidence.md, docs/reachability/evidence-schema.md.
  • CAS: configured bucket/path for cas://reachability/runtime/* and .../graphs/*.
  • Time sync: AirGap Time anchor if sealed; otherwise NTP with drift <200ms.

2. Ingestion workflow (online)

  1. Capture traces from Observer/probes → NDJSON (runtime-trace.ndjson.gz) with symbol_id, purl, timestamp, pid, container, count.
  2. Stage to CAS: upload file, record sha256, store at cas://reachability/runtime/<sha256>.
  3. Optionally sign: wrap CAS digest in DSSE (stella attest runtime --bundle runtime.dsse.json).
  4. Ingest via Signals API:
    curl -H "X-Stella-Tenant: acme" \
         -H "Content-Type: application/x-ndjson" \
         --data-binary @runtime-trace.ndjson.gz \
         "https://signals.example/api/v1/runtime-facts?graph_hash=<graph>"
    
    Headers returned: Content-SHA256, X-Graph-Hash, X-Ingest-Id.
  5. Emit timeline: ensure Timeline event reach.runtime.ingested with CAS digest and ingest id.
  6. Verify: run stella graph verify --runtime runtime-trace.ndjson.gz --graph <graph_hash> to confirm edges mapped.

3. Ingestion workflow (air-gap)

  1. Receive runtime bundle containing runtime-trace.ndjson.gz, manifest.json (hashes), optional DSSE.
  2. Validate hashes against manifest; if present, verify DSSE bundle.
  3. Import into CAS path cas://reachability/runtime/<sha256> using offline loader.
  4. Run Signals offline ingest tool:
    signals-offline ingest-runtime \
      --tenant acme \
      --graph-hash <graph_hash> \
      --runtime runtime-trace.ndjson.gz \
      --manifest manifest.json
    
  5. Export ingest receipt and add to Evidence Locker; update Timeline when reconnected.

4. Checks & alerts

  • Drift: block ingest if time anchor age > configured budget; surface staleness_seconds.
  • Hash mismatch: fail ingest; write runtime.ingest.failed event with reason.
  • Orphan traces: if no matching graph_hash, queue for retry and alert reachability.orphan_traces counter.

5. Troubleshooting

  • 400 Bad Request: validate NDJSON schema; run scripts/reachability/validate_runtime_trace.py.
  • Hash mismatch: recompute sha256sum runtime-trace.ndjson.gz; compare to manifest.
  • Missing symbols: ensure symbol manifest ingested (see docs/specs/symbols/SYMBOL_MANIFEST_v1.md); rerun stella graph verify.
  • High drift: refresh time anchor (AirGap Time service) or resync NTP; retry ingest.

6. Artefact checklist

  • runtime-trace.ndjson.gz (or .json), sha256 recorded.
  • Optional runtime.dsse.json DSSE bundle.
  • Ingest receipt (ingest id, graph hash, CAS digest, tenant).
  • Timeline event reach.runtime.ingested and Evidence Locker record (bundle + receipt).

7. References

  • docs/reachability/DELIVERY_GUIDE.md
  • docs/reachability/function-level-evidence.md
  • docs/reachability/evidence-schema.md
  • docs/specs/symbols/SYMBOL_MANIFEST_v1.md