Files
git.stella-ops.org/docs/reachability/reachability.md
StellaOps Bot 1c782897f7
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
up
2025-11-26 07:47:08 +02:00

3.7 KiB
Raw Permalink Blame History

Reachability · Runtime + Static Union (v0.1)

What this covers

  • End-to-end flow for combining static callgraphs (Scanner) and runtime traces (Zastava) into replayable reachability bundles.
  • Storage layout (CAS namespaces), manifest fields, and Signals APIs that consume/emit reachability facts.
  • How unknowns/pressure and scoring are derived so Policy/UI can explain outcomes.

Pipeline (at a glance)

  1. Scanner emits language-specific callgraphs as richgraph-v1 and packs them into CAS under reachability_graphs/<digest>.tar.zst with manifest meta.json.
  2. Zastava Observer streams NDJSON runtime facts (symbol_id, code_id, hit_count, loader_base, cas_uri) to Signals POST /signals/runtime-facts or /runtime-facts/ndjson.
  3. Union bundles (runtime + static) are uploaded as ZIP to POST /signals/reachability/union with optional X-Analysis-Id; Signals stores under reachability_graphs/{analysisId}/.
  4. Signals scoring consumes union data + runtime facts, computes per-target states (bucket, weight, confidence, score), fact-level score, unknowns pressure, and publishes signals.fact.updated@v1 events.
  5. Replay records provenance: reachability section in replay manifest lists CAS URIs (graphs + runtime traces), namespaces, analyzer/version, callgraphIds, and the shared analysisId.

Storage & CAS namespaces

  • Static graphs: cas://reachability_graphs/<hh>/<sha>.tar.zst (meta.json + graph files).
  • Runtime traces: cas://runtime_traces/<hh>/<sha>.tar.zst (NDJSON or zipped stream).
  • Replay manifest now includes analysisId to correlate graphs/traces; each reference also carries namespace and callgraphId (static) for unambiguous replay.

Signals API quick reference

  • POST /signals/runtime-facts — structured request body; recomputes reachability.
  • POST /signals/runtime-facts/ndjson — streaming NDJSON/gzip; requires callgraphId header params.
  • POST /signals/reachability/union — upload ZIP bundle; optional X-Analysis-Id.
  • GET /signals/reachability/union/{analysisId}/meta — returns meta.json.
  • GET /signals/reachability/union/{analysisId}/files/{fileName} — download bundled graph/trace files.
  • GET /signals/facts/{subjectKey} — fetch latest reachability fact (includes unknowns counters and targets).

Scoring and unknowns

  • Buckets (default weights): entrypoint 1.0, direct 0.85, runtime 0.45, unknown 0.5, unreachable 0.0.
  • Confidence: reachable vs unreachable base, runtime bonus, clamped between Min/Max (defaults 0.050.99).
  • Unknowns: Signals counts unresolved symbols/edges per subject; UnknownsPressure = unknowns / (states + unknowns) (capped). Fact score is reduced by UnknownsPenaltyCeiling (default 0.35) × pressure.
  • Events: signals.fact.updated@v1 now emits unknownsCount and unknownsPressure plus bucket/weight/stateCount/targets.

Replay contract changes (v0.1 add-ons)

  • reachability.analysisId (string, optional) — ties to Signals union ingest.
  • Graph refs include namespace, callgraphId, analyzer, version, sha256, casUri.
  • Runtime trace refs include namespace, recordedAt, sha256, casUri.

Operator checklist

  • Use deterministic CAS paths; never embed absolute file paths.
  • When emitting runtime NDJSON, include loader_base and code_id when available for de-dup.
  • Ensure analysisId is propagated from Scanner/Zastava into Signals ingest to keep replay manifests linked.
  • Keep feeds frozen for reproducibility; avoid external downloads in union preparation.

References

  • Schema: docs/reachability/runtime-static-union-schema.md
  • Delivery guide: docs/reachability/DELIVERY_GUIDE.md
  • Unknowns registry & scoring: Signals code (ReachabilityScoringService, UnknownsIngestionService) and events doc docs/signals/events-24-005.md.