3.7 KiB
3.7 KiB
Reachability · Runtime + Static Union (v0.1)
What this covers
- End-to-end flow for combining static callgraphs (Scanner) and runtime traces (Zastava) into replayable reachability bundles.
- Storage layout (CAS namespaces), manifest fields, and Signals APIs that consume/emit reachability facts.
- How unknowns/pressure and scoring are derived so Policy/UI can explain outcomes.
Pipeline (at a glance)
- Scanner emits language-specific callgraphs as
richgraph-v1and packs them into CAS underreachability_graphs/<digest>.tar.zstwith manifestmeta.json. - Zastava Observer streams NDJSON runtime facts (
symbol_id,code_id,hit_count,loader_base,cas_uri) to SignalsPOST /signals/runtime-factsor/runtime-facts/ndjson. - Union bundles (runtime + static) are uploaded as ZIP to
POST /signals/reachability/unionwith optionalX-Analysis-Id; Signals stores underreachability_graphs/{analysisId}/. - Signals scoring consumes union data + runtime facts, computes per-target states (bucket, weight, confidence, score), fact-level score, unknowns pressure, and publishes
signals.fact.updated@v1events. - Replay records provenance: reachability section in replay manifest lists CAS URIs (graphs + runtime traces), namespaces, analyzer/version, callgraphIds, and the shared
analysisId.
Storage & CAS namespaces
- Static graphs:
cas://reachability_graphs/<hh>/<sha>.tar.zst(meta.json + graph files). - Runtime traces:
cas://runtime_traces/<hh>/<sha>.tar.zst(NDJSON or zipped stream). - Replay manifest now includes
analysisIdto correlate graphs/traces; each reference also carriesnamespaceandcallgraphId(static) for unambiguous replay.
Signals API quick reference
POST /signals/runtime-facts— structured request body; recomputes reachability.POST /signals/runtime-facts/ndjson— streaming NDJSON/gzip; requirescallgraphIdheader params.POST /signals/reachability/union— upload ZIP bundle; optionalX-Analysis-Id.GET /signals/reachability/union/{analysisId}/meta— returns meta.json.GET /signals/reachability/union/{analysisId}/files/{fileName}— download bundled graph/trace files.GET /signals/facts/{subjectKey}— fetch latest reachability fact (includes unknowns counters and targets).
Scoring and unknowns
- Buckets (default weights): entrypoint 1.0, direct 0.85, runtime 0.45, unknown 0.5, unreachable 0.0.
- Confidence: reachable vs unreachable base, runtime bonus, clamped between Min/Max (defaults 0.05–0.99).
- Unknowns: Signals counts unresolved symbols/edges per subject;
UnknownsPressure = unknowns / (states + unknowns)(capped). Fact score is reduced byUnknownsPenaltyCeiling(default 0.35) × pressure. - Events:
signals.fact.updated@v1now emitsunknownsCountandunknownsPressureplus bucket/weight/stateCount/targets.
Replay contract changes (v0.1 add-ons)
reachability.analysisId(string, optional) — ties to Signals union ingest.- Graph refs include
namespace,callgraphId, analyzer, version, sha256, casUri. - Runtime trace refs include
namespace, recordedAt, sha256, casUri.
Operator checklist
- Use deterministic CAS paths; never embed absolute file paths.
- When emitting runtime NDJSON, include
loader_baseandcode_idwhen available for de-dup. - Ensure
analysisIdis propagated from Scanner/Zastava into Signals ingest to keep replay manifests linked. - Keep feeds frozen for reproducibility; avoid external downloads in union preparation.
References
- Schema:
docs/reachability/runtime-static-union-schema.md - Delivery guide:
docs/reachability/DELIVERY_GUIDE.md - Unknowns registry & scoring: Signals code (
ReachabilityScoringService,UnknownsIngestionService) and events docdocs/signals/events-24-005.md.