Files
git.stella-ops.org/docs/modules/reach-graph/architecture.md

20 KiB

ReachGraph Module Architecture

Overview

The ReachGraph module provides a unified store for reachability subgraphs, enabling fast, deterministic, audit-ready answers to "exactly why a dependency is reachable." It consolidates data from Scanner, Signals, and Attestor into content-addressed artifacts with edge-level explainability.

Problem Statement

Before ReachGraph, reachability data was scattered across multiple modules:

Module Data Limitation
Scanner.CallGraph CallGraphSnapshot No unified query API
Signals ReachabilityFactDocument Runtime-focused, not auditable
Attestor PoE JSON Per-CVE only, no slice queries
Graph Generic nodes/edges Not optimized for "why reachable?"

Result: Answering "why is lodash reachable?" required querying multiple systems with no guarantee of consistency or auditability.

Solution

ReachGraph provides:

  1. Unified Schema: Extends PoE subgraph format with edge explainability
  2. Content-Addressed Store: All artifacts identified by BLAKE3 digest
  3. Slice Query API: Fast queries by package, CVE, entrypoint, or file
  4. Deterministic Replay: Verify that same inputs produce same graph
  5. DSSE Signing: Offline-verifiable proofs

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                       Consumers                                  │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐        │
│  │  Policy  │  │   Web    │  │   CLI    │  │ Export   │        │
│  │  Engine  │  │ Console  │  │          │  │ Center   │        │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘        │
└───────┼─────────────┼─────────────┼─────────────┼───────────────┘
        │             │             │             │
        └─────────────┴──────┬──────┴─────────────┘
                             │
┌────────────────────────────▼────────────────────────────────────┐
│                    ReachGraph WebService                         │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │  REST API                                                 │   │
│  │  POST /v1/reachgraphs           GET /v1/reachgraphs/{d}  │   │
│  │  GET  /v1/reachgraphs/{d}/slice POST /v1/reachgraphs/replay│  │
│  └──────────────────────────────────────────────────────────┘   │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │  Slice Query Engine                                       │   │
│  │  - Package slice (by PURL)                                │   │
│  │  - CVE slice (paths to vulnerable sinks)                  │   │
│  │  - Entrypoint slice (reachable from entry)                │   │
│  │  - File slice (changed file impact)                       │   │
│  └──────────────────────────────────────────────────────────┘   │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │  Replay Driver                                            │   │
│  │  - Rebuild graph from inputs                              │   │
│  │  - Verify digest matches                                  │   │
│  │  - Log for audit trail                                    │   │
│  └──────────────────────────────────────────────────────────┘   │
└─────────────────────────────┬───────────────────────────────────┘
                              │
┌─────────────────────────────▼───────────────────────────────────┐
│                    ReachGraph Core Library                       │
│  ┌────────────────┐  ┌────────────────┐  ┌────────────────┐    │
│  │     Schema     │  │ Serialization  │  │    Signing     │    │
│  │                │  │                │  │                │    │
│  │ ReachGraphMin  │  │ Canonical JSON │  │ DSSE Wrapper   │    │
│  │ EdgeExplanation│  │ BLAKE3 Digest  │  │ Attestor Int.  │    │
│  │ Provenance     │  │ Compression    │  │                │    │
│  └────────────────┘  └────────────────┘  └────────────────┘    │
└─────────────────────────────┬───────────────────────────────────┘
                              │
┌─────────────────────────────▼───────────────────────────────────┐
│                    Persistence Layer                             │
│  ┌────────────────────────┐  ┌────────────────────────┐        │
│  │      PostgreSQL        │  │        Valkey          │        │
│  │                        │  │                        │        │
│  │  reachgraph.subgraphs  │  │  Hot slice cache       │        │
│  │  reachgraph.slice_cache│  │  (30min TTL)           │        │
│  │  reachgraph.replay_log │  │                        │        │
│  └────────────────────────┘  └────────────────────────┘        │
└─────────────────────────────────────────────────────────────────┘
                              ▲
                              │
┌─────────────────────────────┴───────────────────────────────────┐
│                       Producers                                  │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
│  │   Scanner    │  │   Signals    │  │   Attestor   │          │
│  │  CallGraph   │  │ RuntimeFacts │  │     PoE      │          │
│  └──────────────┘  └──────────────┘  └──────────────┘          │
└─────────────────────────────────────────────────────────────────┘

Data Model

ReachGraphMinimal (v1)

The core schema extends the PoE predicate format:

{
  "schemaVersion": "reachgraph.min@v1",
  "artifact": {
    "name": "svc.payments",
    "digest": "sha256:abc123...",
    "env": ["linux/amd64"]
  },
  "scope": {
    "entrypoints": ["/app/bin/svc"],
    "selectors": ["prod"],
    "cves": ["CVE-2024-1234"]
  },
  "nodes": [...],
  "edges": [...],
  "provenance": {...},
  "signatures": [...]
}

Edge Explainability

Every edge carries metadata explaining why it exists:

Type Description Example Guard
Import Static import -
DynamicLoad Runtime load -
Reflection Reflective call -
EnvGuard Env variable check DEBUG=true
FeatureFlag Feature flag FEATURE_X=enabled
PlatformArch Platform guard os=linux
LoaderRule PLT/IAT/GOT RTLD_LAZY

Content Addressing

All artifacts are identified by BLAKE3-256 digest:

  • Computed from canonical JSON (sorted keys, no nulls)
  • Signatures excluded from hash computation
  • Enables idempotent upserts and cache keying

API Design

Core Endpoints

Method Path Description
POST /v1/reachgraphs Upsert subgraph (idempotent)
GET /v1/reachgraphs/{digest} Get full subgraph
GET /v1/reachgraphs/{digest}/slice Query slice
POST /v1/reachgraphs/replay Verify determinism

Slice Query Types

  1. Package Slice (?q=pkg:npm/lodash@4.17.21)

    • Returns subgraph containing package and neighbors
    • Configurable depth and direction
  2. CVE Slice (?cve=CVE-2024-1234)

    • Returns all paths from entrypoints to vulnerable sinks
    • Includes edge explanations for each hop
  3. Entrypoint Slice (?entrypoint=/app/bin/svc)

    • Returns everything reachable from entry
    • Optionally filtered to paths reaching sinks
  4. File Slice (?file=src/**/*.ts)

    • Returns impact of changed files
    • Useful for PR-based analysis

Integration Points

Upstream (Data Producers)

  • Scanner.CallGraph: Produces nodes and edges with edge explanations
  • Signals: Provides runtime confirmation of reachability
  • Attestor: DSSE signing integration

Downstream (Data Consumers)

  • Policy Engine: ReachabilityRequirementGate queries slices
  • Web Console: "Why Reachable?" panel displays paths
  • CLI: stella reachgraph slice/replay commands
  • ExportCenter: Includes subgraphs in evidence bundles

Determinism Guarantees

  1. Canonical Serialization

    • Sorted object keys (lexicographic)
    • Sorted arrays by deterministic field
    • UTC ISO-8601 timestamps
    • No null fields (omit when null)
  2. Replay Verification

    • POST /v1/reachgraphs/replay rebuilds from inputs
    • Returns {match: true} if digests match
    • Logs all attempts for audit trail
  3. Content Addressing

    • Same content always produces same digest
    • Enables cache keying and deduplication

Performance Characteristics

Operation Target Latency Notes
Slice query P95 < 200ms Cached in Valkey
Full graph retrieval P95 < 500ms Compressed storage
Upsert P95 < 1s Idempotent, gzip compression
Replay P95 < 5s Depends on input size

Security Considerations

  1. Tenant Isolation: RLS policies enforce at database level
  2. Rate Limiting: 100 req/min reads, 20 req/min writes
  3. DSSE Signing: All artifacts verifiable offline
  4. Input Validation: Schema validation on all requests

Unified Query Interface

The ReachGraph module exposes a Unified Reachability Query API that provides a single facade for static, runtime, and hybrid queries.

API Endpoints

Endpoint Method Description
/v1/reachability/static POST Query static reachability from call graph analysis
/v1/reachability/runtime POST Query runtime reachability from observed execution facts
/v1/reachability/hybrid POST Combine static and runtime for best-effort verdict
/v1/reachability/batch POST Batch query for CVE vulnerability analysis

Adapters

The unified query interface is backed by two adapters:

  1. ReachGraphStoreAdapter: Implements IReachGraphAdapter from StellaOps.Reachability.Core

    • Queries static reachability from stored call graphs
    • Uses BFS from entrypoints to target symbols
    • Returns StaticReachabilityResult with distance, path, and evidence URIs
  2. InMemorySignalsAdapter: Implements ISignalsAdapter from StellaOps.Reachability.Core

    • Queries runtime observation facts
    • Supports observation window filtering
    • Returns RuntimeReachabilityResult with hit count, contexts, and evidence URIs
    • Note: Production deployments should integrate with the actual Signals runtime service

Hybrid Query Flow

┌────────────────┐
│  Hybrid Query  │
│   Request      │
└───────┬────────┘
        │
        ▼
┌───────────────────────────────────────────┐
│         ReachabilityIndex Facade          │
│  (StellaOps.Reachability.Core)            │
└───────┬───────────────────────┬───────────┘
        │                       │
        ▼                       ▼
┌───────────────┐       ┌───────────────┐
│ ReachGraph    │       │ Signals       │
│ StoreAdapter  │       │ Adapter       │
└───────┬───────┘       └───────┬───────┘
        │                       │
        ▼                       ▼
┌───────────────┐       ┌───────────────┐
│ PostgreSQL +  │       │ Runtime Facts │
│ Valkey Cache  │       │ (In-Memory)   │
└───────────────┘       └───────────────┘

Query Models

SymbolRef - Identifies a code symbol:

{
  "namespace": "System.Net.Http",
  "typeName": "HttpClient",
  "memberName": "GetAsync"
}

StaticReachabilityResult:

{
  "symbol": { "namespace": "...", "typeName": "...", "memberName": "..." },
  "artifactDigest": "sha256:abc123...",
  "isReachable": true,
  "distanceFromEntrypoint": 3,
  "path": ["entry -> A -> B -> target"],
  "evidenceUris": ["stella:evidence/reachgraph/sha256:abc123/symbol:..."]
}

RuntimeReachabilityResult:

{
  "symbol": { ... },
  "artifactDigest": "sha256:abc123...",
  "wasObserved": true,
  "hitCount": 1250,
  "firstSeen": "2025-06-10T08:00:00Z",
  "lastSeen": "2025-06-15T12:00:00Z",
  "contexts": [{ "environment": "production", "service": "api-gateway" }],
  "evidenceUris": ["stella:evidence/signals/sha256:abc123/symbol:..."]
}

HybridReachabilityResult:

{
  "symbol": { ... },
  "artifactDigest": "sha256:abc123...",
  "staticResult": { ... },
  "runtimeResult": { ... },
  "confidence": 0.92,
  "verdict": "reachable",
  "reasoning": "Static analysis shows 3-hop path; runtime confirms 1250 observations"
}

14. Lattice Triage Service

Overview

The Lattice Triage Service provides a workflow-oriented surface on top of the 8-state reachability lattice, enabling operators to visualise lattice states, apply evidence, perform manual overrides, and maintain a full audit trail of every state transition.

Library: StellaOps.Reachability.Core Namespace: StellaOps.Reachability.Core

Models

Type Purpose
LatticeTriageEntry Per-(component, CVE) snapshot: current state, confidence, VEX status, full transition history. Content-addressed EntryId (triage:sha256:…). Computed RequiresReview / HasOverride.
LatticeTransitionRecord Immutable log entry per state change: from/to state, confidence before/after, trigger, reason, actor, evidence digests, timestamp. Computed IsManualOverride.
LatticeTransitionTrigger Enum: StaticAnalysis, RuntimeObservation, ManualOverride, SystemReset, AutomatedRule. Serialised as JsonStringEnumConverter.
LatticeOverrideRequest Operator request to force a target state with reason, actor, and evidence digests.
LatticeOverrideResult Outcome of an override: applied flag, updated entry, transition, optional warning.
LatticeTriageQuery Filters: State?, RequiresReview?, ComponentPurlPrefix?, Cve?, Limit (default 100), Offset.

Service Interface (ILatticeTriageService)

Method Description
GetOrCreateEntryAsync(purl, cve) Returns existing entry or creates one at Unknown state.
ApplyEvidenceAsync(purl, cve, evidenceType, digests, actor, reason) Delegates to ReachabilityLattice.ApplyEvidence, records transition.
OverrideStateAsync(request) Forces target state via Reset + ForceState sequence. Warns when overriding Confirmed* states.
ListAsync(query) Filters + pages entries; ordered by UpdatedAt descending.
GetHistoryAsync(purl, cve) Returns full transition log for an entry.
ResetAsync(purl, cve, actor, reason) Resets entry to Unknown, records SystemReset transition.

VEX Status Mapping

Lattice State VEX Status
Unknown, StaticReachable, Contested under_investigation
StaticUnreachable, RuntimeUnobserved, ConfirmedUnreachable not_affected
RuntimeObserved, ConfirmedReachable affected

Manual Override Behaviour

When an operator overrides state, the service:

  1. Resets the lattice to Unknown.
  2. Applies the minimal evidence sequence to reach the target state (e.g., ConfirmedReachable = StaticReachable + RuntimeObserved).
  3. Sets confidence to the midpoint of the target state's confidence range.
  4. Returns a warning when overriding from ConfirmedReachable or ConfirmedUnreachable, since these are high-certainty states.

DI Registration

AddReachabilityCore() registers ILatticeTriageService → LatticeTriageService (singleton, via TryAddSingleton).

Observability (OTel Metrics)

Meter: StellaOps.Reachability.Core.Triage

Metric Type Description
reachability.triage.entries_created Counter Entries created
reachability.triage.evidence_applied Counter Evidence applications
reachability.triage.overrides_applied Counter Manual overrides
reachability.triage.resets Counter Lattice resets
reachability.triage.contested Counter Contested state transitions

Test Coverage

22 tests in StellaOps.Reachability.Core.Tests/LatticeTriageServiceTests.cs:

  • Entry creation (new, idempotent, distinct keys)
  • Evidence application (static→reachable, confirmed paths, conflicting→contested, digest recording)
  • Override (target state, warnings on confirmed, HasOverride flag)
  • Listing with filters (state, review, PURL prefix)
  • History retrieval
  • Reset transitions
  • VEX mapping (theory test)
  • Edge-case validation (null PURL, empty reason)

Last updated: 2026-02-08