Files

StellaOps Bot 0d4a986b7b archive advisories

2025-11-23 23:44:35 +02:00

58 KiB

Raw Blame History

Here’s a quick, beginner‑friendly rundown of reachability analysis and how Stella Ops can leapfrog today’s tools with deterministic, signed call‑graphs and in‑toto attestations—plus a concrete action plan.

Why reachability matters (in plain terms)

Most scanners list every CVE in your code or containers. That’s noisy.
Reachability analysis asks: “Is the vulnerable function actually callable in this artifact, on this path, in this runtime?”
Modern vendors enrich this with context (code → cloud links, runtime traces, attack paths), but they usually infer paths and don’t cryptographically prove them.

Where today’s market sits

Commercial/OSS tools emphasize prioritization signals and partial reachability (select languages, heuristics, runtime traces).
What’s typically missing: deterministic graphs per artifact, stable IDs for nodes/edges, and signed, auditable proofs for “why” a vuln is reachable (or not).

Stella Ops: how to lead (moats → outcomes)

Deterministic call‑graphs per artifact
- Build language/binary call‑graphs with stable node IDs (e.g., purl + symbol signature + build‑ID) and edge determinism (repeatable from the same inputs).
Signed edges with DSSE / in‑toto
- Each edge (caller → callee) becomes a tiny attestation: who computed it, inputs (hashes), tool version, and policy context.
- Publish summaries to Rekor (or a mirror) to get a tamper‑evident audit log.
Explainability by default
- Every finding carries its “why path” (edge chain), VEX gate decisions, and (for containers) layer‑diff cause (which layer introduced the symbol/file).
VEX with proof
- When you say “not affected,” attach the exact proof trail (non‑reachable edge chain, eliminated by policy, or sandboxed at runtime), not just a boolean.
Outcome targets (internal, program goals)
- ≥40% fewer noisy vulns presented to engineers.
- ≥25% faster triage thanks to inspectable “why” paths and signed evidence.

Minimal architecture (clean and pragmatic)

Sbomer/Scanner: produce SBOM + symbol maps + per‑layer file indexes; embed build‑IDs (ELF/PE/Mach‑O) and purls.
Cartographer: construct deterministic call‑graphs (per language/binary) → emit EdgeList.jsonl with stable IDs.
Attestor: wrap edges into DSSE in‑toto attestations; batch‑sign; push digests to Rekor.
Vexer: evaluate policies (trust lattice) to yield VEX with linked edge‑proofs.
Ledger: retain proofs; sync to Rekor mirror for sovereignty/offline.

Practical specs to hand to engineers (short version)

Stable IDs
- Node: purl@version!build-id!symbol-signature (lang: fqdn; binary: objfile:offset+size, demangled name if available).
- Edge: SHA256(nodeA||nodeB||tool-version||inputs-hash).
Determinism
- Pin parsers, symbolizers, compilers, and analysis flags.
- Emit manifest.lock (feeds, rules, tool hashes) to replay any scan identically.
Signing
- DSSE envelope with: tool version, inputs (file hashes), graph slice, time, signer.
- Store full attestations in object store; Rekor receives the envelope digest + inclusion proof.
Explainability payload (attach to each finding)
- call_chain[], per‑edge evidence (file, line, symbol, layer), VEX gate decisions, and counter‑evidence for “not affected”.
Container layer attribution
- Track file provenance to the exact layer; show “introduced in layer X from base Y”.
APIs
- POST /graph/edges: attest (idempotent; same inputs → same edge IDs).
- GET /findings/:id/proof returns the call‑chain + Rekor inclusion proofs.
- GET /vex/:artifact streams VEX with embedded proofs.

Quick wins (do these first)

Add Build‑ID capture to your scanners for ELF/PE/Mach‑O; normalize to nodes with stable IDs.
Ship a Graph Determinism Manifest (hashes of inputs + toolchain) per build.
Start edge attestations for 1–2 ecosystems (e.g., PHP/JS) and container layer provenance.
Integrate Rekor logging (digest‑only is fine to start); keep full DSSE envelopes in your bucket.
Turn on explainable VEX: every verdict must have a machine‑readable why‑path.

How this helps day‑to‑day

Security and developers see only reachable, explainable vulns with a clickable proof path.
Auditors get cryptographic evidence (DSSE + Rekor inclusion) that triage decisions weren’t hand‑waved.
Ops can trace “which layer introduced this risk” and fix the real source quickly.

If you want, I can turn this into:

a one‑page product brief for customers,
an engineer‑ready spec (JSON schemas + example DSSE envelopes),
or a roadmap broken into 30‑60‑90‑day milestones. Below is a very detailed engineering spec for Stella Ops’ reachability and explainability system, plus explicit “why this matters” notes under each major group so your documentation writer can easily translate into customer‑facing language.

You can copy this into an internal design doc and keep the “Why” paragraphs verbatim or adapt them into more marketing‑friendly copy.

0. Scope & Goals

This spec covers:

Deterministic call‑graph construction per artifact
Stable node/edge identities
DSSE / in‑toto edge attestations and Rekor logging
VEX engine that uses those graphs as proof
Proof & explainability APIs (call‑chains, layer attribution, etc.)
Determinism manifest & replay for audits
Container layer provenance integration

Non‑goals (for this document):

UI/UX design
NVD/OSV ingestion pipeline details
AuthN/AuthZ and tenancy model (assume existing platform patterns)

We’ll use RFC‑style language:

MUST / MUST NOT
SHOULD / SHOULD NOT
MAY

1. Core Concepts (Global)

These terms are referenced throughout the spec.

1.1 Artifact

Definition: A concrete software unit that can be scanned and deployed.
- Examples: container image, VM image, binary, library, zip, lambda bundle.
Identity:
- artifact_id (string, globally unique)
- artifact_hash (sha256 of canonical bytes)
- artifact_kind ("container" | "binary" | "library" | "archive" | "source_bundle")

Why this matters (docs hook): Clients need to see evidence tied to exact software units (e.g., “this image:tag”). Stable artifact identities make proofs auditable and prevent confusion between versions.

1.2 Node, Edge, Call Graph

Node: A callable unit (function, method, exported symbol, entrypoint) in an artifact.
Edge: A possible call from one node to another in that artifact.
Call Graph: The directed graph G = (V, E) of nodes V and edges E.

We distinguish:

Static call‑graph: Derived from code/binaries (no runtime traces).
Augmented call‑graph: Static graph plus optional runtime evidence (future extension; not required by this spec but must be accommodated by the schema).

Why this matters (docs hook): Call‑graphs are the backbone of reachability. Without them, tools can only guess whether a vulnerable function is used. Graphs make that decision explainable and repeatable.

1.3 Determinism

Deterministic analysis: For the same inputs (artifact bytes + config + tool versions), the system MUST produce:

The same set of nodes and edges,
The same IDs for nodes and edges,
The same graph hash / revision ID,
The same attestations (payload-wise).

Determinism is enforced via:

Strict canonicalization rules (sorting, formatting),
Fixed analysis options locked per version,
Explicit recording of all inputs in a determinism manifest.

Why this matters (docs hook): Determinism turns vulnerability triage from a “black box guess” into a reproducible math problem. An auditor can rerun analysis and verify the same results and proofs.

1.4 Attestations & Ledger

Attestation: A signed statement about artifacts, edges, or findings.
DSSE envelope: The signing wrapper.
in‑toto Statement: Typed payload inside the envelope.
Rekor: Transparency log where attestation digests are recorded.

Why this matters (docs hook): Signed attestations create a chain of custody for analysis results. Rekor logging means tampering is detectable and clients can verify that Stella Ops didn’t silently change history.

1.5 VEX & Proofs

VEX: Vulnerability Exploitability eXchange statement per vulnerability–artifact pair.
- Key fields: vulnerability_id, status, justification, proof_ref.
Proof: A structured explanation referencing call‑graph edges and attestations that show why a vulnerability is or is not exploitable.

Why this matters (docs hook): A VEX without proof is a “trust me” statement. A VEX with proof is a verifiable, cryptographically linked explanation that customers and auditors can inspect.

2. Artifact Identity & Metadata Service

This service normalizes artifact identity and provides a foundation for graph and attestation linkage.

2.1 Functional Requirements

FR‑A1: Artifact Registration

The system MUST expose POST /v1/artifacts to register an artifact.
Request (simplified):

{
  "kind": "container",
  "coordinates": {
    "image_ref": "gcr.io/acme/api@sha256:abc123...",
    "tag": "v1.2.3"
  },
  "hash": {
    "algo": "sha256",
    "value": "..."
  },
  "build_metadata": {
    "build_id": "build-2025-01-01-1234",
    "ci_run_id": "ci-abc-123",
    "timestamp": "2025-01-01T12:34:56Z"
  },
  "sbom_ref": {
    "format": "spdx-json",
    "location": "s3://.../api-v1.2.3.spdx.json"
  }
}

Response:

{
  "artifact_id": "art-01HXXXXX...",
  "artifact_hash": "sha256:...",
  "kind": "container"
}

FR‑A2: Artifact Uniqueness

artifact_id MUST be stable for an (artifact_kind, artifact_hash) pair.
Registering the same (kind, hash) again MUST return the existing artifact_id.

FR‑A3: Build‑ID & Binary Identity Capture

For binaries within artifacts:

The scanner MUST extract:
- build_id or equivalent (ELF .note.gnu.build-id, PE debug GUID, Mach‑O UUID),
- File path in the artifact,
- File hash.
For each binary, the service MUST assign a binary_id:

{
  "binary_id": "bin-01HYYY...",
  "artifact_id": "art-01HXXXXX...",
  "path": "/usr/local/bin/server",
  "file_hash": "sha256:...",
  "build_id": "0x1234abcd..."
}

FR‑A4: SBOM Integration

The service MUST support linking SBOM components (e.g., purl) to binaries and files:
- component_id ⇔ binary_id and/or file.

Why this matters (docs hook): Having a clean artifact_id and binary_id abstraction allows Stella Ops to say “this edge comes from this binary in this image” with zero ambiguity. It’s the basis for trustable, cross‑tool correlations (SBOM ↔ graph ↔ VEX).

3. Node & Edge Identity (Call‑Graph Model)

This is the core of determinism and explainability.

3.1 Node Identity

FR‑N1: Node Canonical Tuple

Each node MUST be defined by a canonical tuple:

artifact_id
binary_id (nullable for source‑level nodes)
language ("php" | "js" | "java" | "go" | "c" | "cpp" | "dotnet" | ...)
symbol_kind ("function" | "method" | "constructor" | "lambda" | "entrypoint" | "indirect_stub")
symbol_signature:
- Language‑specific, fully qualified form
- Examples:
  - PHP: \Acme\Service\UserService::findByEmail(string): User
  - Java: com.acme.UserService.findByEmail(Ljava/lang/String;)Lcom/acme/User;
  - C: acme_user_find_by_email(const char *email)
source_location OR binary_offset:
- source_location:
  - file_path (normalized),
  - line (int),
  - column (optional).
- binary_offset:
  - section (e.g., .text),
  - offset (hex),
  - size (bytes).

FR‑N2: NodeID Derivation

A node_id MUST be derived as:

canonical_string = JSONCanonicalize({
  "artifact_id": "...",
  "binary_id": "...",
  "language": "...",
  "symbol_kind": "...",
  "symbol_signature": "...",
  "source_location" | "binary_offset": ...
})

node_id = "node-" + hex(sha256(canonical_string))

JSONCanonicalize:
- MUST be deterministic (sorted keys, no extraneous whitespace, fixed number formats).
- Exact algorithm MUST be documented and versioned.

FR‑N3: Stability

For a given canonical tuple, the same node_id MUST be produced across runs, machines, and time.

Why this matters (docs hook): Stable node IDs allow Stella Ops to “bookmark” a function across scans, policies, and attestations. It means a client or auditor can see the exact same identifier referenced in SBOMs, VEX, and evidence—even years later.

3.2 Edge Identity

FR‑E1: Edge Canonical Tuple

Each edge MUST be defined by:

caller_node_id
callee_node_id
edge_kind ("static" | "virtual" | "interface" | "dynamic" | "indirect")
evidence:
- origin ("static_analysis" | "runtime_trace"),
- tool_version,
- analysis_profile (e.g., optimization level, language mode),
- confidence (0–1 decimal),
- optional metadata (language‑specific details).

FR‑E2: EdgeID Derivation

edge_id MUST be derived as:

canonical_edge_string = JSONCanonicalize({
  "caller": "node-...",
  "callee": "node-...",
  "edge_kind": "...",
  "analysis_tool_version": "...",
  "analysis_profile": "..."
})

edge_id = "edge-" + hex(sha256(canonical_edge_string))

FR‑E3: Edge Determinism

For identical inputs (artifact, analysis config, tool version) the same set of (edge_id, fields) MUST be produced.
Sorting rules:
- All edges MUST be emitted in lexicographic order of edge_id.
- This ordering MUST be used when computing graph hashes and signing batches.

Why this matters (docs hook): Stable edges make “reachable” vs “not reachable” mathematically testable. If a client re‑runs analysis, they can verify they get the same edges and therefore the same reachability verdicts—building confidence in the tool’s fairness and repeatability.

3.3 Graph Revision & Manifest

FR‑G1: Graph Revision ID

For an artifact’s graph:

graph_canonical = JSONCanonicalize({
  "nodes": [sorted by node_id],
  "edges": [sorted by edge_id]
})

graph_revision_id = "graph-" + hex(sha256(graph_canonical))

FR‑G2: Determinism Manifest

The system MUST generate a determinism manifest per graph:

{
  "graph_revision_id": "graph-...",
  "artifact_id": "art-...",
  "inputs": {
    "artifact_hash": "sha256:...",
    "sbom_hash": "sha256:...",
    "scanner_version": "1.4.0",
    "cartographer_version": "2.1.3",
    "config": {
      "languages": ["php", "js"],
      "analysis_profile": "default"
    }
  },
  "toolchain_hashes": {
    "php_parser": "sha256:...",
    "js_parser": "sha256:..."
  },
  "timestamp": "2025-01-01T12:34:56Z"
}

Why this matters (docs hook): The manifest is the “recipe card” for the call‑graph. If a regulator or customer wants to re‑bake the same result, they use this recipe and confirm they get the same graph hash.

4. Cartographer (Call‑Graph Builder) Service

4.1 Responsibilities

Parse artifacts and binaries.
Extract functions/symbols as nodes.
Generate edges with deterministic algorithms.
Produce EdgeList and NodeList structures plus graph revision and determinism manifest.

4.2 Inputs & Outputs

FR‑C1: API to Trigger Analysis

POST /v1/graphs:analyze

Request:

{
  "artifact_id": "art-01HXXXXX...",
  "analysis_profile": "default",
  "languages": ["php", "js", "c"],
  "options": {
    "include_runtime_traces": false
  }
}

Response:

{
  "graph_id": "graph-job-01ABC...",
  "artifact_id": "art-...",
  "status": "queued"
}

FR‑C2: Graph Retrieval

GET /v1/graphs/{graph_id}

Response:

{
  "graph_revision_id": "graph-...",
  "artifact_id": "art-...",
  "nodes": [
    {
      "node_id": "node-...",
      "artifact_id": "art-...",
      "binary_id": "bin-...",
      "language": "php",
      "symbol_kind": "function",
      "symbol_signature": "\\Acme\\UserService::findByEmail(string): User",
      "source_location": {
        "file_path": "/app/src/UserService.php",
        "line": 42,
        "column": 3
      }
    }
    // ...
  ],
  "edges": [
    {
      "edge_id": "edge-...",
      "caller_node_id": "node-...",
      "callee_node_id": "node-...",
      "edge_kind": "static",
      "evidence": {
        "origin": "static_analysis",
        "tool_version": "2.1.3",
        "analysis_profile": "default",
        "confidence": 1.0
      }
    }
    // ...
  ],
  "determinism_manifest": { ... }
}

FR‑C3: Language Coverage

Cartographer MUST support at least:
- PHP, JavaScript, TypeScript, Java, Go, C/C++, .NET, Python.
Each language implementation MUST:
- Use pinned parser versions,
- Use deterministic traversal order (e.g., AST preorder + sorted children),
- Emit nodes/edges consistent with the canonical schema.

FR‑C4: Error Handling

If analysis fails, status MUST be one of: "failed_parse", "failed_analysis", "unsupported_language", with a machine‑readable error code and human‑readable message.

Why this matters (docs hook): Cartographer is where Stella Ops’ “magic” becomes structured data. The deterministic API and schemas mean customers can export, inspect, and even combine these graphs with their own tooling.

5. Edge Attestations (Attestor Service)

The Attestor turns graph edges into cryptographically verifiable DSSE/in‑toto attestations and logs them to Rekor.

5.1 Attestation Content

FR‑T1: in‑toto Statement Format

Each attestation payload MUST conform to in‑toto Statement:

{
  "_type": "https://in-toto.io/Statement/v0.1",
  "subject": [
    {
      "name": "artifact:art-01HXXXXX...",
      "digest": {
        "sha256": "..."
      }
    }
  ],
  "predicateType": "https://stella.dev/attestations/callgraph-edges/v1",
  "predicate": {
    "graph_revision_id": "graph-...",
    "edges": [
      {
        "edge_id": "edge-...",
        "caller_node_id": "node-...",
        "callee_node_id": "node-...",
        "edge_kind": "static",
        "evidence": {
          "origin": "static_analysis",
          "tool_version": "2.1.3",
          "analysis_profile": "default",
          "confidence": 1.0
        }
      }
    ],
    "analysis_metadata": {
      "cartographer_version": "2.1.3",
      "options": {
        "languages": ["php", "js"],
        "analysis_profile": "default"
      },
      "timestamp": "2025-01-01T12:34:56Z"
    }
  }
}

FR‑T2: DSSE Envelope

The Statement MUST be wrapped in a DSSE envelope:

{
  "payloadType": "application/vnd.stella.callgraph-edges.v1+json",
  "payload": "base64url(<Statement JSON>)",
  "signatures": [
    {
      "keyid": "k-01KEY...",
      "sig": "base64url(signature-bytes)"
    }
  ]
}

FR‑T3: Signing Requirements

Signatures MUST use Ed25519 (or another modern scheme, but the algorithm MUST be recorded).
Private keys MUST be stored in KMS/HSM (not in application memory except ephemeral signing use).
Key metadata:
- keyid,
- algorithm,
- valid_from, valid_until.
Keys MUST be rotated regularly; new keys MUST NOT invalidate old attestations (verification MUST support multiple keys).

FR‑T4: Batching Strategy

Attestor MAY batch up to N edges per attestation (configurable, default 10,000) for performance.
Each attestation MUST:
- Only contain edges from a single graph_revision_id.
- Preserve deterministic ordering of edges by edge_id.

FR‑T5: Rekor Logging

For each DSSE envelope:
- Compute digest of the envelope.
- Submit to Rekor (kind: dsse or in-toto).
- Persist in internal Ledger:
  - rekor_log_index,
  - rekor_uuid,
  - rekor_inclusion_proof.

Why this matters (docs hook): Attestations let Stella Ops prove that the call‑graph wasn’t tampered with after analysis. Clients can independently verify signatures and Rekor proofs to ensure that evidence is authentic and unmodified.

5.2 Attestor API

FR‑T6: Trigger Attestation

POST /v1/attestations/graphs/{graph_revision_id}

Request:

{
  "artifact_id": "art-01HXXXXX...",
  "batch_size": 10000
}

Response:

{
  "attestation_batch_id": "att-batch-01...",
  "graph_revision_id": "graph-...",
  "status": "in_progress"
}

FR‑T7: Attestation Status & Retrieval

GET /v1/attestations/batches/{attestation_batch_id}

Response:

{
  "attestation_batch_id": "att-batch-01...",
  "graph_revision_id": "graph-...",
  "status": "completed",
  "attestations": [
    {
      "attestation_id": "att-01...",
      "dsse_envelope_ref": "s3://.../att-01.json",
      "rekor_log_index": 12345,
      "rekor_uuid": "..."
    }
  ]
}

Why this matters (docs hook): A clean attestation API allows clients and integrators to subscribe to evidence as a first‑class output (not just raw JSON). This is crucial for advanced audit and compliance workflows.

6. Container Layer Provenance

Reachability proofs should show which container layer introduced a function or vulnerability.

6.1 Layer Indexing

FR‑L1: Layer Metadata

For container artifacts:

The scanner MUST extract the image’s layer list and config.
For each layer:

{
  "layer_digest": "sha256:layer1...",
  "index": 0,
  "created_by": "FROM ubuntu:22.04",
  "size_bytes": 123456
}

FR‑L2: File Provenance

Build a file tree mapping file_path → origin_layer_digest.
For each file, also track:
- file_hash,
- mode, uid, gid.

FR‑L3: Linking Nodes to Layers

When creating nodes, if source_location.file_path is inside a container layer:
- Resolve the path to its origin_layer_digest.
Each node MUST have:

"container_origin": {
  "layer_digest": "sha256:layerX...",
  "path_in_layer": "/app/src/UserService.php"
}

Why this matters (docs hook): Layer provenance lets Stella Ops say “this issue comes from the base image” or “this function was added by your team’s Dockerfile step.” This makes it clear where to fix the problem and who owns it.

7. VEX & Policy Engine

The VEX engine converts vulnerabilities and call‑graphs into status + proof decisions.

7.1 Internal Vulnerability Model

FR‑V1: Vulnerability Record

Internally, for each vulnerability:

{
  "vulnerability_id": "CVE-2025-12345",
  "source": "OSV",
  "affected_symbols": [
    {
      "language": "php",
      "symbol_signature": "\\Vendor\\Lib\\Foo::dangerous()",
      "matching_rule": "exact"
    }
  ],
  "severity": "HIGH",
  "feeds": ["osv", "nvd"],
  "metadata": { ... }
}

FR‑V2: Component Mapping

Use SBOM + artifact metadata to map vulnerability_id to:
- artifact_id,
- component_id (library/package),
- potential source/binary files.

7.2 Reachability Evaluation

FR‑V3: Node Matching

For each (artifact_id, vulnerability_id):

Map affected_symbols to node_ids in the artifact’s graph using:
- language,
- symbol_signature,
- optional fuzzy/heuristic matching rules.
Record the set Sinks = { node_id_1, node_id_2, ... }.

FR‑V4: Entry Points

Define entry points set Entrypoints:

Defaults:
- For containers: common main/server functions, binaries invoked by entrypoint/cmd.
- For libraries: exported funcs/classes.
MUST be configurable via policy.

FR‑V5: Path Search

For each sink in Sinks:
- Run a bounded graph traversal (e.g., BFS or DFS) backwards from sink to see if any entrypoint ∈ Entrypoints is reachable.
- Limit depth and time by configuration, but record those limits in the proof.

FR‑V6: Reachability Result

For each (artifact_id, vulnerability_id):

reachable: if any path entrypoint → ... → sink exists.
not_reachable: if no path is found within bounds and analysis coverage is sufficient.
unknown: if analysis bounds or gaps prevent a confident answer.

Why this matters (docs hook): This is the step that turns thousands of “you have a CVE” alerts into a small list of actually exploitable issues with a clear path, massively reducing noise for engineers.

7.3 VEX Statement & Proof Linking

FR‑V7: Internal VEX Record

For each (artifact_id, vulnerability_id):

{
  "vex_id": "vex-01...",
  "artifact_id": "art-...",
  "vulnerability_id": "CVE-2025-12345",
  "status": "affected" | "not_affected" | "fixed" | "under_investigation",
  "justification": "reachability_not_detected" | "component_not_present" | "vulnerable_code_not_in_execute_path" | ...,
  "proof_ref": "proof-01...",
  "generated_at": "2025-01-01T12:34:56Z",
  "graph_revision_id": "graph-...",
  "policy_version": "pol-2025.01"
}

FR‑V8: Proof Bundles

proof-01... MUST refer to a Proof Bundle:

{
  "proof_id": "proof-01...",
  "artifact_id": "art-...",
  "vulnerability_id": "CVE-2025-12345",
  "graph_revision_id": "graph-...",
  "kind": "reachable" | "unreachable" | "unknown",
  "paths": [
    {
      "path_id": "path-01...",
      "entrypoint_node_id": "node-...",
      "sink_node_id": "node-...",
      "edge_ids": [
        "edge-...",
        "edge-..."
      ],
      "attestation_refs": [
        {
          "edge_id": "edge-...",
          "attestation_id": "att-01...",
          "rekor_log_index": 12345
        }
      ]
    }
  ],
  "analysis_limits": {
    "max_depth": 50,
    "max_time_ms": 5000,
    "coverage_notes": "no runtime traces included"
  }
}

Why this matters (docs hook): The VEX record gives the “headline” (affected or not), while the Proof Bundle is the “story behind the headline.” Clients can inspect exact call‑chains and even verify the underlying attestations if they choose.

8. Proof & Explainability API

This is what customers and UIs will query to display “why is this vulnerability shown (or suppressed)?”

8.1 Findings API

FR‑P1: List Findings

GET /v1/artifacts/{artifact_id}/findings

Response:

{
  "artifact_id": "art-...",
  "findings": [
    {
      "finding_id": "find-01...",
      "vulnerability_id": "CVE-2025-12345",
      "status": "affected",
      "severity": "HIGH",
      "vex_id": "vex-01...",
      "proof_id": "proof-01...",
      "component_id": "pkg:composer/acme/foo@1.2.3",
      "summary": "Reachable from /app/index.php via Acme\\Controller::handle()."
    }
  ]
}

FR‑P2: Proof Retrieval

GET /v1/findings/{finding_id}/proof

Response:

{
  "finding_id": "find-01...",
  "proof": {
    "proof_id": "proof-01...",
    "kind": "reachable",
    "paths": [
      {
        "path_id": "path-01...",
        "entrypoint": {
          "node_id": "node-...",
          "symbol_signature": "\\App\\Controller::handle()",
          "source_location": {
            "file_path": "/app/src/Controller.php",
            "line": 10
          },
          "container_origin": {
            "layer_digest": "sha256:layer0...",
            "index": 0
          }
        },
        "sink": {
          "node_id": "node-...",
          "symbol_signature": "\\Vendor\\Lib\\Foo::dangerous()",
          "source_location": {
            "file_path": "/vendor/vendor/lib/Foo.php",
            "line": 99
          },
          "container_origin": {
            "layer_digest": "sha256:layer1...",
            "index": 1
          }
        },
        "edge_chain": [
          {
            "edge_id": "edge-...",
            "caller_node_id": "node-...",
            "callee_node_id": "node-...",
            "evidence_summary": {
              "origin": "static_analysis",
              "confidence": 1.0
            },
            "attestation_ref": {
              "attestation_id": "att-01...",
              "rekor_log_index": 12345
            }
          }
        ]
      }
    ],
    "analysis_limits": { ... }
  }
}

FR‑P3: Negative Proofs

For status = "not_affected", kind MUST be "unreachable" and paths MAY be omitted. Instead, include:

"unreachable_reason": {
  "no_paths_found": true,
  "graph_coverage": "full",
  "notes": "All entrypoints explored up to depth 50; no path to sink."
}

Why this matters (docs hook): This API lets Stella Ops show exactly why something is prioritized or suppressed. It’s the difference between “we think this is safe” and “here’s the graph that proves why it’s safe.”

9. Determinism & Replay Guarantees

9.1 Re‑analysis

FR‑D1: Replay Endpoint

POST /v1/graphs/{graph_revision_id}:replay

MUST re‑run Cartographer with the same determinism manifest.
MUST produce:
- Same graph_revision_id,
- Same nodes / edges (including IDs),
- Same attestations payloads (signatures will differ if keys rotated, but payload digests MUST match).

FR‑D2: Audit Report

GET /v1/graphs/{graph_revision_id}/audit-report

MUST include:
- Original determinism manifest,
- Replay comparison (equal / not equal),
- Differences if any (for internal troubleshooting).

Why this matters (docs hook): Replay is key for regulated customers: they can prove that Stella Ops would make the same decision again given the same inputs, supporting audit trails and incident investigations.

10. Non‑Functional Requirements

10.1 Performance & Scale

FR‑NF1: Edge Volume

System MUST handle at least:
- 10^6 edges per artifact,
- 10^8 edges per day (multi‑tenant).

FR‑NF2: Latency Targets

For medium artifacts (≤100k edges):
- Call‑graph build: p95 ≤ 60s.
- Attestation batching + Rekor: p95 ≤ 120s (background).

FR‑NF3: Storage

Graphs, nodes, edges, and attestations MUST be stored in a way that:
- Allows efficient lookup by:
  - artifact_id,
  - vulnerability_id,
  - node_id / edge_id,
  - graph_revision_id.
- Supports retention policies (e.g., keep graphs for N years).

10.2 Security

FR‑NF4: Data Integrity

All evidence blobs (graphs, attestations, manifests) MUST be hashed and checksummed.
Any corruption MUST be detectable via hash mismatch.

FR‑NF5: Tenant Isolation

Multi‑tenant stores MUST enforce separation such that:
- Evidence from tenant A cannot be read by tenant B.
- Rekor remains public, but tenant‑specific metadata stays private.

Why this matters (docs hook): Non‑functional guarantees are what make the system usable in real life: fast enough to run in CI, safe enough to satisfy security teams, and scalable enough to handle entire fleets.

11. Documentation Hooks Summary (for your writer)

Here’s a quick mapping from technical feature → story you can tell clients:

Artifact & Binary IDs

“Every finding is anchored to a precise artifact and binary, so you always know exactly what code we’re talking about.”
Deterministic node/edge IDs & graph hash

“If you or your auditor re‑run the analysis, you’ll get the same graph and the same decisions, byte for byte.”
DSSE + in‑toto attestations, Rekor logging

“We sign and transparently log our analysis so nobody—including us—can silently change the evidence later.”
Container layer provenance

“We can tell you whether a risk comes from your base image, your Dockerfile, or a specific library update.”
VEX with proof bundles

“For every ‘not affected’ verdict, we include a proof that shows how we know the vulnerable code is unreachable.”
Proof & explainability API

“You can always drill down from a red vulnerability indicator to the exact call‑chain that justifies it.”
Replay & determinism manifest

“Your compliance team can rerun our analysis later and confirm we’d make the same call, which is critical for audits and incident reviews.”

If you’d like next, I can:

Turn this into a formal RFC‑style doc with numbered sections and changelog, or
Draft customer‑facing docs (1–2 pages) that explain these same features in non‑technical language. Below is what you asked for in one shot:

An internal, numbered RFC-style engineering spec (you can drop this into docs/rfcs/RFC-XXXX-deterministic-reachability.md).
A concise customer-facing explainer your documentation/marketing can turn into website copy, whitepapers, or slideware.

You can adapt names (IDs, paths, versions) to your internal conventions.

Part 1 — Internal RFC: Deterministic Reachability & Proof System

RFC ID: RFC-2025-DRP Title: Deterministic Reachability & Proof System for Stella Ops Status: Draft Owners: Platform / Scanner / Graph Guilds Target Release: Q2–Q3 2026 Affected Components: Scanner, Sbomer, Cartographer, Attestor, Vexer, Ledger, APIs

0. Abstract

This RFC defines the deterministic reachability and proof system for Stella Ops. It standardizes:

Deterministic call-graph construction per artifact
Stable identities for nodes, edges, and graphs
DSSE / in-toto attestations for call-graph edges logged to Rekor
A VEX and policy engine that uses graphs as cryptographic proof
Explainability APIs that expose call-chains, layer provenance, and analysis limits
Determinism manifests and replay capabilities for audits

The objective is to move from heuristic, non-repeatable vulnerability prioritization to a cryptographically verifiable, replayable reachability model.

1. Motivation

Current vulnerability scanners and “reachability” features in the market:

Rely on heuristics or partial coverage for call-graphs
Are typically non-deterministic (different runs = different results)
Provide limited or opaque explainability (“we think this is reachable”)
Do not provide cryptographic guarantees of integrity (no DSSE/in-toto chain)
Do not systematically link reachability to container layer provenance

Stella Ops aims to:

Reduce noise (unreachable CVEs) with strong, repeatable evidence.
Provide proof (signed attestations, Rekor logs) that decisions have not been tampered with.
Support audits via determinism manifests and replay.
Clarify responsibility (base image vs application image layers) via layer provenance.

This RFC describes the engineering design to achieve these goals.

2. Goals and Non-Goals

2.1 Goals

G-1: Deterministic call-graphs per artifact with stable node, edge, and graph IDs.
G-2: DSSE / in-toto attestations over call-graph edges, with digests logged to Rekor.
G-3: VEX decisions driven by graph reachability, not only presence of vulnerable components.
G-4: Explainable findings: each vulnerability has an attached proof bundle (paths, edges, attestations, layer provenance).
G-5: Determinism manifest and replay endpoints for audit and forensic workflows.
G-6: Integration with existing Stella Ops components (Scanner, Sbomer, Ledger, etc.) with minimal disruption.

2.2 Non-Goals

N-1: UI/UX design details (dashboards, widgets, styling) are outside scope.
N-2: Feed ingestion for vulnerability data (NVD, OSV, vendor advisories) is treated as given.
N-3: AuthN/AuthZ and multi-tenant boundaries are assumed to follow the existing platform.
N-4: Runtime trace ingestion is considered a later enhancement; this RFC prioritizes static call-graphs.

3. Terminology

Artifact – Scannable unit: container image, VM image, binary, library, archive, or source bundle.
Binary – Executable or library file within an artifact (ELF, PE, Mach-O, etc.).
Node – Callable unit (function, method, entrypoint, exported symbol) within an artifact.
Edge – Possible call from one node to another.
Call-graph – Directed graph G = (V, E) of nodes V and edges E.
Graph Revision – Canonical hash of the node and edge sets for an artifact.
Determinism Manifest – Structured record capturing all inputs and versions needed to reproduce a graph.
Attestation – DSSE-wrapped in-toto statement describing edges for a given graph revision.
Rekor – Transparency log where DSSE envelope digests are recorded.
VEX – Vulnerability Exploitability eXchange statement for a given vulnerability–artifact pair.
Proof Bundle – Structured, machine-readable explanation of how a VEX verdict was obtained (paths, limits, attestations, etc.).

4. Architecture Overview

Components and responsibilities:

Artifact Service
- Normalizes artifacts, assigns artifact_id, manages binary identities and SBOM links.
Cartographer Service
- Parses artifacts and binaries.
- Builds deterministic call-graphs (nodes + edges).
- Produces graph_revision_id and determinism manifest.
Attestor Service
- Consumes graphs and emits DSSE / in-toto attestations over edges.
- Logs attestation digests to Rekor.
- Stores attestation metadata in Ledger.
VEX Engine
- Maps vulnerabilities to nodes via SBOM + symbol mapping.
- Performs reachability analysis from entrypoints to vulnerable sinks.
- Emits VEX records and associated proof bundles.
Proof & Explainability API
- Exposes findings, proofs, call-chains, and layer provenance.
Container Layer Provenance
- Maps files and nodes to container layers and build steps.
Ledger
- Persists graphs, determinism manifests, attestations, and proofs.
- Supports replay and audit queries.

5. Data Model

5.1 Artifact and Binary Identity

artifact_id
- Globally unique ID for (kind, artifact_hash).
- Idempotent: same (kind, hash) → same artifact_id.
artifact record (logical view):

{
  "artifact_id": "art-01HXXXXX...",
  "kind": "container",
  "artifact_hash": "sha256:...",
  "coordinates": {
    "image_ref": "gcr.io/acme/api@sha256:...",
    "tag": "v1.2.3"
  },
  "build_metadata": {
    "build_id": "build-2025-01-01-1234",
    "ci_run_id": "ci-abc-123",
    "timestamp": "2025-01-01T12:34:56Z"
  },
  "sbom_ref": {
    "format": "spdx-json",
    "location": "s3://.../api-v1.2.3.spdx.json",
    "hash": "sha256:..."
  }
}

binary_id
- Assigned for each binary discovered within an artifact.

{
  "binary_id": "bin-01HYYY...",
  "artifact_id": "art-01HXXXXX...",
  "path": "/usr/local/bin/server",
  "file_hash": "sha256:...",
  "build_id": "0x1234abcd..."
}

Rationale: Stable IDs for artifacts and binaries enable precise binding between SBOM components, graph nodes, and VEX decisions, and are essential for cross-run comparability and auditability.

5.2 Node Model

Each node is defined by a canonical tuple:

artifact_id
binary_id (nullable for source-level nodes)
language
symbol_kind (function, method, constructor, lambda, entrypoint, etc.)
symbol_signature (language-specific, fully qualified)
source_location OR binary_offset
Optional container_origin (for containerized artifacts; see §7)

Example:

{
  "node_id": "node-...",
  "artifact_id": "art-...",
  "binary_id": "bin-...",
  "language": "php",
  "symbol_kind": "method",
  "symbol_signature": "\\Acme\\Service\\UserService::findByEmail(string): User",
  "source_location": {
    "file_path": "/app/src/Service/UserService.php",
    "line": 42,
    "column": 3
  },
  "container_origin": {
    "layer_digest": "sha256:layer1...",
    "path_in_layer": "/app/src/Service/UserService.php"
  }
}

Node ID Derivation

canonical_node_string = JSONCanonicalize({
  "artifact_id": "...",
  "binary_id": "...",
  "language": "...",
  "symbol_kind": "...",
  "symbol_signature": "...",
  "location": { ... } // source_location or binary_offset
})

node_id = "node-" + hex(sha256(canonical_node_string))

JSONCanonicalize MUST be deterministic:
- Sorted keys
- Fixed number formats
- Normalized file paths and line/column representation

Rationale: Deterministic node IDs allow persistent references to functions across scans, policies, and proofs. This is critical for long-term VEX stability and cross-tool integration.

5.3 Edge Model

Canonical fields:

edge_id
caller_node_id
callee_node_id
edge_kind – "static" | "virtual" | "interface" | "dynamic" | "indirect"
evidence:
- origin – "static_analysis" | "runtime_trace" (runtime is future extension)
- tool_version – Cartographer version
- analysis_profile – Profile name (e.g. "default", "aggressive")
- confidence – Float in [0,1]
- metadata – Language-specific info (dispatch tables, type information, etc.)

Edge ID Derivation

canonical_edge_string = JSONCanonicalize({
  "caller": "node-...",
  "callee": "node-...",
  "edge_kind": "...",
  "analysis_tool_version": "...",
  "analysis_profile": "..."
})

edge_id = "edge-" + hex(sha256(canonical_edge_string))

Edges MUST be emitted sorted by edge_id.

Rationale: Stable edges, combined with deterministic sorting, are necessary to ensure that graph hashes and attestations remain identical across re-runs with the same inputs.

5.4 Graph Revision & Determinism Manifest

Graph canonical representation:

graph_canonical = JSONCanonicalize({
  "nodes": [ ... ] // sorted by node_id
  "edges": [ ... ] // sorted by edge_id
})

graph_revision_id = "graph-" + hex(sha256(graph_canonical))

Determinism manifest:

{
  "graph_revision_id": "graph-...",
  "artifact_id": "art-...",
  "inputs": {
    "artifact_hash": "sha256:...",
    "sbom_hash": "sha256:...",
    "scanner_version": "1.4.0",
    "cartographer_version": "2.1.3",
    "config": {
      "languages": ["php", "js"],
      "analysis_profile": "default"
    }
  },
  "toolchain_hashes": {
    "php_parser": "sha256:...",
    "js_parser": "sha256:..."
  },
  "timestamp": "2025-01-01T12:34:56Z"
}

Rationale: The manifest is the “recipe” to regenerate the same graph. It is required for replay, forensic analysis, and third-party audits.

6. Service APIs

6.1 Artifact Service

POST /v1/artifacts

Registers or retrieves an artifact.
Idempotent on (kind, artifact_hash).

Request (simplified):

{
  "kind": "container",
  "coordinates": {
    "image_ref": "gcr.io/acme/api@sha256:...",
    "tag": "v1.2.3"
  },
  "hash": {
    "algo": "sha256",
    "value": "..."
  },
  "build_metadata": { ... },
  "sbom_ref": { ... }
}

Response:

{
  "artifact_id": "art-01HXXXXX...",
  "artifact_hash": "sha256:...",
  "kind": "container"
}

Binary registration is normally performed automatically by Scanner/Cartographer, but logically lives under Artifact Service.

6.2 Cartographer (Call-Graph Builder)

POST /v1/graphs:analyze

Request:

{
  "artifact_id": "art-01HXXXXX...",
  "analysis_profile": "default",
  "languages": ["php", "js", "c"],
  "options": {
    "include_runtime_traces": false
  }
}

Response (job):

{
  "graph_id": "graph-job-01ABC...",
  "artifact_id": "art-...",
  "status": "queued"
}

GET /v1/graphs/{graph_id}

Response (normalized):

{
  "graph_revision_id": "graph-...",
  "artifact_id": "art-...",
  "nodes": [ ... ],
  "edges": [ ... ],
  "determinism_manifest": { ... }
}

Requirements:

Language coverage: PHP, JS/TS, Java, Go, C/C++, .NET, Python.
Deterministic parsing and traversal (AST order, sorted children, pinned parser versions).
Error modes: failed_parse, failed_analysis, unsupported_language with machine-readable codes.

6.3 Attestor Service

Payload format

in-toto Statement:

{
  "_type": "https://in-toto.io/Statement/v0.1",
  "subject": [
    {
      "name": "artifact:art-01HXXXXX...",
      "digest": { "sha256": "..." }
    }
  ],
  "predicateType": "https://stella.dev/attestations/callgraph-edges/v1",
  "predicate": {
    "graph_revision_id": "graph-...",
    "edges": [
      {
        "edge_id": "edge-...",
        "caller_node_id": "node-...",
        "callee_node_id": "node-...",
        "edge_kind": "static",
        "evidence": {
          "origin": "static_analysis",
          "tool_version": "2.1.3",
          "analysis_profile": "default",
          "confidence": 1.0
        }
      }
    ],
    "analysis_metadata": {
      "cartographer_version": "2.1.3",
      "options": {
        "languages": ["php", "js"],
        "analysis_profile": "default"
      },
      "timestamp": "2025-01-01T12:34:56Z"
    }
  }
}

Wrapped in DSSE envelope:

{
  "payloadType": "application/vnd.stella.callgraph-edges.v1+json",
  "payload": "base64url(<Statement JSON>)",
  "signatures": [
    {
      "keyid": "k-01KEY...",
      "sig": "base64url(signature-bytes)"
    }
  ]
}

Signing:

Default: Ed25519 (others may be added; must be recorded).
Private keys in KMS/HSM.
Key rotation supported without invalidating old attestations (verification must handle multiple keys).

Batching:

Up to N edges per attestation (default: 10,000).
Single graph_revision_id per attestation.
Edges within attestation sorted by edge_id.

Rekor:

Submit DSSE envelope digest to Rekor (kind dsse / in-toto).
Persist:
- attestation_id
- rekor_log_index
- rekor_uuid
- rekor_inclusion_proof

POST /v1/attestations/graphs/{graph_revision_id}

Response:

{
  "attestation_batch_id": "att-batch-01...",
  "graph_revision_id": "graph-...",
  "status": "in_progress"
}

GET /v1/attestations/batches/{attestation_batch_id}

Response:

{
  "attestation_batch_id": "att-batch-01...",
  "graph_revision_id": "graph-...",
  "status": "completed",
  "attestations": [
    {
      "attestation_id": "att-01...",
      "dsse_envelope_ref": "s3://.../att-01.json",
      "rekor_log_index": 12345,
      "rekor_uuid": "..."
    }
  ]
}

6.4 Container Layer Provenance

Scanner MUST:

Parse image configuration and layer chain.
Emit per-layer metadata:

{
  "layer_digest": "sha256:layer0...",
  "index": 0,
  "created_by": "FROM ubuntu:22.04",
  "size_bytes": 123456
}

File provenance:

Map file_path to origin_layer_digest, file_hash, and mode/ownership.
When building nodes, resolve source_location.file_path to origin_layer_digest whenever possible.

This yields:

"container_origin": {
  "layer_digest": "sha256:layerX...",
  "index": 1,
  "path_in_layer": "/app/src/UserService.php"
}

6.5 VEX Engine & Policy

Vulnerability representation:

{
  "vulnerability_id": "CVE-2025-12345",
  "source": "OSV",
  "affected_symbols": [
    {
      "language": "php",
      "symbol_signature": "\\Vendor\\Lib\\Foo::dangerous()",
      "matching_rule": "exact"
    }
  ],
  "severity": "HIGH",
  "feeds": ["osv", "nvd"],
  "metadata": { }
}

Process for (artifact_id, vulnerability_id):

Node Matching
- Map affected_symbols to node_ids via language + signature + heuristic matching as needed.
- Define Sinks = { node_id_1, node_id_2, ... }.
Entrypoint Resolution
- Default Entrypoints from:
  - Container entrypoint/cmd and their transitive calls.
  - Exported/public API functions where appropriate.
- Allow policy override (e.g. custom entrypoint list per app).
Reachability Search
- Reverse traversal from each sink towards entrypoints.
- Bounded BFS/DFS (configurable max_depth, max_time_ms).
- Record explored nodes and edges for proof building.
Result Classification
- reachable: at least one entrypoint → ... → sink path found.
- not_reachable: no path within bounds, with sufficient graph coverage.
- unknown: bounds or missing data prevent a confident classification.

VEX record:

{
  "vex_id": "vex-01...",
  "artifact_id": "art-...",
  "vulnerability_id": "CVE-2025-12345",
  "status": "affected" | "not_affected" | "fixed" | "under_investigation",
  "justification": "reachability_not_detected" | "component_not_present" | "vulnerable_code_not_in_execute_path" | "...",
  "proof_ref": "proof-01...",
  "generated_at": "2025-01-01T12:34:56Z",
  "graph_revision_id": "graph-...",
  "policy_version": "pol-2025.01"
}

6.6 Proof Bundles & Explainability API

Proof bundle:

{
  "proof_id": "proof-01...",
  "artifact_id": "art-...",
  "vulnerability_id": "CVE-2025-12345",
  "graph_revision_id": "graph-...",
  "kind": "reachable" | "unreachable" | "unknown",
  "paths": [
    {
      "path_id": "path-01...",
      "entrypoint_node_id": "node-...",
      "sink_node_id": "node-...",
      "edge_ids": ["edge-1...", "edge-2..."],
      "attestation_refs": [
        {
          "edge_id": "edge-1...",
          "attestation_id": "att-01...",
          "rekor_log_index": 12345
        }
      ]
    }
  ],
  "analysis_limits": {
    "max_depth": 50,
    "max_time_ms": 5000,
    "coverage_notes": "no runtime traces included"
  },
  "unreachable_reason": {
    "no_paths_found": true,
    "graph_coverage": "full",
    "notes": "All entrypoints explored up to depth 50; no path to sink."
  }
}

API:

GET /v1/artifacts/{artifact_id}/findings – Lists findings and high-level summaries.
GET /v1/findings/{finding_id}/proof – Returns the full proof bundle as above, including path details, node metadata (symbol signatures, source location, container layers), and attestation pointers.

7. Determinism & Replay

Replay Endpoint

POST /v1/graphs/{graph_revision_id}:replay

Uses determinism manifest to re-run Cartographer.
Expected outputs:
- Same graph_revision_id.
- Identical sets of nodes and edges (IDs and canonical content).
- For attestations, payload digests MUST match (signatures may differ due to key rotation).

Audit Report

GET /v1/graphs/{graph_revision_id}/audit-report

Includes:
- Original determinism manifest.
- Replay manifest.
- Comparison result: match / mismatch.
- If mismatch: enumerated differences (nodes, edges, toolchain or config drift).

8. Security and Compliance Considerations

All graphs, manifests, and attestations must be integrity-checked (hashes) before use.
Key management for signing must follow internal KMS/HSM best practices.
Attestations and proofs may contain code locations but must not leak customer secrets beyond configured scopes.
Multi-tenant isolation rules apply to all stored artifacts, graphs, and proofs; Rekor is public by design, but tenant IDs and internal references must not leak into Rekor payloads.

9. Performance & Scalability

Baseline targets (initial):

Per artifact up to 10^6 edges.
Platform capacity 10^8 edges/day across tenants.
p95 for medium artifacts (≤100k edges):
- Graph build ≤ 60s.
- Attestation batching + Rekor ≤ 120s (async).

Storage:

Graph data indexed by artifact_id, graph_revision_id.
Edges indexable by edge_id and caller_node_id/callee_node_id.
Proof bundles indexable by artifact_id, vulnerability_id, and vex_id.

10. Operational Concerns

Feature flags for:
- Graph-based VEX decisions (on/off).
- Attestations + Rekor logging (on/off; some customers may not allow external logs).
- Determinism replay endpoints (may be limited to Enterprise tiers).
Monitoring:
- Graph build failure rate, latency histograms.
- Attestation creation and Rekor submission errors.
- Replay mismatch rate.

11. Alternatives Considered

Non-deterministic traversal with approximate edges – rejected due to poor auditability.
Runtime-only reachability – rejected as primary mechanism due to incomplete coverage (not all paths exercised in tests).
Storing unsigned graph data only – rejected because it does not meet integrity or non-repudiation goals.

12. Rollout Plan (High Level)

Phase 1 – Internal graphs only
- Implement Cartographer determinism + manifest.
- No attestations; internal verification.
Phase 2 – Attestations & Rekor
- Integrate Attestor, DSSE, and Rekor logging.
- Build internal tools to inspect and verify attestations.
Phase 3 – VEX with proof bundles
- Switch VEX decisions for selected tenants to graph-based reachability.
- Expose Proof & Explainability API.
Phase 4 – Replay & audit workflows
- Make replay endpoints GA for Enterprise customers.
- Add reporting and export features.

13. Changelog

v0.1 (Draft) – Initial RFC, definitions of node/edge identities, graph determinism, VEX and proof model.
v0.2 (Planned) – Add runtime traces integration and trust lattice hooks.
v1.0 (Planned) – Marked stable once at least two major tenants successfully use reachability-based VEX in production.

Part 2 — Customer-Facing Explainer (Draft Copy)

You can give this to your documentation / marketing team to adapt into website, PDF, or slide decks.

Title: Deterministic Reachability & Cryptographic Proofs in Stella Ops

Modern vulnerability scanners all suffer from the same problem: they report every known CVE in your code and containers, but they cannot reliably tell you which ones are actually exploitable in your environment. Security teams are left drowning in noise, and auditors have to trust whatever the tool claims without real evidence.

Stella Ops takes a different approach. We build deterministic call-graphs for every artifact you scan and then attach cryptographic proofs to our reachability analysis and VEX decisions.

1. From “You Have a CVE” to “Here Is the Reachable Path”

A call-graph is a map of which functions can call which other functions inside your containers, binaries, and services. Using language-specific analysis pipelines, Stella Ops:

Identifies every callable function or method (nodes).
Establishes possible calls between them (edges).
Computes whether a vulnerable function can actually be reached from any real entry point in your application.

The result is a clear distinction between:

Vulnerabilities that are reachable and worth urgent remediation.
Vulnerabilities that are present but not reachable under any execution path we can construct.
Vulnerabilities that are unknown (e.g. where analysis limits or missing information prevent a confident verdict).

This dramatically reduces noise for your developers and AppSec teams.

2. Deterministic, Replayable Results

Most tools behave like a black box: re-running analysis on the same artifact may yield slightly different results over time. That is not acceptable in regulated or high-assurance environments.

Stella Ops ensures that:

For the same inputs (artifact, configuration, tool versions), we always produce the same call-graph.
Every function and edge gets a stable identifier.
The entire graph is summarized into a graph hash and a determinism manifest that records exactly which tools and configurations were used.

If you or your auditor re-run the analysis in the future, you can confirm that:

You get the same graph hash.
The same nodes and edges are present.
Our decisions are reproducible, not heuristic.

3. Cryptographic Evidence: DSSE, in-toto, and Rekor

We do not expect you to just trust our analysis; we sign it.

For each call-graph, Stella Ops creates DSSE-wrapped in-toto attestations that describe the edges (who calls whom) for that artifact. These attestations are:

Digitally signed using modern cryptography.
Logged to a transparency ledger (Rekor) so that any tampering is publicly detectable.
Stored in your Stella Ops Ledger so you can retrieve and verify them later.

This gives you a verifiable chain of custody:

From artifact and SBOM.
To call-graph construction.
To reachability analysis and VEX decisions.

4. VEX With Real Proof, Not Just a Checkbox

Stella Ops emits VEX statements for each vulnerability–artifact pair. Unlike most tools, every VEX decision is backed by a proof bundle:

For “affected” findings, we show the exact call-chain from a real entrypoint (e.g. /app/index.php or a REST controller) to the vulnerable function, along with references to the signed attestations that prove each edge.
For “not affected” findings, we document that we explored all configured entrypoints to a defined depth and found no path to the vulnerable function. We also include explicit notes about analysis limits and coverage.

In other words, every VEX verdict comes with a “why,” not just a “yes/no.”

5. Understanding Where Vulnerabilities Come From

Many issues originate in:

Base images
Shared layers used by multiple teams
Vendor-provided components

Stella Ops tracks container layer provenance for each file and function. For any finding, we can show:

Which layer introduced the vulnerable function.
Whether it came from a base image or from your own Dockerfile steps.
The exact file and path in that layer.

This makes it much easier to assign ownership (“platform team vs application team”) and to fix the problem at the right source (updating a base image vs patching local code).

6. Audit-Ready: Replay and Forensics

For regulated industries, it is not enough to know that a decision was made; you must be able to prove it was made correctly and could be made again in the same way.

Stella Ops provides:

A determinism manifest per analysis run, listing all inputs and tool versions.
A replay capability: re-run the analysis and verify that you obtain the same graph and the same vulnerability verdicts.
Exportable reports that link vulnerabilities, VEX decisions, call-chains, and attestations.

This supports internal audits, external regulators, and incident response activities where accurate historical reconstruction is essential.

7. Practical Benefits for Your Teams

By combining deterministic graphs, cryptographic attestations, and explainable VEX, Stella Ops:

Reduces the number of vulnerabilities engineers must look at by focusing on actually reachable issues.
Gives security teams hard evidence they can bring to auditors and management.
Helps operations and platform teams quickly see whether an issue belongs to a base image, a shared library, or a specific application.
Makes vulnerability management more like an engineering discipline and less like guesswork.

If you want, I can next reshape these into:

A shorter, 1-page executive summary, or
A developer-facing “How Stella Ops Reachability Works” guide with diagrams and example JSON snippets for your docs site.

58 KiB Raw Blame History Unescape Escape

Why reachability matters (in plain terms)

Where today’s market sits

Stella Ops: how to lead (moats → outcomes)

Minimal architecture (clean and pragmatic)

Practical specs to hand to engineers (short version)

Quick wins (do these first)

How this helps day‑to‑day

0. Scope & Goals

1. Core Concepts (Global)

1.1 Artifact

1.2 Node, Edge, Call Graph

1.3 Determinism

1.4 Attestations & Ledger

1.5 VEX & Proofs

2. Artifact Identity & Metadata Service

2.1 Functional Requirements

3. Node & Edge Identity (Call‑Graph Model)

3.1 Node Identity

3.2 Edge Identity

3.3 Graph Revision & Manifest

4. Cartographer (Call‑Graph Builder) Service

4.1 Responsibilities

4.2 Inputs & Outputs

5. Edge Attestations (Attestor Service)

5.1 Attestation Content

5.2 Attestor API

6. Container Layer Provenance

6.1 Layer Indexing

7. VEX & Policy Engine

7.1 Internal Vulnerability Model

7.2 Reachability Evaluation

7.3 VEX Statement & Proof Linking

8. Proof & Explainability API

8.1 Findings API

9. Determinism & Replay Guarantees

9.1 Re‑analysis

10. Non‑Functional Requirements

10.1 Performance & Scale

10.2 Security

11. Documentation Hooks Summary (for your writer)

Part 1 — Internal RFC: Deterministic Reachability & Proof System

0. Abstract

1. Motivation

2. Goals and Non-Goals

2.1 Goals

2.2 Non-Goals

3. Terminology

4. Architecture Overview

5. Data Model

5.1 Artifact and Binary Identity

5.2 Node Model

5.3 Edge Model

5.4 Graph Revision & Determinism Manifest

6. Service APIs

6.1 Artifact Service

6.2 Cartographer (Call-Graph Builder)

6.3 Attestor Service

6.4 Container Layer Provenance

6.5 VEX Engine & Policy

6.6 Proof Bundles & Explainability API

7. Determinism & Replay

8. Security and Compliance Considerations

9. Performance & Scalability

10. Operational Concerns

11. Alternatives Considered

12. Rollout Plan (High Level)

13. Changelog

Part 2 — Customer-Facing Explainer (Draft Copy)

Title: Deterministic Reachability & Cryptographic Proofs in Stella Ops

1. From “You Have a CVE” to “Here Is the Reachable Path”

2. Deterministic, Replayable Results

3. Cryptographic Evidence: DSSE, in-toto, and Rekor

4. VEX With Real Proof, Not Just a Checkbox

5. Understanding Where Vulnerabilities Come From

6. Audit-Ready: Replay and Forensics

7. Practical Benefits for Your Teams

58 KiB

Raw Blame History

Stella Ops: how to lead (moats → outcomes)