git.stella-ops.org/20-Nov-2026 - Branch · Reachability & Moat Watch — Verified 2025 Updates.md at 522fff73cd1dea85bfa61dba58a8f1805abcd0b4 - git.stella-ops.org

Files

Docs CI / lint-and-preview (push) Has been cancelled

Details

feat: Add comprehensive documentation for binary reachability with PURL-resolved edges

- Introduced a detailed specification for encoding binary reachability that integrates call graphs with SBOMs.
- Defined a minimal data model including nodes, edges, and SBOM components.
- Outlined a step-by-step guide for building the reachability graph in a C#-centric manner.
- Established core domain models, including enumerations for binary formats and symbol kinds.
- Created a public API for the binary reachability service, including methods for graph building and serialization.
- Specified SBOM component resolution and binary parsing abstractions for PE, ELF, and Mach-O formats.
- Enhanced symbol normalization and digesting processes to ensure deterministic signatures.
- Included error handling, logging, and a high-level test plan to ensure robustness and correctness.
- Added non-functional requirements to guide performance, memory usage, and thread safety.

2025-11-20 23:16:02 +02:00

54 KiB

Raw Blame History

Short answer: Yes, we have more than enough public information and ecosystem knowledge to design better reachability graphs for PHP and JavaScript than what Snyk/Semgrep expose today—especially in terms of openness, precision, and determinism. What we do not have is their exact proprietary heuristics, but we don’t need those to surpass them architecturally.

Let me break it down in the dimensions that matter for Stella Ops.

1. What we concretely know from Snyk & Semgrep

From public material we can infer the shape of their systems:

Snyk
- Builds a call graph of the application + dependencies to decide if vulnerable functions are on an execution path from “entry points” (e.g., HTTP handlers, CLI entry, etc.). (Snyk)
- For its “Reachable Vulnerabilities” feature, Snyk explicitly states that it ingests your repo, builds a call graph, then discards source and keeps only the graph + function names. (docs.snyk.io)
- Combines SCA with static analysis and uses reachability as a factor in “risk score” / prioritization. (docs.snyk.io)
Semgrep (Supply Chain)
- Reachability is computed by correlating manifests/lockfiles with static analysis of the code to see whether vulnerable components are actually used. (semgrep.dev)
- Uses a dependency graph for supply chain (including transitive deps) and classifies findings as “always reachable / conditionally reachable / needs review / no reachability analysis.” (semgrep.dev)
- For PHP specifically, they now advertise reachability as GA in Supply Chain (we saw that in your earlier search). This tells us they do at least basic call-graph level reasoning + data flow for PHP.

Conceptually, that already gives us the core primitives:

Call graphs (application + dependencies).
Entry point modeling.
Mapping vulnerable symbols (functions/methods/routes) to nodes in that graph.
Reachability classification at the level of “reachable / no-path / conditional / not analyzed”.

We also have additional public references (Endor Labs, Coana, GitLab, GitHub, etc.) that all describe more or less the same model: build call graphs or code property graphs and do forward/backward reachability over them. (endorlabs.com)

So: the algorithmic space is well-documented. The secret sauce is mostly heuristics and engineering, not unknown math.

2. Where the gaps actually are

What we do not get from Snyk/Semgrep publicly:

Concrete internal call-graph algorithms and framework models (how they resolve dynamic imports, reflection, magic in PHP, complex JS bundler semantics).
Their framework-specific “entry point catalogs” (e.g., mapping Express/Koa/NestJS/Next.js routes, Laravel/Symfony/WordPress hooks, etc.).
Their internal tuning of false-positive / false-negative trade-offs per language and framework.
Their private benchmarks and labeled datasets.

That means we cannot “clone Snyk’s reachability,” but we absolutely can design:

A better graph spec.
A more transparent and deterministic pipeline.
Stronger binary + container + SBOM/VEX integration.

Which is exactly aligned with your Stella Ops vision.

3. For PHP & JavaScript specifically: can we beat them?

For graph quality and expressiveness, yes, we can.

JavaScript / TypeScript

Existing tools face these pain points:

Highly dynamic imports (require(...), import(), bundlers).
Multiple module systems (CJS, ESM, UMD), tree-shaking, dead code elimination.
Framework magic (Next.js, React SSR, Express middlewares, serverless handlers).

Public info shows Snyk builds a call graph and analyzes execution paths, but details on how they handle all JS edge cases are abstracted away. (Snyk)

What we can do better in Stella Ops graphs:

First-class “resolution nodes”:
- Represent module resolution, bundler steps, and dynamic import decisions as explicit nodes/edges in the graph.
- This makes ambiguity visible instead of hidden inside a heuristic.
Framework contracts:
- Have pluggable “route/handler mappers” per framework (Express, Nest, Next, Fastify, serverless wrappers) so entry points are explicit graph roots, not magic.
Multiple call-graph layers:
- Source-level graph (TS/JS).
- Bundled output graph (Webpack/Vite/Rollup).
- Runtime-inferred hints (if we later choose to add traces), all merged into a unified reachability graph with provenance tags.

If we design our graph format to preserve all uncertainty explicitly (e.g., edges tagged as “over-approximate”, “dynamic-guess”, “runtime-confirmed”), we will have better analytical quality even if raw coverage is comparable.

PHP

Semgrep now has PHP reachability GA in Supply Chain, but again we only see the outcomes, not the internal graph model. (DEV Community)

We can exploit known pain points in PHP:

Dynamic includes / autoloaders.
Magic methods, dynamic dispatch, frameworks like Laravel/Symfony/WordPress/Drupal.
Templating / view layers that act as “hidden” entry points.

Improvements in the Stella Ops model:

Autoloader-aware graph layer:
- Model Composer autoloading rules explicitly; edges from composer.json and PSR-4/PSR-0 rules into the graph.
Framework profiles:
- For Laravel/Symfony/etc., we ship profiles that define how controllers, routes, middlewares, commands, and events are wired. Those profiles become graph generators, not just regex signatures.
Source-to-SBOM linkage:
- Nodes are annotated with PURLs and SBOM component IDs, so you get reachability graph edges directly against SBOM + VEX.

Again, even without their internals, we can design a richer, more transparent graph representation.

4. How Stella Ops can clearly surpass them (graph-wise)

Given your existing roadmap (SBOM spine, deterministic replay, lattice policies), we can deliberately design a reachability graph system that outclasses them in these axes:

Open, documented graph spec
- Define a “Reachability Graph Manifest”:
  - Nodes: functions/methods/routes/files/modules + dependency components (PURLs).
  - Edges: call edges, data-flow edges, dependency edges, “resolution” edges.
  - Metadata: language, framework, hashes, provenance, SBOM linkage.
- Publish it so others can generate/consume the same graphs.
Deterministic, replayable scans
- Every scan is defined by:
  - Exact sources (hashes).
  - Analyzer version.
  - Ruleset + framework profiles.
- Result: any reachability verdict can be re-computed bit-for-bit later.
PURL-level edges for supply chain
- Reachability graph includes direct edges:
  - app:function → package:function → CVE.
- This is exactly what most tools conceptually do, but we make it explicit and exportable.
Rich status model beyond “reachable / not”
- Adopt and extend Semgrep-like statuses: always_reachable, conditionally_reachable, unreachable, not_analyzed, ambiguous.
- Add:
  - Confidence levels.
  - Types of ambiguity (dynamic dispatch, reflection, unresolved include, etc.).
Integration with lattice / policy engine
- You already plan a Trust Algebra / lattice engine.
- The reachability graph becomes a first-class input:
  - Policies like “treat conditional_reachable through untrusted input as High Risk; treat always_reachable behind auth as Medium; treat unreachable but exploitable via RCE as Special Case”.
Offline, sovereign, binary-aware
- Most commercial tools are SaaS and source-centric.
- We can:
  - Run completely offline.
  - Attach reachability graphs to container images, SBOMs, and in-toto attestations.
  - Extend to compiled artifacts later (ELF/PE/Mach-O, which you already plan).

All of this is achievable with publicly available theory + our own engineering, independent of Snyk/Semgrep proprietary code.

5. Practical conclusion

So, to answer your question precisely:

We do not have enough information to duplicate Snyk or Semgrep’s internal implementations.
We absolutely have enough information (from their docs + the broader reachability literature + open tools) to:
- Design a more expressive, auditable reachability graph model for PHP and JavaScript.
- Implement our own call-graph construction and reachability analysis that is:
  - Deterministic,
  - SBOM/VEX-native,
  - Framework-aware,
  - And aligned with Stella Ops’ offline/attested architecture.

If you want, next step I can sketch a concrete “Reachability Graph v1” JSON schema for PHP/JS with PURL edges and confidence annotations that can plug directly into your Scanner/WebService and SBOM spine.

Good, let’s turn this into something a mid-level engineer can actually implement for PHP without guesswork.

Below is a concrete, implementation-ready reachability spec for PHP v1, structured so you can give it directly to an engineer as requirements.

1. Scope of PHP Reachability v1

Goal Given:

A PHP project (source code),
composer.json + composer.lock,
A list of vulnerable symbols (e.g., FQNs from a vulnerability DB, each tied to a PURL),

produce:

A call graph of PHP functions/methods (with nodes and edges).
A mapping between nodes and dependency components (PURLs).
A reachability report per vulnerable symbol:
- reachable, maybe_reachable, unreachable, not_analyzed
- With example call paths when reachable.

Deliberate limitations of v1 (explicit, to keep it doable):

Supported code:
- Plain PHP functions.
- Class methods (normal, static).
Supported calls:
- Direct function calls: foo()
- Method calls: $obj->bar(), Foo::bar()
Supported resolution features:
- Namespaces + use imports.
- Composer autoload mapping (PSR-4/0, classmap) from composer.json.
Not fully supported (treated conservatively as “maybe”):
- Dynamic function names ($fn()).
- Dynamic method calls ($obj->$name()).
- Heavy reflection magic.
- Complex framework containers (Laravel, Symfony DI) – reserved for v2.

2. Reachability Graph Document (JSON)

The main artifact is a graph document. One file per scan:

{
  "schemaVersion": "1.0.0",
  "language": "php",
  "project": {
    "projectId": "my-app",
    "rootDir": "/src/app", 
    "hash": "sha256:..."
  },
  "components": [
    {
      "id": "comp-1",
      "purl": "pkg:composer/vendor/lib-a@1.2.3",
      "name": "vendor/lib-a",
      "version": "1.2.3"
    }
  ],
  "nodes": [],
  "edges": [],
  "vulnerabilities": [],
  "reachabilityResults": []
}

2.1 Node model

Every node is a callable (function or method) or an entry point.

{
  "id": "node-uuid-or-hash",
  "kind": "function | method | entrypoint",
  "name": "index",
  "fqn": "\\App\\Controller\\HomeController::index",
  "file": "src/Controller/HomeController.php",
  "line": 42,
  "componentId": "comp-1",       
  "purl": "pkg:composer/vendor/lib-a@1.2.3", 
  "entryPointType": "http_route | cli | unknown | null",
  "extras": {
    "namespace": "\\App\\Controller",
    "className": "HomeController",
    "visibility": "public | protected | private | null"
  }
}

Rules for node creation

Function node
- kind = "function"
- fqn = \Namespace\functionName
Method node
- kind = "method"
- fqn = \Namespace\ClassName::methodName
Entrypoint node
- kind = "entrypoint"
- entryPointType set accordingly (may be unknown initially).
- Typically represents:
  - public/index.php
  - bin/console commands, etc.
- Entrypoints can either:
  - Be separate nodes that call real functions/methods, or
  - Be the same node as a method/function flagged as entrypoint. For v1, keep it simple: separate entrypoint nodes that call “real” nodes.

2.2 Edge model

Edges capture relationships in the graph.

{
  "id": "edge-uuid-or-hash",
  "from": "node-id-1",
  "to": "node-id-2",
  "type": "call | include | autoload | entry_call",
  "confidence": "high | medium | low",
  "extras": {
    "callExpression": "Foo::bar($x)",
    "file": "src/Controller/HomeController.php",
    "line": 50
  }
}

Edge types (v1)

call From a function/method to another function/method (resolved).
include From a file-level node or entrypoint to nodes defined in included file (optional for v1; can be “expanded” by treating all included definitions as reachable).
autoload From usage site to class definition when resolved via Composer autoload (optional to expose as a separate edge type; good for debug).
entry_call From an entrypoint node to the first callable(s) it invokes.

For v1, an engineer can implement only call + entry_call and treat include/autoload as internal mechanics that result in call edges.

2.3 Vulnerabilities model

Input from your vulnerability database (or later from VEX) mapped into the graph:

{
  "id": "CVE-2020-1234",
  "source": "internal-db-or-nvd-id",
  "componentPurl": "pkg:composer/vendor/lib-a@1.2.3",
  "symbolFqn": "\\Vendor\\LibA\\Foo::dangerousMethod",
  "symbolKind": "method | function",
  "severity": "critical | high | medium | low",
  "extras": {
    "description": "RCE in Foo::dangerousMethod",
    "range": ">=1.0.0,<1.2.5"
  }
}

At graph build time, you pre-resolve symbolFqn to node.id where possible and record it in extras.

3. Reachability Results Structure

Once you have the graph and the vulnerability list, you run reachability and produce:

{
  "vulnerabilityId": "CVE-2020-1234",
  "componentPurl": "pkg:composer/vendor/lib-a@1.2.3",
  "symbolFqn": "\\Vendor\\LibA\\Foo::dangerousMethod",
  "targetNodeId": "node-123",
  "status": "reachable | maybe_reachable | unreachable | not_analyzed",
  "reason": "short explanation string",
  "paths": [
    ["entry-node-1", "node-10", "node-20", "node-123"]
  ],
  "analysisMeta": {
    "algorithmVersion": "1.0.0",
    "maxDepth": 100,
    "timestamp": "2025-11-20T19:30:00Z"
  }
}

Status semantics:

reachable There exists at least one concrete call path from an entrypoint node to targetNodeId using only confidence = high edges.
maybe_reachable A path exists but at least one edge along any path has confidence = medium | low (dynamic call, unresolved class alias, etc.).
unreachable No path exists from any entrypoint to the target node in the constructed graph.
not_analyzed We failed to build a node for the symbol or failed the analysis (parse errors, missing source, etc.).

4. Analysis Pipeline Spec (Step-by-Step)

This is the part a mid-level engineer can follow as tasks.

4.1 Inputs

Directory with PHP code (/app).
composer.json, composer.lock.
List of vulnerabilities (as above).
Optional SBOM mapping PURLs to file paths (if you have it; otherwise use Composer metadata only).

4.2 Step 1 – Parse Composer Metadata & Build Components

Read composer.lock.
For each package in "packages":
- Build purl like: pkg:composer/<name>@<version>
- Create components[] entry (with generated componentId).
For the root project, create one component (e.g., app) with purl = null or a synthetic one (pkg:composer/mycompany/myapp@dev).

Output:

components[] array.
componentIndex: map from package name to componentId.

4.3 Step 2 – PHP AST & Symbol Table

Use a standard AST library (e.g., nikic/php-parser) – explicitly allowed and expected.

For each PHP file in:

application source dirs (e.g. src/, app/),
vendor dirs (if you choose to parse vendor code; v1 may do that only for needed components):

Perform:

Parse file → AST.
Extract:
- File namespace.
- use imports (class aliases).
- Function definitions: name, line.
- Class definitions: name, namespace, methods.
Build symbol table:

// conceptual structure:
class SymbolTable {
    // Fully qualified class or function name → node meta
    public array $functionsByFqn;
    public array $methodsByFqn; // "\Ns\Class::method"
}

Determine componentId for each file:
- If path under vendor/vendor-name/package-name/ → map to that Composer package → componentId.
- Else → root app component.
Create nodes:

For each function:
- Node kind = "function".
For each method:
- Node kind = "method".

Assign id, file, line, fqn, componentId, purl.

Output:

nodes[] with all functions/methods.
symbolTable (for resolving calls).

4.4 Step 3 – Entrypoint Detection

v1 simple rules:

Any of:
- public/index.php
- index.php in project root
- Files under bin/ or cli/ with #!/usr/bin/env php shebang are considered entrypoint files.
For each entrypoint file:
- Create an entrypoint node with:
  - file = that file
  - entryPointType = "http_route" (for public/index.php) or "cli" (for bin/*) or "unknown".
- Add to nodes[].
Later, when scanning each entrypoint file’s AST, you will create entry_call edges from the entrypoint node to the first layer of call targets inside that file.

Output:

Additional entrypoint nodes.

4.5 Step 4 – Call Graph Construction

For each parsed file:

Traverse AST for call expressions:
- foo() → candidate function call.
- $obj->bar() → instance method call.
- Foo::bar() → static method call.
Resolve function calls:

Given:
- Called name (may be qualified, relative, or unqualified).
- Current file namespace.
Resolution rules:
- If fully qualified (starts with \): use directly as FQN.
- Else:
  - Check use imports for alias match.
  - If no alias, prepend current namespace.
- Look up FQN in symbolTable.functionsByFqn or methodsByFqn.
- If found → resolved call with confidence = "high".
- If not found → mark confidence = "low" and set to to a synthetic node id like unknown or skip creating an edge in v1 (implementation choice – recommended: create edge to special unknown node).
Resolve method calls $obj->bar() (v1 simplified):
- Assume dynamic instance type is not known statically → resolution is ambiguous.
- For v1, treat these as:
  - confidence = "medium" and:
    - If $obj variable has a clear new ClassName assignment in the same function, try to infer class and use same resolution rules as static calls.
    - Otherwise, create edges from calling node to all methods named bar in any class inside the same component.
- This is over-approximate but conservative.
Resolve static method calls Foo::bar():
- Resolve Foo to FQN using namespace + imports (same as functions).
- Build FQN \Ns\Foo::bar.
- Look up in symbolTable.methodsByFqn.
- Mark confidence = "high" when resolved.
Connect entrypoints:
- For each entrypoint file:
  - Identify top-level calls in that file (same rules as above).
  - Edges:
    - type = "entry_call"
    - from = entrypointNodeId
    - to = resolved callee node

Output:

edges[] with call and entry_call edges.

4.6 Step 5 – Map Vulnerabilities to Nodes

For each vulnerability:

If symbolFqn is not null:
- If symbolKind == "method" → look into symbolTable.methodsByFqn.
- If symbolKind == "function" → symbolTable.functionsByFqn.
If found → record targetNodeId in a lookup: vulnId → nodeId.
If not found → status will later become not_analyzed.

4.7 Step 6 – Reachability Algorithm

Core logic: multiple BFS (or DFS) from entrypoints over the call graph.

Pre-compute entry roots:

entryNodes = ids of all nodes with kind = "entrypoint".

Algorithm (BFS from all entrypoints):

Pseudo-code (language-agnostic):

function computeReachability(Graph $graph, array $entryNodes): ReachabilityContext {
    $queue = new SplQueue();
    $visited = [];                  // nodeId => true
    $predecessor = [];              // nodeId => parent nodeId (for path reconstruction)
    $edgeConfidenceOnPath = [];     // nodeId => "high" | "medium" | "low"

    foreach ($entryNodes as $entryId) {
        $queue->enqueue($entryId);
        $visited[$entryId] = true;
        $edgeConfidenceOnPath[$entryId] = "high";
    }

    while (!$queue->isEmpty()) {
        $current = $queue->dequeue();

        foreach ($graph->outEdges($current) as $edge) {
            if ($edge->type !== 'call' && $edge->type !== 'entry_call') {
                continue;
            }

            $next = $edge->to;
            if (isset($visited[$next])) {
                continue;
            }

            $visited[$next] = true;
            $predecessor[$next] = $current;

            // propagate confidence (lowest on the path wins)
            $prevConf = $edgeConfidenceOnPath[$current] ?? "high";
            $edgeConf = $edge->confidence; // "high"/"medium"/"low"
            $edgeConfidenceOnPath[$next] = minConfidence($prevConf, $edgeConf);

            $queue->enqueue($next);
        }
    }

    return new ReachabilityContext($visited, $predecessor, $edgeConfidenceOnPath);
}

function minConfidence(string $a, string $b): string {
    $order = ["high" => 3, "medium" => 2, "low" => 1];
    return ($order[$a] <= $order[$b]) ? $a : $b;
}

Classify each vulnerability:

For each vulnerability with targetNodeId:

If targetNodeId is missing → status = "not_analyzed".
Else if targetNodeId is not in visited → status = "unreachable".
Else:
- Let conf = edgeConfidenceOnPath[targetNodeId].
- If conf == "high" → status = "reachable".
- If conf == "medium" or "low" → status = "maybe_reachable".

Path reconstruction:

To generate one example path:

function reconstructPath(array $predecessor, string $targetId): array {
    $path = [];
    $current = $targetId;
    while (isset($predecessor[$current])) {
        array_unshift($path, $current);
        $current = $predecessor[$current];
    }
    array_unshift($path, $current); // entrypoint at start
    return $path;
}

Store that path array in reachabilityResults[].paths[].

5. Handling PHP “messy bits” (v1 rules)

This is where we mark things as maybe instead of pretending we know.

Dynamic function names $fn():
- Create no edges by default in v1.
- Optionally, if $fn is a constant string literal obvious in the same function, treat as a normal call.
- Otherwise: leave it out and accept that some cases will be missed → vulnerability may be marked unreachable but flagged with analysisMeta.dynamicCallsIgnored = true.
Dynamic methods $obj->$method():
- Same principle as above.
Reflection / call_user_func / call_user_func_array:
- v1: do not try to resolve.
- Optional: track the call sites; mark their outgoing edges as confidence = "low" and connect to all functions/methods of that name when the name is a string literal.
Includes (include, require, require_once, include_once):
- v1 simplest rule:
  - Treat the included file as fully reachable from the including file.
  - Pseudo-implementation: when building symbol table, everything defined in the included file is considered potentially called by the including file’s entrypoint logic.
- Implementation shortcut:
  - For the first version, you can even skip modeling edges, and instead mark all nodes in included files as “reachable from the entrypoint” if included directly by an entrypoint file. Later refine.

6. What the engineer actually builds (modules & tasks)

You can frame it to them like this:

Module PhpProjectLoader
- Reads project root, finds composer.json, composer.lock.
- Produces components[] and mapping from file-path → componentId.
Module PhpAstIndexer
- Uses nikic/php-parser.
- For each .php file:
  - Produces entries in symbolTable.
  - Produces base nodes[] (functions, methods).
- Creates entrypoint nodes based on known file patterns.
Module PhpCallGraphBuilder
- Walks AST again:
  - For each callable body, finds call expressions.
  - Resolves calls via symbolTable.
  - Produces edges[].
Module PhpReachabilityEngine
- Runs BFS from entrypoints.
- Classifies per-vulnerability reachability.
Module GraphSerializer
- Assembles everything into the JSON schema described in sections 2–3.

Each module is testable with small sample projects.

7. Minimal working example (very small)

Project:

// public/index.php
<?php
require __DIR__ . '/../src/Controller/HomeController.php';

use App\Controller\HomeController;

$controller = new HomeController();
$controller->index();

// src/Controller/HomeController.php
<?php
namespace App\Controller;

use Vendor\LibA\Foo;

class HomeController {
    public function index() {
        $foo = new Foo();
        $foo->dangerousMethod();
    }
}

Vulnerability:

{
  "id": "CVE-2020-1234",
  "componentPurl": "pkg:composer/vendor/lib-a@1.2.3",
  "symbolFqn": "\\Vendor\\LibA\\Foo::dangerousMethod",
  "symbolKind": "method"
}

Expected reachability path (conceptually):

[
  "entry:public/index.php",
  "\\App\\Controller\\HomeController::index",
  "\\Vendor\\LibA\\Foo::dangerousMethod"
]

Status: reachable with confidence = high.

If you’d like, next step I can:

Strip this into a formal JSON Schema file (reachability-php-graph.schema.json) and
Propose a directory layout + interfaces in C#/.NET 10 for StellaOps.Scanner.Php so you can drop it straight into the repo. Here is a JavaScript/TypeScript reachability spec that a mid-level engineer can actually implement, but which is still “best in class” in terms of clarity, determinism, and extensibility.

I’ll mirror the PHP structure you already have so Scanner/WebService and Sbomer can treat them uniformly.

1. Scope of JS Reachability v1

Goal

Given:

A JS/TS project (Node-centric),
package.json + lockfile (package-lock.json / yarn.lock / pnpm-lock.yaml),
A list of vulnerable symbols (tied to npm PURLs),

produce:

A function-level call graph (nodes + edges).
Mapping of nodes to components (pkg:npm/... PURLs).
A reachability verdict for each vulnerable symbol:
- reachable, maybe_reachable, unreachable, not_analyzed
- With at least one example call path when reachable/maybe_reachable.

Deliberate v1 constraints

To keep it very implementable:

Target runtime: Node.js (server-side).
Source: TypeScript + JavaScript in one unified analysis.
- Use TypeScript compiler with allowJs: true so JS and TS share the same Program.
Modules:
- ES Modules (import/export).
- CommonJS (require, module.exports, exports).
Supported calls:
- Direct calls: foo().
- Method calls: obj.method(), Class.method().
Bundlers (Webpack, Vite, etc.): out of scope v1 (treat source before bundling).
Dynamic features (handled conservatively, see below):
- eval, Function constructor, dynamic imports, obj[methodName](), etc.

2. Reachability Graph Document (JSON)

Same high-level shape as PHP, but annotated for JS/TS.

{
  "schemaVersion": "1.0.0",
  "language": "javascript",
  "project": {
    "projectId": "my-node-app",
    "rootDir": "/app",
    "hash": "sha256:..."
  },
  "components": [],
  "nodes": [],
  "edges": [],
  "vulnerabilities": [],
  "reachabilityResults": []
}

2.1 Components

Each npm package (including the root app) is a component.

{
  "id": "comp-1",
  "purl": "pkg:npm/express@4.19.2",
  "name": "express",
  "version": "4.19.2",
  "isRoot": false,
  "extras": {
    "resolvedPath": "node_modules/express"
  }
}

For the root project, you can use:

{
  "id": "comp-root",
  "purl": "pkg:npm/my-company-my-app@1.0.0",
  "name": "my-company-my-app",
  "version": "1.0.0",
  "isRoot": true
}

A mid-level engineer can easily build this from package.json + the chosen lockfile.

2.2 Nodes (callables & entrypoints)

Every node is a callable or an entrypoint.

{
  "id": "node-uuid-or-hash",
  "kind": "function | method | arrow | class_constructor | entrypoint",
  "name": "handleRequest",
  "fqn": "src/controllers/userController.ts::handleRequest",
  "file": "src/controllers/userController.ts",
  "line": 42,
  "componentId": "comp-root",
  "purl": "pkg:npm/my-company-my-app@1.0.0",
  "exportName": "handleRequest",
  "exportKind": "named | default | none",
  "className": "UserController",
  "entryPointType": "http_route | cli | worker | unknown | null",
  "extras": {
    "isAsync": true,
    "isRouteHandler": true
  }
}

Rules for node creation

Function node
- kind = "function" for function foo() {} and export function foo() {}.
- fqn = <relative-file-path>::foo.
Arrow function node
- kind = "arrow" when it is used as a callback that matters (e.g. Express handler).
- Option: generate synthetic name file.ts::<line>:<column>.
Method node
- kind = "method" for class methods.
- fqn = <file>::ClassName.methodName.
Class constructor node
- kind = "class_constructor" for constructor() if you want constructor-level analysis.
Entrypoint node
- kind = "entrypoint".
- entryPointType according to detection rules (see §4).
- fqn = <file>::<entrypoint-label>, e.g. src/server.ts::node-entry.

You don’t need to over-engineer FQNs; they just need to be stable and unique.

2.3 Edges

Edges model function/method/module relationships.

{
  "id": "edge-uuid-or-hash",
  "from": "node-id-1",
  "to": "node-id-2",
  "type": "call | entry_call | import | export",
  "confidence": "high | medium | low",
  "extras": {
    "callExpression": "userController.handleRequest(req, res)",
    "file": "src/routes/userRoutes.ts",
    "line": 30
  }
}

For reachability v1, only call and entry_call are required. import/export edges are useful for debugging but not strictly necessary for BFS reachability.

2.4 Vulnerabilities

Library-level vulnerabilities are described in terms of npm PURL and symbol.

{
  "id": "CVE-2020-1234",
  "source": "internal-db-or-nvd-id",
  "componentPurl": "pkg:npm/some-lib@1.2.3",
  "packageName": "some-lib",
  "symbolExportName": "dangerousFunction",
  "symbolKind": "function | method",
  "severity": "critical",
  "extras": {
    "description": "Prototype pollution in dangerousFunction",
    "range": ">=1.0.0 <1.2.5"
  }
}

At graph-build time, you pre-resolve symbolExportName → node.id where possible.

2.5 Reachability Results

Exactly the same shape as for PHP.

{
  "vulnerabilityId": "CVE-2020-1234",
  "componentPurl": "pkg:npm/some-lib@1.2.3",
  "symbolExportName": "dangerousFunction",
  "targetNodeId": "node-123",
  "status": "reachable | maybe_reachable | unreachable | not_analyzed",
  "reason": "short explanation",
  "paths": [
    ["entry-node-1", "node-20", "node-50", "node-123"]
  ],
  "analysisMeta": {
    "algorithmVersion": "1.0.0",
    "maxDepth": 200,
    "timestamp": "2025-11-20T19:30:00Z"
  }
}

3. Module & Symbol Resolution (JS/TS specifics)

Backend: TypeScript compiler API with allowJs: true.

3.1 Build TS Program

Generate a tsconfig.reachability.json with:
- allowJs: true
- checkJs: true
- moduleResolution: "node" or "bundler" depending on project.
- rootDir set to project root.
Use TS API to create Program.
Use TypeChecker to resolve symbols where possible.

This gives you:

File list (including JS/TS).
Symbols for exports/imports.
Class and function definitions.

3.2 Export indexing per module

For each source file:

Enumerate:
- export function foo() {}
- export const bar = () => {}
- export default function () {} / export default class {}.
- export { foo } statements.
- module.exports = ... / exports.foo = ... (handle as CommonJS exports).

Build an index:

interface ExportedSymbol {
  moduleFile: string;        // relative path
  exportName: string;        // "foo", "default"
  nodeId: string;            // ID in nodes[]
}

3.3 Import resolution

For each ImportDeclaration:

import { foo as localFoo } from 'some-lib'
- Map localFoo → (module='some-lib', exportName='foo').
import foo from 'some-lib'
- Map foo → (module='some-lib', exportName='default').
import * as lib from 'some-lib'
- Map namespace lib → (module='some-lib', exportName='*').

For CommonJS:

const x = require('some-lib')
- Map x → (module='some-lib', exportName='*').
const { dangerousFunction } = require('some-lib')
- Map dangerousFunction → (module='some-lib', exportName='dangerousFunction').

Later, when you see calls, you use this mapping.

4. Entrypoint Detection (Node-centric)

v1 rules that are easy to implement:

CLI entrypoints
- Files listed in bin section of package.json.
- Files with #!/usr/bin/env node shebang.
- Node:
  - kind = "entrypoint",
  - entryPointType = "cli".
Server entrypoints
- Heuristic: look for src/server.ts, src/index.ts, index.js at project root.
- Mark them as entrypoint with entryPointType = "http_route".
Framework routes (Express v1)
- Pattern: const app = express(); app.get('/path', handler):
  - handler can be:
    - Identifier (function name),
    - Arrow function,
    - Function expression.
For each such route:
- Create an entrypoint node per route or mark handler callable as reachable from server entrypoint:
  - Easiest v1: create entry_call edge:
    - From server entrypoint node (e.g., file src/server.ts) to handler node.
    - Mark handler node extras.isRouteHandler = true.

You do not have to model individual HTTP methods or paths semantically in v1; just treat each handler as a reachable entrypoint into business logic.

5. Call Graph Construction

This is the heart of the algorithm.

5.1 Node creation (summary)

While visiting AST:

For each:
- FunctionDeclaration
- MethodDeclaration
- ArrowFunction (that is:
  - exported, or
  - assigned to a variable that is used as a callback/handler)
Create a node.

Tie each node to:

file (relative path),
line (start line),
componentId (from mapping file path → package),
optional exportName (if exported from module).

5.2 Call extraction rules

For each function/method body (i.e., node):

5.2.1 Direct calls: `foo()`

If callee is an identifier foo:
1. Check if foo is a local function in the same file.
2. If not, check import alias table:
  - If foo maps to (module='pkg', exportName='bar'), then:
    - Resolve to exported symbol for pkg + bar if you have its sources.
    - If library source not indexed, create a synthetic node for that library export (optional).
3. If resolved, add edge:
  - type = "call",
  - confidence = "high".

5.2.2 Property calls: `obj.method()`

If callee is obj.method(...):
1. If obj is an imported namespace:
  - e.g. import * as lib from 'some-lib'; lib.dangerousFunction().
  - Then treat:
    - module='some-lib', exportName='dangerousFunction'.
    - Edge confidence = "high".
2. If obj is created via new ClassName() where ClassName is known:
  - Use TypeScript type checker or simple pattern:
    - Look for const obj = new ClassName(...) in same function.
    - Map to method ClassName.method.
  - Edge confidence = "high".
3. Else:
  - As a v1 heuristic, you do not spread to everything; instead:
    - Either:
      - Skip edge and lose some coverage, or
      - Add confidence = "medium" edge from current node to all methods called method in the same component.
    - Recommended: medium-confidence to all same-name methods in same component (conservative, but safe).

5.2.3 CommonJS require patterns

const x = require('some-lib'); x.dangerousFunction():
- Track variable → module mapping from require.
- When you see x.something():
  - module='some-lib', exportName='something'.
  - confidence = "medium" (less structured than ES import).

5.2.4 Dynamic imports & very dynamic calls

await import('some-lib'), obj[methodName](), eval, Function, etc.:

v1 policy (simple and honest):
- Do not create specific edges unless:
  - The target module name is a string literal and the method name is a string literal in same expression.
- Otherwise:
  - Optionally create a single edge from current node to a special node-unknown with confidence = "low".
  - This preserves a record that “something dynamic happens here” without lying.

6. Mapping Nodes to Components (PURLs)

Using the filesystem:

If file path begins with node_modules/<pkgName>/...:
- Map that file to component with name = pkgName and the version from lockfile.
All other files belong to the root component (the app) or to a local “workspace” package if you support monorepos later.

Each node inherits componentId from its file. Each component has a purl:

pkg:npm/<name>@<version>.

This is how you connect reachability to SBOM/VEX later.

7. Vulnerability → Node mapping

Given a vulnerability:

{
  "componentPurl": "pkg:npm/some-lib@1.2.3",
  "packageName": "some-lib",
  "symbolExportName": "dangerousFunction"
}

Steps:

Find componentId by matching componentPurl or packageName.
In that component, find node(s) where:
- exportName == "dangerousFunction", or
- For CommonJS, any top-level function marked as part of the module’s exports under that name.
If found:
- targetNodeId = node.id.
If not:
- Mark not_analyzed later.

8. Reachability Algorithm (BFS)

Exactly like PHP v1, but now over JS nodes.

Pre-compute:

entryNodes = all nodes where kind = "entrypoint".

Compute reachable set:

function computeReachability(graph: Graph, entryNodes: string[]): ReachabilityContext {
  const queue: string[] = [];
  const visited: Record<string, boolean> = {};
  const predecessor: Record<string, string | undefined> = {};
  const edgeConfidenceOnPath: Record<string, "high" | "medium" | "low"> = {};

  for (const entry of entryNodes) {
    queue.push(entry);
    visited[entry] = true;
    edgeConfidenceOnPath[entry] = "high";
  }

  while (queue.length > 0) {
    const current = queue.shift()!;

    for (const edge of graph.outEdges(current)) {
      if (edge.type !== "call" && edge.type !== "entry_call") continue;

      const next = edge.to;
      if (visited[next]) continue;

      visited[next] = true;
      predecessor[next] = current;

      const prevConf = edgeConfidenceOnPath[current] ?? "high";
      const edgeConf = edge.confidence;
      edgeConfidenceOnPath[next] = minConfidence(prevConf, edgeConf);

      queue.push(next);
    }
  }

  return { visited, predecessor, edgeConfidenceOnPath };
}

function minConfidence(a: "high" | "medium" | "low",
                       b: "high" | "medium" | "low"): "high" | "medium" | "low" {
  const order: Record<string, number> = { high: 3, medium: 2, low: 1 };
  return order[a] <= order[b] ? a : b;
}

Classify per vulnerability:

For each vulnerability with targetNodeId:

If missing → status = "not_analyzed".
If targetNodeId not in visited → status = "unreachable".
Otherwise:
- conf = edgeConfidenceOnPath[targetNodeId].
- If conf == "high" → status = "reachable".
- Else (medium or low) → status = "maybe_reachable".

Path reconstruction:

Same as PHP:

function reconstructPath(predecessor: Record<string, string | undefined>,
                         targetId: string): string[] {
  const path: string[] = [];
  let current: string | undefined = targetId;

  while (current !== undefined) {
    path.unshift(current);
    current = predecessor[current];
  }

  return path;
}

Store at least one path in paths[].

9. Handling JS “messy bits” (v1 rules)

You want to be honest, not magical. So:

eval, new Function, dynamic import with non-literal arguments
- Do not pretend you know where control goes.
- Either:
  - Ignore for graph (recommended v1), or
  - Edge to node-unknown with confidence="low".
- Mark in analysisMeta that dynamic features were detected.
objmethodName with unknown methodName
- If methodName is string literal and obj is clearly typed, you can resolve.
- Otherwise: no edges (or low-confidence to node-unknown).
No source for library
- If you do not index node_modules, you cannot trace inside vulnerable library.
- Still useful: we just need the library’s exported symbol node as “synthetic”:
  - Create a synthetic node representing some-lib::dangerousFunction and attach all calls to it.
  - That node gets componentId for some-lib.
  - Reachability is still valid (we do not need the internal implementation for SCA).

10. Implementation plan for a mid-level engineer

Assume this runs in a Node.js/TypeScript container that Scanner calls, returning JSON.

10.1 Modules to build

JsProjectLoader
- Reads package.json + lockfile.
- Builds components[] (npm packages + root app).
- Maps file paths → componentId.
TsProgramBuilder
- Generates tsconfig.reachability.json.
- Creates TS Program with allowJs: true.
- Exposes sourceFiles and typeChecker.
JsSymbolIndexer
- Walks all source files.
- Indexes:
  - Exported functions/classes.
  - Imported bindings / requires.
- Creates base nodes[] and export index.
JsEntrypointDetector
- Reads package.json for bin and main entry.
- Applies server/Express heuristics.
- Adds entrypoint nodes.
JsCallGraphBuilder
- For each function/method node:
  - Traverses its AST.
  - Emits call edges as per §5.
- Emits entry_call edges for server/route wiring.
VulnerabilityMapper
- Takes vulnerability input (package + symbolExportName).
- Resolves them to targetNodeId.
ReachabilityEngine
- Implements BFS algorithm.
- Produces reachabilityResults[] with paths.
GraphSerializer
- Packs everything into the JSON structure described in §§2–3.

10.2 Minimal example (to sanity check)

Project snippet:

// src/server.ts
import express from 'express';
import { dangerousFunction } from 'some-lib';

const app = express();

app.get('/test', (req, res) => {
  dangerousFunction(req.query.input);
  res.send('ok');
});

app.listen(3000);

Vulnerability:

{
  "id": "CVE-2020-1234",
  "componentPurl": "pkg:npm/some-lib@1.2.3",
  "packageName": "some-lib",
  "symbolExportName": "dangerousFunction"
}

Expected graph & result:

Entrypoint node:
- src/server.ts::node-entry (or similar).
Handler arrow function node for /test.
Node for some-lib::dangerousFunction (real or synthetic).

Reachability path:

[
  "entry:src/server.ts::node-entry",
  "node-handler-/test",
  "node-some-lib::dangerousFunction"
]

Status: reachable, confidence = high.

If you want, next iteration I can:

Turn this into a formal JSON Schema (reachability-js-graph.schema.json),
And sketch the exact Node/TS code structure (file layout, TS interfaces) for the container that StellaOps.Scanner would call. You already have a very solid v1 for both PHP and JS. The next step is not to add “more stuff” randomly, but to make the specs:
More uniform (one core reachability model, multiple languages).
More honest about uncertainty.
More useful for scoring, policy, and SBOM/VEX.

Here is what I would improve.

1. Cross‑language improvements (applies to both PHP & JS)

1.1 Unify under a single core schema

Right now PHP and JS are parallel but not explicitly unified. I would define:

A language‑agnostic core:
- Node (id, kind, file, line, componentId, purl, tags).
- Edge (id, from, to, type, confidence, tags).
- Vulnerability (id, componentPurl, symbolId or symbolFqn, severity, tags).
- ReachabilityResult (vulnId, targetNodeId, status, paths[], analysisMeta).
A language extension block:
- phpExtras (namespace, className, visibility, etc.).
- jsExtras (exportName, exportKind, isAsync, etc.).

This gives you one “Reachability Graph 1.x” spec with per‑language specialisation instead of two separate specs.

1.2 Stronger identity & hashing rules

Make node and edge IDs deterministic and explicitly specified:

Node ID derived from:
- language, componentId, file, fqn, kind → sha256 truncated.
Edge ID derived from:
- from, to, type, file, line.

Benefits:

Stable IDs across runs for the same code → easy diffing, caching, incremental scans.
Downstream tools (policy engine, UI) can key on IDs confidently.

1.3 Multi‑axis confidence instead of a single label

Replace the single confidence enum with multi‑axis confidence:

"confidence": {
  "resolution": "high|medium|low",   // how well we resolved the callee
  "typeInference": "high|medium|low",
  "controlFlow": "high|medium|low"
}

And define:

pathConfidence = min of all axes along the path.
status still uses reachable / maybe_reachable / etc., but you retain the underlying breakdown for scoring and debugging.

1.4 Path conditions and guards (lightweight)

Introduce optional path condition annotations on edges:

"extras": {
  "guard": "if ($userIsLoggedIn)",
  "guardType": "auth | feature_flag | input_validation | unknown"
}

You do not need full symbolic execution. A simple heuristic suffices:

Detect if (...) around the call and capture the textual condition.
Categorize by simple patterns (presence of isAdmin, feature, flag, etc.).

Later, the Trust Algebra can say: “reachable only under feature flag + behind auth → downgrade risk.”

1.5 Partial coverage & truncation flags

Make the graph self‑describing about its limitations:

At graph level:

"analysisMeta": {
  "languages": ["php"],
  "vendorCodeParsed": true,
  "dynamicFeaturesHandled": ["dynamic-includes-partial", "reflection-ignored"],
  "maxNodes": 500000,
  "truncated": false
}

Per‑node or per‑file:

"extras": {
  "parseErrors": false,
  "analysisSkippedReason": null
}

Per‑vulnerability:

Add coverageStatus: full, partial, unknown to complement status.

This avoids a common trap: tools silently dropping edges/nodes and still reporting “unreachable.”

1.6 First‑class SBOM/VEX linkage

You already include PURLs. Go one step further:

componentId links to:
- bomRef (CycloneDX) or componentId (SPDX) if available.
vulnerabilityId links to:
- vexRef in any existing VEX document.

This allows:

A VEX producer to say “not affected / affected but not exploited” with explicit reference to the reachability graph and specific targetNodeIds.

2. PHP‑specific improvements

2.1 Autoloader‑aware edges as first‑class concept

Right now autoload is mostly implicit. Make it explicit and deterministic:

During Composer metadata processing, build:
- Autoload map: FQN class → file.
Add autoload edges:
- From “usage site” node (where new ClassName() first appears) to a file‑level node representing the defining file.

Why it helps:

Clarifies how classes were resolved (or not).
Easier to debug “class not found” vs “we never parsed vendor code.”

2.2 More precise includes / requires

Upgrade the naive rule “everything in included file is reachable”:

Represent each file as a special node kind="file".
include / require statements produce include edges from current node/file to the file node.
Then:
- All functions/methods defined in that file get define_in edges from file node.
- A separate simple pass marks them reachable from that file’s callers.

Add a nuance:

If the include path is static and resolved at scan time → resolution.high.
If dynamic (e.g., include $baseDir.'/file.php';) → resolution.medium or low.

2.3 Better dynamic dispatch handling for methods

Current v1 rule (“connect to all methods with that name in the component”) is safe but noisy.

Refinement:

Use local type inference in the same function/method:
- $x = new Foo(); $x->bar(); → high resolution.
- $x = factory(); $x->bar();:
  - If factory returns a union of known types, edges to those types with resolution.medium.
Introduce a tag on edges:
- extras.dispatchKind = "static" | "local-new" | "factory-heuristic" | "unknown".

This preserves the safety of your current design but cuts down false positives for common patterns.

2.4 Framework‑aware entrypoints (v2, but spec‑ready now)

Extend entryPointType with framework flavors, even if initial implementation is shallow:

laravel_http, symfony_http, wordpress_hook, drupal_hook, etc.

And allow:

"extras": {
  "framework": "laravel",
  "route": "GET /users",
  "hookName": "init"
}

You do not have to implement every framework in v1, but the spec should allow these so you can ship small, incremental framework profiles without changing the schema.

3. JavaScript/TypeScript‑specific improvements

3.1 Explicit async / event‑loop edges

Today all calls are treated uniformly. For JS/TS, you should model:

setTimeout, setInterval, setImmediate, queueMicrotask, process.nextTick, Promise.then/catch/finally, event emitters.

Two improvements:

Additional edge types:
- async_call, event_callback, timer_callback.
Node extras:
- extras.trigger = "timer" | "promise" | "event" | "unknown".

This lets you later express policies like: “reachable only via a rarely used cron‑like timer” vs “reachable via normal HTTP request.”

3.2 Bundler awareness (but spec‑only in v1)

Even if v1 implementation ignores bundlers, the spec should anticipate them:

Allow a bundle mapping block:

"bundles": [
  {
    "id": "bundle-main",
    "tool": "webpack",
    "inputFiles": ["src/index.ts", "src/server.ts"],
    "outputFiles": ["dist/main.js"]
  }
]

Optionally, allow edges:
- type = "bundle_map" from source file nodes to bundled file nodes.

You can attach reachability graphs to either pre‑bundle or post‑bundle views later, without breaking the schema.

3.3 Stronger TypeScript‑based resolution

Encode the fact that a call was resolved using TS type information vs heuristic:

On edges, add:

"extras": {
  "resolutionStrategy": "ts-typechecker | local-scope | require-heuristic | unresolved"
}

This provides a clear line between “hard” and “soft” links for the scoring engine and for debugging why something is maybe_reachable.

3.4 Workspace / monorepo semantics

Support Yarn / pnpm / npm workspaces at the schema level:

Allow components to have:

"extras": {
  "workspace": "packages/service-a",
  "isWorkspaceRoot": false
}

And support edges:

type = "workspace_dep" for internal package imports.

This makes it straightforward to see when a vulnerable library is pulled via an internal package boundary, which is common in large JS monorepos.

4. Operational & lifecycle improvements

4.1 Explicit incremental scan support

Add an optional delta section so a scanner can emit only changes:

"delta": {
  "baseGraphHash": "sha256:...",
  "addedNodes": [...],
  "removedNodeIds": [...],
  "addedEdges": [...],
  "removedEdgeIds": [...]
}

This is particularly valuable for large repos where full graphs are costly and CI needs fast turnaround.

4.2 Test / non‑prod code classification

Mark nodes/edges originating from tests or non‑prod code:

extras.codeRole = "prod | test | devtool | unknown".

Entry points from test runners (e.g., PHPUnit, Jest, Mocha) should either be:

Ignored (default), or
Explicitly flagged as entryPointType = "test" so policies can decide whether to count that reachability.

4.3 Normative definitions of statuses

You already use reachable, maybe_reachable, unreachable, not_analyzed. Make the semantics normative in the spec:

Tie reachable / maybe_reachable to:
- Existence of a path from at least one recognized entrypoint.
- Minimum pathConfidence thresholds.
Require that tools distinguish:
- “No path in the graph” vs “graph incomplete here.”

This allows multiple tools to implement the spec and still produce comparable, auditable results.

If you want, the next concrete step could be:

A “Reachability Graph 1.1” document that:
- Extracts the shared core,
- Adds multi‑axis confidence,
- Adds partial‑coverage metadata,
- Extends the enums for edge types and entrypoint types for PHP/JS.

That gives your team a clean target for implementation without materially increasing complexity for a mid‑level engineer.

54 KiB Raw Blame History Unescape Escape

1. What we concretely know from Snyk & Semgrep

2. Where the gaps actually are

3. For PHP & JavaScript specifically: can we beat them?

JavaScript / TypeScript

PHP

4. How Stella Ops can clearly surpass them (graph-wise)

5. Practical conclusion

1. Scope of PHP Reachability v1

2. Reachability Graph Document (JSON)

2.1 Node model

2.2 Edge model

2.3 Vulnerabilities model

3. Reachability Results Structure

4. Analysis Pipeline Spec (Step-by-Step)

4.1 Inputs

4.2 Step 1 – Parse Composer Metadata & Build Components

4.3 Step 2 – PHP AST & Symbol Table

4.4 Step 3 – Entrypoint Detection

4.5 Step 4 – Call Graph Construction

4.6 Step 5 – Map Vulnerabilities to Nodes

4.7 Step 6 – Reachability Algorithm

5. Handling PHP “messy bits” (v1 rules)

6. What the engineer actually builds (modules & tasks)

7. Minimal working example (very small)

1. Scope of JS Reachability v1

2. Reachability Graph Document (JSON)

2.1 Components

2.2 Nodes (callables & entrypoints)

2.3 Edges

2.4 Vulnerabilities

2.5 Reachability Results

3. Module & Symbol Resolution (JS/TS specifics)

3.1 Build TS Program

3.2 Export indexing per module

3.3 Import resolution

4. Entrypoint Detection (Node-centric)

5. Call Graph Construction

5.1 Node creation (summary)

5.2 Call extraction rules

5.2.1 Direct calls: foo()

5.2.2 Property calls: obj.method()

5.2.3 CommonJS require patterns

5.2.4 Dynamic imports & very dynamic calls

6. Mapping Nodes to Components (PURLs)

7. Vulnerability → Node mapping

8. Reachability Algorithm (BFS)

9. Handling JS “messy bits” (v1 rules)

10. Implementation plan for a mid-level engineer

10.1 Modules to build

10.2 Minimal example (to sanity check)

1. Cross‑language improvements (applies to both PHP & JS)

1.1 Unify under a single core schema

1.2 Stronger identity & hashing rules

1.3 Multi‑axis confidence instead of a single label

1.4 Path conditions and guards (lightweight)

1.5 Partial coverage & truncation flags

1.6 First‑class SBOM/VEX linkage

2. PHP‑specific improvements

2.1 Autoloader‑aware edges as first‑class concept

2.2 More precise includes / requires

2.3 Better dynamic dispatch handling for methods

2.4 Framework‑aware entrypoints (v2, but spec‑ready now)

3. JavaScript/TypeScript‑specific improvements

3.1 Explicit async / event‑loop edges

3.2 Bundler awareness (but spec‑only in v1)

3.3 Stronger TypeScript‑based resolution

3.4 Workspace / monorepo semantics

4. Operational & lifecycle improvements

4.1 Explicit incremental scan support

4.2 Test / non‑prod code classification

4.3 Normative definitions of statuses

54 KiB

Raw Blame History

5.2.1 Direct calls: `foo()`

5.2.2 Property calls: `obj.method()`