Files
git.stella-ops.org/docs/signals/callgraph-formats.md
master 5a480a3c2a
Some checks failed
Reachability Corpus Validation / validate-corpus (push) Waiting to run
Reachability Corpus Validation / validate-ground-truths (push) Waiting to run
Reachability Corpus Validation / determinism-check (push) Blocked by required conditions
Scanner Analyzers / Discover Analyzers (push) Waiting to run
Scanner Analyzers / Build Analyzers (push) Blocked by required conditions
Scanner Analyzers / Test Language Analyzers (push) Blocked by required conditions
Scanner Analyzers / Validate Test Fixtures (push) Waiting to run
Scanner Analyzers / Verify Deterministic Output (push) Blocked by required conditions
Signals CI & Image / signals-ci (push) Waiting to run
Signals Reachability Scoring & Events / reachability-smoke (push) Waiting to run
Signals Reachability Scoring & Events / sign-and-upload (push) Blocked by required conditions
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Export Center CI / export-ci (push) Has been cancelled
Findings Ledger CI / build-test (push) Has been cancelled
Findings Ledger CI / migration-validation (push) Has been cancelled
Findings Ledger CI / generate-manifest (push) Has been cancelled
Lighthouse CI / Lighthouse Audit (push) Has been cancelled
Lighthouse CI / Axe Accessibility Audit (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Add call graph fixtures for various languages and scenarios
- Introduced `all-edge-reasons.json` to test edge resolution reasons in .NET.
- Added `all-visibility-levels.json` to validate method visibility levels in .NET.
- Created `dotnet-aspnetcore-minimal.json` for a minimal ASP.NET Core application.
- Included `go-gin-api.json` for a Go Gin API application structure.
- Added `java-spring-boot.json` for the Spring PetClinic application in Java.
- Introduced `legacy-no-schema.json` for legacy application structure without schema.
- Created `node-express-api.json` for an Express.js API application structure.
2025-12-16 10:44:24 +02:00

12 KiB

Callgraph Schema Reference

This document describes the stella.callgraph.v1 schema used for representing call graphs in StellaOps.

Schema Version

Current Version: stella.callgraph.v1

All call graphs should include the schema field set to stella.callgraph.v1. Legacy call graphs without this field are automatically migrated on ingestion.

Document Structure

A CallgraphDocument contains the following top-level fields:

Field Type Required Description
schema string Yes Schema identifier: stella.callgraph.v1
scanKey string No Scan context identifier
language CallgraphLanguage No Primary language of the call graph
artifacts CallgraphArtifact[] No Artifacts included in the graph
nodes CallgraphNode[] Yes Graph nodes representing symbols
edges CallgraphEdge[] Yes Call edges between nodes
entrypoints CallgraphEntrypoint[] No Discovered entrypoints
metadata CallgraphMetadata No Graph-level metadata
id string Yes Unique graph identifier
component string No Component name
version string No Component version
ingestedAt DateTimeOffset No Ingestion timestamp (ISO 8601)
graphHash string No Content hash for deduplication

Legacy Fields

These fields are preserved for backward compatibility:

Field Type Description
languageString string Legacy language string
roots CallgraphRoot[] Legacy root/entrypoint representation
schemaVersion string Legacy schema version field

Enumerations

CallgraphLanguage

Supported languages for call graph analysis:

Value Description
Unknown Language not determined
DotNet .NET (C#, F#, VB.NET)
Java Java and JVM languages
Node Node.js / JavaScript / TypeScript
Python Python
Go Go
Rust Rust
Ruby Ruby
Php PHP
Binary Native binary (ELF, PE)
Swift Swift
Kotlin Kotlin

SymbolVisibility

Access visibility levels for symbols:

Value Description
Unknown Visibility not determined
Public Publicly accessible
Internal Internal to assembly/module
Protected Protected (subclass accessible)
Private Private to containing type

EdgeKind

Edge classification based on analysis confidence:

Value Description Confidence
Static Statically determined call High
Heuristic Heuristically inferred Medium
Runtime Runtime-observed edge Highest

EdgeReason

Reason codes explaining why an edge exists (critical for explainability):

Value Description Typical Kind
DirectCall Direct method/function call Static
VirtualCall Virtual/interface dispatch Static
ReflectionString Reflection-based invocation Heuristic
DiBinding Dependency injection binding Heuristic
DynamicImport Dynamic import/require Heuristic
NewObj Constructor/object instantiation Static
DelegateCreate Delegate/function pointer creation Static
AsyncContinuation Async/await continuation Static
EventHandler Event handler subscription Heuristic
GenericInstantiation Generic type instantiation Static
NativeInterop Native interop (P/Invoke, JNI, FFI) Static
RuntimeMinted Runtime-minted edge from execution Runtime
Unknown Reason could not be determined -

EntrypointKind

Types of entrypoints:

Value Description
Unknown Type not determined
Http HTTP endpoint
Grpc gRPC endpoint
Cli CLI command handler
Job Background job
Event Event handler
MessageQueue Message queue consumer
Timer Timer/scheduled task
Test Test method
Main Main entry point
ModuleInit Module initializer
StaticConstructor Static constructor

EntrypointFramework

Frameworks that expose entrypoints:

Value Description Language
Unknown Framework not determined -
AspNetCore ASP.NET Core DotNet
MinimalApi ASP.NET Core Minimal APIs DotNet
Spring Spring Framework Java
SpringBoot Spring Boot Java
Express Express.js Node
Fastify Fastify Node
NestJs NestJS Node
FastApi FastAPI Python
Flask Flask Python
Django Django Python
Rails Ruby on Rails Ruby
Gin Gin Go
Echo Echo Go
Actix Actix Web Rust
Rocket Rocket Rust
AzureFunctions Azure Functions Multi
AwsLambda AWS Lambda Multi
CloudFunctions Google Cloud Functions Multi

EntrypointPhase

Execution phase for entrypoints:

Value Description
ModuleInit Module/assembly initialization
AppStart Application startup (Main)
Runtime Runtime request handling
Shutdown Shutdown/cleanup handlers

Node Structure

A CallgraphNode represents a symbol (method, function, type) in the call graph:

{
  "id": "n001",
  "nodeId": "n001",
  "name": "GetWeatherForecast",
  "kind": "method",
  "namespace": "SampleApi.Controllers",
  "file": "WeatherForecastController.cs",
  "line": 15,
  "symbolKey": "SampleApi.Controllers.WeatherForecastController::GetWeatherForecast()",
  "artifactKey": "SampleApi.dll",
  "visibility": "Public",
  "isEntrypointCandidate": true,
  "attributes": {
    "returnType": "IEnumerable<WeatherForecast>",
    "httpMethod": "GET",
    "route": "/weatherforecast"
  },
  "flags": 3
}

Node Fields

Field Type Required Description
id string Yes Unique identifier within the graph
nodeId string No Alias for id (v1 schema convention)
name string Yes Human-readable symbol name
kind string Yes Symbol kind (method, function, class)
namespace string No Namespace or module path
file string No Source file path
line int No Source line number
symbolKey string No Canonical symbol key (v1)
artifactKey string No Reference to containing artifact
visibility SymbolVisibility No Access visibility
isEntrypointCandidate bool No Whether node is an entrypoint candidate
purl string No Package URL for external packages
symbolDigest string No Content-addressed symbol digest
attributes object No Additional attributes
flags int No Bitmask for efficient filtering

Symbol Key Format

The symbolKey follows a canonical format:

{Namespace}.{Type}[`Arity][+Nested]::{Method}[`Arity]({ParamTypes})

Examples:

  • System.String::Concat(string, string)
  • MyApp.Controllers.UserController::GetUser(int)
  • System.Collections.Generic.List1::Add(T)`

Edge Structure

A CallgraphEdge represents a call relationship between two symbols:

{
  "sourceId": "n001",
  "targetId": "n002",
  "from": "n001",
  "to": "n002",
  "type": "call",
  "kind": "Static",
  "reason": "DirectCall",
  "weight": 1.0,
  "offset": 42,
  "isResolved": true,
  "provenance": "static-analysis"
}

Edge Fields

Field Type Required Description
sourceId string Yes Source node ID (caller)
targetId string Yes Target node ID (callee)
from string No Alias for sourceId (v1)
to string No Alias for targetId (v1)
type string No Legacy edge type
kind EdgeKind No Edge classification
reason EdgeReason No Reason for edge existence
weight double No Confidence weight (0.0-1.0)
offset int No IL/bytecode offset
isResolved bool No Whether target was fully resolved
provenance string No Provenance information
candidates string[] No Virtual dispatch candidates

Entrypoint Structure

A CallgraphEntrypoint represents a discovered entrypoint:

{
  "nodeId": "n001",
  "kind": "Http",
  "route": "/api/users/{id}",
  "httpMethod": "GET",
  "framework": "AspNetCore",
  "source": "attribute",
  "phase": "Runtime",
  "order": 0
}

Entrypoint Fields

Field Type Required Description
nodeId string Yes Reference to the node
kind EntrypointKind Yes Type of entrypoint
route string No HTTP route pattern
httpMethod string No HTTP method (GET, POST, etc.)
framework EntrypointFramework No Framework exposing the entrypoint
source string No Discovery source
phase EntrypointPhase No Execution phase
order int No Deterministic ordering

Determinism Requirements

For reproducible analysis, call graphs must be deterministic:

  1. Stable Ordering

    • Nodes must be sorted by id (ordinal string comparison)
    • Edges must be sorted by sourceId, then targetId
    • Entrypoints must be sorted by order
  2. Enum Serialization

    • All enums serialize as camelCase strings
    • Example: EdgeReason.DirectCall"directCall"
  3. Timestamps

    • All timestamps must be UTC ISO 8601 format
    • Example: 2025-01-15T10:00:00Z
  4. Content Hashing

    • The graphHash field should contain a stable content hash
    • Hash algorithm: SHA-256
    • Format: sha256:{hex-digest}

Schema Migration

Legacy call graphs without the schema field are automatically migrated:

  1. Schema Field: Set to stella.callgraph.v1
  2. Language Parsing: String language converted to CallgraphLanguage enum
  3. Visibility Inference: Inferred from symbol key patterns:
    • Contains .Internal.Internal
    • Contains ._ or <Private
    • Default → Public
  4. Edge Reason Inference: Based on legacy type field:
    • call, directDirectCall
    • virtual, callvirtVirtualCall
    • newobjNewObj
    • etc.
  5. Entrypoint Inference: Built from legacy roots and candidate nodes
  6. Symbol Key Generation: Built from namespace and name if missing

Validation Rules

Call graphs are validated against these rules:

  1. All node id values must be unique
  2. All edge sourceId and targetId must reference existing nodes
  3. All entrypoint nodeId must reference existing nodes
  4. Edge weight must be between 0.0 and 1.0
  5. Artifacts referenced by nodes must exist in the artifacts list

Golden Fixtures

Reference fixtures for testing are located at: tests/reachability/fixtures/callgraph-schema-v1/

Fixture Description
dotnet-aspnetcore-minimal.json ASP.NET Core application
java-spring-boot.json Spring Boot application
node-express-api.json Express.js API
go-gin-api.json Go Gin API
legacy-no-schema.json Legacy format for migration testing
all-edge-reasons.json All 13 edge reason codes
all-visibility-levels.json All 5 visibility levels