Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Export Center CI / export-ci (push) Has been cancelled
Findings Ledger CI / build-test (push) Has been cancelled
Findings Ledger CI / migration-validation (push) Has been cancelled
Findings Ledger CI / generate-manifest (push) Has been cancelled
Lighthouse CI / Lighthouse Audit (push) Has been cancelled
Lighthouse CI / Axe Accessibility Audit (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Reachability Corpus Validation / validate-corpus (push) Has been cancelled
Reachability Corpus Validation / validate-ground-truths (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Reachability Corpus Validation / determinism-check (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
- Introduced `all-edge-reasons.json` to test edge resolution reasons in .NET. - Added `all-visibility-levels.json` to validate method visibility levels in .NET. - Created `dotnet-aspnetcore-minimal.json` for a minimal ASP.NET Core application. - Included `go-gin-api.json` for a Go Gin API application structure. - Added `java-spring-boot.json` for the Spring PetClinic application in Java. - Introduced `legacy-no-schema.json` for legacy application structure without schema. - Created `node-express-api.json` for an Express.js API application structure.
356 lines
12 KiB
Markdown
356 lines
12 KiB
Markdown
# Callgraph Schema Reference
|
|
|
|
This document describes the `stella.callgraph.v1` schema used for representing call graphs in StellaOps.
|
|
|
|
## Schema Version
|
|
|
|
**Current Version:** `stella.callgraph.v1`
|
|
|
|
All call graphs should include the `schema` field set to `stella.callgraph.v1`. Legacy call graphs without this field are automatically migrated on ingestion.
|
|
|
|
## Document Structure
|
|
|
|
A `CallgraphDocument` contains the following top-level fields:
|
|
|
|
| Field | Type | Required | Description |
|
|
|-------|------|----------|-------------|
|
|
| `schema` | string | Yes | Schema identifier: `stella.callgraph.v1` |
|
|
| `scanKey` | string | No | Scan context identifier |
|
|
| `language` | CallgraphLanguage | No | Primary language of the call graph |
|
|
| `artifacts` | CallgraphArtifact[] | No | Artifacts included in the graph |
|
|
| `nodes` | CallgraphNode[] | Yes | Graph nodes representing symbols |
|
|
| `edges` | CallgraphEdge[] | Yes | Call edges between nodes |
|
|
| `entrypoints` | CallgraphEntrypoint[] | No | Discovered entrypoints |
|
|
| `metadata` | CallgraphMetadata | No | Graph-level metadata |
|
|
| `id` | string | Yes | Unique graph identifier |
|
|
| `component` | string | No | Component name |
|
|
| `version` | string | No | Component version |
|
|
| `ingestedAt` | DateTimeOffset | No | Ingestion timestamp (ISO 8601) |
|
|
| `graphHash` | string | No | Content hash for deduplication |
|
|
|
|
### Legacy Fields
|
|
|
|
These fields are preserved for backward compatibility:
|
|
|
|
| Field | Type | Description |
|
|
|-------|------|-------------|
|
|
| `languageString` | string | Legacy language string |
|
|
| `roots` | CallgraphRoot[] | Legacy root/entrypoint representation |
|
|
| `schemaVersion` | string | Legacy schema version field |
|
|
|
|
## Enumerations
|
|
|
|
### CallgraphLanguage
|
|
|
|
Supported languages for call graph analysis:
|
|
|
|
| Value | Description |
|
|
|-------|-------------|
|
|
| `Unknown` | Language not determined |
|
|
| `DotNet` | .NET (C#, F#, VB.NET) |
|
|
| `Java` | Java and JVM languages |
|
|
| `Node` | Node.js / JavaScript / TypeScript |
|
|
| `Python` | Python |
|
|
| `Go` | Go |
|
|
| `Rust` | Rust |
|
|
| `Ruby` | Ruby |
|
|
| `Php` | PHP |
|
|
| `Binary` | Native binary (ELF, PE) |
|
|
| `Swift` | Swift |
|
|
| `Kotlin` | Kotlin |
|
|
|
|
### SymbolVisibility
|
|
|
|
Access visibility levels for symbols:
|
|
|
|
| Value | Description |
|
|
|-------|-------------|
|
|
| `Unknown` | Visibility not determined |
|
|
| `Public` | Publicly accessible |
|
|
| `Internal` | Internal to assembly/module |
|
|
| `Protected` | Protected (subclass accessible) |
|
|
| `Private` | Private to containing type |
|
|
|
|
### EdgeKind
|
|
|
|
Edge classification based on analysis confidence:
|
|
|
|
| Value | Description | Confidence |
|
|
|-------|-------------|------------|
|
|
| `Static` | Statically determined call | High |
|
|
| `Heuristic` | Heuristically inferred | Medium |
|
|
| `Runtime` | Runtime-observed edge | Highest |
|
|
|
|
### EdgeReason
|
|
|
|
Reason codes explaining why an edge exists (critical for explainability):
|
|
|
|
| Value | Description | Typical Kind |
|
|
|-------|-------------|--------------|
|
|
| `DirectCall` | Direct method/function call | Static |
|
|
| `VirtualCall` | Virtual/interface dispatch | Static |
|
|
| `ReflectionString` | Reflection-based invocation | Heuristic |
|
|
| `DiBinding` | Dependency injection binding | Heuristic |
|
|
| `DynamicImport` | Dynamic import/require | Heuristic |
|
|
| `NewObj` | Constructor/object instantiation | Static |
|
|
| `DelegateCreate` | Delegate/function pointer creation | Static |
|
|
| `AsyncContinuation` | Async/await continuation | Static |
|
|
| `EventHandler` | Event handler subscription | Heuristic |
|
|
| `GenericInstantiation` | Generic type instantiation | Static |
|
|
| `NativeInterop` | Native interop (P/Invoke, JNI, FFI) | Static |
|
|
| `RuntimeMinted` | Runtime-minted edge from execution | Runtime |
|
|
| `Unknown` | Reason could not be determined | - |
|
|
|
|
### EntrypointKind
|
|
|
|
Types of entrypoints:
|
|
|
|
| Value | Description |
|
|
|-------|-------------|
|
|
| `Unknown` | Type not determined |
|
|
| `Http` | HTTP endpoint |
|
|
| `Grpc` | gRPC endpoint |
|
|
| `Cli` | CLI command handler |
|
|
| `Job` | Background job |
|
|
| `Event` | Event handler |
|
|
| `MessageQueue` | Message queue consumer |
|
|
| `Timer` | Timer/scheduled task |
|
|
| `Test` | Test method |
|
|
| `Main` | Main entry point |
|
|
| `ModuleInit` | Module initializer |
|
|
| `StaticConstructor` | Static constructor |
|
|
|
|
### EntrypointFramework
|
|
|
|
Frameworks that expose entrypoints:
|
|
|
|
| Value | Description | Language |
|
|
|-------|-------------|----------|
|
|
| `Unknown` | Framework not determined | - |
|
|
| `AspNetCore` | ASP.NET Core | DotNet |
|
|
| `MinimalApi` | ASP.NET Core Minimal APIs | DotNet |
|
|
| `Spring` | Spring Framework | Java |
|
|
| `SpringBoot` | Spring Boot | Java |
|
|
| `Express` | Express.js | Node |
|
|
| `Fastify` | Fastify | Node |
|
|
| `NestJs` | NestJS | Node |
|
|
| `FastApi` | FastAPI | Python |
|
|
| `Flask` | Flask | Python |
|
|
| `Django` | Django | Python |
|
|
| `Rails` | Ruby on Rails | Ruby |
|
|
| `Gin` | Gin | Go |
|
|
| `Echo` | Echo | Go |
|
|
| `Actix` | Actix Web | Rust |
|
|
| `Rocket` | Rocket | Rust |
|
|
| `AzureFunctions` | Azure Functions | Multi |
|
|
| `AwsLambda` | AWS Lambda | Multi |
|
|
| `CloudFunctions` | Google Cloud Functions | Multi |
|
|
|
|
### EntrypointPhase
|
|
|
|
Execution phase for entrypoints:
|
|
|
|
| Value | Description |
|
|
|-------|-------------|
|
|
| `ModuleInit` | Module/assembly initialization |
|
|
| `AppStart` | Application startup (Main) |
|
|
| `Runtime` | Runtime request handling |
|
|
| `Shutdown` | Shutdown/cleanup handlers |
|
|
|
|
## Node Structure
|
|
|
|
A `CallgraphNode` represents a symbol (method, function, type) in the call graph:
|
|
|
|
```json
|
|
{
|
|
"id": "n001",
|
|
"nodeId": "n001",
|
|
"name": "GetWeatherForecast",
|
|
"kind": "method",
|
|
"namespace": "SampleApi.Controllers",
|
|
"file": "WeatherForecastController.cs",
|
|
"line": 15,
|
|
"symbolKey": "SampleApi.Controllers.WeatherForecastController::GetWeatherForecast()",
|
|
"artifactKey": "SampleApi.dll",
|
|
"visibility": "Public",
|
|
"isEntrypointCandidate": true,
|
|
"attributes": {
|
|
"returnType": "IEnumerable<WeatherForecast>",
|
|
"httpMethod": "GET",
|
|
"route": "/weatherforecast"
|
|
},
|
|
"flags": 3
|
|
}
|
|
```
|
|
|
|
### Node Fields
|
|
|
|
| Field | Type | Required | Description |
|
|
|-------|------|----------|-------------|
|
|
| `id` | string | Yes | Unique identifier within the graph |
|
|
| `nodeId` | string | No | Alias for id (v1 schema convention) |
|
|
| `name` | string | Yes | Human-readable symbol name |
|
|
| `kind` | string | Yes | Symbol kind (method, function, class) |
|
|
| `namespace` | string | No | Namespace or module path |
|
|
| `file` | string | No | Source file path |
|
|
| `line` | int | No | Source line number |
|
|
| `symbolKey` | string | No | Canonical symbol key (v1) |
|
|
| `artifactKey` | string | No | Reference to containing artifact |
|
|
| `visibility` | SymbolVisibility | No | Access visibility |
|
|
| `isEntrypointCandidate` | bool | No | Whether node is an entrypoint candidate |
|
|
| `purl` | string | No | Package URL for external packages |
|
|
| `symbolDigest` | string | No | Content-addressed symbol digest |
|
|
| `attributes` | object | No | Additional attributes |
|
|
| `flags` | int | No | Bitmask for efficient filtering |
|
|
|
|
### Symbol Key Format
|
|
|
|
The `symbolKey` follows a canonical format:
|
|
|
|
```
|
|
{Namespace}.{Type}[`Arity][+Nested]::{Method}[`Arity]({ParamTypes})
|
|
```
|
|
|
|
Examples:
|
|
- `System.String::Concat(string, string)`
|
|
- `MyApp.Controllers.UserController::GetUser(int)`
|
|
- `System.Collections.Generic.List`1::Add(T)`
|
|
|
|
## Edge Structure
|
|
|
|
A `CallgraphEdge` represents a call relationship between two symbols:
|
|
|
|
```json
|
|
{
|
|
"sourceId": "n001",
|
|
"targetId": "n002",
|
|
"from": "n001",
|
|
"to": "n002",
|
|
"type": "call",
|
|
"kind": "Static",
|
|
"reason": "DirectCall",
|
|
"weight": 1.0,
|
|
"offset": 42,
|
|
"isResolved": true,
|
|
"provenance": "static-analysis"
|
|
}
|
|
```
|
|
|
|
### Edge Fields
|
|
|
|
| Field | Type | Required | Description |
|
|
|-------|------|----------|-------------|
|
|
| `sourceId` | string | Yes | Source node ID (caller) |
|
|
| `targetId` | string | Yes | Target node ID (callee) |
|
|
| `from` | string | No | Alias for sourceId (v1) |
|
|
| `to` | string | No | Alias for targetId (v1) |
|
|
| `type` | string | No | Legacy edge type |
|
|
| `kind` | EdgeKind | No | Edge classification |
|
|
| `reason` | EdgeReason | No | Reason for edge existence |
|
|
| `weight` | double | No | Confidence weight (0.0-1.0) |
|
|
| `offset` | int | No | IL/bytecode offset |
|
|
| `isResolved` | bool | No | Whether target was fully resolved |
|
|
| `provenance` | string | No | Provenance information |
|
|
| `candidates` | string[] | No | Virtual dispatch candidates |
|
|
|
|
## Entrypoint Structure
|
|
|
|
A `CallgraphEntrypoint` represents a discovered entrypoint:
|
|
|
|
```json
|
|
{
|
|
"nodeId": "n001",
|
|
"kind": "Http",
|
|
"route": "/api/users/{id}",
|
|
"httpMethod": "GET",
|
|
"framework": "AspNetCore",
|
|
"source": "attribute",
|
|
"phase": "Runtime",
|
|
"order": 0
|
|
}
|
|
```
|
|
|
|
### Entrypoint Fields
|
|
|
|
| Field | Type | Required | Description |
|
|
|-------|------|----------|-------------|
|
|
| `nodeId` | string | Yes | Reference to the node |
|
|
| `kind` | EntrypointKind | Yes | Type of entrypoint |
|
|
| `route` | string | No | HTTP route pattern |
|
|
| `httpMethod` | string | No | HTTP method (GET, POST, etc.) |
|
|
| `framework` | EntrypointFramework | No | Framework exposing the entrypoint |
|
|
| `source` | string | No | Discovery source |
|
|
| `phase` | EntrypointPhase | No | Execution phase |
|
|
| `order` | int | No | Deterministic ordering |
|
|
|
|
## Determinism Requirements
|
|
|
|
For reproducible analysis, call graphs must be deterministic:
|
|
|
|
1. **Stable Ordering**
|
|
- Nodes must be sorted by `id` (ordinal string comparison)
|
|
- Edges must be sorted by `sourceId`, then `targetId`
|
|
- Entrypoints must be sorted by `order`
|
|
|
|
2. **Enum Serialization**
|
|
- All enums serialize as camelCase strings
|
|
- Example: `EdgeReason.DirectCall` → `"directCall"`
|
|
|
|
3. **Timestamps**
|
|
- All timestamps must be UTC ISO 8601 format
|
|
- Example: `2025-01-15T10:00:00Z`
|
|
|
|
4. **Content Hashing**
|
|
- The `graphHash` field should contain a stable content hash
|
|
- Hash algorithm: SHA-256
|
|
- Format: `sha256:{hex-digest}`
|
|
|
|
## Schema Migration
|
|
|
|
Legacy call graphs without the `schema` field are automatically migrated:
|
|
|
|
1. **Schema Field**: Set to `stella.callgraph.v1`
|
|
2. **Language Parsing**: String language converted to `CallgraphLanguage` enum
|
|
3. **Visibility Inference**: Inferred from symbol key patterns:
|
|
- Contains `.Internal.` → `Internal`
|
|
- Contains `._` or `<` → `Private`
|
|
- Default → `Public`
|
|
4. **Edge Reason Inference**: Based on legacy `type` field:
|
|
- `call`, `direct` → `DirectCall`
|
|
- `virtual`, `callvirt` → `VirtualCall`
|
|
- `newobj` → `NewObj`
|
|
- etc.
|
|
5. **Entrypoint Inference**: Built from legacy `roots` and candidate nodes
|
|
6. **Symbol Key Generation**: Built from namespace and name if missing
|
|
|
|
## Validation Rules
|
|
|
|
Call graphs are validated against these rules:
|
|
|
|
1. All node `id` values must be unique
|
|
2. All edge `sourceId` and `targetId` must reference existing nodes
|
|
3. All entrypoint `nodeId` must reference existing nodes
|
|
4. Edge `weight` must be between 0.0 and 1.0
|
|
5. Artifacts referenced by nodes must exist in the `artifacts` list
|
|
|
|
## Golden Fixtures
|
|
|
|
Reference fixtures for testing are located at:
|
|
`tests/reachability/fixtures/callgraph-schema-v1/`
|
|
|
|
| Fixture | Description |
|
|
|---------|-------------|
|
|
| `dotnet-aspnetcore-minimal.json` | ASP.NET Core application |
|
|
| `java-spring-boot.json` | Spring Boot application |
|
|
| `node-express-api.json` | Express.js API |
|
|
| `go-gin-api.json` | Go Gin API |
|
|
| `legacy-no-schema.json` | Legacy format for migration testing |
|
|
| `all-edge-reasons.json` | All 13 edge reason codes |
|
|
| `all-visibility-levels.json` | All 5 visibility levels |
|
|
|
|
## Related Documentation
|
|
|
|
- [Reachability Analysis Technical Reference](../reachability/README.md)
|
|
- [Schema Migration Implementation](../../src/Signals/StellaOps.Signals/Parsing/CallgraphSchemaMigrator.cs)
|
|
- [SPRINT_1100: CallGraph Schema Enhancement](../implplan/SPRINT_1100_0001_0001_callgraph_schema_enhancement.md)
|