stella-ops.org/git.stella-ops.org

Fork 0

Files

StellaOps Bot 6e45066e37

Concelier Attestation Tests / attestation-tests (push) Has been cancelled

Details

Policy Simulation / policy-simulate (push) Has been cancelled

Details

AOC Guard CI / aoc-guard (push) Has been cancelled

Details

AOC Guard CI / aoc-verify (push) Has been cancelled

Details

Signals CI & Image / signals-ci (push) Has been cancelled

Details

Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled

Details

Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled

Details

Docs CI / lint-and-preview (push) Has been cancelled

Details

Policy Lint & Smoke / policy-lint (push) Has been cancelled

Details

Scanner Analyzers / Discover Analyzers (push) Has been cancelled

Details

Scanner Analyzers / Build Analyzers (push) Has been cancelled

Details

Scanner Analyzers / Test Language Analyzers (push) Has been cancelled

Details

Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled

Details

Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled

Details

2025-12-13 09:37:15 +02:00

10 KiB

Raw Blame History

Edge Explainability Schema

Last updated: 2025-12-13. Owner: Scanner Guild + Policy Guild.

This document defines the edge explainability schema addressing gaps EG1-EG10 from the November 2025 product findings. It specifies the canonical format for call edge evidence, reason codes, confidence rubrics, and propagation into explanation graphs and VEX.

1. Overview

Edge explainability provides detailed rationale for each call edge in the reachability graph. Every edge includes:

Reason code: Why this edge was detected (e.g., bytecode-invoke, plt-stub, indirect-target)
Confidence score: Certainty of the edge's existence
Evidence sources: Detectors and rules that contributed to edge discovery
Provenance: Analyzer version, detection timestamp, and input artifacts

2. Gap Resolutions

EG1: Reason Enum Governance

Standard reason codes:

Code	Category	Description	Example
`bytecode-invoke`	Static	Bytecode invocation instruction	Java `invokevirtual`, .NET `call`
`bytecode-field`	Static	Field access leading to call	Static initializer
`import-symbol`	Static	Import table reference	ELF `.dynsym`, PE imports
`plt-stub`	Static	PLT/GOT indirection	`printf@plt`
`reloc-target`	Static	Relocation target	`.rela.dyn` entries
`indirect-target`	Heuristic	Indirect call target analysis	CFG-based
`init-array`	Static	Constructor/initializer array	`.init_array`, `DT_INIT`
`fini-array`	Static	Destructor/finalizer array	`.fini_array`, `DT_FINI`
`vtable-slot`	Heuristic	Virtual method dispatch	C++ vtable
`reflection-invoke`	Heuristic	Reflective method invocation	`Method.invoke()`
`runtime-observed`	Runtime	Runtime probe observation	JFR, eBPF
`user-annotated`	Manual	User-provided edge	Policy override

Governance rules:

New reason codes require RFC + review by Scanner Guild
Deprecated codes remain valid for 2 major versions
Custom codes use custom: prefix (e.g., custom:my-analyzer)
Codes are case-insensitive, normalized to lowercase

Code registry:

{
  "schema": "stellaops.edge.reason.registry@v1",
  "version": "2025-12-13",
  "reasons": [
    {
      "code": "bytecode-invoke",
      "category": "static",
      "description": "Bytecode invocation instruction",
      "languages": ["java", "dotnet"],
      "confidence_range": [0.9, 1.0],
      "deprecated": false
    }
  ]
}

EG2: Canonical Edge Schema with Hash Rules

Edge schema:

{
  "edge_id": "edge:sha256:{hex}",
  "from": "sym:java:...",
  "to": "sym:java:...",
  "kind": "call",
  "reason": "bytecode-invoke",
  "confidence": 0.95,
  "evidence": [
    {
      "source": "detector:java-bytecode-analyzer",
      "rule_id": "invoke-virtual",
      "rule_version": "1.0.0",
      "location": {
        "file": "com/example/Foo.class",
        "offset": 1234,
        "instruction": "invokevirtual #42"
      },
      "timestamp": "2025-12-13T10:00:00Z"
    }
  ],
  "attributes": {
    "virtual": true,
    "polymorphic_targets": 3
  }
}

Hash computation:

edge_id = "edge:" + sha256(
  canonical_json({
    "from": edge.from,
    "to": edge.to,
    "kind": edge.kind,
    "reason": edge.reason
  })
)

Canonicalization:

Use only from, to, kind, reason for hash (not confidence or evidence)
Sort JSON keys alphabetically
No whitespace, UTF-8 encoding
Hash is lowercase hex with sha256: prefix

EG3: Evidence Limits/Redaction

Evidence limits:

Element	Default Limit	Configurable
Evidence entries per edge	10	Yes
Location detail fields	5	Yes
Instruction preview length	100 chars	Yes
File path depth	10 segments	No

Redaction rules:

Category	Redaction	Example
File paths	Normalize	`/home/user/...` -> `{PROJECT}/...`
Bytecode offsets	Keep	Offsets are not PII
Instruction text	Truncate	First 100 chars
Source line content	Omit	Not included by default

Truncation behavior:

{
  "evidence_truncated": true,
  "evidence_count": 15,
  "evidence_shown": 10,
  "full_evidence_uri": "cas://edges/evidence/sha256:..."
}

EG4: Confidence Rubric

Confidence scale:

Level	Range	Description	Typical Sources
`certain`	1.0	Definite edge	Direct bytecode invoke
`high`	0.85-0.99	Very likely	Import table, PLT
`medium`	0.5-0.84	Probable	Indirect analysis, vtable
`low`	0.2-0.49	Possible	Heuristic carving
`unknown`	0.0-0.19	Speculative	User annotation, fallback

Confidence computation:

edge.confidence = base_confidence(reason) * evidence_boost(evidence_count) * target_resolution_factor

Base confidence by reason:

Reason	Base Confidence
`bytecode-invoke`	0.98
`import-symbol`	0.95
`plt-stub`	0.92
`reloc-target`	0.90
`init-array`	0.95
`vtable-slot`	0.75
`indirect-target`	0.60
`reflection-invoke`	0.50
`runtime-observed`	0.99
`user-annotated`	0.80

EG5: Detector/Rule Provenance

Provenance schema:

{
  "provenance": {
    "analyzer": {
      "name": "scanner.java",
      "version": "1.2.0",
      "digest": "sha256:..."
    },
    "detector": {
      "name": "java-bytecode-analyzer",
      "version": "2.0.0",
      "rule_set": "default"
    },
    "rule": {
      "id": "invoke-virtual",
      "version": "1.0.0",
      "description": "Detect invokevirtual bytecode instructions"
    },
    "input_artifacts": [
      {"type": "jar", "digest": "sha256:...", "path": "lib/app.jar"}
    ],
    "detected_at": "2025-12-13T10:00:00Z"
  }
}

Provenance requirements:

All edges must include analyzer provenance
Detector/rule provenance required for non-runtime edges
Input artifact digests enable reproducibility
Detection timestamp uses UTC ISO-8601

EG6: API/CLI Parity

API endpoints:

Method	Path	Description
`GET`	`/api/edges/{edgeId}`	Get edge details
`GET`	`/api/edges?graph_hash=...`	List edges for graph
`GET`	`/api/edges/{edgeId}/evidence`	Get full evidence
`POST`	`/api/edges/search`	Search edges by criteria

CLI commands:

# List edges for a graph
stella edge list --graph blake3:a1b2c3d4...

# Get edge details
stella edge show --id edge:sha256:...

# Search edges
stella edge search --from "sym:java:..." --reason bytecode-invoke

# Export edges
stella edge export --graph blake3:... --output ./edges.ndjson

Output parity:

API and CLI return identical JSON structure
CLI supports --json for machine-readable output
Both support filtering by reason, confidence, from/to

EG7: Deterministic Fixtures

Fixture location:

tests/Edge/
  fixtures/
    bytecode-invoke.json
    plt-stub.json
    vtable-dispatch.json
    init-array-constructor.json
    runtime-observed.json
  golden/
    bytecode-invoke.golden.json
    graph-with-edges.golden.json

datasets/edges/
  schema/
    edge.schema.json
    reason-registry.json
  samples/
    java-spring-boot/
      edges.ndjson
      expected-hashes.txt

Fixture requirements:

Each reason code has at least one fixture
Fixtures include expected edge_id hash
Golden outputs frozen after review
CI verifies hash stability

EG8: Propagation into Explanation Graphs/VEX

Explanation graph inclusion:

{
  "explanation": {
    "path": [
      {
        "node": "sym:java:main...",
        "outgoing_edge": {
          "edge_id": "edge:sha256:...",
          "to": "sym:java:handler...",
          "reason": "bytecode-invoke",
          "confidence": 0.98
        }
      },
      {
        "node": "sym:java:handler...",
        "outgoing_edge": {
          "edge_id": "edge:sha256:...",
          "to": "sym:java:log4j...",
          "reason": "bytecode-invoke",
          "confidence": 0.95
        }
      }
    ],
    "aggregate_path_confidence": 0.93
  }
}

VEX evidence format:

{
  "stellaops:reachability": {
    "path_edges": [
      {"edge_id": "edge:sha256:...", "reason": "bytecode-invoke", "confidence": 0.98},
      {"edge_id": "edge:sha256:...", "reason": "bytecode-invoke", "confidence": 0.95}
    ],
    "weakest_edge": {
      "edge_id": "edge:sha256:...",
      "reason": "bytecode-invoke",
      "confidence": 0.95
    },
    "aggregate_confidence": 0.93
  }
}

EG9: Localization Guidance

Localizable elements:

Element	Localization	Example
Reason code display	Message catalog	`bytecode-invoke` -> "Bytecode method call"
Confidence level	Message catalog	`high` -> "High confidence"
Evidence descriptions	Template	"Detected at offset {offset} in {file}"
Error messages	Message catalog	Standard error codes

Message catalog structure:

{
  "locale": "en-US",
  "messages": {
    "edge.reason.bytecode-invoke": "Bytecode method call",
    "edge.reason.plt-stub": "PLT/GOT library call",
    "edge.confidence.high": "High confidence ({0:P0})",
    "edge.evidence.location": "Detected at offset {offset} in {file}"
  }
}

Supported locales:

en-US (default)
Additional locales via contribution

EG10: Backfill Plan

Backfill strategy:

Phase 1: Add reason codes to new edges (no backfill needed)
Phase 2: Run detector upgrade on graphs without reason codes
Phase 3: Mark old graphs as requires_reanalysis in metadata

Migration script:

stella edge backfill --graph blake3:... --dry-run

# Output:
Graph: blake3:a1b2c3d4...
Edges without reason: 1234
Edges to update: 1234

Dry run - no changes made.

# Execute:
stella edge backfill --graph blake3:... --execute

Backfill metadata:

{
  "backfill": {
    "status": "complete",
    "original_analyzer_version": "1.0.0",
    "backfill_analyzer_version": "1.2.0",
    "backfilled_at": "2025-12-13T10:00:00Z",
    "edges_updated": 1234
  }
}

richgraph-v1 Contract - Graph schema specification
Function-Level Evidence - Evidence chain guide
Explainability Schema - Explanation format
Hybrid Attestation - Edge bundle DSSE

Last updated: 2025-12-13. See Sprint 0401 EDGE-GAPS-401-065 for change history.

10 KiB Raw Blame History

Edge Explainability Schema

1. Overview

2. Gap Resolutions

EG1: Reason Enum Governance

EG2: Canonical Edge Schema with Hash Rules

EG3: Evidence Limits/Redaction

EG4: Confidence Rubric

EG5: Detector/Rule Provenance

EG6: API/CLI Parity

EG7: Deterministic Fixtures

EG8: Propagation into Explanation Graphs/VEX

EG9: Localization Guidance

EG10: Backfill Plan

3. Related Documentation

10 KiB

Raw Blame History