Some checks failed
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
417 lines
10 KiB
Markdown
417 lines
10 KiB
Markdown
# Edge Explainability Schema
|
|
|
|
_Last updated: 2025-12-13. Owner: Scanner Guild + Policy Guild._
|
|
|
|
This document defines the edge explainability schema addressing gaps EG1-EG10 from the November 2025 product findings. It specifies the canonical format for call edge evidence, reason codes, confidence rubrics, and propagation into explanation graphs and VEX.
|
|
|
|
---
|
|
|
|
## 1. Overview
|
|
|
|
Edge explainability provides detailed rationale for each call edge in the reachability graph. Every edge includes:
|
|
|
|
- **Reason code:** Why this edge was detected (e.g., `bytecode-invoke`, `plt-stub`, `indirect-target`)
|
|
- **Confidence score:** Certainty of the edge's existence
|
|
- **Evidence sources:** Detectors and rules that contributed to edge discovery
|
|
- **Provenance:** Analyzer version, detection timestamp, and input artifacts
|
|
|
|
---
|
|
|
|
## 2. Gap Resolutions
|
|
|
|
### EG1: Reason Enum Governance
|
|
|
|
**Standard reason codes:**
|
|
|
|
| Code | Category | Description | Example |
|
|
|------|----------|-------------|---------|
|
|
| `bytecode-invoke` | Static | Bytecode invocation instruction | Java `invokevirtual`, .NET `call` |
|
|
| `bytecode-field` | Static | Field access leading to call | Static initializer |
|
|
| `import-symbol` | Static | Import table reference | ELF `.dynsym`, PE imports |
|
|
| `plt-stub` | Static | PLT/GOT indirection | `printf@plt` |
|
|
| `reloc-target` | Static | Relocation target | `.rela.dyn` entries |
|
|
| `indirect-target` | Heuristic | Indirect call target analysis | CFG-based |
|
|
| `init-array` | Static | Constructor/initializer array | `.init_array`, `DT_INIT` |
|
|
| `fini-array` | Static | Destructor/finalizer array | `.fini_array`, `DT_FINI` |
|
|
| `vtable-slot` | Heuristic | Virtual method dispatch | C++ vtable |
|
|
| `reflection-invoke` | Heuristic | Reflective method invocation | `Method.invoke()` |
|
|
| `runtime-observed` | Runtime | Runtime probe observation | JFR, eBPF |
|
|
| `user-annotated` | Manual | User-provided edge | Policy override |
|
|
|
|
**Governance rules:**
|
|
|
|
1. New reason codes require RFC + review by Scanner Guild
|
|
2. Deprecated codes remain valid for 2 major versions
|
|
3. Custom codes use `custom:` prefix (e.g., `custom:my-analyzer`)
|
|
4. Codes are case-insensitive, normalized to lowercase
|
|
|
|
**Code registry:**
|
|
|
|
```json
|
|
{
|
|
"schema": "stellaops.edge.reason.registry@v1",
|
|
"version": "2025-12-13",
|
|
"reasons": [
|
|
{
|
|
"code": "bytecode-invoke",
|
|
"category": "static",
|
|
"description": "Bytecode invocation instruction",
|
|
"languages": ["java", "dotnet"],
|
|
"confidence_range": [0.9, 1.0],
|
|
"deprecated": false
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
### EG2: Canonical Edge Schema with Hash Rules
|
|
|
|
**Edge schema:**
|
|
|
|
```json
|
|
{
|
|
"edge_id": "edge:sha256:{hex}",
|
|
"from": "sym:java:...",
|
|
"to": "sym:java:...",
|
|
"kind": "call",
|
|
"reason": "bytecode-invoke",
|
|
"confidence": 0.95,
|
|
"evidence": [
|
|
{
|
|
"source": "detector:java-bytecode-analyzer",
|
|
"rule_id": "invoke-virtual",
|
|
"rule_version": "1.0.0",
|
|
"location": {
|
|
"file": "com/example/Foo.class",
|
|
"offset": 1234,
|
|
"instruction": "invokevirtual #42"
|
|
},
|
|
"timestamp": "2025-12-13T10:00:00Z"
|
|
}
|
|
],
|
|
"attributes": {
|
|
"virtual": true,
|
|
"polymorphic_targets": 3
|
|
}
|
|
}
|
|
```
|
|
|
|
**Hash computation:**
|
|
|
|
```
|
|
edge_id = "edge:" + sha256(
|
|
canonical_json({
|
|
"from": edge.from,
|
|
"to": edge.to,
|
|
"kind": edge.kind,
|
|
"reason": edge.reason
|
|
})
|
|
)
|
|
```
|
|
|
|
**Canonicalization:**
|
|
|
|
1. Use only `from`, `to`, `kind`, `reason` for hash (not confidence or evidence)
|
|
2. Sort JSON keys alphabetically
|
|
3. No whitespace, UTF-8 encoding
|
|
4. Hash is lowercase hex with `sha256:` prefix
|
|
|
|
### EG3: Evidence Limits/Redaction
|
|
|
|
**Evidence limits:**
|
|
|
|
| Element | Default Limit | Configurable |
|
|
|---------|--------------|--------------|
|
|
| Evidence entries per edge | 10 | Yes |
|
|
| Location detail fields | 5 | Yes |
|
|
| Instruction preview length | 100 chars | Yes |
|
|
| File path depth | 10 segments | No |
|
|
|
|
**Redaction rules:**
|
|
|
|
| Category | Redaction | Example |
|
|
|----------|-----------|---------|
|
|
| File paths | Normalize | `/home/user/...` -> `{PROJECT}/...` |
|
|
| Bytecode offsets | Keep | Offsets are not PII |
|
|
| Instruction text | Truncate | First 100 chars |
|
|
| Source line content | Omit | Not included by default |
|
|
|
|
**Truncation behavior:**
|
|
|
|
```json
|
|
{
|
|
"evidence_truncated": true,
|
|
"evidence_count": 15,
|
|
"evidence_shown": 10,
|
|
"full_evidence_uri": "cas://edges/evidence/sha256:..."
|
|
}
|
|
```
|
|
|
|
### EG4: Confidence Rubric
|
|
|
|
**Confidence scale:**
|
|
|
|
| Level | Range | Description | Typical Sources |
|
|
|-------|-------|-------------|-----------------|
|
|
| `certain` | 1.0 | Definite edge | Direct bytecode invoke |
|
|
| `high` | 0.85-0.99 | Very likely | Import table, PLT |
|
|
| `medium` | 0.5-0.84 | Probable | Indirect analysis, vtable |
|
|
| `low` | 0.2-0.49 | Possible | Heuristic carving |
|
|
| `unknown` | 0.0-0.19 | Speculative | User annotation, fallback |
|
|
|
|
**Confidence computation:**
|
|
|
|
```
|
|
edge.confidence = base_confidence(reason) * evidence_boost(evidence_count) * target_resolution_factor
|
|
```
|
|
|
|
**Base confidence by reason:**
|
|
|
|
| Reason | Base Confidence |
|
|
|--------|-----------------|
|
|
| `bytecode-invoke` | 0.98 |
|
|
| `import-symbol` | 0.95 |
|
|
| `plt-stub` | 0.92 |
|
|
| `reloc-target` | 0.90 |
|
|
| `init-array` | 0.95 |
|
|
| `vtable-slot` | 0.75 |
|
|
| `indirect-target` | 0.60 |
|
|
| `reflection-invoke` | 0.50 |
|
|
| `runtime-observed` | 0.99 |
|
|
| `user-annotated` | 0.80 |
|
|
|
|
### EG5: Detector/Rule Provenance
|
|
|
|
**Provenance schema:**
|
|
|
|
```json
|
|
{
|
|
"provenance": {
|
|
"analyzer": {
|
|
"name": "scanner.java",
|
|
"version": "1.2.0",
|
|
"digest": "sha256:..."
|
|
},
|
|
"detector": {
|
|
"name": "java-bytecode-analyzer",
|
|
"version": "2.0.0",
|
|
"rule_set": "default"
|
|
},
|
|
"rule": {
|
|
"id": "invoke-virtual",
|
|
"version": "1.0.0",
|
|
"description": "Detect invokevirtual bytecode instructions"
|
|
},
|
|
"input_artifacts": [
|
|
{"type": "jar", "digest": "sha256:...", "path": "lib/app.jar"}
|
|
],
|
|
"detected_at": "2025-12-13T10:00:00Z"
|
|
}
|
|
}
|
|
```
|
|
|
|
**Provenance requirements:**
|
|
|
|
1. All edges must include analyzer provenance
|
|
2. Detector/rule provenance required for non-runtime edges
|
|
3. Input artifact digests enable reproducibility
|
|
4. Detection timestamp uses UTC ISO-8601
|
|
|
|
### EG6: API/CLI Parity
|
|
|
|
**API endpoints:**
|
|
|
|
| Method | Path | Description |
|
|
|--------|------|-------------|
|
|
| `GET` | `/api/edges/{edgeId}` | Get edge details |
|
|
| `GET` | `/api/edges?graph_hash=...` | List edges for graph |
|
|
| `GET` | `/api/edges/{edgeId}/evidence` | Get full evidence |
|
|
| `POST` | `/api/edges/search` | Search edges by criteria |
|
|
|
|
**CLI commands:**
|
|
|
|
```bash
|
|
# List edges for a graph
|
|
stella edge list --graph blake3:a1b2c3d4...
|
|
|
|
# Get edge details
|
|
stella edge show --id edge:sha256:...
|
|
|
|
# Search edges
|
|
stella edge search --from "sym:java:..." --reason bytecode-invoke
|
|
|
|
# Export edges
|
|
stella edge export --graph blake3:... --output ./edges.ndjson
|
|
```
|
|
|
|
**Output parity:**
|
|
|
|
- API and CLI return identical JSON structure
|
|
- CLI supports `--json` for machine-readable output
|
|
- Both support filtering by reason, confidence, from/to
|
|
|
|
### EG7: Deterministic Fixtures
|
|
|
|
**Fixture location:**
|
|
|
|
```
|
|
tests/Edge/
|
|
fixtures/
|
|
bytecode-invoke.json
|
|
plt-stub.json
|
|
vtable-dispatch.json
|
|
init-array-constructor.json
|
|
runtime-observed.json
|
|
golden/
|
|
bytecode-invoke.golden.json
|
|
graph-with-edges.golden.json
|
|
|
|
datasets/edges/
|
|
schema/
|
|
edge.schema.json
|
|
reason-registry.json
|
|
samples/
|
|
java-spring-boot/
|
|
edges.ndjson
|
|
expected-hashes.txt
|
|
```
|
|
|
|
**Fixture requirements:**
|
|
|
|
1. Each reason code has at least one fixture
|
|
2. Fixtures include expected `edge_id` hash
|
|
3. Golden outputs frozen after review
|
|
4. CI verifies hash stability
|
|
|
|
### EG8: Propagation into Explanation Graphs/VEX
|
|
|
|
**Explanation graph inclusion:**
|
|
|
|
```json
|
|
{
|
|
"explanation": {
|
|
"path": [
|
|
{
|
|
"node": "sym:java:main...",
|
|
"outgoing_edge": {
|
|
"edge_id": "edge:sha256:...",
|
|
"to": "sym:java:handler...",
|
|
"reason": "bytecode-invoke",
|
|
"confidence": 0.98
|
|
}
|
|
},
|
|
{
|
|
"node": "sym:java:handler...",
|
|
"outgoing_edge": {
|
|
"edge_id": "edge:sha256:...",
|
|
"to": "sym:java:log4j...",
|
|
"reason": "bytecode-invoke",
|
|
"confidence": 0.95
|
|
}
|
|
}
|
|
],
|
|
"aggregate_path_confidence": 0.93
|
|
}
|
|
}
|
|
```
|
|
|
|
**VEX evidence format:**
|
|
|
|
```json
|
|
{
|
|
"stellaops:reachability": {
|
|
"path_edges": [
|
|
{"edge_id": "edge:sha256:...", "reason": "bytecode-invoke", "confidence": 0.98},
|
|
{"edge_id": "edge:sha256:...", "reason": "bytecode-invoke", "confidence": 0.95}
|
|
],
|
|
"weakest_edge": {
|
|
"edge_id": "edge:sha256:...",
|
|
"reason": "bytecode-invoke",
|
|
"confidence": 0.95
|
|
},
|
|
"aggregate_confidence": 0.93
|
|
}
|
|
}
|
|
```
|
|
|
|
### EG9: Localization Guidance
|
|
|
|
**Localizable elements:**
|
|
|
|
| Element | Localization | Example |
|
|
|---------|--------------|---------|
|
|
| Reason code display | Message catalog | `bytecode-invoke` -> "Bytecode method call" |
|
|
| Confidence level | Message catalog | `high` -> "High confidence" |
|
|
| Evidence descriptions | Template | "Detected at offset {offset} in {file}" |
|
|
| Error messages | Message catalog | Standard error codes |
|
|
|
|
**Message catalog structure:**
|
|
|
|
```json
|
|
{
|
|
"locale": "en-US",
|
|
"messages": {
|
|
"edge.reason.bytecode-invoke": "Bytecode method call",
|
|
"edge.reason.plt-stub": "PLT/GOT library call",
|
|
"edge.confidence.high": "High confidence ({0:P0})",
|
|
"edge.evidence.location": "Detected at offset {offset} in {file}"
|
|
}
|
|
}
|
|
```
|
|
|
|
**Supported locales:**
|
|
|
|
- `en-US` (default)
|
|
- Additional locales via contribution
|
|
|
|
### EG10: Backfill Plan
|
|
|
|
**Backfill strategy:**
|
|
|
|
1. **Phase 1:** Add reason codes to new edges (no backfill needed)
|
|
2. **Phase 2:** Run detector upgrade on graphs without reason codes
|
|
3. **Phase 3:** Mark old graphs as `requires_reanalysis` in metadata
|
|
|
|
**Migration script:**
|
|
|
|
```bash
|
|
stella edge backfill --graph blake3:... --dry-run
|
|
|
|
# Output:
|
|
Graph: blake3:a1b2c3d4...
|
|
Edges without reason: 1234
|
|
Edges to update: 1234
|
|
|
|
Dry run - no changes made.
|
|
|
|
# Execute:
|
|
stella edge backfill --graph blake3:... --execute
|
|
```
|
|
|
|
**Backfill metadata:**
|
|
|
|
```json
|
|
{
|
|
"backfill": {
|
|
"status": "complete",
|
|
"original_analyzer_version": "1.0.0",
|
|
"backfill_analyzer_version": "1.2.0",
|
|
"backfilled_at": "2025-12-13T10:00:00Z",
|
|
"edges_updated": 1234
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 3. Related Documentation
|
|
|
|
- [richgraph-v1 Contract](../contracts/richgraph-v1.md) - Graph schema specification
|
|
- [Function-Level Evidence](./function-level-evidence.md) - Evidence chain guide
|
|
- [Explainability Schema](./explainability-schema.md) - Explanation format
|
|
- [Hybrid Attestation](./hybrid-attestation.md) - Edge bundle DSSE
|
|
|
|
---
|
|
|
|
_Last updated: 2025-12-13. See Sprint 0401 EDGE-GAPS-401-065 for change history._
|