# Binary Reachability Schema _Last updated: 2025-12-13. Owner: Scanner Guild + Attestor Guild._ This document defines the binary reachability schema addressing gaps BR1-BR10 from the November 2025 product findings. It specifies DSSE predicate formats, edge hash recipes, binary evidence requirements, build-id handling, and Sigstore integration. --- ## 1. Overview Binary reachability extends the function-level evidence chain to native executables (ELF, PE, Mach-O). Key challenges addressed: - **Stripped binaries:** Symbol recovery using `code_id` + `code_block_hash` - **Build variants:** Handling multiple builds from same source - **Large graphs:** Chunking and size limits for DSSE/Rekor - **Offline verification:** Air-gapped attestation workflows --- ## 2. Gap Resolutions ### BR1: Canonical DSSE/Predicate Schemas **Binary graph predicate:** ``` stella.ops/binaryGraph@v1 ``` **Predicate schema:** ```json { "_type": "https://stellaops.dev/predicates/binaryGraph/v1", "subject": [ { "name": "graph", "digest": {"blake3": "a1b2c3d4e5f6..."} } ], "predicate": { "analyzer": { "name": "scanner.native", "version": "1.2.0", "toolchain": "ghidra-11.2" }, "binary": { "format": "ELF", "arch": "x86_64", "file_hash": "sha256:...", "build_id": "gnu-build-id:5f0c7c3c..." }, "graph_stats": { "node_count": 1247, "edge_count": 3891, "root_count": 5 }, "evidence": { "symbols_source": "DWARF", "stripped_symbols": 58, "heuristic_symbols": 12 }, "created_at": "2025-12-13T10:00:00Z" } } ``` **Edge bundle predicate:** ``` stella.ops/binaryEdgeBundle@v1 ``` ```json { "_type": "https://stellaops.dev/predicates/binaryEdgeBundle/v1", "subject": [ { "name": "edges", "digest": {"sha256": "..."} } ], "predicate": { "graph_hash": "blake3:a1b2c3d4...", "bundle_id": "bundle:001", "bundle_reason": "init_array", "edge_count": 128, "edges": [ { "from": "sym:binary:...", "to": "sym:binary:...", "reason": "init-array", "confidence": 0.95 } ] } } ``` ### BR2: Edge Hash Recipe **Binary edge hash computation:** ``` edge_id = "edge:" + sha256( canonical_json({ "from": edge.from, "to": edge.to, "kind": edge.kind, "reason": edge.reason, "binary_hash": binary.file_hash // Binary context included }) ) ``` **Hash includes binary context:** Unlike managed code edges, binary edges include `binary_hash` in the hash computation to distinguish edges from different binaries with identical symbol names. **Canonicalization:** 1. Keys: `binary_hash`, `from`, `kind`, `reason`, `to` (alphabetical) 2. No whitespace, UTF-8 encoding 3. Lowercase hex for all hashes ### BR3: Required Binary Evidence with CAS Refs **Required evidence per node:** | Evidence Type | Required | CAS Storage | |---------------|----------|-------------| | File hash | Yes | N/A (inline) | | Build ID | Conditional | N/A (inline) | | Symbol source | Yes | N/A (inline) | | Code block hash | For stripped | `cas://binary/blocks/{sha256}` | | Disassembly | Optional | `cas://binary/disasm/{sha256}` | | CFG | Optional | `cas://binary/cfg/{sha256}` | **Evidence schema:** ```json { "binary_evidence": { "file_hash": "sha256:...", "build_id": "gnu-build-id:5f0c7c3c...", "symbol_source": "DWARF", "symbol_confidence": 0.95, "code_block_hash": "sha256:deadbeef...", "code_block_uri": "cas://binary/blocks/sha256:deadbeef...", "disassembly_uri": "cas://binary/disasm/sha256:...", "cfg_uri": "cas://binary/cfg/sha256:..." } } ``` **CAS layout:** ``` cas://binary/ blocks/{sha256}/ # Code block bytes disasm/{sha256}/ # Disassembly JSON cfg/{sha256}/ # Control flow graph symbols/{sha256}/ # Symbol table extract ``` ### BR4: Build-ID/Variant Rules **Build-ID sources:** | Format | Build-ID Source | Example | |--------|-----------------|---------| | ELF | `.note.gnu.build-id` | `gnu-build-id:5f0c7c3c...` | | PE | Debug GUID | `pe-guid:12345678-1234-...` | | Mach-O | `LC_UUID` | `macho-uuid:12345678...` | **Fallback when build-ID absent:** ```json { "build_id": null, "build_id_fallback": { "method": "file_hash", "value": "sha256:...", "confidence": 0.7 } } ``` **Variant handling:** Multiple binaries from same source (debug/release, different arch): ```json { "variant_group": "sha256:source_hash...", "variants": [ {"build_id": "gnu-build-id:aaa...", "variant_type": "release-x86_64"}, {"build_id": "gnu-build-id:bbb...", "variant_type": "debug-x86_64"}, {"build_id": "gnu-build-id:ccc...", "variant_type": "release-aarch64"} ] } ``` ### BR5: Policy Hash Governance **Policy version binding:** Binary reachability graphs are bound to a policy version: ```json { "policy_binding": { "policy_digest": "sha256:...", "policy_version": "P-7:v4", "bound_at": "2025-12-13T10:00:00Z", "binding_mode": "strict" } } ``` **Binding modes:** | Mode | Behavior | |------|----------| | `strict` | Graph invalid if policy changes | | `forward` | Graph valid with newer policy versions | | `any` | Graph valid with any policy version | **Governance rules:** 1. Production graphs use `strict` binding 2. Test graphs may use `forward` 3. Policy hash computed from canonical DSL 4. Binding stored in graph metadata ### BR6: Sigstore Bundle/Log Routing **Sigstore integration:** ```json { "sigstore": { "bundle_type": "hashedrekord", "log_index": 12345678, "log_id": "rekor.sigstore.dev", "inclusion_proof": { "log_index": 12345678, "root_hash": "sha256:...", "tree_size": 98765432, "hashes": ["sha256:...", "sha256:..."] }, "signed_entry_timestamp": "base64:..." } } ``` **Log routing:** | Evidence Type | Log | Notes | |---------------|-----|-------| | Graph DSSE | Rekor (public) | Always | | Edge bundle DSSE | Rekor (capped) | Configurable limit | | Code block | No log | CAS only | | CFG/Disasm | No log | CAS only | **Offline mode:** When Rekor unavailable: ```json { "sigstore": { "mode": "offline", "checkpoint": { "origin": "rekor.sigstore.dev", "checkpoint_data": "base64:...", "captured_at": "2025-12-13T10:00:00Z" }, "deferred_submission": true } } ``` ### BR7: Idempotent Submission Keys **Submission key format:** ``` submit:{tenant}:{binary_hash}:{graph_hash}:{timestamp_hour} ``` **Idempotency rules:** 1. Same key returns existing entry (no duplicate) 2. Key includes hour-granularity timestamp for rate limiting 3. Different graphs from same binary produce different keys 4. Retry within 1 hour uses same key **Implementation:** ```json { "submission": { "key": "submit:acme:sha256:abc...:blake3:def...:2025121310", "status": "accepted", "existing_entry": false, "log_index": 12345678 } } ``` ### BR8: Size/Chunking Limits **Size limits:** | Element | Limit | Action on Exceed | |---------|-------|------------------| | Graph JSON | 10 MB | Chunk nodes/edges | | Edge bundle | 512 edges | Split bundles | | DSSE payload | 1 MB | Compress/chunk | | Rekor entry | 100 KB | Reference CAS | **Chunking strategy:** For large graphs (>10MB): ```json { "chunked_graph": { "chunk_count": 5, "chunks": [ {"chunk_id": "chunk:001", "uri": "cas://graphs/chunks/001", "hash": "blake3:..."}, {"chunk_id": "chunk:002", "uri": "cas://graphs/chunks/002", "hash": "blake3:..."} ], "assembly_order": ["chunk:001", "chunk:002", ...], "assembled_hash": "blake3:..." } } ``` **Compression:** - Graph JSON: gzip before DSSE - CAS storage: Raw JSON (indexed) - Rekor payload: DSSE references CAS ### BR9: API/CLI/UI Surfacing **API endpoints:** | Method | Path | Description | |--------|------|-------------| | `POST` | `/api/binary/graphs` | Submit binary graph | | `GET` | `/api/binary/graphs/{hash}` | Get graph details | | `GET` | `/api/binary/graphs/{hash}/edges` | List edges | | `GET` | `/api/binary/symbols/{symbolId}` | Get symbol details | | `POST` | `/api/binary/verify` | Verify graph attestation | **CLI commands:** ```bash # Submit binary graph stella binary submit --graph ./richgraph.json --binary ./app # Get graph info stella binary info --hash blake3:a1b2c3d4... # List symbols stella binary symbols --hash blake3:... --stripped-only # Verify attestation stella binary verify --graph ./richgraph.json --dsse ./richgraph.dsse ``` **UI components:** - Binary graph visualization with zoom/pan - Symbol table with search/filter - Edge explorer with confidence highlighting - Attestation status badges - Build variant selector ### BR10: Binary Fixtures **Fixture location:** ``` tests/Binary/ fixtures/ elf-x86_64-with-debug/ binary.elf graph.json expected-hashes.txt elf-stripped/ binary.elf graph.json expected-hashes.txt pe-x64-with-pdb/ binary.exe graph.json expected-hashes.txt golden/ elf-x86_64.golden.json pe-x64.golden.json datasets/binary/ schema/ binary-graph.schema.json binary-edge.schema.json samples/ openssl-1.1.1/ libssl.so graph.json edges.ndjson ``` **Fixture requirements:** 1. Each binary format has at least one fixture 2. Stripped and debug variants for each format 3. Expected hashes verified by CI 4. Golden outputs include DSSE envelopes 5. Fixtures reproducible from source (where legal) **Test categories:** 1. **Hash stability:** Same binary produces same graph hash 2. **Build-ID extraction:** Correct build-ID parsing per format 3. **Symbol recovery:** DWARF/PDB parsing accuracy 4. **Stripped handling:** Code block hash computation 5. **Chunking:** Large graph assembly/disassembly 6. **DSSE signing:** Envelope creation and verification 7. **Rekor integration:** Submission and verification --- ## 3. Implementation Status | Component | Location | Status | |-----------|----------|--------| | ELF parser | `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Native` | Implemented | | PE parser | `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Native` | Implemented | | DSSE predicates | `src/Signer/StellaOps.Signer/PredicateTypes.cs` | Implemented | | CAS storage | `src/Scanner/__Libraries/StellaOps.Scanner.Reachability` | Partial | | Rekor integration | `src/Attestor/StellaOps.Attestor` | Implemented | | CLI commands | `src/Cli/StellaOps.Cli` | Planned | | UI components | `src/Web/StellaOps.Web` | Implemented | --- ## 4. Related Documentation - [richgraph-v1 Contract](../contracts/richgraph-v1.md) - Graph schema specification - [Function-Level Evidence](./function-level-evidence.md) - Evidence chain guide - [Edge Explainability](./edge-explainability-schema.md) - Edge reason codes - [Hybrid Attestation](./hybrid-attestation.md) - Graph and edge-bundle DSSE - [Native Analyzer Tests](../../src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Native.Tests/Reachability/) - Test fixtures --- _Last updated: 2025-12-13. See Sprint 0401 BINARY-GAPS-401-066 for change history._