Some checks failed
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
462 lines
11 KiB
Markdown
462 lines
11 KiB
Markdown
# Binary Reachability Schema
|
|
|
|
_Last updated: 2025-12-13. Owner: Scanner Guild + Attestor Guild._
|
|
|
|
This document defines the binary reachability schema addressing gaps BR1-BR10 from the November 2025 product findings. It specifies DSSE predicate formats, edge hash recipes, binary evidence requirements, build-id handling, and Sigstore integration.
|
|
|
|
---
|
|
|
|
## 1. Overview
|
|
|
|
Binary reachability extends the function-level evidence chain to native executables (ELF, PE, Mach-O). Key challenges addressed:
|
|
|
|
- **Stripped binaries:** Symbol recovery using `code_id` + `code_block_hash`
|
|
- **Build variants:** Handling multiple builds from same source
|
|
- **Large graphs:** Chunking and size limits for DSSE/Rekor
|
|
- **Offline verification:** Air-gapped attestation workflows
|
|
|
|
---
|
|
|
|
## 2. Gap Resolutions
|
|
|
|
### BR1: Canonical DSSE/Predicate Schemas
|
|
|
|
**Binary graph predicate:**
|
|
|
|
```
|
|
stella.ops/binaryGraph@v1
|
|
```
|
|
|
|
**Predicate schema:**
|
|
|
|
```json
|
|
{
|
|
"_type": "https://stellaops.dev/predicates/binaryGraph/v1",
|
|
"subject": [
|
|
{
|
|
"name": "graph",
|
|
"digest": {"blake3": "a1b2c3d4e5f6..."}
|
|
}
|
|
],
|
|
"predicate": {
|
|
"analyzer": {
|
|
"name": "scanner.native",
|
|
"version": "1.2.0",
|
|
"toolchain": "ghidra-11.2"
|
|
},
|
|
"binary": {
|
|
"format": "ELF",
|
|
"arch": "x86_64",
|
|
"file_hash": "sha256:...",
|
|
"build_id": "gnu-build-id:5f0c7c3c..."
|
|
},
|
|
"graph_stats": {
|
|
"node_count": 1247,
|
|
"edge_count": 3891,
|
|
"root_count": 5
|
|
},
|
|
"evidence": {
|
|
"symbols_source": "DWARF",
|
|
"stripped_symbols": 58,
|
|
"heuristic_symbols": 12
|
|
},
|
|
"created_at": "2025-12-13T10:00:00Z"
|
|
}
|
|
}
|
|
```
|
|
|
|
**Edge bundle predicate:**
|
|
|
|
```
|
|
stella.ops/binaryEdgeBundle@v1
|
|
```
|
|
|
|
```json
|
|
{
|
|
"_type": "https://stellaops.dev/predicates/binaryEdgeBundle/v1",
|
|
"subject": [
|
|
{
|
|
"name": "edges",
|
|
"digest": {"sha256": "..."}
|
|
}
|
|
],
|
|
"predicate": {
|
|
"graph_hash": "blake3:a1b2c3d4...",
|
|
"bundle_id": "bundle:001",
|
|
"bundle_reason": "init_array",
|
|
"edge_count": 128,
|
|
"edges": [
|
|
{
|
|
"from": "sym:binary:...",
|
|
"to": "sym:binary:...",
|
|
"reason": "init-array",
|
|
"confidence": 0.95
|
|
}
|
|
]
|
|
}
|
|
}
|
|
```
|
|
|
|
### BR2: Edge Hash Recipe
|
|
|
|
**Binary edge hash computation:**
|
|
|
|
```
|
|
edge_id = "edge:" + sha256(
|
|
canonical_json({
|
|
"from": edge.from,
|
|
"to": edge.to,
|
|
"kind": edge.kind,
|
|
"reason": edge.reason,
|
|
"binary_hash": binary.file_hash // Binary context included
|
|
})
|
|
)
|
|
```
|
|
|
|
**Hash includes binary context:**
|
|
|
|
Unlike managed code edges, binary edges include `binary_hash` in the hash computation to distinguish edges from different binaries with identical symbol names.
|
|
|
|
**Canonicalization:**
|
|
|
|
1. Keys: `binary_hash`, `from`, `kind`, `reason`, `to` (alphabetical)
|
|
2. No whitespace, UTF-8 encoding
|
|
3. Lowercase hex for all hashes
|
|
|
|
### BR3: Required Binary Evidence with CAS Refs
|
|
|
|
**Required evidence per node:**
|
|
|
|
| Evidence Type | Required | CAS Storage |
|
|
|---------------|----------|-------------|
|
|
| File hash | Yes | N/A (inline) |
|
|
| Build ID | Conditional | N/A (inline) |
|
|
| Symbol source | Yes | N/A (inline) |
|
|
| Code block hash | For stripped | `cas://binary/blocks/{sha256}` |
|
|
| Disassembly | Optional | `cas://binary/disasm/{sha256}` |
|
|
| CFG | Optional | `cas://binary/cfg/{sha256}` |
|
|
|
|
**Evidence schema:**
|
|
|
|
```json
|
|
{
|
|
"binary_evidence": {
|
|
"file_hash": "sha256:...",
|
|
"build_id": "gnu-build-id:5f0c7c3c...",
|
|
"symbol_source": "DWARF",
|
|
"symbol_confidence": 0.95,
|
|
"code_block_hash": "sha256:deadbeef...",
|
|
"code_block_uri": "cas://binary/blocks/sha256:deadbeef...",
|
|
"disassembly_uri": "cas://binary/disasm/sha256:...",
|
|
"cfg_uri": "cas://binary/cfg/sha256:..."
|
|
}
|
|
}
|
|
```
|
|
|
|
**CAS layout:**
|
|
|
|
```
|
|
cas://binary/
|
|
blocks/{sha256}/ # Code block bytes
|
|
disasm/{sha256}/ # Disassembly JSON
|
|
cfg/{sha256}/ # Control flow graph
|
|
symbols/{sha256}/ # Symbol table extract
|
|
```
|
|
|
|
### BR4: Build-ID/Variant Rules
|
|
|
|
**Build-ID sources:**
|
|
|
|
| Format | Build-ID Source | Example |
|
|
|--------|-----------------|---------|
|
|
| ELF | `.note.gnu.build-id` | `gnu-build-id:5f0c7c3c...` |
|
|
| PE | Debug GUID | `pe-guid:12345678-1234-...` |
|
|
| Mach-O | `LC_UUID` | `macho-uuid:12345678...` |
|
|
|
|
**Fallback when build-ID absent:**
|
|
|
|
```json
|
|
{
|
|
"build_id": null,
|
|
"build_id_fallback": {
|
|
"method": "file_hash",
|
|
"value": "sha256:...",
|
|
"confidence": 0.7
|
|
}
|
|
}
|
|
```
|
|
|
|
**Variant handling:**
|
|
|
|
Multiple binaries from same source (debug/release, different arch):
|
|
|
|
```json
|
|
{
|
|
"variant_group": "sha256:source_hash...",
|
|
"variants": [
|
|
{"build_id": "gnu-build-id:aaa...", "variant_type": "release-x86_64"},
|
|
{"build_id": "gnu-build-id:bbb...", "variant_type": "debug-x86_64"},
|
|
{"build_id": "gnu-build-id:ccc...", "variant_type": "release-aarch64"}
|
|
]
|
|
}
|
|
```
|
|
|
|
### BR5: Policy Hash Governance
|
|
|
|
**Policy version binding:**
|
|
|
|
Binary reachability graphs are bound to a policy version:
|
|
|
|
```json
|
|
{
|
|
"policy_binding": {
|
|
"policy_digest": "sha256:...",
|
|
"policy_version": "P-7:v4",
|
|
"bound_at": "2025-12-13T10:00:00Z",
|
|
"binding_mode": "strict"
|
|
}
|
|
}
|
|
```
|
|
|
|
**Binding modes:**
|
|
|
|
| Mode | Behavior |
|
|
|------|----------|
|
|
| `strict` | Graph invalid if policy changes |
|
|
| `forward` | Graph valid with newer policy versions |
|
|
| `any` | Graph valid with any policy version |
|
|
|
|
**Governance rules:**
|
|
|
|
1. Production graphs use `strict` binding
|
|
2. Test graphs may use `forward`
|
|
3. Policy hash computed from canonical DSL
|
|
4. Binding stored in graph metadata
|
|
|
|
### BR6: Sigstore Bundle/Log Routing
|
|
|
|
**Sigstore integration:**
|
|
|
|
```json
|
|
{
|
|
"sigstore": {
|
|
"bundle_type": "hashedrekord",
|
|
"log_index": 12345678,
|
|
"log_id": "rekor.sigstore.dev",
|
|
"inclusion_proof": {
|
|
"log_index": 12345678,
|
|
"root_hash": "sha256:...",
|
|
"tree_size": 98765432,
|
|
"hashes": ["sha256:...", "sha256:..."]
|
|
},
|
|
"signed_entry_timestamp": "base64:..."
|
|
}
|
|
}
|
|
```
|
|
|
|
**Log routing:**
|
|
|
|
| Evidence Type | Log | Notes |
|
|
|---------------|-----|-------|
|
|
| Graph DSSE | Rekor (public) | Always |
|
|
| Edge bundle DSSE | Rekor (capped) | Configurable limit |
|
|
| Code block | No log | CAS only |
|
|
| CFG/Disasm | No log | CAS only |
|
|
|
|
**Offline mode:**
|
|
|
|
When Rekor unavailable:
|
|
|
|
```json
|
|
{
|
|
"sigstore": {
|
|
"mode": "offline",
|
|
"checkpoint": {
|
|
"origin": "rekor.sigstore.dev",
|
|
"checkpoint_data": "base64:...",
|
|
"captured_at": "2025-12-13T10:00:00Z"
|
|
},
|
|
"deferred_submission": true
|
|
}
|
|
}
|
|
```
|
|
|
|
### BR7: Idempotent Submission Keys
|
|
|
|
**Submission key format:**
|
|
|
|
```
|
|
submit:{tenant}:{binary_hash}:{graph_hash}:{timestamp_hour}
|
|
```
|
|
|
|
**Idempotency rules:**
|
|
|
|
1. Same key returns existing entry (no duplicate)
|
|
2. Key includes hour-granularity timestamp for rate limiting
|
|
3. Different graphs from same binary produce different keys
|
|
4. Retry within 1 hour uses same key
|
|
|
|
**Implementation:**
|
|
|
|
```json
|
|
{
|
|
"submission": {
|
|
"key": "submit:acme:sha256:abc...:blake3:def...:2025121310",
|
|
"status": "accepted",
|
|
"existing_entry": false,
|
|
"log_index": 12345678
|
|
}
|
|
}
|
|
```
|
|
|
|
### BR8: Size/Chunking Limits
|
|
|
|
**Size limits:**
|
|
|
|
| Element | Limit | Action on Exceed |
|
|
|---------|-------|------------------|
|
|
| Graph JSON | 10 MB | Chunk nodes/edges |
|
|
| Edge bundle | 512 edges | Split bundles |
|
|
| DSSE payload | 1 MB | Compress/chunk |
|
|
| Rekor entry | 100 KB | Reference CAS |
|
|
|
|
**Chunking strategy:**
|
|
|
|
For large graphs (>10MB):
|
|
|
|
```json
|
|
{
|
|
"chunked_graph": {
|
|
"chunk_count": 5,
|
|
"chunks": [
|
|
{"chunk_id": "chunk:001", "uri": "cas://graphs/chunks/001", "hash": "blake3:..."},
|
|
{"chunk_id": "chunk:002", "uri": "cas://graphs/chunks/002", "hash": "blake3:..."}
|
|
],
|
|
"assembly_order": ["chunk:001", "chunk:002", ...],
|
|
"assembled_hash": "blake3:..."
|
|
}
|
|
}
|
|
```
|
|
|
|
**Compression:**
|
|
|
|
- Graph JSON: gzip before DSSE
|
|
- CAS storage: Raw JSON (indexed)
|
|
- Rekor payload: DSSE references CAS
|
|
|
|
### BR9: API/CLI/UI Surfacing
|
|
|
|
**API endpoints:**
|
|
|
|
| Method | Path | Description |
|
|
|--------|------|-------------|
|
|
| `POST` | `/api/binary/graphs` | Submit binary graph |
|
|
| `GET` | `/api/binary/graphs/{hash}` | Get graph details |
|
|
| `GET` | `/api/binary/graphs/{hash}/edges` | List edges |
|
|
| `GET` | `/api/binary/symbols/{symbolId}` | Get symbol details |
|
|
| `POST` | `/api/binary/verify` | Verify graph attestation |
|
|
|
|
**CLI commands:**
|
|
|
|
```bash
|
|
# Submit binary graph
|
|
stella binary submit --graph ./richgraph.json --binary ./app
|
|
|
|
# Get graph info
|
|
stella binary info --hash blake3:a1b2c3d4...
|
|
|
|
# List symbols
|
|
stella binary symbols --hash blake3:... --stripped-only
|
|
|
|
# Verify attestation
|
|
stella binary verify --graph ./richgraph.json --dsse ./richgraph.dsse
|
|
```
|
|
|
|
**UI components:**
|
|
|
|
- Binary graph visualization with zoom/pan
|
|
- Symbol table with search/filter
|
|
- Edge explorer with confidence highlighting
|
|
- Attestation status badges
|
|
- Build variant selector
|
|
|
|
### BR10: Binary Fixtures
|
|
|
|
**Fixture location:**
|
|
|
|
```
|
|
tests/Binary/
|
|
fixtures/
|
|
elf-x86_64-with-debug/
|
|
binary.elf
|
|
graph.json
|
|
expected-hashes.txt
|
|
elf-stripped/
|
|
binary.elf
|
|
graph.json
|
|
expected-hashes.txt
|
|
pe-x64-with-pdb/
|
|
binary.exe
|
|
graph.json
|
|
expected-hashes.txt
|
|
golden/
|
|
elf-x86_64.golden.json
|
|
pe-x64.golden.json
|
|
|
|
datasets/binary/
|
|
schema/
|
|
binary-graph.schema.json
|
|
binary-edge.schema.json
|
|
samples/
|
|
openssl-1.1.1/
|
|
libssl.so
|
|
graph.json
|
|
edges.ndjson
|
|
```
|
|
|
|
**Fixture requirements:**
|
|
|
|
1. Each binary format has at least one fixture
|
|
2. Stripped and debug variants for each format
|
|
3. Expected hashes verified by CI
|
|
4. Golden outputs include DSSE envelopes
|
|
5. Fixtures reproducible from source (where legal)
|
|
|
|
**Test categories:**
|
|
|
|
1. **Hash stability:** Same binary produces same graph hash
|
|
2. **Build-ID extraction:** Correct build-ID parsing per format
|
|
3. **Symbol recovery:** DWARF/PDB parsing accuracy
|
|
4. **Stripped handling:** Code block hash computation
|
|
5. **Chunking:** Large graph assembly/disassembly
|
|
6. **DSSE signing:** Envelope creation and verification
|
|
7. **Rekor integration:** Submission and verification
|
|
|
|
---
|
|
|
|
## 3. Implementation Status
|
|
|
|
| Component | Location | Status |
|
|
|-----------|----------|--------|
|
|
| ELF parser | `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Native` | Implemented |
|
|
| PE parser | `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Native` | Implemented |
|
|
| DSSE predicates | `src/Signer/StellaOps.Signer/PredicateTypes.cs` | Implemented |
|
|
| CAS storage | `src/Scanner/__Libraries/StellaOps.Scanner.Reachability` | Partial |
|
|
| Rekor integration | `src/Attestor/StellaOps.Attestor` | Implemented |
|
|
| CLI commands | `src/Cli/StellaOps.Cli` | Planned |
|
|
| UI components | `src/UI/StellaOps.UI` | Planned |
|
|
|
|
---
|
|
|
|
## 4. Related Documentation
|
|
|
|
- [richgraph-v1 Contract](../contracts/richgraph-v1.md) - Graph schema specification
|
|
- [Function-Level Evidence](./function-level-evidence.md) - Evidence chain guide
|
|
- [Edge Explainability](./edge-explainability-schema.md) - Edge reason codes
|
|
- [Hybrid Attestation](./hybrid-attestation.md) - Graph and edge-bundle DSSE
|
|
- [Native Analyzer Tests](../../src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Native.Tests/Reachability/) - Test fixtures
|
|
|
|
---
|
|
|
|
_Last updated: 2025-12-13. See Sprint 0401 BINARY-GAPS-401-066 for change history._
|