up
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Notify Smoke Test / Notify Unit Tests (push) Has been cancelled
Notify Smoke Test / Notifier Service Tests (push) Has been cancelled
Notify Smoke Test / Notification Smoke Test (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Notify Smoke Test / Notify Unit Tests (push) Has been cancelled
Notify Smoke Test / Notifier Service Tests (push) Has been cancelled
Notify Smoke Test / Notification Smoke Test (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled
This commit is contained in:
301
docs/contracts/buildid-propagation.md
Normal file
301
docs/contracts/buildid-propagation.md
Normal file
@@ -0,0 +1,301 @@
|
||||
# CONTRACT-BUILDID-PROPAGATION-401: Build-ID and Code-ID Propagation
|
||||
|
||||
> **Status:** Published
|
||||
> **Version:** 1.0.0
|
||||
> **Published:** 2025-12-13
|
||||
> **Owners:** Scanner Guild, Signals Guild, BE-Base Platform Guild
|
||||
> **Unblocks:** SCANNER-BUILDID-401-035, SCANNER-INITROOT-401-036, and downstream tasks
|
||||
|
||||
## Overview
|
||||
|
||||
This contract defines how GNU build-id (ELF), PE GUID, and Mach-O UUID propagate through the reachability pipeline from Scanner to SBOM, Signals, and runtime facts. It ensures consistent identification of binaries across components for deterministic symbol resolution and replay.
|
||||
|
||||
---
|
||||
|
||||
## 1. Build-ID Sources and Formats
|
||||
|
||||
### 1.1 Per-Format Extraction
|
||||
|
||||
| Binary Format | Build-ID Source | Prefix | Example |
|
||||
|---------------|-----------------|--------|---------|
|
||||
| ELF | `.note.gnu.build-id` | `gnu-build-id:` | `gnu-build-id:5f0c7c3cab2eb9bc...` |
|
||||
| PE (Windows) | Debug GUID from PE header | `pe-guid:` | `pe-guid:12345678-1234-1234-1234-123456789abc` |
|
||||
| Mach-O | `LC_UUID` load command | `macho-uuid:` | `macho-uuid:12345678123412341234123456789abc` |
|
||||
|
||||
### 1.2 Canonical Format
|
||||
|
||||
```
|
||||
build_id = "{prefix}{hex_lowercase}"
|
||||
```
|
||||
|
||||
- Hex encoding: lowercase, no separators (except PE GUID retains dashes)
|
||||
- Minimum length: 16 bytes (32 hex chars) for ELF/Mach-O
|
||||
- PE GUID: Standard GUID format with dashes
|
||||
|
||||
### 1.3 Fallback When Build-ID Absent
|
||||
|
||||
When build-id is not present (stripped binaries, older toolchains):
|
||||
|
||||
```json
|
||||
{
|
||||
"build_id": null,
|
||||
"build_id_fallback": {
|
||||
"method": "file_hash",
|
||||
"value": "sha256:...",
|
||||
"confidence": 0.7
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Fallback chain:**
|
||||
1. `file_hash` - SHA-256 of entire binary file (confidence: 0.7)
|
||||
2. `code_section_hash` - SHA-256 of .text section (confidence: 0.6)
|
||||
3. `path_hash` - SHA-256 of file path (confidence: 0.3, last resort)
|
||||
|
||||
---
|
||||
|
||||
## 2. Code-ID for Name-less Symbols
|
||||
|
||||
### 2.1 Purpose
|
||||
|
||||
`code_id` provides stable identification for symbols in stripped binaries where the symbol name is unavailable.
|
||||
|
||||
### 2.2 Format
|
||||
|
||||
```
|
||||
code_id = "code:{lang}:{base64url_sha256}"
|
||||
```
|
||||
|
||||
**Canonical tuple for binary symbols:**
|
||||
```
|
||||
{format}\0{build_id_or_file_hash}\0{section}\0{addr}\0{size}\0{code_block_hash}
|
||||
```
|
||||
|
||||
### 2.3 Code Block Hash
|
||||
|
||||
For stripped functions, compute hash of the code bytes:
|
||||
|
||||
```
|
||||
code_block_hash = "sha256:" + hex(SHA256(code_bytes[addr:addr+size]))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Cross-RID (Runtime Identifier) Mapping
|
||||
|
||||
### 3.1 Problem Statement
|
||||
|
||||
Different platform builds (linux-x64, win-x64, osx-arm64) of the same source code produce different binaries with different build-ids. Runtime facts from one platform must map to the correct binary variant.
|
||||
|
||||
### 3.2 Variant Group
|
||||
|
||||
Binaries from the same source are grouped by source digest:
|
||||
|
||||
```json
|
||||
{
|
||||
"variant_group": {
|
||||
"source_digest": "sha256:...",
|
||||
"variants": [
|
||||
{
|
||||
"rid": "linux-x64",
|
||||
"build_id": "gnu-build-id:aaa...",
|
||||
"file_hash": "sha256:..."
|
||||
},
|
||||
{
|
||||
"rid": "win-x64",
|
||||
"build_id": "pe-guid:bbb...",
|
||||
"file_hash": "sha256:..."
|
||||
},
|
||||
{
|
||||
"rid": "osx-arm64",
|
||||
"build_id": "macho-uuid:ccc...",
|
||||
"file_hash": "sha256:..."
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3.3 Runtime Fact Correlation
|
||||
|
||||
When Signals ingests runtime facts:
|
||||
|
||||
1. Extract `build_id` from runtime event
|
||||
2. Look up variant group containing this build_id
|
||||
3. Correlate with richgraph nodes having matching `build_id`
|
||||
4. If no match, fall back to `code_id` + `code_block_hash` matching
|
||||
|
||||
---
|
||||
|
||||
## 4. SBOM Integration
|
||||
|
||||
### 4.1 CycloneDX 1.6 Properties
|
||||
|
||||
Build-ID propagates to SBOM via component properties:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "library",
|
||||
"name": "libssl.so.3",
|
||||
"version": "3.0.11",
|
||||
"properties": [
|
||||
{"name": "stellaops:build-id", "value": "gnu-build-id:5f0c7c3c..."},
|
||||
{"name": "stellaops:code-id", "value": "code:binary:abc123..."},
|
||||
{"name": "stellaops:file-hash", "value": "sha256:..."}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### 4.2 SPDX 3.0 Integration
|
||||
|
||||
Build-ID maps to SPDX external references:
|
||||
|
||||
```json
|
||||
{
|
||||
"spdxId": "SPDXRef-libssl",
|
||||
"externalRef": {
|
||||
"referenceCategory": "PERSISTENT-ID",
|
||||
"referenceType": "gnu-build-id",
|
||||
"referenceLocator": "gnu-build-id:5f0c7c3c..."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Signals Runtime Facts Schema
|
||||
|
||||
### 5.1 Runtime Event with Build-ID
|
||||
|
||||
```json
|
||||
{
|
||||
"event_type": "function_hit",
|
||||
"timestamp": "2025-12-13T10:00:00Z",
|
||||
"binary": {
|
||||
"path": "/usr/lib/x86_64-linux-gnu/libssl.so.3",
|
||||
"build_id": "gnu-build-id:5f0c7c3c...",
|
||||
"file_hash": "sha256:..."
|
||||
},
|
||||
"symbol": {
|
||||
"name": "SSL_read",
|
||||
"address": "0x12345678",
|
||||
"symbol_id": "sym:binary:..."
|
||||
},
|
||||
"context": {
|
||||
"pid": 12345,
|
||||
"container_id": "abc123..."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 5.2 Ingestion Endpoint
|
||||
|
||||
```
|
||||
POST /signals/runtime-facts
|
||||
Content-Type: application/x-ndjson
|
||||
Content-Encoding: gzip
|
||||
|
||||
{"event_type":"function_hit","binary":{"build_id":"gnu-build-id:..."},...}
|
||||
{"event_type":"function_hit","binary":{"build_id":"gnu-build-id:..."},...}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. RichGraph Integration
|
||||
|
||||
### 6.1 Node with Build-ID
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "sym:binary:...",
|
||||
"symbol_id": "sym:binary:...",
|
||||
"lang": "binary",
|
||||
"kind": "function",
|
||||
"display": "SSL_read",
|
||||
"build_id": "gnu-build-id:5f0c7c3c...",
|
||||
"code_id": "code:binary:...",
|
||||
"code_block_hash": "sha256:...",
|
||||
"purl": "pkg:deb/debian/libssl3@3.0.11"
|
||||
}
|
||||
```
|
||||
|
||||
### 6.2 CAS Evidence Storage
|
||||
|
||||
```
|
||||
cas://binary/
|
||||
by-build-id/{build_id}/ # Index by build-id
|
||||
graph.json # Associated graph
|
||||
symbols.json # Symbol table
|
||||
by-code-id/{code_id}/ # Index by code-id
|
||||
block.bin # Code block bytes
|
||||
disasm.json # Disassembly
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Implementation Requirements
|
||||
|
||||
### 7.1 Scanner Changes
|
||||
|
||||
| Component | Change | Priority |
|
||||
|-----------|--------|----------|
|
||||
| ELF parser | Extract `.note.gnu.build-id` | P0 |
|
||||
| PE parser | Extract Debug GUID | P0 |
|
||||
| Mach-O parser | Extract `LC_UUID` | P0 |
|
||||
| RichGraphBuilder | Populate `build_id` field on nodes | P0 |
|
||||
| SBOM emitters | Add `stellaops:build-id` property | P1 |
|
||||
|
||||
### 7.2 Signals Changes
|
||||
|
||||
| Component | Change | Priority |
|
||||
|-----------|--------|----------|
|
||||
| Runtime facts ingestion | Parse and index `build_id` | P0 |
|
||||
| Scoring service | Correlate by `build_id` then `code_id` | P0 |
|
||||
| Store repository | Add `build_id` index | P1 |
|
||||
|
||||
### 7.3 CLI/UI Changes
|
||||
|
||||
| Component | Change | Priority |
|
||||
|-----------|--------|----------|
|
||||
| `stella graph explain` | Show build_id in output | P1 |
|
||||
| UI symbol drawer | Display build_id with copy button | P1 |
|
||||
|
||||
---
|
||||
|
||||
## 8. Validation Rules
|
||||
|
||||
1. `build_id` must match regex: `^(gnu-build-id|pe-guid|macho-uuid):[a-f0-9-]+$`
|
||||
2. `code_id` must match regex: `^code:[a-z]+:[A-Za-z0-9_-]+$`
|
||||
3. When `build_id` is null, `build_id_fallback` must be present
|
||||
4. `code_block_hash` required when `build_id` is null and symbol is stripped
|
||||
5. Variant group `source_digest` must be consistent across all variants
|
||||
|
||||
---
|
||||
|
||||
## 9. Test Fixtures
|
||||
|
||||
Location: `tests/Binary/fixtures/build-id/`
|
||||
|
||||
| Fixture | Description |
|
||||
|---------|-------------|
|
||||
| `elf-with-buildid/` | ELF binary with GNU build-id |
|
||||
| `elf-stripped/` | ELF stripped, fallback to code-id |
|
||||
| `pe-with-guid/` | PE binary with Debug GUID |
|
||||
| `macho-with-uuid/` | Mach-O binary with LC_UUID |
|
||||
| `variant-group/` | Same source, multiple RIDs |
|
||||
|
||||
---
|
||||
|
||||
## 10. Related Contracts
|
||||
|
||||
- [richgraph-v1](./richgraph-v1.md) - Graph schema with build_id field
|
||||
- [Binary Reachability](../reachability/binary-reachability-schema.md) - Binary evidence schema
|
||||
- [Symbol Manifest](../specs/SYMBOL_MANIFEST_v1.md) - Symbol identification
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
| Version | Date | Author | Changes |
|
||||
|---------|------|--------|---------|
|
||||
| 1.0.0 | 2025-12-13 | Scanner Guild | Initial contract for build-id propagation |
|
||||
Reference in New Issue
Block a user