save changes
This commit is contained in:
@@ -37,6 +37,7 @@ Key features:
|
||||
## Related Documentation
|
||||
|
||||
- Architecture: `./architecture.md`
|
||||
- Hybrid Diff Stack: `./hybrid-diff-stack.md`
|
||||
- High-Level Architecture: `../../ARCHITECTURE_OVERVIEW.md`
|
||||
- Scanner Architecture: `../scanner/architecture.md`
|
||||
- Concelier Architecture: `../concelier/architecture.md`
|
||||
@@ -63,7 +64,7 @@ A major enhancement to BinaryIndex is planned to enable **semantic-level binary
|
||||
| **Phase 1** | IR-Level Semantic Analysis | +15% accuracy on optimized binaries | Planned |
|
||||
| **Phase 2** | Function Behavior Corpus | +10% coverage on stripped binaries | Planned |
|
||||
| **Phase 3** | Ghidra Integration | +5% edge case handling | Planned |
|
||||
| **Phase 4** | Decompiler & ML Similarity | +10% obfuscation resilience | Planned |
|
||||
| **Phase 4** | Decompiler and ML Similarity | +10% obfuscation resilience | Planned |
|
||||
|
||||
### New Libraries (Planned)
|
||||
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
> **Ownership:** Scanner Guild + Concelier Guild
|
||||
> **Status:** DRAFT
|
||||
> **Version:** 1.0.0
|
||||
> **Related:** [High-Level Architecture](../../ARCHITECTURE_OVERVIEW.md), [Scanner Architecture](../scanner/architecture.md), [Concelier Architecture](../concelier/architecture.md)
|
||||
> **Related:** [High-Level Architecture](../../ARCHITECTURE_OVERVIEW.md), [Scanner Architecture](../scanner/architecture.md), [Concelier Architecture](../concelier/architecture.md), [Hybrid Diff Stack](./hybrid-diff-stack.md)
|
||||
|
||||
---
|
||||
|
||||
@@ -1774,3 +1774,4 @@ inside `AddNormalizationPipelines()` in `ServiceCollectionExtensions.cs`.
|
||||
|
||||
*Document Version: 1.5.0*
|
||||
*Last Updated: 2026-02-12*
|
||||
|
||||
|
||||
163
docs/modules/binary-index/hybrid-diff-stack.md
Normal file
163
docs/modules/binary-index/hybrid-diff-stack.md
Normal file
@@ -0,0 +1,163 @@
|
||||
# Hybrid Diff Stack Architecture (Source -> Symbols -> Normalized Bytes)
|
||||
|
||||
> Status: Planned (advisory translation, 2026-02-16)
|
||||
> Module: BinaryIndex with cross-module contracts (Symbols, EvidenceLocker, Policy, Attestor, ReleaseOrchestrator)
|
||||
|
||||
## 1. Objective
|
||||
|
||||
Produce compact, auditable patch artifacts that preserve developer intent and
|
||||
binary truth at the same time:
|
||||
|
||||
- Source-level intent: semantic edit scripts anchored to classes/functions.
|
||||
- Build-level mapping: symbol map linked to immutable build identity.
|
||||
- Binary-level patching: normalization-first per-symbol deltas.
|
||||
- Release evidence: DSSE-signed contract consumed by policy and replay.
|
||||
|
||||
## 2. Current implementation baseline
|
||||
|
||||
Implemented today:
|
||||
|
||||
- ELF normalization passes and deterministic delta hash generation.
|
||||
- DeltaSig predicate contracts (v1 and v2) with CLI author/sign/verify flows.
|
||||
- Symbol manifest model with debug id, code id, source paths, and line data.
|
||||
|
||||
Gaps for full advisory scope:
|
||||
|
||||
- No AST semantic edit script artifact pipeline in current release workflow.
|
||||
- No canonical builder output for source-range to symbol-address map as a
|
||||
first-class build artifact contract.
|
||||
- No end-to-end "source edits -> symbol patch plan -> normalized deltas"
|
||||
bundle schema consumed by release policy.
|
||||
- Existing function delta composition still contains placeholder address/size
|
||||
behavior in parts of DeltaSig generation.
|
||||
|
||||
## 3. Target contracts
|
||||
|
||||
### 3.1 Source semantic edit script (`semantic_edit_script.json`)
|
||||
|
||||
Required fields:
|
||||
|
||||
- `schemaVersion`
|
||||
- `sourceTreeDigest`
|
||||
- `edits[]` where each edit includes:
|
||||
- `editType`: `add|remove|move|update|rename`
|
||||
- `nodeKind`: `class|method|field|import|statement`
|
||||
- `nodePath`: stable language-specific path
|
||||
- `anchor`: symbol-like identifier (for example `Namespace.Type.Method`)
|
||||
- `pre` and `post` source spans and digests
|
||||
|
||||
Determinism rules:
|
||||
|
||||
- Stable sort by file path, then node path.
|
||||
- Stable source digests and normalized paths.
|
||||
|
||||
### 3.2 Symbol map (`symbol_map.json`)
|
||||
|
||||
Produced during build from DWARF/PDB + build metadata.
|
||||
|
||||
Required fields:
|
||||
|
||||
- `schemaVersion`
|
||||
- `buildId`
|
||||
- `binaryDigest`
|
||||
- `symbols[]`:
|
||||
- `name`
|
||||
- `kind` (`function|object|section`)
|
||||
- `addressStart` and `addressEnd`
|
||||
- `section`
|
||||
- `sourceRanges[]` (`file`, `lineStart`, `lineEnd`)
|
||||
|
||||
Determinism rules:
|
||||
|
||||
- Symbol ordering by address then name.
|
||||
- Build id must match attestation subject.
|
||||
|
||||
### 3.3 Symbol patch plan (`symbol_patch_plan.json`)
|
||||
|
||||
Joins source edits with concrete symbols.
|
||||
|
||||
Required fields:
|
||||
|
||||
- `schemaVersion`
|
||||
- `buildIdBefore` and `buildIdAfter`
|
||||
- `editsDigest`
|
||||
- `symbolMapDigestBefore` and `symbolMapDigestAfter`
|
||||
- `changes[]`:
|
||||
- `symbol`
|
||||
- `changeType` (`added|removed|modified|moved`)
|
||||
- `astAnchors[]`
|
||||
- `preHash` and `postHash`
|
||||
- `deltaRef`
|
||||
|
||||
### 3.4 Patch manifest (`patch_manifest.json`)
|
||||
|
||||
Binds per-symbol normalized deltas to evidence and policy.
|
||||
|
||||
Required fields:
|
||||
|
||||
- `schemaVersion`
|
||||
- `buildId`
|
||||
- `normalizationRecipeId`
|
||||
- `patches[]`:
|
||||
- `symbol`
|
||||
- `addressRange`
|
||||
- `deltaDigest`
|
||||
- `pre` (`size`, `hash`)
|
||||
- `post` (`size`, `hash`)
|
||||
- `attestation` (`predicateType`, `dsseDigest`)
|
||||
|
||||
## 4. Evidence and policy integration
|
||||
|
||||
EvidenceLocker stores four linked artifacts per release comparison:
|
||||
|
||||
1. semantic edit script
|
||||
2. symbol maps (before/after)
|
||||
3. symbol patch plan
|
||||
4. normalized patch manifest + delta blobs
|
||||
|
||||
Policy hooks:
|
||||
|
||||
- Allowlist/denylist by namespace or symbol path.
|
||||
- Max function-count and max byte budget controls.
|
||||
- API surface change checks.
|
||||
- Hot-path and cryptography namespace protection rules.
|
||||
|
||||
## 5. Verifier contract (Attestor/Doctor)
|
||||
|
||||
Verifier must prove all of the following before promotion:
|
||||
|
||||
- Build-id and subject digest alignment.
|
||||
- Re-normalization of target binary with matching recipe id.
|
||||
- Dry-run delta application succeeds within declared symbol boundaries.
|
||||
- Resulting hashes equal manifest `post` values.
|
||||
- AST anchors reconcile to changed symbols in symbol patch plan.
|
||||
- DSSE signatures and transparency references validate per policy.
|
||||
|
||||
## 6. Integration boundaries
|
||||
|
||||
Builder step (CI): emit symbol map and normalized segments.
|
||||
|
||||
ReleaseOrchestrator step: combine source edits, symbol maps, and normalized
|
||||
bytes into patch plan and manifest.
|
||||
|
||||
BinaryIndex/DeltaSig: own normalization and per-symbol diff generation.
|
||||
|
||||
Attestor/Doctor: own verification and attestation checks.
|
||||
|
||||
EvidenceLocker: own storage schema and query surfaces.
|
||||
|
||||
Policy: consume summarized patch-plan metrics and rule evaluations.
|
||||
|
||||
## 7. Implementation tracker
|
||||
|
||||
Execution is tracked in:
|
||||
|
||||
- `docs/implplan/SPRINT_20260216_001_BinaryIndex_hybrid_diff_patch_pipeline.md`
|
||||
|
||||
## 8. Related documents
|
||||
|
||||
- `docs/hybrid-diff-patching.md`
|
||||
- `docs/modules/binary-index/semantic-diffing.md`
|
||||
- `docs/modules/binary-index/deltasig-v2-schema.md`
|
||||
- `docs/modules/scanner/binary-diff-attestation.md`
|
||||
- `docs/modules/evidence-locker/guides/evidence-pack-schema.md`
|
||||
@@ -49,6 +49,7 @@ Key settings:
|
||||
## Related Documentation
|
||||
|
||||
- Architecture: `./architecture.md`
|
||||
- Contract: `./contracts/ebpf-micro-witness-determinism-profile.md`
|
||||
- Policy Engine: `../policy/`
|
||||
- VexLens: `../vex-lens/`
|
||||
- High-Level Architecture: `../../ARCHITECTURE_OVERVIEW.md`
|
||||
|
||||
@@ -0,0 +1,124 @@
|
||||
# eBPF Micro-Witness Determinism Profile v1.0.0
|
||||
|
||||
**Status:** PLANNED
|
||||
**Version:** 1.0.0
|
||||
**Effective:** 2026-02-16
|
||||
**Owner:** Signals Guild + Scanner Guild + Attestor Guild + Evidence Locker Guild
|
||||
**Sprint:** `docs/implplan/SPRINT_20260216_001_Signals_ebpf_micro_witness_determinism_profile.md`
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
This profile defines the minimum deterministic contract for runtime eBPF "micro-witnesses" so replay yields the same symbolized result across distros/toolchains and in offline environments.
|
||||
|
||||
---
|
||||
|
||||
## 2. Contract Scope
|
||||
|
||||
- Runtime collection and BTF selection (`Signals`).
|
||||
- Runtime witness payload schema and signing (`Scanner`).
|
||||
- DSSE and transparency evidence shape (`Attestor`).
|
||||
- Portable storage/export/indexing (`Evidence Locker`).
|
||||
|
||||
---
|
||||
|
||||
## 3. Runtime Loader Contract (BTF Selection)
|
||||
|
||||
### 3.1 Selection order (mandatory)
|
||||
1. `/sys/kernel/btf/vmlinux`
|
||||
2. configured full-kernel BTF path (for example distro debug package path)
|
||||
3. split-BTF selected by `{kernel_release, arch}`
|
||||
|
||||
### 3.2 Required emitted metadata
|
||||
|
||||
```json
|
||||
{
|
||||
"kernel_release": "6.8.0-45-generic",
|
||||
"kernel_arch": "x86_64",
|
||||
"btf": {
|
||||
"source_kind": "kernel|external-vmlinux|split-btf",
|
||||
"source_path": "/sys/kernel/btf/vmlinux",
|
||||
"source_digest": "sha256:...",
|
||||
"selection_reason": "kernel_btf_present"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`source_path` and `source_digest` are mandatory for deterministic replay.
|
||||
|
||||
---
|
||||
|
||||
## 4. Deterministic Symbolization Contract
|
||||
|
||||
Each runtime witness must carry deterministic symbolization inputs:
|
||||
|
||||
```json
|
||||
{
|
||||
"symbolization": {
|
||||
"build_id": "gnu-build-id:...",
|
||||
"debug_artifact_uri": "cas://symbols/by-build-id/gnu-build-id:.../artifact.debug",
|
||||
"symbol_table_uri": "cas://symbols/by-build-id/gnu-build-id:.../symtab.json",
|
||||
"symbolizer": {
|
||||
"name": "llvm-symbolizer",
|
||||
"version": "18.1.7",
|
||||
"digest": "sha256:..."
|
||||
},
|
||||
"libc_variant": "glibc|musl",
|
||||
"sysroot_digest": "sha256:..."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
At least one of `debug_artifact_uri` or `symbol_table_uri` must be present.
|
||||
|
||||
---
|
||||
|
||||
## 5. Witness Packaging Contract
|
||||
|
||||
Each micro-witness must be exportable as:
|
||||
|
||||
1. `trace.json` (canonical payload)
|
||||
2. `trace.dsse.json` (DSSE envelope)
|
||||
3. `trace.sigstore.json` (Sigstore bundle with signature/cert/transparency proof)
|
||||
|
||||
Offline verification must use only bundle-contained material (no network dependency).
|
||||
|
||||
---
|
||||
|
||||
## 6. Evidence Locker Index Contract
|
||||
|
||||
Evidence Locker must index runtime witness artifacts by:
|
||||
|
||||
- `build_id`
|
||||
- `kernel_release`
|
||||
- `probe_id`
|
||||
- `policy_run_id`
|
||||
|
||||
These keys are required for deterministic replay lookup and audit search.
|
||||
|
||||
---
|
||||
|
||||
## 7. Validation Matrix (minimum)
|
||||
|
||||
- Kernel matrix: at least 3 supported kernel lines.
|
||||
- libc matrix: glibc + musl.
|
||||
- Verification modes: online + offline.
|
||||
- Determinism check: byte-identical replayed frame output for fixed input evidence.
|
||||
|
||||
---
|
||||
|
||||
## 8. Confirmed Gaps (2026-02-16 Baseline)
|
||||
|
||||
- Hard BTF dependency with no split-BTF fallback metadata contract in collector:
|
||||
- `src/Signals/__Libraries/StellaOps.Signals.Ebpf/Services/RuntimeSignalCollector.cs`
|
||||
- Probe load path is simulated and does not record selected BTF source:
|
||||
- `src/Signals/__Libraries/StellaOps.Signals.Ebpf/Probes/CoreProbeLoader.cs`
|
||||
- Runtime witness payload lacks required symbolization tuple fields:
|
||||
- `src/Scanner/__Libraries/StellaOps.Scanner.Reachability/Witnesses/PathWitness.cs`
|
||||
- `src/Scanner/__Libraries/StellaOps.Scanner.Reachability/Witnesses/RuntimeObservation.cs`
|
||||
- Runtime witness generator implementation is missing:
|
||||
- `src/Scanner/__Libraries/StellaOps.Scanner.Reachability/Witnesses/IRuntimeWitnessGenerator.cs`
|
||||
- Sigstore bundle (`trace.sigstore.json`) is not yet standardized in witness storage/export:
|
||||
- `src/Scanner/__Libraries/StellaOps.Scanner.Storage/Postgres/Migrations/013_witness_storage.sql`
|
||||
- `src/EvidenceLocker/__Libraries/StellaOps.EvidenceLocker.Export/Models/BundleManifest.cs`
|
||||
Reference in New Issue
Block a user