179 lines
5.8 KiB
Markdown
179 lines
5.8 KiB
Markdown
# Hybrid Diff Stack Architecture (Source -> Symbols -> Normalized Bytes)
|
|
|
|
> Status: Implemented in BinaryIndex DeltaSig (2026-02-16)
|
|
> Module: BinaryIndex with cross-module contracts (Symbols, EvidenceLocker, Policy, Attestor, ReleaseOrchestrator)
|
|
|
|
## 1. Objective
|
|
|
|
Produce compact, auditable patch artifacts that preserve developer intent and
|
|
binary truth at the same time:
|
|
|
|
- Source-level intent: semantic edit scripts anchored to classes/functions.
|
|
- Build-level mapping: symbol map linked to immutable build identity.
|
|
- Binary-level patching: normalization-first per-symbol deltas.
|
|
- Release evidence: DSSE-signed contract consumed by policy and replay.
|
|
|
|
## 2. Implementation baseline (2026-02-16)
|
|
|
|
Implemented in `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/`:
|
|
|
|
- Hybrid artifact contracts: `semantic_edit_script`, `symbol_map`,
|
|
`symbol_patch_plan`, and `patch_manifest` (`HybridDiffContracts.cs`).
|
|
- Deterministic artifact composer with digest linking and manifest generation
|
|
(`HybridDiffComposer.cs`).
|
|
- DeltaSig generation now emits function deltas from symbol-map/signature
|
|
boundaries (address, section, size) instead of placeholder derivations.
|
|
- DeltaSig predicates include optional `hybridDiff` evidence bundle with linked
|
|
digests (`Attestation/DeltaSigPredicate.cs`, `DeltaSigService.cs`).
|
|
- Verifier fail-closed checks for hybrid artifact digest/linkage mismatches and
|
|
boundary/hash reconciliation in dry verification (`DeltaSigService.VerifyAsync`).
|
|
- Policy hooks for hybrid evidence requirements, AST anchor requirements,
|
|
namespace restrictions, and patch-manifest byte budgets
|
|
(`DeltaSigPolicyOptions`, `DeltaSigService.EvaluatePolicy`).
|
|
- Binary resolution API evidence (VulnResolutionResponse.Evidence) now projects
|
|
deterministic hybridDiff payloads for both live lookups and cache hits so
|
|
the Web evidence drawer can render semantic edit counts, symbol patch plans,
|
|
manifest summaries, and digest chains from a single response.
|
|
|
|
Current constraints:
|
|
|
|
- Source semantic edits are deterministic text/symbol heuristics, not a full
|
|
language-specific AST adapter.
|
|
- Symbol maps come from provided build manifests/maps when available; otherwise
|
|
deterministic fallback maps are synthesized from signatures.
|
|
- Delta application dry-run remains boundary/hash level verification; byte-level
|
|
patch replay engine integration is still a separate Attestor/Doctor concern.
|
|
|
|
## 3. Target contracts
|
|
|
|
### 3.1 Source semantic edit script (`semantic_edit_script.json`)
|
|
|
|
Required fields:
|
|
|
|
- `schemaVersion`
|
|
- `sourceTreeDigest`
|
|
- `edits[]` where each edit includes:
|
|
- `editType`: `add|remove|move|update|rename`
|
|
- `nodeKind`: `class|method|field|import|statement`
|
|
- `nodePath`: stable language-specific path
|
|
- `anchor`: symbol-like identifier (for example `Namespace.Type.Method`)
|
|
- `pre` and `post` source spans and digests
|
|
|
|
Determinism rules:
|
|
|
|
- Stable sort by file path, then node path.
|
|
- Stable source digests and normalized paths.
|
|
|
|
### 3.2 Symbol map (`symbol_map.json`)
|
|
|
|
Produced during build from DWARF/PDB + build metadata.
|
|
|
|
Required fields:
|
|
|
|
- `schemaVersion`
|
|
- `buildId`
|
|
- `binaryDigest`
|
|
- `symbols[]`:
|
|
- `name`
|
|
- `kind` (`function|object|section`)
|
|
- `addressStart` and `addressEnd`
|
|
- `section`
|
|
- `sourceRanges[]` (`file`, `lineStart`, `lineEnd`)
|
|
|
|
Determinism rules:
|
|
|
|
- Symbol ordering by address then name.
|
|
- Build id must match attestation subject.
|
|
|
|
### 3.3 Symbol patch plan (`symbol_patch_plan.json`)
|
|
|
|
Joins source edits with concrete symbols.
|
|
|
|
Required fields:
|
|
|
|
- `schemaVersion`
|
|
- `buildIdBefore` and `buildIdAfter`
|
|
- `editsDigest`
|
|
- `symbolMapDigestBefore` and `symbolMapDigestAfter`
|
|
- `changes[]`:
|
|
- `symbol`
|
|
- `changeType` (`added|removed|modified|moved`)
|
|
- `astAnchors[]`
|
|
- `preHash` and `postHash`
|
|
- `deltaRef`
|
|
|
|
### 3.4 Patch manifest (`patch_manifest.json`)
|
|
|
|
Binds per-symbol normalized deltas to evidence and policy.
|
|
|
|
Required fields:
|
|
|
|
- `schemaVersion`
|
|
- `buildId`
|
|
- `normalizationRecipeId`
|
|
- `patches[]`:
|
|
- `symbol`
|
|
- `addressRange`
|
|
- `deltaDigest`
|
|
- `pre` (`size`, `hash`)
|
|
- `post` (`size`, `hash`)
|
|
- `attestation` (`predicateType`, `dsseDigest`)
|
|
|
|
## 4. Evidence and policy integration
|
|
|
|
EvidenceLocker stores four linked artifacts per release comparison:
|
|
|
|
1. semantic edit script
|
|
2. symbol maps (before/after)
|
|
3. symbol patch plan
|
|
4. normalized patch manifest + delta blobs
|
|
|
|
Policy hooks:
|
|
|
|
- Allowlist/denylist by namespace or symbol path.
|
|
- Max function-count and max byte budget controls.
|
|
- API surface change checks.
|
|
- Hot-path and cryptography namespace protection rules.
|
|
|
|
## 5. Verifier contract (Attestor/Doctor)
|
|
|
|
Verifier must prove all of the following before promotion:
|
|
|
|
- Build-id and subject digest alignment.
|
|
- Re-normalization of target binary with matching recipe id.
|
|
- Dry-run delta application succeeds within declared symbol boundaries.
|
|
- Resulting hashes equal manifest `post` values.
|
|
- AST anchors reconcile to changed symbols in symbol patch plan.
|
|
- DSSE signatures and transparency references validate per policy.
|
|
|
|
## 6. Integration boundaries
|
|
|
|
Builder step (CI): emit symbol map and normalized segments.
|
|
|
|
ReleaseOrchestrator step: combine source edits, symbol maps, and normalized
|
|
bytes into patch plan and manifest.
|
|
|
|
BinaryIndex/DeltaSig: own normalization and per-symbol diff generation.
|
|
|
|
Attestor/Doctor: own verification and attestation checks.
|
|
|
|
EvidenceLocker: own storage schema and query surfaces.
|
|
|
|
Policy: consume summarized patch-plan metrics and rule evaluations.
|
|
|
|
## 7. Implementation tracker
|
|
|
|
Execution is tracked in:
|
|
|
|
- `docs/implplan/SPRINT_20260216_001_BinaryIndex_hybrid_diff_patch_pipeline.md`
|
|
|
|
## 8. Related documents
|
|
|
|
- `docs/hybrid-diff-patching.md`
|
|
- `docs/modules/binary-index/semantic-diffing.md`
|
|
- `docs/modules/binary-index/deltasig-v2-schema.md`
|
|
- `docs/modules/scanner/binary-diff-attestation.md`
|
|
- `docs/modules/evidence-locker/guides/evidence-pack-schema.md`
|
|
|
|
|