5.8 KiB
Hybrid Diff Stack Architecture (Source -> Symbols -> Normalized Bytes)
Status: Implemented in BinaryIndex DeltaSig (2026-02-16) Module: BinaryIndex with cross-module contracts (Symbols, EvidenceLocker, Policy, Attestor, ReleaseOrchestrator)
1. Objective
Produce compact, auditable patch artifacts that preserve developer intent and binary truth at the same time:
- Source-level intent: semantic edit scripts anchored to classes/functions.
- Build-level mapping: symbol map linked to immutable build identity.
- Binary-level patching: normalization-first per-symbol deltas.
- Release evidence: DSSE-signed contract consumed by policy and replay.
2. Implementation baseline (2026-02-16)
Implemented in src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/:
- Hybrid artifact contracts:
semantic_edit_script,symbol_map,symbol_patch_plan, andpatch_manifest(HybridDiffContracts.cs). - Deterministic artifact composer with digest linking and manifest generation
(
HybridDiffComposer.cs). - DeltaSig generation now emits function deltas from symbol-map/signature boundaries (address, section, size) instead of placeholder derivations.
- DeltaSig predicates include optional
hybridDiffevidence bundle with linked digests (Attestation/DeltaSigPredicate.cs,DeltaSigService.cs). - Verifier fail-closed checks for hybrid artifact digest/linkage mismatches and
boundary/hash reconciliation in dry verification (
DeltaSigService.VerifyAsync). - Policy hooks for hybrid evidence requirements, AST anchor requirements,
namespace restrictions, and patch-manifest byte budgets
(
DeltaSigPolicyOptions,DeltaSigService.EvaluatePolicy). - Binary resolution API evidence (VulnResolutionResponse.Evidence) now projects deterministic hybridDiff payloads for both live lookups and cache hits so the Web evidence drawer can render semantic edit counts, symbol patch plans, manifest summaries, and digest chains from a single response.
Current constraints:
- Source semantic edits are deterministic text/symbol heuristics, not a full language-specific AST adapter.
- Symbol maps come from provided build manifests/maps when available; otherwise deterministic fallback maps are synthesized from signatures.
- Delta application dry-run remains boundary/hash level verification; byte-level patch replay engine integration is still a separate Attestor/Doctor concern.
3. Target contracts
3.1 Source semantic edit script (semantic_edit_script.json)
Required fields:
schemaVersionsourceTreeDigestedits[]where each edit includes:editType:add|remove|move|update|renamenodeKind:class|method|field|import|statementnodePath: stable language-specific pathanchor: symbol-like identifier (for exampleNamespace.Type.Method)preandpostsource spans and digests
Determinism rules:
- Stable sort by file path, then node path.
- Stable source digests and normalized paths.
3.2 Symbol map (symbol_map.json)
Produced during build from DWARF/PDB + build metadata.
Required fields:
schemaVersionbuildIdbinaryDigestsymbols[]:namekind(function|object|section)addressStartandaddressEndsectionsourceRanges[](file,lineStart,lineEnd)
Determinism rules:
- Symbol ordering by address then name.
- Build id must match attestation subject.
3.3 Symbol patch plan (symbol_patch_plan.json)
Joins source edits with concrete symbols.
Required fields:
schemaVersionbuildIdBeforeandbuildIdAftereditsDigestsymbolMapDigestBeforeandsymbolMapDigestAfterchanges[]:symbolchangeType(added|removed|modified|moved)astAnchors[]preHashandpostHashdeltaRef
3.4 Patch manifest (patch_manifest.json)
Binds per-symbol normalized deltas to evidence and policy.
Required fields:
schemaVersionbuildIdnormalizationRecipeIdpatches[]:symboladdressRangedeltaDigestpre(size,hash)post(size,hash)
attestation(predicateType,dsseDigest)
4. Evidence and policy integration
EvidenceLocker stores four linked artifacts per release comparison:
- semantic edit script
- symbol maps (before/after)
- symbol patch plan
- normalized patch manifest + delta blobs
Policy hooks:
- Allowlist/denylist by namespace or symbol path.
- Max function-count and max byte budget controls.
- API surface change checks.
- Hot-path and cryptography namespace protection rules.
5. Verifier contract (Attestor/Doctor)
Verifier must prove all of the following before promotion:
- Build-id and subject digest alignment.
- Re-normalization of target binary with matching recipe id.
- Dry-run delta application succeeds within declared symbol boundaries.
- Resulting hashes equal manifest
postvalues. - AST anchors reconcile to changed symbols in symbol patch plan.
- DSSE signatures and transparency references validate per policy.
6. Integration boundaries
Builder step (CI): emit symbol map and normalized segments.
ReleaseOrchestrator step: combine source edits, symbol maps, and normalized bytes into patch plan and manifest.
BinaryIndex/DeltaSig: own normalization and per-symbol diff generation.
Attestor/Doctor: own verification and attestation checks.
EvidenceLocker: own storage schema and query surfaces.
Policy: consume summarized patch-plan metrics and rule evaluations.
7. Implementation tracker
Execution is tracked in:
docs/implplan/SPRINT_20260216_001_BinaryIndex_hybrid_diff_patch_pipeline.md
8. Related documents
docs/hybrid-diff-patching.mddocs/modules/binary-index/semantic-diffing.mddocs/modules/binary-index/deltasig-v2-schema.mddocs/modules/scanner/binary-diff-attestation.mddocs/modules/evidence-locker/guides/evidence-pack-schema.md