# Golden Pairs Corpus Golden pairs are curated binary pairs (original vs patched) used to validate binary-diff logic. Binaries are stored outside git; this folder tracks metadata, hashes, and reports only. ## Layout ``` datasets/golden-pairs/ index.json CVE-2022-0847/ metadata.json original/ vmlinux vmlinux.sha256 vmlinux.sections.json patched/ vmlinux vmlinux.sha256 vmlinux.sections.json diff-report.json advisories/ USN-5317-1.txt ``` ## File Conventions - `metadata.json` follows `docs/schemas/golden-pair-v1.schema.json`. - `index.json` follows `docs/schemas/golden-pairs-index.schema.json`. - `*.sha256` contains a single lowercase hex digest, no prefix. - `*.sections.json` contains section hash output from the ELF hash extractor. - `diff-report.json` is produced by `golden-pairs diff`. ## Adding a Pair 1. Create a `CVE-YYYY-NNNN/metadata.json` with required fields. 2. Fetch binaries via `golden-pairs mirror CVE-...`. 3. Generate section hashes for each binary. 4. Run `golden-pairs diff CVE-...` and review `diff-report.json`. 5. Update `index.json` with status and summary counts. ## Offline Notes - Use cached package mirrors or `file://` sources for air-gapped runs. - Keep hashes and timestamps deterministic; always use UTC ISO-8601 timestamps.