Files
git.stella-ops.org/datasets/golden-pairs/README.md
2026-01-13 18:53:39 +02:00

1.3 KiB

Golden Pairs Corpus

Golden pairs are curated binary pairs (original vs patched) used to validate binary-diff logic. Binaries are stored outside git; this folder tracks metadata, hashes, and reports only.

Layout

datasets/golden-pairs/
  index.json
  CVE-2022-0847/
    metadata.json
    original/
      vmlinux
      vmlinux.sha256
      vmlinux.sections.json
    patched/
      vmlinux
      vmlinux.sha256
      vmlinux.sections.json
    diff-report.json
    advisories/
      USN-5317-1.txt

File Conventions

  • metadata.json follows docs/schemas/golden-pair-v1.schema.json.
  • index.json follows docs/schemas/golden-pairs-index.schema.json.
  • *.sha256 contains a single lowercase hex digest, no prefix.
  • *.sections.json contains section hash output from the ELF hash extractor.
  • diff-report.json is produced by golden-pairs diff.

Adding a Pair

  1. Create a CVE-YYYY-NNNN/metadata.json with required fields.
  2. Fetch binaries via golden-pairs mirror CVE-....
  3. Generate section hashes for each binary.
  4. Run golden-pairs diff CVE-... and review diff-report.json.
  5. Update index.json with status and summary counts.

Offline Notes

  • Use cached package mirrors or file:// sources for air-gapped runs.
  • Keep hashes and timestamps deterministic; always use UTC ISO-8601 timestamps.