46 lines
1.3 KiB
Markdown
46 lines
1.3 KiB
Markdown
# Golden Pairs Corpus
|
|
|
|
Golden pairs are curated binary pairs (original vs patched) used to validate binary-diff logic.
|
|
Binaries are stored outside git; this folder tracks metadata, hashes, and reports only.
|
|
|
|
## Layout
|
|
|
|
```
|
|
datasets/golden-pairs/
|
|
index.json
|
|
CVE-2022-0847/
|
|
metadata.json
|
|
original/
|
|
vmlinux
|
|
vmlinux.sha256
|
|
vmlinux.sections.json
|
|
patched/
|
|
vmlinux
|
|
vmlinux.sha256
|
|
vmlinux.sections.json
|
|
diff-report.json
|
|
advisories/
|
|
USN-5317-1.txt
|
|
```
|
|
|
|
## File Conventions
|
|
|
|
- `metadata.json` follows `docs/schemas/golden-pair-v1.schema.json`.
|
|
- `index.json` follows `docs/schemas/golden-pairs-index.schema.json`.
|
|
- `*.sha256` contains a single lowercase hex digest, no prefix.
|
|
- `*.sections.json` contains section hash output from the ELF hash extractor.
|
|
- `diff-report.json` is produced by `golden-pairs diff`.
|
|
|
|
## Adding a Pair
|
|
|
|
1. Create a `CVE-YYYY-NNNN/metadata.json` with required fields.
|
|
2. Fetch binaries via `golden-pairs mirror CVE-...`.
|
|
3. Generate section hashes for each binary.
|
|
4. Run `golden-pairs diff CVE-...` and review `diff-report.json`.
|
|
5. Update `index.json` with status and summary counts.
|
|
|
|
## Offline Notes
|
|
|
|
- Use cached package mirrors or `file://` sources for air-gapped runs.
|
|
- Keep hashes and timestamps deterministic; always use UTC ISO-8601 timestamps.
|