Files
git.stella-ops.org/docs/reachability/patch-oracles.md
StellaOps Bot f1a39c4ce3
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Notify Smoke Test / Notify Unit Tests (push) Has been cancelled
Notify Smoke Test / Notifier Service Tests (push) Has been cancelled
Notify Smoke Test / Notification Smoke Test (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled
up
2025-12-13 18:08:55 +02:00

221 lines
6.4 KiB
Markdown

# Patch-Oracles QA Pattern
Patch oracles define expected functions and edges that must be present (or absent) in generated reachability graphs. The CI pipeline uses these oracles to ensure that:
1. Critical vulnerability paths are correctly identified as reachable
2. Mitigated paths are correctly identified as unreachable
3. Graph generation remains deterministic and complete
This document covers both the **JSON-based harness** (for reachbench integration) and the **YAML-based format** (for binary patch testing).
---
## Part A: JSON Patch-Oracle Harness (v1)
The JSON-based patch-oracle harness integrates with the reachbench fixture system for CI graph validation.
### A.1 Schema Overview
Patch-oracle fixtures follow the `patch-oracle/v1` schema:
```json
{
"schema_version": "patch-oracle/v1",
"id": "curl-CVE-2023-38545-socks5-heap-reachable",
"case_ref": "curl-CVE-2023-38545-socks5-heap",
"variant": "reachable",
"description": "Validates SOCKS5 heap overflow path is reachable",
"expected_functions": [...],
"expected_edges": [...],
"expected_roots": [...],
"forbidden_functions": [...],
"forbidden_edges": [...],
"min_confidence": 0.5,
"strict_mode": false
}
```
### A.2 Expected Functions
Define functions that MUST be present in the graph:
```json
{
"symbol_id": "sym://curl:curl.c#sink",
"lang": "c",
"kind": "function",
"purl_pattern": "pkg:github/curl/*",
"required": true,
"reason": "Vulnerable buffer handling function"
}
```
### A.3 Expected Edges
Define edges that MUST be present in the graph:
```json
{
"from": "sym://net:handler#read",
"to": "sym://curl:curl.c#entry",
"kind": "call",
"min_confidence": 0.8,
"required": true,
"reason": "Data flows from network to SOCKS5 handler"
}
```
### A.4 Forbidden Elements (for unreachable variants)
```json
{
"forbidden_functions": [
{
"symbol_id": "sym://dangerous#sink",
"reason": "Should not be reachable when feature disabled"
}
],
"forbidden_edges": [
{
"from": "sym://entry",
"to": "sym://sink",
"reason": "Path should be blocked by feature flag"
}
]
}
```
### A.5 Wildcard Patterns
Symbol IDs support `*` wildcards:
- `sym://test#func1` - exact match
- `sym://test#*` - matches any symbol starting with `sym://test#`
- `*` - matches anything
### A.6 Directory Structure
```
tests/reachability/fixtures/patch-oracles/
├── INDEX.json # Oracle index
├── schema/
│ └── patch-oracle-v1.json # JSON Schema
└── cases/
├── curl-CVE-2023-38545-socks5-heap/
│ ├── reachable.oracle.json
│ └── unreachable.oracle.json
└── java-log4j-CVE-2021-44228-log4shell/
└── reachable.oracle.json
```
### A.7 Usage in Tests
```csharp
var loader = new PatchOracleLoader(fixtureRoot);
var oracle = loader.LoadOracle("curl-CVE-2023-38545-socks5-heap-reachable");
var comparer = new PatchOracleComparer(oracle);
var result = comparer.Compare(richGraph);
if (!result.Success)
{
foreach (var violation in result.Violations)
{
Console.WriteLine($"[{violation.Type}] {violation.From} -> {violation.To}");
}
}
```
### A.8 Violation Types
| Type | Description |
|------|-------------|
| `MissingFunction` | Required function not found |
| `MissingEdge` | Required edge not found |
| `MissingRoot` | Required root not found |
| `ForbiddenFunctionPresent` | Forbidden function found |
| `ForbiddenEdgePresent` | Forbidden edge found |
| `UnexpectedFunction` | Unexpected function in strict mode |
| `UnexpectedEdge` | Unexpected edge in strict mode |
---
## Part B: YAML Binary Patch-Oracles
The YAML-based format is used for paired vulnerable/fixed binary testing.
### B.1 Workflow (per CVE)
1) Pick a CVE with a small, clean fix (e.g., OpenSSL, zlib, BusyBox). Identify vulnerable commit `A` and fixed commit `B`.
2) Build two stripped binaries (`vuln`, `fixed`) with identical toolchains/flags; keep a tiny harness that exercises the affected path.
3) Run Scanner binary analyzers to emit `richgraph-v1` for each binary.
4) Diff graphs: expect new/removed functions and edges to match the patch (e.g., `foo_parse -> validate_len` added; `foo_parse -> memcpy` removed).
5) Fail the test if expected functions/edges are absent or unchanged.
### B.2 Oracle manifest (YAML)
```yaml
cve: CVE-YYYY-XXXX
target: libfoo 1.2.3
build:
cc: clang
cflags: [-O2, -fno-omit-frame-pointer]
ldflags: []
strip: true
expect:
functions_added: [validate_len]
functions_removed: [unsafe_copy]
edges_added:
- { caller: foo_parse, callee: validate_len }
edges_removed:
- { caller: foo_parse, callee: memcpy }
tolerances:
allow_unresolved_symbols: 0
allow_extra_funcs: 2
```
Place manifests under `tests/reachability/patch-oracles/<cve>/oracle.yml` next to the sources/build scripts.
## 3. Repository layout
```
tests/reachability/patch-oracles/
CVE-YYYY-XXXX-foo/
src/ # vuln + fixed sources + harness
build.sh # produces ./out/vuln ./out/fixed
oracle.yml
```
## 4. Harness rules
- Output binaries to `out/vuln` and `out/fixed` with deterministic flags and stripped symbols.
- Record toolchain version in a sidecar `build-meta.json` so Replay captures provenance.
- Never download from the internet during CI; vendor tiny sources into the fixture folder.
## 5. Test runner expectations
- Runs Scanner binary analyzers on both binaries; emits `richgraph-v1` CAS entries.
- Compares graphs against `oracle.yml` expectations (functions/edges added/removed, tolerances).
- Fails when deltas are missing; succeeds when expected guards/edges are present.
## 6. Integration points
- **Scanner**: add fixture runner under `tests/reachability/StellaOps.Scanner.Binary.PatchOracleTests`.
- **CI**: wire into reachbench/patch-oracles job; ensure artifacts are small and deterministic.
- **Docs**: link this file from reachability delivery guide once tests are live.
### B.7 Acceptance criteria
- At least three seed oracles (e.g., zlib overflow, OpenSSL length guard, BusyBox ash fix) committed with passing expectations.
- CI job proves deterministic hashes across reruns.
- Failures emit clear diffs (`expected edge foo->validate_len missing`).
---
## Related Documentation
- [Reachability Evidence Chain](./function-level-evidence.md)
- [RichGraph Schema](../contracts/richgraph-v1.md)
- [Ground Truth Schema](./ground-truth-schema.md)
- [Lattice States](./lattice.md)
- [Reachability Delivery Guide](./DELIVERY_GUIDE.md)