tests fixes and sprints work
This commit is contained in:
351
docs/modules/cli/guides/ground-truth-cli.md
Normal file
351
docs/modules/cli/guides/ground-truth-cli.md
Normal file
@@ -0,0 +1,351 @@
|
||||
# Ground-Truth Corpus CLI Guide
|
||||
|
||||
**Sprint:** SPRINT_20260121_035_BinaryIndex_golden_corpus_connectors_cli
|
||||
|
||||
## Overview
|
||||
|
||||
The `stella groundtruth` command group provides CLI access to the ground-truth corpus for patch-paired binary verification. This corpus enables precision validation of security advisories by maintaining symbol and binary pairs from upstream distribution sources.
|
||||
|
||||
## Use Cases
|
||||
|
||||
- **Security teams**: Validate patch presence in production binaries
|
||||
- **Compliance auditors**: Generate evidence bundles for air-gapped verification
|
||||
- **DevSecOps**: Integrate corpus validation into CI/CD pipelines
|
||||
- **Researchers**: Query symbol databases for vulnerability analysis
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Stella CLI installed and configured
|
||||
- Backend connectivity to Platform service (or offline bundle)
|
||||
- For sync operations: network access to upstream sources
|
||||
|
||||
## Command Structure
|
||||
|
||||
```
|
||||
stella groundtruth
|
||||
├── sources # Manage symbol source connectors
|
||||
│ ├── list # List available connectors
|
||||
│ ├── enable # Enable a connector
|
||||
│ ├── disable # Disable a connector
|
||||
│ └── sync # Sync from upstream
|
||||
├── symbols # Query symbols in corpus
|
||||
│ ├── lookup # Lookup by debug ID
|
||||
│ └── search # Search by package/distro
|
||||
├── pairs # Manage security pairs
|
||||
│ ├── create # Create vuln/patch pair
|
||||
│ ├── list # List existing pairs
|
||||
│ └── delete # Remove a pair
|
||||
└── validate # Run validation harness
|
||||
├── run # Execute validation
|
||||
├── metrics # View run metrics
|
||||
└── export # Export report
|
||||
```
|
||||
|
||||
## Source Connectors
|
||||
|
||||
The ground-truth corpus ingests data from multiple upstream sources:
|
||||
|
||||
| Connector ID | Distribution | Data Type | Description |
|
||||
|--------------|--------------|-----------|-------------|
|
||||
| `debuginfod-fedora` | Fedora | Debug symbols | ELF debuginfo via debuginfod protocol |
|
||||
| `debuginfod-ubuntu` | Ubuntu | Debug symbols | ELF debuginfo via debuginfod protocol |
|
||||
| `ddeb-ubuntu` | Ubuntu | Debug packages | `.ddeb` debug symbol packages |
|
||||
| `buildinfo-debian` | Debian | Build metadata | `.buildinfo` reproducibility records |
|
||||
| `secdb-alpine` | Alpine | Security DB | `secfixes` YAML from APKBUILD |
|
||||
|
||||
### List Sources
|
||||
|
||||
```bash
|
||||
stella groundtruth sources list
|
||||
|
||||
# Output:
|
||||
ID Display Name Status Last Sync
|
||||
------------------------------------------------------------------------------------------
|
||||
debuginfod-fedora Fedora Debuginfod Enabled 2026-01-22T10:00:00Z
|
||||
debuginfod-ubuntu Ubuntu Debuginfod Enabled 2026-01-22T10:00:00Z
|
||||
ddeb-ubuntu Ubuntu ddebs Enabled 2026-01-22T09:30:00Z
|
||||
buildinfo-debian Debian Buildinfo Enabled 2026-01-22T08:00:00Z
|
||||
secdb-alpine Alpine SecDB Enabled 2026-01-22T06:00:00Z
|
||||
```
|
||||
|
||||
### Enable/Disable Sources
|
||||
|
||||
```bash
|
||||
# Enable a source connector
|
||||
stella groundtruth sources enable debuginfod-fedora
|
||||
|
||||
# Disable a source connector (stops future syncs)
|
||||
stella groundtruth sources disable debuginfod-fedora
|
||||
```
|
||||
|
||||
### Sync Sources
|
||||
|
||||
```bash
|
||||
# Incremental sync of all enabled sources
|
||||
stella groundtruth sources sync
|
||||
|
||||
# Full sync of a specific source
|
||||
stella groundtruth sources sync --source buildinfo-debian --full
|
||||
|
||||
# Sync with verbose output
|
||||
stella groundtruth sources sync --source ddeb-ubuntu -v
|
||||
```
|
||||
|
||||
## Symbol Operations
|
||||
|
||||
### Lookup by Debug ID
|
||||
|
||||
Query symbols using the ELF GNU Build-ID or equivalent identifier:
|
||||
|
||||
```bash
|
||||
# Lookup by build-id
|
||||
stella groundtruth symbols lookup --debug-id 7f8a9b2c4d5e6f1a
|
||||
|
||||
# JSON output
|
||||
stella groundtruth symbols lookup --debug-id 7f8a9b2c4d5e6f1a --output-format json
|
||||
```
|
||||
|
||||
**Example output:**
|
||||
```
|
||||
Binary: libcrypto.so.3
|
||||
Architecture: x86_64
|
||||
Distribution: debian-bookworm
|
||||
Package: openssl@3.0.11-1
|
||||
Symbol Count: 4523
|
||||
Sources: debuginfod-fedora, buildinfo-debian
|
||||
```
|
||||
|
||||
### Search Symbols
|
||||
|
||||
Search across the corpus by package name or distribution:
|
||||
|
||||
```bash
|
||||
# Search by package
|
||||
stella groundtruth symbols search --package openssl
|
||||
|
||||
# Filter by distribution
|
||||
stella groundtruth symbols search --package openssl --distro debian
|
||||
|
||||
# Limit results
|
||||
stella groundtruth symbols search --package curl --limit 100
|
||||
```
|
||||
|
||||
## Security Pairs
|
||||
|
||||
Security pairs link vulnerable and patched binary versions for a specific CVE.
|
||||
|
||||
### Create a Pair
|
||||
|
||||
```bash
|
||||
stella groundtruth pairs create \
|
||||
--cve CVE-2024-1234 \
|
||||
--vuln-pkg openssl=3.0.10-1 \
|
||||
--patch-pkg openssl=3.0.11-1 \
|
||||
--distro debian-bookworm
|
||||
```
|
||||
|
||||
### List Pairs
|
||||
|
||||
```bash
|
||||
# List all pairs
|
||||
stella groundtruth pairs list
|
||||
|
||||
# Filter by CVE pattern
|
||||
stella groundtruth pairs list --cve "CVE-2024-*"
|
||||
|
||||
# Filter by package
|
||||
stella groundtruth pairs list --package openssl --limit 50
|
||||
|
||||
# JSON output
|
||||
stella groundtruth pairs list --output-format json
|
||||
```
|
||||
|
||||
**Example output:**
|
||||
```
|
||||
Pair ID CVE Package Vuln Version Patch Version
|
||||
-------------------------------------------------------------------------------
|
||||
pair-001 CVE-2024-1234 openssl 3.0.10-1 3.0.11-1
|
||||
pair-002 CVE-2024-5678 curl 8.4.0-1 8.5.0-1
|
||||
```
|
||||
|
||||
### Delete a Pair
|
||||
|
||||
```bash
|
||||
# Delete with confirmation prompt
|
||||
stella groundtruth pairs delete pair-001
|
||||
|
||||
# Skip confirmation
|
||||
stella groundtruth pairs delete pair-001 --force
|
||||
```
|
||||
|
||||
## Validation Harness
|
||||
|
||||
The validation harness runs end-to-end verification against security pairs.
|
||||
|
||||
### Run Validation
|
||||
|
||||
```bash
|
||||
# Validate all pairs
|
||||
stella groundtruth validate run
|
||||
|
||||
# Validate specific pairs (pattern match)
|
||||
stella groundtruth validate run --pairs "openssl:CVE-2024-*"
|
||||
|
||||
# Use specific matcher
|
||||
stella groundtruth validate run --matcher semantic-diffing
|
||||
|
||||
# Parallel validation with report output
|
||||
stella groundtruth validate run \
|
||||
--pairs "curl:*" \
|
||||
--parallel 8 \
|
||||
--output validation-report.md
|
||||
```
|
||||
|
||||
**Matcher types:**
|
||||
| Matcher | Description |
|
||||
|---------|-------------|
|
||||
| `semantic-diffing` | IR-level semantic comparison (default) |
|
||||
| `hash-based` | Function hash matching |
|
||||
| `hybrid` | Combined semantic + hash approach |
|
||||
|
||||
### View Metrics
|
||||
|
||||
```bash
|
||||
stella groundtruth validate metrics --run-id vr-20260122100532
|
||||
|
||||
# JSON output
|
||||
stella groundtruth validate metrics --run-id vr-20260122100532 --output-format json
|
||||
```
|
||||
|
||||
**Example output:**
|
||||
```
|
||||
Run ID: vr-20260122100532
|
||||
Duration: 2026-01-22T10:00:00Z - 2026-01-22T10:15:32Z
|
||||
Pairs: 48/50 successful
|
||||
Function Match Rate: 94.2%
|
||||
False-Negative Rate: 2.1%
|
||||
SBOM Hash Stability: 3/3
|
||||
Verify Time (p50/p95): 423ms / 1.2s
|
||||
```
|
||||
|
||||
### Export Reports
|
||||
|
||||
```bash
|
||||
# Export as Markdown
|
||||
stella groundtruth validate export \
|
||||
--run-id vr-20260122100532 \
|
||||
--format markdown \
|
||||
--output report.md
|
||||
|
||||
# Export as HTML
|
||||
stella groundtruth validate export \
|
||||
--run-id vr-20260122100532 \
|
||||
--format html \
|
||||
--output report.html
|
||||
|
||||
# Export as JSON (machine-readable)
|
||||
stella groundtruth validate export \
|
||||
--run-id vr-20260122100532 \
|
||||
--format json \
|
||||
--output report.json
|
||||
```
|
||||
|
||||
## CI/CD Integration
|
||||
|
||||
### GitHub Actions Example
|
||||
|
||||
```yaml
|
||||
name: Corpus Validation
|
||||
on:
|
||||
schedule:
|
||||
- cron: '0 6 * * 1' # Weekly on Monday
|
||||
|
||||
jobs:
|
||||
validate:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Sync corpus sources
|
||||
run: stella groundtruth sources sync
|
||||
|
||||
- name: Run validation
|
||||
run: |
|
||||
stella groundtruth validate run \
|
||||
--matcher semantic-diffing \
|
||||
--parallel 4 \
|
||||
--output validation-${{ github.run_id }}.md
|
||||
|
||||
- name: Check metrics
|
||||
run: |
|
||||
MATCH_RATE=$(stella groundtruth validate metrics --run-id $(cat run-id.txt) --output-format json | jq '.functionMatchRate')
|
||||
if (( $(echo "$MATCH_RATE < 90" | bc -l) )); then
|
||||
echo "Match rate below threshold: $MATCH_RATE%"
|
||||
exit 1
|
||||
fi
|
||||
```
|
||||
|
||||
### GitLab CI Example
|
||||
|
||||
```yaml
|
||||
corpus-validation:
|
||||
stage: verify
|
||||
script:
|
||||
- stella groundtruth sources sync --source buildinfo-debian
|
||||
- stella groundtruth validate run --pairs "openssl:*" --output report.md
|
||||
artifacts:
|
||||
paths:
|
||||
- report.md
|
||||
expire_in: 1 week
|
||||
rules:
|
||||
- if: $CI_PIPELINE_SOURCE == "schedule"
|
||||
```
|
||||
|
||||
## Offline Usage
|
||||
|
||||
For air-gapped environments, use offline bundles:
|
||||
|
||||
```bash
|
||||
# Export corpus for offline use
|
||||
stella bundle export \
|
||||
--include-corpus \
|
||||
--output corpus-bundle-$(date +%F).tar.gz
|
||||
|
||||
# Import on air-gapped system
|
||||
stella bundle import --package corpus-bundle-2026-01-22.tar.gz
|
||||
|
||||
# Run validation offline
|
||||
stella groundtruth validate run --offline
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**Sync fails with network error:**
|
||||
```bash
|
||||
# Check source status
|
||||
stella groundtruth sources list
|
||||
|
||||
# Retry with verbose output
|
||||
stella groundtruth sources sync --source debuginfod-ubuntu -v
|
||||
```
|
||||
|
||||
**Symbol lookup returns no results:**
|
||||
```bash
|
||||
# Verify debug-id format (hex string)
|
||||
stella groundtruth symbols lookup --debug-id abc123 -v
|
||||
|
||||
# Try searching by package instead
|
||||
stella groundtruth symbols search --package libcrypto
|
||||
```
|
||||
|
||||
**Validation metrics show low match rate:**
|
||||
- Check that both vuln and patch binaries are present in corpus
|
||||
- Verify symbol sources are synced and enabled
|
||||
- Consider using `hybrid` matcher for complex cases
|
||||
|
||||
## See Also
|
||||
|
||||
- [CLI Command Reference](commands/reference.md#ground-truth-corpus-commands)
|
||||
- [BinaryIndex Architecture](../../binary-index/architecture.md)
|
||||
- [Golden Corpus KPIs](../../benchmarks/golden-corpus-kpis.md)
|
||||
- [Air-Gap Bundle Guide](../../modules/airgap/README.md)
|
||||
Reference in New Issue
Block a user