tests fixes and sprints work
This commit is contained in:
365
docs/runbooks/golden-corpus-operations.md
Normal file
365
docs/runbooks/golden-corpus-operations.md
Normal file
@@ -0,0 +1,365 @@
|
||||
# Golden Corpus Operations Runbook
|
||||
|
||||
Sprint: SPRINT_20260121_036_BinaryIndex_golden_corpus_bundle_verification
|
||||
Task: GCB-006 - Document corpus folder layout and maintenance procedures
|
||||
|
||||
## Overview
|
||||
|
||||
This runbook provides operational procedures for the golden corpus infrastructure, including troubleshooting, incident response, and common maintenance tasks.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| Task | Command |
|
||||
|------|---------|
|
||||
| Check corpus health | `stella doctor --check "check.binaryanalysis.corpus.*"` |
|
||||
| Run validation | `stella groundtruth validate run --output results.json` |
|
||||
| Check regression | `stella groundtruth validate check --results results.json --baseline current.json` |
|
||||
| Update baseline | `stella groundtruth baseline update --from-results results.json --output current.json` |
|
||||
| Export bundle | `stella groundtruth bundle export --packages openssl --distros debian --output bundle.tar.gz` |
|
||||
| Verify bundle | `stella groundtruth bundle import --input bundle.tar.gz --verify` |
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Mirror Sync Failures
|
||||
|
||||
#### Symptoms
|
||||
- Doctor check `check.binaryanalysis.corpus.mirror.freshness` fails
|
||||
- Validation runs fail with "source not found" errors
|
||||
- Alerts for stale mirrors
|
||||
|
||||
#### Diagnosis
|
||||
|
||||
```bash
|
||||
# Check mirror last sync times
|
||||
ls -la /data/golden-corpus/mirrors/*/.last-sync
|
||||
|
||||
# Check sync logs
|
||||
tail -100 /var/log/corpus/debian-sync.log
|
||||
tail -100 /var/log/corpus/ubuntu-sync.log
|
||||
tail -100 /var/log/corpus/osv-sync.log
|
||||
|
||||
# Test connectivity
|
||||
curl -I https://snapshot.debian.org/
|
||||
curl -I https://buildinfos.debian.net/
|
||||
curl -I https://ubuntu.com/security/notices.json
|
||||
```
|
||||
|
||||
#### Resolution
|
||||
|
||||
1. **Network connectivity issues**
|
||||
```bash
|
||||
# Check firewall rules
|
||||
iptables -L -n | grep -E "80|443"
|
||||
|
||||
# Check DNS resolution
|
||||
nslookup snapshot.debian.org
|
||||
|
||||
# Test with proxy if applicable
|
||||
export https_proxy=http://proxy:3128
|
||||
curl -I https://snapshot.debian.org/
|
||||
```
|
||||
|
||||
2. **Upstream service unavailable**
|
||||
- Check upstream service status
|
||||
- Wait and retry (services may be temporarily unavailable)
|
||||
- Switch to backup mirror if available
|
||||
|
||||
3. **Disk space issues**
|
||||
```bash
|
||||
# Check disk usage
|
||||
df -h /data/golden-corpus
|
||||
|
||||
# Clean up old archives
|
||||
/opt/golden-corpus/scripts/archive-old-results.sh
|
||||
```
|
||||
|
||||
4. **Permission issues**
|
||||
```bash
|
||||
# Check file ownership
|
||||
ls -la /data/golden-corpus/mirrors/
|
||||
|
||||
# Fix permissions
|
||||
chown -R corpus:corpus /data/golden-corpus/mirrors/
|
||||
chmod -R 755 /data/golden-corpus/mirrors/
|
||||
```
|
||||
|
||||
### Validation Failures
|
||||
|
||||
#### Symptoms
|
||||
- CI pipeline fails on regression check
|
||||
- Validation run exits with non-zero code
|
||||
- Lower than expected KPI metrics
|
||||
|
||||
#### Diagnosis
|
||||
|
||||
```bash
|
||||
# Check latest validation results
|
||||
stella groundtruth validate metrics --run-id latest --detailed
|
||||
|
||||
# Compare with baseline
|
||||
stella groundtruth validate check \
|
||||
--results bench/results/latest.json \
|
||||
--baseline bench/baselines/current.json \
|
||||
--verbose
|
||||
|
||||
# Review specific failures
|
||||
jq '.failedPairs[]' bench/results/latest.json
|
||||
```
|
||||
|
||||
#### Resolution
|
||||
|
||||
1. **True regression (algorithm degradation)**
|
||||
- Review recent code changes
|
||||
- Identify the causing commit
|
||||
- Either fix the regression or update baseline if intentional
|
||||
|
||||
2. **False positive (ground truth incorrect)**
|
||||
```bash
|
||||
# Review ground truth for specific pair
|
||||
cat corpus/debian/openssl/DSA-5678-1/metadata/ground-truth.json
|
||||
|
||||
# Update ground truth if incorrect
|
||||
# (Requires manual review by security team)
|
||||
```
|
||||
|
||||
3. **Infrastructure issues**
|
||||
- Check if build environment is consistent
|
||||
- Verify debug symbols are available
|
||||
- Check Ghidra/BSim connectivity
|
||||
|
||||
4. **Baseline drift**
|
||||
- If corpus was significantly updated, baseline may need refresh
|
||||
- Run full validation and update baseline following procedures
|
||||
|
||||
### Bundle Verification Failures
|
||||
|
||||
#### Symptoms
|
||||
- `stella groundtruth bundle import --verify` fails
|
||||
- Signature verification errors
|
||||
- Timestamp validation errors
|
||||
|
||||
#### Diagnosis
|
||||
|
||||
```bash
|
||||
# Verbose verification
|
||||
stella groundtruth bundle import \
|
||||
--input bundle.tar.gz \
|
||||
--verify \
|
||||
--verbose \
|
||||
--output report.json
|
||||
|
||||
# Check specific failures
|
||||
jq '.signatureResult, .timestampResult, .digestResult' report.json
|
||||
```
|
||||
|
||||
#### Resolution
|
||||
|
||||
1. **Signature verification failure**
|
||||
```bash
|
||||
# Check trusted keys
|
||||
cat /etc/stellaops/trusted-keys.pub
|
||||
|
||||
# Verify key hasn't expired
|
||||
openssl x509 -in /etc/stellaops/trusted-keys.pub -noout -dates
|
||||
|
||||
# Check if bundle was signed with different key
|
||||
# May need to add signing key to trusted keys
|
||||
```
|
||||
|
||||
2. **Timestamp verification failure**
|
||||
- Check TSA certificate validity
|
||||
- Verify system clock is accurate
|
||||
- Check if timestamp is within validity window
|
||||
|
||||
3. **Digest mismatch**
|
||||
- Bundle may be corrupted during transfer
|
||||
- Re-download or re-generate the bundle
|
||||
- Check for partial transfers
|
||||
|
||||
### Baseline Not Found
|
||||
|
||||
#### Symptoms
|
||||
- Doctor check `check.binaryanalysis.corpus.kpi.baseline` fails
|
||||
- Regression check errors with "baseline not found"
|
||||
|
||||
#### Resolution
|
||||
|
||||
```bash
|
||||
# Check baseline path
|
||||
ls -la bench/baselines/current.json
|
||||
|
||||
# If missing, create from latest results
|
||||
stella groundtruth baseline update \
|
||||
--from-results bench/results/latest.json \
|
||||
--output bench/baselines/current.json \
|
||||
--description "Initial baseline"
|
||||
|
||||
# Or restore from archive
|
||||
ls bench/baselines/archive/
|
||||
cp bench/baselines/archive/baseline-20260115.json \
|
||||
bench/baselines/current.json
|
||||
```
|
||||
|
||||
### Debuginfod Connectivity Issues
|
||||
|
||||
#### Symptoms
|
||||
- Doctor check `check.binaryanalysis.debuginfod.availability` fails
|
||||
- Missing debug symbols during validation
|
||||
|
||||
#### Diagnosis
|
||||
|
||||
```bash
|
||||
# Check DEBUGINFOD_URLS environment
|
||||
echo $DEBUGINFOD_URLS
|
||||
|
||||
# Test debuginfod connectivity
|
||||
curl -I "https://debuginfod.fedoraproject.org/buildid/xyz/debuginfo"
|
||||
curl -I "https://debuginfod.ubuntu.com/buildid/xyz/debuginfo"
|
||||
```
|
||||
|
||||
#### Resolution
|
||||
|
||||
1. **Configure DEBUGINFOD_URLS**
|
||||
```bash
|
||||
export DEBUGINFOD_URLS="https://debuginfod.fedoraproject.org/ https://debuginfod.ubuntu.com/"
|
||||
```
|
||||
|
||||
2. **Use local fallback**
|
||||
- Enable local debug symbol cache
|
||||
- Sync ddeb packages for Ubuntu
|
||||
- Download debug packages from archives
|
||||
|
||||
## Incident Response
|
||||
|
||||
### KPI Regression Detected in Production
|
||||
|
||||
**Severity:** High
|
||||
**Response Time:** 4 hours
|
||||
|
||||
1. **Acknowledge and assess**
|
||||
```bash
|
||||
# Get current status
|
||||
stella groundtruth validate check \
|
||||
--results bench/results/latest.json \
|
||||
--baseline bench/baselines/current.json
|
||||
```
|
||||
|
||||
2. **Identify root cause**
|
||||
- Check recent code changes
|
||||
- Review validation logs
|
||||
- Compare with previous runs
|
||||
|
||||
3. **Mitigate**
|
||||
- If code regression: revert the change
|
||||
- If ground truth issue: fix ground truth
|
||||
- If infrastructure issue: fix and re-run
|
||||
|
||||
4. **Verify fix**
|
||||
```bash
|
||||
# Re-run validation
|
||||
stella groundtruth validate run --output results-fix.json
|
||||
|
||||
# Verify regression is fixed
|
||||
stella groundtruth validate check \
|
||||
--results results-fix.json \
|
||||
--baseline bench/baselines/current.json
|
||||
```
|
||||
|
||||
5. **Post-incident**
|
||||
- Document in incident log
|
||||
- Update runbook if new issue type
|
||||
- Consider adding monitoring/alerting
|
||||
|
||||
### Mirror Corruption Detected
|
||||
|
||||
**Severity:** Medium
|
||||
**Response Time:** 24 hours
|
||||
|
||||
1. **Identify corrupted files**
|
||||
```bash
|
||||
# Check file integrity
|
||||
find /data/golden-corpus/mirrors -name "*.deb" -exec dpkg-deb --info {} \; 2>&1 | grep -i error
|
||||
```
|
||||
|
||||
2. **Remove corrupted files**
|
||||
```bash
|
||||
# Move corrupted files to quarantine
|
||||
mkdir -p /data/golden-corpus/quarantine
|
||||
mv /data/golden-corpus/mirrors/debian/path/to/corrupted.deb \
|
||||
/data/golden-corpus/quarantine/
|
||||
```
|
||||
|
||||
3. **Re-sync affected mirror**
|
||||
```bash
|
||||
/opt/golden-corpus/scripts/sync-debian-mirrors.sh
|
||||
```
|
||||
|
||||
4. **Verify fix**
|
||||
```bash
|
||||
stella doctor --check check.binaryanalysis.corpus.mirror.freshness
|
||||
```
|
||||
|
||||
### Disk Space Critical
|
||||
|
||||
**Severity:** High
|
||||
**Response Time:** 1 hour
|
||||
|
||||
1. **Check usage**
|
||||
```bash
|
||||
df -h /data/golden-corpus
|
||||
du -sh /data/golden-corpus/*
|
||||
```
|
||||
|
||||
2. **Quick cleanup**
|
||||
```bash
|
||||
# Archive old results
|
||||
/opt/golden-corpus/scripts/archive-old-results.sh
|
||||
|
||||
# Prune old baselines
|
||||
/opt/golden-corpus/scripts/prune-baselines.sh
|
||||
|
||||
# Remove old evidence bundles
|
||||
find /data/golden-corpus/evidence -name "*.tar.gz" -mtime +90 -delete
|
||||
```
|
||||
|
||||
3. **Expand storage if needed**
|
||||
- Request additional storage
|
||||
- Mount new volume
|
||||
- Migrate data if necessary
|
||||
|
||||
## Scheduled Maintenance
|
||||
|
||||
### Weekly Tasks
|
||||
|
||||
- [ ] Review Doctor health checks
|
||||
- [ ] Check mirror freshness alerts
|
||||
- [ ] Review validation results trends
|
||||
- [ ] Archive old results
|
||||
|
||||
### Monthly Tasks
|
||||
|
||||
- [ ] Generate compliance evidence bundles
|
||||
- [ ] Review and update ground truth annotations
|
||||
- [ ] Prune old baselines (keep last 10)
|
||||
- [ ] Review storage usage trends
|
||||
|
||||
### Quarterly Tasks
|
||||
|
||||
- [ ] Full corpus validation (not just seed)
|
||||
- [ ] Review and update documentation
|
||||
- [ ] Test disaster recovery procedures
|
||||
- [ ] Review access permissions
|
||||
|
||||
## Contact Information
|
||||
|
||||
| Role | Contact | Escalation |
|
||||
|------|---------|------------|
|
||||
| Corpus Owner | corpus-team@stella-ops.org | 1st |
|
||||
| BinaryIndex Guild | binaryindex@stella-ops.org | 2nd |
|
||||
| Platform On-Call | oncall@stella-ops.org | 3rd |
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Golden Corpus Folder Layout](../modules/binary-index/golden-corpus-layout.md)
|
||||
- [Golden Corpus Maintenance](../modules/binary-index/golden-corpus-maintenance.md)
|
||||
- [Ground Truth Corpus Overview](../modules/binary-index/ground-truth-corpus.md)
|
||||
Reference in New Issue
Block a user