Files
git.stella-ops.org/docs/training/troubleshooting-guide.md
StellaOps Bot 80b8254763 docs(sprint-3500.0004.0004): Complete documentation handoff
Sprint 3500.0004.0004 (Documentation & Handoff) - COMPLETE

Training Materials (T5 DONE):
- epic-3500-faq.md: Comprehensive FAQ for Score Proofs/Reachability
- video-tutorial-scripts.md: 6 video tutorial scripts
- Training guides already existed from prior work

Release Notes (T6 DONE):
- v2.5.0-release-notes.md: Full release notes with breaking changes,
  upgrade instructions, and performance benchmarks

OpenAPI Specs (T7 DONE):
- Scanner OpenAPI already comprehensive with ProofSpines, Unknowns,
  CallGraphs, Reachability endpoints and schemas

Handoff Checklist (T8 DONE):
- epic-3500-handoff-checklist.md: Complete handoff documentation
  including sign-off tracking, escalation paths, monitoring config

All 8/8 tasks complete. Sprint DONE.
Epic 3500 documentation deliverables complete.
2025-12-20 22:38:19 +02:00

12 KiB

Score Proofs & Reachability Troubleshooting Guide

Sprint: SPRINT_3500_0004_0004
Audience: Operations, Support, Security Engineers


Quick Diagnostic Commands

# Check system health
stella status

# Verify scan completed successfully
stella scan status --scan-id $SCAN_ID

# Check reachability computation status
stella reachability job-status --job-id $JOB_ID

# Verify proof integrity
stella proof verify --scan-id $SCAN_ID --verbose

Score Proofs Issues

1. Replay Produces Different Results

Symptoms:

  • stella score replay output differs from original
  • Verification fails with "hash mismatch"

Possible Causes:

Cause Diagnosis Solution
Missing inputs stella proof inspect --check-inputs shows gaps Export with --include-inputs
Algorithm version mismatch Check environment.scannerVersion in manifest Use matching scanner version
Non-deterministic config Review configuration section Enable --deterministic mode
Feed drift Compare advisoryFeeds.asOf timestamps Use frozen feeds

Resolution Steps:

# Step 1: Inspect the proof
stella proof inspect --scan-id $SCAN_ID

# Step 2: Check for missing inputs
stella proof inspect --scan-id $SCAN_ID --check-inputs

# Step 3: If inputs missing, re-export with data
stella proof export --scan-id $SCAN_ID --include-inputs --output proof-full.zip

# Step 4: Retry replay
stella score replay --scan-id $SCAN_ID --bundle proof-full.zip

2. Signature Verification Failed

Symptoms:

  • "Invalid signature" or "Signature verification failed"
  • stella proof verify returns error

Possible Causes:

Cause Diagnosis Solution
Key rotation Check stella trust list for key dates Import new trust anchor
Corrupted bundle Verify file integrity Re-download bundle
Wrong trust root Check issuer in attestation Configure correct trust
Tampered content Hash mismatch in bundle Investigate tampering

Resolution Steps:

# Step 1: Verbose verification
stella proof verify --scan-id $SCAN_ID --verbose

# Step 2: Check trust anchors
stella trust list

# Step 3: If key rotated, import new anchor
stella trust import --file new-public-key.pem

# Step 4: Retry verification
stella proof verify --scan-id $SCAN_ID

3. Proof Chain Broken

Symptoms:

  • "Chain integrity violation"
  • "prev_hash mismatch"

Possible Causes:

Cause Diagnosis Solution
Database corruption Check Postgres logs Restore from backup
Manual modification Audit access logs Investigate, restore
Storage failure Check disk health Repair/restore

Resolution Steps:

# Step 1: Check chain status
stella proof status --scan-id $SCAN_ID

# Step 2: Find break point
stella proof list --since "30 days" --verify-chain

# Step 3: If database issue
# Check Postgres logs
# Restore from backup if needed

4. Proof Export Fails

Symptoms:

  • "Failed to export proof bundle"
  • Timeout during export

Possible Causes:

Cause Diagnosis Solution
Large inputs Check SBOM/graph size Use --exclude-inputs
Storage full Check disk space Clear space or use different path
Network timeout Check network connectivity Increase timeout

Resolution Steps:

# Step 1: Export without inputs (smaller)
stella proof export --scan-id $SCAN_ID --output proof.zip

# Step 2: If still fails, check disk
# Windows: Get-Volume | Format-Table
# Linux: df -h

# Step 3: Try alternative location
stella proof export --scan-id $SCAN_ID --output /tmp/proof.zip

Reachability Issues

1. Too Many UNKNOWN Findings

Symptoms:

  • Most vulnerabilities show UNKNOWN reachability status
  • Coverage percentage is low

Possible Causes:

Cause Diagnosis Solution
No call graph stella scan graph summary returns empty Upload call graph
Incomplete graph Low node count Regenerate with more options
Symbol mismatch Symbols not resolved Check symbol resolution

Resolution Steps:

# Step 1: Check if call graph exists
stella scan graph summary --scan-id $SCAN_ID

# Step 2: If missing, generate and upload
# .NET example:
dotnet build --generate-call-graph
stella scan graph upload --scan-id $SCAN_ID --file callgraph.json

# Step 3: Verify entrypoints detected
stella scan graph entrypoints --scan-id $SCAN_ID

# Step 4: Recompute reachability
stella reachability compute --scan-id $SCAN_ID --force

2. False UNREACHABLE Findings

Symptoms:

  • Known-reachable code marked UNREACHABLE
  • Security team reports false negatives

Possible Causes:

Cause Diagnosis Solution
Missing edges Graph incomplete Add missing calls
Reflection not detected Edge type missing Add reflection hints
Entrypoint not detected Check entrypoints list Add manual entrypoint

Resolution Steps:

# Step 1: Explain the specific finding
stella reachability explain --scan-id $SCAN_ID \
  --cve CVE-2024-XXXX \
  --purl "pkg:type/name@version" \
  --verbose

# Step 2: Check if entrypoint is known
stella scan graph entrypoints --scan-id $SCAN_ID | grep -i "suspected-entry"

# Step 3: Add missing entrypoint if needed
stella scan graph upload --scan-id $SCAN_ID \
  --file additional-entrypoints.json \
  --merge

# Step 4: Recompute
stella reachability compute --scan-id $SCAN_ID --force

3. Computation Timeout

Symptoms:

  • "Computation exceeded timeout"
  • Job stuck at percentage

Possible Causes:

Cause Diagnosis Solution
Large graph Check node/edge count Increase timeout
Deep paths Max depth too high Reduce max depth
Cycles Graph has loops Enable cycle detection

Resolution Steps:

# Step 1: Check graph size
stella scan graph summary --scan-id $SCAN_ID

# Step 2: Increase timeout
stella reachability compute --scan-id $SCAN_ID --timeout 900s

# Step 3: Or reduce depth
stella reachability compute --scan-id $SCAN_ID --max-depth 10

# Step 4: Or partition analysis
stella reachability compute --scan-id $SCAN_ID --partition-by artifact

4. Inconsistent Results Between Runs

Symptoms:

  • Same scan produces different reachability results
  • Status changes between POSSIBLY_REACHABLE and UNKNOWN

Possible Causes:

Cause Diagnosis Solution
Non-deterministic mode Check config Enable deterministic mode
Concurrent modifications Check job logs Serialize jobs
Caching issues Clear cache Disable or clear cache

Resolution Steps:

# Step 1: Enable deterministic mode
stella reachability compute --scan-id $SCAN_ID --deterministic --seed "fixed-seed"

# Step 2: Clear cache if needed
stella cache clear --scope reachability

# Step 3: Re-run computation
stella reachability compute --scan-id $SCAN_ID --force

Unknowns Issues

1. Unknowns Not Appearing

Symptoms:

  • Expected unknowns not in registry
  • Count seems too low

Possible Causes:

Cause Diagnosis Solution
Auto-suppress enabled Check workspace settings Disable auto-suppress
Filter active Check list filters Clear filters
Different workspace Verify workspace ID Use correct workspace

Resolution Steps:

# Step 1: List without filters
stella unknowns list --workspace-id $WS_ID --status all

# Step 2: Check workspace settings
stella config get unknowns.auto-suppress

# Step 3: Disable auto-suppress if needed
stella config set unknowns.auto-suppress false

2. Resolution Not Persisting

Symptoms:

  • Resolved unknowns reappear
  • Status resets to pending

Possible Causes:

Cause Diagnosis Solution
Scope too narrow Check resolution scope Use broader scope
New occurrence Different scan/artifact Resolve at workspace level
Database issue Check error logs Contact support

Resolution Steps:

# Step 1: Check current scope
stella unknowns show --id $UNKNOWN_ID

# Step 2: Re-resolve with broader scope
stella unknowns resolve --id $UNKNOWN_ID \
  --resolution mapped \
  --scope workspace \
  --comment "Resolving at workspace level"

3. Priority Score Incorrect

Symptoms:

  • Low priority for critical component
  • Scoring doesn't reflect risk

Possible Causes:

Cause Diagnosis Solution
Missing context Automatic scoring limited Manually escalate
Outdated metadata Component info stale Refresh metadata

Resolution Steps:

# Step 1: Escalate with correct severity
stella unknowns escalate --id $UNKNOWN_ID \
  --reason "Handles authentication - critical despite low auto-score" \
  --severity critical

# Step 2: Request scoring review
# Add comment explaining the discrepancy

Air-Gap / Offline Issues

1. Offline Kit Import Fails

Symptoms:

  • "Invalid offline kit"
  • "Trust anchor missing"

Possible Causes:

Cause Diagnosis Solution
Corrupted transfer Verify checksums Re-transfer
Missing components Check kit contents Re-generate kit
Version mismatch Check scanner version Use matching versions

Resolution Steps:

# Step 1: Verify kit integrity
sha256sum offline-kit.tar.gz
# Compare with manifest.sha256

# Step 2: Check kit contents
tar -tzf offline-kit.tar.gz | head -20

# Step 3: If incomplete, regenerate on connected system
stella airgap prepare --feeds nvd,ghsa --output offline-kit/

2. Time Anchor Issues

Symptoms:

  • "Time anchor expired"
  • "Cannot verify timestamp"

Possible Causes:

Cause Diagnosis Solution
Old kit Check time anchor date Refresh kit
Clock drift Check system clock Sync system time
Expired anchor Anchor has TTL Generate new anchor

Resolution Steps:

# Step 1: Check time anchor
cat offline-kit/time-anchor/timestamp.json

# Step 2: If expired, generate new (on connected system)
stella airgap prepare-time-anchor --output offline-kit/time-anchor/

# Step 3: Transfer and use new anchor

Error Code Reference

Error Code Category Meaning Typical Resolution
E1001 Proof Manifest hash mismatch Re-export with inputs
E1002 Proof Signature invalid Check trust anchors
E1003 Proof Chain broken Restore from backup
E2001 Reach No call graph Upload call graph
E2002 Reach Computation timeout Increase timeout
E2003 Reach Symbol not resolved Check symbol DB
E3001 Unknown Resolution conflict Use broader scope
E3002 Unknown Invalid category Check category value
E4001 Airgap Invalid kit Re-generate kit
E4002 Airgap Time anchor expired Refresh anchor

Getting Help

Collecting Diagnostics

# Generate diagnostic bundle
stella diagnostic collect --output diagnostics.zip

# Include specific scan
stella diagnostic collect --scan-id $SCAN_ID --output diagnostics.zip

Log Locations

Component Log Path
Scanner /var/log/stella/scanner.log
Reachability /var/log/stella/reachability.log
Proofs /var/log/stella/proofs.log
CLI ~/.stella/logs/cli.log

Support Channels

  • Documentation: docs/ directory
  • Issues: Internal issue tracker
  • Emergency: On-call security team


Last Updated: 2025-12-20
Version: 1.0.0
Sprint: 3500.0004.0004