Files
git.stella-ops.org/docs/operations/trust-lattice-troubleshooting.md
StellaOps Bot 5146204f1b feat: add security sink detection patterns for JavaScript/TypeScript
- Introduced `sink-detect.js` with various security sink detection patterns categorized by type (e.g., command injection, SQL injection, file operations).
- Implemented functions to build a lookup map for fast sink detection and to match sink calls against known patterns.
- Added `package-lock.json` for dependency management.
2025-12-22 23:21:21 +02:00

8.4 KiB

Trust Lattice Troubleshooting Guide

Version: 1.0.0 Last Updated: 2025-12-22 Audience: Support and Development teams


Quick Reference

Symptom Likely Cause Section
Low confidence scores Stale VEX data or missing sources 2.1
Gate failures blocking builds Threshold too high or source issues 2.2
Verdict replay mismatches Non-deterministic inputs 2.3
Unexpected trust changes Calibration drift 2.4
Conflicting verdicts Multi-source disagreement 2.5

1. Diagnostic Commands

1.1 Check System Health

# Excititor health
curl https://api.example.com/excititor/health

# Policy Engine health
curl https://api.example.com/policy/health

# Authority health
curl https://api.example.com/authority/health

1.2 Trace a Verdict

# Get detailed verdict explanation
stella verdict explain <manifestId>

# Output includes:
# - All claims considered
# - Trust vector scores
# - Strength/freshness multipliers
# - Gate evaluation results
# - Conflict detection

1.3 Check VEX Source Status

# List all sources with status
stella vex source list

# Check specific source
stella vex source status vendor:redhat

# Sample output:
# Source: vendor:redhat
# Status: healthy
# Last fetch: 2025-12-22T10:00:00Z
# Documents: 15234
# Freshness: 2.3 hours

2. Common Issues

2.1 Low Confidence Scores

Symptoms:

  • Verdicts have confidence < 0.5
  • Many "under_investigation" statuses

Diagnosis:

  1. Check claim freshness:

    stella claim analyze --cve CVE-2025-12345 --asset sha256:abc123
    
    # Look for:
    # - Freshness multiplier < 0.5 (claim older than 180 days)
    # - No high-trust sources
    
  2. Check trust vector values:

    stella trustvector show vendor:redhat
    
    # Low scores indicate:
    # - Signature verification issues (P)
    # - Poor scope matching (C)
    # - Non-deterministic outputs (R)
    
  3. Check for missing VEX coverage:

    stella vex coverage --purl pkg:npm/lodash@4.17.21
    
    # No claims? Source may not cover this package
    

Resolution:

  • If freshness is low: Check if source is publishing updates
  • If trust vector is low: Review source verification settings
  • If coverage is missing: Add additional VEX sources

2.2 Gate Failures

Symptoms:

  • Builds failing with "Gate: MinimumConfidenceGate FAILED"
  • Policy violations despite VEX claims

Diagnosis:

  1. Check gate thresholds:

    stella gates show minimumConfidence
    
    # Thresholds:
    #   production: 0.75
    #   staging: 0.60
    #   development: 0.40
    
  2. Compare with verdict confidence:

    stella verdict show <manifestId> | grep confidence
    
    # confidence: 0.68  <- Below 0.75 production threshold
    
  3. Check which gate failed:

    stella verdict gates <manifestId>
    
    # Gates:
    #   MinimumConfidenceGate: FAILED (0.68 < 0.75)
    #   SourceQuotaGate: PASSED
    #   UnknownsBudgetGate: PASSED
    

Resolution:

  • Temporary: Lower threshold (with approval)
  • Long-term: Add corroborating VEX sources
  • If single-source: Check SourceQuotaGate corroboration

2.3 Verdict Replay Failures

Symptoms:

  • Replay verification returns success: false
  • Audit failures due to non-determinism

Diagnosis:

  1. Get detailed diff:

    stella verdict replay --diff <manifestId>
    
    # Differences:
    #   result.confidence: 0.82 -> 0.79
    #   inputs.vexDocumentDigests[2]: sha256:abc... (missing)
    
  2. Common causes:

    Difference Likely Cause
    VEX digest mismatch Document was modified after verdict
    Confidence delta Clock cutoff drift (freshness calc)
    Missing claims Source was unavailable during replay
    Different status Policy version changed
  3. Check input availability:

    # Verify all pinned inputs exist
    stella cas verify --digest sha256:abc123
    

Resolution:

  • Clock drift: Ensure NTP synchronization across nodes
  • Missing inputs: Restore from backup or acknowledge drift
  • Policy change: Compare policy hashes between original and replay

2.4 Calibration Issues

Symptoms:

  • Trust vectors changed unexpectedly
  • Accuracy metrics declining

Diagnosis:

  1. Review recent calibrations:

    stella calibration history vendor:redhat --epochs 5
    
    # Epoch 42: accuracy=0.92, delta=(-0.02, +0.02, 0)
    # Epoch 41: accuracy=0.94, delta=(-0.01, +0.01, 0)
    
  2. Check comparison results:

    stella calibration epoch 42 --details
    
    # Total claims: 1500
    # Correct: 1380
    # False positives: 45
    # False negatives: 75
    # Detected bias: OptimisticBias
    
  3. Check for data quality issues:

    # Look for corrupted truth data
    stella calibration validate-truth --epoch 42
    

Resolution:

  • High false positive: Reduce provenance score
  • High false negative: Review coverage matching
  • Data quality issue: Re-run with corrected truth set
  • Emergency: Rollback to previous epoch

2.5 Claim Conflicts

Symptoms:

  • Verdicts show hasConflicts: true
  • Confidence reduced due to conflict penalty

Diagnosis:

  1. View conflict details:

    stella verdict conflicts <manifestId>
    
    # Conflicts:
    #   vendor:redhat claims: not_affected
    #   hub:osv claims: affected
    #   Conflict penalty applied: 0.25
    
  2. Investigate source disagreement:

    # Get raw claims from each source
    stella vex claim --source vendor:redhat --cve CVE-2025-12345
    stella vex claim --source hub:osv --cve CVE-2025-12345
    
  3. Check claim timestamps:

    # Older claim may be outdated
    stella claim compare vendor:redhat hub:osv --cve CVE-2025-12345
    

Resolution:

  • If one source is stale: Flag for review
  • If genuine disagreement: Higher-trust source wins (by design)
  • If persistent: Consider source override in policy

3. Performance Issues

3.1 Slow Claim Scoring

Symptoms:

  • Scoring latency > 100ms
  • Timeouts during high load

Diagnosis:

# Check scoring performance
stella perf scoring --samples 100

# Look for:
# - Cache miss rate
# - Trust vector lookups
# - Freshness calculation overhead

Resolution:

  • Enable trust vector caching
  • Pre-compute freshness for common cutoffs
  • Scale Excititor horizontally

3.2 Slow Verdict Replay

Symptoms:

  • Replay verification > 5 seconds
  • Timeout during audit

Diagnosis:

# Check input retrieval time
stella verdict replay --timing <manifestId>

# Timing:
#   Input fetch: 3.2s
#   Score compute: 0.1s
#   Merge: 0.05s
#   Total: 3.35s

Resolution:

  • Ensure CAS storage is local or cached
  • Pre-warm verdict cache for critical assets
  • Increase timeout for large manifests

4. Integration Issues

4.1 VEX Source Not Recognized

Symptoms:

  • Claims from source not included in verdicts
  • Source shows as "unknown" class

Resolution:

  1. Register source in configuration:

    # etc/trust-lattice.yaml
    sources:
      - id: vendor:newvendor
        class: vendor
        trustVector:
          provenance: 0.85
          coverage: 0.70
          replayability: 0.60
    
  2. Reload configuration:

    stella config reload --service excititor
    

4.2 Gate Not Evaluating

Symptoms:

  • Expected gate not appearing in results
  • Gate shows as "disabled"

Resolution:

  1. Check gate configuration:

    stella gates list --show-disabled
    
  2. Enable gate:

    # etc/policy-gates.yaml
    gates:
      minimumConfidence:
        enabled: true  # Ensure this is true
    

5. Support Information

5.1 Collecting Diagnostic Bundle

stella support bundle --include trust-lattice \
  --since 1h --output /tmp/diag.zip

Bundle includes:

  • Trust vector snapshots
  • Recent verdicts
  • Gate evaluations
  • Calibration history
  • System metrics

5.2 Log Locations

Service Log Path
Excititor /var/log/stellaops/excititor.log
Policy /var/log/stellaops/policy.log
Authority /var/log/stellaops/authority.log

5.3 Contact


Document Version: 1.0.0 Sprint: 7100.0003.0002