# Trust Lattice Operations Runbook > **Version**: 1.0.0 > **Last Updated**: 2025-12-22 > **Audience**: Operations and Support teams --- ## 1. Overview The Trust Lattice is a VEX claim scoring framework that produces explainable, deterministic verdicts. This runbook covers operational procedures for monitoring, troubleshooting, and maintaining the system. --- ## 2. System Components | Component | Service | Purpose | |-----------|---------|---------| | TrustVector | Excititor | 3-component trust scoring (P/C/R) | | ClaimScoreMerger | Policy | Merge scored claims into verdicts | | PolicyGates | Policy | Enforce trust thresholds | | VerdictManifest | Authority | Store signed verdicts | | Calibration | Excititor | Adjust trust vectors over time | --- ## 3. Monitoring ### 3.1 Key Metrics | Metric | Alert Threshold | Description | |--------|-----------------|-------------| | `trustlattice_score_latency_p95` | > 100ms | Claim scoring latency | | `trustlattice_merge_conflicts_total` | Rate increase | Claims with status conflicts | | `policy_gate_failures_total` | Rate increase | Gate rejections | | `verdict_manifest_replay_failures` | > 0 | Non-deterministic verdicts | | `calibration_drift_percent` | > 10% | Trust vector drift from baseline | ### 3.2 Dashboards Access dashboards at: - Grafana: `https:///d/trustlattice` - Prometheus queries: ```promql # Average claim score by source class avg(trustlattice_claim_score) by (source_class) # Gate failure rate rate(policy_gate_failures_total[5m]) # Confidence distribution histogram_quantile(0.5, trustlattice_verdict_confidence_bucket) ``` ### 3.3 Log Queries Key log entries (Loki/ELK): ``` # Claim scoring {app="excititor"} |= "ClaimScore computed" # Gate failures {app="policy"} |= "Gate failed" | json | gate_name != "" # Verdict replay failures {app="authority"} |= "Replay mismatch" ``` --- ## 4. Common Operations ### 4.1 Viewing Current Trust Vectors ```bash # Via CLI stella trustvector list --source-class vendor # Via API curl -H "Authorization: Bearer $TOKEN" \ https://api.example.com/api/v1/trustlattice/vectors ``` ### 4.2 Inspecting a Verdict ```bash # Get verdict details stella verdict show verd:acme:abc123:CVE-2025-12345:1734873600 # Verify verdict replay stella verdict replay verd:acme:abc123:CVE-2025-12345:1734873600 ``` ### 4.3 Viewing Gate Configuration ```bash # List enabled gates stella gates list --environment production # Show gate thresholds stella gates show minimumConfidence --environment production ``` ### 4.4 Triggering Manual Calibration ```bash # Trigger calibration epoch for a source stella calibration run --source vendor:redhat \ --start 2025-11-01 --end 2025-12-01 # View calibration history stella calibration history vendor:redhat ``` --- ## 5. Emergency Procedures ### 5.1 High Gate Failure Rate **Symptoms:** - Spike in `policy_gate_failures_total` - Many builds failing due to low confidence **Steps:** 1. Check if VEX source is unavailable: ```bash stella vex source status vendor:redhat ``` 2. If source is stale, consider temporary threshold reduction: ```bash # Edit etc/policy-gates.yaml gates: minimumConfidence: thresholds: production: 0.60 # Reduced from 0.75 ``` 3. Restart Policy Engine to apply changes 4. Monitor and restore threshold once source recovers ### 5.2 Verdict Replay Failures **Symptoms:** - `verdict_manifest_replay_failures` > 0 - Audit compliance check failures **Steps:** 1. Identify failing verdict: ```bash stella verdict list --replay-status failed --limit 10 ``` 2. Compare original and replayed inputs: ```bash stella verdict diff ``` 3. Common causes: - VEX document modified after verdict - Clock drift during evaluation - Policy configuration changed 4. For clock drift, verify NTP synchronization: ```bash timedatectl status ``` ### 5.3 Trust Vector Drift Emergency **Symptoms:** - `calibration_drift_percent` > 20% - Sudden confidence changes across many assets **Steps:** 1. Freeze calibration: ```bash stella calibration freeze vendor:redhat ``` 2. Investigate recent calibration epochs: ```bash stella calibration history vendor:redhat --epochs 5 ``` 3. If false positive rate increased, rollback: ```bash stella calibration rollback vendor:redhat --to-epoch 41 ``` 4. Unfreeze after investigation: ```bash stella calibration unfreeze vendor:redhat ``` --- ## 6. Configuration ### 6.1 Configuration Files | File | Purpose | |------|---------| | `etc/trust-lattice.yaml` | Trust vector weights and defaults | | `etc/policy-gates.yaml` | Gate thresholds and rules | | `etc/excititor-calibration.yaml` | Calibration parameters | ### 6.2 Environment Variables | Variable | Default | Description | |----------|---------|-------------| | `TRUSTLATTICE_WEIGHTS_PROVENANCE` | 0.45 | Provenance weight | | `TRUSTLATTICE_WEIGHTS_COVERAGE` | 0.35 | Coverage weight | | `TRUSTLATTICE_FRESHNESS_HALFLIFE` | 90 | Freshness half-life (days) | | `GATES_MINIMUM_CONFIDENCE_PROD` | 0.75 | Production confidence threshold | | `CALIBRATION_LEARNING_RATE` | 0.02 | Calibration learning rate | --- ## 7. Maintenance Tasks ### 7.1 Daily - [ ] Review gate failure alerts - [ ] Check verdict replay success rate - [ ] Monitor trust vector stability ### 7.2 Weekly - [ ] Review calibration epoch results - [ ] Analyze conflict rate trends - [ ] Update trust vectors for new sources ### 7.3 Monthly - [ ] Audit high-drift sources - [ ] Review and tune gate thresholds - [ ] Clean up expired verdict manifests --- ## 8. Contact - **On-call**: #trustlattice-oncall (Slack) - **Escalation**: VEX Guild Lead - **Documentation**: `docs/modules/excititor/trust-lattice.md` --- *Document Version: 1.0.0* *Sprint: 7100.0003.0002*