# VexLens Operations Runbook > VexLens provides VEX consensus computation across multiple issuer sources. This runbook covers deployment, configuration, operations, and troubleshooting. ## 1. Service scope VexLens computes deterministic consensus over VEX (Vulnerability Exploitability eXchange) statements from multiple issuers. Operations owns: - Consensus engine scaling, projection storage, and event bus connectivity. - Monitoring and alerts for consensus latency, conflict rates, and trust weight anomalies. - Runbook execution for recovery, offline bundle import, and issuer trust management. - Coordination with Policy Engine and Vuln Explorer consumers. Related documentation: - `docs/modules/vex-lens/README.md` - `docs/modules/vex-lens/architecture.md` - `docs/modules/vex-lens/implementation_plan.md` - `docs/modules/vex-lens/runbooks/observability.md` ## 2. Contacts & tooling | Area | Owner(s) | Escalation | |------|----------|------------| | VexLens service | VEX Lens Guild | `#vex-lens-ops`, on-call rotation | | Issuer Directory | Issuer Directory Guild | `#issuer-directory` | | Policy Engine integration | Policy Guild | `#policy-engine` | | Offline Kit | Offline Kit Guild | `#offline-kit` | Primary tooling: - `stella vex consensus` CLI (query, export, verify). - VexLens API (`/api/v1/vex/consensus/*`) for automation. - Grafana dashboards (`VEX Lens / Consensus Health`, `VEX Lens / Conflicts`). - Alertmanager routes (`VexLens.ConsensusLatency`, `VexLens.Conflicts`). ## 3. Configuration ### 3.1 Options reference Configure via `vexlens.yaml` or environment variables with `VEXLENS_` prefix: ```yaml VexLens: Storage: Driver: mongo # "memory" for testing, "mongo" for production ConnectionString: "mongodb://..." Database: stellaops ProjectionsCollection: vex_consensus HistoryCollection: vex_consensus_history MaxHistoryEntries: 100 CommandTimeoutSeconds: 30 Trust: AuthoritativeWeight: 1.0 TrustedWeight: 0.8 KnownWeight: 0.5 UnknownWeight: 0.3 UntrustedWeight: 0.1 SignedMultiplier: 1.2 FreshnessDecayDays: 30 MinFreshnessFactor: 0.5 JustifiedNotAffectedBoost: 1.1 FixedStatusBoost: 1.05 Consensus: DefaultMode: WeightedVote # HighestWeight, WeightedVote, Lattice, AuthoritativeFirst MinimumWeightThreshold: 0.1 ConflictThreshold: 0.3 RequireJustificationForNotAffected: false MaxStatementsPerComputation: 100 EnableConflictDetection: true EmitEvents: true Normalization: EnabledFormats: - OpenVEX - CSAF - CycloneDX StrictMode: false MaxDocumentSizeBytes: 10485760 # 10 MB MaxStatementsPerDocument: 10000 AirGap: SealedMode: false BundlePath: /var/lib/stellaops/vex-bundles VerifyBundleSignatures: true AllowedBundleSources: [] ExportFormat: jsonl Telemetry: MetricsEnabled: true TracingEnabled: true MeterName: StellaOps.VexLens ActivitySourceName: StellaOps.VexLens ``` ### 3.2 Environment variable overrides ```bash VEXLENS_STORAGE__DRIVER=mongo VEXLENS_STORAGE__CONNECTIONSTRING=mongodb://localhost:27017 VEXLENS_CONSENSUS__DEFAULTMODE=WeightedVote VEXLENS_AIRGAP__SEALEDMODE=true ``` ### 3.3 Consensus mode selection | Mode | Use case | |------|----------| | HighestWeight | Single authoritative source preferred | | WeightedVote | Democratic consensus from multiple sources | | Lattice | Formal lattice join (most conservative) | | AuthoritativeFirst | Short-circuit on authoritative issuer | ## 4. Monitoring & SLOs Key metrics (exposed by VexLensMetrics): | Metric | SLO / Alert | Notes | |--------|-------------|-------| | `vexlens.consensus.duration_seconds` | p95 < 500ms | Per-computation latency | | `vexlens.consensus.conflicts_total` | Monitor trend | Conflicts by reason | | `vexlens.consensus.confidence` | avg > 0.7 | Low confidence indicates issuer gaps | | `vexlens.normalization.duration_seconds` | p95 < 200ms | Per-document normalization | | `vexlens.normalization.errors_total` | Alert on spike | By format | | `vexlens.trust.weight_value` | Distribution | Trust weight distribution | | `vexlens.projection.query_duration_seconds` | p95 < 100ms | Projection lookups | Dashboards must include: - Consensus computation rate by mode and outcome. - Conflict breakdown (status disagreement, weight tie, insufficient data). - Trust weight distribution by issuer category. - Normalization success/failure by VEX format. - Projection query latency and throughput. Alerts (Alertmanager): - `VexLensConsensusLatencyHigh` - consensus duration p95 > 500ms for 5 minutes. - `VexLensConflictSpike` - conflict rate increase > 50% in 10 minutes. - `VexLensNormalizationFailures` - normalization error rate > 5% for 5 minutes. - `VexLensLowConfidence` - average confidence < 0.5 for 10 minutes. ## 5. Routine operations ### 5.1 Daily checklist - Review dashboard for consensus latency and conflict rates. - Check normalization error logs for malformed VEX documents. - Verify projection storage growth is within capacity thresholds. - Review trust weight distribution for anomalies. - Scan logs for `issuer_not_found` or `signature_verification_failed`. ### 5.2 Weekly tasks - Review issuer directory for new registrations or revocations. - Audit conflict queue for persistent disagreements. - Test consensus determinism with sample documents. - Verify Policy Engine and Vuln Explorer integrations are functional. ### 5.3 Monthly tasks - Review and tune trust weights based on issuer performance. - Archive old projection history beyond retention period. - Update issuer trust tiers based on incident history. - Test offline bundle import/export workflow. ## 6. Offline operations ### 6.1 Bundle export ```bash # Export consensus projections to offline bundle stella vex consensus export \ --format jsonl \ --output /var/lib/stellaops/vex-bundles/consensus-2025-01.jsonl \ --manifest /var/lib/stellaops/vex-bundles/manifest.json \ --sign # Verify bundle integrity stella vex consensus verify \ --bundle /var/lib/stellaops/vex-bundles/consensus-2025-01.jsonl \ --manifest /var/lib/stellaops/vex-bundles/manifest.json ``` ### 6.2 Bundle import (air-gapped) ```bash # Enable sealed mode export VEXLENS_AIRGAP__SEALEDMODE=true export VEXLENS_AIRGAP__BUNDLEPATH=/var/lib/stellaops/vex-bundles # Import bundle stella vex consensus import \ --bundle /var/lib/stellaops/vex-bundles/consensus-2025-01.jsonl \ --verify-signatures # Verify import stella vex consensus status ``` ### 6.3 Air-gap verification 1. Confirm `VEXLENS_AIRGAP__SEALEDMODE=true` in environment. 2. Verify no external network calls in service logs. 3. Check bundle manifest hashes match imported data. 4. Run determinism check on imported projections. ## 7. Troubleshooting ### 7.1 High conflict rates **Symptoms:** `vexlens.consensus.conflicts_total` spiking. **Investigation:** 1. Check conflict breakdown by reason in dashboard. 2. Identify issuers with conflicting statements. 3. Review issuer trust tiers and weights. **Resolution:** - Adjust `ConflictThreshold` if legitimate disagreements. - Update issuer trust tiers based on authority. - Contact issuer owners to resolve source conflicts. ### 7.2 Normalization failures **Symptoms:** `vexlens.normalization.errors_total` increasing. **Investigation:** 1. Check error logs for specific format failures. 2. Identify malformed documents in input stream. 3. Validate document against format schema. **Resolution:** - Enable `StrictMode: false` for lenient parsing. - Report malformed documents to source issuers. - Update normalizers if format specification changed. ### 7.3 Low consensus confidence **Symptoms:** Average confidence below 0.5. **Investigation:** 1. Check issuer coverage for affected vulnerabilities. 2. Review trust weight distribution. 3. Identify missing or untrusted issuers. **Resolution:** - Register additional trusted issuers. - Adjust trust tier assignments. - Import offline bundles from authoritative sources. ### 7.4 Projection storage growth **Symptoms:** Storage usage increasing beyond capacity. **Investigation:** 1. Check `MaxHistoryEntries` setting. 2. Review projection count and history depth. 3. Identify high-churn vulnerability/product pairs. **Resolution:** - Reduce `MaxHistoryEntries`. - Implement history pruning job. - Archive old projections to cold storage. ## 8. Recovery procedures ### 8.1 Storage failover 1. Stop VexLens service instances. 2. Switch storage connection to replica. 3. Verify connectivity with health check. 4. Restart service instances. 5. Monitor for consensus recomputation. ### 8.2 Issuer directory sync 1. Export current issuer registry backup. 2. Resync from authoritative issuer directory source. 3. Verify issuer fingerprints and trust tiers. 4. Restart VexLens to reload issuer cache. ### 8.3 Consensus recomputation 1. Trigger recomputation for affected vulnerability/product pairs. 2. Monitor recomputation progress in logs. 3. Verify consensus outcomes match expected state. 4. Emit status change events if outcomes differ. ## 9. Evidence locations - Sprint tracker: `docs/implplan/SPRINT_0129_0001_0001_policy_reasoning.md` - Module docs: `docs/modules/vex-lens/` - Source code: `src/VexLens/StellaOps.VexLens/` - Dashboard stub: `docs/modules/vex-lens/runbooks/dashboards/vex-lens-observability.json`