Implement VEX document verification system with issuer management and signature verification
- Added IIssuerDirectory interface for managing VEX document issuers, including methods for registration, revocation, and trust validation. - Created InMemoryIssuerDirectory class as an in-memory implementation of IIssuerDirectory for testing and single-instance deployments. - Introduced ISignatureVerifier interface for verifying signatures on VEX documents, with support for multiple signature formats. - Developed SignatureVerifier class as the default implementation of ISignatureVerifier, allowing extensibility for different signature formats. - Implemented handlers for DSSE and JWS signature formats, including methods for verification and signature extraction. - Defined various records and enums for issuer and signature metadata, enhancing the structure and clarity of the verification process.
This commit is contained in:
297
docs/modules/vex-lens/runbooks/operations.md
Normal file
297
docs/modules/vex-lens/runbooks/operations.md
Normal file
@@ -0,0 +1,297 @@
|
||||
# VexLens Operations Runbook
|
||||
|
||||
> VexLens provides VEX consensus computation across multiple issuer sources. This runbook covers deployment, configuration, operations, and troubleshooting.
|
||||
|
||||
## 1. Service scope
|
||||
|
||||
VexLens computes deterministic consensus over VEX (Vulnerability Exploitability eXchange) statements from multiple issuers. Operations owns:
|
||||
|
||||
- Consensus engine scaling, projection storage, and event bus connectivity.
|
||||
- Monitoring and alerts for consensus latency, conflict rates, and trust weight anomalies.
|
||||
- Runbook execution for recovery, offline bundle import, and issuer trust management.
|
||||
- Coordination with Policy Engine and Vuln Explorer consumers.
|
||||
|
||||
Related documentation:
|
||||
|
||||
- `docs/modules/vex-lens/README.md`
|
||||
- `docs/modules/vex-lens/architecture.md`
|
||||
- `docs/modules/vex-lens/implementation_plan.md`
|
||||
- `docs/modules/vex-lens/runbooks/observability.md`
|
||||
|
||||
## 2. Contacts & tooling
|
||||
|
||||
| Area | Owner(s) | Escalation |
|
||||
|------|----------|------------|
|
||||
| VexLens service | VEX Lens Guild | `#vex-lens-ops`, on-call rotation |
|
||||
| Issuer Directory | Issuer Directory Guild | `#issuer-directory` |
|
||||
| Policy Engine integration | Policy Guild | `#policy-engine` |
|
||||
| Offline Kit | Offline Kit Guild | `#offline-kit` |
|
||||
|
||||
Primary tooling:
|
||||
|
||||
- `stella vex consensus` CLI (query, export, verify).
|
||||
- VexLens API (`/api/v1/vex/consensus/*`) for automation.
|
||||
- Grafana dashboards (`VEX Lens / Consensus Health`, `VEX Lens / Conflicts`).
|
||||
- Alertmanager routes (`VexLens.ConsensusLatency`, `VexLens.Conflicts`).
|
||||
|
||||
## 3. Configuration
|
||||
|
||||
### 3.1 Options reference
|
||||
|
||||
Configure via `vexlens.yaml` or environment variables with `VEXLENS_` prefix:
|
||||
|
||||
```yaml
|
||||
VexLens:
|
||||
Storage:
|
||||
Driver: mongo # "memory" for testing, "mongo" for production
|
||||
ConnectionString: "mongodb://..."
|
||||
Database: stellaops
|
||||
ProjectionsCollection: vex_consensus
|
||||
HistoryCollection: vex_consensus_history
|
||||
MaxHistoryEntries: 100
|
||||
CommandTimeoutSeconds: 30
|
||||
|
||||
Trust:
|
||||
AuthoritativeWeight: 1.0
|
||||
TrustedWeight: 0.8
|
||||
KnownWeight: 0.5
|
||||
UnknownWeight: 0.3
|
||||
UntrustedWeight: 0.1
|
||||
SignedMultiplier: 1.2
|
||||
FreshnessDecayDays: 30
|
||||
MinFreshnessFactor: 0.5
|
||||
JustifiedNotAffectedBoost: 1.1
|
||||
FixedStatusBoost: 1.05
|
||||
|
||||
Consensus:
|
||||
DefaultMode: WeightedVote # HighestWeight, WeightedVote, Lattice, AuthoritativeFirst
|
||||
MinimumWeightThreshold: 0.1
|
||||
ConflictThreshold: 0.3
|
||||
RequireJustificationForNotAffected: false
|
||||
MaxStatementsPerComputation: 100
|
||||
EnableConflictDetection: true
|
||||
EmitEvents: true
|
||||
|
||||
Normalization:
|
||||
EnabledFormats:
|
||||
- OpenVEX
|
||||
- CSAF
|
||||
- CycloneDX
|
||||
StrictMode: false
|
||||
MaxDocumentSizeBytes: 10485760 # 10 MB
|
||||
MaxStatementsPerDocument: 10000
|
||||
|
||||
AirGap:
|
||||
SealedMode: false
|
||||
BundlePath: /var/lib/stellaops/vex-bundles
|
||||
VerifyBundleSignatures: true
|
||||
AllowedBundleSources: []
|
||||
ExportFormat: jsonl
|
||||
|
||||
Telemetry:
|
||||
MetricsEnabled: true
|
||||
TracingEnabled: true
|
||||
MeterName: StellaOps.VexLens
|
||||
ActivitySourceName: StellaOps.VexLens
|
||||
```
|
||||
|
||||
### 3.2 Environment variable overrides
|
||||
|
||||
```bash
|
||||
VEXLENS_STORAGE__DRIVER=mongo
|
||||
VEXLENS_STORAGE__CONNECTIONSTRING=mongodb://localhost:27017
|
||||
VEXLENS_CONSENSUS__DEFAULTMODE=WeightedVote
|
||||
VEXLENS_AIRGAP__SEALEDMODE=true
|
||||
```
|
||||
|
||||
### 3.3 Consensus mode selection
|
||||
|
||||
| Mode | Use case |
|
||||
|------|----------|
|
||||
| HighestWeight | Single authoritative source preferred |
|
||||
| WeightedVote | Democratic consensus from multiple sources |
|
||||
| Lattice | Formal lattice join (most conservative) |
|
||||
| AuthoritativeFirst | Short-circuit on authoritative issuer |
|
||||
|
||||
## 4. Monitoring & SLOs
|
||||
|
||||
Key metrics (exposed by VexLensMetrics):
|
||||
|
||||
| Metric | SLO / Alert | Notes |
|
||||
|--------|-------------|-------|
|
||||
| `vexlens.consensus.duration_seconds` | p95 < 500ms | Per-computation latency |
|
||||
| `vexlens.consensus.conflicts_total` | Monitor trend | Conflicts by reason |
|
||||
| `vexlens.consensus.confidence` | avg > 0.7 | Low confidence indicates issuer gaps |
|
||||
| `vexlens.normalization.duration_seconds` | p95 < 200ms | Per-document normalization |
|
||||
| `vexlens.normalization.errors_total` | Alert on spike | By format |
|
||||
| `vexlens.trust.weight_value` | Distribution | Trust weight distribution |
|
||||
| `vexlens.projection.query_duration_seconds` | p95 < 100ms | Projection lookups |
|
||||
|
||||
Dashboards must include:
|
||||
|
||||
- Consensus computation rate by mode and outcome.
|
||||
- Conflict breakdown (status disagreement, weight tie, insufficient data).
|
||||
- Trust weight distribution by issuer category.
|
||||
- Normalization success/failure by VEX format.
|
||||
- Projection query latency and throughput.
|
||||
|
||||
Alerts (Alertmanager):
|
||||
|
||||
- `VexLensConsensusLatencyHigh` - consensus duration p95 > 500ms for 5 minutes.
|
||||
- `VexLensConflictSpike` - conflict rate increase > 50% in 10 minutes.
|
||||
- `VexLensNormalizationFailures` - normalization error rate > 5% for 5 minutes.
|
||||
- `VexLensLowConfidence` - average confidence < 0.5 for 10 minutes.
|
||||
|
||||
## 5. Routine operations
|
||||
|
||||
### 5.1 Daily checklist
|
||||
|
||||
- Review dashboard for consensus latency and conflict rates.
|
||||
- Check normalization error logs for malformed VEX documents.
|
||||
- Verify projection storage growth is within capacity thresholds.
|
||||
- Review trust weight distribution for anomalies.
|
||||
- Scan logs for `issuer_not_found` or `signature_verification_failed`.
|
||||
|
||||
### 5.2 Weekly tasks
|
||||
|
||||
- Review issuer directory for new registrations or revocations.
|
||||
- Audit conflict queue for persistent disagreements.
|
||||
- Test consensus determinism with sample documents.
|
||||
- Verify Policy Engine and Vuln Explorer integrations are functional.
|
||||
|
||||
### 5.3 Monthly tasks
|
||||
|
||||
- Review and tune trust weights based on issuer performance.
|
||||
- Archive old projection history beyond retention period.
|
||||
- Update issuer trust tiers based on incident history.
|
||||
- Test offline bundle import/export workflow.
|
||||
|
||||
## 6. Offline operations
|
||||
|
||||
### 6.1 Bundle export
|
||||
|
||||
```bash
|
||||
# Export consensus projections to offline bundle
|
||||
stella vex consensus export \
|
||||
--format jsonl \
|
||||
--output /var/lib/stellaops/vex-bundles/consensus-2025-01.jsonl \
|
||||
--manifest /var/lib/stellaops/vex-bundles/manifest.json \
|
||||
--sign
|
||||
|
||||
# Verify bundle integrity
|
||||
stella vex consensus verify \
|
||||
--bundle /var/lib/stellaops/vex-bundles/consensus-2025-01.jsonl \
|
||||
--manifest /var/lib/stellaops/vex-bundles/manifest.json
|
||||
```
|
||||
|
||||
### 6.2 Bundle import (air-gapped)
|
||||
|
||||
```bash
|
||||
# Enable sealed mode
|
||||
export VEXLENS_AIRGAP__SEALEDMODE=true
|
||||
export VEXLENS_AIRGAP__BUNDLEPATH=/var/lib/stellaops/vex-bundles
|
||||
|
||||
# Import bundle
|
||||
stella vex consensus import \
|
||||
--bundle /var/lib/stellaops/vex-bundles/consensus-2025-01.jsonl \
|
||||
--verify-signatures
|
||||
|
||||
# Verify import
|
||||
stella vex consensus status
|
||||
```
|
||||
|
||||
### 6.3 Air-gap verification
|
||||
|
||||
1. Confirm `VEXLENS_AIRGAP__SEALEDMODE=true` in environment.
|
||||
2. Verify no external network calls in service logs.
|
||||
3. Check bundle manifest hashes match imported data.
|
||||
4. Run determinism check on imported projections.
|
||||
|
||||
## 7. Troubleshooting
|
||||
|
||||
### 7.1 High conflict rates
|
||||
|
||||
**Symptoms:** `vexlens.consensus.conflicts_total` spiking.
|
||||
|
||||
**Investigation:**
|
||||
1. Check conflict breakdown by reason in dashboard.
|
||||
2. Identify issuers with conflicting statements.
|
||||
3. Review issuer trust tiers and weights.
|
||||
|
||||
**Resolution:**
|
||||
- Adjust `ConflictThreshold` if legitimate disagreements.
|
||||
- Update issuer trust tiers based on authority.
|
||||
- Contact issuer owners to resolve source conflicts.
|
||||
|
||||
### 7.2 Normalization failures
|
||||
|
||||
**Symptoms:** `vexlens.normalization.errors_total` increasing.
|
||||
|
||||
**Investigation:**
|
||||
1. Check error logs for specific format failures.
|
||||
2. Identify malformed documents in input stream.
|
||||
3. Validate document against format schema.
|
||||
|
||||
**Resolution:**
|
||||
- Enable `StrictMode: false` for lenient parsing.
|
||||
- Report malformed documents to source issuers.
|
||||
- Update normalizers if format specification changed.
|
||||
|
||||
### 7.3 Low consensus confidence
|
||||
|
||||
**Symptoms:** Average confidence below 0.5.
|
||||
|
||||
**Investigation:**
|
||||
1. Check issuer coverage for affected vulnerabilities.
|
||||
2. Review trust weight distribution.
|
||||
3. Identify missing or untrusted issuers.
|
||||
|
||||
**Resolution:**
|
||||
- Register additional trusted issuers.
|
||||
- Adjust trust tier assignments.
|
||||
- Import offline bundles from authoritative sources.
|
||||
|
||||
### 7.4 Projection storage growth
|
||||
|
||||
**Symptoms:** Storage usage increasing beyond capacity.
|
||||
|
||||
**Investigation:**
|
||||
1. Check `MaxHistoryEntries` setting.
|
||||
2. Review projection count and history depth.
|
||||
3. Identify high-churn vulnerability/product pairs.
|
||||
|
||||
**Resolution:**
|
||||
- Reduce `MaxHistoryEntries`.
|
||||
- Implement history pruning job.
|
||||
- Archive old projections to cold storage.
|
||||
|
||||
## 8. Recovery procedures
|
||||
|
||||
### 8.1 Storage failover
|
||||
|
||||
1. Stop VexLens service instances.
|
||||
2. Switch storage connection to replica.
|
||||
3. Verify connectivity with health check.
|
||||
4. Restart service instances.
|
||||
5. Monitor for consensus recomputation.
|
||||
|
||||
### 8.2 Issuer directory sync
|
||||
|
||||
1. Export current issuer registry backup.
|
||||
2. Resync from authoritative issuer directory source.
|
||||
3. Verify issuer fingerprints and trust tiers.
|
||||
4. Restart VexLens to reload issuer cache.
|
||||
|
||||
### 8.3 Consensus recomputation
|
||||
|
||||
1. Trigger recomputation for affected vulnerability/product pairs.
|
||||
2. Monitor recomputation progress in logs.
|
||||
3. Verify consensus outcomes match expected state.
|
||||
4. Emit status change events if outcomes differ.
|
||||
|
||||
## 9. Evidence locations
|
||||
|
||||
- Sprint tracker: `docs/implplan/SPRINT_0129_0001_0001_policy_reasoning.md`
|
||||
- Module docs: `docs/modules/vex-lens/`
|
||||
- Source code: `src/VexLens/StellaOps.VexLens/`
|
||||
- Dashboard stub: `docs/modules/vex-lens/runbooks/dashboards/vex-lens-observability.json`
|
||||
Reference in New Issue
Block a user