save progress
This commit is contained in:
223
docs/guides/vex-trust-gate-rollout.md
Normal file
223
docs/guides/vex-trust-gate-rollout.md
Normal file
@@ -0,0 +1,223 @@
|
||||
# VexTrustGate Rollout Guide
|
||||
|
||||
This guide describes the phased rollout procedure for the VexTrustGate policy feature, which enforces VEX signature verification trust thresholds.
|
||||
|
||||
## Overview
|
||||
|
||||
VexTrustGate adds a new policy gate that:
|
||||
1. Validates VEX signature verification trust scores
|
||||
2. Enforces per-environment thresholds (production stricter than staging/dev)
|
||||
3. Blocks or warns on status transitions when trust is insufficient
|
||||
4. Contributes to confidence scoring via VexTrustConfidenceFactorProvider
|
||||
|
||||
## Gate Order
|
||||
|
||||
VexTrustGate is positioned in the policy gate chain at **order 250**:
|
||||
- **100**: EvidenceCompleteness
|
||||
- **200**: LatticeState
|
||||
- **250**: VexTrust ← NEW
|
||||
- **300**: UncertaintyTier
|
||||
- **400**: Confidence
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. VEX signature verification pipeline active (SPRINT_1227_0004_0001)
|
||||
2. IssuerDirectory populated with trusted VEX sources
|
||||
3. Excititor properly populating VexTrustStatus in API responses
|
||||
|
||||
## Rollout Phases
|
||||
|
||||
### Phase 1: Feature Flag Deployment
|
||||
|
||||
Deploy with gate disabled to establish baseline:
|
||||
|
||||
```yaml
|
||||
PolicyGates:
|
||||
VexTrust:
|
||||
Enabled: false # Gate off initially
|
||||
```
|
||||
|
||||
**Duration**: 1-2 days
|
||||
**Monitoring**: Verify deployment health, no regression in existing gates.
|
||||
|
||||
### Phase 2: Shadow Mode (Warn Everywhere)
|
||||
|
||||
Enable gate in warn-only mode across all environments:
|
||||
|
||||
```yaml
|
||||
PolicyGates:
|
||||
VexTrust:
|
||||
Enabled: true
|
||||
Thresholds:
|
||||
production:
|
||||
MinCompositeScore: 0.80
|
||||
RequireIssuerVerified: true
|
||||
FailureAction: Warn # Changed from Block
|
||||
staging:
|
||||
MinCompositeScore: 0.60
|
||||
RequireIssuerVerified: true
|
||||
FailureAction: Warn
|
||||
development:
|
||||
MinCompositeScore: 0.40
|
||||
RequireIssuerVerified: false
|
||||
FailureAction: Warn
|
||||
MissingTrustBehavior: Warn
|
||||
```
|
||||
|
||||
**Duration**: 1-2 weeks
|
||||
**Monitoring**:
|
||||
- Review `stellaops.policy.vex_trust_gate.decisions.total` metrics
|
||||
- Analyze warn events to understand threshold impact
|
||||
- Collect feedback from operators on false positives
|
||||
|
||||
### Phase 3: Threshold Tuning
|
||||
|
||||
Based on Phase 2 data, adjust thresholds:
|
||||
|
||||
1. **Review decision breakdown by reason**:
|
||||
- `composite_score`: May need to lower threshold
|
||||
- `issuer_verified`: Check IssuerDirectory completeness
|
||||
- `freshness`: Consider expanding acceptable states
|
||||
|
||||
2. **Tenant-specific adjustments** (if needed):
|
||||
```yaml
|
||||
PolicyGates:
|
||||
VexTrust:
|
||||
TenantOverrides:
|
||||
tenant-with-internal-vex:
|
||||
production:
|
||||
MinCompositeScore: 0.70 # Lower for self-signed internal VEX
|
||||
high-security-tenant:
|
||||
production:
|
||||
MinCompositeScore: 0.90 # Higher for regulated workloads
|
||||
```
|
||||
|
||||
**Duration**: 1 week
|
||||
**Outcome**: Validated threshold configuration
|
||||
|
||||
### Phase 4: Production Enforcement
|
||||
|
||||
Enable blocking in production only:
|
||||
|
||||
```yaml
|
||||
PolicyGates:
|
||||
VexTrust:
|
||||
Enabled: true
|
||||
Thresholds:
|
||||
production:
|
||||
MinCompositeScore: 0.80
|
||||
RequireIssuerVerified: true
|
||||
MinAccuracyRate: 0.85
|
||||
AcceptableFreshness:
|
||||
- fresh
|
||||
FailureAction: Block # Now enforcing
|
||||
staging:
|
||||
FailureAction: Warn # Still warn only
|
||||
development:
|
||||
FailureAction: Warn
|
||||
```
|
||||
|
||||
**Duration**: Ongoing with monitoring
|
||||
**Rollback**: Set `FailureAction: Warn` or `Enabled: false` if issues arise.
|
||||
|
||||
### Phase 5: Full Rollout
|
||||
|
||||
After production stabilization, optionally enable blocking in staging:
|
||||
|
||||
```yaml
|
||||
PolicyGates:
|
||||
VexTrust:
|
||||
Thresholds:
|
||||
staging:
|
||||
MinCompositeScore: 0.60
|
||||
RequireIssuerVerified: true
|
||||
FailureAction: Block # Optional stricter staging
|
||||
```
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Key Metrics
|
||||
|
||||
| Metric | Description | Alert Threshold |
|
||||
|--------|-------------|-----------------|
|
||||
| `stellaops.policy.vex_trust_gate.evaluations.total` | Total evaluations | Baseline variance |
|
||||
| `stellaops.policy.vex_trust_gate.decisions.total{decision="block"}` | Block decisions | Sudden spike |
|
||||
| `stellaops.policy.vex_trust_gate.trust_score` | Score distribution | Mean < 0.50 |
|
||||
| `stellaops.policy.vex_trust_gate.evaluation_duration_ms` | Latency | p99 > 100ms |
|
||||
|
||||
### Trace Spans
|
||||
|
||||
- `VexTrustGate.EvaluateAsync`
|
||||
- Attributes: `environment`, `trust_score`, `decision`, `issuer_id`
|
||||
|
||||
### Audit Trail
|
||||
|
||||
PolicyAuditEntity now includes VEX trust fields:
|
||||
- `VexTrustScore`: Composite score at decision time
|
||||
- `VexTrustTier`: Tier classification
|
||||
- `VexSignatureVerified`: Whether signature was verified
|
||||
- `VexIssuerId`/`VexIssuerName`: Issuer info
|
||||
- `VexTrustGateResult`: Gate decision
|
||||
- `VexTrustGateReason`: Reason code
|
||||
|
||||
## Rollback Procedure
|
||||
|
||||
### Immediate Disable
|
||||
```yaml
|
||||
PolicyGates:
|
||||
VexTrust:
|
||||
Enabled: false
|
||||
```
|
||||
|
||||
### Switch to Warn Mode
|
||||
```yaml
|
||||
PolicyGates:
|
||||
VexTrust:
|
||||
Thresholds:
|
||||
production:
|
||||
FailureAction: Warn
|
||||
staging:
|
||||
FailureAction: Warn
|
||||
development:
|
||||
FailureAction: Warn
|
||||
```
|
||||
|
||||
### Per-Tenant Disable
|
||||
```yaml
|
||||
PolicyGates:
|
||||
VexTrust:
|
||||
TenantOverrides:
|
||||
affected-tenant:
|
||||
production:
|
||||
MinCompositeScore: 0.01 # Effectively bypass
|
||||
RequireIssuerVerified: false
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
| Symptom | Likely Cause | Resolution |
|
||||
|---------|--------------|------------|
|
||||
| All VEX blocked | Missing IssuerDirectory entries | Populate directory with trusted issuers |
|
||||
| High false positive rate | Threshold too strict | Lower `MinCompositeScore` |
|
||||
| "missing_vex_trust_data" warnings | Verification pipeline not running | Check Excititor logs |
|
||||
| Inconsistent decisions | Stale trust cache | Verify cache TTL settings |
|
||||
|
||||
### Debug Logging
|
||||
|
||||
Enable debug logging for gate:
|
||||
```yaml
|
||||
Logging:
|
||||
LogLevel:
|
||||
StellaOps.Policy.Engine.Gates.VexTrustGate: Debug
|
||||
```
|
||||
|
||||
## Support
|
||||
|
||||
- Sprint: `SPRINT_1227_0004_0003`
|
||||
- Component: `StellaOps.Policy.Engine.Gates`
|
||||
- Files:
|
||||
- `src/Policy/StellaOps.Policy.Engine/Gates/VexTrustGate.cs`
|
||||
- `src/Policy/StellaOps.Policy.Engine/Gates/VexTrustGateOptions.cs`
|
||||
- `etc/policy-gates.yaml.sample`
|
||||
Reference in New Issue
Block a user