1211 lines
46 KiB
Markdown
1211 lines
46 KiB
Markdown
# Policy Determinization Architecture
|
||
|
||
## Overview
|
||
|
||
The **Determinization** subsystem handles CVEs that arrive without complete evidence (EPSS, VEX, reachability). Rather than blocking pipelines or silently ignoring unknowns, it treats them as **probabilistic observations** that can mature as evidence arrives.
|
||
|
||
**Design Principles:**
|
||
1. **Uncertainty is first-class** - Missing signals contribute to entropy, not guesswork
|
||
2. **Graceful degradation** - Pipelines continue with guardrails, not hard blocks
|
||
3. **Automatic hardening** - Policies tighten as evidence accumulates
|
||
4. **Full auditability** - Every decision traces back to evidence state
|
||
|
||
## Problem Statement
|
||
|
||
When a CVE is discovered against a component, several scenarios create uncertainty:
|
||
|
||
| Scenario | Current Behavior | Desired Behavior |
|
||
|----------|------------------|------------------|
|
||
| EPSS not yet published | Treat as unknown severity | Explicit `SignalState.NotQueried` with default prior |
|
||
| VEX statement missing | Assume affected | Explicit uncertainty with configurable policy |
|
||
| Reachability indeterminate | Conservative block | Allow with guardrails in non-prod |
|
||
| Conflicting VEX sources | K4 Conflict state | Entropy penalty + human review trigger |
|
||
| Stale evidence (>14 days) | No special handling | Decay-adjusted confidence + auto-review |
|
||
|
||
## Architecture
|
||
|
||
### Component Diagram
|
||
|
||
```
|
||
+------------------------+
|
||
| Policy Engine |
|
||
| (Verdict Evaluation) |
|
||
+------------------------+
|
||
|
|
||
v
|
||
+----------------+ +-------------------+ +------------------------+
|
||
| Feedser |--->| Signal Aggregator |-->| Determinization Gate |
|
||
| (EPSS/VEX/KEV) | | (Null-aware) | | (Entropy Thresholds) |
|
||
+----------------+ +-------------------+ +------------------------+
|
||
| |
|
||
v v
|
||
+-------------------+ +-------------------+
|
||
| Uncertainty Score | | GuardRails Policy |
|
||
| Calculator | | (Allow/Quarantine)|
|
||
+-------------------+ +-------------------+
|
||
| |
|
||
v v
|
||
+-------------------+ +-------------------+
|
||
| Decay Calculator | | Observation State |
|
||
| (Half-life) | | (pending_determ) |
|
||
+-------------------+ +-------------------+
|
||
```
|
||
|
||
### Library Structure
|
||
|
||
```
|
||
src/Policy/__Libraries/StellaOps.Policy.Determinization/
|
||
├── Models/
|
||
│ ├── ObservationState.cs # CVE observation lifecycle states
|
||
│ ├── SignalState.cs # Null-aware signal wrapper
|
||
│ ├── SignalSnapshot.cs # Point-in-time signal collection
|
||
│ ├── UncertaintyScore.cs # Knowledge completeness entropy
|
||
│ ├── ObservationDecay.cs # Per-CVE decay configuration
|
||
│ ├── GuardRails.cs # Guardrail policy outcomes
|
||
│ └── DeterminizationContext.cs # Evaluation context container
|
||
├── Scoring/
|
||
│ ├── IUncertaintyScoreCalculator.cs
|
||
│ ├── UncertaintyScoreCalculator.cs # entropy = 1 - evidence_sum
|
||
│ ├── IDecayedConfidenceCalculator.cs
|
||
│ ├── DecayedConfidenceCalculator.cs # Half-life decay application
|
||
│ ├── SignalWeights.cs # Configurable signal weights
|
||
│ ├── PriorDistribution.cs # Default priors for missing signals
|
||
│ ├── EvidenceWeightedScoring/ # 6-dimension EWS model
|
||
│ │ ├── EwsDimension.cs # RCH/RTS/BKP/XPL/SRC/MIT enum
|
||
│ │ ├── IEwsDimensionNormalizer.cs # Pluggable normalizer interface
|
||
│ │ ├── EwsSignalInput.cs # Raw signal inputs
|
||
│ │ ├── EwsModels.cs # Scores, weights, guardrails
|
||
│ │ ├── IGuardrailsEngine.cs # Guardrails enforcement interface
|
||
│ │ ├── GuardrailsEngine.cs # Caps/floors (KEV, backport, etc.)
|
||
│ │ ├── IEwsCalculator.cs # Unified calculator interface
|
||
│ │ ├── EwsCalculator.cs # Orchestrates normalizers + guardrails
|
||
│ │ └── Normalizers/
|
||
│ │ ├── ReachabilityNormalizer.cs
|
||
│ │ ├── RuntimeSignalsNormalizer.cs
|
||
│ │ ├── BackportEvidenceNormalizer.cs
|
||
│ │ ├── ExploitabilityNormalizer.cs
|
||
│ │ ├── SourceConfidenceNormalizer.cs
|
||
│ │ └── MitigationStatusNormalizer.cs
|
||
│ └── Triage/ # Decay-based triage queue (Sprint 050)
|
||
│ ├── TriageModels.cs # TriagePriority, TriageItem, TriageQueueSnapshot, TriageQueueOptions
|
||
│ ├── ITriageQueueEvaluator.cs # Batch + single evaluation interface
|
||
│ ├── TriageQueueEvaluator.cs # Priority classification, days-until-stale, OTel metrics
|
||
│ ├── ITriageObservationSource.cs # Source for observation candidates
|
||
│ ├── ITriageReanalysisSink.cs # Sink interface for re-analysis queue
|
||
│ ├── InMemoryTriageReanalysisSink.cs # ConcurrentQueue-based default sink
|
||
│ └── UnknownTriageQueueService.cs # Fetch→evaluate→enqueue cycle orchestrator
|
||
│ └── WeightManifest/ # Versioned weight manifests (Sprint 051)
|
||
│ ├── WeightManifestModels.cs # WeightManifestDocument, weights, guardrails, buckets, diff models
|
||
│ ├── WeightManifestHashComputer.cs # Deterministic SHA-256 with canonical JSON (excludes contentHash)
|
||
│ ├── IWeightManifestLoader.cs # Interface: list, load, select, validate, diff
|
||
│ ├── WeightManifestLoader.cs # File-based discovery, effectiveFrom selection, OTel metrics
|
||
│ └── WeightManifestCommands.cs # CLI backing: list, validate, diff, activate, hash
|
||
├── Policies/
|
||
│ ├── IDeterminizationPolicy.cs
|
||
│ ├── DeterminizationPolicy.cs # Allow/quarantine/escalate rules
|
||
│ ├── GuardRailsPolicy.cs # Guardrails configuration
|
||
│ ├── DeterminizationRuleSet.cs # Rule definitions
|
||
│ └── EnvironmentThresholds.cs # Per-environment thresholds
|
||
├── Gates/
|
||
│ ├── IDeterminizationGate.cs
|
||
│ ├── DeterminizationGate.cs # Policy engine gate
|
||
│ └── DeterminizationGateOptions.cs
|
||
├── Subscriptions/
|
||
│ ├── ISignalUpdateSubscription.cs
|
||
│ ├── SignalUpdateHandler.cs # Re-evaluation on new signals
|
||
│ └── DeterminizationEventTypes.cs
|
||
├── DeterminizationOptions.cs # Global options
|
||
└── ServiceCollectionExtensions.cs # DI registration
|
||
```
|
||
|
||
## Data Models
|
||
|
||
### ObservationState
|
||
|
||
Represents the lifecycle state of a CVE observation, orthogonal to VEX status:
|
||
|
||
```csharp
|
||
/// <summary>
|
||
/// Observation state for CVE tracking, independent of VEX status.
|
||
/// Allows a CVE to be "Affected" (VEX) but "PendingDeterminization" (observation).
|
||
/// </summary>
|
||
public enum ObservationState
|
||
{
|
||
/// <summary>
|
||
/// Initial state: CVE discovered but evidence incomplete.
|
||
/// Triggers guardrail-based policy evaluation.
|
||
/// </summary>
|
||
PendingDeterminization = 0,
|
||
|
||
/// <summary>
|
||
/// Evidence sufficient for confident determination.
|
||
/// Normal policy evaluation applies.
|
||
/// </summary>
|
||
Determined = 1,
|
||
|
||
/// <summary>
|
||
/// Multiple signals conflict (K4 Conflict state).
|
||
/// Requires human review regardless of confidence.
|
||
/// </summary>
|
||
Disputed = 2,
|
||
|
||
/// <summary>
|
||
/// Evidence decayed below threshold; needs refresh.
|
||
/// Auto-triggered when decay > threshold.
|
||
/// </summary>
|
||
StaleRequiresRefresh = 3,
|
||
|
||
/// <summary>
|
||
/// Manually flagged for review.
|
||
/// Bypasses automatic determinization.
|
||
/// </summary>
|
||
ManualReviewRequired = 4,
|
||
|
||
/// <summary>
|
||
/// CVE suppressed/ignored by policy exception.
|
||
/// Evidence tracking continues but decisions skip.
|
||
/// </summary>
|
||
Suppressed = 5
|
||
}
|
||
```
|
||
|
||
### SignalState<T>
|
||
|
||
Null-aware wrapper distinguishing "not queried" from "queried, value null":
|
||
|
||
```csharp
|
||
/// <summary>
|
||
/// Wraps a signal value with query status metadata.
|
||
/// Distinguishes between: not queried, queried with value, queried but absent, query failed.
|
||
/// </summary>
|
||
public sealed record SignalState<T>
|
||
{
|
||
/// <summary>Status of the signal query.</summary>
|
||
public required SignalQueryStatus Status { get; init; }
|
||
|
||
/// <summary>Signal value if Status is Queried and value exists.</summary>
|
||
public T? Value { get; init; }
|
||
|
||
/// <summary>When the signal was last queried (UTC).</summary>
|
||
public DateTimeOffset? QueriedAt { get; init; }
|
||
|
||
/// <summary>Reason for failure if Status is Failed.</summary>
|
||
public string? FailureReason { get; init; }
|
||
|
||
/// <summary>Source that provided the value (feed ID, issuer, etc.).</summary>
|
||
public string? Source { get; init; }
|
||
|
||
/// <summary>Whether this signal contributes to uncertainty (true if not queried or failed).</summary>
|
||
public bool ContributesToUncertainty =>
|
||
Status is SignalQueryStatus.NotQueried or SignalQueryStatus.Failed;
|
||
|
||
/// <summary>Whether this signal has a usable value.</summary>
|
||
public bool HasValue => Status == SignalQueryStatus.Queried && Value is not null;
|
||
}
|
||
|
||
public enum SignalQueryStatus
|
||
{
|
||
/// <summary>Signal source not yet queried.</summary>
|
||
NotQueried = 0,
|
||
|
||
/// <summary>Signal source queried; value may be present or absent.</summary>
|
||
Queried = 1,
|
||
|
||
/// <summary>Signal query failed (timeout, network, parse error).</summary>
|
||
Failed = 2
|
||
}
|
||
```
|
||
|
||
### SignalSnapshot
|
||
|
||
Point-in-time collection of all signals for a CVE observation:
|
||
|
||
```csharp
|
||
/// <summary>
|
||
/// Immutable snapshot of all signals for a CVE observation at a point in time.
|
||
/// </summary>
|
||
public sealed record SignalSnapshot
|
||
{
|
||
/// <summary>CVE identifier (e.g., CVE-2026-12345).</summary>
|
||
public required string CveId { get; init; }
|
||
|
||
/// <summary>Subject component (PURL).</summary>
|
||
public required string SubjectPurl { get; init; }
|
||
|
||
/// <summary>Snapshot capture time (UTC).</summary>
|
||
public required DateTimeOffset CapturedAt { get; init; }
|
||
|
||
/// <summary>EPSS score signal.</summary>
|
||
public required SignalState<EpssEvidence> Epss { get; init; }
|
||
|
||
/// <summary>VEX claim signal.</summary>
|
||
public required SignalState<VexClaimSummary> Vex { get; init; }
|
||
|
||
/// <summary>Reachability determination signal.</summary>
|
||
public required SignalState<ReachabilityEvidence> Reachability { get; init; }
|
||
|
||
/// <summary>Runtime observation signal (eBPF, dyld, ETW).</summary>
|
||
public required SignalState<RuntimeEvidence> Runtime { get; init; }
|
||
|
||
/// <summary>Fix backport detection signal.</summary>
|
||
public required SignalState<BackportEvidence> Backport { get; init; }
|
||
|
||
/// <summary>SBOM lineage signal.</summary>
|
||
public required SignalState<SbomLineageEvidence> SbomLineage { get; init; }
|
||
|
||
/// <summary>Known Exploited Vulnerability flag.</summary>
|
||
public required SignalState<bool> Kev { get; init; }
|
||
|
||
/// <summary>CVSS score signal.</summary>
|
||
public required SignalState<CvssEvidence> Cvss { get; init; }
|
||
}
|
||
```
|
||
|
||
### UncertaintyScore
|
||
|
||
Knowledge completeness measurement (not code entropy):
|
||
|
||
```csharp
|
||
/// <summary>
|
||
/// Measures knowledge completeness for a CVE observation.
|
||
/// High entropy (close to 1.0) means many signals are missing.
|
||
/// Low entropy (close to 0.0) means comprehensive evidence.
|
||
/// </summary>
|
||
public sealed record UncertaintyScore
|
||
{
|
||
/// <summary>Entropy value [0.0-1.0]. Higher = more uncertain.</summary>
|
||
public required double Entropy { get; init; }
|
||
|
||
/// <summary>Completeness value [0.0-1.0]. Higher = more complete. (1 - Entropy)</summary>
|
||
public double Completeness => 1.0 - Entropy;
|
||
|
||
/// <summary>Signals that are missing or failed.</summary>
|
||
public required ImmutableArray<SignalGap> MissingSignals { get; init; }
|
||
|
||
/// <summary>Weighted sum of present signals.</summary>
|
||
public required double WeightedEvidenceSum { get; init; }
|
||
|
||
/// <summary>Maximum possible weighted sum (all signals present).</summary>
|
||
public required double MaxPossibleWeight { get; init; }
|
||
|
||
/// <summary>Tier classification based on entropy.</summary>
|
||
public UncertaintyTier Tier => Entropy switch
|
||
{
|
||
<= 0.2 => UncertaintyTier.VeryLow, // Comprehensive evidence
|
||
<= 0.4 => UncertaintyTier.Low, // Good evidence coverage
|
||
<= 0.6 => UncertaintyTier.Medium, // Moderate gaps
|
||
<= 0.8 => UncertaintyTier.High, // Significant gaps
|
||
_ => UncertaintyTier.VeryHigh // Minimal evidence
|
||
};
|
||
}
|
||
|
||
public sealed record SignalGap(
|
||
string SignalName,
|
||
double Weight,
|
||
SignalQueryStatus Status,
|
||
string? Reason);
|
||
|
||
public enum UncertaintyTier
|
||
{
|
||
VeryLow = 0, // Entropy <= 0.2
|
||
Low = 1, // Entropy <= 0.4
|
||
Medium = 2, // Entropy <= 0.6
|
||
High = 3, // Entropy <= 0.8
|
||
VeryHigh = 4 // Entropy > 0.8
|
||
}
|
||
```
|
||
|
||
### ObservationDecay
|
||
|
||
Time-based confidence decay configuration:
|
||
|
||
```csharp
|
||
/// <summary>
|
||
/// Tracks evidence freshness decay for a CVE observation.
|
||
/// </summary>
|
||
public sealed record ObservationDecay
|
||
{
|
||
/// <summary>Half-life for confidence decay. Default: 14 days per advisory.</summary>
|
||
public required TimeSpan HalfLife { get; init; }
|
||
|
||
/// <summary>Minimum confidence floor (never decays below). Default: 0.35.</summary>
|
||
public required double Floor { get; init; }
|
||
|
||
/// <summary>Last time any signal was updated (UTC).</summary>
|
||
public required DateTimeOffset LastSignalUpdate { get; init; }
|
||
|
||
/// <summary>Current decayed confidence multiplier [Floor-1.0].</summary>
|
||
public required double DecayedMultiplier { get; init; }
|
||
|
||
/// <summary>When next auto-review is scheduled (UTC).</summary>
|
||
public DateTimeOffset? NextReviewAt { get; init; }
|
||
|
||
/// <summary>Whether decay has triggered stale state.</summary>
|
||
public bool IsStale { get; init; }
|
||
}
|
||
```
|
||
|
||
### GuardRails
|
||
|
||
Policy outcome with monitoring requirements:
|
||
|
||
```csharp
|
||
/// <summary>
|
||
/// Guardrails applied when allowing uncertain observations.
|
||
/// </summary>
|
||
public sealed record GuardRails
|
||
{
|
||
/// <summary>Enable runtime monitoring for this observation.</summary>
|
||
public required bool EnableRuntimeMonitoring { get; init; }
|
||
|
||
/// <summary>Interval for automatic re-review.</summary>
|
||
public required TimeSpan ReviewInterval { get; init; }
|
||
|
||
/// <summary>EPSS threshold that triggers automatic escalation.</summary>
|
||
public required double EpssEscalationThreshold { get; init; }
|
||
|
||
/// <summary>Reachability status that triggers escalation.</summary>
|
||
public required ImmutableArray<string> EscalatingReachabilityStates { get; init; }
|
||
|
||
/// <summary>Maximum time in guarded state before forced review.</summary>
|
||
public required TimeSpan MaxGuardedDuration { get; init; }
|
||
|
||
/// <summary>Alert channels for this observation.</summary>
|
||
public ImmutableArray<string> AlertChannels { get; init; } = ImmutableArray<string>.Empty;
|
||
|
||
/// <summary>Additional context for audit trail.</summary>
|
||
public string? PolicyRationale { get; init; }
|
||
}
|
||
```
|
||
|
||
## Scoring Algorithms
|
||
|
||
### Uncertainty Score Calculation
|
||
|
||
```csharp
|
||
/// <summary>
|
||
/// Calculates knowledge completeness entropy from signal snapshot.
|
||
/// Formula: entropy = 1 - (sum of weighted present signals / max possible weight)
|
||
/// </summary>
|
||
public sealed class UncertaintyScoreCalculator : IUncertaintyScoreCalculator
|
||
{
|
||
private readonly SignalWeights _weights;
|
||
|
||
public UncertaintyScore Calculate(SignalSnapshot snapshot)
|
||
{
|
||
var gaps = new List<SignalGap>();
|
||
var weightedSum = 0.0;
|
||
var maxWeight = _weights.TotalWeight;
|
||
|
||
// EPSS signal
|
||
if (snapshot.Epss.HasValue)
|
||
weightedSum += _weights.Epss;
|
||
else
|
||
gaps.Add(new SignalGap("EPSS", _weights.Epss, snapshot.Epss.Status, snapshot.Epss.FailureReason));
|
||
|
||
// VEX signal
|
||
if (snapshot.Vex.HasValue)
|
||
weightedSum += _weights.Vex;
|
||
else
|
||
gaps.Add(new SignalGap("VEX", _weights.Vex, snapshot.Vex.Status, snapshot.Vex.FailureReason));
|
||
|
||
// Reachability signal
|
||
if (snapshot.Reachability.HasValue)
|
||
weightedSum += _weights.Reachability;
|
||
else
|
||
gaps.Add(new SignalGap("Reachability", _weights.Reachability, snapshot.Reachability.Status, snapshot.Reachability.FailureReason));
|
||
|
||
// Runtime signal
|
||
if (snapshot.Runtime.HasValue)
|
||
weightedSum += _weights.Runtime;
|
||
else
|
||
gaps.Add(new SignalGap("Runtime", _weights.Runtime, snapshot.Runtime.Status, snapshot.Runtime.FailureReason));
|
||
|
||
// Backport signal
|
||
if (snapshot.Backport.HasValue)
|
||
weightedSum += _weights.Backport;
|
||
else
|
||
gaps.Add(new SignalGap("Backport", _weights.Backport, snapshot.Backport.Status, snapshot.Backport.FailureReason));
|
||
|
||
// SBOM Lineage signal
|
||
if (snapshot.SbomLineage.HasValue)
|
||
weightedSum += _weights.SbomLineage;
|
||
else
|
||
gaps.Add(new SignalGap("SBOMLineage", _weights.SbomLineage, snapshot.SbomLineage.Status, snapshot.SbomLineage.FailureReason));
|
||
|
||
var entropy = 1.0 - (weightedSum / maxWeight);
|
||
|
||
return new UncertaintyScore
|
||
{
|
||
Entropy = Math.Clamp(entropy, 0.0, 1.0),
|
||
MissingSignals = gaps.ToImmutableArray(),
|
||
WeightedEvidenceSum = weightedSum,
|
||
MaxPossibleWeight = maxWeight
|
||
};
|
||
}
|
||
}
|
||
```
|
||
|
||
### Signal Weights (Configurable)
|
||
|
||
```csharp
|
||
/// <summary>
|
||
/// Configurable weights for signal contribution to completeness.
|
||
/// Weights should sum to 1.0 for normalized entropy.
|
||
/// </summary>
|
||
public sealed record SignalWeights
|
||
{
|
||
public double Vex { get; init; } = 0.25;
|
||
public double Epss { get; init; } = 0.15;
|
||
public double Reachability { get; init; } = 0.25;
|
||
public double Runtime { get; init; } = 0.15;
|
||
public double Backport { get; init; } = 0.10;
|
||
public double SbomLineage { get; init; } = 0.10;
|
||
|
||
public double TotalWeight =>
|
||
Vex + Epss + Reachability + Runtime + Backport + SbomLineage;
|
||
|
||
public SignalWeights Normalize()
|
||
{
|
||
var total = TotalWeight;
|
||
return new SignalWeights
|
||
{
|
||
Vex = Vex / total,
|
||
Epss = Epss / total,
|
||
Reachability = Reachability / total,
|
||
Runtime = Runtime / total,
|
||
Backport = Backport / total,
|
||
SbomLineage = SbomLineage / total
|
||
};
|
||
}
|
||
}
|
||
```
|
||
|
||
### Decay Calculation
|
||
|
||
```csharp
|
||
/// <summary>
|
||
/// Applies exponential decay to confidence based on evidence staleness.
|
||
/// Formula: decayed = max(floor, exp(-ln(2) * age_days / half_life_days))
|
||
/// </summary>
|
||
public sealed class DecayedConfidenceCalculator : IDecayedConfidenceCalculator
|
||
{
|
||
private readonly TimeProvider _timeProvider;
|
||
|
||
public ObservationDecay Calculate(
|
||
DateTimeOffset lastSignalUpdate,
|
||
TimeSpan halfLife,
|
||
double floor = 0.35)
|
||
{
|
||
var now = _timeProvider.GetUtcNow();
|
||
var ageDays = (now - lastSignalUpdate).TotalDays;
|
||
|
||
double decayedMultiplier;
|
||
if (ageDays <= 0)
|
||
{
|
||
decayedMultiplier = 1.0;
|
||
}
|
||
else
|
||
{
|
||
var rawDecay = Math.Exp(-Math.Log(2) * ageDays / halfLife.TotalDays);
|
||
decayedMultiplier = Math.Max(rawDecay, floor);
|
||
}
|
||
|
||
// Calculate next review time (when decay crosses 50% threshold)
|
||
var daysTo50Percent = halfLife.TotalDays;
|
||
var nextReviewAt = lastSignalUpdate.AddDays(daysTo50Percent);
|
||
|
||
return new ObservationDecay
|
||
{
|
||
HalfLife = halfLife,
|
||
Floor = floor,
|
||
LastSignalUpdate = lastSignalUpdate,
|
||
DecayedMultiplier = decayedMultiplier,
|
||
NextReviewAt = nextReviewAt,
|
||
IsStale = decayedMultiplier <= 0.5
|
||
};
|
||
}
|
||
}
|
||
```
|
||
|
||
## Policy Rules
|
||
|
||
### Determinization Policy
|
||
|
||
```csharp
|
||
/// <summary>
|
||
/// Implements allow/quarantine/escalate logic per advisory specification.
|
||
/// </summary>
|
||
public sealed class DeterminizationPolicy : IDeterminizationPolicy
|
||
{
|
||
private readonly DeterminizationOptions _options;
|
||
private readonly ILogger<DeterminizationPolicy> _logger;
|
||
|
||
public DeterminizationResult Evaluate(DeterminizationContext ctx)
|
||
{
|
||
var snapshot = ctx.SignalSnapshot;
|
||
var uncertainty = ctx.UncertaintyScore;
|
||
var decay = ctx.Decay;
|
||
var env = ctx.Environment;
|
||
|
||
// Rule 1: Escalate if runtime evidence shows loaded
|
||
if (snapshot.Runtime.HasValue &&
|
||
snapshot.Runtime.Value!.ObservedLoaded)
|
||
{
|
||
return DeterminizationResult.Escalated(
|
||
"Runtime evidence shows vulnerable code loaded",
|
||
PolicyVerdictStatus.Escalated);
|
||
}
|
||
|
||
// Rule 2: Quarantine if EPSS >= threshold or proven reachable
|
||
if (snapshot.Epss.HasValue &&
|
||
snapshot.Epss.Value!.Score >= _options.EpssQuarantineThreshold)
|
||
{
|
||
return DeterminizationResult.Quarantined(
|
||
$"EPSS score {snapshot.Epss.Value.Score:P1} exceeds threshold {_options.EpssQuarantineThreshold:P1}",
|
||
PolicyVerdictStatus.Blocked);
|
||
}
|
||
|
||
if (snapshot.Reachability.HasValue &&
|
||
snapshot.Reachability.Value!.Status == ReachabilityStatus.Reachable)
|
||
{
|
||
return DeterminizationResult.Quarantined(
|
||
"Vulnerable code is reachable via call graph",
|
||
PolicyVerdictStatus.Blocked);
|
||
}
|
||
|
||
// Rule 3: Allow with guardrails if score < threshold AND entropy > threshold AND non-prod
|
||
var trustScore = ctx.TrustScore;
|
||
if (trustScore < _options.GuardedAllowScoreThreshold &&
|
||
uncertainty.Entropy > _options.GuardedAllowEntropyThreshold &&
|
||
env != DeploymentEnvironment.Production)
|
||
{
|
||
var guardrails = BuildGuardrails(ctx);
|
||
return DeterminizationResult.GuardedAllow(
|
||
$"Uncertain observation (entropy={uncertainty.Entropy:F2}) allowed with guardrails in {env}",
|
||
PolicyVerdictStatus.GuardedPass,
|
||
guardrails);
|
||
}
|
||
|
||
// Rule 4: Block in production with high entropy
|
||
if (env == DeploymentEnvironment.Production &&
|
||
uncertainty.Entropy > _options.ProductionBlockEntropyThreshold)
|
||
{
|
||
return DeterminizationResult.Quarantined(
|
||
$"High uncertainty (entropy={uncertainty.Entropy:F2}) not allowed in production",
|
||
PolicyVerdictStatus.Blocked);
|
||
}
|
||
|
||
// Rule 5: Defer if evidence is stale
|
||
if (decay.IsStale)
|
||
{
|
||
return DeterminizationResult.Deferred(
|
||
$"Evidence stale (last update: {decay.LastSignalUpdate:u}), requires refresh",
|
||
PolicyVerdictStatus.Deferred);
|
||
}
|
||
|
||
// Default: Allow (sufficient evidence or acceptable risk)
|
||
return DeterminizationResult.Allowed(
|
||
"Evidence sufficient for determination",
|
||
PolicyVerdictStatus.Pass);
|
||
}
|
||
|
||
private GuardRails BuildGuardrails(DeterminizationContext ctx) =>
|
||
new GuardRails
|
||
{
|
||
EnableRuntimeMonitoring = true,
|
||
ReviewInterval = TimeSpan.FromDays(_options.GuardedReviewIntervalDays),
|
||
EpssEscalationThreshold = _options.EpssQuarantineThreshold,
|
||
EscalatingReachabilityStates = ImmutableArray.Create("Reachable", "ObservedReachable"),
|
||
MaxGuardedDuration = TimeSpan.FromDays(_options.MaxGuardedDurationDays),
|
||
PolicyRationale = $"Auto-allowed with entropy={ctx.UncertaintyScore.Entropy:F2}, trust={ctx.TrustScore:F2}"
|
||
};
|
||
}
|
||
```
|
||
|
||
### Environment Thresholds
|
||
|
||
```csharp
|
||
/// <summary>
|
||
/// Per-environment threshold configuration.
|
||
/// </summary>
|
||
public sealed record EnvironmentThresholds
|
||
{
|
||
public DeploymentEnvironment Environment { get; init; }
|
||
public double MinConfidenceForNotAffected { get; init; }
|
||
public double MaxEntropyForAllow { get; init; }
|
||
public double EpssBlockThreshold { get; init; }
|
||
public bool RequireReachabilityForAllow { get; init; }
|
||
}
|
||
|
||
public static class DefaultEnvironmentThresholds
|
||
{
|
||
public static EnvironmentThresholds Production => new()
|
||
{
|
||
Environment = DeploymentEnvironment.Production,
|
||
MinConfidenceForNotAffected = 0.75,
|
||
MaxEntropyForAllow = 0.3,
|
||
EpssBlockThreshold = 0.3,
|
||
RequireReachabilityForAllow = true
|
||
};
|
||
|
||
public static EnvironmentThresholds Staging => new()
|
||
{
|
||
Environment = DeploymentEnvironment.Staging,
|
||
MinConfidenceForNotAffected = 0.60,
|
||
MaxEntropyForAllow = 0.5,
|
||
EpssBlockThreshold = 0.4,
|
||
RequireReachabilityForAllow = true
|
||
};
|
||
|
||
public static EnvironmentThresholds Development => new()
|
||
{
|
||
Environment = DeploymentEnvironment.Development,
|
||
MinConfidenceForNotAffected = 0.40,
|
||
MaxEntropyForAllow = 0.7,
|
||
EpssBlockThreshold = 0.6,
|
||
RequireReachabilityForAllow = false
|
||
};
|
||
}
|
||
```
|
||
|
||
## Integration Points
|
||
|
||
### Feedser Integration
|
||
|
||
Feedser attaches `SignalState<T>` to CVE observations:
|
||
|
||
```csharp
|
||
// In Feedser: EpssSignalAttacher
|
||
public async Task<SignalState<EpssEvidence>> AttachEpssAsync(string cveId, CancellationToken ct)
|
||
{
|
||
try
|
||
{
|
||
var evidence = await _epssClient.GetScoreAsync(cveId, ct);
|
||
return new SignalState<EpssEvidence>
|
||
{
|
||
Status = SignalQueryStatus.Queried,
|
||
Value = evidence,
|
||
QueriedAt = _timeProvider.GetUtcNow(),
|
||
Source = "first.org"
|
||
};
|
||
}
|
||
catch (EpssNotFoundException)
|
||
{
|
||
return new SignalState<EpssEvidence>
|
||
{
|
||
Status = SignalQueryStatus.Queried,
|
||
Value = null,
|
||
QueriedAt = _timeProvider.GetUtcNow(),
|
||
Source = "first.org"
|
||
};
|
||
}
|
||
catch (Exception ex)
|
||
{
|
||
return new SignalState<EpssEvidence>
|
||
{
|
||
Status = SignalQueryStatus.Failed,
|
||
Value = null,
|
||
FailureReason = ex.Message
|
||
};
|
||
}
|
||
}
|
||
```
|
||
|
||
### Policy Engine Gate
|
||
|
||
```csharp
|
||
// In Policy.Engine: DeterminizationGate
|
||
public sealed class DeterminizationGate : IPolicyGate
|
||
{
|
||
private readonly IDeterminizationPolicy _policy;
|
||
private readonly IUncertaintyScoreCalculator _uncertaintyCalculator;
|
||
private readonly IDecayedConfidenceCalculator _decayCalculator;
|
||
|
||
public async Task<GateResult> EvaluateAsync(PolicyEvaluationContext ctx, CancellationToken ct)
|
||
{
|
||
var snapshot = await BuildSignalSnapshotAsync(ctx, ct);
|
||
var uncertainty = _uncertaintyCalculator.Calculate(snapshot);
|
||
var decay = _decayCalculator.Calculate(snapshot.CapturedAt, ctx.Options.DecayHalfLife);
|
||
|
||
var determCtx = new DeterminizationContext
|
||
{
|
||
SignalSnapshot = snapshot,
|
||
UncertaintyScore = uncertainty,
|
||
Decay = decay,
|
||
TrustScore = ctx.TrustScore,
|
||
Environment = ctx.Environment
|
||
};
|
||
|
||
var result = _policy.Evaluate(determCtx);
|
||
|
||
return new GateResult
|
||
{
|
||
Passed = result.Status is PolicyVerdictStatus.Pass or PolicyVerdictStatus.GuardedPass,
|
||
Status = result.Status,
|
||
Reason = result.Reason,
|
||
GuardRails = result.GuardRails,
|
||
Metadata = new Dictionary<string, object>
|
||
{
|
||
["uncertainty_entropy"] = uncertainty.Entropy,
|
||
["uncertainty_tier"] = uncertainty.Tier.ToString(),
|
||
["decay_multiplier"] = decay.DecayedMultiplier,
|
||
["missing_signals"] = uncertainty.MissingSignals.Select(g => g.SignalName).ToArray()
|
||
}
|
||
};
|
||
}
|
||
}
|
||
```
|
||
|
||
### Graph Integration
|
||
|
||
CVE nodes in the Graph module carry `ObservationState` and `UncertaintyScore`:
|
||
|
||
```csharp
|
||
// Extended CVE node for Graph module
|
||
public sealed record CveObservationNode
|
||
{
|
||
public required string CveId { get; init; }
|
||
public required string SubjectPurl { get; init; }
|
||
|
||
// VEX status (orthogonal to observation state)
|
||
public required VexClaimStatus? VexStatus { get; init; }
|
||
|
||
// Observation lifecycle state
|
||
public required ObservationState ObservationState { get; init; }
|
||
|
||
// Knowledge completeness
|
||
public required UncertaintyScore Uncertainty { get; init; }
|
||
|
||
// Evidence freshness
|
||
public required ObservationDecay Decay { get; init; }
|
||
|
||
// Trust score (from confidence aggregation)
|
||
public required double TrustScore { get; init; }
|
||
|
||
// Policy outcome
|
||
public required PolicyVerdictStatus PolicyHint { get; init; }
|
||
|
||
// Guardrails if GuardedPass
|
||
public GuardRails? GuardRails { get; init; }
|
||
}
|
||
```
|
||
|
||
## Event-Driven Re-evaluation
|
||
|
||
When new signals arrive, the system re-evaluates affected observations:
|
||
|
||
```csharp
|
||
public sealed class SignalUpdateHandler : ISignalUpdateSubscription
|
||
{
|
||
private readonly IObservationRepository _observations;
|
||
private readonly IDeterminizationPolicy _policy;
|
||
private readonly IEventPublisher _events;
|
||
|
||
public async Task HandleAsync(SignalUpdatedEvent evt, CancellationToken ct)
|
||
{
|
||
// Find observations affected by this signal
|
||
var affected = await _observations.FindByCveAndPurlAsync(evt.CveId, evt.Purl, ct);
|
||
|
||
foreach (var obs in affected)
|
||
{
|
||
// Rebuild signal snapshot
|
||
var snapshot = await BuildCurrentSnapshotAsync(obs, ct);
|
||
|
||
// Recalculate uncertainty
|
||
var uncertainty = _uncertaintyCalculator.Calculate(snapshot);
|
||
|
||
// Re-evaluate policy
|
||
var result = _policy.Evaluate(new DeterminizationContext
|
||
{
|
||
SignalSnapshot = snapshot,
|
||
UncertaintyScore = uncertainty,
|
||
// ... other context
|
||
});
|
||
|
||
// Transition state if needed
|
||
var newState = DetermineNewState(obs.ObservationState, result, uncertainty);
|
||
if (newState != obs.ObservationState)
|
||
{
|
||
await _observations.UpdateStateAsync(obs.Id, newState, ct);
|
||
await _events.PublishAsync(new ObservationStateChangedEvent(
|
||
obs.Id, obs.ObservationState, newState, result.Reason), ct);
|
||
}
|
||
}
|
||
}
|
||
|
||
private ObservationState DetermineNewState(
|
||
ObservationState current,
|
||
DeterminizationResult result,
|
||
UncertaintyScore uncertainty)
|
||
{
|
||
// Transition logic
|
||
if (result.Status == PolicyVerdictStatus.Escalated)
|
||
return ObservationState.ManualReviewRequired;
|
||
|
||
if (uncertainty.Tier == UncertaintyTier.VeryLow)
|
||
return ObservationState.Determined;
|
||
|
||
if (current == ObservationState.PendingDeterminization &&
|
||
uncertainty.Tier <= UncertaintyTier.Low)
|
||
return ObservationState.Determined;
|
||
|
||
return current;
|
||
}
|
||
}
|
||
```
|
||
|
||
## Configuration
|
||
|
||
```csharp
|
||
public sealed class DeterminizationOptions
|
||
{
|
||
/// <summary>EPSS score that triggers quarantine (block). Default: 0.4</summary>
|
||
public double EpssQuarantineThreshold { get; set; } = 0.4;
|
||
|
||
/// <summary>Trust score threshold for guarded allow. Default: 0.5</summary>
|
||
public double GuardedAllowScoreThreshold { get; set; } = 0.5;
|
||
|
||
/// <summary>Entropy threshold for guarded allow. Default: 0.4</summary>
|
||
public double GuardedAllowEntropyThreshold { get; set; } = 0.4;
|
||
|
||
/// <summary>Entropy threshold for production block. Default: 0.3</summary>
|
||
public double ProductionBlockEntropyThreshold { get; set; } = 0.3;
|
||
|
||
/// <summary>Half-life for evidence decay in days. Default: 14</summary>
|
||
public int DecayHalfLifeDays { get; set; } = 14;
|
||
|
||
/// <summary>Minimum confidence floor after decay. Default: 0.35</summary>
|
||
public double DecayFloor { get; set; } = 0.35;
|
||
|
||
/// <summary>Review interval for guarded observations in days. Default: 7</summary>
|
||
public int GuardedReviewIntervalDays { get; set; } = 7;
|
||
|
||
/// <summary>Maximum time in guarded state in days. Default: 30</summary>
|
||
public int MaxGuardedDurationDays { get; set; } = 30;
|
||
|
||
/// <summary>Signal weights for uncertainty calculation.</summary>
|
||
public SignalWeights SignalWeights { get; set; } = new();
|
||
|
||
/// <summary>Per-environment threshold overrides.</summary>
|
||
public Dictionary<string, EnvironmentThresholds> EnvironmentThresholds { get; set; } = new();
|
||
}
|
||
```
|
||
|
||
## Verdict Status Extension
|
||
|
||
Extended `PolicyVerdictStatus` enum:
|
||
|
||
```csharp
|
||
public enum PolicyVerdictStatus
|
||
{
|
||
Pass = 0, // Finding meets policy requirements
|
||
GuardedPass = 1, // NEW: Allow with runtime monitoring enabled
|
||
Blocked = 2, // Finding fails policy checks; must be remediated
|
||
Ignored = 3, // Finding deliberately ignored via exception
|
||
Warned = 4, // Finding passes but with warnings
|
||
Deferred = 5, // Decision deferred; needs additional evidence
|
||
Escalated = 6, // Decision escalated for human review
|
||
RequiresVex = 7 // VEX statement required to make decision
|
||
}
|
||
```
|
||
|
||
## Metrics & Observability
|
||
|
||
```csharp
|
||
public static class DeterminizationMetrics
|
||
{
|
||
// Counters
|
||
public static readonly Counter<int> ObservationsCreated =
|
||
Meter.CreateCounter<int>("stellaops_determinization_observations_created_total");
|
||
|
||
public static readonly Counter<int> StateTransitions =
|
||
Meter.CreateCounter<int>("stellaops_determinization_state_transitions_total");
|
||
|
||
public static readonly Counter<int> PolicyEvaluations =
|
||
Meter.CreateCounter<int>("stellaops_determinization_policy_evaluations_total");
|
||
|
||
// Histograms
|
||
public static readonly Histogram<double> UncertaintyEntropy =
|
||
Meter.CreateHistogram<double>("stellaops_determinization_uncertainty_entropy");
|
||
|
||
public static readonly Histogram<double> DecayMultiplier =
|
||
Meter.CreateHistogram<double>("stellaops_determinization_decay_multiplier");
|
||
|
||
// Gauges
|
||
public static readonly ObservableGauge<int> PendingObservations =
|
||
Meter.CreateObservableGauge<int>("stellaops_determinization_pending_observations",
|
||
() => /* query count */);
|
||
|
||
public static readonly ObservableGauge<int> StaleObservations =
|
||
Meter.CreateObservableGauge<int>("stellaops_determinization_stale_observations",
|
||
() => /* query count */);
|
||
}
|
||
```
|
||
|
||
## Evidence-Weighted Score (EWS) Model
|
||
|
||
The EWS model extends the Determinization subsystem with a **6-dimension scoring
|
||
pipeline** that replaces ad-hoc signal weighting with a unified, pluggable, and
|
||
guardrail-enforced composite score.
|
||
|
||
### Dimensions
|
||
|
||
Each dimension maps a family of raw signals to a **normalised risk score 0–100**
|
||
(higher = riskier) and a **confidence 0.0–1.0**:
|
||
|
||
| Code | Dimension | Key signals | Score semantics |
|
||
|------|-----------|-------------|-----------------|
|
||
| RCH | Reachability | Call-graph tier R0–R4, runtime trace | Higher = more reachable |
|
||
| RTS | RuntimeSignals | Instrumentation coverage, invocation count, APM | Higher = more actively exercised |
|
||
| BKP | BackportEvidence | Vendor confirmation, binary-analysis confidence | Higher = no backport / low confidence |
|
||
| XPL | Exploitability | EPSS, KEV, exploit-kit availability, PoC age, CVSS | Higher = more exploitable |
|
||
| SRC | SourceConfidence | SBOM completeness, signatures, attestation count | **Inverted**: high confidence = low risk |
|
||
| MIT | MitigationStatus | VEX status, workarounds, network controls | Higher = less mitigated |
|
||
|
||
### Default Weights
|
||
|
||
```
|
||
RCH 0.25 XPL 0.20 RTS 0.15
|
||
BKP 0.15 SRC 0.15 MIT 0.10
|
||
─── Total: 1.00 ───
|
||
```
|
||
|
||
A **Legacy** preset preserves backward-compatible weights aligned with the
|
||
original `SignalWeights` record.
|
||
|
||
### Guardrails
|
||
|
||
After weighted scoring, a `GuardrailsEngine` enforces hard caps and floors:
|
||
|
||
| Guardrail | Default | Trigger condition |
|
||
|-----------|---------|-------------------|
|
||
| `kev_floor` | 70 | `IsInKev == true` — floor the score |
|
||
| `backported_cap` | 20 | `BackportDetected && Confidence ≥ 0.8` — cap the score |
|
||
| `not_affected_cap` | 25 | `VexStatus == not_affected` — cap the score |
|
||
| `runtime_floor` | 30 | `RuntimeTraceConfirmed == true` — floor the score |
|
||
| `speculative_cap` | 60 | Overall confidence < `MinConfidenceThreshold` (0.3) — cap |
|
||
|
||
Guardrails are applied in priority order (KEV first). The resulting
|
||
`EwsCompositeScore` records which guardrails fired and whether the score was
|
||
adjusted up or down.
|
||
|
||
### Calculator API
|
||
|
||
```csharp
|
||
// Via DI
|
||
IEwsCalculator calculator = serviceProvider.GetRequiredService<IEwsCalculator>();
|
||
|
||
// Or standalone
|
||
IEwsCalculator calculator = EwsCalculator.CreateDefault();
|
||
|
||
var signal = new EwsSignalInput
|
||
{
|
||
CveId = "CVE-2025-1234",
|
||
ReachabilityTier = 3, // R3
|
||
EpssProbability = 0.42,
|
||
IsInKev = false,
|
||
VexStatus = "under_investigation",
|
||
SbomCompleteness = 0.85,
|
||
};
|
||
|
||
EwsCompositeScore result = calculator.Calculate(signal);
|
||
// result.Score → 0-100 composite
|
||
// result.BasisPoints → 0-10000 (fine-grained)
|
||
// result.Confidence → weighted confidence
|
||
// result.RiskTier → Critical/High/Medium/Low/Negligible
|
||
// result.AppliedGuardrails → list of guardrail names that fired
|
||
// result.NeedsReview → true when confidence < threshold
|
||
```
|
||
|
||
### Normalizer Interface
|
||
|
||
Each dimension is implemented as an `IEwsDimensionNormalizer`:
|
||
|
||
```csharp
|
||
public interface IEwsDimensionNormalizer
|
||
{
|
||
EwsDimension Dimension { get; }
|
||
int Normalize(EwsSignalInput signal); // 0-100
|
||
double GetConfidence(EwsSignalInput signal); // 0.0-1.0
|
||
string GetExplanation(EwsSignalInput signal, int score);
|
||
}
|
||
```
|
||
|
||
Normalizers are registered via DI as `IEnumerable<IEwsDimensionNormalizer>`.
|
||
Custom normalizers can be added by registering additional implementations.
|
||
|
||
### Observability
|
||
|
||
The calculator emits two OTel metrics:
|
||
|
||
- **`stellaops_ews_score`** (Histogram) — score distribution 0–100
|
||
- **`stellaops_ews_guardrails_applied`** (Counter) — number of guardrail applications
|
||
|
||
## Unknowns Decay Triage Queue
|
||
|
||
> **Sprint:** SPRINT_20260208_050_Policy_unknowns_decay_and_triage_queue
|
||
|
||
The triage queue automatically identifies unknowns whose evidence has decayed past staleness thresholds and queues them for re-analysis, closing the gap between passive `ObservationDecay.CheckIsStale()` tracking and active re-analysis triggering.
|
||
|
||
### Triage Priority Classification
|
||
|
||
| Priority | Decay Multiplier Range | Action |
|
||
|----------|----------------------|--------|
|
||
| **None** | > 0.70 | No action — fresh |
|
||
| **Low** | 0.50 – 0.70 | Monitor — approaching staleness |
|
||
| **Medium** | 0.30 – 0.50 | Schedule re-analysis — stale |
|
||
| **High** | 0.10 – 0.30 | Re-analyse soon — heavily decayed |
|
||
| **Critical** | ≤ 0.10 | URGENT — evidence at floor |
|
||
|
||
Thresholds are configurable via `TriageQueueOptions` (section: `Determinization:TriageQueue`).
|
||
|
||
### Architecture
|
||
|
||
```
|
||
UnknownTriageQueueService (orchestrator)
|
||
├── ITriageObservationSource → fetch candidates
|
||
├── ITriageQueueEvaluator → classify priority, compute days-until-stale
|
||
└── ITriageReanalysisSink → enqueue Medium+ items for re-analysis
|
||
```
|
||
|
||
- **`TriageQueueEvaluator`**: Deterministic evaluator. Given the same observations and reference time, produces identical output. Calculates days-until-stale using the formula: `d = -halfLife × ln(threshold) / ln(2) - currentAgeDays`.
|
||
- **`UnknownTriageQueueService`**: Orchestrates fetch→evaluate→enqueue cycles. Designed for periodic invocation by a background host, timer, or scheduler. Also supports on-demand evaluation (CLI/API) without auto-enqueue.
|
||
- **`InMemoryTriageReanalysisSink`**: Default `ConcurrentQueue<TriageItem>` implementation for single-node and offline scenarios. Host-level can replace with message bus or database-backed sink.
|
||
|
||
### OTel Metrics
|
||
|
||
- **`stellaops_triage_items_evaluated_total`** (Counter) — observations evaluated per cycle
|
||
- **`stellaops_triage_items_queued_total`** (Counter) — items added to triage queue
|
||
- **`stellaops_triage_decay_multiplier`** (Histogram) — decay multiplier distribution
|
||
- **`stellaops_triage_cycles_total`** (Counter) — evaluation cycles executed
|
||
- **`stellaops_triage_reanalysis_enqueued_total`** (Counter) — items sent to re-analysis sink
|
||
- **`stellaops_triage_cycle_duration_seconds`** (Histogram) — cycle duration
|
||
|
||
### Configuration
|
||
|
||
```yaml
|
||
Determinization:
|
||
TriageQueue:
|
||
ApproachingThreshold: 0.70 # Multiplier below which Low priority starts
|
||
HighPriorityThreshold: 0.30 # Below this → High
|
||
CriticalPriorityThreshold: 0.10 # Below this → Critical
|
||
MaxSnapshotItems: 500 # Max items per snapshot
|
||
IncludeApproaching: true # Include Low priority in snapshots
|
||
MinEvaluationIntervalMinutes: 60
|
||
```
|
||
|
||
## Testing Strategy
|
||
|
||
| Test Category | Focus Area | Example |
|
||
|---------------|------------|---------|
|
||
| Unit | Uncertainty calculation | Missing 2 signals = correct entropy |
|
||
| Unit | Decay calculation | 14 days = 50% multiplier |
|
||
| Unit | Policy rules | EPSS 0.5 + dev = guarded allow |
|
||
| Integration | Signal attachment | Feedser EPSS query → SignalState |
|
||
| Integration | State transitions | New VEX → PendingDeterminization → Determined |
|
||
| Determinism | Same input → same output | Canonical snapshot → reproducible entropy |
|
||
| Property | Entropy bounds | Always [0.0, 1.0] |
|
||
| Property | Decay monotonicity | Older → lower multiplier |
|
||
|
||
## Security Considerations
|
||
|
||
1. **No Guessing:** Missing signals use explicit priors, never random values
|
||
2. **Audit Trail:** Every state transition logged with evidence snapshot
|
||
3. **Conservative Defaults:** Production blocks high entropy; only non-prod allows guardrails
|
||
4. **Escalation Path:** Runtime evidence always escalates regardless of other signals
|
||
5. **Tamper Detection:** Signal snapshots hashed for integrity verification
|
||
|
||
## Versioned Weight Manifests
|
||
|
||
Weight manifests (Sprint 051) provide versioned, content-addressed configuration for
|
||
all scoring weights, guardrails, buckets, and determinization thresholds. Manifests
|
||
live in `etc/weights/` as JSON files with a `*.weights.json` extension.
|
||
|
||
### Manifest Schema (v1.0.0)
|
||
|
||
| Field | Type | Description |
|
||
| --- | --- | --- |
|
||
| `schemaVersion` | string | Must be `"1.0.0"` |
|
||
| `version` | string | Manifest version identifier (e.g. `"v2026-01-22"`) |
|
||
| `effectiveFrom` | ISO-8601 | UTC date from which this manifest is active |
|
||
| `profile` | string | Environment profile (`production`, `staging`, etc.) |
|
||
| `contentHash` | string | `sha256:<hex>` content hash or `sha256:auto` placeholder |
|
||
| `weights.legacy` | dict | 6-dimension EWS weights (must sum to 1.0) |
|
||
| `weights.advisory` | dict | Advisory-profile weights |
|
||
| `guardrails` | object | Guardrail rules (notAffectedCap, runtimeFloor, speculativeCap) |
|
||
| `buckets` | object | Action tier boundaries (actNowMin, scheduleNextMin, investigateMin) |
|
||
| `determinizationThresholds` | object | Entropy thresholds for triage |
|
||
| `signalWeightsForEntropy` | dict | Signal weights for uncertainty calculation (sum to 1.0) |
|
||
| `metadata` | object | Provenance: createdBy, createdAt, changelog, notes |
|
||
|
||
### Content Hash Computation
|
||
|
||
The `WeightManifestHashComputer` computes a deterministic SHA-256 hash over
|
||
canonical JSON (alphabetically sorted properties, `contentHash` field excluded):
|
||
|
||
```
|
||
Input JSON → parse → remove contentHash → sort keys recursively → UTF-8 → SHA-256 → "sha256:<hex>"
|
||
```
|
||
|
||
This enables tamper detection and content-addressed references. The `sha256:auto`
|
||
placeholder is replaced by `stella weights hash --write-back` or at build time.
|
||
|
||
### CLI Commands (backing services)
|
||
|
||
| Command | Service Method | Description |
|
||
| --- | --- | --- |
|
||
| `stella weights list` | `WeightManifestCommands.ListAsync()` | List all manifests with version, profile, hash status |
|
||
| `stella weights validate` | `WeightManifestCommands.ValidateAsync()` | Validate schema, weight normalization, hash integrity |
|
||
| `stella weights diff` | `WeightManifestCommands.DiffAsync()` | Compare two manifests field-by-field |
|
||
| `stella weights activate` | `WeightManifestCommands.ActivateAsync()` | Select effective manifest for a reference date |
|
||
| `stella weights hash` | `WeightManifestCommands.HashAsync()` | Compute/verify content hash, optionally write back |
|
||
|
||
### EffectiveFrom Selection
|
||
|
||
`WeightManifestLoader.SelectEffectiveAsync(referenceDate)` picks the most recent
|
||
manifest where `effectiveFrom ≤ referenceDate`, enabling time-travel replay:
|
||
|
||
```
|
||
Manifests: v2026-01-01 v2026-01-22 v2026-03-01
|
||
Reference: 2026-02-15
|
||
Selected: v2026-01-22 (most recent ≤ reference date)
|
||
```
|
||
|
||
### OTel Metrics
|
||
|
||
| Metric | Type | Description |
|
||
| --- | --- | --- |
|
||
| `stellaops.weight_manifest.loaded_total` | Counter | Manifests loaded from disk |
|
||
| `stellaops.weight_manifest.validated_total` | Counter | Manifests validated |
|
||
| `stellaops.weight_manifest.hash_mismatch_total` | Counter | Content hash mismatches |
|
||
| `stellaops.weight_manifest.validation_error_total` | Counter | Validation errors |
|
||
|
||
### DI Registration
|
||
|
||
```csharp
|
||
services.AddDeterminization(); // Registers WeightManifestLoaderOptions,
|
||
// IWeightManifestLoader → WeightManifestLoader,
|
||
// WeightManifestCommands
|
||
```
|
||
|
||
### YAML Configuration
|
||
|
||
```yaml
|
||
Determinization:
|
||
WeightManifest:
|
||
ManifestDirectory: "etc/weights"
|
||
FilePattern: "*.weights.json"
|
||
RequireComputedHash: true # Reject sha256:auto in production
|
||
StrictHashVerification: true # Fail on hash mismatch
|
||
```
|
||
|
||
## References
|
||
|
||
- Product Advisory: "Unknown CVEs: graceful placeholders, not blockers"
|
||
- Existing: `src/Policy/__Libraries/StellaOps.Policy.Unknowns/`
|
||
- Existing: `src/Policy/__Libraries/StellaOps.Policy/Confidence/`
|
||
- Existing: `src/Excititor/__Libraries/StellaOps.Excititor.Core/TrustVector/`
|
||
- OpenVEX Specification: https://openvex.dev/
|
||
- EPSS Model: https://www.first.org/epss/
|