save progress
This commit is contained in:
469
docs/flows/18-risk-score-dashboard-flow.md
Normal file
469
docs/flows/18-risk-score-dashboard-flow.md
Normal file
@@ -0,0 +1,469 @@
|
||||
# Risk Score Dashboard Flow
|
||||
|
||||
## Overview
|
||||
|
||||
The Risk Score Dashboard Flow describes how StellaOps computes, aggregates, and displays risk scores across multiple dimensions including images, teams, applications, and the entire organization. This flow enables data-driven security prioritization and trend analysis.
|
||||
|
||||
**Business Value**: Quantified risk visibility enables resource prioritization, executive reporting, and measurable security improvement over time.
|
||||
|
||||
## Actors
|
||||
|
||||
| Actor | Type | Role |
|
||||
|-------|------|------|
|
||||
| Security Leader | Human | Reviews risk posture |
|
||||
| RiskEngine | Service | Computes risk scores |
|
||||
| Platform | Service | Aggregates across tenants |
|
||||
| Scanner | Service | Provides vulnerability data |
|
||||
| Policy | Service | Provides compliance data |
|
||||
| Console | System | Displays dashboards |
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Images scanned with findings
|
||||
- Risk scoring model configured
|
||||
- Historical data available for trends
|
||||
- Aggregation permissions configured
|
||||
|
||||
## Risk Dimensions
|
||||
|
||||
| Dimension | Scope | Aggregation |
|
||||
|-----------|-------|-------------|
|
||||
| Image | Single container | Direct score |
|
||||
| Application | Group of images | Weighted average |
|
||||
| Team | All team assets | Sum/average |
|
||||
| Environment | prod/staging/dev | Environment-weighted |
|
||||
| Organization | All tenants | Executive rollup |
|
||||
|
||||
## Risk Factors
|
||||
|
||||
| Factor | Weight | Source |
|
||||
|--------|--------|--------|
|
||||
| CVSS Score | 25% | Advisory data |
|
||||
| Exploitability | 20% | KEV, EPSS |
|
||||
| Reachability | 20% | K4 lattice state |
|
||||
| Exposure | 15% | Network exposure |
|
||||
| Asset Criticality | 10% | Business metadata |
|
||||
| Time to Remediate | 10% | Age of finding |
|
||||
|
||||
## Flow Diagram
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────────┐
|
||||
│ Risk Score Dashboard Flow │
|
||||
└─────────────────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
┌──────────┐ ┌─────────┐ ┌────────────┐ ┌─────────┐ ┌────────┐ ┌─────────┐
|
||||
│ Security │ │ Console │ │ Platform │ │ Risk │ │ Scanner│ │ Policy │
|
||||
│ Leader │ │ │ │ Service │ │ Engine │ │ │ │ │
|
||||
└────┬─────┘ └────┬────┘ └─────┬──────┘ └────┬────┘ └───┬────┘ └────┬────┘
|
||||
│ │ │ │ │ │
|
||||
│ View risk │ │ │ │ │
|
||||
│ dashboard │ │ │ │ │
|
||||
│────────────>│ │ │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ GET /risk/ │ │ │ │
|
||||
│ │ summary │ │ │ │
|
||||
│ │────────────>│ │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ Get findings │ │ │
|
||||
│ │ │──────────────────────────> │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ Vuln data │ │ │
|
||||
│ │ │<────────────────────────── │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ Get verdicts │ │ │
|
||||
│ │ │───────────────────────────────────────>
|
||||
│ │ │ │ │ │
|
||||
│ │ │ Policy data │ │ │
|
||||
│ │ │<───────────────────────────────────────
|
||||
│ │ │ │ │ │
|
||||
│ │ │ Compute │ │ │
|
||||
│ │ │ scores │ │ │
|
||||
│ │ │─────────────>│ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ │ Calculate │ │
|
||||
│ │ │ │ per-image │ │
|
||||
│ │ │ │───┐ │ │
|
||||
│ │ │ │ │ │ │
|
||||
│ │ │ │<──┘ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ │ Aggregate │ │
|
||||
│ │ │ │ by team │ │
|
||||
│ │ │ │───┐ │ │
|
||||
│ │ │ │ │ │ │
|
||||
│ │ │ │<──┘ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ Risk scores │ │ │
|
||||
│ │ │<─────────────│ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ Dashboard │ │ │ │
|
||||
│ │ data │ │ │ │
|
||||
│ │<────────────│ │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ Render │ │ │ │ │
|
||||
│ dashboard │ │ │ │ │
|
||||
│<────────────│ │ │ │ │
|
||||
│ │ │ │ │ │
|
||||
```
|
||||
|
||||
## Step-by-Step
|
||||
|
||||
### 1. Dashboard Request
|
||||
|
||||
Security Leader accesses risk dashboard:
|
||||
|
||||
```http
|
||||
GET /api/v1/risk/summary HTTP/1.1
|
||||
Authorization: Bearer {jwt}
|
||||
X-Tenant-Id: acme-corp
|
||||
|
||||
Query params:
|
||||
?scope=organization
|
||||
&period=30d
|
||||
&breakdown=team,severity,trend
|
||||
```
|
||||
|
||||
### 2. Data Collection
|
||||
|
||||
Platform Service collects data from multiple sources:
|
||||
|
||||
#### Vulnerability Data (Scanner)
|
||||
```json
|
||||
{
|
||||
"vulnerability_summary": {
|
||||
"total_findings": 1847,
|
||||
"by_severity": {
|
||||
"critical": 23,
|
||||
"high": 189,
|
||||
"medium": 567,
|
||||
"low": 1068
|
||||
},
|
||||
"by_status": {
|
||||
"new": 145,
|
||||
"existing": 1502,
|
||||
"fixed": 200
|
||||
},
|
||||
"unique_cves": 423,
|
||||
"affected_images": 89,
|
||||
"affected_packages": 234
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Policy Data (Policy Engine)
|
||||
```json
|
||||
{
|
||||
"policy_summary": {
|
||||
"total_evaluations": 892,
|
||||
"by_verdict": {
|
||||
"pass": 743,
|
||||
"warn": 98,
|
||||
"fail": 51
|
||||
},
|
||||
"compliance_rate": 0.83,
|
||||
"by_policy_set": {
|
||||
"production": {"pass": 234, "fail": 12},
|
||||
"pci-dss": {"pass": 198, "fail": 8},
|
||||
"default": {"pass": 311, "fail": 31}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Reachability Data (ReachGraph)
|
||||
```json
|
||||
{
|
||||
"reachability_summary": {
|
||||
"total_analyzed": 1847,
|
||||
"by_state": {
|
||||
"ConfirmedReachable": 45,
|
||||
"StaticallyReachable": 234,
|
||||
"Unknown": 890,
|
||||
"StaticallyUnreachable": 456,
|
||||
"ConfirmedUnreachable": 222
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Risk Calculation
|
||||
|
||||
RiskEngine computes risk scores using the model:
|
||||
|
||||
#### Per-Finding Risk Score
|
||||
```
|
||||
Finding Risk = Σ(factor_weight × factor_score)
|
||||
|
||||
For CVE-2024-1234:
|
||||
- CVSS Score: 0.25 × (9.8/10) = 0.245
|
||||
- Exploitability: 0.20 × 0.95 (KEV)= 0.190
|
||||
- Reachability: 0.20 × 0.80 (SR) = 0.160
|
||||
- Exposure: 0.15 × 0.70 (ext)= 0.105
|
||||
- Criticality: 0.10 × 0.90 = 0.090
|
||||
- Time to Remediate:0.10 × 0.50 (30d)= 0.050
|
||||
|
||||
Finding Risk Score: 0.84 (High)
|
||||
```
|
||||
|
||||
#### Per-Image Risk Score
|
||||
```
|
||||
Image Risk = max(finding_risks) + 0.1 × avg(other_findings)
|
||||
|
||||
Image: docker.io/myorg/api:v1.2.3
|
||||
- Highest finding risk: 0.84
|
||||
- Average other risks: 0.45
|
||||
- Image Risk Score: 0.84 + 0.1 × 0.45 = 0.885
|
||||
```
|
||||
|
||||
#### Aggregated Risk Score
|
||||
```
|
||||
Team Risk = Σ(image_risk × image_weight) / Σ(image_weight)
|
||||
|
||||
Team: Platform Engineering
|
||||
- 12 images, weighted by deployment frequency
|
||||
- Team Risk Score: 0.67
|
||||
```
|
||||
|
||||
### 4. Dashboard Response
|
||||
|
||||
Platform returns aggregated risk data:
|
||||
|
||||
```json
|
||||
{
|
||||
"risk_summary": {
|
||||
"organization": {
|
||||
"score": 0.58,
|
||||
"grade": "C",
|
||||
"trend": "-0.05",
|
||||
"trend_direction": "improving"
|
||||
},
|
||||
"breakdown": {
|
||||
"by_team": [
|
||||
{
|
||||
"team": "Platform Engineering",
|
||||
"score": 0.45,
|
||||
"grade": "B",
|
||||
"image_count": 12,
|
||||
"critical_findings": 2
|
||||
},
|
||||
{
|
||||
"team": "Product Development",
|
||||
"score": 0.72,
|
||||
"grade": "D",
|
||||
"image_count": 34,
|
||||
"critical_findings": 18
|
||||
}
|
||||
],
|
||||
"by_severity": {
|
||||
"critical": {"count": 23, "risk_contribution": 0.35},
|
||||
"high": {"count": 189, "risk_contribution": 0.40},
|
||||
"medium": {"count": 567, "risk_contribution": 0.20},
|
||||
"low": {"count": 1068, "risk_contribution": 0.05}
|
||||
},
|
||||
"by_environment": {
|
||||
"production": {"score": 0.62, "image_count": 45},
|
||||
"staging": {"score": 0.55, "image_count": 23},
|
||||
"development": {"score": 0.48, "image_count": 67}
|
||||
}
|
||||
},
|
||||
"trends": {
|
||||
"period": "30d",
|
||||
"scores": [
|
||||
{"date": "2024-11-29", "score": 0.63},
|
||||
{"date": "2024-12-06", "score": 0.61},
|
||||
{"date": "2024-12-13", "score": 0.59},
|
||||
{"date": "2024-12-20", "score": 0.57},
|
||||
{"date": "2024-12-29", "score": 0.58}
|
||||
],
|
||||
"change_30d": -0.05,
|
||||
"change_7d": +0.01
|
||||
},
|
||||
"top_risks": [
|
||||
{
|
||||
"cve": "CVE-2024-1234",
|
||||
"risk_score": 0.92,
|
||||
"affected_images": 8,
|
||||
"teams": ["Product Development"],
|
||||
"remediation": "Upgrade lodash to 4.17.21"
|
||||
}
|
||||
],
|
||||
"recommendations": [
|
||||
{
|
||||
"priority": 1,
|
||||
"action": "Remediate CVE-2024-1234 in Product Development",
|
||||
"impact": "Reduces org risk by 0.08 points"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 5. Dashboard Rendering
|
||||
|
||||
Console displays risk visualization:
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Organization Risk Dashboard │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ Overall Risk Score: 58/100 (Grade: C) ↓ 5% from last month │
|
||||
│ ████████████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░│
|
||||
│ │
|
||||
│ ┌─ Risk by Team ──────────────────────────────────────────────┐│
|
||||
│ │ Platform Engineering ████████░░░░░░░ 45 (B) ↓3% ││
|
||||
│ │ Product Development ████████████████░░ 72 (D) ↑2% ││
|
||||
│ │ Security ████░░░░░░░░░░░░ 28 (A) ↓1% ││
|
||||
│ └─────────────────────────────────────────────────────────────┘│
|
||||
│ │
|
||||
│ ┌─ 30-Day Trend ──────────────────────────────────────────────┐│
|
||||
│ │ ╭───────────────────────────────────────────╮ ││
|
||||
│ │ 63 │ ╲ │ ││
|
||||
│ │ 61 │ ╲ │ ││
|
||||
│ │ 59 │ ╲ │ ││
|
||||
│ │ 57 │ ╲___________/ │ ││
|
||||
│ │ ╰─────────────────────────────────────────────╯ ││
|
||||
│ │ Nov 29 Dec 13 Dec 22 Dec 29 ││
|
||||
│ └─────────────────────────────────────────────────────────────┘│
|
||||
│ │
|
||||
│ ┌─ Top Actions ───────────────────────────────────────────────┐│
|
||||
│ │ 1. Remediate CVE-2024-1234 (-0.08 risk) ││
|
||||
│ │ 2. Update base images in Product team (-0.05 risk) ││
|
||||
│ │ 3. Enable runtime monitoring for api-service (-0.03 risk) ││
|
||||
│ └─────────────────────────────────────────────────────────────┘│
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Risk Grading Scale
|
||||
|
||||
| Score Range | Grade | Description |
|
||||
|-------------|-------|-------------|
|
||||
| 0-20 | A | Excellent - minimal risk |
|
||||
| 21-40 | B | Good - well managed |
|
||||
| 41-60 | C | Fair - needs attention |
|
||||
| 61-80 | D | Poor - significant risk |
|
||||
| 81-100 | F | Critical - immediate action |
|
||||
|
||||
## Data Contracts
|
||||
|
||||
### Risk Summary Request Schema
|
||||
|
||||
```typescript
|
||||
interface RiskSummaryRequest {
|
||||
scope: 'image' | 'application' | 'team' | 'environment' | 'organization';
|
||||
scope_id?: string; // Required for image/application/team
|
||||
period?: string; // ISO-8601 duration, default 30d
|
||||
breakdown?: Array<'team' | 'severity' | 'environment' | 'trend'>;
|
||||
compare_to?: string; // Previous period for comparison
|
||||
}
|
||||
```
|
||||
|
||||
### Risk Summary Response Schema
|
||||
|
||||
```typescript
|
||||
interface RiskSummaryResponse {
|
||||
risk_summary: {
|
||||
scope: string;
|
||||
score: number; // 0-100
|
||||
grade: 'A' | 'B' | 'C' | 'D' | 'F';
|
||||
trend: number;
|
||||
trend_direction: 'improving' | 'stable' | 'degrading';
|
||||
breakdown?: {
|
||||
by_team?: TeamRisk[];
|
||||
by_severity?: SeverityBreakdown;
|
||||
by_environment?: EnvironmentRisk[];
|
||||
};
|
||||
trends?: {
|
||||
period: string;
|
||||
scores: Array<{date: string; score: number}>;
|
||||
change_30d: number;
|
||||
change_7d: number;
|
||||
};
|
||||
top_risks?: TopRisk[];
|
||||
recommendations?: Recommendation[];
|
||||
};
|
||||
metadata: {
|
||||
calculated_at: string;
|
||||
data_freshness: string;
|
||||
model_version: string;
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
## Risk Model Configuration
|
||||
|
||||
```yaml
|
||||
risk_model:
|
||||
version: "1.0.0"
|
||||
factors:
|
||||
cvss_score:
|
||||
weight: 0.25
|
||||
normalization: linear # score/10
|
||||
exploitability:
|
||||
weight: 0.20
|
||||
components:
|
||||
kev: 0.50 # In KEV = 1.0
|
||||
epss: 0.30 # EPSS percentile
|
||||
poc: 0.20 # Public PoC exists
|
||||
reachability:
|
||||
weight: 0.20
|
||||
mapping:
|
||||
ConfirmedReachable: 1.0
|
||||
RuntimeObserved: 0.9
|
||||
StaticallyReachable: 0.7
|
||||
Unknown: 0.5
|
||||
StaticallyUnreachable: 0.2
|
||||
ConfirmedUnreachable: 0.1
|
||||
exposure:
|
||||
weight: 0.15
|
||||
mapping:
|
||||
internet_facing: 1.0
|
||||
internal_network: 0.6
|
||||
isolated: 0.2
|
||||
asset_criticality:
|
||||
weight: 0.10
|
||||
source: business_metadata
|
||||
remediation_age:
|
||||
weight: 0.10
|
||||
decay_days: 90 # Linear decay over 90 days
|
||||
|
||||
aggregation:
|
||||
image: max_plus_average
|
||||
team: weighted_average
|
||||
organization: weighted_sum
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Error | Recovery |
|
||||
|-------|----------|
|
||||
| Data stale | Show warning, use cached data |
|
||||
| Partial data | Calculate with available, note gaps |
|
||||
| Model error | Fall back to simplified model |
|
||||
| Aggregation timeout | Return partial results |
|
||||
|
||||
## Observability
|
||||
|
||||
### Metrics
|
||||
|
||||
| Metric | Type | Labels |
|
||||
|--------|------|--------|
|
||||
| `risk_score_current` | Gauge | `scope`, `scope_id` |
|
||||
| `risk_calculation_duration_ms` | Histogram | `scope` |
|
||||
| `risk_grade_distribution` | Gauge | `grade` |
|
||||
| `risk_trend_change_30d` | Gauge | `scope`, `scope_id` |
|
||||
|
||||
### Key Log Events
|
||||
|
||||
| Event | Level | Fields |
|
||||
|-------|-------|--------|
|
||||
| `risk.calculated` | INFO | `scope`, `score`, `grade` |
|
||||
| `risk.trend_alert` | WARN | `scope`, `change`, `direction` |
|
||||
| `risk.threshold_exceeded` | WARN | `scope`, `threshold`, `score` |
|
||||
|
||||
## Related Flows
|
||||
|
||||
- [Dashboard Data Flow](01-dashboard-data-flow.md) - Dashboard patterns
|
||||
- [Policy Evaluation Flow](04-policy-evaluation-flow.md) - Compliance data
|
||||
- [Advisory Drift Re-scan Flow](11-advisory-drift-rescan-flow.md) - Risk updates
|
||||
Reference in New Issue
Block a user