save progress

2026-01-03 00:47:24 +02:00
parent 3f197814c5
commit ca578801fd
319 changed files with 32478 additions and 2202 deletions
--- a/docs/flows/18-risk-score-dashboard-flow.md
+++ b/docs/flows/18-risk-score-dashboard-flow.md
@@ -0,0 +1,469 @@
+# Risk Score Dashboard Flow
+
+## Overview
+
+The Risk Score Dashboard Flow describes how StellaOps computes, aggregates, and displays risk scores across multiple dimensions including images, teams, applications, and the entire organization. This flow enables data-driven security prioritization and trend analysis.
+
+**Business Value**: Quantified risk visibility enables resource prioritization, executive reporting, and measurable security improvement over time.
+
+## Actors
+
+| Actor | Type | Role |
+|-------|------|------|
+| Security Leader | Human | Reviews risk posture |
+| RiskEngine | Service | Computes risk scores |
+| Platform | Service | Aggregates across tenants |
+| Scanner | Service | Provides vulnerability data |
+| Policy | Service | Provides compliance data |
+| Console | System | Displays dashboards |
+
+## Prerequisites
+
+- Images scanned with findings
+- Risk scoring model configured
+- Historical data available for trends
+- Aggregation permissions configured
+
+## Risk Dimensions
+
+| Dimension | Scope | Aggregation |
+|-----------|-------|-------------|
+| Image | Single container | Direct score |
+| Application | Group of images | Weighted average |
+| Team | All team assets | Sum/average |
+| Environment | prod/staging/dev | Environment-weighted |
+| Organization | All tenants | Executive rollup |
+
+## Risk Factors
+
+| Factor | Weight | Source |
+|--------|--------|--------|
+| CVSS Score | 25% | Advisory data |
+| Exploitability | 20% | KEV, EPSS |
+| Reachability | 20% | K4 lattice state |
+| Exposure | 15% | Network exposure |
+| Asset Criticality | 10% | Business metadata |
+| Time to Remediate | 10% | Age of finding |
+
+## Flow Diagram
+
+```
+┌─────────────────────────────────────────────────────────────────────────────────┐
+│                        Risk Score Dashboard Flow                                 │
+└─────────────────────────────────────────────────────────────────────────────────┘
+
+┌──────────┐  ┌─────────┐  ┌────────────┐  ┌─────────┐  ┌────────┐  ┌─────────┐
+│ Security │  │ Console │  │  Platform  │  │  Risk   │  │ Scanner│  │ Policy  │
+│  Leader  │  │         │  │  Service   │  │ Engine  │  │        │  │         │
+└────┬─────┘  └────┬────┘  └─────┬──────┘  └────┬────┘  └───┬────┘  └────┬────┘
+     │             │             │              │           │            │
+     │ View risk   │             │              │           │            │
+     │ dashboard   │             │              │           │            │
+     │────────────>│             │              │           │            │
+     │             │             │              │           │            │
+     │             │ GET /risk/  │              │           │            │
+     │             │ summary     │              │           │            │
+     │             │────────────>│              │           │            │
+     │             │             │              │           │            │
+     │             │             │ Get findings │           │            │
+     │             │             │──────────────────────────>            │
+     │             │             │              │           │            │
+     │             │             │ Vuln data    │           │            │
+     │             │             │<──────────────────────────            │
+     │             │             │              │           │            │
+     │             │             │ Get verdicts │           │            │
+     │             │             │───────────────────────────────────────>
+     │             │             │              │           │            │
+     │             │             │ Policy data  │           │            │
+     │             │             │<───────────────────────────────────────
+     │             │             │              │           │            │
+     │             │             │ Compute      │           │            │
+     │             │             │ scores       │           │            │
+     │             │             │─────────────>│           │            │
+     │             │             │              │           │            │
+     │             │             │              │ Calculate │            │
+     │             │             │              │ per-image │            │
+     │             │             │              │───┐       │            │
+     │             │             │              │   │       │            │
+     │             │             │              │<──┘       │            │
+     │             │             │              │           │            │
+     │             │             │              │ Aggregate │            │
+     │             │             │              │ by team   │            │
+     │             │             │              │───┐       │            │
+     │             │             │              │   │       │            │
+     │             │             │              │<──┘       │            │
+     │             │             │              │           │            │
+     │             │             │ Risk scores  │           │            │
+     │             │             │<─────────────│           │            │
+     │             │             │              │           │            │
+     │             │ Dashboard   │              │           │            │
+     │             │ data        │              │           │            │
+     │             │<────────────│              │           │            │
+     │             │             │              │           │            │
+     │ Render      │             │              │           │            │
+     │ dashboard   │             │              │           │            │
+     │<────────────│             │              │           │            │
+     │             │             │              │           │            │
+```
+
+## Step-by-Step
+
+### 1. Dashboard Request
+
+Security Leader accesses risk dashboard:
+
+```http
+GET /api/v1/risk/summary HTTP/1.1
+Authorization: Bearer {jwt}
+X-Tenant-Id: acme-corp
+
+Query params:
+?scope=organization
+&period=30d
+&breakdown=team,severity,trend
+```
+
+### 2. Data Collection
+
+Platform Service collects data from multiple sources:
+
+#### Vulnerability Data (Scanner)
+```json
+{
+  "vulnerability_summary": {
+    "total_findings": 1847,
+    "by_severity": {
+      "critical": 23,
+      "high": 189,
+      "medium": 567,
+      "low": 1068
+    },
+    "by_status": {
+      "new": 145,
+      "existing": 1502,
+      "fixed": 200
+    },
+    "unique_cves": 423,
+    "affected_images": 89,
+    "affected_packages": 234
+  }
+}
+```
+
+#### Policy Data (Policy Engine)
+```json
+{
+  "policy_summary": {
+    "total_evaluations": 892,
+    "by_verdict": {
+      "pass": 743,
+      "warn": 98,
+      "fail": 51
+    },
+    "compliance_rate": 0.83,
+    "by_policy_set": {
+      "production": {"pass": 234, "fail": 12},
+      "pci-dss": {"pass": 198, "fail": 8},
+      "default": {"pass": 311, "fail": 31}
+    }
+  }
+}
+```
+
+#### Reachability Data (ReachGraph)
+```json
+{
+  "reachability_summary": {
+    "total_analyzed": 1847,
+    "by_state": {
+      "ConfirmedReachable": 45,
+      "StaticallyReachable": 234,
+      "Unknown": 890,
+      "StaticallyUnreachable": 456,
+      "ConfirmedUnreachable": 222
+    }
+  }
+}
+```
+
+### 3. Risk Calculation
+
+RiskEngine computes risk scores using the model:
+
+#### Per-Finding Risk Score
+```
+Finding Risk = Σ(factor_weight × factor_score)
+
+For CVE-2024-1234:
+  - CVSS Score:       0.25 × (9.8/10) = 0.245
+  - Exploitability:   0.20 × 0.95 (KEV)= 0.190
+  - Reachability:     0.20 × 0.80 (SR) = 0.160
+  - Exposure:         0.15 × 0.70 (ext)= 0.105
+  - Criticality:      0.10 × 0.90     = 0.090
+  - Time to Remediate:0.10 × 0.50 (30d)= 0.050
+
+Finding Risk Score: 0.84 (High)
+```
+
+#### Per-Image Risk Score
+```
+Image Risk = max(finding_risks) + 0.1 × avg(other_findings)
+
+Image: docker.io/myorg/api:v1.2.3
+  - Highest finding risk: 0.84
+  - Average other risks: 0.45
+  - Image Risk Score: 0.84 + 0.1 × 0.45 = 0.885
+```
+
+#### Aggregated Risk Score
+```
+Team Risk = Σ(image_risk × image_weight) / Σ(image_weight)
+
+Team: Platform Engineering
+  - 12 images, weighted by deployment frequency
+  - Team Risk Score: 0.67
+```
+
+### 4. Dashboard Response
+
+Platform returns aggregated risk data:
+
+```json
+{
+  "risk_summary": {
+    "organization": {
+      "score": 0.58,
+      "grade": "C",
+      "trend": "-0.05",
+      "trend_direction": "improving"
+    },
+    "breakdown": {
+      "by_team": [
+        {
+          "team": "Platform Engineering",
+          "score": 0.45,
+          "grade": "B",
+          "image_count": 12,
+          "critical_findings": 2
+        },
+        {
+          "team": "Product Development",
+          "score": 0.72,
+          "grade": "D",
+          "image_count": 34,
+          "critical_findings": 18
+        }
+      ],
+      "by_severity": {
+        "critical": {"count": 23, "risk_contribution": 0.35},
+        "high": {"count": 189, "risk_contribution": 0.40},
+        "medium": {"count": 567, "risk_contribution": 0.20},
+        "low": {"count": 1068, "risk_contribution": 0.05}
+      },
+      "by_environment": {
+        "production": {"score": 0.62, "image_count": 45},
+        "staging": {"score": 0.55, "image_count": 23},
+        "development": {"score": 0.48, "image_count": 67}
+      }
+    },
+    "trends": {
+      "period": "30d",
+      "scores": [
+        {"date": "2024-11-29", "score": 0.63},
+        {"date": "2024-12-06", "score": 0.61},
+        {"date": "2024-12-13", "score": 0.59},
+        {"date": "2024-12-20", "score": 0.57},
+        {"date": "2024-12-29", "score": 0.58}
+      ],
+      "change_30d": -0.05,
+      "change_7d": +0.01
+    },
+    "top_risks": [
+      {
+        "cve": "CVE-2024-1234",
+        "risk_score": 0.92,
+        "affected_images": 8,
+        "teams": ["Product Development"],
+        "remediation": "Upgrade lodash to 4.17.21"
+      }
+    ],
+    "recommendations": [
+      {
+        "priority": 1,
+        "action": "Remediate CVE-2024-1234 in Product Development",
+        "impact": "Reduces org risk by 0.08 points"
+      }
+    ]
+  }
+}
+```
+
+### 5. Dashboard Rendering
+
+Console displays risk visualization:
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│ Organization Risk Dashboard                                      │
+├─────────────────────────────────────────────────────────────────┤
+│                                                                  │
+│  Overall Risk Score: 58/100 (Grade: C)  ↓ 5% from last month    │
+│  ████████████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░│
+│                                                                  │
+│  ┌─ Risk by Team ──────────────────────────────────────────────┐│
+│  │ Platform Engineering  ████████░░░░░░░ 45 (B) ↓3%           ││
+│  │ Product Development   ████████████████░░ 72 (D) ↑2%        ││
+│  │ Security              ████░░░░░░░░░░░░ 28 (A) ↓1%          ││
+│  └─────────────────────────────────────────────────────────────┘│
+│                                                                  │
+│  ┌─ 30-Day Trend ──────────────────────────────────────────────┐│
+│  │       ╭───────────────────────────────────────────╮         ││
+│  │   63  │    ╲                                       │         ││
+│  │   61  │      ╲                                     │         ││
+│  │   59  │        ╲                                   │         ││
+│  │   57  │          ╲___________/                     │         ││
+│  │       ╰─────────────────────────────────────────────╯       ││
+│  │       Nov 29     Dec 13      Dec 22      Dec 29              ││
+│  └─────────────────────────────────────────────────────────────┘│
+│                                                                  │
+│  ┌─ Top Actions ───────────────────────────────────────────────┐│
+│  │ 1. Remediate CVE-2024-1234 (-0.08 risk)                    ││
+│  │ 2. Update base images in Product team (-0.05 risk)         ││
+│  │ 3. Enable runtime monitoring for api-service (-0.03 risk)  ││
+│  └─────────────────────────────────────────────────────────────┘│
+│                                                                  │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+## Risk Grading Scale
+
+| Score Range | Grade | Description |
+|-------------|-------|-------------|
+| 0-20 | A | Excellent - minimal risk |
+| 21-40 | B | Good - well managed |
+| 41-60 | C | Fair - needs attention |
+| 61-80 | D | Poor - significant risk |
+| 81-100 | F | Critical - immediate action |
+
+## Data Contracts
+
+### Risk Summary Request Schema
+
+```typescript
+interface RiskSummaryRequest {
+  scope: 'image' | 'application' | 'team' | 'environment' | 'organization';
+  scope_id?: string;  // Required for image/application/team
+  period?: string;    // ISO-8601 duration, default 30d
+  breakdown?: Array<'team' | 'severity' | 'environment' | 'trend'>;
+  compare_to?: string;  // Previous period for comparison
+}
+```
+
+### Risk Summary Response Schema
+
+```typescript
+interface RiskSummaryResponse {
+  risk_summary: {
+    scope: string;
+    score: number;  // 0-100
+    grade: 'A' | 'B' | 'C' | 'D' | 'F';
+    trend: number;
+    trend_direction: 'improving' | 'stable' | 'degrading';
+    breakdown?: {
+      by_team?: TeamRisk[];
+      by_severity?: SeverityBreakdown;
+      by_environment?: EnvironmentRisk[];
+    };
+    trends?: {
+      period: string;
+      scores: Array<{date: string; score: number}>;
+      change_30d: number;
+      change_7d: number;
+    };
+    top_risks?: TopRisk[];
+    recommendations?: Recommendation[];
+  };
+  metadata: {
+    calculated_at: string;
+    data_freshness: string;
+    model_version: string;
+  };
+}
+```
+
+## Risk Model Configuration
+
+```yaml
+risk_model:
+  version: "1.0.0"
+  factors:
+    cvss_score:
+      weight: 0.25
+      normalization: linear  # score/10
+    exploitability:
+      weight: 0.20
+      components:
+        kev: 0.50      # In KEV = 1.0
+        epss: 0.30     # EPSS percentile
+        poc: 0.20      # Public PoC exists
+    reachability:
+      weight: 0.20
+      mapping:
+        ConfirmedReachable: 1.0
+        RuntimeObserved: 0.9
+        StaticallyReachable: 0.7
+        Unknown: 0.5
+        StaticallyUnreachable: 0.2
+        ConfirmedUnreachable: 0.1
+    exposure:
+      weight: 0.15
+      mapping:
+        internet_facing: 1.0
+        internal_network: 0.6
+        isolated: 0.2
+    asset_criticality:
+      weight: 0.10
+      source: business_metadata
+    remediation_age:
+      weight: 0.10
+      decay_days: 90  # Linear decay over 90 days
+
+  aggregation:
+    image: max_plus_average
+    team: weighted_average
+    organization: weighted_sum
+```
+
+## Error Handling
+
+| Error | Recovery |
+|-------|----------|
+| Data stale | Show warning, use cached data |
+| Partial data | Calculate with available, note gaps |
+| Model error | Fall back to simplified model |
+| Aggregation timeout | Return partial results |
+
+## Observability
+
+### Metrics
+
+| Metric | Type | Labels |
+|--------|------|--------|
+| `risk_score_current` | Gauge | `scope`, `scope_id` |
+| `risk_calculation_duration_ms` | Histogram | `scope` |
+| `risk_grade_distribution` | Gauge | `grade` |
+| `risk_trend_change_30d` | Gauge | `scope`, `scope_id` |
+
+### Key Log Events
+
+| Event | Level | Fields |
+|-------|-------|--------|
+| `risk.calculated` | INFO | `scope`, `score`, `grade` |
+| `risk.trend_alert` | WARN | `scope`, `change`, `direction` |
+| `risk.threshold_exceeded` | WARN | `scope`, `threshold`, `score` |
+
+## Related Flows
+
+- [Dashboard Data Flow](01-dashboard-data-flow.md) - Dashboard patterns
+- [Policy Evaluation Flow](04-policy-evaluation-flow.md) - Compliance data
+- [Advisory Drift Re-scan Flow](11-advisory-drift-rescan-flow.md) - Risk updates