save progress

2026-01-09 18:27:36 +02:00
parent e608752924
commit a21d3dbc1f
361 changed files with 63068 additions and 1192 deletions
--- a/docs/modules/opsmemory/README.md
+++ b/docs/modules/opsmemory/README.md
@@ -0,0 +1,327 @@
+# OpsMemory Module
+
+> **Decision Ledger for Playbook Learning**
+
+OpsMemory is a structured ledger of prior security decisions and their outcomes. It enables playbook learning - understanding which decisions led to good outcomes and surfacing institutional knowledge for similar situations.
+
+## What OpsMemory Is
+
+- ✅ **Decision + Outcome pairs**: Every security decision is recorded with its eventual outcome
+- ✅ **Success/failure classification**: Learn what worked and what didn't
+- ✅ **Similar situation matching**: Find past decisions in comparable scenarios
+- ✅ **Playbook suggestions**: Surface recommendations based on historical success
+
+## What OpsMemory Is NOT
+
+- ❌ Chat history (that's conversation storage)
+- ❌ Audit logs (that's the Timeline)
+- ❌ VEX statements (that's Excititor)
+
+## Architecture
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    OpsMemory Service                        │
+├─────────────────────────────────────────────────────────────┤
+│  ┌─────────────┐  ┌──────────────────┐  ┌───────────────┐  │
+│  │  Decision   │  │    Playbook      │  │   Outcome     │  │
+│  │  Recording  │  │   Suggestion     │  │   Tracking    │  │
+│  └──────┬──────┘  └────────┬─────────┘  └───────┬───────┘  │
+│         │                  │                    │          │
+│         ▼                  ▼                    ▼          │
+│  ┌─────────────────────────────────────────────────────┐   │
+│  │              IOpsMemoryStore                        │   │
+│  │    (PostgreSQL with similarity vectors)             │   │
+│  └─────────────────────────────────────────────────────┘   │
+└─────────────────────────────────────────────────────────────┘
+```
+
+## Core Components
+
+### OpsMemoryRecord
+
+The core data structure capturing a decision and its context:
+
+```json
+{
+  "memoryId": "mem-abc123",
+  "tenantId": "tenant-xyz",
+  "recordedAt": "2026-01-07T12:00:00Z",
+  
+  "situation": {
+    "cveId": "CVE-2023-44487",
+    "component": "pkg:npm/http2@1.0.0",
+    "severity": "high",
+    "reachability": "reachable",
+    "epssScore": 0.97,
+    "isKev": true,
+    "contextTags": ["production", "external-facing", "payment-service"]
+  },
+  
+  "decision": {
+    "action": "Remediate",
+    "rationale": "KEV + reachable + payment service = immediate remediation",
+    "decidedBy": "security-team",
+    "decidedAt": "2026-01-07T12:00:00Z",
+    "policyReference": "policy/critical-kev.rego"
+  },
+  
+  "outcome": {
+    "status": "Success",
+    "resolutionTime": "4:30:00",
+    "lessonsLearned": "Upgrade was smooth, no breaking changes",
+    "recordedAt": "2026-01-07T16:30:00Z"
+  }
+}
+```
+
+### Decision Actions
+
+| Action | Description |
+|--------|-------------|
+| `Accept` | Accept the risk (no action) |
+| `Remediate` | Upgrade/patch the component |
+| `Quarantine` | Isolate the component |
+| `Mitigate` | Apply compensating controls (WAF, config) |
+| `Defer` | Defer for later review |
+| `Escalate` | Escalate to security team |
+| `FalsePositive` | Mark as not applicable |
+
+### Outcome Status
+
+| Status | Description |
+|--------|-------------|
+| `Success` | Decision led to successful resolution |
+| `PartialSuccess` | Decision led to partial resolution |
+| `Ineffective` | Decision was ineffective |
+| `NegativeOutcome` | Decision led to negative consequences |
+| `Pending` | Outcome still pending |
+
+## API Reference
+
+### Record a Decision
+
+```http
+POST /api/v1/opsmemory/decisions
+Content-Type: application/json
+
+{
+  "tenantId": "tenant-xyz",
+  "cveId": "CVE-2023-44487",
+  "componentPurl": "pkg:npm/http2@1.0.0",
+  "severity": "high",
+  "reachability": "reachable",
+  "epssScore": 0.97,
+  "action": "Remediate",
+  "rationale": "KEV + reachable + payment service",
+  "decidedBy": "alice@example.com",
+  "contextTags": ["production", "payment-service"]
+}
+```
+
+**Response:**
+```json
+{
+  "memoryId": "abc123def456",
+  "recordedAt": "2026-01-07T12:00:00Z"
+}
+```
+
+### Record an Outcome
+
+```http
+POST /api/v1/opsmemory/decisions/{memoryId}/outcome?tenantId=tenant-xyz
+Content-Type: application/json
+
+{
+  "status": "Success",
+  "resolutionTimeMinutes": 270,
+  "lessonsLearned": "Upgrade was smooth, no breaking changes",
+  "recordedBy": "alice@example.com"
+}
+```
+
+### Get Playbook Suggestions
+
+```http
+GET /api/v1/opsmemory/suggestions?tenantId=tenant-xyz&cveId=CVE-2024-1234&severity=high&reachability=reachable
+```
+
+**Response:**
+```json
+{
+  "suggestions": [
+    {
+      "suggestedAction": "Remediate",
+      "confidence": 0.87,
+      "rationale": "87% confidence based on 15 similar past decisions. Remediation succeeded in 93% of high-severity reachable vulnerabilities.",
+      "successRate": 0.93,
+      "similarDecisionCount": 15,
+      "averageResolutionTimeMinutes": 180,
+      "evidence": [
+        {
+          "memoryId": "abc123",
+          "similarity": 0.92,
+          "action": "Remediate",
+          "outcome": "Success",
+          "cveId": "CVE-2023-44487"
+        }
+      ],
+      "matchingFactors": [
+        "Same severity: high",
+        "Same reachability: Reachable",
+        "Both are KEV",
+        "Shared context: production"
+      ]
+    }
+  ],
+  "analyzedRecords": 15,
+  "topSimilarity": 0.92
+}
+```
+
+### Query Past Decisions
+
+```http
+GET /api/v1/opsmemory/decisions?tenantId=tenant-xyz&action=Remediate&pageSize=20
+```
+
+### Get Statistics
+
+```http
+GET /api/v1/opsmemory/stats?tenantId=tenant-xyz
+```
+
+**Response:**
+```json
+{
+  "tenantId": "tenant-xyz",
+  "totalDecisions": 1250,
+  "decisionsWithOutcomes": 980,
+  "successRate": 0.87
+}
+```
+
+## Similarity Algorithm
+
+OpsMemory uses a 50-dimensional vector to represent each security situation:
+
+| Dimensions | Feature |
+|------------|---------|
+| 0-9 | CVE category (memory, injection, auth, crypto, dos, etc.) |
+| 10-14 | Severity (none, low, medium, high, critical) |
+| 15-18 | Reachability (unknown, reachable, not-reachable, potential) |
+| 19-23 | EPSS band (0-0.2, 0.2-0.4, 0.4-0.6, 0.6-0.8, 0.8-1.0) |
+| 24-28 | CVSS band (0-2, 2-4, 4-6, 6-8, 8-10) |
+| 29 | KEV flag |
+| 30-39 | Component type (npm, maven, pypi, nuget, go, cargo, etc.) |
+| 40-49 | Context tags (production, external-facing, payment, etc.) |
+
+Similarity is computed using **cosine similarity** between normalized vectors.
+
+## Integration Points
+
+### Decision Recording Hook
+
+OpsMemory integrates with the Findings Ledger to automatically capture decisions:
+
+```csharp
+public class OpsMemoryHook : IDecisionHook
+{
+    public async Task OnDecisionRecordedAsync(FindingDecision decision)
+    {
+        var record = new OpsMemoryRecord
+        {
+            TenantId = decision.TenantId,
+            Situation = ExtractSituation(decision),
+            Decision = ExtractDecision(decision)
+        };
+        
+        // Fire-and-forget to not block the decision flow
+        _ = _store.RecordDecisionAsync(record);
+    }
+}
+```
+
+### Outcome Tracking
+
+The OutcomeTrackingService monitors for resolution events and prompts users:
+
+1. **Auto-detect resolution**: When a finding is marked resolved
+2. **Calculate resolution time**: Time from decision to resolution
+3. **Prompt for classification**: Ask user about outcome quality
+4. **Link to original decision**: Update the OpsMemory record
+
+## Configuration
+
+```yaml
+opsmemory:
+  connectionString: "Host=localhost;Database=stellaops"
+  
+  similarity:
+    minThreshold: 0.6      # Minimum similarity for suggestions
+    maxResults: 10         # Maximum similar records to analyze
+    
+  suggestions:
+    maxSuggestions: 3      # Maximum suggestions to return
+    minConfidence: 0.5     # Minimum confidence threshold
+    
+  outcomeTracking:
+    autoPromptDelay: 24h   # Delay before prompting for outcome
+    reminderInterval: 7d   # Reminder interval for pending outcomes
+```
+
+## Database Schema
+
+```sql
+CREATE SCHEMA IF NOT EXISTS opsmemory;
+
+CREATE TABLE opsmemory.decisions (
+    memory_id TEXT PRIMARY KEY,
+    tenant_id TEXT NOT NULL,
+    recorded_at TIMESTAMPTZ NOT NULL,
+    
+    -- Situation (JSONB for flexibility)
+    situation JSONB NOT NULL,
+    
+    -- Decision (JSONB)
+    decision JSONB NOT NULL,
+    
+    -- Outcome (nullable, updated later)
+    outcome JSONB,
+    
+    -- Similarity vector (array for simple cosine similarity)
+    similarity_vector REAL[] NOT NULL
+);
+
+CREATE INDEX idx_decisions_tenant ON opsmemory.decisions(tenant_id);
+CREATE INDEX idx_decisions_recorded ON opsmemory.decisions(recorded_at DESC);
+CREATE INDEX idx_decisions_cve ON opsmemory.decisions((situation->>'cveId'));
+```
+
+## Best Practices
+
+### Recording Decisions
+
+1. **Include context tags**: The more context, the better similarity matching
+2. **Document rationale**: Future users benefit from understanding why
+3. **Reference policies**: Link to the policy that guided the decision
+
+### Recording Outcomes
+
+1. **Be timely**: Record outcomes as soon as resolution is confirmed
+2. **Be honest**: Failed decisions are valuable learning data
+3. **Add lessons learned**: Help future users avoid pitfalls
+
+### Using Suggestions
+
+1. **Review evidence**: Look at the similar past decisions
+2. **Check matching factors**: Ensure the situations are truly comparable
+3. **Trust but verify**: Suggestions are guidance, not mandates
+
+## Related Modules
+
+- [Findings Ledger](../findings-ledger/README.md) - Source of decision events
+- [Timeline](../timeline-indexer/README.md) - Audit trail
+- [Excititor](../excititor/README.md) - VEX statement management
+- [Risk Engine](../risk-engine/README.md) - Risk scoring
--- a/docs/modules/opsmemory/architecture.md
+++ b/docs/modules/opsmemory/architecture.md
@@ -0,0 +1,393 @@
+# OpsMemory Architecture
+
+> **Technical deep-dive into the Decision Ledger**
+
+## Overview
+
+OpsMemory provides a structured approach to organizational learning from security decisions. It captures the complete lifecycle of a decision - from the situation context through the action taken to the eventual outcome.
+
+## Design Principles
+
+### 1. Determinism First
+
+All operations produce deterministic, reproducible results:
+- Similarity vectors are computed from stable inputs
+- Confidence scores use fixed formulas
+- No randomness in suggestion ranking
+
+### 2. Multi-Tenant Isolation
+
+Every operation is scoped to a tenant:
+- Records cannot be accessed across tenants
+- Similarity search is tenant-isolated
+- Statistics are per-tenant
+
+### 3. Fire-and-Forget Integration
+
+Decision recording is async and non-blocking:
+- UI decisions complete immediately
+- OpsMemory recording happens in background
+- Failures don't affect the primary flow
+
+### 4. Offline Capable
+
+All features work without network access:
+- Local PostgreSQL storage
+- No external API dependencies
+- Self-contained similarity computation
+
+## Component Architecture
+
+```
+┌────────────────────────────────────────────────────────────────────┐
+│                         WebService Layer                            │
+│  ┌──────────────────────────────────────────────────────────────┐  │
+│  │                   OpsMemoryEndpoints                          │  │
+│  │  POST /decisions  GET /decisions  GET /suggestions  GET /stats│  │
+│  └──────────────────────────────────────────────────────────────┘  │
+└────────────────────────────────┬───────────────────────────────────┘
+                                 │
+┌────────────────────────────────┼───────────────────────────────────┐
+│                         Service Layer                               │
+│  ┌─────────────────┐  ┌─────────────────┐  ┌────────────────────┐  │
+│  │ PlaybookSuggest │  │ OutcomeTracking │  │ SimilarityVector   │  │
+│  │    Service      │  │    Service      │  │    Generator       │  │
+│  └────────┬────────┘  └────────┬────────┘  └─────────┬──────────┘  │
+│           │                    │                     │             │
+│           ▼                    ▼                     ▼             │
+│  ┌──────────────────────────────────────────────────────────────┐  │
+│  │                      IOpsMemoryStore                          │  │
+│  └──────────────────────────────────────────────────────────────┘  │
+└────────────────────────────────┬───────────────────────────────────┘
+                                 │
+┌────────────────────────────────┼───────────────────────────────────┐
+│                       Storage Layer                                 │
+│  ┌──────────────────────────────────────────────────────────────┐  │
+│  │                 PostgresOpsMemoryStore                        │  │
+│  │  - Decision CRUD                                              │  │
+│  │  - Outcome updates                                            │  │
+│  │  - Similarity search (array-based cosine)                     │  │
+│  │  - Query with pagination                                      │  │
+│  │  - Statistics aggregation                                     │  │
+│  └──────────────────────────────────────────────────────────────┘  │
+└────────────────────────────────────────────────────────────────────┘
+```
+
+## Data Model
+
+### OpsMemoryRecord
+
+The core aggregate containing all decision information:
+
+```csharp
+public sealed record OpsMemoryRecord
+{
+    public required string MemoryId { get; init; }
+    public required string TenantId { get; init; }
+    public required DateTimeOffset RecordedAt { get; init; }
+    public required SituationContext Situation { get; init; }
+    public required DecisionRecord Decision { get; init; }
+    public OutcomeRecord? Outcome { get; init; }
+    public ImmutableArray<float> SimilarityVector { get; init; }
+}
+```
+
+### SituationContext
+
+Captures the security context at decision time:
+
+```csharp
+public sealed record SituationContext
+{
+    public string? CveId { get; init; }
+    public string? Component { get; init; }       // PURL
+    public string? Severity { get; init; }        // low/medium/high/critical
+    public ReachabilityStatus Reachability { get; init; }
+    public double? EpssScore { get; init; }       // 0-1
+    public double? CvssScore { get; init; }       // 0-10
+    public bool IsKev { get; init; }
+    public ImmutableArray<string> ContextTags { get; init; }
+}
+```
+
+### DecisionRecord
+
+The action taken and why:
+
+```csharp
+public sealed record DecisionRecord
+{
+    public required DecisionAction Action { get; init; }
+    public required string Rationale { get; init; }
+    public required string DecidedBy { get; init; }
+    public required DateTimeOffset DecidedAt { get; init; }
+    public string? PolicyReference { get; init; }
+    public MitigationDetails? Mitigation { get; init; }
+}
+```
+
+### OutcomeRecord
+
+The result of the decision:
+
+```csharp
+public sealed record OutcomeRecord
+{
+    public required OutcomeStatus Status { get; init; }
+    public TimeSpan? ResolutionTime { get; init; }
+    public string? ActualImpact { get; init; }
+    public string? LessonsLearned { get; init; }
+    public required string RecordedBy { get; init; }
+    public required DateTimeOffset RecordedAt { get; init; }
+}
+```
+
+## Similarity Algorithm
+
+### Vector Generation
+
+The `SimilarityVectorGenerator` creates 50-dimensional feature vectors:
+
+```
+Vector Layout:
+[0-9]   : CVE category one-hot (memory, injection, auth, crypto, dos, 
+          info-disclosure, privilege-escalation, xss, path-traversal, other)
+[10-14] : Severity one-hot (none, low, medium, high, critical)
+[15-18] : Reachability one-hot (unknown, reachable, not-reachable, potential)
+[19-23] : EPSS band one-hot (0-0.2, 0.2-0.4, 0.4-0.6, 0.6-0.8, 0.8-1.0)
+[24-28] : CVSS band one-hot (0-2, 2-4, 4-6, 6-8, 8-10)
+[29]    : KEV flag (0 or 1)
+[30-39] : Component type one-hot (npm, maven, pypi, nuget, go, cargo, 
+          deb, rpm, apk, other)
+[40-49] : Context tag presence (production, development, staging, 
+          external-facing, internal, payment, auth, data, api, frontend)
+```
+
+### Cosine Similarity
+
+Similarity between vectors A and B:
+
+```
+similarity = (A · B) / (||A|| × ||B||)
+```
+
+Where `A · B` is the dot product and `||A||` is the L2 norm.
+
+### CVE Classification
+
+CVEs are classified by analyzing keywords in the CVE ID and description:
+
+| Category | Keywords |
+|----------|----------|
+| memory | buffer, overflow, heap, stack, use-after-free |
+| injection | sql, command, code injection, ldap |
+| auth | authentication, authorization, bypass |
+| crypto | cryptographic, encryption, key |
+| dos | denial of service, resource exhaustion |
+| info-disclosure | information disclosure, leak |
+| privilege-escalation | privilege escalation, elevation |
+| xss | cross-site scripting, xss |
+| path-traversal | path traversal, directory traversal |
+
+## Playbook Suggestion Algorithm
+
+### Confidence Calculation
+
+```csharp
+confidence = baseSimilarity 
+           × successRateBonus 
+           × recencyBonus 
+           × evidenceCountBonus
+```
+
+Where:
+- `baseSimilarity`: Highest similarity score from matching records
+- `successRateBonus`: `1 + (successRate - 0.5) * 0.5` (rewards high success rate)
+- `recencyBonus`: More recent decisions weighted higher
+- `evidenceCountBonus`: More evidence = higher confidence
+
+### Suggestion Ranking
+
+1. Group past decisions by action taken
+2. For each action, calculate:
+   - Average similarity of records with that action
+   - Success rate for that action
+   - Number of similar decisions
+3. Compute confidence score
+4. Rank by confidence descending
+5. Return top N suggestions
+
+### Rationale Generation
+
+Rationales are generated programmatically:
+
+```
+"{confidence}% confidence based on {count} similar past decisions. 
+{action} succeeded in {successRate}% of {factors}."
+```
+
+## Storage Design
+
+### PostgreSQL Schema
+
+```sql
+CREATE TABLE opsmemory.decisions (
+    memory_id TEXT PRIMARY KEY,
+    tenant_id TEXT NOT NULL,
+    recorded_at TIMESTAMPTZ NOT NULL,
+    
+    -- Denormalized situation fields for indexing
+    cve_id TEXT,
+    component TEXT,
+    severity TEXT,
+    
+    -- Full data as JSONB
+    situation JSONB NOT NULL,
+    decision JSONB NOT NULL,
+    outcome JSONB,
+    
+    -- Similarity vector as array (not pgvector)
+    similarity_vector REAL[] NOT NULL
+);
+
+-- Indexes
+CREATE INDEX idx_decisions_tenant ON opsmemory.decisions(tenant_id);
+CREATE INDEX idx_decisions_recorded ON opsmemory.decisions(recorded_at DESC);
+CREATE INDEX idx_decisions_cve ON opsmemory.decisions(cve_id) WHERE cve_id IS NOT NULL;
+CREATE INDEX idx_decisions_component ON opsmemory.decisions(component) WHERE component IS NOT NULL;
+```
+
+### Why Not pgvector?
+
+The current implementation uses PostgreSQL arrays instead of pgvector:
+
+1. **Simpler deployment**: No extension installation required
+2. **Smaller dataset**: OpsMemory is per-org, not global
+3. **Adequate performance**: Array operations are fast enough for <100K records
+4. **Future option**: Can migrate to pgvector if needed
+
+### Cosine Similarity in SQL
+
+```sql
+-- Cosine similarity between query vector and stored vectors
+SELECT memory_id,
+       (
+         SELECT SUM(a * b) 
+         FROM UNNEST(similarity_vector, @query_vector) AS t(a, b)
+       ) / (
+         SQRT((SELECT SUM(a * a) FROM UNNEST(similarity_vector) AS t(a))) *
+         SQRT((SELECT SUM(b * b) FROM UNNEST(@query_vector) AS t(b)))
+       ) AS similarity
+FROM opsmemory.decisions
+WHERE tenant_id = @tenant_id
+ORDER BY similarity DESC
+LIMIT @top_k;
+```
+
+## API Design
+
+### Endpoint Overview
+
+| Method | Path | Description |
+|--------|------|-------------|
+| POST | `/api/v1/opsmemory/decisions` | Record a new decision |
+| GET | `/api/v1/opsmemory/decisions/{id}` | Get decision details |
+| POST | `/api/v1/opsmemory/decisions/{id}/outcome` | Record outcome |
+| GET | `/api/v1/opsmemory/suggestions` | Get playbook suggestions |
+| GET | `/api/v1/opsmemory/decisions` | Query past decisions |
+| GET | `/api/v1/opsmemory/stats` | Get statistics |
+
+### Request/Response DTOs
+
+The API uses string-based DTOs that convert to/from internal enums:
+
+```csharp
+// API accepts strings
+public record RecordDecisionRequest
+{
+    public required string Action { get; init; }  // "Remediate", "Accept", etc.
+    public string? Reachability { get; init; }    // "reachable", "not-reachable"
+}
+
+// Internal uses enums
+public enum DecisionAction { Accept, Remediate, Quarantine, ... }
+public enum ReachabilityStatus { Unknown, Reachable, NotReachable, Potential }
+```
+
+## Testing Strategy
+
+### Unit Tests (26 tests)
+
+**SimilarityVectorGeneratorTests:**
+- Vector dimension validation
+- Feature encoding (severity, reachability, EPSS, CVSS, KEV)
+- Component type classification
+- Context tag encoding
+- Vector normalization
+- Cosine similarity computation
+- Matching factor detection
+
+**PlaybookSuggestionServiceTests:**
+- Empty history handling
+- Single record suggestions
+- Multiple record ranking
+- Confidence calculation
+- Rationale generation
+- Evidence linking
+
+### Integration Tests (5 tests)
+
+**PostgresOpsMemoryStoreTests:**
+- Decision persistence and retrieval
+- Outcome updates
+- Tenant isolation
+- Query filtering
+- Statistics calculation
+
+## Performance Considerations
+
+### Indexing Strategy
+
+- Primary key on `memory_id` for direct lookups
+- Index on `tenant_id` for isolation
+- Index on `recorded_at` for recent-first queries
+- Partial indexes on `cve_id` and `component` for filtered queries
+
+### Query Optimization
+
+- Limit similarity search to last N days by default
+- Return only top-K similar records
+- Use cursor-based pagination for large result sets
+
+### Caching
+
+Currently no caching (records are infrequently accessed). Future options:
+- Cache similarity vectors in memory
+- Cache recent suggestions per tenant
+- Use read replicas for heavy read loads
+
+## Future Enhancements
+
+### pgvector Migration
+
+If dataset grows significantly:
+1. Install pgvector extension
+2. Add vector column with IVFFlat index
+3. Replace array-based similarity with vector operations
+4. ~100x speedup for large datasets
+
+### ML-Based Suggestions
+
+Replace rule-based confidence with ML model:
+1. Train on historical decision-outcome pairs
+2. Include more features (time of day, team, etc.)
+3. Use gradient boosting or neural network
+4. Continuous learning from new outcomes
+
+### Outcome Prediction
+
+Predict outcome before decision is made:
+1. Use past outcomes as training data
+2. Predict success probability per action
+3. Show predicted outcomes in UI
+4. Track prediction accuracy over time