Files
git.stella-ops.org/docs/modules/opsmemory/README.md
2026-01-09 18:27:46 +02:00

10 KiB

OpsMemory Module

Decision Ledger for Playbook Learning

OpsMemory is a structured ledger of prior security decisions and their outcomes. It enables playbook learning - understanding which decisions led to good outcomes and surfacing institutional knowledge for similar situations.

What OpsMemory Is

  • Decision + Outcome pairs: Every security decision is recorded with its eventual outcome
  • Success/failure classification: Learn what worked and what didn't
  • Similar situation matching: Find past decisions in comparable scenarios
  • Playbook suggestions: Surface recommendations based on historical success

What OpsMemory Is NOT

  • Chat history (that's conversation storage)
  • Audit logs (that's the Timeline)
  • VEX statements (that's Excititor)

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    OpsMemory Service                        │
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌──────────────────┐  ┌───────────────┐  │
│  │  Decision   │  │    Playbook      │  │   Outcome     │  │
│  │  Recording  │  │   Suggestion     │  │   Tracking    │  │
│  └──────┬──────┘  └────────┬─────────┘  └───────┬───────┘  │
│         │                  │                    │          │
│         ▼                  ▼                    ▼          │
│  ┌─────────────────────────────────────────────────────┐   │
│  │              IOpsMemoryStore                        │   │
│  │    (PostgreSQL with similarity vectors)             │   │
│  └─────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘

Core Components

OpsMemoryRecord

The core data structure capturing a decision and its context:

{
  "memoryId": "mem-abc123",
  "tenantId": "tenant-xyz",
  "recordedAt": "2026-01-07T12:00:00Z",
  
  "situation": {
    "cveId": "CVE-2023-44487",
    "component": "pkg:npm/http2@1.0.0",
    "severity": "high",
    "reachability": "reachable",
    "epssScore": 0.97,
    "isKev": true,
    "contextTags": ["production", "external-facing", "payment-service"]
  },
  
  "decision": {
    "action": "Remediate",
    "rationale": "KEV + reachable + payment service = immediate remediation",
    "decidedBy": "security-team",
    "decidedAt": "2026-01-07T12:00:00Z",
    "policyReference": "policy/critical-kev.rego"
  },
  
  "outcome": {
    "status": "Success",
    "resolutionTime": "4:30:00",
    "lessonsLearned": "Upgrade was smooth, no breaking changes",
    "recordedAt": "2026-01-07T16:30:00Z"
  }
}

Decision Actions

Action Description
Accept Accept the risk (no action)
Remediate Upgrade/patch the component
Quarantine Isolate the component
Mitigate Apply compensating controls (WAF, config)
Defer Defer for later review
Escalate Escalate to security team
FalsePositive Mark as not applicable

Outcome Status

Status Description
Success Decision led to successful resolution
PartialSuccess Decision led to partial resolution
Ineffective Decision was ineffective
NegativeOutcome Decision led to negative consequences
Pending Outcome still pending

API Reference

Record a Decision

POST /api/v1/opsmemory/decisions
Content-Type: application/json

{
  "tenantId": "tenant-xyz",
  "cveId": "CVE-2023-44487",
  "componentPurl": "pkg:npm/http2@1.0.0",
  "severity": "high",
  "reachability": "reachable",
  "epssScore": 0.97,
  "action": "Remediate",
  "rationale": "KEV + reachable + payment service",
  "decidedBy": "alice@example.com",
  "contextTags": ["production", "payment-service"]
}

Response:

{
  "memoryId": "abc123def456",
  "recordedAt": "2026-01-07T12:00:00Z"
}

Record an Outcome

POST /api/v1/opsmemory/decisions/{memoryId}/outcome?tenantId=tenant-xyz
Content-Type: application/json

{
  "status": "Success",
  "resolutionTimeMinutes": 270,
  "lessonsLearned": "Upgrade was smooth, no breaking changes",
  "recordedBy": "alice@example.com"
}

Get Playbook Suggestions

GET /api/v1/opsmemory/suggestions?tenantId=tenant-xyz&cveId=CVE-2024-1234&severity=high&reachability=reachable

Response:

{
  "suggestions": [
    {
      "suggestedAction": "Remediate",
      "confidence": 0.87,
      "rationale": "87% confidence based on 15 similar past decisions. Remediation succeeded in 93% of high-severity reachable vulnerabilities.",
      "successRate": 0.93,
      "similarDecisionCount": 15,
      "averageResolutionTimeMinutes": 180,
      "evidence": [
        {
          "memoryId": "abc123",
          "similarity": 0.92,
          "action": "Remediate",
          "outcome": "Success",
          "cveId": "CVE-2023-44487"
        }
      ],
      "matchingFactors": [
        "Same severity: high",
        "Same reachability: Reachable",
        "Both are KEV",
        "Shared context: production"
      ]
    }
  ],
  "analyzedRecords": 15,
  "topSimilarity": 0.92
}

Query Past Decisions

GET /api/v1/opsmemory/decisions?tenantId=tenant-xyz&action=Remediate&pageSize=20

Get Statistics

GET /api/v1/opsmemory/stats?tenantId=tenant-xyz

Response:

{
  "tenantId": "tenant-xyz",
  "totalDecisions": 1250,
  "decisionsWithOutcomes": 980,
  "successRate": 0.87
}

Similarity Algorithm

OpsMemory uses a 50-dimensional vector to represent each security situation:

Dimensions Feature
0-9 CVE category (memory, injection, auth, crypto, dos, etc.)
10-14 Severity (none, low, medium, high, critical)
15-18 Reachability (unknown, reachable, not-reachable, potential)
19-23 EPSS band (0-0.2, 0.2-0.4, 0.4-0.6, 0.6-0.8, 0.8-1.0)
24-28 CVSS band (0-2, 2-4, 4-6, 6-8, 8-10)
29 KEV flag
30-39 Component type (npm, maven, pypi, nuget, go, cargo, etc.)
40-49 Context tags (production, external-facing, payment, etc.)

Similarity is computed using cosine similarity between normalized vectors.

Integration Points

Decision Recording Hook

OpsMemory integrates with the Findings Ledger to automatically capture decisions:

public class OpsMemoryHook : IDecisionHook
{
    public async Task OnDecisionRecordedAsync(FindingDecision decision)
    {
        var record = new OpsMemoryRecord
        {
            TenantId = decision.TenantId,
            Situation = ExtractSituation(decision),
            Decision = ExtractDecision(decision)
        };
        
        // Fire-and-forget to not block the decision flow
        _ = _store.RecordDecisionAsync(record);
    }
}

Outcome Tracking

The OutcomeTrackingService monitors for resolution events and prompts users:

  1. Auto-detect resolution: When a finding is marked resolved
  2. Calculate resolution time: Time from decision to resolution
  3. Prompt for classification: Ask user about outcome quality
  4. Link to original decision: Update the OpsMemory record

Configuration

opsmemory:
  connectionString: "Host=localhost;Database=stellaops"
  
  similarity:
    minThreshold: 0.6      # Minimum similarity for suggestions
    maxResults: 10         # Maximum similar records to analyze
    
  suggestions:
    maxSuggestions: 3      # Maximum suggestions to return
    minConfidence: 0.5     # Minimum confidence threshold
    
  outcomeTracking:
    autoPromptDelay: 24h   # Delay before prompting for outcome
    reminderInterval: 7d   # Reminder interval for pending outcomes

Database Schema

CREATE SCHEMA IF NOT EXISTS opsmemory;

CREATE TABLE opsmemory.decisions (
    memory_id TEXT PRIMARY KEY,
    tenant_id TEXT NOT NULL,
    recorded_at TIMESTAMPTZ NOT NULL,
    
    -- Situation (JSONB for flexibility)
    situation JSONB NOT NULL,
    
    -- Decision (JSONB)
    decision JSONB NOT NULL,
    
    -- Outcome (nullable, updated later)
    outcome JSONB,
    
    -- Similarity vector (array for simple cosine similarity)
    similarity_vector REAL[] NOT NULL
);

CREATE INDEX idx_decisions_tenant ON opsmemory.decisions(tenant_id);
CREATE INDEX idx_decisions_recorded ON opsmemory.decisions(recorded_at DESC);
CREATE INDEX idx_decisions_cve ON opsmemory.decisions((situation->>'cveId'));

Best Practices

Recording Decisions

  1. Include context tags: The more context, the better similarity matching
  2. Document rationale: Future users benefit from understanding why
  3. Reference policies: Link to the policy that guided the decision

Recording Outcomes

  1. Be timely: Record outcomes as soon as resolution is confirmed
  2. Be honest: Failed decisions are valuable learning data
  3. Add lessons learned: Help future users avoid pitfalls

Using Suggestions

  1. Review evidence: Look at the similar past decisions
  2. Check matching factors: Ensure the situations are truly comparable
  3. Trust but verify: Suggestions are guidance, not mandates