328 lines
10 KiB
Markdown
328 lines
10 KiB
Markdown
# OpsMemory Module
|
|
|
|
> **Decision Ledger for Playbook Learning**
|
|
|
|
OpsMemory is a structured ledger of prior security decisions and their outcomes. It enables playbook learning - understanding which decisions led to good outcomes and surfacing institutional knowledge for similar situations.
|
|
|
|
## What OpsMemory Is
|
|
|
|
- ✅ **Decision + Outcome pairs**: Every security decision is recorded with its eventual outcome
|
|
- ✅ **Success/failure classification**: Learn what worked and what didn't
|
|
- ✅ **Similar situation matching**: Find past decisions in comparable scenarios
|
|
- ✅ **Playbook suggestions**: Surface recommendations based on historical success
|
|
|
|
## What OpsMemory Is NOT
|
|
|
|
- ❌ Chat history (that's conversation storage)
|
|
- ❌ Audit logs (that's the Timeline)
|
|
- ❌ VEX statements (that's Excititor)
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ OpsMemory Service │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ ┌─────────────┐ ┌──────────────────┐ ┌───────────────┐ │
|
|
│ │ Decision │ │ Playbook │ │ Outcome │ │
|
|
│ │ Recording │ │ Suggestion │ │ Tracking │ │
|
|
│ └──────┬──────┘ └────────┬─────────┘ └───────┬───────┘ │
|
|
│ │ │ │ │
|
|
│ ▼ ▼ ▼ │
|
|
│ ┌─────────────────────────────────────────────────────┐ │
|
|
│ │ IOpsMemoryStore │ │
|
|
│ │ (PostgreSQL with similarity vectors) │ │
|
|
│ └─────────────────────────────────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Core Components
|
|
|
|
### OpsMemoryRecord
|
|
|
|
The core data structure capturing a decision and its context:
|
|
|
|
```json
|
|
{
|
|
"memoryId": "mem-abc123",
|
|
"tenantId": "tenant-xyz",
|
|
"recordedAt": "2026-01-07T12:00:00Z",
|
|
|
|
"situation": {
|
|
"cveId": "CVE-2023-44487",
|
|
"component": "pkg:npm/http2@1.0.0",
|
|
"severity": "high",
|
|
"reachability": "reachable",
|
|
"epssScore": 0.97,
|
|
"isKev": true,
|
|
"contextTags": ["production", "external-facing", "payment-service"]
|
|
},
|
|
|
|
"decision": {
|
|
"action": "Remediate",
|
|
"rationale": "KEV + reachable + payment service = immediate remediation",
|
|
"decidedBy": "security-team",
|
|
"decidedAt": "2026-01-07T12:00:00Z",
|
|
"policyReference": "policy/critical-kev.rego"
|
|
},
|
|
|
|
"outcome": {
|
|
"status": "Success",
|
|
"resolutionTime": "4:30:00",
|
|
"lessonsLearned": "Upgrade was smooth, no breaking changes",
|
|
"recordedAt": "2026-01-07T16:30:00Z"
|
|
}
|
|
}
|
|
```
|
|
|
|
### Decision Actions
|
|
|
|
| Action | Description |
|
|
|--------|-------------|
|
|
| `Accept` | Accept the risk (no action) |
|
|
| `Remediate` | Upgrade/patch the component |
|
|
| `Quarantine` | Isolate the component |
|
|
| `Mitigate` | Apply compensating controls (WAF, config) |
|
|
| `Defer` | Defer for later review |
|
|
| `Escalate` | Escalate to security team |
|
|
| `FalsePositive` | Mark as not applicable |
|
|
|
|
### Outcome Status
|
|
|
|
| Status | Description |
|
|
|--------|-------------|
|
|
| `Success` | Decision led to successful resolution |
|
|
| `PartialSuccess` | Decision led to partial resolution |
|
|
| `Ineffective` | Decision was ineffective |
|
|
| `NegativeOutcome` | Decision led to negative consequences |
|
|
| `Pending` | Outcome still pending |
|
|
|
|
## API Reference
|
|
|
|
### Record a Decision
|
|
|
|
```http
|
|
POST /api/v1/opsmemory/decisions
|
|
Content-Type: application/json
|
|
|
|
{
|
|
"tenantId": "tenant-xyz",
|
|
"cveId": "CVE-2023-44487",
|
|
"componentPurl": "pkg:npm/http2@1.0.0",
|
|
"severity": "high",
|
|
"reachability": "reachable",
|
|
"epssScore": 0.97,
|
|
"action": "Remediate",
|
|
"rationale": "KEV + reachable + payment service",
|
|
"decidedBy": "alice@example.com",
|
|
"contextTags": ["production", "payment-service"]
|
|
}
|
|
```
|
|
|
|
**Response:**
|
|
```json
|
|
{
|
|
"memoryId": "abc123def456",
|
|
"recordedAt": "2026-01-07T12:00:00Z"
|
|
}
|
|
```
|
|
|
|
### Record an Outcome
|
|
|
|
```http
|
|
POST /api/v1/opsmemory/decisions/{memoryId}/outcome?tenantId=tenant-xyz
|
|
Content-Type: application/json
|
|
|
|
{
|
|
"status": "Success",
|
|
"resolutionTimeMinutes": 270,
|
|
"lessonsLearned": "Upgrade was smooth, no breaking changes",
|
|
"recordedBy": "alice@example.com"
|
|
}
|
|
```
|
|
|
|
### Get Playbook Suggestions
|
|
|
|
```http
|
|
GET /api/v1/opsmemory/suggestions?tenantId=tenant-xyz&cveId=CVE-2024-1234&severity=high&reachability=reachable
|
|
```
|
|
|
|
**Response:**
|
|
```json
|
|
{
|
|
"suggestions": [
|
|
{
|
|
"suggestedAction": "Remediate",
|
|
"confidence": 0.87,
|
|
"rationale": "87% confidence based on 15 similar past decisions. Remediation succeeded in 93% of high-severity reachable vulnerabilities.",
|
|
"successRate": 0.93,
|
|
"similarDecisionCount": 15,
|
|
"averageResolutionTimeMinutes": 180,
|
|
"evidence": [
|
|
{
|
|
"memoryId": "abc123",
|
|
"similarity": 0.92,
|
|
"action": "Remediate",
|
|
"outcome": "Success",
|
|
"cveId": "CVE-2023-44487"
|
|
}
|
|
],
|
|
"matchingFactors": [
|
|
"Same severity: high",
|
|
"Same reachability: Reachable",
|
|
"Both are KEV",
|
|
"Shared context: production"
|
|
]
|
|
}
|
|
],
|
|
"analyzedRecords": 15,
|
|
"topSimilarity": 0.92
|
|
}
|
|
```
|
|
|
|
### Query Past Decisions
|
|
|
|
```http
|
|
GET /api/v1/opsmemory/decisions?tenantId=tenant-xyz&action=Remediate&pageSize=20
|
|
```
|
|
|
|
### Get Statistics
|
|
|
|
```http
|
|
GET /api/v1/opsmemory/stats?tenantId=tenant-xyz
|
|
```
|
|
|
|
**Response:**
|
|
```json
|
|
{
|
|
"tenantId": "tenant-xyz",
|
|
"totalDecisions": 1250,
|
|
"decisionsWithOutcomes": 980,
|
|
"successRate": 0.87
|
|
}
|
|
```
|
|
|
|
## Similarity Algorithm
|
|
|
|
OpsMemory uses a 50-dimensional vector to represent each security situation:
|
|
|
|
| Dimensions | Feature |
|
|
|------------|---------|
|
|
| 0-9 | CVE category (memory, injection, auth, crypto, dos, etc.) |
|
|
| 10-14 | Severity (none, low, medium, high, critical) |
|
|
| 15-18 | Reachability (unknown, reachable, not-reachable, potential) |
|
|
| 19-23 | EPSS band (0-0.2, 0.2-0.4, 0.4-0.6, 0.6-0.8, 0.8-1.0) |
|
|
| 24-28 | CVSS band (0-2, 2-4, 4-6, 6-8, 8-10) |
|
|
| 29 | KEV flag |
|
|
| 30-39 | Component type (npm, maven, pypi, nuget, go, cargo, etc.) |
|
|
| 40-49 | Context tags (production, external-facing, payment, etc.) |
|
|
|
|
Similarity is computed using **cosine similarity** between normalized vectors.
|
|
|
|
## Integration Points
|
|
|
|
### Decision Recording Hook
|
|
|
|
OpsMemory integrates with the Findings Ledger to automatically capture decisions:
|
|
|
|
```csharp
|
|
public class OpsMemoryHook : IDecisionHook
|
|
{
|
|
public async Task OnDecisionRecordedAsync(FindingDecision decision)
|
|
{
|
|
var record = new OpsMemoryRecord
|
|
{
|
|
TenantId = decision.TenantId,
|
|
Situation = ExtractSituation(decision),
|
|
Decision = ExtractDecision(decision)
|
|
};
|
|
|
|
// Fire-and-forget to not block the decision flow
|
|
_ = _store.RecordDecisionAsync(record);
|
|
}
|
|
}
|
|
```
|
|
|
|
### Outcome Tracking
|
|
|
|
The OutcomeTrackingService monitors for resolution events and prompts users:
|
|
|
|
1. **Auto-detect resolution**: When a finding is marked resolved
|
|
2. **Calculate resolution time**: Time from decision to resolution
|
|
3. **Prompt for classification**: Ask user about outcome quality
|
|
4. **Link to original decision**: Update the OpsMemory record
|
|
|
|
## Configuration
|
|
|
|
```yaml
|
|
opsmemory:
|
|
connectionString: "Host=localhost;Database=stellaops"
|
|
|
|
similarity:
|
|
minThreshold: 0.6 # Minimum similarity for suggestions
|
|
maxResults: 10 # Maximum similar records to analyze
|
|
|
|
suggestions:
|
|
maxSuggestions: 3 # Maximum suggestions to return
|
|
minConfidence: 0.5 # Minimum confidence threshold
|
|
|
|
outcomeTracking:
|
|
autoPromptDelay: 24h # Delay before prompting for outcome
|
|
reminderInterval: 7d # Reminder interval for pending outcomes
|
|
```
|
|
|
|
## Database Schema
|
|
|
|
```sql
|
|
CREATE SCHEMA IF NOT EXISTS opsmemory;
|
|
|
|
CREATE TABLE opsmemory.decisions (
|
|
memory_id TEXT PRIMARY KEY,
|
|
tenant_id TEXT NOT NULL,
|
|
recorded_at TIMESTAMPTZ NOT NULL,
|
|
|
|
-- Situation (JSONB for flexibility)
|
|
situation JSONB NOT NULL,
|
|
|
|
-- Decision (JSONB)
|
|
decision JSONB NOT NULL,
|
|
|
|
-- Outcome (nullable, updated later)
|
|
outcome JSONB,
|
|
|
|
-- Similarity vector (array for simple cosine similarity)
|
|
similarity_vector REAL[] NOT NULL
|
|
);
|
|
|
|
CREATE INDEX idx_decisions_tenant ON opsmemory.decisions(tenant_id);
|
|
CREATE INDEX idx_decisions_recorded ON opsmemory.decisions(recorded_at DESC);
|
|
CREATE INDEX idx_decisions_cve ON opsmemory.decisions((situation->>'cveId'));
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
### Recording Decisions
|
|
|
|
1. **Include context tags**: The more context, the better similarity matching
|
|
2. **Document rationale**: Future users benefit from understanding why
|
|
3. **Reference policies**: Link to the policy that guided the decision
|
|
|
|
### Recording Outcomes
|
|
|
|
1. **Be timely**: Record outcomes as soon as resolution is confirmed
|
|
2. **Be honest**: Failed decisions are valuable learning data
|
|
3. **Add lessons learned**: Help future users avoid pitfalls
|
|
|
|
### Using Suggestions
|
|
|
|
1. **Review evidence**: Look at the similar past decisions
|
|
2. **Check matching factors**: Ensure the situations are truly comparable
|
|
3. **Trust but verify**: Suggestions are guidance, not mandates
|
|
|
|
## Related Modules
|
|
|
|
- [Findings Ledger](../findings-ledger/README.md) - Source of decision events
|
|
- [Timeline](../timeline-indexer/README.md) - Audit trail
|
|
- [Excititor](../excititor/README.md) - VEX statement management
|
|
- [Risk Engine](../risk-engine/README.md) - Risk scoring
|