Files
git.stella-ops.org/docs/modules/concelier/operations/valkey-advisory-cache.md
StellaOps Bot 17613acf57 feat: add bulk triage view component and related stories
- Exported BulkTriageViewComponent and its related types from findings module.
- Created a new accessibility test suite for score components using axe-core.
- Introduced design tokens for score components to standardize styling.
- Enhanced score breakdown popover for mobile responsiveness with drag handle.
- Added date range selector functionality to score history chart component.
- Implemented unit tests for date range selector in score history chart.
- Created Storybook stories for bulk triage view and score history chart with date range selector.
2025-12-26 01:01:35 +02:00

9.0 KiB

Valkey Advisory Cache

Per SPRINT_8200_0013_0001.

Overview

The Valkey Advisory Cache provides sub-20ms read latency for canonical advisory lookups by caching advisory data and maintaining fast-path indexes. The cache integrates with the Concelier CanonicalAdvisoryService as a read-through cache with automatic population and invalidation.

Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                         Advisory Cache Flow                                 │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│    ┌───────────┐    miss    ┌───────────┐    fetch    ┌───────────────┐    │
│    │  Client   │ ─────────► │   Valkey  │ ──────────► │   PostgreSQL  │    │
│    │  Request  │            │   Cache   │             │   Canonical   │    │
│    └───────────┘            └───────────┘             │   Store       │    │
│         ▲                        │                    └───────────────┘    │
│         │                        │ hit                      │              │
│         │                        ▼                          │              │
│         └──────────────── (< 20ms) ◄────────────────────────┘              │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Configuration

Configure in concelier.yaml:

ConcelierCache:
  # Valkey connection settings
  ConnectionString: "localhost:6379,abortConnect=false,connectTimeout=5000"
  Database: 0
  InstanceName: "concelier:"

  # TTL settings by interest score tier
  TtlPolicy:
    HighInterestTtlHours: 24    # Interest score >= 0.7
    MediumInterestTtlHours: 4   # Interest score 0.3 - 0.7
    LowInterestTtlHours: 1      # Interest score < 0.3
    DefaultTtlHours: 2

  # Index settings
  HotSetMaxSize: 10000          # Max entries in rank:hot
  EnablePurlIndex: true
  EnableCveIndex: true

  # Connection pool settings
  PoolSize: 20
  ConnectRetryCount: 3
  ReconnectRetryDelayMs: 1000

  # Fallback behavior when Valkey unavailable
  FallbackToPostgres: true
  HealthCheckIntervalSeconds: 30

Key Schema

Advisory Entry

Key: advisory:{merge_hash}

Value: JSON-serialized CanonicalAdvisory

TTL: Based on interest score tier

{
  "id": "uuid",
  "cve": "CVE-2024-1234",
  "affects_key": "pkg:npm/express@4.0.0",
  "merge_hash": "sha256:a1b2c3...",
  "severity": "high",
  "interest_score": 0.85,
  "title": "...",
  "updated_at": "2025-01-15T10:30:00Z"
}

Hot Set (Sorted Set)

Key: rank:hot

Score: Interest score (0.0 - 1.0)

Member: merge_hash

Stores top advisories by interest score for quick access.

PURL Index (Set)

Key: by:purl:{normalized_purl}

Members: Set of merge_hash values

Maps package URLs to affected advisories.

Example: by:purl:pkg:npm/express@4.0.0{sha256:a1b2c3..., sha256:d4e5f6...}

CVE Index (Set)

Key: by:cve:{cve_id}

Members: Set of merge_hash values

Maps CVE IDs to canonical advisories.

Example: by:cve:cve-2024-1234{sha256:a1b2c3...}

Operations

Get Advisory

// Service interface
public interface IAdvisoryCacheService
{
    Task<CanonicalAdvisory?> GetAsync(string mergeHash, CancellationToken ct = default);
    Task SetAsync(CanonicalAdvisory advisory, CancellationToken ct = default);
    Task InvalidateAsync(string mergeHash, CancellationToken ct = default);
    Task<IReadOnlyList<CanonicalAdvisory>> GetByPurlAsync(string purl, CancellationToken ct = default);
    Task<IReadOnlyList<CanonicalAdvisory>> GetHotAsync(int count = 100, CancellationToken ct = default);
    Task IndexPurlAsync(string purl, string mergeHash, CancellationToken ct = default);
    Task IndexCveAsync(string cve, string mergeHash, CancellationToken ct = default);
}

Read-Through Cache

1. GetAsync(mergeHash) called
2. Check Valkey: GET advisory:{mergeHash}
   └─ Hit: deserialize and return
   └─ Miss: fetch from PostgreSQL, cache result, return

Cache Population

Advisories are cached when:

  • First read (read-through)
  • Ingested from source connectors
  • Imported from federation bundles
  • Updated by merge operations

Cache Invalidation

Invalidation occurs when:

  • Advisory is updated (re-merge with new data)
  • Advisory is withdrawn
  • Manual cache flush requested
// Invalidate single advisory
await cacheService.InvalidateAsync(mergeHash, ct);

// Invalidate multiple (e.g., after bulk import)
await cacheService.InvalidateManyAsync(mergeHashes, ct);

TTL Policy

Interest score determines TTL tier:

Interest Score TTL Rationale
>= 0.7 (High) 24 hours Hot advisories: likely to be queried frequently
0.3 - 0.7 (Medium) 4 hours Moderate interest: balance between freshness and cache hits
< 0.3 (Low) 1 hour Low interest: evict quickly to save memory

TTL is set when advisory is cached:

var ttl = ttlPolicy.GetTtl(advisory.InterestScore);
await cache.SetAsync(key, advisory, ttl, ct);

Monitoring

Metrics

Metric Type Description
concelier_cache_hits_total Counter Total cache hits
concelier_cache_misses_total Counter Total cache misses
concelier_cache_hit_rate Gauge Hit rate (hits / total)
concelier_cache_latency_ms Histogram Cache operation latency
concelier_cache_size_bytes Gauge Estimated cache memory usage
concelier_cache_hot_set_size Gauge Entries in rank:hot

OpenTelemetry Spans

Cache operations emit spans:

concelier.cache.get
  ├── cache.key: "advisory:sha256:..."
  ├── cache.hit: true/false
  └── cache.latency_ms: 2.5

concelier.cache.set
  ├── cache.key: "advisory:sha256:..."
  └── cache.ttl_hours: 24

Health Check

GET /health/cache

Response:

{
  "status": "healthy",
  "valkey_connected": true,
  "latency_ms": 1.2,
  "hot_set_size": 8542,
  "hit_rate_1h": 0.87
}

Performance

Benchmarks

Operation p50 p95 p99
GetAsync (hit) 1.2ms 3.5ms 8.0ms
GetAsync (miss + populate) 12ms 25ms 45ms
SetAsync 1.5ms 4.0ms 9.0ms
GetByPurlAsync 2.5ms 6.0ms 15ms
GetHotAsync(100) 3.0ms 8.0ms 18ms

Optimization Tips

  1. Connection Pooling: Use shared multiplexer with PoolSize: 20

  2. Pipeline Reads: For bulk operations, use pipelining:

    var batch = cache.CreateBatch();
    foreach (var hash in mergeHashes)
        tasks.Add(batch.GetAsync(hash));
    batch.Execute();
    
  3. Hot Set Preload: Run warmup job on startup to preload hot set

  4. Compression: Enable Valkey LZF compression for large advisories

Fallback Behavior

When Valkey is unavailable:

  1. FallbackToPostgres: true (default)

    • All reads go directly to PostgreSQL
    • Performance degrades but system remains operational
    • Reconnection attempts continue in background
  2. FallbackToPostgres: false

    • Cache misses return null/empty
    • Only cached data is accessible
    • Use for strict latency requirements

Troubleshooting

Common Issues

Issue Cause Solution
High miss rate Cache cold / insufficient TTL Run warmup job, increase TTLs
Latency spikes Connection exhaustion Increase PoolSize
Memory pressure Too many cached advisories Reduce HotSetMaxSize, lower TTLs
Index stale Invalidation not triggered Check event handlers, verify IndexPurlAsync calls

Debug Commands

# Check cache stats
stella cache stats

# View hot set
stella cache list-hot --limit 10

# Check specific advisory
stella cache get sha256:mergehash...

# Flush cache
stella cache flush --confirm

# Check PURL index
stella cache lookup-purl pkg:npm/express@4.0.0

Valkey CLI

# Connect to Valkey
redis-cli -h localhost -p 6379

# Check memory usage
INFO memory

# List hot set entries
ZRANGE rank:hot 0 9 WITHSCORES

# Check PURL index
SMEMBERS by:purl:pkg:npm/express@4.0.0

# Get advisory
GET advisory:sha256:a1b2c3...