Files
git.stella-ops.org/docs/reachability/cve-symbol-mapping.md
StellaOps Bot df94136727 feat: Implement distro-native version comparison for RPM, Debian, and Alpine packages
- Add RpmVersionComparer for RPM version comparison with epoch, version, and release handling.
- Introduce DebianVersion for parsing Debian EVR (Epoch:Version-Release) strings.
- Create ApkVersion for parsing Alpine APK version strings with suffix support.
- Define IVersionComparator interface for version comparison with proof-line generation.
- Implement VersionComparisonResult struct to encapsulate comparison results and proof lines.
- Add tests for Debian and RPM version comparers to ensure correct functionality and edge case handling.
- Create project files for the version comparison library and its tests.
2025-12-22 09:49:53 +02:00

7.6 KiB
Raw Blame History

CVE → Symbol Mapping

Last updated: 2025-12-22. Owner: Scanner Guild + Concelier Guild.

This document describes how Stella Ops maps CVE identifiers to specific binary symbols/functions for precise reachability analysis.


1. Overview

To determine if a vulnerability is reachable, we need to know which specific functions are affected. The CVE→Symbol Mapping service bridges:

  • CVE identifiers (e.g., CVE-2024-1234)
  • Package coordinates (e.g., pkg:npm/lodash@4.17.21)
  • Affected symbols (e.g., lodash.template, openssl:EVP_PKEY_decrypt)

2. Data Sources

2.1 Patch Diff Analysis

The highest-fidelity source: analyze git commits that fix vulnerabilities.

CVE-2024-1234 fixed in commit abc123
  → Diff shows changes to:
    - src/crypto.c: EVP_PKEY_decrypt() [modified]
    - src/crypto.c: decrypt_internal() [added guard]
  → Affected symbols: EVP_PKEY_decrypt, decrypt_internal

Implementation: StellaOps.Scanner.VulnSurfaces.PatchDiffAnalyzer

2.2 Advisory Metadata

Structured advisories with function-level detail:

  • OSV (affected[].ranges[].events[].introduced/fixed)
  • NVD CPE with CWE → typical affected patterns
  • Vendor advisories (GitHub, npm, PyPI security advisories)

Implementation: StellaOps.Concelier.Connectors.*

2.3 Heuristic Inference

When precise mappings unavailable:

  1. All public exports of affected package version
  2. CWE-based patterns (e.g., CWE-79 XSS → output functions)
  3. Function name patterns (e.g., *_decrypt*, *_parse*)

Implementation: StellaOps.Scanner.VulnSurfaces.HeuristicMapper


3. Mapping Confidence Tiers

Tier Source Confidence Example
Confirmed Patch diff analysis 0.951.0 Exact function from git diff
Likely Advisory with function names 0.70.9 OSV with affected.functions[]
Inferred CWE/pattern heuristics 0.40.6 All exports of vulnerable version
Unknown No data available 0.00.3 Package-level only

4. Query Interface

4.1 Service Contract

public interface IVulnSurfaceService
{
    /// <summary>
    /// Get symbols affected by a CVE for a specific package.
    /// </summary>
    Task<VulnSurfaceResult> GetAffectedSymbolsAsync(
        string cveId,
        string purl,
        VulnSurfaceOptions? options = null,
        CancellationToken ct = default);

    /// <summary>
    /// Batch query for multiple CVE+PURL pairs.
    /// </summary>
    Task<IReadOnlyList<VulnSurfaceResult>> GetAffectedSymbolsBatchAsync(
        IEnumerable<(string CveId, string Purl)> queries,
        CancellationToken ct = default);
}

4.2 Result Model

public sealed record VulnSurfaceResult
{
    public required string CveId { get; init; }
    public required string Purl { get; init; }
    public required ImmutableArray<AffectedSymbol> Symbols { get; init; }
    public required VulnSurfaceSource Source { get; init; }
    public required double Confidence { get; init; }
    public DateTimeOffset? CachedAt { get; init; }
}

public sealed record AffectedSymbol
{
    public required string Name { get; init; }
    public required string SymbolId { get; init; }
    public string? File { get; init; }
    public int? Line { get; init; }
    public string? Signature { get; init; }
    public SymbolChangeType ChangeType { get; init; }
}

public enum VulnSurfaceSource
{
    PatchDiff,
    Advisory,
    Heuristic,
    Unknown
}

public enum SymbolChangeType
{
    Modified,   // Function code changed
    Added,      // New guard/check added
    Removed,    // Vulnerable code removed
    Renamed     // Function renamed
}

5. Integration with Concelier

The CVE→Symbol mapping service integrates with Concelier's advisory feed:

┌─────────────────┐     ┌──────────────────┐     ┌───────────────────┐
│  Scanner        │────►│  VulnSurface     │────►│  Concelier        │
│  (Query)        │     │  Service         │     │  Advisory API     │
└─────────────────┘     └──────────────────┘     └───────────────────┘
                               │
                               ▼
                        ┌──────────────────┐
                        │  Patch Diff      │
                        │  Analyzer        │
                        └──────────────────┘

5.1 Advisory Client

public interface IAdvisoryClient
{
    Task<Advisory?> GetAdvisoryAsync(string cveId, CancellationToken ct);
    Task<IReadOnlyList<AffectedPackage>> GetAffectedPackagesAsync(
        string cveId,
        CancellationToken ct);
}

5.2 Caching Strategy

Data TTL Invalidation
Advisory metadata 1 hour On feed update
Patch diff results 24 hours On new CVE revision
Heuristic mappings 15 minutes On query

6. Offline Support

For air-gapped environments:

6.1 Pre-computed Bundles

offline-bundles/
  vuln-surfaces/
    cve-2024-*.json       # Pre-computed mappings
    ecosystem-npm.json    # NPM ecosystem mappings
    ecosystem-pypi.json   # PyPI ecosystem mappings

6.2 Bundle Format

{
  "version": "1.0.0",
  "generatedAt": "2025-12-22T00:00:00Z",
  "mappings": {
    "CVE-2024-1234": {
      "pkg:npm/lodash@4.17.21": {
        "symbols": ["template", "templateSettings"],
        "source": "patch_diff",
        "confidence": 0.95
      }
    }
  }
}

7. Fallback Behavior

When no mapping is available:

  1. Ecosystem-specific defaults:

    • npm: All exports from package.json
    • PyPI: All public functions (__all__)
    • Native: All exported symbols (.dynsym)
  2. Conservative approach:

    • Mark all public APIs as potentially affected
    • Set confidence = 0.3 (Inferred tier)
    • Include explanation in verdict reasons
  3. Manual override:

    • Allow user-provided symbol lists via policy
    • Support suppression rules for known false positives

8. Performance Considerations

Metric Target Notes
Cache hit rate >90% Most queries hit cache
Cold query latency <500ms Concelier API call
Batch throughput >100 queries/sec Parallel execution

9. Example Queries

Simple Query

POST /api/vuln-surfaces/query
Content-Type: application/json

{
  "cveId": "CVE-2024-1234",
  "purl": "pkg:npm/lodash@4.17.21"
}

Response:

{
  "cveId": "CVE-2024-1234",
  "purl": "pkg:npm/lodash@4.17.21",
  "symbols": [
    {
      "name": "template",
      "symbolId": "js:lodash/template",
      "file": "lodash.js",
      "line": 14850,
      "changeType": "modified"
    }
  ],
  "source": "patch_diff",
  "confidence": 0.95
}

Batch Query

POST /api/vuln-surfaces/batch
Content-Type: application/json

{
  "queries": [
    {"cveId": "CVE-2024-1234", "purl": "pkg:npm/lodash@4.17.21"},
    {"cveId": "CVE-2024-5678", "purl": "pkg:pypi/requests@2.28.0"}
  ]
}


Created: 2025-12-22. See Sprint 3810 for implementation details.