Files
git.stella-ops.org/docs/reachability/cve-symbol-mapping.md
StellaOps Bot df94136727 feat: Implement distro-native version comparison for RPM, Debian, and Alpine packages
- Add RpmVersionComparer for RPM version comparison with epoch, version, and release handling.
- Introduce DebianVersion for parsing Debian EVR (Epoch:Version-Release) strings.
- Create ApkVersion for parsing Alpine APK version strings with suffix support.
- Define IVersionComparator interface for version comparison with proof-line generation.
- Implement VersionComparisonResult struct to encapsulate comparison results and proof lines.
- Add tests for Debian and RPM version comparers to ensure correct functionality and edge case handling.
- Create project files for the version comparison library and its tests.
2025-12-22 09:49:53 +02:00

297 lines
7.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# CVE → Symbol Mapping
_Last updated: 2025-12-22. Owner: Scanner Guild + Concelier Guild._
This document describes how Stella Ops maps CVE identifiers to specific binary symbols/functions for precise reachability analysis.
---
## 1. Overview
To determine if a vulnerability is reachable, we need to know which specific functions are affected. The **CVE→Symbol Mapping** service bridges:
- **CVE identifiers** (e.g., `CVE-2024-1234`)
- **Package coordinates** (e.g., `pkg:npm/lodash@4.17.21`)
- **Affected symbols** (e.g., `lodash.template`, `openssl:EVP_PKEY_decrypt`)
---
## 2. Data Sources
### 2.1 Patch Diff Analysis
The highest-fidelity source: analyze git commits that fix vulnerabilities.
```
CVE-2024-1234 fixed in commit abc123
→ Diff shows changes to:
- src/crypto.c: EVP_PKEY_decrypt() [modified]
- src/crypto.c: decrypt_internal() [added guard]
→ Affected symbols: EVP_PKEY_decrypt, decrypt_internal
```
**Implementation**: `StellaOps.Scanner.VulnSurfaces.PatchDiffAnalyzer`
### 2.2 Advisory Metadata
Structured advisories with function-level detail:
- **OSV** (`affected[].ranges[].events[].introduced/fixed`)
- **NVD CPE** with CWE → typical affected patterns
- **Vendor advisories** (GitHub, npm, PyPI security advisories)
**Implementation**: `StellaOps.Concelier.Connectors.*`
### 2.3 Heuristic Inference
When precise mappings unavailable:
1. **All public exports** of affected package version
2. **CWE-based patterns** (e.g., CWE-79 XSS → output functions)
3. **Function name patterns** (e.g., `*_decrypt*`, `*_parse*`)
**Implementation**: `StellaOps.Scanner.VulnSurfaces.HeuristicMapper`
---
## 3. Mapping Confidence Tiers
| Tier | Source | Confidence | Example |
|------|--------|------------|---------|
| **Confirmed** | Patch diff analysis | 0.951.0 | Exact function from git diff |
| **Likely** | Advisory with function names | 0.70.9 | OSV with `affected.functions[]` |
| **Inferred** | CWE/pattern heuristics | 0.40.6 | All exports of vulnerable version |
| **Unknown** | No data available | 0.00.3 | Package-level only |
---
## 4. Query Interface
### 4.1 Service Contract
```csharp
public interface IVulnSurfaceService
{
/// <summary>
/// Get symbols affected by a CVE for a specific package.
/// </summary>
Task<VulnSurfaceResult> GetAffectedSymbolsAsync(
string cveId,
string purl,
VulnSurfaceOptions? options = null,
CancellationToken ct = default);
/// <summary>
/// Batch query for multiple CVE+PURL pairs.
/// </summary>
Task<IReadOnlyList<VulnSurfaceResult>> GetAffectedSymbolsBatchAsync(
IEnumerable<(string CveId, string Purl)> queries,
CancellationToken ct = default);
}
```
### 4.2 Result Model
```csharp
public sealed record VulnSurfaceResult
{
public required string CveId { get; init; }
public required string Purl { get; init; }
public required ImmutableArray<AffectedSymbol> Symbols { get; init; }
public required VulnSurfaceSource Source { get; init; }
public required double Confidence { get; init; }
public DateTimeOffset? CachedAt { get; init; }
}
public sealed record AffectedSymbol
{
public required string Name { get; init; }
public required string SymbolId { get; init; }
public string? File { get; init; }
public int? Line { get; init; }
public string? Signature { get; init; }
public SymbolChangeType ChangeType { get; init; }
}
public enum VulnSurfaceSource
{
PatchDiff,
Advisory,
Heuristic,
Unknown
}
public enum SymbolChangeType
{
Modified, // Function code changed
Added, // New guard/check added
Removed, // Vulnerable code removed
Renamed // Function renamed
}
```
---
## 5. Integration with Concelier
The CVE→Symbol mapping service integrates with Concelier's advisory feed:
```
┌─────────────────┐ ┌──────────────────┐ ┌───────────────────┐
│ Scanner │────►│ VulnSurface │────►│ Concelier │
│ (Query) │ │ Service │ │ Advisory API │
└─────────────────┘ └──────────────────┘ └───────────────────┘
┌──────────────────┐
│ Patch Diff │
│ Analyzer │
└──────────────────┘
```
### 5.1 Advisory Client
```csharp
public interface IAdvisoryClient
{
Task<Advisory?> GetAdvisoryAsync(string cveId, CancellationToken ct);
Task<IReadOnlyList<AffectedPackage>> GetAffectedPackagesAsync(
string cveId,
CancellationToken ct);
}
```
### 5.2 Caching Strategy
| Data | TTL | Invalidation |
|------|-----|--------------|
| Advisory metadata | 1 hour | On feed update |
| Patch diff results | 24 hours | On new CVE revision |
| Heuristic mappings | 15 minutes | On query |
---
## 6. Offline Support
For air-gapped environments:
### 6.1 Pre-computed Bundles
```
offline-bundles/
vuln-surfaces/
cve-2024-*.json # Pre-computed mappings
ecosystem-npm.json # NPM ecosystem mappings
ecosystem-pypi.json # PyPI ecosystem mappings
```
### 6.2 Bundle Format
```json
{
"version": "1.0.0",
"generatedAt": "2025-12-22T00:00:00Z",
"mappings": {
"CVE-2024-1234": {
"pkg:npm/lodash@4.17.21": {
"symbols": ["template", "templateSettings"],
"source": "patch_diff",
"confidence": 0.95
}
}
}
}
```
---
## 7. Fallback Behavior
When no mapping is available:
1. **Ecosystem-specific defaults**:
- npm: All `exports` from package.json
- PyPI: All public functions (`__all__`)
- Native: All exported symbols (`.dynsym`)
2. **Conservative approach**:
- Mark all public APIs as potentially affected
- Set confidence = 0.3 (Inferred tier)
- Include explanation in verdict reasons
3. **Manual override**:
- Allow user-provided symbol lists via policy
- Support suppression rules for known false positives
---
## 8. Performance Considerations
| Metric | Target | Notes |
|--------|--------|-------|
| Cache hit rate | >90% | Most queries hit cache |
| Cold query latency | <500ms | Concelier API call |
| Batch throughput | >100 queries/sec | Parallel execution |
---
## 9. Example Queries
### Simple Query
```http
POST /api/vuln-surfaces/query
Content-Type: application/json
{
"cveId": "CVE-2024-1234",
"purl": "pkg:npm/lodash@4.17.21"
}
```
Response:
```json
{
"cveId": "CVE-2024-1234",
"purl": "pkg:npm/lodash@4.17.21",
"symbols": [
{
"name": "template",
"symbolId": "js:lodash/template",
"file": "lodash.js",
"line": 14850,
"changeType": "modified"
}
],
"source": "patch_diff",
"confidence": 0.95
}
```
### Batch Query
```http
POST /api/vuln-surfaces/batch
Content-Type: application/json
{
"queries": [
{"cveId": "CVE-2024-1234", "purl": "pkg:npm/lodash@4.17.21"},
{"cveId": "CVE-2024-5678", "purl": "pkg:pypi/requests@2.28.0"}
]
}
```
---
## 10. Related Documentation
- [Slice Schema](./slice-schema.md)
- [Patch Oracles](./patch-oracles.md)
- [Concelier Architecture](../modules/concelier/architecture.md)
- [Vulnerability Surfaces](../modules/scanner/vuln-surfaces.md)
---
_Created: 2025-12-22. See Sprint 3810 for implementation details._