up
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
api-governance / spectral-lint (push) Has been cancelled
oas-ci / oas-validate (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
SDK Publish & Sign / sdk-publish (push) Has been cancelled
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
api-governance / spectral-lint (push) Has been cancelled
oas-ci / oas-validate (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
SDK Publish & Sign / sdk-publish (push) Has been cancelled
This commit is contained in:
@@ -0,0 +1,799 @@
|
||||
Here’s a quick win for making your vuln paths auditor‑friendly without retraining any models: **add a plain‑language `reason` to every graph edge** (why this edge exists). Think “introduced via dynamic import” or “symbol relocation via `ld`”, not jargon soup.
|
||||
|
||||

|
||||
|
||||
# Why this helps
|
||||
|
||||
* **Explains reachability** at a glance (auditors & devs can follow the story).
|
||||
* **Reduces false‑positive fights** (every hop justifies itself).
|
||||
* **Stable across languages** (no model changes, just metadata).
|
||||
|
||||
# Minimal schema change
|
||||
|
||||
Add three fields to every edge in your call/dep graph (SBOM→Reachability→Fix plan):
|
||||
|
||||
```json
|
||||
{
|
||||
"from": "pkg:pypi/requests@2.32.3#requests.sessions.Session.request",
|
||||
"to": "pkg:pypi/urllib3@2.2.3#urllib3.connectionpool.HTTPConnectionPool.urlopen",
|
||||
"via": {
|
||||
"reason": "imported via top-level module dependency",
|
||||
"evidence": [
|
||||
"import urllib3 in requests/adapters.py:12",
|
||||
"pip freeze: urllib3==2.2.3"
|
||||
],
|
||||
"provenance": {
|
||||
"detector": "StellaOps.Scanner.WebService@1.4.2",
|
||||
"rule_id": "PY-IMPORT-001",
|
||||
"confidence": "high"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Standard reason glossary (use as enum)
|
||||
|
||||
* `declared_dependency` (manifest lock/SBOM edge)
|
||||
* `static_call` (direct call site with symbol ref)
|
||||
* `dynamic_import` (e.g., `__import__`, `importlib`, `require(...)`)
|
||||
* `reflection_call` (C# `MethodInfo.Invoke`, Java reflection)
|
||||
* `plugin_discovery` (entry points, ServiceLoader, MEF)
|
||||
* `symbol_relocation` (ELF/PE/Mach‑O relocation binds)
|
||||
* `plt_got_resolution` (ELF PLT/GOT jump to symbol)
|
||||
* `ld_preload_injection` (runtime injected .so/.dll)
|
||||
* `env_config_path` (path read from env/config enables load)
|
||||
* `taint_propagation` (user input reaches sink)
|
||||
* `vendor_patch_alias` (function moved/aliased across versions)
|
||||
|
||||
# Emission rules (keep it deterministic)
|
||||
|
||||
* **One reason per edge**, short, lowercase snake_case from glossary.
|
||||
* **Up to 3 evidence strings** (file:line or binary section + symbol).
|
||||
* **Confidence**: `high|medium|low` with a single, stable rubric:
|
||||
|
||||
* high = exact symbol/call site or relocation
|
||||
* medium = heuristic import/loader path
|
||||
* low = inferred from naming or optional plugin
|
||||
|
||||
# UI/Report snippet
|
||||
|
||||
Render paths like:
|
||||
|
||||
```
|
||||
app → requests → urllib3 → OpenSSL EVP_PKEY_new_raw_private_key
|
||||
• declared_dependency (poetry.lock)
|
||||
• static_call (requests.adapters:345)
|
||||
• symbol_relocation (ELF .rela.plt: _EVP_PKEY_new_raw_private_key)
|
||||
```
|
||||
|
||||
# C# drop‑in (for your .NET 10 code)
|
||||
|
||||
Edge builder with reason/evidence:
|
||||
|
||||
```csharp
|
||||
public sealed record EdgeId(string From, string To);
|
||||
|
||||
public sealed record EdgeEvidence(
|
||||
string Reason, // enum string from glossary
|
||||
IReadOnlyList<string> Evidence, // file:line, symbol, section
|
||||
string Confidence, // high|medium|low
|
||||
string Detector, // component@version
|
||||
string RuleId // stable rule key
|
||||
);
|
||||
|
||||
public sealed record GraphEdge(EdgeId Id, EdgeEvidence Via);
|
||||
|
||||
public static class EdgeFactory
|
||||
{
|
||||
public static GraphEdge DeclaredDependency(string from, string to, string manifestPath)
|
||||
=> new(new EdgeId(from, to),
|
||||
new EdgeEvidence(
|
||||
Reason: "declared_dependency",
|
||||
Evidence: new[] { $"manifest:{manifestPath}" },
|
||||
Confidence: "high",
|
||||
Detector: "StellaOps.Scanner.WebService@1.0.0",
|
||||
RuleId: "DEP-LOCK-001"));
|
||||
|
||||
public static GraphEdge SymbolRelocation(string from, string to, string objPath, string section, string symbol)
|
||||
=> new(new EdgeId(from, to),
|
||||
new EdgeEvidence(
|
||||
Reason: "symbol_relocation",
|
||||
Evidence: new[] { $"{objPath}::{section}:{symbol}" },
|
||||
Confidence: "high",
|
||||
Detector: "StellaOps.Scanner.WebService@1.0.0",
|
||||
RuleId: "BIN-RELOC-101"));
|
||||
}
|
||||
```
|
||||
|
||||
# Integration checklist (fast path)
|
||||
|
||||
* Emit `via.reason/evidence/provenance` for **all** edges (SBOM, source, binary).
|
||||
* Validate `reason` against glossary; reject free‑text.
|
||||
* Add a “**Why this edge exists**” column in your path tables.
|
||||
* In JSON/CSV exports, keep columns: `from,to,reason,confidence,evidence0..2,rule_id`.
|
||||
* In the console, collapse evidence by default; expand on click.
|
||||
|
||||
If you want, I’ll plug this into your Stella Ops graph contracts (Concelier/Cartographer) and produce the enum + validators and a tiny renderer for your docs.
|
||||
Cool, let’s turn this into a concrete, dev‑friendly implementation plan you can actually hand to teams.
|
||||
|
||||
I’ll structure it by phases and by component (schema, producers, APIs, UI, testing, rollout) so you can slice into tickets easily.
|
||||
|
||||
---
|
||||
|
||||
## 0. Recap of what we’re building
|
||||
|
||||
**Goal:**
|
||||
Every edge in your vuln path graph (SBOM → Reachability → Fix plan) carries **machine‑readable, auditor‑friendly metadata**:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"from": "pkg:pypi/requests@2.32.3#requests.sessions.Session.request",
|
||||
"to": "pkg:pypi/urllib3@2.2.3#urllib3.connectionpool.HTTPConnectionPool.urlopen",
|
||||
"via": {
|
||||
"reason": "declared_dependency", // from a controlled enum
|
||||
"evidence": [
|
||||
"manifest:requirements.txt:3", // up to 3 short evidence strings
|
||||
"pip freeze: urllib3==2.2.3"
|
||||
],
|
||||
"provenance": {
|
||||
"detector": "StellaOps.Scanner.WebService@1.4.2",
|
||||
"rule_id": "PY-IMPORT-001",
|
||||
"confidence": "high"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Standard **reason glossary** (enum):
|
||||
|
||||
* `declared_dependency`
|
||||
* `static_call`
|
||||
* `dynamic_import`
|
||||
* `reflection_call`
|
||||
* `plugin_discovery`
|
||||
* `symbol_relocation`
|
||||
* `plt_got_resolution`
|
||||
* `ld_preload_injection`
|
||||
* `env_config_path`
|
||||
* `taint_propagation`
|
||||
* `vendor_patch_alias`
|
||||
* `unknown` (fallback only when you truly can’t do better)
|
||||
|
||||
---
|
||||
|
||||
## 1. Design & contracts (shared work for backend & frontend)
|
||||
|
||||
### 1.1 Define the canonical edge metadata types
|
||||
|
||||
**Owner:** Platform / shared lib team
|
||||
|
||||
**Tasks:**
|
||||
|
||||
1. In your shared C# library (used by scanners + API), define:
|
||||
|
||||
```csharp
|
||||
public enum EdgeReason
|
||||
{
|
||||
Unknown = 0,
|
||||
DeclaredDependency,
|
||||
StaticCall,
|
||||
DynamicImport,
|
||||
ReflectionCall,
|
||||
PluginDiscovery,
|
||||
SymbolRelocation,
|
||||
PltGotResolution,
|
||||
LdPreloadInjection,
|
||||
EnvConfigPath,
|
||||
TaintPropagation,
|
||||
VendorPatchAlias
|
||||
}
|
||||
|
||||
public enum EdgeConfidence
|
||||
{
|
||||
Low = 0,
|
||||
Medium,
|
||||
High
|
||||
}
|
||||
|
||||
public sealed record EdgeProvenance(
|
||||
string Detector, // e.g., "StellaOps.Scanner.WebService@1.4.2"
|
||||
string RuleId, // e.g., "PY-IMPORT-001"
|
||||
EdgeConfidence Confidence
|
||||
);
|
||||
|
||||
public sealed record EdgeVia(
|
||||
EdgeReason Reason,
|
||||
IReadOnlyList<string> Evidence,
|
||||
EdgeProvenance Provenance
|
||||
);
|
||||
|
||||
public sealed record EdgeId(string From, string To);
|
||||
|
||||
public sealed record GraphEdge(
|
||||
EdgeId Id,
|
||||
EdgeVia Via
|
||||
);
|
||||
```
|
||||
|
||||
2. Enforce **max 3 evidence strings** via a small helper to avoid accidental spam:
|
||||
|
||||
```csharp
|
||||
public static class EdgeViaFactory
|
||||
{
|
||||
private const int MaxEvidence = 3;
|
||||
|
||||
public static EdgeVia Create(
|
||||
EdgeReason reason,
|
||||
IEnumerable<string> evidence,
|
||||
string detector,
|
||||
string ruleId,
|
||||
EdgeConfidence confidence
|
||||
)
|
||||
{
|
||||
var ev = evidence
|
||||
.Where(s => !string.IsNullOrWhiteSpace(s))
|
||||
.Take(MaxEvidence)
|
||||
.ToArray();
|
||||
|
||||
return new EdgeVia(
|
||||
Reason: reason,
|
||||
Evidence: ev,
|
||||
Provenance: new EdgeProvenance(detector, ruleId, confidence)
|
||||
);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
* [ ] EdgeReason enum defined and shared in a reusable package.
|
||||
* [ ] EdgeVia and EdgeProvenance types exist and are serializable to JSON.
|
||||
* [ ] Evidence is capped to 3 entries and cannot be null (empty list allowed).
|
||||
|
||||
---
|
||||
|
||||
### 1.2 API / JSON contract
|
||||
|
||||
**Owner:** API team
|
||||
|
||||
**Tasks:**
|
||||
|
||||
1. Extend your existing graph edge DTO to include `via`:
|
||||
|
||||
```csharp
|
||||
public sealed record GraphEdgeDto
|
||||
{
|
||||
public string From { get; init; } = default!;
|
||||
public string To { get; init; } = default!;
|
||||
public EdgeViaDto Via { get; init; } = default!;
|
||||
}
|
||||
|
||||
public sealed record EdgeViaDto
|
||||
{
|
||||
public string Reason { get; init; } = default!; // enum as string
|
||||
public string[] Evidence { get; init; } = Array.Empty<string>();
|
||||
public EdgeProvenanceDto Provenance { get; init; } = default!;
|
||||
}
|
||||
|
||||
public sealed record EdgeProvenanceDto
|
||||
{
|
||||
public string Detector { get; init; } = default!;
|
||||
public string RuleId { get; init; } = default!;
|
||||
public string Confidence { get; init; } = default!; // "high|medium|low"
|
||||
}
|
||||
```
|
||||
|
||||
2. Ensure JSON is **additive** (backward compatible):
|
||||
|
||||
* `via` is **non‑nullable** in responses from the new API version.
|
||||
* If you must keep a legacy endpoint, add **v2** endpoints that guarantee `via`.
|
||||
|
||||
3. Update OpenAPI spec:
|
||||
|
||||
* Document `via.reason` as enum string, including allowed values.
|
||||
* Document `via.provenance.detector`, `rule_id`, `confidence`.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
* [ ] OpenAPI / Swagger shows `via.reason` as a string enum + description.
|
||||
* [ ] New clients can deserialize edges with `via` without custom hacks.
|
||||
* [ ] Old clients remain unaffected (either keep old endpoint or allow them to ignore `via`).
|
||||
|
||||
---
|
||||
|
||||
## 2. Producers: add reasons & evidence where edges are created
|
||||
|
||||
You likely have 3 main edge producers:
|
||||
|
||||
* SBOM / manifest / lockfile analyzers
|
||||
* Source analyzers (call graph, taint analysis)
|
||||
* Binary analyzers (ELF/PE/Mach‑O, containers)
|
||||
|
||||
Treat each as a mini‑project with identical patterns.
|
||||
|
||||
---
|
||||
|
||||
### 2.1 SBOM / manifest edges
|
||||
|
||||
**Owner:** SBOM / dep graph team
|
||||
|
||||
**Tasks:**
|
||||
|
||||
1. Identify all code paths that create “declared dependency” edges:
|
||||
|
||||
* Manifest → Package
|
||||
* Root module → Imported package (if you store these explicitly)
|
||||
|
||||
2. Replace plain edge construction with factory calls:
|
||||
|
||||
```csharp
|
||||
public static class EdgeFactory
|
||||
{
|
||||
private const string DetectorName = "StellaOps.Scanner.Sbom@1.0.0";
|
||||
|
||||
public static GraphEdge DeclaredDependency(
|
||||
string from,
|
||||
string to,
|
||||
string manifestPath,
|
||||
string? dependencySpecLine
|
||||
)
|
||||
{
|
||||
var evidence = new List<string>
|
||||
{
|
||||
$"manifest:{manifestPath}"
|
||||
};
|
||||
|
||||
if (!string.IsNullOrWhiteSpace(dependencySpecLine))
|
||||
evidence.Add($"spec:{dependencySpecLine}");
|
||||
|
||||
var via = EdgeViaFactory.Create(
|
||||
EdgeReason.DeclaredDependency,
|
||||
evidence,
|
||||
DetectorName,
|
||||
"DEP-LOCK-001",
|
||||
EdgeConfidence.High
|
||||
);
|
||||
|
||||
return new GraphEdge(new EdgeId(from, to), via);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
3. Make sure each SBOM/manifest edge sets:
|
||||
|
||||
* `reason = declared_dependency`
|
||||
* `confidence = high`
|
||||
* Evidence includes at least `manifest:<path>` and, if possible, line or spec snippet.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
* [ ] Any SBOM‑generated edge returns with `via.reason == declared_dependency`.
|
||||
* [ ] Evidence contains manifest path for ≥ 99% of SBOM edges.
|
||||
* [ ] Unit tests cover at least: normal manifest, multiple manifests, malformed manifest.
|
||||
|
||||
---
|
||||
|
||||
### 2.2 Source code call graph edges
|
||||
|
||||
**Owner:** Static analysis / call graph team
|
||||
|
||||
**Tasks:**
|
||||
|
||||
1. Map current edge types → reasons:
|
||||
|
||||
* Direct function/method calls → `static_call`
|
||||
* Reflection (Java/C#) → `reflection_call`
|
||||
* Dynamic imports (`__import__`, `importlib`, `require(...)`) → `dynamic_import`
|
||||
* Plugin systems (entry points, ServiceLoader, MEF) → `plugin_discovery`
|
||||
* Taint / dataflow edges (user input → sink) → `taint_propagation`
|
||||
|
||||
2. Implement helper factories:
|
||||
|
||||
```csharp
|
||||
public static class SourceEdgeFactory
|
||||
{
|
||||
private const string DetectorName = "StellaOps.Scanner.Source@1.0.0";
|
||||
|
||||
public static GraphEdge StaticCall(
|
||||
string fromSymbol,
|
||||
string toSymbol,
|
||||
string filePath,
|
||||
int lineNumber
|
||||
)
|
||||
{
|
||||
var evidence = new[]
|
||||
{
|
||||
$"callsite:{filePath}:{lineNumber}"
|
||||
};
|
||||
|
||||
var via = EdgeViaFactory.Create(
|
||||
EdgeReason.StaticCall,
|
||||
evidence,
|
||||
DetectorName,
|
||||
"SRC-CALL-001",
|
||||
EdgeConfidence.High
|
||||
);
|
||||
|
||||
return new GraphEdge(new EdgeId(fromSymbol, toSymbol), via);
|
||||
}
|
||||
|
||||
public static GraphEdge DynamicImport(
|
||||
string fromSymbol,
|
||||
string toSymbol,
|
||||
string filePath,
|
||||
int lineNumber
|
||||
)
|
||||
{
|
||||
var via = EdgeViaFactory.Create(
|
||||
EdgeReason.DynamicImport,
|
||||
new[] { $"importsite:{filePath}:{lineNumber}" },
|
||||
DetectorName,
|
||||
"SRC-DYNIMPORT-001",
|
||||
EdgeConfidence.Medium
|
||||
);
|
||||
|
||||
return new GraphEdge(new EdgeId(fromSymbol, toSymbol), via);
|
||||
}
|
||||
|
||||
// Similar for ReflectionCall, PluginDiscovery, TaintPropagation...
|
||||
}
|
||||
```
|
||||
|
||||
3. Replace all direct `new GraphEdge(...)` calls in source analyzers with these factories.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
* [ ] Direct call edges produce `reason = static_call` with file:line evidence.
|
||||
* [ ] Reflection/dynamic import edges use correct reasons and mark `confidence = medium` (or high where you’re certain).
|
||||
* [ ] Unit tests check that for a known source file, the resulting edges contain expected `reason`, `evidence`, and `rule_id`.
|
||||
|
||||
---
|
||||
|
||||
### 2.3 Binary / container analyzers
|
||||
|
||||
**Owner:** Binary analysis / SCA team
|
||||
|
||||
**Tasks:**
|
||||
|
||||
1. Map binary features to reasons:
|
||||
|
||||
* Symbol relocations + PLT/GOT edges → `symbol_relocation` or `plt_got_resolution`
|
||||
* LD_PRELOAD or injection edges → `ld_preload_injection`
|
||||
|
||||
2. Implement factory:
|
||||
|
||||
```csharp
|
||||
public static class BinaryEdgeFactory
|
||||
{
|
||||
private const string DetectorName = "StellaOps.Scanner.Binary@1.0.0";
|
||||
|
||||
public static GraphEdge SymbolRelocation(
|
||||
string fromSymbol,
|
||||
string toSymbol,
|
||||
string binaryPath,
|
||||
string section,
|
||||
string relocationName
|
||||
)
|
||||
{
|
||||
var evidence = new[]
|
||||
{
|
||||
$"{binaryPath}::{section}:{relocationName}"
|
||||
};
|
||||
|
||||
var via = EdgeViaFactory.Create(
|
||||
EdgeReason.SymbolRelocation,
|
||||
evidence,
|
||||
DetectorName,
|
||||
"BIN-RELOC-101",
|
||||
EdgeConfidence.High
|
||||
);
|
||||
|
||||
return new GraphEdge(new EdgeId(fromSymbol, toSymbol), via);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
3. Wire up all binary edge creation to use this.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
* [ ] For a test binary with a known relocation, edges include `reason = symbol_relocation` and section/symbol in evidence.
|
||||
* [ ] No binary edge is created without `via`.
|
||||
|
||||
---
|
||||
|
||||
## 3. Storage & migrations
|
||||
|
||||
This depends on your backing store, but the pattern is similar.
|
||||
|
||||
### 3.1 Relational (SQL) example
|
||||
|
||||
**Owner:** Data / infra team
|
||||
|
||||
**Tasks:**
|
||||
|
||||
1. Add columns:
|
||||
|
||||
```sql
|
||||
ALTER TABLE graph_edges
|
||||
ADD COLUMN via_reason VARCHAR(64) NOT NULL DEFAULT 'unknown',
|
||||
ADD COLUMN via_evidence JSONB NOT NULL DEFAULT '[]'::jsonb,
|
||||
ADD COLUMN via_detector VARCHAR(255) NOT NULL DEFAULT 'unknown',
|
||||
ADD COLUMN via_rule_id VARCHAR(128) NOT NULL DEFAULT 'unknown',
|
||||
ADD COLUMN via_confidence VARCHAR(16) NOT NULL DEFAULT 'low';
|
||||
```
|
||||
|
||||
2. Update ORM model:
|
||||
|
||||
```csharp
|
||||
public class EdgeEntity
|
||||
{
|
||||
public string From { get; set; } = default!;
|
||||
public string To { get; set; } = default!;
|
||||
|
||||
public string ViaReason { get; set; } = "unknown";
|
||||
public string[] ViaEvidence { get; set; } = Array.Empty<string>();
|
||||
public string ViaDetector { get; set; } = "unknown";
|
||||
public string ViaRuleId { get; set; } = "unknown";
|
||||
public string ViaConfidence { get; set; } = "low";
|
||||
}
|
||||
```
|
||||
|
||||
3. Add mapping to domain `GraphEdge`:
|
||||
|
||||
```csharp
|
||||
public static GraphEdge ToDomain(this EdgeEntity e)
|
||||
{
|
||||
var via = new EdgeVia(
|
||||
Reason: Enum.TryParse<EdgeReason>(e.ViaReason, true, out var r) ? r : EdgeReason.Unknown,
|
||||
Evidence: e.ViaEvidence,
|
||||
Provenance: new EdgeProvenance(
|
||||
Detector: e.ViaDetector,
|
||||
RuleId: e.ViaRuleId,
|
||||
Confidence: Enum.TryParse<EdgeConfidence>(e.ViaConfidence, true, out var c) ? c : EdgeConfidence.Low
|
||||
)
|
||||
);
|
||||
|
||||
return new GraphEdge(new EdgeId(e.From, e.To), via);
|
||||
}
|
||||
```
|
||||
|
||||
4. **Backfill existing data** (optional but recommended):
|
||||
|
||||
* For edges with a known “type” column, map to best‑fit `reason`.
|
||||
* If you can’t infer: set `reason = unknown`, `confidence = low`, `detector = "backfill@<version>"`.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
* [ ] DB migration runs cleanly in staging and prod.
|
||||
* [ ] No existing reader breaks: default values keep queries functioning.
|
||||
* [ ] Edge round‑trip (domain → DB → API JSON) retains `via` fields correctly.
|
||||
|
||||
---
|
||||
|
||||
## 4. API & service layer
|
||||
|
||||
**Owner:** API / service team
|
||||
|
||||
**Tasks:**
|
||||
|
||||
1. Wire domain model → DTOs:
|
||||
|
||||
```csharp
|
||||
public static GraphEdgeDto ToDto(this GraphEdge edge)
|
||||
{
|
||||
return new GraphEdgeDto
|
||||
{
|
||||
From = edge.Id.From,
|
||||
To = edge.Id.To,
|
||||
Via = new EdgeViaDto
|
||||
{
|
||||
Reason = edge.Via.Reason.ToString().ToSnakeCaseLower(), // e.g. "static_call"
|
||||
Evidence = edge.Via.Evidence.ToArray(),
|
||||
Provenance = new EdgeProvenanceDto
|
||||
{
|
||||
Detector = edge.Via.Provenance.Detector,
|
||||
RuleId = edge.Via.Provenance.RuleId,
|
||||
Confidence = edge.Via.Provenance.Confidence.ToString().ToLowerInvariant()
|
||||
}
|
||||
}
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
2. If you accept edges via API (internal services), validate:
|
||||
|
||||
* `reason` must be one of the known values; otherwise reject or coerce to `unknown`.
|
||||
* `evidence` length ≤ 3.
|
||||
* Trim whitespace and limit each evidence string length (e.g. 256 chars).
|
||||
|
||||
3. Versioning:
|
||||
|
||||
* Introduce `/v2/graph/paths` (or similar) that guarantees `via`.
|
||||
* Keep `/v1/...` unchanged or mark deprecated.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
* [ ] Path API returns `via.reason` and `via.evidence` for all edges in new endpoints.
|
||||
* [ ] Invalid reason strings are rejected or converted to `unknown` with a log.
|
||||
* [ ] Integration tests cover full flow: repo → scanner → DB → API → JSON.
|
||||
|
||||
---
|
||||
|
||||
## 5. UI: make paths auditor‑friendly
|
||||
|
||||
**Owner:** Frontend team
|
||||
|
||||
**Tasks:**
|
||||
|
||||
1. **Path details UI**:
|
||||
|
||||
For each edge in the vulnerability path table:
|
||||
|
||||
* Show a **“Reason” column** with a small pill:
|
||||
|
||||
* `static_call` → “Static call”
|
||||
* `declared_dependency` → “Declared dependency”
|
||||
* etc.
|
||||
* Below or on hover, show **primary evidence** (first evidence string).
|
||||
|
||||
2. **Edge details panel** (drawer/modal):
|
||||
|
||||
When user clicks an edge:
|
||||
|
||||
* Show:
|
||||
|
||||
* From → To (symbols/packages)
|
||||
* Reason (with friendly description per enum)
|
||||
* Evidence list (each on its own line)
|
||||
* Detector, rule id, confidence
|
||||
|
||||
3. **Filtering & sorting (optional but powerful)**:
|
||||
|
||||
* Filter edges by `reason` (multi‑select).
|
||||
* Filter by `confidence` (e.g. show only high/medium).
|
||||
* This helps auditors quickly isolate more speculative edges.
|
||||
|
||||
4. **UX text / glossary**:
|
||||
|
||||
* Add a small “?” tooltip that links to a glossary explaining each reason type in human language.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
* [ ] For a given vulnerability, the path view shows a “Reason” column per edge.
|
||||
* [ ] Clicking an edge reveals all evidence and provenance information.
|
||||
* [ ] UX has a glossary/tooltip explaining what each reason means in plain English.
|
||||
|
||||
---
|
||||
|
||||
## 6. Testing strategy
|
||||
|
||||
**Owner:** QA + each feature team
|
||||
|
||||
### 6.1 Unit tests
|
||||
|
||||
* **Factories**: verify correct mapping from input to `EdgeVia`:
|
||||
|
||||
* Reason set correctly.
|
||||
* Evidence trimmed, max 3.
|
||||
* Confidence matches rubric (high for relocations, medium for heuristic imports, etc.).
|
||||
* **Serialization**: `EdgeVia` → JSON and back.
|
||||
|
||||
### 6.2 Integration tests
|
||||
|
||||
Set up **small fixtures**:
|
||||
|
||||
1. **Simple dependency project**:
|
||||
|
||||
* Example: Python project with `requirements.txt` → `requests` → `urllib3`.
|
||||
* Expected edges:
|
||||
|
||||
* App → requests: `declared_dependency`, evidence includes `requirements.txt`.
|
||||
* requests → urllib3: `declared_dependency`, plus static call edges.
|
||||
|
||||
2. **Dynamic import case**:
|
||||
|
||||
* A module using `importlib.import_module("mod")`.
|
||||
* Ensure edge is `dynamic_import` with `confidence = medium`.
|
||||
|
||||
3. **Binary edge case**:
|
||||
|
||||
* Test ELF with known symbol relocation.
|
||||
* Ensure an edge with `reason = symbol_relocation` exists.
|
||||
|
||||
### 6.3 End‑to‑end tests
|
||||
|
||||
* Run full scan on a sample repo and:
|
||||
|
||||
* Hit path API.
|
||||
* Assert every edge has non‑null `via` fields.
|
||||
* Spot check a few known edges for exact `reason` and evidence.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
* [ ] Automated tests fail if any edge is emitted without `via`.
|
||||
* [ ] Coverage includes at least one example for each `EdgeReason` you support.
|
||||
|
||||
---
|
||||
|
||||
## 7. Observability, guardrails & rollout
|
||||
|
||||
### 7.1 Metrics & logging
|
||||
|
||||
**Owner:** Observability / platform
|
||||
|
||||
**Tasks:**
|
||||
|
||||
* Emit metrics:
|
||||
|
||||
* `% edges with reason != unknown`
|
||||
* Count by `reason` and `confidence`
|
||||
* Log warnings when:
|
||||
|
||||
* Edge is emitted with `reason = unknown`.
|
||||
* Evidence is empty for a non‑unknown reason.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
* [ ] Dashboards showing distribution of edge reasons over time.
|
||||
* [ ] Alerts if `unknown` reason edges exceed a threshold (e.g. >5%).
|
||||
|
||||
---
|
||||
|
||||
### 7.2 Rollout plan
|
||||
|
||||
**Owner:** PM + tech leads
|
||||
|
||||
**Steps:**
|
||||
|
||||
1. **Phase 1 – Dark‑launch metadata:**
|
||||
|
||||
* Start generating & storing `via` for new scans.
|
||||
* Keep UI unchanged.
|
||||
* Monitor metrics, unknown ratio, and storage overhead.
|
||||
|
||||
2. **Phase 2 – Enable for internal users:**
|
||||
|
||||
* Toggle UI on (feature flag for internal / beta users).
|
||||
* Collect feedback from security engineers and auditors.
|
||||
|
||||
3. **Phase 3 – General availability:**
|
||||
|
||||
* Enable UI for all.
|
||||
* Update customer‑facing documentation & audit guides.
|
||||
|
||||
---
|
||||
|
||||
### 7.3 Documentation
|
||||
|
||||
**Owner:** Docs / PM
|
||||
|
||||
* Short **“Why this edge exists”** section in:
|
||||
|
||||
* Product docs (for customers).
|
||||
* Internal runbooks (for support & SEs).
|
||||
* Include:
|
||||
|
||||
* Table of reasons → human descriptions.
|
||||
* Examples of path explanations (e.g., “This edge exists because `app` declares `urllib3` in `requirements.txt` and calls it in `client.py:42`”).
|
||||
|
||||
---
|
||||
|
||||
## 8. Ready‑to‑use ticket breakdown
|
||||
|
||||
You can almost copy‑paste these into your tracker:
|
||||
|
||||
1. **Shared**: Define EdgeReason, EdgeVia & EdgeProvenance in shared library, plus EdgeViaFactory.
|
||||
2. **SBOM**: Use EdgeFactory.DeclaredDependency for all manifest‑generated edges.
|
||||
3. **Source**: Wire all callgraph edges to SourceEdgeFactory (static_call, dynamic_import, reflection_call, plugin_discovery, taint_propagation).
|
||||
4. **Binary**: Wire relocations/PLT/GOT edges to BinaryEdgeFactory (symbol_relocation, plt_got_resolution, ld_preload_injection).
|
||||
5. **Data**: Add via_* columns/properties to graph_edges storage and map to/from domain.
|
||||
6. **API**: Extend graph path DTOs to include `via`, update OpenAPI, and implement /v2 endpoints if needed.
|
||||
7. **UI**: Show edge reason, evidence, and provenance in vulnerability path screens and add filters.
|
||||
8. **Testing**: Add unit, integration, and end‑to‑end tests ensuring every edge has non‑null `via`.
|
||||
9. **Observability**: Add metrics and logs for edge reasons and unknown rates.
|
||||
10. **Docs & rollout**: Write glossary + auditor docs and plan staged rollout.
|
||||
|
||||
---
|
||||
|
||||
If you tell me a bit about your current storage (e.g., Neo4j vs SQL) and the services’ names, I can tailor this into an even more literal set of code snippets and migrations to match your stack exactly.
|
||||
Reference in New Issue
Block a user