feat: add stella-callgraph-node for JavaScript/TypeScript call graph extraction
- Implemented a new tool `stella-callgraph-node` that extracts call graphs from JavaScript/TypeScript projects using Babel AST. - Added command-line interface with options for JSON output and help. - Included functionality to analyze project structure, detect functions, and build call graphs. - Created a package.json file for dependency management. feat: introduce stella-callgraph-python for Python call graph extraction - Developed `stella-callgraph-python` to extract call graphs from Python projects using AST analysis. - Implemented command-line interface with options for JSON output and verbose logging. - Added framework detection to identify popular web frameworks and their entry points. - Created an AST analyzer to traverse Python code and extract function definitions and calls. - Included requirements.txt for project dependencies. chore: add framework detection for Python projects - Implemented framework detection logic to identify frameworks like Flask, FastAPI, Django, and others based on project files and import patterns. - Enhanced the AST analyzer to recognize entry points based on decorators and function definitions.
This commit is contained in:
@@ -0,0 +1,104 @@
|
||||
Below is a **feature → moat strength** map for Stella Ops, explicitly benchmarked against the tools we’ve been discussing (Trivy/Aqua, Grype/Syft, Anchore Enterprise, Snyk, Prisma Cloud). I’m using **“moat”** in the strict sense: *how hard is it for an incumbent to replicate the capability to parity, and how strong are the switching costs once deployed.*
|
||||
|
||||
### Moat scale
|
||||
|
||||
* **5 = Structural moat** (new primitives, strong defensibility, durable switching cost)
|
||||
* **4 = Strong moat** (difficult multi-domain engineering; incumbents have only partial analogs)
|
||||
* **3 = Moderate moat** (others can build; differentiation is execution + packaging)
|
||||
* **2 = Weak moat** (table-stakes soon; limited defensibility)
|
||||
* **1 = Commodity** (widely available in OSS / easy to replicate)
|
||||
|
||||
---
|
||||
|
||||
## 1) Stella Ops candidate features mapped to moat strength
|
||||
|
||||
| Stella Ops feature (precisely defined) | Closest competitor analogs (evidence) | Competitive parity today | Moat strength | Why this is (or isn’t) defensible | How to harden the moat |
|
||||
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------: | ------------: | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| **Signed, replayable risk verdicts**: “this artifact is acceptable” decisions produced deterministically, with an evidence bundle + policy snapshot, signed as an attestation | Ecosystem can sign SBOM attestations (e.g., Syft + Sigstore; DSSE/in-toto via cosign), but not “risk verdict” decisions end-to-end ([Anchore][1]) | Low | **5** | This requires a **deterministic evaluation model**, a **proof/evidence schema**, and “knowledge snapshotting” so results are replayable months later. Incumbents mostly stop at exporting scan results or SBOMs, not signing a decision in a reproducible way. | Make the verdict format a **first-class artifact** (OCI-attached attestation), with strict replay semantics (“same inputs → same verdict”), plus auditor-friendly evidence extraction. |
|
||||
| **VEX decisioning engine (not just ingestion)**: ingest OpenVEX/CycloneDX/CSAF, resolve conflicts with a trust/policy lattice, and produce explainable outcomes | Trivy supports multiple VEX formats (CycloneDX/OpenVEX/CSAF) but notes it’s “experimental/minimal functionality” ([Trivy][2]). Grype supports OpenVEX ingestion ([Chainguard][3]). Anchore can generate VEX docs from annotations (OpenVEX + CycloneDX) ([Anchore Docs][4]). Aqua runs VEX Hub for distributing VEX statements to Trivy ([Aqua][5]) | Medium (ingestion exists; decision logic is thin) | **4** | Ingestion alone is easy; the moat comes from **formal conflict resolution**, provenance-aware trust weighting, and deterministic outcomes. Most tools treat VEX as suppression/annotation, not a reasoning substrate. | Ship a **policy-controlled merge semantics** (“vendor > distro > internal” is too naive) + required evidence hooks (e.g., “not affected because feature flag off”). |
|
||||
| **Reachability with proof**, tied to deployable artifacts: produce a defensible chain “entrypoint → call path → vulnerable symbol,” plus configuration gates | Snyk has reachability analysis in GA for certain languages/integrations and uses call-graph style reasoning to determine whether vulnerable code is called ([Snyk User Docs][6]). Some commercial vendors also market reachability (e.g., Endor Labs is listed in CycloneDX Tool Center as analyzing reachability) ([CycloneDX][7]) | Medium (reachability exists, but proof portability varies) | **4** | “Reachability” as a label is no longer unique. The moat is **portable proofs** (usable in audits and in air-gapped environments) + artifact-level mapping (not just source repo analysis) + deterministic replay. | Focus on **proof-carrying reachability**: store the reachability subgraph as evidence; make it reproducible and attestable; support both source and post-build artifacts. |
|
||||
| **Smart-Diff (semantic risk delta)**: between releases, explain “what materially changed in exploitable surface,” not just “CVE count changed” | Anchore provides SBOM management and policy evaluation (good foundation), but “semantic risk diff” is not a prominent, standardized feature in typical scanners ([Anchore Docs][8]) | Low–Medium | **4** | Most incumbents can diff findings lists. Few can diff **reachability graphs, policy outcomes, and VEX state** to produce stable “delta narratives.” Hard to replicate without the underlying evidence model. | Treat diff as first-class: version SBOM graphs + reachability graphs + VEX claims; compute deltas over those graphs and emit a signed “delta verdict.” |
|
||||
| **Unknowns as first-class state**: represent “unknown-reachable/unknown-unreachable” and force policies to account for uncertainty | Not a standard capability in common scanners/platforms; most systems output findings and (optionally) suppressions | Low | **4** | This is conceptually simple but operationally rare; it requires rethinking UX, scoring, and policy evaluation. It becomes sticky once orgs base governance on uncertainty budgets. | Bake unknowns into policies (“fail if unknowns > N in prod”), reporting, and attestations. Make it the default rather than optional. |
|
||||
| **Air-gapped epistemic mode**: offline operation where the tool can prove what knowledge it used (feed snapshot + timestamps + trust anchors) | Prisma Cloud Compute Edition supports air-gapped environments and has an offline Intel Stream update mechanism ([Prisma Cloud Docs][9]). (But “prove exact knowledge state used for decisions” is typically not the emphasis.) | Medium | **4** | Air-gapped “runtime” is common; air-gapped **reproducibility** is not. The moat is packaging offline feeds + policies + deterministic scoring into a replayable bundle tied to attestations. | Deliver a “sealed knowledge snapshot” workflow (export/import), and make audits a one-command replay. |
|
||||
| **SBOM ledger + lineage**: BYOS ingestion plus versioned SBOM storage, grouping, and historical tracking | Anchore explicitly positions centralized SBOM management and “Bring Your Own SBOM” ([Anchore Docs][8]). Snyk can generate SBOMs and expose SBOM via API in CycloneDX/SPDX formats ([Snyk User Docs][10]). Prisma can export CycloneDX SBOMs for scans ([Prisma Cloud Docs][11]) | High | **3** | SBOM generation/storage is quickly becoming table stakes. You can still differentiate on **graph fidelity + lineage semantics**, but “having SBOMs” alone won’t be a moat. | Make the ledger valuable via **semantic diff, evidence joins (reachability/VEX), and provenance** rather than storage. |
|
||||
| **Policy engine with proofs**: policy-as-code that produces a signed explanation (“why pass/fail”) and links to evidence nodes | Anchore has a mature policy model (policy JSON, gates, allowlists, mappings) ([Anchore Docs][12]). Prisma/Aqua have rich policy + runtime guardrails (platform-driven) ([Aqua][13]) | High | **3** | Policy engines are common. The moat is the **proof output** + deterministic replay + integration with attestations. | Keep policy language small but rigorous; always emit evidence pointers; support “policy compilation” to deterministic decision artifacts. |
|
||||
| **VEX distribution network**: ecosystem layer that aggregates, validates, and serves VEX at scale | Aqua’s VEX Hub is explicitly a centralized repository designed for discover/fetch/consume flows with Trivy ([Aqua][5]) | Medium | **3–4** | A network layer can become a moat if it achieves broad adoption. But incumbents can also launch hubs. This becomes defensible only with **network effects + trust frameworks**. | Differentiate with **verification + trust scoring** of VEX sources, plus tight coupling to deterministic decisioning and attestations. |
|
||||
| **“Integrations everywhere”** (CI/CD, registry, Kubernetes, IDE) | Everyone in this space integrates broadly; reachability and scoring features often ride those integrations (e.g., Snyk reachability depends on repo/integration access) ([Snyk User Docs][6]) | High | **1–2** | Integrations are necessary, but not defensible—mostly engineering throughput. | Use integrations to *distribute attestations and proofs*, not as the headline differentiator. |
|
||||
|
||||
---
|
||||
|
||||
## 2) Where competitors already have strong moats (avoid head‑on fights early)
|
||||
|
||||
These are areas where incumbents are structurally advantaged, so Stella Ops should either (a) integrate rather than replace, or (b) compete only if you have a much sharper wedge.
|
||||
|
||||
### Snyk’s moat: developer adoption + reachability-informed prioritization
|
||||
|
||||
* Snyk publicly documents **reachability analysis** (GA for certain integrations/languages) ([Snyk User Docs][6])
|
||||
* Snyk prioritization incorporates reachability and other signals into **Priority Score** ([Snyk User Docs][14])
|
||||
**Implication:** pure “reachability” claims won’t beat Snyk; **proof-carrying, artifact-tied, replayable reachability** can.
|
||||
|
||||
### Prisma Cloud’s moat: CNAPP breadth + graph-based risk prioritization + air-gapped CWPP
|
||||
|
||||
* Prisma invests in graph-driven investigation/tracing of vulnerabilities ([Prisma Cloud Docs][15])
|
||||
* Risk prioritization and risk-score ranked vulnerability views are core platform capabilities ([Prisma Cloud Docs][16])
|
||||
* Compute Edition supports **air-gapped environments** and has offline update workflows ([Prisma Cloud Docs][9])
|
||||
**Implication:** competing on “platform breadth” is a losing battle early; compete on **decision integrity** (deterministic, attestable, replayable) and integrate where needed.
|
||||
|
||||
### Anchore’s moat: SBOM operations + policy-as-code maturity
|
||||
|
||||
* Anchore is explicitly SBOM-management centric and supports policy gating constructs ([Anchore Docs][8])
|
||||
**Implication:** Anchore is strong at “SBOM at scale.” Stella Ops should outperform on **semantic diff, VEX reasoning, and proof outputs**, not just SBOM storage.
|
||||
|
||||
### Aqua’s moat: code-to-runtime enforcement plus emerging VEX distribution
|
||||
|
||||
* Aqua provides CWPP-style runtime policy enforcement/guardrails ([Aqua][13])
|
||||
* Aqua backs VEX Hub for VEX distribution and Trivy consumption ([Aqua][5])
|
||||
**Implication:** if Stella Ops is not a runtime protection platform, don’t chase CWPP breadth—use Aqua/Prisma integrations and focus on upstream decision quality.
|
||||
|
||||
---
|
||||
|
||||
## 3) Practical positioning: which features produce the most durable wedge
|
||||
|
||||
If you want the shortest path to a *defensible* position:
|
||||
|
||||
1. **Moat anchor (5): Signed, replayable risk verdicts**
|
||||
|
||||
* Everything else (VEX, reachability, diff) becomes evidence feeding that verdict.
|
||||
2. **Moat amplifier (4): VEX decisioning + proof-carrying reachability**
|
||||
|
||||
* In 2025, VEX ingestion exists in Trivy/Grype/Anchore ([Trivy][2]), and reachability exists in Snyk ([Snyk User Docs][6]).
|
||||
* Your differentiation must be: **determinism + portability + auditability**.
|
||||
3. **Moat compounding (4): Smart-Diff over risk meaning**
|
||||
|
||||
* Turns “scan results” into an operational change-control primitive.
|
||||
|
||||
---
|
||||
|
||||
## 4) A concise “moat thesis” per feature (one-liners you can use internally)
|
||||
|
||||
* **Deterministic signed verdicts:** “We don’t output findings; we output an attestable decision that can be replayed.”
|
||||
* **VEX decisioning:** “We treat VEX as a logical claim system, not a suppression file.”
|
||||
* **Reachability proofs:** “We provide proof of exploitability in *this* artifact, not just a badge.”
|
||||
* **Smart-Diff:** “We explain what changed in exploitable surface area, not what changed in CVE count.”
|
||||
* **Unknowns modeling:** “We quantify uncertainty and gate on it.”
|
||||
|
||||
---
|
||||
|
||||
If you want, I can convert the table into a **2×2 moat map** (Customer Value vs Defensibility) and a **build-order roadmap** that maximizes durable advantage while minimizing overlap with entrenched competitor moats.
|
||||
|
||||
[1]: https://anchore.com/sbom/creating-sbom-attestations-using-syft-and-sigstore/?utm_source=chatgpt.com "Creating SBOM Attestations Using Syft and Sigstore"
|
||||
[2]: https://trivy.dev/docs/v0.50/supply-chain/vex/?utm_source=chatgpt.com "VEX"
|
||||
[3]: https://www.chainguard.dev/unchained/vexed-then-grype-about-it-chainguard-and-anchore-announce-grype-supports-openvex?utm_source=chatgpt.com "VEXed? Then Grype about it"
|
||||
[4]: https://docs.anchore.com/current/docs/vulnerability_management/vuln_annotations/?utm_source=chatgpt.com "Vulnerability Annotations and VEX"
|
||||
[5]: https://www.aquasec.com/blog/introducing-vex-hub-unified-repository-for-vex-statements/?utm_source=chatgpt.com "Trivy VEX Hub:The Solution to Vulnerability Fatigue"
|
||||
[6]: https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing/reachability-analysis?utm_source=chatgpt.com "Reachability analysis"
|
||||
[7]: https://cyclonedx.org/tool-center/?utm_source=chatgpt.com "CycloneDX Tool Center"
|
||||
[8]: https://docs.anchore.com/current/docs/sbom_management/?utm_source=chatgpt.com "SBOM Management"
|
||||
[9]: https://docs.prismacloud.io/en/compute-edition?utm_source=chatgpt.com "Prisma Cloud Compute Edition"
|
||||
[10]: https://docs.snyk.io/developer-tools/snyk-cli/commands/sbom?utm_source=chatgpt.com "SBOM | Snyk User Docs"
|
||||
[11]: https://docs.prismacloud.io/en/compute-edition/32/admin-guide/vulnerability-management/exporting-sboms?utm_source=chatgpt.com "Exporting Software Bill of Materials on CycloneDX"
|
||||
[12]: https://docs.anchore.com/current/docs/overview/concepts/policy/policies/?utm_source=chatgpt.com "Policies and Evaluation"
|
||||
[13]: https://www.aquasec.com/products/cwpp-cloud-workload-protection/?utm_source=chatgpt.com "Cloud workload protection in Runtime - Aqua Security"
|
||||
[14]: https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing?utm_source=chatgpt.com "Prioritize issues for fixing"
|
||||
[15]: https://docs.prismacloud.io/en/enterprise-edition/content-collections/search-and-investigate/c2c-tracing-vulnerabilities/investigate-vulnerabilities-tracing?utm_source=chatgpt.com "Use Vulnerabilities Tracing on Investigate"
|
||||
[16]: https://docs.prismacloud.io/en/enterprise-edition/use-cases/secure-the-infrastructure/risk-prioritization?utm_source=chatgpt.com "Risk Prioritization - Prisma Cloud Documentation"
|
||||
@@ -0,0 +1,619 @@
|
||||
## Trust Algebra and Lattice Engine Specification
|
||||
|
||||
This spec defines a deterministic “Trust Algebra / Lattice Engine” that ingests heterogeneous security assertions (SBOM, VEX, reachability, provenance attestations), normalizes them into a canonical claim model, merges them using lattice operations that preserve **unknowns and contradictions**, and produces a **signed, replayable verdict** with an auditable proof trail.
|
||||
|
||||
The design deliberately separates:
|
||||
|
||||
1. **Knowledge aggregation** (monotone, conflict-preserving, order-independent), from
|
||||
2. **Decision selection** (policy-driven, trust-aware, environment-aware).
|
||||
|
||||
This prevents “heuristics creep” and makes the system explainable and reproducible.
|
||||
|
||||
---
|
||||
|
||||
# 1) Scope and objectives
|
||||
|
||||
### 1.1 What the engine must do
|
||||
|
||||
* Accept VEX from multiple standards (OpenVEX, CSAF VEX, CycloneDX/ECMA-424 VEX).
|
||||
* Accept internally generated evidence (SBOM, reachability proofs, mitigations, patch/pedigree evidence).
|
||||
* Merge claims while representing:
|
||||
|
||||
* **Unknown** (no evidence)
|
||||
* **Conflict** (credible evidence for both sides)
|
||||
* Compute an output disposition aligned to common VEX output states:
|
||||
|
||||
* CycloneDX impact-analysis states include: `resolved`, `resolved_with_pedigree`, `exploitable`, `in_triage`, `false_positive`, `not_affected`. ([Ecma International][1])
|
||||
* Provide deterministic, signed, replayable results:
|
||||
|
||||
* Same inputs + same policy bundle ⇒ same outputs.
|
||||
* Produce a proof object that can be independently verified offline.
|
||||
|
||||
### 1.2 Non-goals
|
||||
|
||||
* “One score to rule them all” without proofs.
|
||||
* Probabilistic scoring as the primary decision mechanism.
|
||||
* Trust by vendor branding instead of cryptographic/verifiable identity.
|
||||
|
||||
---
|
||||
|
||||
# 2) Standards surface (external inputs) and canonicalization targets
|
||||
|
||||
The engine should support at minimum these external statement types:
|
||||
|
||||
### 2.1 CycloneDX / ECMA-424 VEX (embedded)
|
||||
|
||||
CycloneDX’s vulnerability “impact analysis” model defines:
|
||||
|
||||
* `analysis.state` values: `resolved`, `resolved_with_pedigree`, `exploitable`, `in_triage`, `false_positive`, `not_affected`. ([Ecma International][1])
|
||||
* `analysis.justification` values: `code_not_present`, `code_not_reachable`, `requires_configuration`, `requires_dependency`, `requires_environment`, `protected_by_compiler`, `protected_at_runtime`, `protected_at_perimeter`, `protected_by_mitigating_control`. ([Ecma International][1])
|
||||
|
||||
This is the richest mainstream state model; we will treat it as the “maximal” target semantics.
|
||||
|
||||
### 2.2 OpenVEX
|
||||
|
||||
OpenVEX defines status labels:
|
||||
|
||||
* `not_affected`, `affected`, `fixed`, `under_investigation`. ([Docker Documentation][2])
|
||||
For `not_affected`, OpenVEX requires supplying either a status justification or an `impact_statement`. ([GitHub][3])
|
||||
|
||||
### 2.3 CSAF VEX
|
||||
|
||||
CSAF VEX requires `product_status` containing at least one of:
|
||||
|
||||
* `fixed`, `known_affected`, `known_not_affected`, `under_investigation`. ([OASIS Documents][4])
|
||||
|
||||
### 2.4 Provenance / attestations
|
||||
|
||||
The engine should ingest signed attestations, particularly DSSE-wrapped in-toto statements (common in Sigstore/Cosign flows). Sigstore documentation states payloads are signed using the DSSE signing spec. ([Sigstore][5])
|
||||
DSSE’s design highlights include binding the payload **and its type** to prevent confusion attacks and avoiding canonicalization to reduce attack surface. ([GitHub][6])
|
||||
|
||||
---
|
||||
|
||||
# 3) Canonical internal model
|
||||
|
||||
## 3.1 Core identifiers
|
||||
|
||||
### Subject identity
|
||||
|
||||
A **Subject** is what we are making a security determination about.
|
||||
|
||||
Minimum viable Subject key:
|
||||
|
||||
* `artifact.digest` (e.g., OCI image digest, binary hash)
|
||||
* `component.id` (prefer `purl`, else `cpe`, else `bom-ref`)
|
||||
* `vuln.id` (CVE/OSV/etc.)
|
||||
* `context.id` (optional but recommended; see below)
|
||||
|
||||
```
|
||||
Subject := (ArtifactRef, ComponentRef, VulnerabilityRef, ContextRef?)
|
||||
```
|
||||
|
||||
### Context identity (optional but recommended)
|
||||
|
||||
ContextRef allows environment-sensitive statements to remain valid and deterministic:
|
||||
|
||||
* build flags
|
||||
* runtime config profile (e.g., feature gates)
|
||||
* deployment mode (cluster policy)
|
||||
* OS / libc family
|
||||
* FIPS mode, SELinux/AppArmor posture, etc.
|
||||
|
||||
ContextRef must be hashable (canonical JSON → digest).
|
||||
|
||||
---
|
||||
|
||||
## 3.2 Claims, evidence, attestations
|
||||
|
||||
### Claim
|
||||
|
||||
A **Claim** is a signed or unsigned assertion about a Subject.
|
||||
|
||||
Required fields:
|
||||
|
||||
* `claim.id`: content-addressable digest of canonical claim JSON
|
||||
* `claim.subject`
|
||||
* `claim.issuer`: principal identity
|
||||
* `claim.time`: `issued_at`, `valid_from`, `valid_until` (optional)
|
||||
* `claim.assertions[]`: list of atomic assertions (see §4)
|
||||
* `claim.evidence_refs[]`: pointers to evidence objects
|
||||
* `claim.signature`: optional DSSE / signature wrapper reference
|
||||
|
||||
### Evidence
|
||||
|
||||
Evidence is a typed object that supports replay and audit:
|
||||
|
||||
* `evidence.type`: e.g., `sbom_node`, `callgraph_path`, `loader_resolution`, `config_snapshot`, `patch_diff`, `pedigree_commit_chain`
|
||||
* `evidence.digest`: hash of canonical bytes
|
||||
* `evidence.producer`: tool identity and version
|
||||
* `evidence.time`
|
||||
* `evidence.payload_ref`: CAS pointer
|
||||
* `evidence.signature_ref`: optional (attested evidence)
|
||||
|
||||
### Attestation wrapper
|
||||
|
||||
For signed payloads (claims or evidence bundles):
|
||||
|
||||
* Prefer DSSE envelopes for transport/type binding. ([GitHub][6])
|
||||
* Prefer in-toto statement structure (subject + predicate + type).
|
||||
|
||||
---
|
||||
|
||||
# 4) The fact lattice: representing truth, unknowns, and conflicts
|
||||
|
||||
## 4.1 Why a lattice, not booleans
|
||||
|
||||
For vulnerability disposition you will routinely see:
|
||||
|
||||
* **no evidence** (unknown)
|
||||
* **incomplete evidence** (triage)
|
||||
* **contradictory evidence** (vendor says not affected; scanner says exploitable)
|
||||
A boolean cannot represent these safely.
|
||||
|
||||
## 4.2 Four-valued fact lattice (Belnap-style)
|
||||
|
||||
For each atomic proposition `p`, the engine stores a value in:
|
||||
|
||||
```
|
||||
K4 := { ⊥, T, F, ⊤ }
|
||||
|
||||
⊥ = unknown (no support)
|
||||
T = supported true
|
||||
F = supported false
|
||||
⊤ = conflict (support for both true and false)
|
||||
```
|
||||
|
||||
### Knowledge ordering (≤k)
|
||||
|
||||
* ⊥ ≤k T ≤k ⊤
|
||||
* ⊥ ≤k F ≤k ⊤
|
||||
* T and F incomparable
|
||||
|
||||
### Join operator (⊔k)
|
||||
|
||||
Join is “union of support” and is monotone:
|
||||
|
||||
* ⊥ ⊔k x = x
|
||||
* T ⊔k F = ⊤
|
||||
* ⊤ ⊔k x = ⊤
|
||||
* T ⊔k T = T, F ⊔k F = F
|
||||
|
||||
This operator is order-independent; it provides deterministic aggregation even under parallel ingestion.
|
||||
|
||||
---
|
||||
|
||||
# 5) Atomic propositions (canonical “security atoms”)
|
||||
|
||||
For each Subject `S`, the engine maintains K4 truth values for these propositions:
|
||||
|
||||
1. **PRESENT**: the component instance is present in the artifact/context.
|
||||
2. **APPLIES**: vulnerability applies to that component (version/range/cpe match).
|
||||
3. **REACHABLE**: vulnerable code is reachable in the given context.
|
||||
4. **MITIGATED**: controls prevent exploitation (compiler/runtime/perimeter/controls).
|
||||
5. **FIXED**: remediation has been applied to the artifact.
|
||||
6. **MISATTRIBUTED**: the finding is a false association (false positive).
|
||||
|
||||
These atoms are intentionally orthogonal; external formats are normalized into them.
|
||||
|
||||
---
|
||||
|
||||
# 6) Trust algebra: principals, assurance, and authority
|
||||
|
||||
Trust is not a single number; it must represent:
|
||||
|
||||
* cryptographic verification
|
||||
* identity assurance
|
||||
* authority scope
|
||||
* freshness/revocation
|
||||
* evidence strength
|
||||
|
||||
We model trust as a label computed deterministically from policy + verification.
|
||||
|
||||
## 6.1 Principal
|
||||
|
||||
A principal is an issuer identity with verifiable keys:
|
||||
|
||||
* `principal.id` (URI-like)
|
||||
* `principal.key_ids[]`
|
||||
* `principal.identity_claims` (e.g., cert SANs, OIDC subject, org, repo)
|
||||
* `principal.roles[]` (vendor, distro, internal-sec, build-system, scanner, auditor)
|
||||
|
||||
## 6.2 Trust label
|
||||
|
||||
A trust label is a tuple:
|
||||
|
||||
```
|
||||
TrustLabel := (
|
||||
assurance_level, // cryptographic + identity verification strength
|
||||
authority_scope, // what subjects this principal is authoritative for
|
||||
freshness_class, // time validity
|
||||
evidence_class // strength/type of evidence attached
|
||||
)
|
||||
```
|
||||
|
||||
### Assurance levels (example)
|
||||
|
||||
Deterministic levels, increasing:
|
||||
|
||||
* A0: unsigned / unverifiable
|
||||
* A1: signed, key known but weak identity binding
|
||||
* A2: signed, verified identity (e.g., cert chain / keyless identity)
|
||||
* A3: signed + provenance binding to artifact digest
|
||||
* A4: signed + provenance + transparency log inclusion (if available)
|
||||
|
||||
Sigstore cosign’s attestation verification references DSSE signing for payloads. ([Sigstore][5])
|
||||
DSSE design includes payload-type binding and avoids canonicalization. ([GitHub][6])
|
||||
|
||||
### Authority scope
|
||||
|
||||
Authority is not purely cryptographic. It is policy-defined mapping between:
|
||||
|
||||
* principal identity and
|
||||
* subject namespaces (vendors, products, package namespaces, internal artifacts)
|
||||
|
||||
Examples:
|
||||
|
||||
* Vendor principal is authoritative for `product.vendor == VendorX`.
|
||||
* Distro principal authoritative for packages under their repos.
|
||||
* Internal security principal authoritative for internal runtime reachability proofs.
|
||||
|
||||
### Evidence class
|
||||
|
||||
Evidence class is derived from evidence types:
|
||||
|
||||
* E0: statement-only (no supporting evidence refs)
|
||||
* E1: SBOM linkage evidence (component present + version)
|
||||
* E2: reachability/mitigation evidence (call paths, config snapshots)
|
||||
* E3: remediation evidence (patch diffs, pedigree/commit chain)
|
||||
|
||||
CycloneDX/ECMA-424 explicitly distinguishes `resolved_with_pedigree` as remediation with verifiable commit history/diffs in pedigree. ([Ecma International][1])
|
||||
|
||||
## 6.3 Trust ordering and operators
|
||||
|
||||
Trust labels define a partial order ≤t (policy-defined). A simple implementation is component-wise ordering, but authority scope is set-based.
|
||||
|
||||
Core operators:
|
||||
|
||||
* **join (⊔t)**: combine independent supporting trust (often max-by-order)
|
||||
* **meet (⊓t)**: compose along dependency chain (often min-by-order)
|
||||
* **compose (⊗)**: trust of derived claim = min(trust of prerequisites) adjusted by method assurance
|
||||
|
||||
**Important:** Trust affects **decision selection**, not raw knowledge aggregation. Aggregation retains conflicts even if one side is low-trust.
|
||||
|
||||
---
|
||||
|
||||
# 7) Normalization: external VEX → canonical atoms
|
||||
|
||||
## 7.1 CycloneDX / ECMA-424 normalization
|
||||
|
||||
From `analysis.state` ([Ecma International][1])
|
||||
|
||||
* `resolved`
|
||||
→ FIXED := T
|
||||
* `resolved_with_pedigree`
|
||||
→ FIXED := T and require pedigree/diff evidence (E3)
|
||||
* `exploitable`
|
||||
→ APPLIES := T, REACHABLE := T, MITIGATED := F (unless explicit mitigation evidence exists)
|
||||
* `in_triage`
|
||||
→ mark triage flag; leave atoms mostly ⊥ unless other fields present
|
||||
* `false_positive`
|
||||
→ MISATTRIBUTED := T
|
||||
* `not_affected`
|
||||
→ requires justification mapping (below)
|
||||
|
||||
From `analysis.justification` ([Ecma International][1])
|
||||
Map into atoms as conditional facts (context-sensitive):
|
||||
|
||||
* `code_not_present` → PRESENT := F
|
||||
* `code_not_reachable` → REACHABLE := F
|
||||
* `requires_configuration` → REACHABLE := F *under current config snapshot*
|
||||
* `requires_dependency` → REACHABLE := F *unless dependency present*
|
||||
* `requires_environment` → REACHABLE := F *under current environment constraints*
|
||||
* `protected_by_compiler` / `protected_at_runtime` / `protected_at_perimeter` / `protected_by_mitigating_control`
|
||||
→ MITIGATED := T (with evidence refs expected)
|
||||
|
||||
## 7.2 OpenVEX normalization
|
||||
|
||||
OpenVEX statuses: `not_affected`, `affected`, `fixed`, `under_investigation`. ([Docker Documentation][2])
|
||||
For `not_affected`, OpenVEX requires justification or an impact statement. ([GitHub][3])
|
||||
|
||||
Mapping:
|
||||
|
||||
* `fixed` → FIXED := T
|
||||
* `affected` → APPLIES := T (conservative; leave REACHABLE := ⊥ unless present)
|
||||
* `under_investigation` → triage flag
|
||||
* `not_affected` → choose mapping based on provided justification / impact statement:
|
||||
|
||||
* component not present → PRESENT := F
|
||||
* vulnerable code not reachable → REACHABLE := F
|
||||
* mitigations already exist → MITIGATED := T
|
||||
* otherwise → APPLIES := F only if explicitly asserted
|
||||
|
||||
## 7.3 CSAF VEX normalization
|
||||
|
||||
CSAF product_status includes `fixed`, `known_affected`, `known_not_affected`, `under_investigation`. ([OASIS Documents][4])
|
||||
|
||||
Mapping:
|
||||
|
||||
* `fixed` → FIXED := T
|
||||
* `known_affected` → APPLIES := T
|
||||
* `known_not_affected` → APPLIES := F unless stronger justification indicates PRESENT := F / REACHABLE := F / MITIGATED := T
|
||||
* `under_investigation` → triage flag
|
||||
|
||||
---
|
||||
|
||||
# 8) Lattice engine: aggregation algorithm
|
||||
|
||||
Aggregation is pure, monotone, and order-independent.
|
||||
|
||||
## 8.1 Support sets
|
||||
|
||||
For each Subject `S` and atom `p`, maintain:
|
||||
|
||||
* `SupportTrue[S,p]` = set of claim IDs supporting p=true
|
||||
* `SupportFalse[S,p]` = set of claim IDs supporting p=false
|
||||
|
||||
Optionally store per-support:
|
||||
|
||||
* trust label
|
||||
* evidence digests
|
||||
* timestamps
|
||||
|
||||
## 8.2 Compute K4 value
|
||||
|
||||
For each `(S,p)`:
|
||||
|
||||
* if both support sets empty → ⊥
|
||||
* if only true non-empty → T
|
||||
* if only false non-empty → F
|
||||
* if both non-empty → ⊤
|
||||
|
||||
## 8.3 Track trust on each side
|
||||
|
||||
Maintain:
|
||||
|
||||
* `TrustTrue[S,p]` = max trust label among SupportTrue
|
||||
* `TrustFalse[S,p]` = max trust label among SupportFalse
|
||||
|
||||
This enables policy selection without losing conflict information.
|
||||
|
||||
---
|
||||
|
||||
# 9) Decision selection: from atoms → disposition
|
||||
|
||||
Decision selection is where “trust algebra” actually participates. It is **policy-driven** and can differ by environment (prod vs dev, regulated vs non-regulated).
|
||||
|
||||
## 9.1 Output disposition space
|
||||
|
||||
The engine should be able to emit a CycloneDX-compatible disposition (ECMA-424): ([Ecma International][1])
|
||||
|
||||
* `resolved_with_pedigree`
|
||||
* `resolved`
|
||||
* `false_positive`
|
||||
* `not_affected`
|
||||
* `exploitable`
|
||||
* `in_triage`
|
||||
|
||||
## 9.2 Deterministic selection rules (baseline)
|
||||
|
||||
Define `D(S)`:
|
||||
|
||||
1. If `FIXED == T` and pedigree evidence meets threshold → `resolved_with_pedigree`
|
||||
2. Else if `FIXED == T` → `resolved`
|
||||
3. Else if `MISATTRIBUTED == T` and trust≥threshold → `false_positive`
|
||||
4. Else if `APPLIES == F` or `PRESENT == F` → `not_affected`
|
||||
5. Else if `REACHABLE == F` or `MITIGATED == T` → `not_affected` (with justification)
|
||||
6. Else if `REACHABLE == T` and `MITIGATED != T` → `exploitable`
|
||||
7. Else → `in_triage`
|
||||
|
||||
## 9.3 Conflict-handling modes (policy selectable)
|
||||
|
||||
When any required atom is ⊤ (conflict) or ⊥ (unknown), policy chooses a stance:
|
||||
|
||||
* **Skeptical (default for production gating):**
|
||||
|
||||
* conflict/unknown biases toward `in_triage` or `exploitable` depending on risk tolerance
|
||||
* **Authority-weighted:**
|
||||
|
||||
* if high-authority vendor statement conflicts with low-trust scanner output, accept vendor but record conflict in proof
|
||||
* **Quorum-based:**
|
||||
|
||||
* accept `not_affected` only if:
|
||||
|
||||
* (vendor trust≥A3) OR
|
||||
* (internal reachability proof trust≥A3) OR
|
||||
* (two independent principals ≥A2 agree)
|
||||
Otherwise remain `in_triage`.
|
||||
|
||||
This is where “trust algebra” expresses **institutional policy** without destroying underlying knowledge.
|
||||
|
||||
---
|
||||
|
||||
# 10) Proof object: verifiable explainability
|
||||
|
||||
Every verdict emits a **Proof Bundle** that can be verified offline.
|
||||
|
||||
## 10.1 Proof bundle contents
|
||||
|
||||
* `subject` (canonical form)
|
||||
* `inputs`:
|
||||
|
||||
* list of claim IDs + digests
|
||||
* list of evidence digests
|
||||
* policy bundle digest
|
||||
* vulnerability feed snapshot digest (if applicable)
|
||||
* `normalization`:
|
||||
|
||||
* mappings applied (e.g., OpenVEX status→atoms)
|
||||
* `atom_table`:
|
||||
|
||||
* each atom p: K4 value, support sets, trust per side
|
||||
* `decision_trace`:
|
||||
|
||||
* rule IDs fired
|
||||
* thresholds used
|
||||
* `output`:
|
||||
|
||||
* disposition + justification + confidence metadata
|
||||
|
||||
## 10.2 Signing
|
||||
|
||||
The proof bundle is itself a payload suitable for signing in DSSE, enabling attested verdicts. DSSE’s type binding is important so a proof bundle cannot be reinterpreted as a different payload class. ([GitHub][6])
|
||||
|
||||
---
|
||||
|
||||
# 11) Policy bundle specification (Trust + Decision DSL)
|
||||
|
||||
A policy bundle is a hashable document. Example structure (YAML-like; illustrative):
|
||||
|
||||
```yaml
|
||||
policy_id: "org.prod.default.v1"
|
||||
trust_roots:
|
||||
- principal: "did:web:vendor.example"
|
||||
min_assurance: A2
|
||||
authority:
|
||||
products: ["vendor.example/*"]
|
||||
|
||||
- principal: "did:web:sec.internal"
|
||||
min_assurance: A2
|
||||
authority:
|
||||
artifacts: ["sha256:*"] # internal is authoritative for internal artifacts
|
||||
|
||||
acceptance_thresholds:
|
||||
resolved_with_pedigree:
|
||||
min_evidence_class: E3
|
||||
min_assurance: A3
|
||||
|
||||
not_affected:
|
||||
mode: quorum
|
||||
quorum:
|
||||
- any:
|
||||
- { principal_role: vendor, min_assurance: A3 }
|
||||
- { evidence_type: callgraph_path, min_assurance: A3 }
|
||||
- { all:
|
||||
- { distinct_principals: 2 }
|
||||
- { min_assurance_each: A2 }
|
||||
}
|
||||
|
||||
conflict_mode:
|
||||
production: skeptical
|
||||
development: authority_weighted
|
||||
```
|
||||
|
||||
The engine must treat the policy bundle as an **input artifact** (hashed, stored, referenced in proofs).
|
||||
|
||||
---
|
||||
|
||||
# 12) Determinism requirements
|
||||
|
||||
To guarantee deterministic replay:
|
||||
|
||||
1. **Canonical JSON** for all stored objects (claims, evidence, policy bundles).
|
||||
2. **Content-addressing**:
|
||||
|
||||
* `id = sha256(canonical_bytes)`
|
||||
3. **Stable sorting**:
|
||||
|
||||
* when iterating claims/evidence, sort by `(type, id)` to prevent nondeterministic traversal
|
||||
4. **Time handling**:
|
||||
|
||||
* evaluation time is explicit input (e.g., `as_of` timestamp)
|
||||
* expired claims are excluded deterministically
|
||||
5. **Version pinning**:
|
||||
|
||||
* tool identity + version recorded in evidence
|
||||
* vuln feed snapshot digests recorded
|
||||
|
||||
---
|
||||
|
||||
# 13) Worked examples
|
||||
|
||||
## Example A: Vendor says not affected; scanner says exploitable
|
||||
|
||||
Inputs:
|
||||
|
||||
* OpenVEX: `not_affected` with justification (required by spec) ([GitHub][3])
|
||||
* Internal scanner: flags exploitable
|
||||
Aggregation:
|
||||
* REACHABLE: ⊤ (conflict)
|
||||
Selection (production skeptical):
|
||||
* verdict: `in_triage`
|
||||
Selection (authority-weighted, vendor authoritative):
|
||||
* verdict: `not_affected`
|
||||
Proof bundle records conflict and why policy accepted vendor.
|
||||
|
||||
## Example B: Fixed with pedigree
|
||||
|
||||
Inputs:
|
||||
|
||||
* CycloneDX analysis.state = `resolved_with_pedigree` ([Ecma International][1])
|
||||
* Evidence includes commit history/diff in pedigree
|
||||
Selection:
|
||||
* `resolved_with_pedigree`
|
||||
|
||||
## Example C: Not affected due to mitigations
|
||||
|
||||
Inputs:
|
||||
|
||||
* CycloneDX analysis.state=`not_affected`, justification=`protected_at_runtime` ([Ecma International][1])
|
||||
* Evidence: runtime mitigation proof (E2/E3)
|
||||
Selection:
|
||||
* `not_affected` with justification preserved
|
||||
|
||||
---
|
||||
|
||||
# 14) Implementation checklist
|
||||
|
||||
Minimum viable implementation modules:
|
||||
|
||||
1. **Format adapters**
|
||||
|
||||
* CycloneDX/ECMA-424 VEX parser
|
||||
* OpenVEX parser
|
||||
* CSAF VEX parser
|
||||
2. **Canonicalizer**
|
||||
|
||||
* canonical JSON encoder
|
||||
* digest computation
|
||||
3. **Verifier**
|
||||
|
||||
* signature/DSSE verifier (pluggable)
|
||||
* principal resolver (keys/certs/OIDC claims)
|
||||
4. **Normalizer**
|
||||
|
||||
* maps external statements → atoms
|
||||
* maps justifications → conditions/evidence expectations
|
||||
5. **Lattice store**
|
||||
|
||||
* support sets per (S,p)
|
||||
* efficient indexing by artifact/component/vuln
|
||||
6. **Policy evaluator**
|
||||
|
||||
* thresholds, authority scopes, conflict modes
|
||||
7. **Proof generator**
|
||||
|
||||
* deterministic trace emission
|
||||
* optional DSSE signing step
|
||||
|
||||
---
|
||||
|
||||
## What you should build next (practical sequencing)
|
||||
|
||||
1. Implement the **canonical atom model** + K4 aggregation (fast, foundational).
|
||||
2. Add CycloneDX/ECMA-424 ingestion first (richest semantics). ([Ecma International][1])
|
||||
3. Add OpenVEX + CSAF ingestion with mapping to atoms. ([Docker Documentation][2])
|
||||
4. Add trust label computation + policy bundle hashing.
|
||||
5. Add proof bundles + DSSE signing support. ([GitHub][6])
|
||||
|
||||
If you want, I can follow this with:
|
||||
|
||||
* a concrete JSON schema for `Claim`, `Evidence`, `PolicyBundle`, and `ProofBundle`;
|
||||
* an explicit mapping table from OpenVEX/CSAF justifications to ECMA-424 justifications/atoms;
|
||||
* a reference evaluation pseudocode implementation (deterministic, testable).
|
||||
|
||||
[1]: https://ecma-international.org/wp-content/uploads/ECMA-424_1st_edition_june_2024.pdf "ECMA-424, 1st edition, June 2024"
|
||||
[2]: https://docs.docker.com/scout/how-tos/create-exceptions-vex/?utm_source=chatgpt.com "Create an exception using the VEX"
|
||||
[3]: https://github.com/openvex/spec/blob/main/OPENVEX-SPEC.md?utm_source=chatgpt.com "spec/OPENVEX-SPEC.md at main"
|
||||
[4]: https://docs.oasis-open.org/csaf/csaf/v2.0/os/csaf-v2.0-os.html?utm_source=chatgpt.com "Common Security Advisory Framework Version 2.0 - Index of /"
|
||||
[5]: https://docs.sigstore.dev/cosign/verifying/attestation/?utm_source=chatgpt.com "In-Toto Attestations"
|
||||
[6]: https://github.com/secure-systems-lab/dsse?utm_source=chatgpt.com "DSSE: Dead Simple Signing Envelope"
|
||||
Reference in New Issue
Block a user