feat: add stella-callgraph-node for JavaScript/TypeScript call graph extraction

- Implemented a new tool `stella-callgraph-node` that extracts call graphs from JavaScript/TypeScript projects using Babel AST.
- Added command-line interface with options for JSON output and help.
- Included functionality to analyze project structure, detect functions, and build call graphs.
- Created a package.json file for dependency management.

feat: introduce stella-callgraph-python for Python call graph extraction

- Developed `stella-callgraph-python` to extract call graphs from Python projects using AST analysis.
- Implemented command-line interface with options for JSON output and verbose logging.
- Added framework detection to identify popular web frameworks and their entry points.
- Created an AST analyzer to traverse Python code and extract function definitions and calls.
- Included requirements.txt for project dependencies.

chore: add framework detection for Python projects

- Implemented framework detection logic to identify frameworks like Flask, FastAPI, Django, and others based on project files and import patterns.
- Enhanced the AST analyzer to recognize entry points based on decorators and function definitions.
This commit is contained in:
master
2025-12-19 18:11:59 +02:00
parent 951a38d561
commit 8779e9226f
130 changed files with 19011 additions and 422 deletions

View File

@@ -0,0 +1,104 @@
Below is a **feature → moat strength** map for Stella Ops, explicitly benchmarked against the tools weve been discussing (Trivy/Aqua, Grype/Syft, Anchore Enterprise, Snyk, Prisma Cloud). Im using **“moat”** in the strict sense: *how hard is it for an incumbent to replicate the capability to parity, and how strong are the switching costs once deployed.*
### Moat scale
* **5 = Structural moat** (new primitives, strong defensibility, durable switching cost)
* **4 = Strong moat** (difficult multi-domain engineering; incumbents have only partial analogs)
* **3 = Moderate moat** (others can build; differentiation is execution + packaging)
* **2 = Weak moat** (table-stakes soon; limited defensibility)
* **1 = Commodity** (widely available in OSS / easy to replicate)
---
## 1) Stella Ops candidate features mapped to moat strength
| Stella Ops feature (precisely defined) | Closest competitor analogs (evidence) | Competitive parity today | Moat strength | Why this is (or isnt) defensible | How to harden the moat |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------: | ------------: | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Signed, replayable risk verdicts**: “this artifact is acceptable” decisions produced deterministically, with an evidence bundle + policy snapshot, signed as an attestation | Ecosystem can sign SBOM attestations (e.g., Syft + Sigstore; DSSE/in-toto via cosign), but not “risk verdict” decisions end-to-end ([Anchore][1]) | Low | **5** | This requires a **deterministic evaluation model**, a **proof/evidence schema**, and “knowledge snapshotting” so results are replayable months later. Incumbents mostly stop at exporting scan results or SBOMs, not signing a decision in a reproducible way. | Make the verdict format a **first-class artifact** (OCI-attached attestation), with strict replay semantics (“same inputs → same verdict”), plus auditor-friendly evidence extraction. |
| **VEX decisioning engine (not just ingestion)**: ingest OpenVEX/CycloneDX/CSAF, resolve conflicts with a trust/policy lattice, and produce explainable outcomes | Trivy supports multiple VEX formats (CycloneDX/OpenVEX/CSAF) but notes its “experimental/minimal functionality” ([Trivy][2]). Grype supports OpenVEX ingestion ([Chainguard][3]). Anchore can generate VEX docs from annotations (OpenVEX + CycloneDX) ([Anchore Docs][4]). Aqua runs VEX Hub for distributing VEX statements to Trivy ([Aqua][5]) | Medium (ingestion exists; decision logic is thin) | **4** | Ingestion alone is easy; the moat comes from **formal conflict resolution**, provenance-aware trust weighting, and deterministic outcomes. Most tools treat VEX as suppression/annotation, not a reasoning substrate. | Ship a **policy-controlled merge semantics** (“vendor > distro > internal” is too naive) + required evidence hooks (e.g., “not affected because feature flag off”). |
| **Reachability with proof**, tied to deployable artifacts: produce a defensible chain “entrypoint → call path → vulnerable symbol,” plus configuration gates | Snyk has reachability analysis in GA for certain languages/integrations and uses call-graph style reasoning to determine whether vulnerable code is called ([Snyk User Docs][6]). Some commercial vendors also market reachability (e.g., Endor Labs is listed in CycloneDX Tool Center as analyzing reachability) ([CycloneDX][7]) | Medium (reachability exists, but proof portability varies) | **4** | “Reachability” as a label is no longer unique. The moat is **portable proofs** (usable in audits and in air-gapped environments) + artifact-level mapping (not just source repo analysis) + deterministic replay. | Focus on **proof-carrying reachability**: store the reachability subgraph as evidence; make it reproducible and attestable; support both source and post-build artifacts. |
| **Smart-Diff (semantic risk delta)**: between releases, explain “what materially changed in exploitable surface,” not just “CVE count changed” | Anchore provides SBOM management and policy evaluation (good foundation), but “semantic risk diff” is not a prominent, standardized feature in typical scanners ([Anchore Docs][8]) | LowMedium | **4** | Most incumbents can diff findings lists. Few can diff **reachability graphs, policy outcomes, and VEX state** to produce stable “delta narratives.” Hard to replicate without the underlying evidence model. | Treat diff as first-class: version SBOM graphs + reachability graphs + VEX claims; compute deltas over those graphs and emit a signed “delta verdict.” |
| **Unknowns as first-class state**: represent “unknown-reachable/unknown-unreachable” and force policies to account for uncertainty | Not a standard capability in common scanners/platforms; most systems output findings and (optionally) suppressions | Low | **4** | This is conceptually simple but operationally rare; it requires rethinking UX, scoring, and policy evaluation. It becomes sticky once orgs base governance on uncertainty budgets. | Bake unknowns into policies (“fail if unknowns > N in prod”), reporting, and attestations. Make it the default rather than optional. |
| **Air-gapped epistemic mode**: offline operation where the tool can prove what knowledge it used (feed snapshot + timestamps + trust anchors) | Prisma Cloud Compute Edition supports air-gapped environments and has an offline Intel Stream update mechanism ([Prisma Cloud Docs][9]). (But “prove exact knowledge state used for decisions” is typically not the emphasis.) | Medium | **4** | Air-gapped “runtime” is common; air-gapped **reproducibility** is not. The moat is packaging offline feeds + policies + deterministic scoring into a replayable bundle tied to attestations. | Deliver a “sealed knowledge snapshot” workflow (export/import), and make audits a one-command replay. |
| **SBOM ledger + lineage**: BYOS ingestion plus versioned SBOM storage, grouping, and historical tracking | Anchore explicitly positions centralized SBOM management and “Bring Your Own SBOM” ([Anchore Docs][8]). Snyk can generate SBOMs and expose SBOM via API in CycloneDX/SPDX formats ([Snyk User Docs][10]). Prisma can export CycloneDX SBOMs for scans ([Prisma Cloud Docs][11]) | High | **3** | SBOM generation/storage is quickly becoming table stakes. You can still differentiate on **graph fidelity + lineage semantics**, but “having SBOMs” alone wont be a moat. | Make the ledger valuable via **semantic diff, evidence joins (reachability/VEX), and provenance** rather than storage. |
| **Policy engine with proofs**: policy-as-code that produces a signed explanation (“why pass/fail”) and links to evidence nodes | Anchore has a mature policy model (policy JSON, gates, allowlists, mappings) ([Anchore Docs][12]). Prisma/Aqua have rich policy + runtime guardrails (platform-driven) ([Aqua][13]) | High | **3** | Policy engines are common. The moat is the **proof output** + deterministic replay + integration with attestations. | Keep policy language small but rigorous; always emit evidence pointers; support “policy compilation” to deterministic decision artifacts. |
| **VEX distribution network**: ecosystem layer that aggregates, validates, and serves VEX at scale | Aquas VEX Hub is explicitly a centralized repository designed for discover/fetch/consume flows with Trivy ([Aqua][5]) | Medium | **34** | A network layer can become a moat if it achieves broad adoption. But incumbents can also launch hubs. This becomes defensible only with **network effects + trust frameworks**. | Differentiate with **verification + trust scoring** of VEX sources, plus tight coupling to deterministic decisioning and attestations. |
| **“Integrations everywhere”** (CI/CD, registry, Kubernetes, IDE) | Everyone in this space integrates broadly; reachability and scoring features often ride those integrations (e.g., Snyk reachability depends on repo/integration access) ([Snyk User Docs][6]) | High | **12** | Integrations are necessary, but not defensible—mostly engineering throughput. | Use integrations to *distribute attestations and proofs*, not as the headline differentiator. |
---
## 2) Where competitors already have strong moats (avoid headon fights early)
These are areas where incumbents are structurally advantaged, so Stella Ops should either (a) integrate rather than replace, or (b) compete only if you have a much sharper wedge.
### Snyks moat: developer adoption + reachability-informed prioritization
* Snyk publicly documents **reachability analysis** (GA for certain integrations/languages) ([Snyk User Docs][6])
* Snyk prioritization incorporates reachability and other signals into **Priority Score** ([Snyk User Docs][14])
**Implication:** pure “reachability” claims wont beat Snyk; **proof-carrying, artifact-tied, replayable reachability** can.
### Prisma Clouds moat: CNAPP breadth + graph-based risk prioritization + air-gapped CWPP
* Prisma invests in graph-driven investigation/tracing of vulnerabilities ([Prisma Cloud Docs][15])
* Risk prioritization and risk-score ranked vulnerability views are core platform capabilities ([Prisma Cloud Docs][16])
* Compute Edition supports **air-gapped environments** and has offline update workflows ([Prisma Cloud Docs][9])
**Implication:** competing on “platform breadth” is a losing battle early; compete on **decision integrity** (deterministic, attestable, replayable) and integrate where needed.
### Anchores moat: SBOM operations + policy-as-code maturity
* Anchore is explicitly SBOM-management centric and supports policy gating constructs ([Anchore Docs][8])
**Implication:** Anchore is strong at “SBOM at scale.” Stella Ops should outperform on **semantic diff, VEX reasoning, and proof outputs**, not just SBOM storage.
### Aquas moat: code-to-runtime enforcement plus emerging VEX distribution
* Aqua provides CWPP-style runtime policy enforcement/guardrails ([Aqua][13])
* Aqua backs VEX Hub for VEX distribution and Trivy consumption ([Aqua][5])
**Implication:** if Stella Ops is not a runtime protection platform, dont chase CWPP breadth—use Aqua/Prisma integrations and focus on upstream decision quality.
---
## 3) Practical positioning: which features produce the most durable wedge
If you want the shortest path to a *defensible* position:
1. **Moat anchor (5): Signed, replayable risk verdicts**
* Everything else (VEX, reachability, diff) becomes evidence feeding that verdict.
2. **Moat amplifier (4): VEX decisioning + proof-carrying reachability**
* In 2025, VEX ingestion exists in Trivy/Grype/Anchore ([Trivy][2]), and reachability exists in Snyk ([Snyk User Docs][6]).
* Your differentiation must be: **determinism + portability + auditability**.
3. **Moat compounding (4): Smart-Diff over risk meaning**
* Turns “scan results” into an operational change-control primitive.
---
## 4) A concise “moat thesis” per feature (one-liners you can use internally)
* **Deterministic signed verdicts:** “We dont output findings; we output an attestable decision that can be replayed.”
* **VEX decisioning:** “We treat VEX as a logical claim system, not a suppression file.”
* **Reachability proofs:** “We provide proof of exploitability in *this* artifact, not just a badge.”
* **Smart-Diff:** “We explain what changed in exploitable surface area, not what changed in CVE count.”
* **Unknowns modeling:** “We quantify uncertainty and gate on it.”
---
If you want, I can convert the table into a **2×2 moat map** (Customer Value vs Defensibility) and a **build-order roadmap** that maximizes durable advantage while minimizing overlap with entrenched competitor moats.
[1]: https://anchore.com/sbom/creating-sbom-attestations-using-syft-and-sigstore/?utm_source=chatgpt.com "Creating SBOM Attestations Using Syft and Sigstore"
[2]: https://trivy.dev/docs/v0.50/supply-chain/vex/?utm_source=chatgpt.com "VEX"
[3]: https://www.chainguard.dev/unchained/vexed-then-grype-about-it-chainguard-and-anchore-announce-grype-supports-openvex?utm_source=chatgpt.com "VEXed? Then Grype about it"
[4]: https://docs.anchore.com/current/docs/vulnerability_management/vuln_annotations/?utm_source=chatgpt.com "Vulnerability Annotations and VEX"
[5]: https://www.aquasec.com/blog/introducing-vex-hub-unified-repository-for-vex-statements/?utm_source=chatgpt.com "Trivy VEX Hub:The Solution to Vulnerability Fatigue"
[6]: https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing/reachability-analysis?utm_source=chatgpt.com "Reachability analysis"
[7]: https://cyclonedx.org/tool-center/?utm_source=chatgpt.com "CycloneDX Tool Center"
[8]: https://docs.anchore.com/current/docs/sbom_management/?utm_source=chatgpt.com "SBOM Management"
[9]: https://docs.prismacloud.io/en/compute-edition?utm_source=chatgpt.com "Prisma Cloud Compute Edition"
[10]: https://docs.snyk.io/developer-tools/snyk-cli/commands/sbom?utm_source=chatgpt.com "SBOM | Snyk User Docs"
[11]: https://docs.prismacloud.io/en/compute-edition/32/admin-guide/vulnerability-management/exporting-sboms?utm_source=chatgpt.com "Exporting Software Bill of Materials on CycloneDX"
[12]: https://docs.anchore.com/current/docs/overview/concepts/policy/policies/?utm_source=chatgpt.com "Policies and Evaluation"
[13]: https://www.aquasec.com/products/cwpp-cloud-workload-protection/?utm_source=chatgpt.com "Cloud workload protection in Runtime - Aqua Security"
[14]: https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing?utm_source=chatgpt.com "Prioritize issues for fixing"
[15]: https://docs.prismacloud.io/en/enterprise-edition/content-collections/search-and-investigate/c2c-tracing-vulnerabilities/investigate-vulnerabilities-tracing?utm_source=chatgpt.com "Use Vulnerabilities Tracing on Investigate"
[16]: https://docs.prismacloud.io/en/enterprise-edition/use-cases/secure-the-infrastructure/risk-prioritization?utm_source=chatgpt.com "Risk Prioritization - Prisma Cloud Documentation"

View File

@@ -0,0 +1,619 @@
## Trust Algebra and Lattice Engine Specification
This spec defines a deterministic “Trust Algebra / Lattice Engine” that ingests heterogeneous security assertions (SBOM, VEX, reachability, provenance attestations), normalizes them into a canonical claim model, merges them using lattice operations that preserve **unknowns and contradictions**, and produces a **signed, replayable verdict** with an auditable proof trail.
The design deliberately separates:
1. **Knowledge aggregation** (monotone, conflict-preserving, order-independent), from
2. **Decision selection** (policy-driven, trust-aware, environment-aware).
This prevents “heuristics creep” and makes the system explainable and reproducible.
---
# 1) Scope and objectives
### 1.1 What the engine must do
* Accept VEX from multiple standards (OpenVEX, CSAF VEX, CycloneDX/ECMA-424 VEX).
* Accept internally generated evidence (SBOM, reachability proofs, mitigations, patch/pedigree evidence).
* Merge claims while representing:
* **Unknown** (no evidence)
* **Conflict** (credible evidence for both sides)
* Compute an output disposition aligned to common VEX output states:
* CycloneDX impact-analysis states include: `resolved`, `resolved_with_pedigree`, `exploitable`, `in_triage`, `false_positive`, `not_affected`. ([Ecma International][1])
* Provide deterministic, signed, replayable results:
* Same inputs + same policy bundle ⇒ same outputs.
* Produce a proof object that can be independently verified offline.
### 1.2 Non-goals
* “One score to rule them all” without proofs.
* Probabilistic scoring as the primary decision mechanism.
* Trust by vendor branding instead of cryptographic/verifiable identity.
---
# 2) Standards surface (external inputs) and canonicalization targets
The engine should support at minimum these external statement types:
### 2.1 CycloneDX / ECMA-424 VEX (embedded)
CycloneDXs vulnerability “impact analysis” model defines:
* `analysis.state` values: `resolved`, `resolved_with_pedigree`, `exploitable`, `in_triage`, `false_positive`, `not_affected`. ([Ecma International][1])
* `analysis.justification` values: `code_not_present`, `code_not_reachable`, `requires_configuration`, `requires_dependency`, `requires_environment`, `protected_by_compiler`, `protected_at_runtime`, `protected_at_perimeter`, `protected_by_mitigating_control`. ([Ecma International][1])
This is the richest mainstream state model; we will treat it as the “maximal” target semantics.
### 2.2 OpenVEX
OpenVEX defines status labels:
* `not_affected`, `affected`, `fixed`, `under_investigation`. ([Docker Documentation][2])
For `not_affected`, OpenVEX requires supplying either a status justification or an `impact_statement`. ([GitHub][3])
### 2.3 CSAF VEX
CSAF VEX requires `product_status` containing at least one of:
* `fixed`, `known_affected`, `known_not_affected`, `under_investigation`. ([OASIS Documents][4])
### 2.4 Provenance / attestations
The engine should ingest signed attestations, particularly DSSE-wrapped in-toto statements (common in Sigstore/Cosign flows). Sigstore documentation states payloads are signed using the DSSE signing spec. ([Sigstore][5])
DSSEs design highlights include binding the payload **and its type** to prevent confusion attacks and avoiding canonicalization to reduce attack surface. ([GitHub][6])
---
# 3) Canonical internal model
## 3.1 Core identifiers
### Subject identity
A **Subject** is what we are making a security determination about.
Minimum viable Subject key:
* `artifact.digest` (e.g., OCI image digest, binary hash)
* `component.id` (prefer `purl`, else `cpe`, else `bom-ref`)
* `vuln.id` (CVE/OSV/etc.)
* `context.id` (optional but recommended; see below)
```
Subject := (ArtifactRef, ComponentRef, VulnerabilityRef, ContextRef?)
```
### Context identity (optional but recommended)
ContextRef allows environment-sensitive statements to remain valid and deterministic:
* build flags
* runtime config profile (e.g., feature gates)
* deployment mode (cluster policy)
* OS / libc family
* FIPS mode, SELinux/AppArmor posture, etc.
ContextRef must be hashable (canonical JSON → digest).
---
## 3.2 Claims, evidence, attestations
### Claim
A **Claim** is a signed or unsigned assertion about a Subject.
Required fields:
* `claim.id`: content-addressable digest of canonical claim JSON
* `claim.subject`
* `claim.issuer`: principal identity
* `claim.time`: `issued_at`, `valid_from`, `valid_until` (optional)
* `claim.assertions[]`: list of atomic assertions (see §4)
* `claim.evidence_refs[]`: pointers to evidence objects
* `claim.signature`: optional DSSE / signature wrapper reference
### Evidence
Evidence is a typed object that supports replay and audit:
* `evidence.type`: e.g., `sbom_node`, `callgraph_path`, `loader_resolution`, `config_snapshot`, `patch_diff`, `pedigree_commit_chain`
* `evidence.digest`: hash of canonical bytes
* `evidence.producer`: tool identity and version
* `evidence.time`
* `evidence.payload_ref`: CAS pointer
* `evidence.signature_ref`: optional (attested evidence)
### Attestation wrapper
For signed payloads (claims or evidence bundles):
* Prefer DSSE envelopes for transport/type binding. ([GitHub][6])
* Prefer in-toto statement structure (subject + predicate + type).
---
# 4) The fact lattice: representing truth, unknowns, and conflicts
## 4.1 Why a lattice, not booleans
For vulnerability disposition you will routinely see:
* **no evidence** (unknown)
* **incomplete evidence** (triage)
* **contradictory evidence** (vendor says not affected; scanner says exploitable)
A boolean cannot represent these safely.
## 4.2 Four-valued fact lattice (Belnap-style)
For each atomic proposition `p`, the engine stores a value in:
```
K4 := { ⊥, T, F, }
⊥ = unknown (no support)
T = supported true
F = supported false
= conflict (support for both true and false)
```
### Knowledge ordering (≤k)
* ⊥ ≤k T ≤k
* ⊥ ≤k F ≤k
* T and F incomparable
### Join operator (⊔k)
Join is “union of support” and is monotone:
* ⊥ ⊔k x = x
* T ⊔k F =
* ⊔k x =
* T ⊔k T = T, F ⊔k F = F
This operator is order-independent; it provides deterministic aggregation even under parallel ingestion.
---
# 5) Atomic propositions (canonical “security atoms”)
For each Subject `S`, the engine maintains K4 truth values for these propositions:
1. **PRESENT**: the component instance is present in the artifact/context.
2. **APPLIES**: vulnerability applies to that component (version/range/cpe match).
3. **REACHABLE**: vulnerable code is reachable in the given context.
4. **MITIGATED**: controls prevent exploitation (compiler/runtime/perimeter/controls).
5. **FIXED**: remediation has been applied to the artifact.
6. **MISATTRIBUTED**: the finding is a false association (false positive).
These atoms are intentionally orthogonal; external formats are normalized into them.
---
# 6) Trust algebra: principals, assurance, and authority
Trust is not a single number; it must represent:
* cryptographic verification
* identity assurance
* authority scope
* freshness/revocation
* evidence strength
We model trust as a label computed deterministically from policy + verification.
## 6.1 Principal
A principal is an issuer identity with verifiable keys:
* `principal.id` (URI-like)
* `principal.key_ids[]`
* `principal.identity_claims` (e.g., cert SANs, OIDC subject, org, repo)
* `principal.roles[]` (vendor, distro, internal-sec, build-system, scanner, auditor)
## 6.2 Trust label
A trust label is a tuple:
```
TrustLabel := (
assurance_level, // cryptographic + identity verification strength
authority_scope, // what subjects this principal is authoritative for
freshness_class, // time validity
evidence_class // strength/type of evidence attached
)
```
### Assurance levels (example)
Deterministic levels, increasing:
* A0: unsigned / unverifiable
* A1: signed, key known but weak identity binding
* A2: signed, verified identity (e.g., cert chain / keyless identity)
* A3: signed + provenance binding to artifact digest
* A4: signed + provenance + transparency log inclusion (if available)
Sigstore cosigns attestation verification references DSSE signing for payloads. ([Sigstore][5])
DSSE design includes payload-type binding and avoids canonicalization. ([GitHub][6])
### Authority scope
Authority is not purely cryptographic. It is policy-defined mapping between:
* principal identity and
* subject namespaces (vendors, products, package namespaces, internal artifacts)
Examples:
* Vendor principal is authoritative for `product.vendor == VendorX`.
* Distro principal authoritative for packages under their repos.
* Internal security principal authoritative for internal runtime reachability proofs.
### Evidence class
Evidence class is derived from evidence types:
* E0: statement-only (no supporting evidence refs)
* E1: SBOM linkage evidence (component present + version)
* E2: reachability/mitigation evidence (call paths, config snapshots)
* E3: remediation evidence (patch diffs, pedigree/commit chain)
CycloneDX/ECMA-424 explicitly distinguishes `resolved_with_pedigree` as remediation with verifiable commit history/diffs in pedigree. ([Ecma International][1])
## 6.3 Trust ordering and operators
Trust labels define a partial order ≤t (policy-defined). A simple implementation is component-wise ordering, but authority scope is set-based.
Core operators:
* **join (⊔t)**: combine independent supporting trust (often max-by-order)
* **meet (⊓t)**: compose along dependency chain (often min-by-order)
* **compose (⊗)**: trust of derived claim = min(trust of prerequisites) adjusted by method assurance
**Important:** Trust affects **decision selection**, not raw knowledge aggregation. Aggregation retains conflicts even if one side is low-trust.
---
# 7) Normalization: external VEX → canonical atoms
## 7.1 CycloneDX / ECMA-424 normalization
From `analysis.state` ([Ecma International][1])
* `resolved`
→ FIXED := T
* `resolved_with_pedigree`
→ FIXED := T and require pedigree/diff evidence (E3)
* `exploitable`
→ APPLIES := T, REACHABLE := T, MITIGATED := F (unless explicit mitigation evidence exists)
* `in_triage`
→ mark triage flag; leave atoms mostly ⊥ unless other fields present
* `false_positive`
→ MISATTRIBUTED := T
* `not_affected`
→ requires justification mapping (below)
From `analysis.justification` ([Ecma International][1])
Map into atoms as conditional facts (context-sensitive):
* `code_not_present` → PRESENT := F
* `code_not_reachable` → REACHABLE := F
* `requires_configuration` → REACHABLE := F *under current config snapshot*
* `requires_dependency` → REACHABLE := F *unless dependency present*
* `requires_environment` → REACHABLE := F *under current environment constraints*
* `protected_by_compiler` / `protected_at_runtime` / `protected_at_perimeter` / `protected_by_mitigating_control`
→ MITIGATED := T (with evidence refs expected)
## 7.2 OpenVEX normalization
OpenVEX statuses: `not_affected`, `affected`, `fixed`, `under_investigation`. ([Docker Documentation][2])
For `not_affected`, OpenVEX requires justification or an impact statement. ([GitHub][3])
Mapping:
* `fixed` → FIXED := T
* `affected` → APPLIES := T (conservative; leave REACHABLE := ⊥ unless present)
* `under_investigation` → triage flag
* `not_affected` → choose mapping based on provided justification / impact statement:
* component not present → PRESENT := F
* vulnerable code not reachable → REACHABLE := F
* mitigations already exist → MITIGATED := T
* otherwise → APPLIES := F only if explicitly asserted
## 7.3 CSAF VEX normalization
CSAF product_status includes `fixed`, `known_affected`, `known_not_affected`, `under_investigation`. ([OASIS Documents][4])
Mapping:
* `fixed` → FIXED := T
* `known_affected` → APPLIES := T
* `known_not_affected` → APPLIES := F unless stronger justification indicates PRESENT := F / REACHABLE := F / MITIGATED := T
* `under_investigation` → triage flag
---
# 8) Lattice engine: aggregation algorithm
Aggregation is pure, monotone, and order-independent.
## 8.1 Support sets
For each Subject `S` and atom `p`, maintain:
* `SupportTrue[S,p]` = set of claim IDs supporting p=true
* `SupportFalse[S,p]` = set of claim IDs supporting p=false
Optionally store per-support:
* trust label
* evidence digests
* timestamps
## 8.2 Compute K4 value
For each `(S,p)`:
* if both support sets empty → ⊥
* if only true non-empty → T
* if only false non-empty → F
* if both non-empty →
## 8.3 Track trust on each side
Maintain:
* `TrustTrue[S,p]` = max trust label among SupportTrue
* `TrustFalse[S,p]` = max trust label among SupportFalse
This enables policy selection without losing conflict information.
---
# 9) Decision selection: from atoms → disposition
Decision selection is where “trust algebra” actually participates. It is **policy-driven** and can differ by environment (prod vs dev, regulated vs non-regulated).
## 9.1 Output disposition space
The engine should be able to emit a CycloneDX-compatible disposition (ECMA-424): ([Ecma International][1])
* `resolved_with_pedigree`
* `resolved`
* `false_positive`
* `not_affected`
* `exploitable`
* `in_triage`
## 9.2 Deterministic selection rules (baseline)
Define `D(S)`:
1. If `FIXED == T` and pedigree evidence meets threshold → `resolved_with_pedigree`
2. Else if `FIXED == T``resolved`
3. Else if `MISATTRIBUTED == T` and trust≥threshold → `false_positive`
4. Else if `APPLIES == F` or `PRESENT == F``not_affected`
5. Else if `REACHABLE == F` or `MITIGATED == T``not_affected` (with justification)
6. Else if `REACHABLE == T` and `MITIGATED != T``exploitable`
7. Else → `in_triage`
## 9.3 Conflict-handling modes (policy selectable)
When any required atom is (conflict) or ⊥ (unknown), policy chooses a stance:
* **Skeptical (default for production gating):**
* conflict/unknown biases toward `in_triage` or `exploitable` depending on risk tolerance
* **Authority-weighted:**
* if high-authority vendor statement conflicts with low-trust scanner output, accept vendor but record conflict in proof
* **Quorum-based:**
* accept `not_affected` only if:
* (vendor trust≥A3) OR
* (internal reachability proof trust≥A3) OR
* (two independent principals ≥A2 agree)
Otherwise remain `in_triage`.
This is where “trust algebra” expresses **institutional policy** without destroying underlying knowledge.
---
# 10) Proof object: verifiable explainability
Every verdict emits a **Proof Bundle** that can be verified offline.
## 10.1 Proof bundle contents
* `subject` (canonical form)
* `inputs`:
* list of claim IDs + digests
* list of evidence digests
* policy bundle digest
* vulnerability feed snapshot digest (if applicable)
* `normalization`:
* mappings applied (e.g., OpenVEX status→atoms)
* `atom_table`:
* each atom p: K4 value, support sets, trust per side
* `decision_trace`:
* rule IDs fired
* thresholds used
* `output`:
* disposition + justification + confidence metadata
## 10.2 Signing
The proof bundle is itself a payload suitable for signing in DSSE, enabling attested verdicts. DSSEs type binding is important so a proof bundle cannot be reinterpreted as a different payload class. ([GitHub][6])
---
# 11) Policy bundle specification (Trust + Decision DSL)
A policy bundle is a hashable document. Example structure (YAML-like; illustrative):
```yaml
policy_id: "org.prod.default.v1"
trust_roots:
- principal: "did:web:vendor.example"
min_assurance: A2
authority:
products: ["vendor.example/*"]
- principal: "did:web:sec.internal"
min_assurance: A2
authority:
artifacts: ["sha256:*"] # internal is authoritative for internal artifacts
acceptance_thresholds:
resolved_with_pedigree:
min_evidence_class: E3
min_assurance: A3
not_affected:
mode: quorum
quorum:
- any:
- { principal_role: vendor, min_assurance: A3 }
- { evidence_type: callgraph_path, min_assurance: A3 }
- { all:
- { distinct_principals: 2 }
- { min_assurance_each: A2 }
}
conflict_mode:
production: skeptical
development: authority_weighted
```
The engine must treat the policy bundle as an **input artifact** (hashed, stored, referenced in proofs).
---
# 12) Determinism requirements
To guarantee deterministic replay:
1. **Canonical JSON** for all stored objects (claims, evidence, policy bundles).
2. **Content-addressing**:
* `id = sha256(canonical_bytes)`
3. **Stable sorting**:
* when iterating claims/evidence, sort by `(type, id)` to prevent nondeterministic traversal
4. **Time handling**:
* evaluation time is explicit input (e.g., `as_of` timestamp)
* expired claims are excluded deterministically
5. **Version pinning**:
* tool identity + version recorded in evidence
* vuln feed snapshot digests recorded
---
# 13) Worked examples
## Example A: Vendor says not affected; scanner says exploitable
Inputs:
* OpenVEX: `not_affected` with justification (required by spec) ([GitHub][3])
* Internal scanner: flags exploitable
Aggregation:
* REACHABLE: (conflict)
Selection (production skeptical):
* verdict: `in_triage`
Selection (authority-weighted, vendor authoritative):
* verdict: `not_affected`
Proof bundle records conflict and why policy accepted vendor.
## Example B: Fixed with pedigree
Inputs:
* CycloneDX analysis.state = `resolved_with_pedigree` ([Ecma International][1])
* Evidence includes commit history/diff in pedigree
Selection:
* `resolved_with_pedigree`
## Example C: Not affected due to mitigations
Inputs:
* CycloneDX analysis.state=`not_affected`, justification=`protected_at_runtime` ([Ecma International][1])
* Evidence: runtime mitigation proof (E2/E3)
Selection:
* `not_affected` with justification preserved
---
# 14) Implementation checklist
Minimum viable implementation modules:
1. **Format adapters**
* CycloneDX/ECMA-424 VEX parser
* OpenVEX parser
* CSAF VEX parser
2. **Canonicalizer**
* canonical JSON encoder
* digest computation
3. **Verifier**
* signature/DSSE verifier (pluggable)
* principal resolver (keys/certs/OIDC claims)
4. **Normalizer**
* maps external statements → atoms
* maps justifications → conditions/evidence expectations
5. **Lattice store**
* support sets per (S,p)
* efficient indexing by artifact/component/vuln
6. **Policy evaluator**
* thresholds, authority scopes, conflict modes
7. **Proof generator**
* deterministic trace emission
* optional DSSE signing step
---
## What you should build next (practical sequencing)
1. Implement the **canonical atom model** + K4 aggregation (fast, foundational).
2. Add CycloneDX/ECMA-424 ingestion first (richest semantics). ([Ecma International][1])
3. Add OpenVEX + CSAF ingestion with mapping to atoms. ([Docker Documentation][2])
4. Add trust label computation + policy bundle hashing.
5. Add proof bundles + DSSE signing support. ([GitHub][6])
If you want, I can follow this with:
* a concrete JSON schema for `Claim`, `Evidence`, `PolicyBundle`, and `ProofBundle`;
* an explicit mapping table from OpenVEX/CSAF justifications to ECMA-424 justifications/atoms;
* a reference evaluation pseudocode implementation (deterministic, testable).
[1]: https://ecma-international.org/wp-content/uploads/ECMA-424_1st_edition_june_2024.pdf "ECMA-424, 1st edition, June 2024"
[2]: https://docs.docker.com/scout/how-tos/create-exceptions-vex/?utm_source=chatgpt.com "Create an exception using the VEX"
[3]: https://github.com/openvex/spec/blob/main/OPENVEX-SPEC.md?utm_source=chatgpt.com "spec/OPENVEX-SPEC.md at main"
[4]: https://docs.oasis-open.org/csaf/csaf/v2.0/os/csaf-v2.0-os.html?utm_source=chatgpt.com "Common Security Advisory Framework Version 2.0 - Index of /"
[5]: https://docs.sigstore.dev/cosign/verifying/attestation/?utm_source=chatgpt.com "In-Toto Attestations"
[6]: https://github.com/secure-systems-lab/dsse?utm_source=chatgpt.com "DSSE: Dead Simple Signing Envelope"