Files

master 8779e9226f feat: add stella-callgraph-node for JavaScript/TypeScript call graph extraction

- Implemented a new tool `stella-callgraph-node` that extracts call graphs from JavaScript/TypeScript projects using Babel AST.
- Added command-line interface with options for JSON output and help.
- Included functionality to analyze project structure, detect functions, and build call graphs.
- Created a package.json file for dependency management.

feat: introduce stella-callgraph-python for Python call graph extraction

- Developed `stella-callgraph-python` to extract call graphs from Python projects using AST analysis.
- Implemented command-line interface with options for JSON output and verbose logging.
- Added framework detection to identify popular web frameworks and their entry points.
- Created an AST analyzer to traverse Python code and extract function definitions and calls.
- Included requirements.txt for project dependencies.

chore: add framework detection for Python projects

- Implemented framework detection logic to identify frameworks like Flask, FastAPI, Django, and others based on project files and import patterns.
- Enhanced the AST analyzer to recognize entry points based on decorators and function definitions.

2025-12-19 18:11:59 +02:00

20 KiB

Raw Blame History

Trust Algebra and Lattice Engine Specification

This spec defines a deterministic “Trust Algebra / Lattice Engine” that ingests heterogeneous security assertions (SBOM, VEX, reachability, provenance attestations), normalizes them into a canonical claim model, merges them using lattice operations that preserve unknowns and contradictions, and produces a signed, replayable verdict with an auditable proof trail.

The design deliberately separates:

Knowledge aggregation (monotone, conflict-preserving, order-independent), from
Decision selection (policy-driven, trust-aware, environment-aware).

This prevents “heuristics creep” and makes the system explainable and reproducible.

1) Scope and objectives

1.1 What the engine must do

Accept VEX from multiple standards (OpenVEX, CSAF VEX, CycloneDX/ECMA-424 VEX).
Accept internally generated evidence (SBOM, reachability proofs, mitigations, patch/pedigree evidence).
Merge claims while representing:
- Unknown (no evidence)
- Conflict (credible evidence for both sides)
Compute an output disposition aligned to common VEX output states:
- CycloneDX impact-analysis states include: resolved, resolved_with_pedigree, exploitable, in_triage, false_positive, not_affected. (Ecma International)
Provide deterministic, signed, replayable results:
- Same inputs + same policy bundle ⇒ same outputs.
Produce a proof object that can be independently verified offline.

1.2 Non-goals

“One score to rule them all” without proofs.
Probabilistic scoring as the primary decision mechanism.
Trust by vendor branding instead of cryptographic/verifiable identity.

2) Standards surface (external inputs) and canonicalization targets

The engine should support at minimum these external statement types:

2.1 CycloneDX / ECMA-424 VEX (embedded)

CycloneDX’s vulnerability “impact analysis” model defines:

analysis.state values: resolved, resolved_with_pedigree, exploitable, in_triage, false_positive, not_affected. (Ecma International)
analysis.justification values: code_not_present, code_not_reachable, requires_configuration, requires_dependency, requires_environment, protected_by_compiler, protected_at_runtime, protected_at_perimeter, protected_by_mitigating_control. (Ecma International)

This is the richest mainstream state model; we will treat it as the “maximal” target semantics.

2.2 OpenVEX

OpenVEX defines status labels:

not_affected, affected, fixed, under_investigation. (Docker Documentation) For not_affected, OpenVEX requires supplying either a status justification or an impact_statement. (GitHub)

2.3 CSAF VEX

CSAF VEX requires product_status containing at least one of:

fixed, known_affected, known_not_affected, under_investigation. (OASIS Documents)

2.4 Provenance / attestations

The engine should ingest signed attestations, particularly DSSE-wrapped in-toto statements (common in Sigstore/Cosign flows). Sigstore documentation states payloads are signed using the DSSE signing spec. (Sigstore) DSSE’s design highlights include binding the payload and its type to prevent confusion attacks and avoiding canonicalization to reduce attack surface. (GitHub)

3) Canonical internal model

3.1 Core identifiers

Subject identity

A Subject is what we are making a security determination about.

Minimum viable Subject key:

artifact.digest (e.g., OCI image digest, binary hash)
component.id (prefer purl, else cpe, else bom-ref)
vuln.id (CVE/OSV/etc.)
context.id (optional but recommended; see below)

Subject := (ArtifactRef, ComponentRef, VulnerabilityRef, ContextRef?)

Context identity (optional but recommended)

ContextRef allows environment-sensitive statements to remain valid and deterministic:

build flags
runtime config profile (e.g., feature gates)
deployment mode (cluster policy)
OS / libc family
FIPS mode, SELinux/AppArmor posture, etc.

ContextRef must be hashable (canonical JSON → digest).

3.2 Claims, evidence, attestations

Claim

A Claim is a signed or unsigned assertion about a Subject.

Required fields:

claim.id: content-addressable digest of canonical claim JSON
claim.subject
claim.issuer: principal identity
claim.time: issued_at, valid_from, valid_until (optional)
claim.assertions[]: list of atomic assertions (see §4)
claim.evidence_refs[]: pointers to evidence objects
claim.signature: optional DSSE / signature wrapper reference

Evidence

Evidence is a typed object that supports replay and audit:

evidence.type: e.g., sbom_node, callgraph_path, loader_resolution, config_snapshot, patch_diff, pedigree_commit_chain
evidence.digest: hash of canonical bytes
evidence.producer: tool identity and version
evidence.time
evidence.payload_ref: CAS pointer
evidence.signature_ref: optional (attested evidence)

Attestation wrapper

For signed payloads (claims or evidence bundles):

Prefer DSSE envelopes for transport/type binding. (GitHub)
Prefer in-toto statement structure (subject + predicate + type).

4) The fact lattice: representing truth, unknowns, and conflicts

4.1 Why a lattice, not booleans

For vulnerability disposition you will routinely see:

no evidence (unknown)
incomplete evidence (triage)
contradictory evidence (vendor says not affected; scanner says exploitable) A boolean cannot represent these safely.

4.2 Four-valued fact lattice (Belnap-style)

For each atomic proposition p, the engine stores a value in:

K4 := { ⊥, T, F, ⊤ }

⊥  = unknown (no support)
T  = supported true
F  = supported false
⊤  = conflict (support for both true and false)

Knowledge ordering (≤k)

⊥ ≤k T ≤k ⊤
⊥ ≤k F ≤k ⊤
T and F incomparable

Join operator (⊔k)

Join is “union of support” and is monotone:

⊥ ⊔k x = x
T ⊔k F = ⊤
⊤ ⊔k x = ⊤
T ⊔k T = T, F ⊔k F = F

This operator is order-independent; it provides deterministic aggregation even under parallel ingestion.

5) Atomic propositions (canonical “security atoms”)

For each Subject S, the engine maintains K4 truth values for these propositions:

PRESENT: the component instance is present in the artifact/context.
APPLIES: vulnerability applies to that component (version/range/cpe match).
REACHABLE: vulnerable code is reachable in the given context.
MITIGATED: controls prevent exploitation (compiler/runtime/perimeter/controls).
FIXED: remediation has been applied to the artifact.
MISATTRIBUTED: the finding is a false association (false positive).

These atoms are intentionally orthogonal; external formats are normalized into them.

6) Trust algebra: principals, assurance, and authority

Trust is not a single number; it must represent:

cryptographic verification
identity assurance
authority scope
freshness/revocation
evidence strength

We model trust as a label computed deterministically from policy + verification.

6.1 Principal

A principal is an issuer identity with verifiable keys:

principal.id (URI-like)
principal.key_ids[]
principal.identity_claims (e.g., cert SANs, OIDC subject, org, repo)
principal.roles[] (vendor, distro, internal-sec, build-system, scanner, auditor)

6.2 Trust label

A trust label is a tuple:

TrustLabel := (
  assurance_level,    // cryptographic + identity verification strength
  authority_scope,    // what subjects this principal is authoritative for
  freshness_class,    // time validity
  evidence_class      // strength/type of evidence attached
)

Assurance levels (example)

Deterministic levels, increasing:

A0: unsigned / unverifiable
A1: signed, key known but weak identity binding
A2: signed, verified identity (e.g., cert chain / keyless identity)
A3: signed + provenance binding to artifact digest
A4: signed + provenance + transparency log inclusion (if available)

Sigstore cosign’s attestation verification references DSSE signing for payloads. (Sigstore) DSSE design includes payload-type binding and avoids canonicalization. (GitHub)

Authority scope

Authority is not purely cryptographic. It is policy-defined mapping between:

principal identity and
subject namespaces (vendors, products, package namespaces, internal artifacts)

Examples:

Vendor principal is authoritative for product.vendor == VendorX.
Distro principal authoritative for packages under their repos.
Internal security principal authoritative for internal runtime reachability proofs.

Evidence class

Evidence class is derived from evidence types:

E0: statement-only (no supporting evidence refs)
E1: SBOM linkage evidence (component present + version)
E2: reachability/mitigation evidence (call paths, config snapshots)
E3: remediation evidence (patch diffs, pedigree/commit chain)

CycloneDX/ECMA-424 explicitly distinguishes resolved_with_pedigree as remediation with verifiable commit history/diffs in pedigree. (Ecma International)

6.3 Trust ordering and operators

Trust labels define a partial order ≤t (policy-defined). A simple implementation is component-wise ordering, but authority scope is set-based.

Core operators:

join (⊔t): combine independent supporting trust (often max-by-order)
meet (⊓t): compose along dependency chain (often min-by-order)
compose (⊗): trust of derived claim = min(trust of prerequisites) adjusted by method assurance

Important: Trust affects decision selection, not raw knowledge aggregation. Aggregation retains conflicts even if one side is low-trust.

7) Normalization: external VEX → canonical atoms

7.1 CycloneDX / ECMA-424 normalization

From analysis.state (Ecma International)

resolved → FIXED := T
resolved_with_pedigree → FIXED := T and require pedigree/diff evidence (E3)
exploitable → APPLIES := T, REACHABLE := T, MITIGATED := F (unless explicit mitigation evidence exists)
in_triage → mark triage flag; leave atoms mostly ⊥ unless other fields present
false_positive → MISATTRIBUTED := T
not_affected → requires justification mapping (below)

From analysis.justification (Ecma International) Map into atoms as conditional facts (context-sensitive):

code_not_present → PRESENT := F
code_not_reachable → REACHABLE := F
requires_configuration → REACHABLE := F under current config snapshot
requires_dependency → REACHABLE := F unless dependency present
requires_environment → REACHABLE := F under current environment constraints
protected_by_compiler / protected_at_runtime / protected_at_perimeter / protected_by_mitigating_control → MITIGATED := T (with evidence refs expected)

7.2 OpenVEX normalization

OpenVEX statuses: not_affected, affected, fixed, under_investigation. (Docker Documentation) For not_affected, OpenVEX requires justification or an impact statement. (GitHub)

Mapping:

fixed → FIXED := T
affected → APPLIES := T (conservative; leave REACHABLE := ⊥ unless present)
under_investigation → triage flag
not_affected → choose mapping based on provided justification / impact statement:
- component not present → PRESENT := F
- vulnerable code not reachable → REACHABLE := F
- mitigations already exist → MITIGATED := T
- otherwise → APPLIES := F only if explicitly asserted

7.3 CSAF VEX normalization

CSAF product_status includes fixed, known_affected, known_not_affected, under_investigation. (OASIS Documents)

Mapping:

fixed → FIXED := T
known_affected → APPLIES := T
known_not_affected → APPLIES := F unless stronger justification indicates PRESENT := F / REACHABLE := F / MITIGATED := T
under_investigation → triage flag

8) Lattice engine: aggregation algorithm

Aggregation is pure, monotone, and order-independent.

8.1 Support sets

For each Subject S and atom p, maintain:

SupportTrue[S,p] = set of claim IDs supporting p=true
SupportFalse[S,p] = set of claim IDs supporting p=false

Optionally store per-support:

trust label
evidence digests
timestamps

8.2 Compute K4 value

For each (S,p):

if both support sets empty → ⊥
if only true non-empty → T
if only false non-empty → F
if both non-empty → ⊤

8.3 Track trust on each side

Maintain:

TrustTrue[S,p] = max trust label among SupportTrue
TrustFalse[S,p] = max trust label among SupportFalse

This enables policy selection without losing conflict information.

9) Decision selection: from atoms → disposition

Decision selection is where “trust algebra” actually participates. It is policy-driven and can differ by environment (prod vs dev, regulated vs non-regulated).

9.1 Output disposition space

The engine should be able to emit a CycloneDX-compatible disposition (ECMA-424): (Ecma International)

resolved_with_pedigree
resolved
false_positive
not_affected
exploitable
in_triage

9.2 Deterministic selection rules (baseline)

Define D(S):

If FIXED == T and pedigree evidence meets threshold → resolved_with_pedigree
Else if FIXED == T → resolved
Else if MISATTRIBUTED == T and trust≥threshold → false_positive
Else if APPLIES == F or PRESENT == F → not_affected
Else if REACHABLE == F or MITIGATED == T → not_affected (with justification)
Else if REACHABLE == T and MITIGATED != T → exploitable
Else → in_triage

9.3 Conflict-handling modes (policy selectable)

When any required atom is ⊤ (conflict) or ⊥ (unknown), policy chooses a stance:

Skeptical (default for production gating):
- conflict/unknown biases toward in_triage or exploitable depending on risk tolerance
Authority-weighted:
- if high-authority vendor statement conflicts with low-trust scanner output, accept vendor but record conflict in proof
Quorum-based:
- accept not_affected only if:
  - (vendor trust≥A3) OR
  - (internal reachability proof trust≥A3) OR
  - (two independent principals ≥A2 agree) Otherwise remain in_triage.

This is where “trust algebra” expresses institutional policy without destroying underlying knowledge.

10) Proof object: verifiable explainability

Every verdict emits a Proof Bundle that can be verified offline.

10.1 Proof bundle contents

subject (canonical form)
inputs:
- list of claim IDs + digests
- list of evidence digests
- policy bundle digest
- vulnerability feed snapshot digest (if applicable)
normalization:
- mappings applied (e.g., OpenVEX status→atoms)
atom_table:
- each atom p: K4 value, support sets, trust per side
decision_trace:
- rule IDs fired
- thresholds used
output:
- disposition + justification + confidence metadata

10.2 Signing

The proof bundle is itself a payload suitable for signing in DSSE, enabling attested verdicts. DSSE’s type binding is important so a proof bundle cannot be reinterpreted as a different payload class. (GitHub)

11) Policy bundle specification (Trust + Decision DSL)

A policy bundle is a hashable document. Example structure (YAML-like; illustrative):

policy_id: "org.prod.default.v1"
trust_roots:
  - principal: "did:web:vendor.example"
    min_assurance: A2
    authority:
      products: ["vendor.example/*"]

  - principal: "did:web:sec.internal"
    min_assurance: A2
    authority:
      artifacts: ["sha256:*"]   # internal is authoritative for internal artifacts

acceptance_thresholds:
  resolved_with_pedigree:
    min_evidence_class: E3
    min_assurance: A3

  not_affected:
    mode: quorum
    quorum:
      - any:
          - { principal_role: vendor, min_assurance: A3 }
          - { evidence_type: callgraph_path, min_assurance: A3 }
          - { all:
              - { distinct_principals: 2 }
              - { min_assurance_each: A2 }
            }

conflict_mode:
  production: skeptical
  development: authority_weighted

The engine must treat the policy bundle as an input artifact (hashed, stored, referenced in proofs).

12) Determinism requirements

To guarantee deterministic replay:

Canonical JSON for all stored objects (claims, evidence, policy bundles).
Content-addressing:
- id = sha256(canonical_bytes)
Stable sorting:
- when iterating claims/evidence, sort by (type, id) to prevent nondeterministic traversal
Time handling:
- evaluation time is explicit input (e.g., as_of timestamp)
- expired claims are excluded deterministically
Version pinning:
- tool identity + version recorded in evidence
- vuln feed snapshot digests recorded

13) Worked examples

Example A: Vendor says not affected; scanner says exploitable

Inputs:

OpenVEX: not_affected with justification (required by spec) (GitHub)
Internal scanner: flags exploitable Aggregation:
REACHABLE: ⊤ (conflict) Selection (production skeptical):
verdict: in_triage Selection (authority-weighted, vendor authoritative):
verdict: not_affected Proof bundle records conflict and why policy accepted vendor.

Example B: Fixed with pedigree

Inputs:

CycloneDX analysis.state = resolved_with_pedigree (Ecma International)
Evidence includes commit history/diff in pedigree Selection:
resolved_with_pedigree

Example C: Not affected due to mitigations

Inputs:

CycloneDX analysis.state=not_affected, justification=protected_at_runtime (Ecma International)
Evidence: runtime mitigation proof (E2/E3) Selection:
not_affected with justification preserved

14) Implementation checklist

Minimum viable implementation modules:

Format adapters
- CycloneDX/ECMA-424 VEX parser
- OpenVEX parser
- CSAF VEX parser
Canonicalizer
- canonical JSON encoder
- digest computation
Verifier
- signature/DSSE verifier (pluggable)
- principal resolver (keys/certs/OIDC claims)
Normalizer
- maps external statements → atoms
- maps justifications → conditions/evidence expectations
Lattice store
- support sets per (S,p)
- efficient indexing by artifact/component/vuln
Policy evaluator
- thresholds, authority scopes, conflict modes
Proof generator
- deterministic trace emission
- optional DSSE signing step

What you should build next (practical sequencing)

Implement the canonical atom model + K4 aggregation (fast, foundational).
Add CycloneDX/ECMA-424 ingestion first (richest semantics). (Ecma International)
Add OpenVEX + CSAF ingestion with mapping to atoms. (Docker Documentation)
Add trust label computation + policy bundle hashing.
Add proof bundles + DSSE signing support. (GitHub)

If you want, I can follow this with:

a concrete JSON schema for Claim, Evidence, PolicyBundle, and ProofBundle;
an explicit mapping table from OpenVEX/CSAF justifications to ECMA-424 justifications/atoms;
a reference evaluation pseudocode implementation (deterministic, testable).

20 KiB Raw Blame History Unescape Escape

Trust Algebra and Lattice Engine Specification

1) Scope and objectives

1.1 What the engine must do

1.2 Non-goals

2) Standards surface (external inputs) and canonicalization targets

2.1 CycloneDX / ECMA-424 VEX (embedded)

2.2 OpenVEX

2.3 CSAF VEX

2.4 Provenance / attestations

3) Canonical internal model

3.1 Core identifiers

Subject identity

Context identity (optional but recommended)

3.2 Claims, evidence, attestations

Claim

Evidence

Attestation wrapper

4) The fact lattice: representing truth, unknowns, and conflicts

4.1 Why a lattice, not booleans

4.2 Four-valued fact lattice (Belnap-style)

Knowledge ordering (≤k)

Join operator (⊔k)

5) Atomic propositions (canonical “security atoms”)

6) Trust algebra: principals, assurance, and authority

6.1 Principal

6.2 Trust label

Assurance levels (example)

Authority scope

Evidence class

6.3 Trust ordering and operators

7) Normalization: external VEX → canonical atoms

7.1 CycloneDX / ECMA-424 normalization

7.2 OpenVEX normalization

7.3 CSAF VEX normalization

8) Lattice engine: aggregation algorithm

8.1 Support sets

8.2 Compute K4 value

8.3 Track trust on each side

9) Decision selection: from atoms → disposition

9.1 Output disposition space

9.2 Deterministic selection rules (baseline)

9.3 Conflict-handling modes (policy selectable)

10) Proof object: verifiable explainability

10.1 Proof bundle contents

10.2 Signing

11) Policy bundle specification (Trust + Decision DSL)

12) Determinism requirements

13) Worked examples

Example A: Vendor says not affected; scanner says exploitable

Example B: Fixed with pedigree

Example C: Not affected due to mitigations

14) Implementation checklist

What you should build next (practical sequencing)

20 KiB

Raw Blame History