Files
git.stella-ops.org/docs/product-advisories/19-Dec-2025 - Benchmarking Container Scanners Against Stella Ops.md
master 53503cb407 Add reference architecture and testing strategy documentation
- Created a new document for the Stella Ops Reference Architecture outlining the system's topology, trust boundaries, artifact association, and interfaces.
- Developed a comprehensive Testing Strategy document detailing the importance of offline readiness, interoperability, determinism, and operational guardrails.
- Introduced a README for the Testing Strategy, summarizing processing details and key concepts implemented.
- Added guidance for AI agents and developers in the tests directory, including directory structure, test categories, key patterns, and rules for test development.
2025-12-22 07:59:30 +02:00

15 KiB
Raw Blame History

Im sharing a competitive securitytool matrix that you can immediately plug into StellaOps strategy discussions — it maps real, comparable evidence from public sources to categories where most current tools fall short. Below the CSV is a short Markdown commentary that highlights gaps & opportunities StellaOps can exploit.


🧠 Competitive Security Tool Matrix (CSV)

Columns: Tool,SBOM Fidelity,VEX Handling,Explainability,SmartDiff,CallStack Reachability,Deterministic Scoring,Unknowns State,Ecosystem Integrations,Policy Engine,Offline/AirGapped,Provenance/Attestations,Public Evidence

Tool,SBOM Fidelity,VEX Handling,Explainability,SmartDiff,CallStack Reachability,Deterministic Scoring,Unknowns State,Ecosystem Integrations,Policy Engine,Offline/AirGapped,Provenance/Attestations,Public Evidence
Trivy (open),CycloneDX/SPDX support (basic),Partial* (SBOM ext refs),Low,No,No,Moderate,No,Strong CI/CD/K8s,Minimal,Unknown,SBOM only evidence; VEX support request exists but unmerged⟨*⟩,:contentReference[oaicite:0]{index=0}
Grype/Syft,Strong CycloneDX/SPDX (generator + scanner),None documented,Low,No,No,Moderate,No,Strong CI/CD/K8s,Policy minimal,Unknown,Syft can create signed SBOMs but not full attestations,:contentReference[oaicite:1]{index=1}
Snyk,SBOM export likely (platform),Unknown/limited,Vuln context explainability (reports),No,No,Proprietary risk scoring,Partial integrations,Strong Black/White list policies in UI,Unknown,Unknown (not focused on attestations),:contentReference[oaicite:2]{index=2}
Prisma Cloud,Enterprise SBOM + vuln scanning,Runtime exploitability contexts?*,Enterprise dashboards,No formal smartdiff,No,Risk prioritization,Supports multicloud integrations,Rich policy engines (CNAPP),Supports offline deployment?,Unknown attestations capabilities,:contentReference[oaicite:3]{index=3}
Aqua (enterprise),SBOM via Trivy,Unknown commercial VEX support,Some explainability in reports,No documented smartdiff,No,Risk prioritization,Comprehensive integrations (cloud/CI/CD/SIEM),Enterprise policy supports compliance,Airgapped options in enterprise,Focus on compliance attestations?,:contentReference[oaicite:4]{index=4}
Anchore Enterprise,Strong SBOM mgmt + format support,Policy engine can ingest SBOM + vulnerability sources,Moderate (reports & SBOM insights),Potential policy diff,No explicit reachability analysis,Moderate policy scoring,Partial,Rich integrations (CI/CD/registry),Policyascode,Airgapped deploy supported,SBOM provenance & signing via Syft/intoto,:contentReference[oaicite:5]{index=5}
StellaOps,High fidelity SBOM (CycloneDX/SPDX) planned,Native VEX ingestion + decisioning,Explainability + proof extracts,Smartdiff tech planned,Callstack reachability analysis,Deterministic scoring with proofs,Explicit unknowns state,Integrations with CI/CD/SIGSTORE,Declarative multimodal policy engine,Full offline/airgapped support,Provenance/attestations via DSSE/intoto,StellaOps internal vision

📌 Key Notes, Gaps & Opportunities (Markdown)

SBOM Fidelity

  • Open tools (Trivy, Syft) already support CycloneDX/SPDX output, but mostly as flat SBOM artifacts without longterm repositories or versioned diffing. (Ox Security)
  • Opportunity: Provide repository + lineage + merge semantics with proofs — not just generation.

VEX Handling

  • Trivy has an open feature request for dynamic VEX ingestion. (GitHub)
  • Most competitors either lack VEX support or have no decisioning logic based on exploitability.
  • Opportunity: Firstclass VEX ingestion with evaluation rules + automated scoring.

Explainability

  • Commercial tools (Prisma/Snyk) offer UI report context and devoriented remediation guidance. (Snyk)
  • OSS tools provide flat scan outputs with minimal causal trace.
  • Opportunity: Link vulnerability flags back to proven code paths, enriched with SBOM + call reachability.

SmartDiff & Unknowns State

  • No major tool advertising smart diffing between SBOMs for incremental risk deltas across releases.
  • Opportunity: Automate risk deltas between SBOMs with uncertainty margins.

CallStack Reachability

  • None of these tools publicly document callstack based exploit reachability analysis outofthebox.
  • Opportunity: Integrate dynamic/static reachability evidence that elevates scanning from surface report → impact map.

Deterministic Scoring

  • Snyk & Prisma offer proprietary scoring that blends severity + context. (TrustRadius)
  • But these arent reproducible with signed verdicts.
  • Opportunity: Provide deterministic, attestable scoring proofs.

Ecosystem Integrations

  • Trivy/Grype excel at lightweight CI/CD and Kubernetes. (Echo)
  • Enterprise products integrate deeply into cloud/registry. (Palo Alto Networks)
  • Opportunity: Expand sigstore/notation based pipelines and automated attestation flows.

Policy Engine

  • Prisma & Aqua have mature enterprise policies. (Aqua)
  • OSS tools have limited simple allow/deny.
  • Opportunity: Provide lattice/constraint policies with proof outputs.

Offline/AirGapped

  • Anchore supports airgapped deployment in enterprise contexts. (Anchore)
  • Support across all open tools is adhoc at best.
  • Opportunity: Builtin deterministic offline modes with offline SBOM stores and VEX ingestion.

Provenance/Attestations

  • Syft supports SBOM output in various formats; also intoto for attestations. (Ox Security)
  • Most competitors dont prominently advertise attestation pipelines.
  • Opportunity: Endtoend DSSE/intoto provenance with immutable proofs.

  • Trivy / Syft / Grype SBOM support & formats: CycloneDX/SPDX; Syft as generator + Grype scanner. (Ox Security)
  • Trivy VEX feature request: exists but not mainstream yet. (GitHub)
  • Snyk platform capability: scans containers, IaC, devfirst prioritization. (Snyk)
  • Prisma Cloud container security: lifecycle scanning + policy. (Palo Alto Networks)
  • Anchore Enterprise SBOM mgmt & policy: central imports + CI/CD ecosystem. (Anchore Docs)

If youd like this in Excel, JSON, or Mermaid graph form next, I can generate it — just tell me the format you prefer.

Below is a deep, implementation-grade elaboration that turns the comparison matrix into actionable architecture and product differentiation for Stella Ops. I will deliberately avoid marketing language and focus on mechanics, evidence, and execution gaps.


1. Why existing tools plateau (root cause analysis)

Across Trivy, Syft/Grype, Snyk, Prisma, Aqua, Anchore, there is a structural ceiling they all hit — regardless of OSS vs enterprise.

Shared structural limitations

  1. SBOM is treated as a static artifact

    • Generated → stored → scanned.
    • No concept of evolving truth, lineage, or replayability.
  2. Vulnerability scoring is probabilistic, not provable

    • CVSS + vendor heuristics.
    • Cannot answer: “Show me why this CVE is exploitable here.”
  3. Exploitability ≠ reachability

    • “Runtime context” ≠ call-path proof.
  4. Diffing is file-level, not semantic

    • Image hash change ≠ security delta understanding.
  5. Offline support is operational, not epistemic

    • You can run it offline, but you cannot prove what knowledge state was used.

These are not accidental omissions. They arise from tooling lineage:

  • Trivy/Syft grew from package scanners
  • Snyk grew from developer remediation UX
  • Prisma/Aqua grew from policy & compliance platforms

None were designed around forensic reproducibility or trust algebra.


2. SBOM fidelity: what “high fidelity” actually means

Most tools claim CycloneDX/SPDX support. That is necessary but insufficient.

Current reality

Dimension Industry tools
Component identity Package name + version
Binary provenance Weak or absent
Build determinism None
Dependency graph Flat or shallow
Layer attribution Partial
Rebuild reproducibility Not supported

What Stella Ops must do differently

SBOM must become a stateful ledger, not a document.

Concrete requirements:

  • Component identity = (source + digest + build recipe hash)

  • Binary → source mapping

    • ELF Build-ID / Mach-O UUID / PE timestamp+hash
  • Layer-aware dependency graphs

    • Not “package depends on X”
    • But “binary symbol A resolves to shared object B via loader rule C”
  • Replay manifest

    • Exact feeds
    • Exact policies
    • Exact scoring rules
    • Exact timestamps
    • Hash of everything

This is the foundation for deterministic replayable scans — something none of the competitors even attempt.


3. VEX handling: ingestion vs decisioning

Most vendors misunderstand VEX.

What competitors do

  • Accept VEX as:

    • Metadata
    • Annotation
    • Suppression rule
  • No formal reasoning over VEX statements.

What Stella Ops must do

VEX is not a comment — it is a logical claim.

Each VEX statement:

IF
  product == X
  AND component == Y
  AND version in range Z
THEN
  status ∈ {not_affected, affected, fixed, under_investigation}
BECAUSE
  justification J
WITH
  evidence E

Stella Ops advantage:

  • VEX statements become inputs to a lattice merge

  • Conflicting VEX from:

    • Vendor
    • Distro
    • Internal analysis
    • Runtime evidence
  • Are resolved deterministically via policy, not precedence hacks.

This unlocks:

  • Vendor-supplied proofs
  • Customer-supplied overrides
  • Jurisdiction-specific trust rules

4. Explainability: reports vs proofs

Industry “explainability”

  • “This vulnerability is high because…”
  • Screenshots, UI hints, remediation text.

Required explainability

Security explainability must answer four non-negotiable questions:

  1. What exact evidence triggered this finding?
  2. What code or binary path makes it reachable?
  3. What assumptions are being made?
  4. What would falsify this conclusion?

No existing scanner answers #4.

Stella Ops model

Each finding emits:

  • Evidence bundle:

    • SBOM nodes
    • Call-graph edges
    • Loader resolution
    • Runtime symbol presence
  • Assumption set:

    • Compiler flags
    • Runtime configuration
    • Feature gates
  • Confidence score derived from evidence density, not CVSS

This is explainability suitable for:

  • Auditors
  • Regulators
  • Courts
  • Defense procurement

5. Smart-Diff: the missing primitive

All tools compare:

  • Image A vs Image B
  • Result: “+3 CVEs, 1 CVE”

This is noise-centric diffing.

What Smart-Diff must mean

Diff not artifacts, but security meaning.

Examples:

  • Same CVE remains, but:

    • Call path removed → risk collapses
  • New binary added, but:

    • Dead code → no reachable risk
  • Dependency upgraded, but:

    • ABI unchanged → no exposure delta

Implementation direction:

  • Diff reachability graphs
  • Diff policy outcomes
  • Diff trust weights
  • Diff unknowns

Output:

“This release reduces exploitability surface by 41%, despite +2 CVEs.”

No competitor does this.


6. Call-stack reachability: why runtime context isnt enough

Current vendor claim

“Runtime exploitability analysis.”

Reality:

  • Usually:

    • Process exists
    • Library loaded
    • Port open

This is coarse correlation, not proof.

Stella Ops reachability model

Reachability requires three layers:

  1. Static call graph

    • From entrypoints to vulnerable symbols
  2. Binary resolution

    • Dynamic loader rules
    • Symbol versioning
  3. Runtime gating

    • Feature flags
    • Configuration
    • Environment

Only when all three align does exploitability exist.

This makes false positives structurally impossible, not heuristically reduced.


7. Deterministic scoring: replacing trust with math

Every competitor uses:

  • CVSS
  • EPSS
  • Proprietary weighting

Problem:

  • Scores are non-reproducible
  • Cannot be attested
  • Cannot be audited

Stella Ops scoring

Score = deterministic function of:

  • Evidence count
  • Evidence strength
  • Assumption penalties
  • Trust source weights
  • Policy constraints

Same inputs → same outputs → forever.

This enables:

  • Signed risk decisions
  • Cross-org verification
  • Legal defensibility

8. Unknowns as a first-class state

Industry tools suppress uncertainty.

Stella Ops must surface it.

States:

  • Known-safe
  • Known-vulnerable
  • Unknown-reachable
  • Unknown-unreachable

Unknowns are risk, but different from vulnerabilities.

This is critical for:

  • Air-gapped environments
  • Novel exploits
  • Zero-day windows

No competitor models this explicitly.


9. Offline / air-gapped: epistemic completeness

Competitors:

  • “Yes, we can run offline.”

But cannot answer:

“Exactly what knowledge did you have when you made this decision?”

Stella Ops must:

  • Snapshot feeds
  • Snapshot policies
  • Snapshot scoring rules
  • Snapshot trust anchors

Then bind them cryptographically to scan results.

This creates forensic replayability, not just offline execution.


10. Strategic conclusion (no hype)

Where Stella Ops can objectively dominate

  1. Deterministic replayable scans
  2. Formal VEX reasoning
  3. Reachability-backed exploitability
  4. Semantic smart-diff
  5. Evidence-first explainability
  6. Unknowns modeling
  7. Jurisdiction-ready offline trust

Where competitors remain stronger (for now)

  • Mass-market UX polish
  • SaaS onboarding friction
  • Marketplace integrations

These are solvable later. The hard parts above are not.


If you want, next I can:

  • Convert this into developer implementation milestones
  • Map features → moat strength vs competitors
  • Design the Trust Algebra / Lattice Engine spec
  • Produce a regulator-grade threat & evidence model

Tell me which vector you want next.