Add reference architecture and testing strategy documentation

- Created a new document for the Stella Ops Reference Architecture outlining the system's topology, trust boundaries, artifact association, and interfaces. - Developed a comprehensive Testing Strategy document detailing the importance of offline readiness, interoperability, determinism, and operational guardrails. - Introduced a README for the Testing Strategy, summarizing processing details and key concepts implemented. - Added guidance for AI agents and developers in the tests directory, including directory structure, test categories, key patterns, and rules for test development.
2025-12-22 07:59:15 +02:00
parent 5d398ec442
commit 53503cb407
96 changed files with 37565 additions and 71 deletions
--- a/docs/product-advisories/19-Dec-2025
+++ b/docs/product-advisories/19-Dec-2025
@@ -0,0 +1,104 @@
+Below is a **feature → moat strength** map for Stella Ops, explicitly benchmarked against the tools we’ve been discussing (Trivy/Aqua, Grype/Syft, Anchore Enterprise, Snyk, Prisma Cloud). I’m using **“moat”** in the strict sense: *how hard is it for an incumbent to replicate the capability to parity, and how strong are the switching costs once deployed.*
+
+### Moat scale
+
+* **5 = Structural moat** (new primitives, strong defensibility, durable switching cost)
+* **4 = Strong moat** (difficult multi-domain engineering; incumbents have only partial analogs)
+* **3 = Moderate moat** (others can build; differentiation is execution + packaging)
+* **2 = Weak moat** (table-stakes soon; limited defensibility)
+* **1 = Commodity** (widely available in OSS / easy to replicate)
+
+---
+
+## 1) Stella Ops candidate features mapped to moat strength
+
+| Stella Ops feature (precisely defined)                                                                                                                                        | Closest competitor analogs (evidence)                                                                                                                                                                                                                                                                                                               |                                   Competitive parity today | Moat strength | Why this is (or isn’t) defensible                                                                                                                                                                                                                              | How to harden the moat                                                                                                                                                                 |
+| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------: | ------------: | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| **Signed, replayable risk verdicts**: “this artifact is acceptable” decisions produced deterministically, with an evidence bundle + policy snapshot, signed as an attestation | Ecosystem can sign SBOM attestations (e.g., Syft + Sigstore; DSSE/in-toto via cosign), but not “risk verdict” decisions end-to-end ([Anchore][1])                                                                                                                                                                                                   |                                                        Low |         **5** | This requires a **deterministic evaluation model**, a **proof/evidence schema**, and “knowledge snapshotting” so results are replayable months later. Incumbents mostly stop at exporting scan results or SBOMs, not signing a decision in a reproducible way. | Make the verdict format a **first-class artifact** (OCI-attached attestation), with strict replay semantics (“same inputs → same verdict”), plus auditor-friendly evidence extraction. |
+| **VEX decisioning engine (not just ingestion)**: ingest OpenVEX/CycloneDX/CSAF, resolve conflicts with a trust/policy lattice, and produce explainable outcomes               | Trivy supports multiple VEX formats (CycloneDX/OpenVEX/CSAF) but notes it’s “experimental/minimal functionality” ([Trivy][2]). Grype supports OpenVEX ingestion ([Chainguard][3]). Anchore can generate VEX docs from annotations (OpenVEX + CycloneDX) ([Anchore Docs][4]). Aqua runs VEX Hub for distributing VEX statements to Trivy ([Aqua][5]) |          Medium (ingestion exists; decision logic is thin) |         **4** | Ingestion alone is easy; the moat comes from **formal conflict resolution**, provenance-aware trust weighting, and deterministic outcomes. Most tools treat VEX as suppression/annotation, not a reasoning substrate.                                          | Ship a **policy-controlled merge semantics** (“vendor > distro > internal” is too naive) + required evidence hooks (e.g., “not affected because feature flag off”).                    |
+| **Reachability with proof**, tied to deployable artifacts: produce a defensible chain “entrypoint → call path → vulnerable symbol,” plus configuration gates                  | Snyk has reachability analysis in GA for certain languages/integrations and uses call-graph style reasoning to determine whether vulnerable code is called ([Snyk User Docs][6]). Some commercial vendors also market reachability (e.g., Endor Labs is listed in CycloneDX Tool Center as analyzing reachability) ([CycloneDX][7])                 | Medium (reachability exists, but proof portability varies) |         **4** | “Reachability” as a label is no longer unique. The moat is **portable proofs** (usable in audits and in air-gapped environments) + artifact-level mapping (not just source repo analysis) + deterministic replay.                                              | Focus on **proof-carrying reachability**: store the reachability subgraph as evidence; make it reproducible and attestable; support both source and post-build artifacts.              |
+| **Smart-Diff (semantic risk delta)**: between releases, explain “what materially changed in exploitable surface,” not just “CVE count changed”                                | Anchore provides SBOM management and policy evaluation (good foundation), but “semantic risk diff” is not a prominent, standardized feature in typical scanners ([Anchore Docs][8])                                                                                                                                                                 |                                                 Low–Medium |         **4** | Most incumbents can diff findings lists. Few can diff **reachability graphs, policy outcomes, and VEX state** to produce stable “delta narratives.” Hard to replicate without the underlying evidence model.                                                   | Treat diff as first-class: version SBOM graphs + reachability graphs + VEX claims; compute deltas over those graphs and emit a signed “delta verdict.”                                 |
+| **Unknowns as first-class state**: represent “unknown-reachable/unknown-unreachable” and force policies to account for uncertainty                                            | Not a standard capability in common scanners/platforms; most systems output findings and (optionally) suppressions                                                                                                                                                                                                                                  |                                                        Low |         **4** | This is conceptually simple but operationally rare; it requires rethinking UX, scoring, and policy evaluation. It becomes sticky once orgs base governance on uncertainty budgets.                                                                             | Bake unknowns into policies (“fail if unknowns > N in prod”), reporting, and attestations. Make it the default rather than optional.                                                   |
+| **Air-gapped epistemic mode**: offline operation where the tool can prove what knowledge it used (feed snapshot + timestamps + trust anchors)                                 | Prisma Cloud Compute Edition supports air-gapped environments and has an offline Intel Stream update mechanism ([Prisma Cloud Docs][9]). (But “prove exact knowledge state used for decisions” is typically not the emphasis.)                                                                                                                      |                                                     Medium |         **4** | Air-gapped “runtime” is common; air-gapped **reproducibility** is not. The moat is packaging offline feeds + policies + deterministic scoring into a replayable bundle tied to attestations.                                                                   | Deliver a “sealed knowledge snapshot” workflow (export/import), and make audits a one-command replay.                                                                                  |
+| **SBOM ledger + lineage**: BYOS ingestion plus versioned SBOM storage, grouping, and historical tracking                                                                      | Anchore explicitly positions centralized SBOM management and “Bring Your Own SBOM” ([Anchore Docs][8]). Snyk can generate SBOMs and expose SBOM via API in CycloneDX/SPDX formats ([Snyk User Docs][10]). Prisma can export CycloneDX SBOMs for scans ([Prisma Cloud Docs][11])                                                                     |                                                       High |         **3** | SBOM generation/storage is quickly becoming table stakes. You can still differentiate on **graph fidelity + lineage semantics**, but “having SBOMs” alone won’t be a moat.                                                                                     | Make the ledger valuable via **semantic diff, evidence joins (reachability/VEX), and provenance** rather than storage.                                                                 |
+| **Policy engine with proofs**: policy-as-code that produces a signed explanation (“why pass/fail”) and links to evidence nodes                                                | Anchore has a mature policy model (policy JSON, gates, allowlists, mappings) ([Anchore Docs][12]). Prisma/Aqua have rich policy + runtime guardrails (platform-driven) ([Aqua][13])                                                                                                                                                                 |                                                       High |         **3** | Policy engines are common. The moat is the **proof output** + deterministic replay + integration with attestations.                                                                                                                                            | Keep policy language small but rigorous; always emit evidence pointers; support “policy compilation” to deterministic decision artifacts.                                              |
+| **VEX distribution network**: ecosystem layer that aggregates, validates, and serves VEX at scale                                                                             | Aqua’s VEX Hub is explicitly a centralized repository designed for discover/fetch/consume flows with Trivy ([Aqua][5])                                                                                                                                                                                                                              |                                                     Medium |       **3–4** | A network layer can become a moat if it achieves broad adoption. But incumbents can also launch hubs. This becomes defensible only with **network effects + trust frameworks**.                                                                                | Differentiate with **verification + trust scoring** of VEX sources, plus tight coupling to deterministic decisioning and attestations.                                                 |
+| **“Integrations everywhere”** (CI/CD, registry, Kubernetes, IDE)                                                                                                              | Everyone in this space integrates broadly; reachability and scoring features often ride those integrations (e.g., Snyk reachability depends on repo/integration access) ([Snyk User Docs][6])                                                                                                                                                       |                                                       High |       **1–2** | Integrations are necessary, but not defensible—mostly engineering throughput.                                                                                                                                                                                  | Use integrations to *distribute attestations and proofs*, not as the headline differentiator.                                                                                          |
+
+---
+
+## 2) Where competitors already have strong moats (avoid head‑on fights early)
+
+These are areas where incumbents are structurally advantaged, so Stella Ops should either (a) integrate rather than replace, or (b) compete only if you have a much sharper wedge.
+
+### Snyk’s moat: developer adoption + reachability-informed prioritization
+
+* Snyk publicly documents **reachability analysis** (GA for certain integrations/languages) ([Snyk User Docs][6])
+* Snyk prioritization incorporates reachability and other signals into **Priority Score** ([Snyk User Docs][14])
+  **Implication:** pure “reachability” claims won’t beat Snyk; **proof-carrying, artifact-tied, replayable reachability** can.
+
+### Prisma Cloud’s moat: CNAPP breadth + graph-based risk prioritization + air-gapped CWPP
+
+* Prisma invests in graph-driven investigation/tracing of vulnerabilities ([Prisma Cloud Docs][15])
+* Risk prioritization and risk-score ranked vulnerability views are core platform capabilities ([Prisma Cloud Docs][16])
+* Compute Edition supports **air-gapped environments** and has offline update workflows ([Prisma Cloud Docs][9])
+  **Implication:** competing on “platform breadth” is a losing battle early; compete on **decision integrity** (deterministic, attestable, replayable) and integrate where needed.
+
+### Anchore’s moat: SBOM operations + policy-as-code maturity
+
+* Anchore is explicitly SBOM-management centric and supports policy gating constructs ([Anchore Docs][8])
+  **Implication:** Anchore is strong at “SBOM at scale.” Stella Ops should outperform on **semantic diff, VEX reasoning, and proof outputs**, not just SBOM storage.
+
+### Aqua’s moat: code-to-runtime enforcement plus emerging VEX distribution
+
+* Aqua provides CWPP-style runtime policy enforcement/guardrails ([Aqua][13])
+* Aqua backs VEX Hub for VEX distribution and Trivy consumption ([Aqua][5])
+  **Implication:** if Stella Ops is not a runtime protection platform, don’t chase CWPP breadth—use Aqua/Prisma integrations and focus on upstream decision quality.
+
+---
+
+## 3) Practical positioning: which features produce the most durable wedge
+
+If you want the shortest path to a *defensible* position:
+
+1. **Moat anchor (5): Signed, replayable risk verdicts**
+
+   * Everything else (VEX, reachability, diff) becomes evidence feeding that verdict.
+2. **Moat amplifier (4): VEX decisioning + proof-carrying reachability**
+
+   * In 2025, VEX ingestion exists in Trivy/Grype/Anchore ([Trivy][2]), and reachability exists in Snyk ([Snyk User Docs][6]).
+   * Your differentiation must be: **determinism + portability + auditability**.
+3. **Moat compounding (4): Smart-Diff over risk meaning**
+
+   * Turns “scan results” into an operational change-control primitive.
+
+---
+
+## 4) A concise “moat thesis” per feature (one-liners you can use internally)
+
+* **Deterministic signed verdicts:** “We don’t output findings; we output an attestable decision that can be replayed.”
+* **VEX decisioning:** “We treat VEX as a logical claim system, not a suppression file.”
+* **Reachability proofs:** “We provide proof of exploitability in *this* artifact, not just a badge.”
+* **Smart-Diff:** “We explain what changed in exploitable surface area, not what changed in CVE count.”
+* **Unknowns modeling:** “We quantify uncertainty and gate on it.”
+
+---
+
+If you want, I can convert the table into a **2×2 moat map** (Customer Value vs Defensibility) and a **build-order roadmap** that maximizes durable advantage while minimizing overlap with entrenched competitor moats.
+
+[1]: https://anchore.com/sbom/creating-sbom-attestations-using-syft-and-sigstore/?utm_source=chatgpt.com "Creating SBOM Attestations Using Syft and Sigstore"
+[2]: https://trivy.dev/docs/v0.50/supply-chain/vex/?utm_source=chatgpt.com "VEX"
+[3]: https://www.chainguard.dev/unchained/vexed-then-grype-about-it-chainguard-and-anchore-announce-grype-supports-openvex?utm_source=chatgpt.com "VEXed? Then Grype about it"
+[4]: https://docs.anchore.com/current/docs/vulnerability_management/vuln_annotations/?utm_source=chatgpt.com "Vulnerability Annotations and VEX"
+[5]: https://www.aquasec.com/blog/introducing-vex-hub-unified-repository-for-vex-statements/?utm_source=chatgpt.com "Trivy VEX Hub:The Solution to Vulnerability Fatigue"
+[6]: https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing/reachability-analysis?utm_source=chatgpt.com "Reachability analysis"
+[7]: https://cyclonedx.org/tool-center/?utm_source=chatgpt.com "CycloneDX Tool Center"
+[8]: https://docs.anchore.com/current/docs/sbom_management/?utm_source=chatgpt.com "SBOM Management"
+[9]: https://docs.prismacloud.io/en/compute-edition?utm_source=chatgpt.com "Prisma Cloud Compute Edition"
+[10]: https://docs.snyk.io/developer-tools/snyk-cli/commands/sbom?utm_source=chatgpt.com "SBOM | Snyk User Docs"
+[11]: https://docs.prismacloud.io/en/compute-edition/32/admin-guide/vulnerability-management/exporting-sboms?utm_source=chatgpt.com "Exporting Software Bill of Materials on CycloneDX"
+[12]: https://docs.anchore.com/current/docs/overview/concepts/policy/policies/?utm_source=chatgpt.com "Policies and Evaluation"
+[13]: https://www.aquasec.com/products/cwpp-cloud-workload-protection/?utm_source=chatgpt.com "Cloud workload protection in Runtime - Aqua Security"
+[14]: https://docs.snyk.io/manage-risk/prioritize-issues-for-fixing?utm_source=chatgpt.com "Prioritize issues for fixing"
+[15]: https://docs.prismacloud.io/en/enterprise-edition/content-collections/search-and-investigate/c2c-tracing-vulnerabilities/investigate-vulnerabilities-tracing?utm_source=chatgpt.com "Use Vulnerabilities Tracing on Investigate"
+[16]: https://docs.prismacloud.io/en/enterprise-edition/use-cases/secure-the-infrastructure/risk-prioritization?utm_source=chatgpt.com "Risk Prioritization - Prisma Cloud Documentation"