I thought you might appreciate a quick reality-check of public evidence backing up five concrete test cases you could drop into the StellaOps acceptance suite **today** — each rooted in a real issue or behavior in the ecosystem.

---

## 🔎 Recent public incidents & tests matching your list

### • Credential-leak via Grype JSON output

* CVE-2025-65965 / GHSA-6gxw-85q2-q646 allows registry credentials to be written unsanitized into `--output json=…`. ([GitHub][1])
* Affects grype 0.68.0–0.104.0; patched in 0.104.1. ([GitHub][2])
* Workaround: avoid JSON file output or upgrade. ([GitHub][1])

**Implication for StellaOps**: run Grype with credentials + JSON output + scan the JSON for secrets; the test catches the leak.

---

### • Air-gap / old DB schema issues with Trivy

* `--skip-db-update` supports offline mode. ([Trivy][3])
* Using an old/offline DB with mismatched schema errors out instead of falling back. ([GitHub][4])

**Implication**: capture that deterministic failure (old DB + skip update) or document the exit code as part of offline gating.

---

### • SBOM mismatch between native binary vs container builds (Syft + friends)

* Mixing SBOM sources (Syft vs Trivy) yields wildly different vulnerability counts when fed into Grype. ([GitHub][5])

**Implication**: compare SBOMs from native builds and containers for the same artifact (digests, component counts) to detect provenance divergence.

---

### • Inconsistent vulnerability detection across Grype versions

* The same SBOM under Grype v0.87.0 reported multiple critical+high findings; newer versions reported none. ([GitHub][6])

**Implication**: ingest a stable SBOM/VEX set under multiple DB/scanner versions to detect regressions or nondeterministic outputs.

---

## ✅ Publicly verified vs speculative

| ✅ Publicly verified                                                                                          | ⚠️ Needs controlled testing / assumption-driven                                                                                              |
| ------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------- |
| Credential leak in Grype JSON output (CVE-2025-65965) ([GitHub][2])                                          | Exact SBOM-digest parity divergence between native & container Syft runs (no formal bug yet)                                                 |
| Trivy offline DB schema error ([Trivy][3])                                                                   | Custom CVSS/VEX sorting in patched grype-dbs or Snyk workarounds (not publicly reproducible)                                                  |
| Grype version divergence on the same SBOM ([GitHub][6])                                                      | VEX evidence mappings from tools like Snyk - plausible but not documented reproducibly                                                        |

---

## 🎯 Why this matters for StellaOps

Your goals are deterministic, audit-ready SBOM + VEX pipelines. These public incidents show how fragile tooling can be. Embedding the above tests ensures reliability and reproducibility even under credential leaks, offline scans, or DB/schema churn.

Want a bash/pseudo spec for all five cases (commands + assertions)? I can drop it into your repo as a starting point.

[1]: https://github.com/anchore/grype/security/advisories/GHSA-6gxw-85q2-q646?utm_source=chatgpt.com "Credential disclosure vulnerability in Grype JSON output"
[2]: https://github.com/advisories/GHSA-6gxw-85q2-q646?utm_source=chatgpt.com "Grype has a credential disclosure vulnerability in its JSON ..."
[3]: https://trivy.dev/docs/v0.55/guide/advanced/air-gap/?utm_source=chatgpt.com "Advanced Network Scenarios"
[4]: https://github.com/aquasecurity/trivy/discussions/4838?utm_source=chatgpt.com "trivy offline-db · aquasecurity trivy · Discussion #4838"
[5]: https://github.com/aquasecurity/trivy/discussions/6325?utm_source=chatgpt.com "and of those detected Trivy sometimes does not report a ..."
[6]: https://github.com/anchore/grype/issues/2628?utm_source=chatgpt.com "Grype false negatives in versions v0.88.0 and later leading ..."

Here’s a concrete package of **guidelines + five acceptance test specs** for StellaOps implementors, based on the incidents and behaviors we walked through (Grype leak, Trivy offline DB, Syft parity, grype-db / VEX, Snyk exploit maturity).

I’ll keep it tool/framework‑agnostic so you can drop this into whatever test runner you’re using.

---

## 1. Implementation guidelines for StellaOps integrators

These are the “rules of the road” that every new scanner / SBOM / VEX adapter in StellaOps should follow.

### 1.1. Test layering

For each external tool (Grype, Trivy, Syft, Snyk, …) you should have:

1. **Unit tests**

   * Validate argument construction, environment variables, and parsing of raw tool output.
2. **Adapter integration tests**

   * Run the real tool (or a pinned container image) on small deterministic fixtures.
3. **Acceptance tests (this doc)**

   * Cross-tool workflows and critical invariants: secrets, determinism, offline behavior, VEX mapping.
4. **Golden outputs**

   * For key paths, store canonical JSON/CSAF/CycloneDX outputs as fixtures and diff against them.

Each new adapter should add at least one test in each layer.

---

### 1.2. Determinism & reproducibility

**Rule:** *Same inputs (artifact, SBOM, VEX, tool versions) → bit‑for‑bit identical StellaOps results.*

Guidelines:

* Pin external tool versions in tests (`GRYPE_VERSION`, `TRIVY_VERSION`, `SYFT_VERSION`, `SNYK_CLI_VERSION`).
* Pin vulnerability DB snapshots (e.g. `GRYPE_DB_SNAPSHOT_ID`, `TRIVY_DB_DIR`).
* Persist the “scan context” alongside results: scanner version, db snapshot id, SBOM hash, VEX hash.
* For sorted outputs, define and test a **stable global sort order**:

  * e.g. `effectiveSeverity DESC, exploitEvidence DESC, vulnerabilityId ASC`.

---

### 1.3. Secrets and logs

Recent issues show scanners can leak sensitive data to JSON or logs:

* **Grype**: versions v0.68.0–v0.104.0 could embed registry credentials in JSON output files when invoked with `--file` or `--output json=<file>` and credentials are configured. ([GitLab Advisory Database][1])
* **Trivy / Snyk**: both have had edge cases where DB/CLI errors or debug logs printed sensitive info. ([Trivy][2])

Guidelines:

* **Never** enable DEBUG/TRACE logging for third‑party tools in production mode.
* **Never** use scanner options that write JSON reports directly to disk unless you control the path and sanitize.
* Treat logs and raw reports as **untrusted**: run them through secret‑scanning / redaction before persistence.
* Build tests that use **fake-but-realistic secrets** (e.g. `stella_USER_123`, `stella_PASS_456`) and assert they never appear in:

  * DB rows
  * API responses
  * UI
  * Stored raw reports

---

### 1.4. Offline / air‑gapped behavior

Trivy’s offline mode is a good example of fragile behavior:

* Using an old offline DB with `--skip-db-update` can produce fatal “old schema” errors like
  `The local DB has an old schema version… --skip-update cannot be specified with the old DB schema`. ([GitHub][3])
* Air‑gap workflows require explicit flags like `--skip-db-update` and `--skip-java-db-update`. ([aquasecurity.github.io][4])

Guidelines:

* All scanner adapters must have an **explicit offline mode** flag / config.
* In offline mode:

  * Do **not** attempt any network DB updates.
  * Treat **DB schema mismatch / “old DB schema”** as a *hard* scanner infrastructure error, *not* “zero vulns”.
* Surface offline issues as **typed StellaOps errors**, e.g. `ScannerDatabaseOutOfDate`, `ScannerFirstRunNoDB`.

---

### 1.5. External metadata → internal model

Different tools carry different metadata:

* CVSS v2/v3/v3.1/v4.0 (score & vectors).
* GHSA vs CVE vs SNYK‑IDs.
* Snyk’s `exploitMaturity` field (e.g. `no-known-exploit`, `proof-of-concept`, `mature`, `no-data`). ([Postman][5])

Guidelines:

* Normalize all of these into **one internal vulnerability model**:

  * `primaryId` (CVE, GHSA, Snyk ID, etc.)
  * `aliases[]`
  * `cvss[]` (list of metrics with `version`, `baseScore`, `baseSeverity`)
  * `exploitEvidence` (internal enum derived from `exploitMaturity`, EPSS, etc.)
* Define **deterministic merge rules** when multiple sources describe the same vuln (CVE+GHSA+SNYK).

---

### 1.6. Contract & fixture hygiene

* Prefer **static fixtures** over dynamic network calls in tests.
* Version every fixture with a semantic name and version, e.g.
  `fixtures/grype/2025-credential-leak-v1.json`.
* When you change parsers or normalizers, add a new version of the fixture, but **keep the old one** to guard regressions.

---

## 2. Acceptance test specs (for implementors)

Below are five concrete test cases you can add to a `tests/acceptance/` suite.

I’ll name them `STELLA-ACC-00X` so you can turn them into tickets if you want.

---

### STELLA-ACC-001 — Grype JSON credential leak guard

**Goal**
Guarantee that StellaOps never stores or exposes registry credentials even if:

* Grype itself is (or becomes) vulnerable, or
* A user uploads a raw Grype JSON report that contains creds.

This is directly motivated by CVE‑2025‑65965 / GHSA‑6gxw‑85q2‑q646. ([GitLab Advisory Database][1])

---

#### 001A – Adapter CLI usage (no unsafe flags)

**Invariant**

> The Grype adapter must *never* use `--file` or `--output json=<file>` — only `--output json` to stdout.

**Setup**

* Replace `grype` in PATH with a small wrapper script that:

  * Records argv and env to a temp file.
  * Exits 0 after printing a minimal fake JSON vulnerability report to stdout.

**Steps**

1. Run a standard StellaOps Grype‑based scan through the adapter on a dummy image/SBOM.
2. After completion, read the temp “spy” file written by the wrapper.

**Assertions**

* The recorded arguments **do not** contain:

  * `--file`
  * `--output json=`
* There is exactly one `--output` argument and its value is `json`.

> This test is purely about **command construction**; it never calls the real Grype binary.

---

#### 001B – Ingestion of JSON with embedded creds

**Invariant**

> If a Grype JSON report contains credentials, StellaOps must either reject it or scrub credentials before storage/exposure.

**Fixture**

* `fixtures/grype/credential-leak.json`, modeled roughly like the CVE report:

  * Include a fake Docker config snippet Grype might accidentally embed, e.g.:

  ```json
  {
    "config": {
      "auths": {
        "registry.example.com": {
          "username": "stella_USER_123",
          "password": "stella_PASS_456"
        }
      }
    }
  }
  ```

  * Plus one small vulnerability record so the ingestion pipeline runs normally.

**Steps**

1. Use a test helper to ingest the fixture as if it were Grype output:

   * e.g. `stellaIngestGrypeReport("credential-leak.json")`.
2. Fetch the stored scan result via:

   * direct DB access, or
   * StellaOps internal API.

**Assertions**

* Nowhere in:

  * stored raw report blobs,
  * normalized vulnerability records,
  * metadata tables,
  * logs captured by the test runner,

  should the substrings `stella_USER_123` or `stella_PASS_456` appear.
* The ingestion should:

  * either succeed and **omit/sanitize** the auth section,
  * or fail with a **clear, typed error** like `ReportContainsSensitiveSecrets`.

**Implementation guidelines**

* Implement a **secret scrubber** that runs on:

  * any external JSON you store,
  * any logs when a scanner fails verbosely.
* Add a generic helper assertion in your test framework:

  * `assertNoSecrets(ScanResult, ["stella_USER_123", "stella_PASS_456"])`.

---

### STELLA-ACC-002 — Syft SBOM manifest digest parity (native vs container)

**Goal**
Ensure StellaOps produces the same **artifact identity** (image digest / manifest digest / tags) when SBOMs are generated:

* By Syft installed natively on the host vs
* By Syft run as a container image.

This is about keeping VEX, rescan, and historical analysis aligned even if environments differ.

---

#### Setup

1. Build a small deterministic test image:

   ```bash
   docker build -t stella/parity-image:1.0 tests/fixtures/images/parity-image
   ```

2. Make both Syft variants available:

   * Native binary: `syft` on PATH.
   * Container image: `anchore/syft:<pinned-version>` already pulled into your local registry/cache.

3. Decide where you store canonical SBOMs in the pipeline, e.g. under `Scan.artifactIdentity`.

---

#### Steps

1. **Generate SBOM with native Syft**

   ```bash
   syft packages --scope Squashed stella/parity-image:1.0 -o json > sbom-native.json
   ```

2. **Generate SBOM with containerized Syft**

   ```bash
   docker run --rm \
     -v /var/run/docker.sock:/var/run/docker.sock \
     anchore/syft:<pinned-version> \
       packages --scope Squashed stella/parity-image:1.0 -o json \
     > sbom-container.json
   ```

3. Import each SBOM into StellaOps as if they were separate scans:

   * `scanNative = stellaIngestSBOM("sbom-native.json")`
   * `scanContainer = stellaIngestSBOM("sbom-container.json")`

4. Extract the normalized artifact identity:

   * `scanNative.artifactId`
   * `scanNative.imageDigest`
   * `scanNative.manifestDigest`
   * `scanContainer.*` same fields.

---

#### Assertions

* `scanNative.artifactId == scanContainer.artifactId`
* `scanNative.imageDigest == scanContainer.imageDigest`
* If you track manifest digest separately:
  `scanNative.manifestDigest == scanContainer.manifestDigest`
* Optional: for extra paranoia, assert that the **set of package coordinates** is identical (ignoring ordering).

**Implementation guidelines**

* Don’t trust SBOM metadata blindly; if Syft metadata differs, compute a **canonical image ID** using:

  * container runtime inspect,
  * or an independent digest calculator.
* Store artifact IDs in a **single, normalized format** across all scanners and SBOM sources.

---

### STELLA-ACC-003 — Trivy air‑gapped DB schema / skip behavior

**Goal**
In offline mode, Trivy DB schema mismatches must produce a clear, typed failure in StellaOps — never silently “0 vulns”.

Trivy shows specific messages for old DB schema when combined with `--skip-update/--skip-db-update`. ([GitHub][3])

---

#### Fixtures

1. `tests/fixtures/trivy/db-old/`

   * A Trivy DB with `metadata.json` `Version: 1` where your pinned Trivy version expects `Version: 2` or greater.
   * You can:

     * download an old offline DB archive, or
     * copy a modern DB and manually set `"Version": 1` in `metadata.json` for test purposes.
2. `tests/fixtures/trivy/db-new/`

   * A DB snapshot matching the current Trivy version (pass case).

---

#### 003A – Old schema + `--skip-db-update` gives typed error

**Steps**

1. Configure the Trivy adapter for offline scan:

   * `TRIVY_CACHE_DIR = tests/fixtures/trivy/db-old`
   * Add `--skip-db-update` (or `--skip-update` depending on your adapter) to the CLI args.

2. Run a StellaOps scan using Trivy on a small test image.

3. Capture:

   * Trivy exit code.
   * Stdout/stderr.
   * StellaOps internal `ScanRun` / job status.

**Assertions**

* Trivy exits non‑zero.
* Stderr contains both:

  * `"The local DB has an old schema version"` and
  * `"--skip-update cannot be specified with the old DB schema"` (or localized equivalent). ([GitHub][3])
* StellaOps marks the scan as **failed with error type** `ScannerDatabaseOutOfDate` (or equivalent internal enum).
* **No** vulnerability records are saved for this run.

---

#### 003B – New schema + offline flags succeeds

**Steps**

1. Same as above, but:

   * `TRIVY_CACHE_DIR = tests/fixtures/trivy/db-new`

2. Run the scan.

**Assertions**

* Scan succeeds.
* A non‑empty set of vulnerabilities is stored (assuming the fixture image is intentionally vulnerable).
* Scan metadata records:

  * `offlineMode = true`
  * `dbSnapshotId` or a hash of `metadata.json`.

**Implementation guidelines**

* Parse known Trivy error strings into **structured error types** instead of treating all non‑zero exit codes alike.
* Add a small helper in the adapter like:

  ```go
  func classifyTrivyError(stderr string, exitCode int) ScannerErrorType
  ```

  and unit test it with copies of real Trivy messages from docs/issues.

---

### STELLA-ACC-004 — grype‑db / CVSS & VEX sorting determinism

**Goal**
Guarantee that StellaOps:

1. Merges and prioritizes CVSS metrics (v2/v3/v3.1/4.0) deterministically, and
2. Produces a stable, reproducible vulnerability ordering after applying VEX.

This protects you from “randomly reshuffled” critical lists when DBs update or scanners add new metrics.

---

#### Fixture

`fixtures/grype/cvss-vex-sorting.json`:

* Single artifact with three vulnerabilities on the same package:

  1. `CVE-2020-AAAA`

     * CVSS v2: 7.5 (High)
     * CVSS v3.1: 5.0 (Medium)
  2. `CVE-2021-BBBB`

     * CVSS v3.1: 8.0 (High)
  3. `GHSA-xxxx-yyyy-zzzz`

     * Alias of `CVE-2021-BBBB` but only v2 score 5.0 (Medium)

* A companion VEX document (CycloneDX VEX or CSAF) that:

  * Marks `CVE-2020-AAAA` as `not_affected` for the specific version.
  * Leaves `CVE-2021-BBBB` as `affected`.

You don’t need real IDs; you just need consistent internal expectations.

---

#### Steps

1. Ingest the Grype report and the VEX document together via StellaOps, producing `scanId`.

2. Fetch the normalized vulnerability list for that artifact:

   * e.g. `GET /artifacts/{id}/vulnerabilities?scan={scanId}` or similar internal call.

3. Map it to a simplified view inside the test:

   ```json
   [
     { "id": "CVE-2020-AAAA", "effectiveSeverity": "...", "status": "..."},
     { "id": "CVE-2021-BBBB", "effectiveSeverity": "...", "status": "..."},
     ...
   ]
   ```

---

#### Assertions

* **Deduplication**:

  * `GHSA-xxxx-yyyy-zzzz` is **merged** into the same logical vuln as `CVE-2021-BBBB` (i.e. appears once with aliases including both IDs).

* **CVSS selection**:

  * For `CVE-2020-AAAA`, `effectiveSeverity` is based on:

    * v3.1 over v2 if present, else highest base score.
  * For `CVE-2021-BBBB`, `effectiveSeverity` is “High” (from v3.1 8.0).

* **VEX impact**:

  * `CVE-2020-AAAA` is marked as `NOT_AFFECTED` (or your internal equivalent) and should:

    * either be excluded from the default list, or
    * appear but clearly flagged as `not_affected`.

* **Sorting**:

  * The vulnerability list is ordered stably, e.g.:

    1. `CVE-2021-BBBB` (High, affected)
    2. `CVE-2020-AAAA` (Medium, not_affected) — if you show not_affected entries.

* **Reproducibility**:

  * Running the same test multiple times yields identical JSON for:

    * the vulnerability list (after sorting),
    * the computed `effectiveSeverity` values.

**Implementation guidelines**

* Implement explicit merge logic for per‑vuln metrics:

  ```text
  1. Prefer CVSS v3.1 > 3.0 > 2.0 when computing effectiveSeverity.
  2. If multiple metrics of same version exist, pick highest baseScore.
  3. When multiple sources (NVD, GHSA, vendor) disagree on baseSeverity,
     define your precedence and test it.
  ```

* Keep VEX application deterministic:

  * apply VEX status before sorting,
  * optionally remove `not_affected` entries from the “default” list, but still store them.

---

### STELLA-ACC-005 — Snyk exploit‑maturity → VEX evidence mapping

**Goal**
Map Snyk’s `exploitMaturity` metadata into a standardized StellaOps “exploit evidence” / VEX semantic and make this mapping deterministic and testable.

Snyk’s APIs and webhooks expose an `exploitMaturity` field (e.g. `no-known-exploit`, `proof-of-concept`, etc.). ([Postman][5])

---

#### Fixture

`fixtures/snyk/exploit-maturity.json`:

* Mimic a Snyk test/report response with at least four issues, one per value:

  ```json
  [
    {
      "id": "SNYK-JS-AAA-1",
      "issueType": "vuln",
      "severity": "high",
      "issueData": {
        "exploitMaturity": "no-known-exploit",
        "cvssScore": 7.5
      }
    },
    {
      "id": "SNYK-JS-BBB-2",
      "issueData": {
        "exploitMaturity": "proof-of-concept",
        "cvssScore": 7.5
      }
    },
    {
      "id": "SNYK-JS-CCC-3",
      "issueData": {
        "exploitMaturity": "mature",
        "cvssScore": 5.0
      }
    },
    {
      "id": "SNYK-JS-DDD-4",
      "issueData": {
        "exploitMaturity": "no-data",
        "cvssScore": 9.0
      }
    }
  ]
  ```

(Names/IDs don’t matter; internal mapping does.)

---

#### Internal mapping (proposed)

Define a StellaOps enum, e.g. `ExploitEvidence`:

* `EXPLOIT_NONE` ← `no-known-exploit`
* `EXPLOIT_POC` ← `proof-of-concept`
* `EXPLOIT_WIDESPREAD` ← `mature`
* `EXPLOIT_UNKNOWN` ← `no-data` or missing

And define how this influences risk (e.g. by factors or direct overrides).

---

#### Steps

1. Ingest the Snyk fixture as a Snyk scan for a dummy project/image.
2. Fetch normalized vulnerabilities from StellaOps for that scan.

   In the test, map each vuln to:

   ```json
   {
     "id": "SNYK-JS-AAA-1",
     "exploitEvidence": "...",
     "effectiveSeverity": "..."
   }
   ```

---

#### Assertions

* Mapping:

  * `SNYK-JS-AAA-1` → `exploitEvidence == EXPLOIT_NONE`
  * `SNYK-JS-BBB-2` → `exploitEvidence == EXPLOIT_POC`
  * `SNYK-JS-CCC-3` → `exploitEvidence == EXPLOIT_WIDESPREAD`
  * `SNYK-JS-DDD-4` → `exploitEvidence == EXPLOIT_UNKNOWN`

* Risk impact (example; adjust to your policy):

  * For equal CVSS, rows with `EXPLOIT_WIDESPREAD` or `EXPLOIT_POC` must rank **above** `EXPLOIT_NONE` in any prioritized listing (e.g. patch queue).
  * A high CVSS (9.0) with `EXPLOIT_UNKNOWN` must not be treated as lower risk than a lower CVSS with `EXPLOIT_WIDESPREAD` unless your policy explicitly says so.

  Your sorting rule might look like:

  ```text
  ORDER BY
    exploitEvidenceRank DESC,   -- WIDESPREAD > POC > UNKNOWN > NONE
    cvssBaseScore DESC,
    vulnerabilityId ASC
  ```

  and the test should assert on the resulting order.

* Any Snyk issue missing `exploitMaturity` explicitly is treated as `EXPLOIT_UNKNOWN`, not silently defaulted to `EXPLOIT_NONE`.

**Implementation guidelines**

* Centralize the mapping in a single function and unit test it:

  ```ts
  function mapSnykExploitMaturity(value: string | null): ExploitEvidence
  ```

* If you emit VEX (CycloneDX/CSAF) from StellaOps, propagate the internal `ExploitEvidence` into the appropriate field (e.g. as part of justification or additional evidence object) and add a small test that round‑trips this.

---

## 3. How to use this as an implementor checklist

When you add or modify StellaOps integrations, treat this as a living checklist:

1. **Touching Grype?**

   * Ensure `STELLA-ACC-001` still passes.
2. **Touching SBOM ingestion / artifact identity?**

   * Run `STELLA-ACC-002`.
3. **Touching Trivy adapter or offline mode?**

   * Run `STELLA-ACC-003`.
4. **Changing vulnerability normalization / severity logic?**

   * Run `STELLA-ACC-004` and `STELLA-ACC-005`.
5. **Adding a new scanner?**

   * Clone these patterns:

     * one **secret-leak** test,
     * one **offline / DB drift** test (if relevant),
     * one **identity parity** or **determinism** test,
     * one **metadata‑mapping** test (like exploit maturity).

If you want, next step I can help you translate these into a concrete test skeleton (e.g. Gherkin scenarios or Jest/Go test functions) for the language you’re using in StellaOps.

[1]: https://advisories.gitlab.com/pkg/golang/github.com/anchore/grype/CVE-2025-65965/?utm_source=chatgpt.com "Grype has a credential disclosure vulnerability in its JSON ..."
[2]: https://trivy.dev/docs/latest/references/troubleshooting/?utm_source=chatgpt.com "Troubleshooting"
[3]: https://github.com/aquasecurity/trivy-db/issues/186?utm_source=chatgpt.com "trivy-db latest Version has an old schema · Issue #186"
[4]: https://aquasecurity.github.io/trivy/v0.37/docs/advanced/air-gap/?utm_source=chatgpt.com "Air-Gapped Environment"
[5]: https://www.postman.com/api-evangelist/snyk/collection/mppgu5u/snyk-api?utm_source=chatgpt.com "Snyk API | Get Started"