Add inline DSSE provenance documentation and Mongo schema

- Introduced a new document outlining the inline DSSE provenance for SBOM, VEX, scan, and derived events. - Defined the Mongo schema for event patches, including key fields for provenance and trust verification. - Documented the write path for ingesting provenance metadata and backfilling historical events. - Created CI/CD snippets for uploading DSSE attestations and generating provenance metadata. - Established Mongo indexes for efficient provenance queries and provided query recipes for various use cases. - Outlined policy gates for managing VEX decisions based on provenance verification. - Included UI nudges for displaying provenance information and implementation tasks for future enhancements. --- Implement reachability lattice and scoring model - Developed a comprehensive document detailing the reachability lattice and scoring model. - Defined core types for reachability states, evidence, and mitigations with corresponding C# models. - Established a scoring policy with base score contributions from various evidence classes. - Mapped reachability states to VEX gates and provided a clear overview of evidence sources. - Documented the event graph schema for persisting reachability data in MongoDB. - Outlined the integration of runtime probes for evidence collection and defined a roadmap for future tasks. --- Introduce uncertainty states and entropy scoring - Created a draft document for tracking uncertainty states and their impact on risk scoring. - Defined core uncertainty states with associated entropy values and evidence requirements. - Established a schema for storing uncertainty states alongside findings. - Documented the risk score calculation incorporating uncertainty and its effect on final risk assessments. - Provided policy guidelines for handling uncertainty in decision-making processes. - Outlined UI guidelines for displaying uncertainty information and suggested remediation actions. --- Add Ruby package inventory management - Implemented Ruby package inventory management with corresponding data models and storage mechanisms. - Created C# records for Ruby package inventory, artifacts, provenance, and runtime details. - Developed a repository for managing Ruby package inventory documents in MongoDB. - Implemented a service for storing and retrieving Ruby package inventories. - Added unit tests for the Ruby package inventory store to ensure functionality and data integrity.
2025-11-13 00:20:33 +02:00
parent 86be324fc0
commit 7040984215
41 changed files with 1955 additions and 76 deletions
--- a/docs/09_API_CLI_REFERENCE.md
+++ b/docs/09_API_CLI_REFERENCE.md
@@ -673,7 +673,7 @@ See `docs/dev/32_AUTH_CLIENT_GUIDE.md` for recommended profiles (online vs. air-
 | `stellaops-cli scan run` | Execute scanner container against a directory (auto-upload) | `--target <directory>` (required)<br>`--runner <docker\|dotnet\|self>` (default from config)<br>`--entry <image-or-entrypoint>`<br>`[scanner-args...]` | Runs the scanner, writes results into `ResultsDirectory`, emits a structured `scan-run-*.json` metadata file, and automatically uploads the artefact when the exit code is `0`. |
 | `stellaops-cli scan upload` | Re-upload existing scan artefact | `--file <path>` | Useful for retries when automatic upload fails or when operating offline. |
 | `stellaops-cli ruby inspect` | Offline Ruby workspace inspection (Gemfile / lock + runtime signals) | `--root <directory>` (default current directory)<br>`--format <table\|json>` (default `table`) | Runs the bundled `RubyLanguageAnalyzer`, renders Observation summary (bundler/runtime/capabilities) plus Package/Version/Group/Source/Lockfile/Runtime columns, or emits JSON `{ packages: [...], observation: {...} }`. Exit codes: `0` success, `64` invalid format, `70` unexpected failure, `71` missing directory. |
-| `stellaops-cli ruby resolve` | Fetch Ruby package inventory for a completed scan | `--image <registry-ref>` *or* `--scan-id <id>` (one required)<br>`--format <table\|json>` (default `table`) | Calls `GetRubyPackagesAsync` to download `ruby_packages.json`, groups entries by bundle/platform, and shows runtime entrypoints/usage. Table output mirrors `inspect`; JSON returns `{ scanId, groups: [...] }`. Exit codes: `0` success, `64` invalid args, `70` backend failure. |
+| `stellaops-cli ruby resolve` | Fetch Ruby package inventory for a completed scan | `--image <registry-ref>` *or* `--scan-id <id>` (one required)<br>`--format <table\|json>` (default `table`) | Calls `GetRubyPackagesAsync` (`GET /api/scans/{scanId}/ruby-packages`) to download the canonical `RubyPackageInventory`. Table output mirrors `inspect` with groups/platform/runtime usage; JSON now returns `{ scanId, imageDigest, generatedAt, groups: [...] }`. Exit codes: `0` success, `64` invalid args, `70` backend failure, `0` with warning when inventory hasn’t been persisted yet. |
 | `stellaops-cli db fetch` | Trigger connector jobs | `--source <id>` (e.g. `redhat`, `osv`)<br>`--stage <fetch\|parse\|map>` (default `fetch`)<br>`--mode <resume|init|cursor>` | Translates to `POST /jobs/source:{source}:{stage}` with `trigger=cli` |
 | `stellaops-cli db merge` | Run canonical merge reconcile | — | Calls `POST /jobs/merge:reconcile`; exit code `0` on acceptance, `1` on failures/conflicts |
 | `stellaops-cli db export` | Kick JSON / Trivy exports | `--format <json\|trivy-db>` (default `json`)<br>`--delta`<br>`--publish-full/--publish-delta`<br>`--bundle-full/--bundle-delta` | Sets `{ delta = true }` parameter when requested and can override ORAS/bundle toggles per run |
@@ -684,7 +684,7 @@ See `docs/dev/32_AUTH_CLIENT_GUIDE.md` for recommended profiles (online vs. air-

 ### Ruby dependency verbs (`stellaops-cli ruby …`)

-`ruby inspect` runs the same deterministic `RubyLanguageAnalyzer` bundled with Scanner.Worker against the local working tree—no backend calls—so operators can sanity-check Gemfile / Gemfile.lock pairs before shipping. The command now renders an observation banner (bundler version, package/runtime counts, capability flags, scheduler names) before the package table so air-gapped users can prove what evidence was collected. `ruby resolve` downloads the `ruby_packages.json` artifact that Scanner creates for each scan (via `GetRubyPackagesAsync`) and reshapes it for operators who need to reason about groups/platforms/runtime usage after the fact.
+`ruby inspect` runs the same deterministic `RubyLanguageAnalyzer` bundled with Scanner.Worker against the local working tree—no backend calls—so operators can sanity-check Gemfile / Gemfile.lock pairs before shipping. The command now renders an observation banner (bundler version, package/runtime counts, capability flags, scheduler names) before the package table so air-gapped users can prove what evidence was collected. `ruby resolve` reuses the persisted `RubyPackageInventory` (stored under Mongo `ruby.packages` and exposed via `GET /api/scans/{scanId}/ruby-packages`) so operators can reason about groups/platforms/runtime usage after Scanner or Offline Kits finish processing; the CLI surfaces `scanId`, `imageDigest`, and `generatedAt` metadata in JSON mode for downstream scripting.

 **`ruby inspect` flags**

@@ -731,7 +731,7 @@ Successful runs exit `0`; invalid formats raise **64**, unexpected failures retu
 | ---- | ------- | ----------- |
 | `--image <registry/ref>` | — | Scanner artifact identifier (image digest/tag). Mutually exclusive with `--scan-id`; one is required. |
 | `--scan-id <id>` | — | Explicit scan identifier returned by `scan run`. |
-| `--format <table\|json>` | `table` | `json` writes `{ "scanId": "…", "groups": [{ "group": "default", "platform": "-", "packages": [...] }] }`. |
+| `--format <table\|json>` | `table` | `json` writes the full `{ "scanId": "…", "imageDigest": "…", "generatedAt": "…", "groups": [{ "group": "default", "platform": "-", "packages": [...] }] }` payload for downstream automation. |
 | `--verbose` / `-v` | `false` | Enables HTTP + resolver logging. |

 Errors caused by missing identifiers return **64**; transient backend errors surface as **70** (with full context in logs). Table output groups packages by Gem/Bundle group + platform and shows runtime entrypoints or `[grey]-[/]` when unused. JSON payloads stay stable for downstream automation:
@@ -739,6 +739,8 @@ Errors caused by missing identifiers return **64**; transient backend errors sur
 ```json
 {
  "scanId": "scan-ruby",
+  "imageDigest": "sha256:4f7d...",
+  "generatedAt": "2025-11-12T16:05:00Z",
  "groups": [
    {
      "group": "default",