Add inline DSSE provenance documentation and Mongo schema

- Introduced a new document outlining the inline DSSE provenance for SBOM, VEX, scan, and derived events.
- Defined the Mongo schema for event patches, including key fields for provenance and trust verification.
- Documented the write path for ingesting provenance metadata and backfilling historical events.
- Created CI/CD snippets for uploading DSSE attestations and generating provenance metadata.
- Established Mongo indexes for efficient provenance queries and provided query recipes for various use cases.
- Outlined policy gates for managing VEX decisions based on provenance verification.
- Included UI nudges for displaying provenance information and implementation tasks for future enhancements.

---

Implement reachability lattice and scoring model

- Developed a comprehensive document detailing the reachability lattice and scoring model.
- Defined core types for reachability states, evidence, and mitigations with corresponding C# models.
- Established a scoring policy with base score contributions from various evidence classes.
- Mapped reachability states to VEX gates and provided a clear overview of evidence sources.
- Documented the event graph schema for persisting reachability data in MongoDB.
- Outlined the integration of runtime probes for evidence collection and defined a roadmap for future tasks.

---

Introduce uncertainty states and entropy scoring

- Created a draft document for tracking uncertainty states and their impact on risk scoring.
- Defined core uncertainty states with associated entropy values and evidence requirements.
- Established a schema for storing uncertainty states alongside findings.
- Documented the risk score calculation incorporating uncertainty and its effect on final risk assessments.
- Provided policy guidelines for handling uncertainty in decision-making processes.
- Outlined UI guidelines for displaying uncertainty information and suggested remediation actions.

---

Add Ruby package inventory management

- Implemented Ruby package inventory management with corresponding data models and storage mechanisms.
- Created C# records for Ruby package inventory, artifacts, provenance, and runtime details.
- Developed a repository for managing Ruby package inventory documents in MongoDB.
- Implemented a service for storing and retrieving Ruby package inventories.
- Added unit tests for the Ruby package inventory store to ensure functionality and data integrity.
This commit is contained in:
master
2025-11-13 00:20:33 +02:00
parent 86be324fc0
commit 7040984215
41 changed files with 1955 additions and 76 deletions

View File

@@ -673,7 +673,7 @@ See `docs/dev/32_AUTH_CLIENT_GUIDE.md` for recommended profiles (online vs. air-
| `stellaops-cli scan run` | Execute scanner container against a directory (auto-upload) | `--target <directory>` (required)<br>`--runner <docker\|dotnet\|self>` (default from config)<br>`--entry <image-or-entrypoint>`<br>`[scanner-args...]` | Runs the scanner, writes results into `ResultsDirectory`, emits a structured `scan-run-*.json` metadata file, and automatically uploads the artefact when the exit code is `0`. |
| `stellaops-cli scan upload` | Re-upload existing scan artefact | `--file <path>` | Useful for retries when automatic upload fails or when operating offline. |
| `stellaops-cli ruby inspect` | Offline Ruby workspace inspection (Gemfile / lock + runtime signals) | `--root <directory>` (default current directory)<br>`--format <table\|json>` (default `table`) | Runs the bundled `RubyLanguageAnalyzer`, renders Observation summary (bundler/runtime/capabilities) plus Package/Version/Group/Source/Lockfile/Runtime columns, or emits JSON `{ packages: [...], observation: {...} }`. Exit codes: `0` success, `64` invalid format, `70` unexpected failure, `71` missing directory. |
| `stellaops-cli ruby resolve` | Fetch Ruby package inventory for a completed scan | `--image <registry-ref>` *or* `--scan-id <id>` (one required)<br>`--format <table\|json>` (default `table`) | Calls `GetRubyPackagesAsync` to download `ruby_packages.json`, groups entries by bundle/platform, and shows runtime entrypoints/usage. Table output mirrors `inspect`; JSON returns `{ scanId, groups: [...] }`. Exit codes: `0` success, `64` invalid args, `70` backend failure. |
| `stellaops-cli ruby resolve` | Fetch Ruby package inventory for a completed scan | `--image <registry-ref>` *or* `--scan-id <id>` (one required)<br>`--format <table\|json>` (default `table`) | Calls `GetRubyPackagesAsync` (`GET /api/scans/{scanId}/ruby-packages`) to download the canonical `RubyPackageInventory`. Table output mirrors `inspect` with groups/platform/runtime usage; JSON now returns `{ scanId, imageDigest, generatedAt, groups: [...] }`. Exit codes: `0` success, `64` invalid args, `70` backend failure, `0` with warning when inventory hasnt been persisted yet. |
| `stellaops-cli db fetch` | Trigger connector jobs | `--source <id>` (e.g. `redhat`, `osv`)<br>`--stage <fetch\|parse\|map>` (default `fetch`)<br>`--mode <resume|init|cursor>` | Translates to `POST /jobs/source:{source}:{stage}` with `trigger=cli` |
| `stellaops-cli db merge` | Run canonical merge reconcile | — | Calls `POST /jobs/merge:reconcile`; exit code `0` on acceptance, `1` on failures/conflicts |
| `stellaops-cli db export` | Kick JSON / Trivy exports | `--format <json\|trivy-db>` (default `json`)<br>`--delta`<br>`--publish-full/--publish-delta`<br>`--bundle-full/--bundle-delta` | Sets `{ delta = true }` parameter when requested and can override ORAS/bundle toggles per run |
@@ -684,7 +684,7 @@ See `docs/dev/32_AUTH_CLIENT_GUIDE.md` for recommended profiles (online vs. air-
### Ruby dependency verbs (`stellaops-cli ruby …`)
`ruby inspect` runs the same deterministic `RubyLanguageAnalyzer` bundled with Scanner.Worker against the local working tree—no backend calls—so operators can sanity-check Gemfile / Gemfile.lock pairs before shipping. The command now renders an observation banner (bundler version, package/runtime counts, capability flags, scheduler names) before the package table so air-gapped users can prove what evidence was collected. `ruby resolve` downloads the `ruby_packages.json` artifact that Scanner creates for each scan (via `GetRubyPackagesAsync`) and reshapes it for operators who need to reason about groups/platforms/runtime usage after the fact.
`ruby inspect` runs the same deterministic `RubyLanguageAnalyzer` bundled with Scanner.Worker against the local working tree—no backend calls—so operators can sanity-check Gemfile / Gemfile.lock pairs before shipping. The command now renders an observation banner (bundler version, package/runtime counts, capability flags, scheduler names) before the package table so air-gapped users can prove what evidence was collected. `ruby resolve` reuses the persisted `RubyPackageInventory` (stored under Mongo `ruby.packages` and exposed via `GET /api/scans/{scanId}/ruby-packages`) so operators can reason about groups/platforms/runtime usage after Scanner or Offline Kits finish processing; the CLI surfaces `scanId`, `imageDigest`, and `generatedAt` metadata in JSON mode for downstream scripting.
**`ruby inspect` flags**
@@ -731,7 +731,7 @@ Successful runs exit `0`; invalid formats raise **64**, unexpected failures retu
| ---- | ------- | ----------- |
| `--image <registry/ref>` | — | Scanner artifact identifier (image digest/tag). Mutually exclusive with `--scan-id`; one is required. |
| `--scan-id <id>` | — | Explicit scan identifier returned by `scan run`. |
| `--format <table\|json>` | `table` | `json` writes `{ "scanId": "…", "groups": [{ "group": "default", "platform": "-", "packages": [...] }] }`. |
| `--format <table\|json>` | `table` | `json` writes the full `{ "scanId": "…", "imageDigest": "…", "generatedAt": "…", "groups": [{ "group": "default", "platform": "-", "packages": [...] }] }` payload for downstream automation. |
| `--verbose` / `-v` | `false` | Enables HTTP + resolver logging. |
Errors caused by missing identifiers return **64**; transient backend errors surface as **70** (with full context in logs). Table output groups packages by Gem/Bundle group + platform and shows runtime entrypoints or `[grey]-[/]` when unused. JSON payloads stay stable for downstream automation:
@@ -739,6 +739,8 @@ Errors caused by missing identifiers return **64**; transient backend errors sur
```json
{
"scanId": "scan-ruby",
"imageDigest": "sha256:4f7d...",
"generatedAt": "2025-11-12T16:05:00Z",
"groups": [
{
"group": "default",