devops folders consolidate

This commit is contained in:
master
2026-01-25 23:27:41 +02:00
parent 6e687b523a
commit a743bb9a1d
613 changed files with 8611 additions and 41846 deletions

View File

@@ -2,8 +2,8 @@
> Aligned with Epic 6 – Vulnerability Explorer and Epic 10 – Export Center.
> **Scope.** Implementationâ€ready architecture for the **Scanner** subsystem: WebService, Workers, analyzers, SBOM assembly (inventory & usage), perâ€layer caching, threeâ€way diffs, artifact catalog (RustFS default + PostgreSQL, S3-compatible fallback), attestation handâ€off, and scale/security posture. This document is the contract between the scanning plane and everything else (Policy, Excititor, Concelier, UI, CLI).
> **Related:** `docs/modules/scanner/operations/ai-code-guard.md`
> **Scope.** Implementationâ€ready architecture for the **Scanner** subsystem: WebService, Workers, analyzers, SBOM assembly (inventory & usage), perâ€layer caching, threeâ€way diffs, artifact catalog (RustFS default + PostgreSQL, S3-compatible fallback), attestation handâ€off, and scale/security posture. This document is the contract between the scanning plane and everything else (Policy, Excititor, Concelier, UI, CLI).
> **Related:** `docs/modules/scanner/operations/ai-code-guard.md`
---
@@ -14,14 +14,14 @@
**Boundaries.**
* Scanner **does not** produce PASS/FAIL. The backend (Policy + Excititor + Concelier) decides presentation and verdicts.
* Scanner **does not** keep thirdâ€party SBOM warehouses. It may **bind** to existing attestations for exact hashes.
* Core analyzers are **deterministic** (no fuzzy identity). Optional heuristic plugâ€ins (e.g., patchâ€presence) run under explicit flags and never contaminate the core SBOM.
SBOM dependency reachability inference uses dependency graphs to reduce false positives and
apply reachability-aware severity adjustments. See `src/Scanner/docs/sbom-reachability-filtering.md`
for policy configuration and reporting expectations.
---
* Scanner **does not** keep thirdâ€party SBOM warehouses. It may **bind** to existing attestations for exact hashes.
* Core analyzers are **deterministic** (no fuzzy identity). Optional heuristic plugâ€ins (e.g., patchâ€presence) run under explicit flags and never contaminate the core SBOM.
SBOM dependency reachability inference uses dependency graphs to reduce false positives and
apply reachability-aware severity adjustments. See `src/Scanner/docs/sbom-reachability-filtering.md`
for policy configuration and reporting expectations.
---
## 1) Solution & project layout
@@ -98,34 +98,27 @@ CLI usage: `stella scan --semantic <image>` enables semantic analysis in output.
- **Hybrid attestation**: emit **graph-level DSSE** for every `richgraph-v1` (mandatory) and optional **edge-bundle DSSE** (≤512 edges) for runtime/init-root/contested edges or third-party provenance. Publish graph DSSE digests to Rekor by default; edge-bundle Rekor publish is policy-driven. CAS layout: `cas://reachability/graphs/{blake3}` for graph body, `.../{blake3}.dsse` for envelope, and `cas://reachability/edges/{graph_hash}/{bundle_id}[.dsse]` for bundles. Deterministic ordering before hashing/signing is required.
- **Deterministic call-graph manifest**: capture analyzer versions, feed hashes, toolchain digests, and flags in a manifest stored alongside `richgraph-v1`; replaying with the same manifest MUST yield identical node/edge sets and hashes (see `docs/modules/reach-graph/guides/lead.md`).
### 1.1 Queue backbone (Valkey / NATS)
### 1.1 Queue backbone (Valkey Streams)
`StellaOps.Scanner.Queue` exposes a transport-agnostic contract (`IScanQueue`/`IScanQueueLease`) used by the WebService producer and Worker consumers. Sprint 9 introduces two first-party transports:
`StellaOps.Scanner.Queue` exposes a transport-agnostic contract (`IScanQueue`/`IScanQueueLease`) used by the WebService producer and Worker consumers.
- **Valkey Streams** (default). Uses consumer groups, deterministic idempotency keys (`scanner:jobs:idemp:*`), and supports lease claim (`XCLAIM`), renewal, exponential-backoff retries, and a `scanner:jobs:dead` stream for exhausted attempts.
- **NATS JetStream**. Provisions the `SCANNER_JOBS` work-queue stream + durable consumer `scanner-workers`, publishes with `MsgId` for dedupe, applies backoff via `NAK` delays, and routes dead-lettered jobs to `SCANNER_JOBS_DEAD`.
**Valkey Streams** is the standard transport. Uses consumer groups, deterministic idempotency keys (`scanner:jobs:idemp:*`), and supports lease claim (`XCLAIM`), renewal, exponential-backoff retries, and a `scanner:jobs:dead` stream for exhausted attempts.
Metrics are emitted via `Meter` counters (`scanner_queue_enqueued_total`, `scanner_queue_retry_total`, `scanner_queue_deadletter_total`), and `ScannerQueueHealthCheck` pings the active backend (Valkey `PING`, NATS `PING`). Configuration is bound from `scanner.queue`:
Metrics are emitted via `Meter` counters (`scanner_queue_enqueued_total`, `scanner_queue_retry_total`, `scanner_queue_deadletter_total`), and `ScannerQueueHealthCheck` pings the Valkey backend. Configuration is bound from `scanner.queue`:
```yaml
scanner:
queue:
kind: valkey # or nats (valkey uses redis:// protocol)
kind: valkey
valkey:
connectionString: "redis://queue:6379/0"
connectionString: "valkey://valkey:6379/0"
streamName: "scanner:jobs"
nats:
url: "nats://queue:4222"
stream: "SCANNER_JOBS"
subject: "scanner.jobs"
durableConsumer: "scanner-workers"
deadLetterSubject: "scanner.jobs.dead"
maxDeliveryAttempts: 5
retryInitialBackoff: 00:00:05
retryMaxBackoff: 00:02:00
```
The DI extension (`AddScannerQueue`) wires the selected transport, so future additions (e.g., RabbitMQ) only implement the same contract and register.
The DI extension (`AddScannerQueue`) wires the transport.
**Runtime formâ€factor:** two deployables
@@ -137,7 +130,7 @@ The DI extension (`AddScannerQueue`) wires the selected transport, so future add
## 2) External dependencies
* **OCI registry** with **Referrers API** (discover attached SBOMs/signatures).
* **RustFS** (default, offline-first) for SBOM artifacts; optional S3/MinIO compatibility retained for migration; **Object Lock** semantics emulated via retention headers; **ILM** for TTL.
* **RustFS** (default, offline-first) for SBOM artifacts; S3-compatible interface with **Object Lock** semantics emulated via retention headers; **ILM** for TTL.
* **PostgreSQL** for catalog, job state, diffs, ILM rules.
* **Queue** (Valkey Streams/NATS/RabbitMQ).
* **Authority** (onâ€prem OIDC) for **OpToks** (DPoP/mTLS).
@@ -206,9 +199,7 @@ attest/<artifactSha256>.dsse.json # DSSE bundle (cert chain + Rekor
RustFS exposes a deterministic HTTP API (`PUT|GET|DELETE /api/v1/buckets/{bucket}/objects/{key}`).
Scanner clients tag immutable uploads with `X-RustFS-Immutable: true` and, when retention applies,
`X-RustFS-Retain-Seconds: <ttlSeconds>`. Additional headers can be injected via
`scanner.artifactStore.headers` to support custom auth or proxy requirements. Legacy MinIO/S3
deployments remain supported by setting `scanner.artifactStore.driver = "s3"` during phased
migrations.
`scanner.artifactStore.headers` to support custom auth or proxy requirements. RustFS provides the standard S3-compatible interface for all artifact storage.
---
@@ -378,40 +369,40 @@ public sealed record BinaryFindingEvidence
The emitted `buildId` metadata is preserved in component hashes, diff payloads, and `/policy/runtime` responses so operators can pivot from SBOM entries → runtime events → `debug/.build-id/<aa>/<rest>.debug` within the Offline Kit or release bundle.
### 5.5.1 Service security analysis (Sprint 20260119_016)
When an SBOM path is provided, the worker runs the `service-security` stage to parse CycloneDX services and emit a deterministic report covering:
- Endpoint scheme hygiene (HTTP/WS/plaintext protocol detection).
- Authentication and trust-boundary enforcement.
- Sensitive data flow exposure and unencrypted transfers.
- Deprecated service versions and rate-limiting metadata gaps.
Inputs are passed via scan metadata (`sbom.path` or `sbomPath`, plus `sbom.format`). The report is attached as a surface observation payload (`service-security.report`) and keyed in the analysis store for downstream policy and report assembly. See `src/Scanner/docs/service-security.md` for the policy schema and output formats.
### 5.5.2 CBOM crypto analysis (Sprint 20260119_017)
When an SBOM includes CycloneDX `cryptoProperties`, the worker runs the `crypto-analysis` stage to produce a crypto inventory and compliance findings for weak algorithms, short keys, deprecated protocol versions, certificate hygiene, and post-quantum readiness. The report is attached as a surface observation payload (`crypto-analysis.report`) and keyed in the analysis store for downstream evidence workflows. See `src/Scanner/docs/crypto-analysis.md` for the policy schema and inventory export formats.
### 5.5.3 AI/ML supply chain security (Sprint 20260119_018)
When an SBOM includes CycloneDX `modelCard` or SPDX AI profile data, the worker runs the `ai-ml-security` stage to evaluate model governance readiness. The report covers model card completeness, training data provenance, bias/fairness checks, safety risk assessment coverage, and provenance verification. The report is attached as a surface observation payload (`ai-ml-security.report`) and keyed in the analysis store for policy evaluation and audit trails. See `src/Scanner/docs/ai-ml-security.md` for policy schema, CLI toggles, and binary analysis conventions.
### 5.5.4 Build provenance verification (Sprint 20260119_019)
When an SBOM includes CycloneDX formulation or SPDX build profile data, the worker runs the `build-provenance` stage to verify provenance completeness, builder trust, source integrity, hermetic build requirements, and optional reproducibility checks. The report is attached as a surface observation payload (`build-provenance.report`) and keyed in the analysis store for policy enforcement and audit evidence. See `src/Scanner/docs/build-provenance.md` for policy schema, CLI toggles, and report formats.
### 5.5.5 SBOM dependency reachability (Sprint 20260119_022)
When configured, the worker runs the `reachability-analysis` stage to infer dependency reachability from SBOM graphs and optionally refine it with a `richgraph-v1` call graph. Advisory matches are filtered or severity-adjusted using `VulnerabilityReachabilityFilter`, with false-positive reduction metrics recorded for auditability. The stage attaches:
- `reachability.report` (JSON) for component and vulnerability reachability.
- `reachability.report.sarif` (SARIF 2.1.0) for toolchain export.
- `reachability.graph.dot` (GraphViz) for dependency visualization.
Configuration lives in `src/Scanner/docs/sbom-reachability-filtering.md`, including policy schema, metadata keys, and report outputs.
### 5.6 DSSE attestation (via Signer/Attestor)
### 5.5.1 Service security analysis (Sprint 20260119_016)
When an SBOM path is provided, the worker runs the `service-security` stage to parse CycloneDX services and emit a deterministic report covering:
- Endpoint scheme hygiene (HTTP/WS/plaintext protocol detection).
- Authentication and trust-boundary enforcement.
- Sensitive data flow exposure and unencrypted transfers.
- Deprecated service versions and rate-limiting metadata gaps.
Inputs are passed via scan metadata (`sbom.path` or `sbomPath`, plus `sbom.format`). The report is attached as a surface observation payload (`service-security.report`) and keyed in the analysis store for downstream policy and report assembly. See `src/Scanner/docs/service-security.md` for the policy schema and output formats.
### 5.5.2 CBOM crypto analysis (Sprint 20260119_017)
When an SBOM includes CycloneDX `cryptoProperties`, the worker runs the `crypto-analysis` stage to produce a crypto inventory and compliance findings for weak algorithms, short keys, deprecated protocol versions, certificate hygiene, and post-quantum readiness. The report is attached as a surface observation payload (`crypto-analysis.report`) and keyed in the analysis store for downstream evidence workflows. See `src/Scanner/docs/crypto-analysis.md` for the policy schema and inventory export formats.
### 5.5.3 AI/ML supply chain security (Sprint 20260119_018)
When an SBOM includes CycloneDX `modelCard` or SPDX AI profile data, the worker runs the `ai-ml-security` stage to evaluate model governance readiness. The report covers model card completeness, training data provenance, bias/fairness checks, safety risk assessment coverage, and provenance verification. The report is attached as a surface observation payload (`ai-ml-security.report`) and keyed in the analysis store for policy evaluation and audit trails. See `src/Scanner/docs/ai-ml-security.md` for policy schema, CLI toggles, and binary analysis conventions.
### 5.5.4 Build provenance verification (Sprint 20260119_019)
When an SBOM includes CycloneDX formulation or SPDX build profile data, the worker runs the `build-provenance` stage to verify provenance completeness, builder trust, source integrity, hermetic build requirements, and optional reproducibility checks. The report is attached as a surface observation payload (`build-provenance.report`) and keyed in the analysis store for policy enforcement and audit evidence. See `src/Scanner/docs/build-provenance.md` for policy schema, CLI toggles, and report formats.
### 5.5.5 SBOM dependency reachability (Sprint 20260119_022)
When configured, the worker runs the `reachability-analysis` stage to infer dependency reachability from SBOM graphs and optionally refine it with a `richgraph-v1` call graph. Advisory matches are filtered or severity-adjusted using `VulnerabilityReachabilityFilter`, with false-positive reduction metrics recorded for auditability. The stage attaches:
- `reachability.report` (JSON) for component and vulnerability reachability.
- `reachability.report.sarif` (SARIF 2.1.0) for toolchain export.
- `reachability.graph.dot` (GraphViz) for dependency visualization.
Configuration lives in `src/Scanner/docs/sbom-reachability-filtering.md`, including policy schema, metadata keys, and report outputs.
### 5.6 DSSE attestation (via Signer/Attestor)
* WebService constructs **predicate** with `image_digest`, `stellaops_version`, `license_id`, `policy_digest?` (when emitting **final reports**), timestamps.
* Calls **Signer** (requires **OpTok + PoE**); Signer verifies **entitlement + scanner image integrity** and returns **DSSE bundle**.