Refactor and enhance scanner worker functionality

- Cleaned up code formatting and organization across multiple files for improved readability.
- Introduced `OsScanAnalyzerDispatcher` to handle OS analyzer execution and plugin loading.
- Updated `ScanJobContext` to include an `Analysis` property for storing scan results.
- Enhanced `ScanJobProcessor` to utilize the new `OsScanAnalyzerDispatcher`.
- Improved logging and error handling in `ScanProgressReporter` for better traceability.
- Updated project dependencies and added references to new analyzer plugins.
- Revised task documentation to reflect current status and dependencies.
This commit is contained in:
2025-10-19 18:34:15 +03:00
parent daa6a4ae8c
commit 7e2fa0a42a
59 changed files with 5563 additions and 2288 deletions

View File

@@ -162,6 +162,10 @@ GET /catalog/artifacts/{id} → { meta }
GET /healthz | /readyz | /metrics
```
### Report events
When `scanner.events.enabled = true`, the WebService serialises the signed report (canonical JSON + DSSE envelope) with `NotifyCanonicalJsonSerializer` and publishes two Redis Stream entries (`scanner.report.ready`, `scanner.scan.completed`) to the configured stream (default `stella.events`). The stream fields carry the whole envelope plus lightweight headers (`kind`, `tenant`, `ts`) so Notify and UI timelines can consume the event bus without recomputing signatures. Publish timeouts and bounded stream length are controlled via `scanner:events:publishTimeoutSeconds` and `scanner:events:maxStreamLength`. If the queue driver is already Redis and no explicit events DSN is provided, the host reuses the queue connection and auto-enables event emission so deployments get live envelopes without extra wiring.
---
## 5) Execution flow (Worker)
@@ -433,6 +437,26 @@ ResolveEntrypoint(ImageConfig cfg, RootFs fs):
return Unknown(reason)
```
### Appendix A.1 — EntryTrace Explainability
EntryTrace emits structured diagnostics and metrics so operators can quickly understand why resolution succeeded or degraded:
| Reason | Description | Typical Mitigation |
|--------|-------------|--------------------|
| `CommandNotFound` | A command referenced in the script cannot be located in the layered root filesystem or `PATH`. | Ensure binaries exist in the image or extend `PATH` hints. |
| `MissingFile` | `source`/`.`/`run-parts` targets are missing. | Bundle the script or guard the include. |
| `DynamicEnvironmentReference` | Path depends on `$VARS` that are unknown at scan time. | Provide defaults via scan metadata or accept partial usage. |
| `RecursionLimitReached` | Nested includes exceeded the analyzer depth limit (default 64). | Flatten indirection or increase the limit in options. |
| `RunPartsEmpty` | `run-parts` directory contained no executable entries. | Remove empty directories or ignore if intentional. |
| `JarNotFound` / `ModuleNotFound` | Java/Python targets missing, preventing interpreter tracing. | Ship the jar/module with the image or adjust the launcher. |
Diagnostics drive two metrics published by `EntryTraceMetrics`:
- `entrytrace_resolutions_total{outcome}` — resolution attempts segmented by outcome (`resolved`, `partiallyresolved`, `unresolved`).
- `entrytrace_unresolved_total{reason}` — diagnostic counts keyed by reason.
Structured logs include `entrytrace.path`, `entrytrace.command`, `entrytrace.reason`, and `entrytrace.depth`, all correlated with scan/job IDs. Timestamps are normalized to UTC (microsecond precision) to keep DSSE attestations and UI traces explainable.
### Appendix B — BOMIndex sidecar
```

View File

@@ -1,116 +1,116 @@
# BuildX Generator Quickstart
This quickstart explains how to run the StellaOps **BuildX SBOM generator** offline, verify the CAS handshake, and emit OCI descriptors that downstream services can attest.
## 1. Prerequisites
- Docker 25+ with BuildKit enabled (`docker buildx` available).
- .NET 10 (preview) SDK matching the repository `global.json`.
- Optional: network access to a StellaOps Attestor endpoint (the quickstart uses a mock service).
## 2. Publish the plug-in binaries
The BuildX generator publishes as a .NET self-contained executable with its manifest under `plugins/scanner/buildx/`.
```bash
# From the repository root
DOTNET_CLI_HOME="${PWD}/.dotnet" \
dotnet publish src/StellaOps.Scanner.Sbomer.BuildXPlugin/StellaOps.Scanner.Sbomer.BuildXPlugin.csproj \
-c Release \
-o out/buildx
```
- `out/buildx/` now contains `StellaOps.Scanner.Sbomer.BuildXPlugin.dll` and the manifest `stellaops.sbom-indexer.manifest.json`.
- `plugins/scanner/buildx/StellaOps.Scanner.Sbomer.BuildXPlugin/` receives the same artefacts for release packaging.
## 3. Verify the CAS handshake
```bash
dotnet out/buildx/StellaOps.Scanner.Sbomer.BuildXPlugin.dll handshake \
--manifest out/buildx \
--cas out/cas
```
The command performs a deterministic probe write (`sha256`) into the provided CAS directory and prints the resolved path.
## 4. Emit a descriptor + provenance placeholder
1. Build or identify the image you want to describe and capture its digest:
```bash
docker buildx build --load -t stellaops/buildx-demo:ci samples/ci/buildx-demo
DIGEST=$(docker image inspect stellaops/buildx-demo:ci --format '{{index .RepoDigests 0}}')
```
2. Generate a CycloneDX SBOM for the built image (any tool works; here we use `docker sbom`):
```bash
docker sbom stellaops/buildx-demo:ci --format cyclonedx-json > out/buildx-sbom.cdx.json
```
3. Invoke the `descriptor` command, pointing at the SBOM file and optional metadata:
```bash
dotnet out/buildx/StellaOps.Scanner.Sbomer.BuildXPlugin.dll descriptor \
--manifest out/buildx \
--image "$DIGEST" \
--sbom out/buildx-sbom.cdx.json \
--sbom-name buildx-sbom.cdx.json \
--artifact-type application/vnd.stellaops.sbom.layer+json \
--sbom-format cyclonedx-json \
--sbom-kind inventory \
--repository git.stella-ops.org/stellaops/buildx-demo \
--build-ref $(git rev-parse HEAD) \
> out/buildx-descriptor.json
```
# BuildX Generator Quickstart
This quickstart explains how to run the StellaOps **BuildX SBOM generator** offline, verify the CAS handshake, and emit OCI descriptors that downstream services can attest.
## 1. Prerequisites
- Docker 25+ with BuildKit enabled (`docker buildx` available).
- .NET 10 (preview) SDK matching the repository `global.json`.
- Optional: network access to a StellaOps Attestor endpoint (the quickstart uses a mock service).
## 2. Publish the plug-in binaries
The BuildX generator publishes as a .NET self-contained executable with its manifest under `plugins/scanner/buildx/`.
```bash
# From the repository root
DOTNET_CLI_HOME="${PWD}/.dotnet" \
dotnet publish src/StellaOps.Scanner.Sbomer.BuildXPlugin/StellaOps.Scanner.Sbomer.BuildXPlugin.csproj \
-c Release \
-o out/buildx
```
- `out/buildx/` now contains `StellaOps.Scanner.Sbomer.BuildXPlugin.dll` and the manifest `stellaops.sbom-indexer.manifest.json`.
- `plugins/scanner/buildx/StellaOps.Scanner.Sbomer.BuildXPlugin/` receives the same artefacts for release packaging.
## 3. Verify the CAS handshake
```bash
dotnet out/buildx/StellaOps.Scanner.Sbomer.BuildXPlugin.dll handshake \
--manifest out/buildx \
--cas out/cas
```
The command performs a deterministic probe write (`sha256`) into the provided CAS directory and prints the resolved path.
## 4. Emit a descriptor + provenance placeholder
1. Build or identify the image you want to describe and capture its digest:
```bash
docker buildx build --load -t stellaops/buildx-demo:ci samples/ci/buildx-demo
DIGEST=$(docker image inspect stellaops/buildx-demo:ci --format '{{index .RepoDigests 0}}')
```
2. Generate a CycloneDX SBOM for the built image (any tool works; here we use `docker sbom`):
```bash
docker sbom stellaops/buildx-demo:ci --format cyclonedx-json > out/buildx-sbom.cdx.json
```
3. Invoke the `descriptor` command, pointing at the SBOM file and optional metadata:
```bash
dotnet out/buildx/StellaOps.Scanner.Sbomer.BuildXPlugin.dll descriptor \
--manifest out/buildx \
--image "$DIGEST" \
--sbom out/buildx-sbom.cdx.json \
--sbom-name buildx-sbom.cdx.json \
--artifact-type application/vnd.stellaops.sbom.layer+json \
--sbom-format cyclonedx-json \
--sbom-kind inventory \
--repository git.stella-ops.org/stellaops/buildx-demo \
--build-ref $(git rev-parse HEAD) \
> out/buildx-descriptor.json
```
The output JSON captures:
- OCI artifact descriptor including size, digest, and annotations (`org.stellaops.*`).
- Provenance placeholder (`expectedDsseSha256`, `nonce`, `attestorUri` when provided).
- Provenance placeholder (`expectedDsseSha256`, `nonce`, `attestorUri` when provided). `nonce` is derived deterministically from the image + SBOM metadata so repeated runs produce identical placeholders for identical inputs.
- Generator metadata and deterministic timestamps.
## 5. (Optional) Send the placeholder to an Attestor
The plug-in can POST the descriptor metadata to an Attestor endpoint, returning once it receives an HTTP 202.
```bash
python3 - <<'PY' &
from http.server import BaseHTTPRequestHandler, HTTPServer
class Handler(BaseHTTPRequestHandler):
def do_POST(self):
_ = self.rfile.read(int(self.headers.get('Content-Length', 0)))
self.send_response(202); self.end_headers(); self.wfile.write(b'accepted')
def log_message(self, fmt, *args):
return
server = HTTPServer(('127.0.0.1', 8085), Handler)
try:
server.serve_forever()
except KeyboardInterrupt:
pass
finally:
server.server_close()
PY
MOCK_PID=$!
dotnet out/buildx/StellaOps.Scanner.Sbomer.BuildXPlugin.dll descriptor \
--manifest out/buildx \
--image "$DIGEST" \
--sbom out/buildx-sbom.cdx.json \
--attestor http://127.0.0.1:8085/provenance \
--attestor-token "$STELLAOPS_ATTESTOR_TOKEN" \
> out/buildx-descriptor.json
kill $MOCK_PID
```
Set `STELLAOPS_ATTESTOR_TOKEN` (or pass `--attestor-token`) when the Attestor requires bearer authentication. Use `--attestor-insecure` for lab environments with self-signed certificates.
## 6. CI workflow example
## 5. (Optional) Send the placeholder to an Attestor
The plug-in can POST the descriptor metadata to an Attestor endpoint, returning once it receives an HTTP 202.
```bash
python3 - <<'PY' &
from http.server import BaseHTTPRequestHandler, HTTPServer
class Handler(BaseHTTPRequestHandler):
def do_POST(self):
_ = self.rfile.read(int(self.headers.get('Content-Length', 0)))
self.send_response(202); self.end_headers(); self.wfile.write(b'accepted')
def log_message(self, fmt, *args):
return
server = HTTPServer(('127.0.0.1', 8085), Handler)
try:
server.serve_forever()
except KeyboardInterrupt:
pass
finally:
server.server_close()
PY
MOCK_PID=$!
dotnet out/buildx/StellaOps.Scanner.Sbomer.BuildXPlugin.dll descriptor \
--manifest out/buildx \
--image "$DIGEST" \
--sbom out/buildx-sbom.cdx.json \
--attestor http://127.0.0.1:8085/provenance \
--attestor-token "$STELLAOPS_ATTESTOR_TOKEN" \
> out/buildx-descriptor.json
kill $MOCK_PID
```
Set `STELLAOPS_ATTESTOR_TOKEN` (or pass `--attestor-token`) when the Attestor requires bearer authentication. Use `--attestor-insecure` for lab environments with self-signed certificates.
## 6. CI workflow example
A reusable GitHub Actions workflow is provided under `samples/ci/buildx-demo/github-actions-buildx-demo.yml`. It publishes the plug-in, runs the handshake, builds the demo image, emits a descriptor, and uploads both the descriptor and the mock-Attestor request as artefacts.
Add the workflow to your repository (or call it via `workflow_call`) and adjust the SBOM path + Attestor URL as needed.
Add the workflow to your repository (or call it via `workflow_call`) and adjust the SBOM path + Attestor URL as needed. The workflow also re-runs the `descriptor` command and diffs the results (ignoring the `generatedAt` timestamp) so you catch regressions that would break deterministic CI use.
---