114 lines
6.5 KiB
Markdown
114 lines
6.5 KiB
Markdown
# StellaOps Scanner
|
||
|
||
Scanner analyses container images layer-by-layer, producing deterministic SBOM fragments, diffs, and signed reports.
|
||
|
||
## Latest updates (2025-12-12)
|
||
- Deterministic SBOM composition fixture published at `docs/modules/scanner/fixtures/deterministic-compose/` with DSSE, `_composition.json`, BOM, and hashes; doc `deterministic-sbom-compose.md` promoted to Ready v1.0 with offline verification steps.
|
||
- Node analyzer now ingests npm/yarn/pnpm lockfiles, emitting `DeclaredOnly` components with lock provenance. The CLI companion command `stella node lock-validate` runs the collector offline, surfaces declared-only or missing-lock packages, and emits telemetry via `stellaops.cli.node.lock_validate.count`. See `docs/modules/scanner/analyzers-node.md` and bench scenario `node_detection_gaps_fixture`.
|
||
- Python analyzer picks up `requirements*.txt`, `Pipfile.lock`, and `poetry.lock`, tagging installed distributions with lock provenance and generating declared-only components for policy. Use `stella python lock-validate` to run the same checks locally before images are built.
|
||
- Java analyzer now parses `gradle.lockfile`, `gradle/dependency-locks/**/*.lockfile`, and `pom.xml` dependencies via the new `JavaLockFileCollector`, merging lock metadata onto jar evidence and emitting declared-only components when jars are absent. The new CLI verb `stella java lock-validate` reuses that collector offline (table/JSON output) and records `stellaops.cli.java.lock_validate.count{outcome}` for observability.
|
||
- Worker/WebService now resolve cache roots and feature flags via `StellaOps.Scanner.Surface.Env`; misconfiguration warnings are documented in `docs/modules/scanner/design/surface-env.md` and surfaced through startup validation.
|
||
- Platform events rollout (2025-10-19) continues to publish scanner.report.ready@1 and scanner.scan.completed@1 envelopes with embedded DSSE payloads (see docs/updates/2025-10-19-scanner-policy.md and docs/updates/2025-10-19-platform-events.md). Service and consumer tests should round-trip the canonical samples under docs/events/samples/.
|
||
- OS/non-language analyzers: evidence is rootfs-relative, warnings are structured/capped, hashing is bounded, and Linux OS analyzers support surface-cache reuse. See `os-analyzers-evidence.md`.
|
||
|
||
## Responsibilities
|
||
- Expose APIs (WebService) for scan orchestration, diffing, and artifact retrieval.
|
||
- Run Worker analyzers for OS, language, and native ecosystems with restart-only plug-ins.
|
||
- Store SBOM fragments and artifacts in RustFS/object storage.
|
||
- Publish DSSE-ready metadata for Signer/Attestor and downstream policy evaluation.
|
||
|
||
## Key components
|
||
- `StellaOps.Scanner.WebService` minimal API host.
|
||
- `StellaOps.Scanner.Worker` analyzer executor.
|
||
- Analyzer libraries under `StellaOps.Scanner.Analyzers.*`.
|
||
|
||
## Integrations & dependencies
|
||
- Scheduler for job intake and retries.
|
||
- Policy Engine for evidence handoff.
|
||
- Export Center / Offline Kit for artifact packaging.
|
||
|
||
## Operational notes
|
||
- CAS caches, bounded retries, DSSE integration.
|
||
- Monitoring dashboards (see ./operations/analyzers-grafana-dashboard.json).
|
||
- RustFS migration playbook.
|
||
|
||
## Related resources
|
||
- ./operations/analyzers.md
|
||
- ./operations/analyzers-grafana-dashboard.json
|
||
- ./operations/rustfs-migration.md
|
||
- ./operations/entrypoint.md
|
||
- ./analyzers-node.md
|
||
- ./analyzers-go.md
|
||
- ./operations/secret-leak-detection.md
|
||
- ./operations/dsse-rekor-operator-guide.md
|
||
- ./os-analyzers-evidence.md
|
||
- ./design/macos-analyzer.md
|
||
- ./design/windows-analyzer.md
|
||
- ../benchmarks/scanner/deep-dives/macos.md
|
||
- ../benchmarks/scanner/deep-dives/windows.md
|
||
- ../benchmarks/scanner/windows-macos-demand.md
|
||
- ../benchmarks/scanner/windows-macos-interview-template.md
|
||
- ./operations/field-engagement.md
|
||
- ./design/README.md
|
||
|
||
## Backlog references
|
||
- DOCS-SCANNER updates tracked in ../../TASKS.md.
|
||
- Analyzer parity work in src/Scanner/**/TASKS.md.
|
||
|
||
## Implementation Status
|
||
|
||
### Phase 1 – Control plane & job queue (Complete)
|
||
- Scanner WebService with queue abstraction (Valkey/NATS)
|
||
- Job leasing with retries and dead-letter handling
|
||
- CAS layer cache and artifact catalog
|
||
- REST API endpoints for scan management
|
||
|
||
### Phase 2 – Analyzer parity & SBOM assembly (In Progress)
|
||
- OS analyzers: apk/dpkg/rpm with deterministic metadata
|
||
- Language analyzers: Java, Node, Python, Go, .NET, Rust with lock file support
|
||
- Native analyzers: ELF/PE/MachO for binary analysis
|
||
- SBOM views: inventory/usage with CycloneDX/SPDX emitters
|
||
- Entry trace resolution and dependency analysis
|
||
|
||
### Phase 3 – Diff & attestations (In Progress)
|
||
- Three-way diff engine (base, target, runtime)
|
||
- DSSE SBOM/report signing pipeline
|
||
- Attestation hand-off to Signer/Attestor
|
||
- Metadata for Export Center integration
|
||
|
||
### Phase 4 – Integrations & exports (Planned)
|
||
- Policy Engine integration for evaluation
|
||
- Vuln Explorer metadata delivery
|
||
- Export Center artifact packaging
|
||
- CLI/Console workflows and buildx plugin
|
||
|
||
### Phase 5 – Observability & resilience (Planned)
|
||
- Metrics: queue depth, scan latency, cache hit/miss, analyzer timing
|
||
- Queue backpressure handling and cache eviction
|
||
- SLO dashboards and alerting
|
||
- Smoke tests and runbooks
|
||
|
||
### Key Acceptance Criteria
|
||
- Scans produce deterministic SBOM inventory/usage with stable component identity
|
||
- Queue/worker pipeline handles retries, backpressure, offline kits
|
||
- DSSE attestations exported for Signer/Attestor without transformation
|
||
- CLI/Console parity for scan submission, diffing, exports, verification
|
||
- Offline scanning supported with local caches and manifest verification
|
||
|
||
### Technical Decisions & Risks
|
||
- Analyzer drift prevented via golden fixtures, hash-based regression tests, deterministic sorting
|
||
- Queue overload mitigated with adaptive backpressure, worker scaling, priority lanes
|
||
- Storage growth managed via CAS dedupe, ILM policies, offline bundle pruning
|
||
- Lock file integration (npm/yarn/pnpm, pip/poetry, gradle) with declared-only components
|
||
- Surface cache reuse for Linux OS analyzers with rootfs-relative evidence
|
||
|
||
### Recent Enhancements (2025-12-12)
|
||
- Deterministic SBOM composition with DSSE fixtures and offline verification
|
||
- Node/Python/Java lock file collectors with CLI validation commands
|
||
- Platform events rollout with scanner.report.ready@1 and scanner.scan.completed@1
|
||
- Surface-cache environment resolution with startup validation
|
||
|
||
## Epic alignment
|
||
- **Epic 6 – Vulnerability Explorer:** provide policy-aware scan outputs, explain traces, and findings ledger hooks for triage workflows.
|
||
- **Epic 10 – Export Center:** generate export-ready artefacts, manifests, and DSSE metadata for bundles.
|