feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules
- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes. - Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes. - Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables. - Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
This commit is contained in:
		
							
								
								
									
										489
									
								
								docs/modules/scanner/architecture.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										489
									
								
								docs/modules/scanner/architecture.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,489 @@ | ||||
| # component_architecture_scanner.md — **Stella Ops Scanner** (2025Q4) | ||||
|  | ||||
| > Aligned with Epic 6 – Vulnerability Explorer and Epic 10 – Export Center. | ||||
|  | ||||
| > **Scope.** Implementation‑ready architecture for the **Scanner** subsystem: WebService, Workers, analyzers, SBOM assembly (inventory & usage), per‑layer caching, three‑way diffs, artifact catalog (RustFS default + Mongo, S3-compatible fallback), attestation hand‑off, and scale/security posture. This document is the contract between the scanning plane and everything else (Policy, Excititor, Concelier, UI, CLI). | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 0) Mission & boundaries | ||||
|  | ||||
| **Mission.** Produce **deterministic**, **explainable** SBOMs and diffs for container images and filesystems, quickly and repeatedly, without guessing. Emit two views: **Inventory** (everything present) and **Usage** (entrypoint closure + actually linked libs). Attach attestations through **Signer→Attestor→Rekor v2**. | ||||
|  | ||||
| **Boundaries.** | ||||
|  | ||||
| * Scanner **does not** produce PASS/FAIL. The backend (Policy + Excititor + Concelier) decides presentation and verdicts. | ||||
| * Scanner **does not** keep third‑party SBOM warehouses. It may **bind** to existing attestations for exact hashes. | ||||
| * Core analyzers are **deterministic** (no fuzzy identity). Optional heuristic plug‑ins (e.g., patch‑presence) run under explicit flags and never contaminate the core SBOM. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 1) Solution & project layout | ||||
|  | ||||
| ``` | ||||
| src/ | ||||
|  ├─ StellaOps.Scanner.WebService/            # REST control plane, catalog, diff, exports | ||||
|  ├─ StellaOps.Scanner.Worker/                # queue consumer; executes analyzers | ||||
|  ├─ StellaOps.Scanner.Models/                # DTOs, evidence, graph nodes, CDX/SPDX adapters | ||||
|  ├─ StellaOps.Scanner.Storage/               # Mongo repositories; RustFS object client (default) + S3 fallback; ILM/GC | ||||
|  ├─ StellaOps.Scanner.Queue/                 # queue abstraction (Redis/NATS/RabbitMQ) | ||||
|  ├─ StellaOps.Scanner.Cache/                 # layer cache; file CAS; bloom/bitmap indexes | ||||
|  ├─ StellaOps.Scanner.EntryTrace/            # ENTRYPOINT/CMD → terminal program resolver (shell AST) | ||||
|  ├─ StellaOps.Scanner.Analyzers.OS.[Apk|Dpkg|Rpm]/ | ||||
|  ├─ StellaOps.Scanner.Analyzers.Lang.[Java|Node|Python|Go|DotNet|Rust]/ | ||||
|  ├─ StellaOps.Scanner.Analyzers.Native.[ELF|PE|MachO]/   # PE/Mach-O planned (M2) | ||||
|  ├─ StellaOps.Scanner.Emit.CDX/              # CycloneDX (JSON + Protobuf) | ||||
|  ├─ StellaOps.Scanner.Emit.SPDX/             # SPDX 3.0.1 JSON | ||||
|  ├─ StellaOps.Scanner.Diff/                  # image→layer→component three‑way diff | ||||
|  ├─ StellaOps.Scanner.Index/                 # BOM‑Index sidecar (purls + roaring bitmaps) | ||||
|  ├─ StellaOps.Scanner.Tests.*                # unit/integration/e2e fixtures | ||||
|  └─ Tools/ | ||||
|      ├─ StellaOps.Scanner.Sbomer.BuildXPlugin/   # BuildKit generator (image referrer SBOMs) | ||||
|      └─ StellaOps.Scanner.Sbomer.DockerImage/    # CLI‑driven scanner container | ||||
| ``` | ||||
|  | ||||
| Analyzer assemblies and buildx generators are packaged as **restart-time plug-ins** under `plugins/scanner/**` with manifests; services must restart to activate new plug-ins. | ||||
|  | ||||
| ### 1.1 Queue backbone (Redis / NATS) | ||||
|  | ||||
| `StellaOps.Scanner.Queue` exposes a transport-agnostic contract (`IScanQueue`/`IScanQueueLease`) used by the WebService producer and Worker consumers. Sprint 9 introduces two first-party transports: | ||||
|  | ||||
| - **Redis Streams** (default). Uses consumer groups, deterministic idempotency keys (`scanner:jobs:idemp:*`), and supports lease claim (`XCLAIM`), renewal, exponential-backoff retries, and a `scanner:jobs:dead` stream for exhausted attempts. | ||||
| - **NATS JetStream**. Provisions the `SCANNER_JOBS` work-queue stream + durable consumer `scanner-workers`, publishes with `MsgId` for dedupe, applies backoff via `NAK` delays, and routes dead-lettered jobs to `SCANNER_JOBS_DEAD`. | ||||
|  | ||||
| Metrics are emitted via `Meter` counters (`scanner_queue_enqueued_total`, `scanner_queue_retry_total`, `scanner_queue_deadletter_total`), and `ScannerQueueHealthCheck` pings the active backend (Redis `PING`, NATS `PING`). Configuration is bound from `scanner.queue`: | ||||
|  | ||||
| ```yaml | ||||
| scanner: | ||||
|   queue: | ||||
|     kind: redis # or nats | ||||
|     redis: | ||||
|       connectionString: "redis://queue:6379/0" | ||||
|       streamName: "scanner:jobs" | ||||
|     nats: | ||||
|       url: "nats://queue:4222" | ||||
|       stream: "SCANNER_JOBS" | ||||
|       subject: "scanner.jobs" | ||||
|       durableConsumer: "scanner-workers" | ||||
|       deadLetterSubject: "scanner.jobs.dead" | ||||
|     maxDeliveryAttempts: 5 | ||||
|     retryInitialBackoff: 00:00:05 | ||||
|     retryMaxBackoff: 00:02:00 | ||||
| ``` | ||||
|  | ||||
| The DI extension (`AddScannerQueue`) wires the selected transport, so future additions (e.g., RabbitMQ) only implement the same contract and register. | ||||
|  | ||||
| **Runtime form‑factor:** two deployables | ||||
|  | ||||
| * **Scanner.WebService** (stateless REST) | ||||
| * **Scanner.Worker** (N replicas; queue‑driven) | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 2) External dependencies | ||||
|  | ||||
| * **OCI registry** with **Referrers API** (discover attached SBOMs/signatures). | ||||
| * **RustFS** (default, offline-first) for SBOM artifacts; optional S3/MinIO compatibility retained for migration; **Object Lock** semantics emulated via retention headers; **ILM** for TTL. | ||||
| * **MongoDB** for catalog, job state, diffs, ILM rules. | ||||
| * **Queue** (Redis Streams/NATS/RabbitMQ). | ||||
| * **Authority** (on‑prem OIDC) for **OpToks** (DPoP/mTLS). | ||||
| * **Signer** + **Attestor** (+ **Fulcio/KMS** + **Rekor v2**) for DSSE + transparency. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 3) Contracts & data model | ||||
|  | ||||
| ### 3.1 Evidence‑first component model | ||||
|  | ||||
| **Nodes** | ||||
|  | ||||
| * `Image`, `Layer`, `File` | ||||
| * `Component` (`purl?`, `name`, `version?`, `type`, `id` — may be `bin:{sha256}`) | ||||
| * `Executable` (ELF/PE/Mach‑O), `Library` (native or managed), `EntryScript` (shell/launcher) | ||||
|  | ||||
| **Edges** (all carry **Evidence**) | ||||
|  | ||||
| * `contains(Image|Layer → File)` | ||||
| * `installs(PackageDB → Component)` (OS database row) | ||||
| * `declares(InstalledMetadata → Component)` (dist‑info, pom.properties, deps.json…) | ||||
| * `links_to(Executable → Library)` (ELF `DT_NEEDED`, PE imports) | ||||
| * `calls(EntryScript → Program)` (file:line from shell AST) | ||||
| * `attests(Rekor → Component|Image)` (SBOM/predicate binding) | ||||
| * `bound_from_attestation(Component_attested → Component_observed)` (hash equality proof) | ||||
|  | ||||
| **Evidence** | ||||
|  | ||||
| ``` | ||||
| { source: enum, locator: (path|offset|line), sha256?, method: enum, timestamp } | ||||
| ``` | ||||
|  | ||||
| No confidences. Either a fact is proven with listed mechanisms, or it is not claimed. | ||||
|  | ||||
| ### 3.2 Catalog schema (Mongo) | ||||
|  | ||||
| * `artifacts` | ||||
|  | ||||
|   ``` | ||||
|   { _id, type: layer-bom|image-bom|diff|index, | ||||
|     format: cdx-json|cdx-pb|spdx-json, | ||||
|     bytesSha256, size, rekor: { uuid,index,url }?, | ||||
|     ttlClass, immutable, refCount, createdAt } | ||||
|   ``` | ||||
| * `images { imageDigest, repo, tag?, arch, createdAt, lastSeen }` | ||||
| * `layers { layerDigest, mediaType, size, createdAt, lastSeen }` | ||||
| * `links  { fromType, fromDigest, artifactId }`               // image/layer -> artifact | ||||
| * `jobs   { _id, kind, args, state, startedAt, heartbeatAt, endedAt, error }` | ||||
| * `lifecycleRules { ruleId, scope, ttlDays, retainIfReferenced, immutable }` | ||||
|  | ||||
| ### 3.3 Object store layout (RustFS) | ||||
|  | ||||
| ``` | ||||
| layers/<sha256>/sbom.cdx.json.zst | ||||
| layers/<sha256>/sbom.spdx.json.zst | ||||
| images/<imgDigest>/inventory.cdx.pb            # CycloneDX Protobuf | ||||
| images/<imgDigest>/usage.cdx.pb | ||||
| indexes/<imgDigest>/bom-index.bin              # purls + roaring bitmaps | ||||
| diffs/<old>_<new>/diff.json.zst | ||||
| attest/<artifactSha256>.dsse.json              # DSSE bundle (cert chain + Rekor proof) | ||||
| ``` | ||||
|  | ||||
| RustFS exposes a deterministic HTTP API (`PUT|GET|DELETE /api/v1/buckets/{bucket}/objects/{key}`). | ||||
| Scanner clients tag immutable uploads with `X-RustFS-Immutable: true` and, when retention applies, | ||||
| `X-RustFS-Retain-Seconds: <ttlSeconds>`. Additional headers can be injected via | ||||
| `scanner.artifactStore.headers` to support custom auth or proxy requirements. Legacy MinIO/S3 | ||||
| deployments remain supported by setting `scanner.artifactStore.driver = "s3"` during phased | ||||
| migrations. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 4) REST API (Scanner.WebService) | ||||
|  | ||||
| All under `/api/v1/scanner`. Auth: **OpTok** (DPoP/mTLS); RBAC scopes. | ||||
|  | ||||
| ``` | ||||
| POST /scans                        { imageRef|digest, force?:bool } → { scanId } | ||||
| GET  /scans/{id}                   → { status, imageDigest, artifacts[], rekor? } | ||||
| GET  /sboms/{imageDigest}          ?format=cdx-json|cdx-pb|spdx-json&view=inventory|usage → bytes | ||||
| GET  /diff?old=<digest>&new=<digest>&view=inventory|usage → diff.json | ||||
| POST /exports                      { imageDigest, format, view, attest?:bool } → { artifactId, rekor? } | ||||
| POST /reports                      { imageDigest, policyRevision? } → { reportId, rekor? }   # delegates to backend policy+vex | ||||
| GET  /catalog/artifacts/{id}       → { meta } | ||||
| GET  /healthz | /readyz | /metrics | ||||
| ``` | ||||
|  | ||||
| ### Report events | ||||
|  | ||||
| When `scanner.events.enabled = true`, the WebService serialises the signed report (canonical JSON + DSSE envelope) with `NotifyCanonicalJsonSerializer` and publishes two Redis Stream entries (`scanner.report.ready`, `scanner.scan.completed`) to the configured stream (default `stella.events`). The stream fields carry the whole envelope plus lightweight headers (`kind`, `tenant`, `ts`) so Notify and UI timelines can consume the event bus without recomputing signatures. Publish timeouts and bounded stream length are controlled via `scanner:events:publishTimeoutSeconds` and `scanner:events:maxStreamLength`. If the queue driver is already Redis and no explicit events DSN is provided, the host reuses the queue connection and auto-enables event emission so deployments get live envelopes without extra wiring. Compose/Helm bundles expose the same knobs via the `SCANNER__EVENTS__*` environment variables for quick tuning. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 5) Execution flow (Worker) | ||||
|  | ||||
| ### 5.1 Acquire & verify | ||||
|  | ||||
| 1. **Resolve image** (prefer `repo@sha256:…`). | ||||
| 2. **(Optional) verify image signature** per policy (cosign). | ||||
| 3. **Pull blobs**, compute layer digests; record metadata. | ||||
|  | ||||
| ### 5.2 Layer union FS | ||||
|  | ||||
| * Apply whiteouts; materialize final filesystem; map **file → first introducing layer**. | ||||
| * Windows layers (MSI/SxS/GAC) planned in **M2**. | ||||
|  | ||||
| ### 5.3 Evidence harvest (parallel analyzers; deterministic only) | ||||
|  | ||||
| **A) OS packages** | ||||
|  | ||||
| * **apk**: `/lib/apk/db/installed` | ||||
| * **dpkg**: `/var/lib/dpkg/status`, `/var/lib/dpkg/info/*.list` | ||||
| * **rpm**: `/var/lib/rpm/Packages` (via librpm or parser) | ||||
| * Record `name`, `version` (epoch/revision), `arch`, source package where present, and **declared file lists**. | ||||
|  | ||||
| > **Data flow note:** Each OS analyzer now writes its canonical output into the shared `ScanAnalysisStore` under | ||||
| > `analysis.os.packages` (raw results), `analysis.os.fragments` (per-analyzer layer fragments), and contributes to | ||||
| > `analysis.layers.fragments` (the aggregated view consumed by emit/diff pipelines). Helpers in | ||||
| > `ScanAnalysisCompositionBuilder` convert these fragments into SBOM composition requests and component graphs so the | ||||
| > diff/emit stages no longer reach back into individual analyzer implementations. | ||||
|  | ||||
| **B) Language ecosystems (installed state only)** | ||||
|  | ||||
| * **Java**: `META-INF/maven/*/pom.properties`, MANIFEST → `pkg:maven/...` | ||||
| * **Node**: `node_modules/**/package.json` → `pkg:npm/...` | ||||
| * **Python**: `*.dist-info/{METADATA,RECORD}` → `pkg:pypi/...` | ||||
| * **Go**: Go **buildinfo** in binaries → `pkg:golang/...` | ||||
| * **.NET**: `*.deps.json` + assembly metadata → `pkg:nuget/...` | ||||
| * **Rust**: crates only when **explicitly present** (embedded metadata or cargo/registry traces); otherwise binaries reported as `bin:{sha256}`. | ||||
|  | ||||
| > **Rule:** We only report components proven **on disk** with authoritative metadata. Lockfiles are evidence only. | ||||
|  | ||||
| **C) Native link graph** | ||||
|  | ||||
| * **ELF**: parse `PT_INTERP`, `DT_NEEDED`, RPATH/RUNPATH, **GNU symbol versions**; map **SONAMEs** to file paths; link executables → libs. | ||||
| * **PE/Mach‑O** (planned M2): import table, delay‑imports; version resources; code signatures. | ||||
| * Map libs back to **OS packages** if possible (via file lists); else emit `bin:{sha256}` components. | ||||
| * The exported metadata (`stellaops.os.*` properties, license list, source package) feeds policy scoring and export pipelines | ||||
|   directly – Policy evaluates quiet rules against package provenance while Exporters forward the enriched fields into | ||||
|   downstream JSON/Trivy payloads. | ||||
|  | ||||
| **D) EntryTrace (ENTRYPOINT/CMD → terminal program)** | ||||
|  | ||||
| * Read image config; parse shell (POSIX/Bash subset) with AST: `source`/`.` includes; `case/if`; `exec`/`command`; `run‑parts`. | ||||
| * Resolve commands via **PATH** within the **built rootfs**; follow language launchers (Java/Node/Python) to identify the terminal program (ELF/JAR/venv script). | ||||
| * Record **file:line** and choices for each hop; output chain graph. | ||||
| * Unresolvable dynamic constructs are recorded as **unknown** edges with reasons (e.g., `$FOO` unresolved). | ||||
|  | ||||
| **E) Attestation & SBOM bind (optional)** | ||||
|  | ||||
| * For each **file hash** or **binary hash**, query local cache of **Rekor v2** indices; if an SBOM attestation is found for **exact hash**, bind it to the component (origin=`attested`). | ||||
| * For the **image** digest, likewise bind SBOM attestations (build‑time referrers). | ||||
|  | ||||
| ### 5.4 Component normalization (exact only) | ||||
|  | ||||
| * Create `Component` nodes only with deterministic identities: purl, or **`bin:{sha256}`** for unlabeled binaries. | ||||
| * Record **origin** (OS DB, installed metadata, linker, attestation). | ||||
|  | ||||
| ### 5.5 SBOM assembly & emit | ||||
|  | ||||
| * **Per-layer SBOM fragments**: components introduced by the layer (+ relationships). | ||||
| * **Image SBOMs**: merge fragments; refer back to them via **CycloneDX BOM‑Link** (or SPDX ExternalRef). | ||||
| * Emit both **Inventory** & **Usage** views. | ||||
| * When the native analyzer reports an ELF `buildId`, attach it to component metadata and surface it as `stellaops:buildId` in CycloneDX properties (and diff metadata). This keeps SBOM/diff output in lockstep with runtime events and the debug-store manifest. | ||||
| * Serialize **CycloneDX JSON** and **CycloneDX Protobuf**; optionally **SPDX 3.0.1 JSON**. | ||||
| * Build **BOM‑Index** sidecar: purl table + roaring bitmap; flag `usedByEntrypoint` components for fast backend joins. | ||||
|  | ||||
| The emitted `buildId` metadata is preserved in component hashes, diff payloads, and `/policy/runtime` responses so operators can pivot from SBOM entries → runtime events → `debug/.build-id/<aa>/<rest>.debug` within the Offline Kit or release bundle. | ||||
|  | ||||
| ### 5.6 DSSE attestation (via Signer/Attestor) | ||||
|  | ||||
| * WebService constructs **predicate** with `image_digest`, `stellaops_version`, `license_id`, `policy_digest?` (when emitting **final reports**), timestamps. | ||||
| * Calls **Signer** (requires **OpTok + PoE**); Signer verifies **entitlement + scanner image integrity** and returns **DSSE bundle**. | ||||
| * **Attestor** logs to **Rekor v2**; returns `{uuid,index,proof}` → stored in `artifacts.rekor`. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 6) Three‑way diff (image → layer → component) | ||||
|  | ||||
| ### 6.1 Keys & classification | ||||
|  | ||||
| * Component key: **purl** when present; else `bin:{sha256}`. | ||||
| * Diff classes: `added`, `removed`, `version_changed` (`upgraded|downgraded`), `metadata_changed` (e.g., origin from attestation vs observed). | ||||
| * Layer attribution: for each change, resolve the **introducing/removing layer**. | ||||
|  | ||||
| ### 6.2 Algorithm (outline) | ||||
|  | ||||
| ``` | ||||
| A = components(imageOld, key) | ||||
| B = components(imageNew, key) | ||||
|  | ||||
| added   = B \ A | ||||
| removed = A \ B | ||||
| changed = { k in A∩B : version(A[k]) != version(B[k]) || origin changed } | ||||
|  | ||||
| for each item in added/removed/changed: | ||||
|    layer = attribute_to_layer(item, imageOld|imageNew) | ||||
|    usageFlag = usedByEntrypoint(item, imageNew) | ||||
| emit diff.json (grouped by layer with badges) | ||||
| ``` | ||||
|  | ||||
| Diffs are stored as artifacts and feed **UI** and **CLI**. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 7) Build‑time SBOMs (fast CI path) | ||||
|  | ||||
| **Scanner.Sbomer.BuildXPlugin** can act as a BuildKit **generator**: | ||||
|  | ||||
| * During `docker buildx build --attest=type=sbom,generator=stellaops/sbom-indexer`, run analyzers on the build context/output; attach SBOMs as OCI **referrers** to the built image. | ||||
| * Optionally request **Signer/Attestor** to produce **Stella Ops‑verified** attestation immediately; else, Scanner.WebService can verify and re‑attest post‑push. | ||||
| * Scanner.WebService trusts build‑time SBOMs per policy, enabling **no‑rescan** for unchanged bases. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 8) Configuration (YAML) | ||||
|  | ||||
| ```yaml | ||||
| scanner: | ||||
|   queue: | ||||
|     kind: redis | ||||
|     url: "redis://queue:6379/0" | ||||
|   mongo: | ||||
|     uri: "mongodb://mongo/scanner" | ||||
|   s3: | ||||
|     endpoint: "http://minio:9000" | ||||
|     bucket: "stellaops" | ||||
|     objectLock: "governance"   # or 'compliance' | ||||
|   analyzers: | ||||
|     os: { apk: true, dpkg: true, rpm: true } | ||||
|     lang: { java: true, node: true, python: true, go: true, dotnet: true, rust: true } | ||||
|     native: { elf: true, pe: false, macho: false }    # PE/Mach-O in M2 | ||||
|     entryTrace: { enabled: true, shellMaxDepth: 64, followRunParts: true } | ||||
|   emit: | ||||
|     cdx: { json: true, protobuf: true } | ||||
|     spdx: { json: true } | ||||
|     compress: "zstd" | ||||
|   rekor: | ||||
|     url: "https://rekor-v2.internal" | ||||
|   signer: | ||||
|     url: "https://signer.internal" | ||||
|   limits: | ||||
|     maxParallel: 8 | ||||
|     perRegistryConcurrency: 2 | ||||
|   policyHints: | ||||
|     verifyImageSignature: false | ||||
|     trustBuildTimeSboms: true | ||||
| ``` | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 9) Scale & performance | ||||
|  | ||||
| * **Parallelism**: per‑analyzer concurrency; bounded directory walkers; file CAS dedupe by sha256. | ||||
| * **Distributed locks** per **layer digest** to prevent duplicate work across Workers. | ||||
| * **Registry throttles**: per‑host concurrency budgets; exponential backoff on 429/5xx. | ||||
| * **Targets**: | ||||
|  | ||||
|   * **Build‑time**: P95 ≤ 3–5 s on warmed bases (CI generator). | ||||
|   * **Post‑build delta**: P95 ≤ 10 s for 200 MB images with cache hit. | ||||
|   * **Emit**: CycloneDX Protobuf ≤ 150 ms for 5k components; JSON ≤ 500 ms. | ||||
|   * **Diff**: ≤ 200 ms for 5k vs 5k components. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 10) Security posture | ||||
|  | ||||
| * **AuthN**: Authority‑issued short OpToks (DPoP/mTLS). | ||||
| * **AuthZ**: scopes (`scanner.scan`, `scanner.export`, `scanner.catalog.read`). | ||||
| * **mTLS** to **Signer**/**Attestor**; only **Signer** can sign. | ||||
| * **No network fetches** during analysis (except registry pulls and optional Rekor index reads). | ||||
| * **Sandboxing**: non‑root containers; read‑only FS; seccomp profiles; disable execution of scanned content. | ||||
| * **Release integrity**: all first‑party images are **cosign‑signed**; Workers/WebService self‑verify at startup. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 11) Observability & audit | ||||
|  | ||||
| * **Metrics**: | ||||
|  | ||||
|   * `scanner.jobs_inflight`, `scanner.scan_latency_seconds` | ||||
|   * `scanner.layer_cache_hits_total`, `scanner.file_cas_hits_total` | ||||
|   * `scanner.artifact_bytes_total{format}` | ||||
|   * `scanner.attestation_latency_seconds`, `scanner.rekor_failures_total` | ||||
|   * `scanner_analyzer_golang_heuristic_total{indicator,version_hint}` — increments whenever the Go analyzer falls back to heuristics (build-id or runtime markers). Grafana panel: `sum by (indicator) (rate(scanner_analyzer_golang_heuristic_total[5m]))`; alert when the rate is ≥ 1 for 15 minutes to highlight unexpected stripped binaries. | ||||
| * **Tracing**: spans for acquire→union→analyzers→compose→emit→sign→log. | ||||
| * **Audit logs**: DSSE requests log `license_id`, `image_digest`, `artifactSha256`, `policy_digest?`, Rekor UUID on success. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 12) Testing matrix | ||||
|  | ||||
| * **Determinism:** given same image + analyzers → byte‑identical **CDX Protobuf**; JSON normalized. | ||||
| * **OS packages:** ground‑truth images per distro; compare to package DB. | ||||
| * **Lang ecosystems:** sample images per ecosystem (Java/Node/Python/Go/.NET/Rust) with installed metadata; negative tests w/ lockfile‑only. | ||||
| * **Native & EntryTrace:** ELF graph correctness; shell AST cases (includes, run‑parts, exec, case/if). | ||||
| * **Diff:** layer attribution against synthetic two‑image sequences. | ||||
| * **Performance:** cold vs warm cache; large `node_modules` and `site‑packages`. | ||||
| * **Security:** ensure no code execution from image; fuzz parser inputs; path traversal resistance on layer extract. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 13) Failure modes & degradations | ||||
|  | ||||
| * **Missing OS DB** (files exist, DB removed): record **files**; do **not** fabricate package components; emit `bin:{sha256}` where unavoidable; flag in evidence. | ||||
| * **Unreadable metadata** (corrupt dist‑info): record file evidence; skip component creation; annotate. | ||||
| * **Dynamic shell constructs**: mark unresolved edges with reasons (env var unknown) and continue; **Usage** view may be partial. | ||||
| * **Registry rate limits**: honor backoff; queue job retries with jitter. | ||||
| * **Signer refusal** (license/plan/version): scan completes; artifact produced; **no attestation**; WebService marks result as **unverified**. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 14) Optional plug‑ins (off by default) | ||||
|  | ||||
| * **Patch‑presence detector** (signature‑based backport checks). Reads curated function‑level signatures from advisories; inspects binaries for patched code snippets to lower false‑positives for backported fixes. Runs as a sidecar analyzer that **annotates** components; never overrides core identities. | ||||
| * **Runtime probes** (with Zastava): when allowed, compare **/proc/<pid>/maps** (DSOs actually loaded) with static **Usage** view for precision. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 15) DevOps & operations | ||||
|  | ||||
| * **HA**: WebService horizontal scale; Workers autoscale by queue depth & CPU; distributed locks on layers. | ||||
| * **Retention**: ILM rules per artifact class (`short`, `default`, `compliance`); **Object Lock** for compliance artifacts (reports, signed SBOMs). | ||||
| * **Upgrades**: bump **cache schema** when analyzer outputs change; WebService triggers refresh of dependent artifacts. | ||||
| * **Backups**: Mongo (daily dumps); RustFS snapshots (filesystem-level rsync/ZFS) or S3 versioning when legacy driver enabled; Rekor v2 DB snapshots. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 16) CLI & UI touch points | ||||
|  | ||||
| * **CLI**: `stellaops scan <ref>`, `stellaops diff --old --new`, `stellaops export`, `stellaops verify attestation <bundle|url>`. | ||||
| * **UI**: Scan detail shows **Inventory/Usage** toggles, **Diff by Layer**, **Attestation badge** (verified/unverified), Rekor link, and **EntryTrace** chain with file:line breadcrumbs. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 17) Roadmap (Scanner) | ||||
|  | ||||
| * **M2**: Windows containers (MSI/SxS/GAC analyzers), PE/Mach‑O native analyzer, deeper Rust metadata. | ||||
| * **M2**: Buildx generator GA (certified external registries), cross‑registry trust policies. | ||||
| * **M3**: Patch‑presence plug‑in GA (opt‑in), cross‑image corpus clustering (evidence‑only; not identity). | ||||
| * **M3**: Advanced EntryTrace (POSIX shell features breadth, busybox detection). | ||||
|  | ||||
| --- | ||||
|  | ||||
| ### Appendix A — EntryTrace resolution (pseudo) | ||||
|  | ||||
| ```csharp | ||||
| ResolveEntrypoint(ImageConfig cfg, RootFs fs): | ||||
|   cmd = Normalize(cfg.ENTRYPOINT, cfg.CMD) | ||||
|   stack = [ Script(cmd, path=FindOnPath(cmd[0], fs)) ] | ||||
|   visited = set() | ||||
|  | ||||
|   while stack not empty and depth < MAX: | ||||
|     cur = stack.pop() | ||||
|     if cur in visited: continue | ||||
|     visited.add(cur) | ||||
|  | ||||
|     if IsShellScript(cur.path): | ||||
|        ast = ParseShell(cur.path) | ||||
|        foreach directive in ast: | ||||
|          if directive is Source include: | ||||
|             p = ResolveInclude(include.path, cur.env, fs) | ||||
|             stack.push(Script(p)) | ||||
|          if directive is Exec call: | ||||
|             p = ResolveExec(call.argv[0], cur.env, fs) | ||||
|             stack.push(Program(p, argv=call.argv)) | ||||
|          if directive is Interpreter (python -m / node / java -jar): | ||||
|             term = ResolveInterpreterTarget(call, fs) | ||||
|             stack.push(Program(term)) | ||||
|     else: | ||||
|        return Terminal(cur.path) | ||||
|  | ||||
|   return Unknown(reason) | ||||
| ``` | ||||
|  | ||||
| ### Appendix A.1 — EntryTrace Explainability | ||||
|  | ||||
| EntryTrace emits structured diagnostics and metrics so operators can quickly understand why resolution succeeded or degraded: | ||||
|  | ||||
| | Reason | Description | Typical Mitigation | | ||||
| |--------|-------------|--------------------| | ||||
| | `CommandNotFound` | A command referenced in the script cannot be located in the layered root filesystem or `PATH`. | Ensure binaries exist in the image or extend `PATH` hints. | | ||||
| | `MissingFile` | `source`/`.`/`run-parts` targets are missing. | Bundle the script or guard the include. | | ||||
| | `DynamicEnvironmentReference` | Path depends on `$VARS` that are unknown at scan time. | Provide defaults via scan metadata or accept partial usage. | | ||||
| | `RecursionLimitReached` | Nested includes exceeded the analyzer depth limit (default 64). | Flatten indirection or increase the limit in options. | | ||||
| | `RunPartsEmpty` | `run-parts` directory contained no executable entries. | Remove empty directories or ignore if intentional. | | ||||
| | `JarNotFound` / `ModuleNotFound` | Java/Python targets missing, preventing interpreter tracing. | Ship the jar/module with the image or adjust the launcher. | | ||||
|  | ||||
| Diagnostics drive two metrics published by `EntryTraceMetrics`: | ||||
|  | ||||
| - `entrytrace_resolutions_total{outcome}` — resolution attempts segmented by outcome (`resolved`, `partiallyresolved`, `unresolved`). | ||||
| - `entrytrace_unresolved_total{reason}` — diagnostic counts keyed by reason. | ||||
|  | ||||
| Structured logs include `entrytrace.path`, `entrytrace.command`, `entrytrace.reason`, and `entrytrace.depth`, all correlated with scan/job IDs. Timestamps are normalized to UTC (microsecond precision) to keep DSSE attestations and UI traces explainable. | ||||
|  | ||||
| ### Appendix B — BOM‑Index sidecar | ||||
|  | ||||
| ``` | ||||
| struct Header { magic, version, imageDigest, createdAt } | ||||
| vector<string> purls | ||||
| map<purlIndex, roaring_bitmap> components | ||||
| optional map<purlIndex, roaring_bitmap> usedByEntrypoint | ||||
| ``` | ||||
		Reference in New Issue
	
	Block a user