feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules

- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes. - Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes. - Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables. - Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
2025-10-30 00:09:39 +02:00
parent 86f606a115
commit e8537460a3
503 changed files with 16136 additions and 54638 deletions
--- a/docs/modules/scanner/AGENTS.md
+++ b/docs/modules/scanner/AGENTS.md
@@ -0,0 +1,22 @@
+# Scanner agent guide
+
+## Mission
+Scanner analyses container images layer-by-layer, producing deterministic SBOM fragments, diffs, and signed reports.
+
+## Key docs
+- [Module README](./README.md)
+- [Architecture](./architecture.md)
+- [Implementation plan](./implementation_plan.md)
+- [Task board](./TASKS.md)
+
+## How to get started
+1. Open ../../implplan/SPRINTS.md and locate the stories referencing this module.
+2. Review ./TASKS.md for local follow-ups and confirm status transitions (TODO → DOING → DONE/BLOCKED).
+3. Read the architecture and README for domain context before editing code or docs.
+4. Coordinate cross-module changes in the main /AGENTS.md description and through the sprint plan.
+
+## Guardrails
+- Honour the Aggregation-Only Contract where applicable (see ../../ingestion/aggregation-only-contract.md).
+- Preserve determinism: sort outputs, normalise timestamps (UTC ISO-8601), and avoid machine-specific artefacts.
+- Keep Offline Kit parity in mind—document air-gapped workflows for any new feature.
+- Update runbooks/observability assets when operational characteristics change.
--- a/docs/modules/scanner/README.md
+++ b/docs/modules/scanner/README.md
@@ -0,0 +1,38 @@
+# StellaOps Scanner
+
+Scanner analyses container images layer-by-layer, producing deterministic SBOM fragments, diffs, and signed reports.
+
+## Responsibilities
+- Expose APIs (WebService) for scan orchestration, diffing, and artifact retrieval.
+- Run Worker analyzers for OS, language, and native ecosystems with restart-only plug-ins.
+- Store SBOM fragments and artifacts in RustFS/object storage.
+- Publish DSSE-ready metadata for Signer/Attestor and downstream policy evaluation.
+
+## Key components
+- `StellaOps.Scanner.WebService` minimal API host.
+- `StellaOps.Scanner.Worker` analyzer executor.
+- Analyzer libraries under `StellaOps.Scanner.Analyzers.*`.
+
+## Integrations & dependencies
+- Scheduler for job intake and retries.
+- Policy Engine for evidence handoff.
+- Export Center / Offline Kit for artifact packaging.
+
+## Operational notes
+- CAS caches, bounded retries, DSSE integration.
+- Monitoring dashboards (see ./operations/analyzers-grafana-dashboard.json).
+- RustFS migration playbook.
+
+## Related resources
+- ./operations/analyzers.md
+- ./operations/analyzers-grafana-dashboard.json
+- ./operations/rustfs-migration.md
+- ./operations/entrypoint.md
+
+## Backlog references
+- DOCS-SCANNER updates tracked in ../../TASKS.md.
+- Analyzer parity work in src/Scanner/**/TASKS.md.
+
+## Epic alignment
+- **Epic 6 – Vulnerability Explorer:** provide policy-aware scan outputs, explain traces, and findings ledger hooks for triage workflows.
+- **Epic 10 – Export Center:** generate export-ready artefacts, manifests, and DSSE metadata for bundles.
--- a/docs/modules/scanner/TASKS.md
+++ b/docs/modules/scanner/TASKS.md
@@ -0,0 +1,9 @@
+# Task board — Scanner
+
+> Local tasks should link back to ./AGENTS.md and mirror status updates into ../../TASKS.md when applicable.
+
+| ID | Status | Owner(s) | Description | Notes |
+|----|--------|----------|-------------|-------|
+| SCANNER-DOCS-0001 | DOING (2025-10-29) | Docs Guild | Validate that ./README.md aligns with the latest release notes. | See ./AGENTS.md |
+| SCANNER-OPS-0001 | TODO | Ops Guild | Review runbooks/observability assets after next sprint demo. | Sync outcomes back to ../../TASKS.md |
+| SCANNER-ENG-0001 | TODO | Module Team | Cross-check implementation plan milestones against ../../implplan/SPRINTS.md. | Update status via ./AGENTS.md workflow |
--- a/docs/modules/scanner/architecture.md
+++ b/docs/modules/scanner/architecture.md
@@ -0,0 +1,489 @@
+# component_architecture_scanner.md — **Stella Ops Scanner** (2025Q4)
+
+> Aligned with Epic 6 – Vulnerability Explorer and Epic 10 – Export Center.
+
+> **Scope.** Implementation‑ready architecture for the **Scanner** subsystem: WebService, Workers, analyzers, SBOM assembly (inventory & usage), per‑layer caching, three‑way diffs, artifact catalog (RustFS default + Mongo, S3-compatible fallback), attestation hand‑off, and scale/security posture. This document is the contract between the scanning plane and everything else (Policy, Excititor, Concelier, UI, CLI).
+
+---
+
+## 0) Mission & boundaries
+
+**Mission.** Produce **deterministic**, **explainable** SBOMs and diffs for container images and filesystems, quickly and repeatedly, without guessing. Emit two views: **Inventory** (everything present) and **Usage** (entrypoint closure + actually linked libs). Attach attestations through **Signer→Attestor→Rekor v2**.
+
+**Boundaries.**
+
+* Scanner **does not** produce PASS/FAIL. The backend (Policy + Excititor + Concelier) decides presentation and verdicts.
+* Scanner **does not** keep third‑party SBOM warehouses. It may **bind** to existing attestations for exact hashes.
+* Core analyzers are **deterministic** (no fuzzy identity). Optional heuristic plug‑ins (e.g., patch‑presence) run under explicit flags and never contaminate the core SBOM.
+
+---
+
+## 1) Solution & project layout
+
+```
+src/
+ ├─ StellaOps.Scanner.WebService/            # REST control plane, catalog, diff, exports
+ ├─ StellaOps.Scanner.Worker/                # queue consumer; executes analyzers
+ ├─ StellaOps.Scanner.Models/                # DTOs, evidence, graph nodes, CDX/SPDX adapters
+ ├─ StellaOps.Scanner.Storage/               # Mongo repositories; RustFS object client (default) + S3 fallback; ILM/GC
+ ├─ StellaOps.Scanner.Queue/                 # queue abstraction (Redis/NATS/RabbitMQ)
+ ├─ StellaOps.Scanner.Cache/                 # layer cache; file CAS; bloom/bitmap indexes
+ ├─ StellaOps.Scanner.EntryTrace/            # ENTRYPOINT/CMD → terminal program resolver (shell AST)
+ ├─ StellaOps.Scanner.Analyzers.OS.[Apk|Dpkg|Rpm]/
+ ├─ StellaOps.Scanner.Analyzers.Lang.[Java|Node|Python|Go|DotNet|Rust]/
+ ├─ StellaOps.Scanner.Analyzers.Native.[ELF|PE|MachO]/   # PE/Mach-O planned (M2)
+ ├─ StellaOps.Scanner.Emit.CDX/              # CycloneDX (JSON + Protobuf)
+ ├─ StellaOps.Scanner.Emit.SPDX/             # SPDX 3.0.1 JSON
+ ├─ StellaOps.Scanner.Diff/                  # image→layer→component three‑way diff
+ ├─ StellaOps.Scanner.Index/                 # BOM‑Index sidecar (purls + roaring bitmaps)
+ ├─ StellaOps.Scanner.Tests.*                # unit/integration/e2e fixtures
+ └─ Tools/
+     ├─ StellaOps.Scanner.Sbomer.BuildXPlugin/   # BuildKit generator (image referrer SBOMs)
+     └─ StellaOps.Scanner.Sbomer.DockerImage/    # CLI‑driven scanner container
+```
+
+Analyzer assemblies and buildx generators are packaged as **restart-time plug-ins** under `plugins/scanner/**` with manifests; services must restart to activate new plug-ins.
+
+### 1.1 Queue backbone (Redis / NATS)
+
+`StellaOps.Scanner.Queue` exposes a transport-agnostic contract (`IScanQueue`/`IScanQueueLease`) used by the WebService producer and Worker consumers. Sprint 9 introduces two first-party transports:
+
+- **Redis Streams** (default). Uses consumer groups, deterministic idempotency keys (`scanner:jobs:idemp:*`), and supports lease claim (`XCLAIM`), renewal, exponential-backoff retries, and a `scanner:jobs:dead` stream for exhausted attempts.
+- **NATS JetStream**. Provisions the `SCANNER_JOBS` work-queue stream + durable consumer `scanner-workers`, publishes with `MsgId` for dedupe, applies backoff via `NAK` delays, and routes dead-lettered jobs to `SCANNER_JOBS_DEAD`.
+
+Metrics are emitted via `Meter` counters (`scanner_queue_enqueued_total`, `scanner_queue_retry_total`, `scanner_queue_deadletter_total`), and `ScannerQueueHealthCheck` pings the active backend (Redis `PING`, NATS `PING`). Configuration is bound from `scanner.queue`:
+
+```yaml
+scanner:
+  queue:
+    kind: redis # or nats
+    redis:
+      connectionString: "redis://queue:6379/0"
+      streamName: "scanner:jobs"
+    nats:
+      url: "nats://queue:4222"
+      stream: "SCANNER_JOBS"
+      subject: "scanner.jobs"
+      durableConsumer: "scanner-workers"
+      deadLetterSubject: "scanner.jobs.dead"
+    maxDeliveryAttempts: 5
+    retryInitialBackoff: 00:00:05
+    retryMaxBackoff: 00:02:00
+```
+
+The DI extension (`AddScannerQueue`) wires the selected transport, so future additions (e.g., RabbitMQ) only implement the same contract and register.
+
+**Runtime form‑factor:** two deployables
+
+* **Scanner.WebService** (stateless REST)
+* **Scanner.Worker** (N replicas; queue‑driven)
+
+---
+
+## 2) External dependencies
+
+* **OCI registry** with **Referrers API** (discover attached SBOMs/signatures).
+* **RustFS** (default, offline-first) for SBOM artifacts; optional S3/MinIO compatibility retained for migration; **Object Lock** semantics emulated via retention headers; **ILM** for TTL.
+* **MongoDB** for catalog, job state, diffs, ILM rules.
+* **Queue** (Redis Streams/NATS/RabbitMQ).
+* **Authority** (on‑prem OIDC) for **OpToks** (DPoP/mTLS).
+* **Signer** + **Attestor** (+ **Fulcio/KMS** + **Rekor v2**) for DSSE + transparency.
+
+---
+
+## 3) Contracts & data model
+
+### 3.1 Evidence‑first component model
+
+**Nodes**
+
+* `Image`, `Layer`, `File`
+* `Component` (`purl?`, `name`, `version?`, `type`, `id` — may be `bin:{sha256}`)
+* `Executable` (ELF/PE/Mach‑O), `Library` (native or managed), `EntryScript` (shell/launcher)
+
+**Edges** (all carry **Evidence**)
+
+* `contains(Image|Layer → File)`
+* `installs(PackageDB → Component)` (OS database row)
+* `declares(InstalledMetadata → Component)` (dist‑info, pom.properties, deps.json…)
+* `links_to(Executable → Library)` (ELF `DT_NEEDED`, PE imports)
+* `calls(EntryScript → Program)` (file:line from shell AST)
+* `attests(Rekor → Component|Image)` (SBOM/predicate binding)
+* `bound_from_attestation(Component_attested → Component_observed)` (hash equality proof)
+
+**Evidence**
+
+```
+{ source: enum, locator: (path|offset|line), sha256?, method: enum, timestamp }
+```
+
+No confidences. Either a fact is proven with listed mechanisms, or it is not claimed.
+
+### 3.2 Catalog schema (Mongo)
+
+* `artifacts`
+
+  ```
+  { _id, type: layer-bom|image-bom|diff|index,
+    format: cdx-json|cdx-pb|spdx-json,
+    bytesSha256, size, rekor: { uuid,index,url }?,
+    ttlClass, immutable, refCount, createdAt }
+  ```
+* `images { imageDigest, repo, tag?, arch, createdAt, lastSeen }`
+* `layers { layerDigest, mediaType, size, createdAt, lastSeen }`
+* `links  { fromType, fromDigest, artifactId }`               // image/layer -> artifact
+* `jobs   { _id, kind, args, state, startedAt, heartbeatAt, endedAt, error }`
+* `lifecycleRules { ruleId, scope, ttlDays, retainIfReferenced, immutable }`
+
+### 3.3 Object store layout (RustFS)
+
+```
+layers/<sha256>/sbom.cdx.json.zst
+layers/<sha256>/sbom.spdx.json.zst
+images/<imgDigest>/inventory.cdx.pb            # CycloneDX Protobuf
+images/<imgDigest>/usage.cdx.pb
+indexes/<imgDigest>/bom-index.bin              # purls + roaring bitmaps
+diffs/<old>_<new>/diff.json.zst
+attest/<artifactSha256>.dsse.json              # DSSE bundle (cert chain + Rekor proof)
+```
+
+RustFS exposes a deterministic HTTP API (`PUT|GET|DELETE /api/v1/buckets/{bucket}/objects/{key}`).
+Scanner clients tag immutable uploads with `X-RustFS-Immutable: true` and, when retention applies,
+`X-RustFS-Retain-Seconds: <ttlSeconds>`. Additional headers can be injected via
+`scanner.artifactStore.headers` to support custom auth or proxy requirements. Legacy MinIO/S3
+deployments remain supported by setting `scanner.artifactStore.driver = "s3"` during phased
+migrations.
+
+---
+
+## 4) REST API (Scanner.WebService)
+
+All under `/api/v1/scanner`. Auth: **OpTok** (DPoP/mTLS); RBAC scopes.
+
+```
+POST /scans                        { imageRef|digest, force?:bool } → { scanId }
+GET  /scans/{id}                   → { status, imageDigest, artifacts[], rekor? }
+GET  /sboms/{imageDigest}          ?format=cdx-json|cdx-pb|spdx-json&view=inventory|usage → bytes
+GET  /diff?old=<digest>&new=<digest>&view=inventory|usage → diff.json
+POST /exports                      { imageDigest, format, view, attest?:bool } → { artifactId, rekor? }
+POST /reports                      { imageDigest, policyRevision? } → { reportId, rekor? }   # delegates to backend policy+vex
+GET  /catalog/artifacts/{id}       → { meta }
+GET  /healthz | /readyz | /metrics
+```
+
+### Report events
+
+When `scanner.events.enabled = true`, the WebService serialises the signed report (canonical JSON + DSSE envelope) with `NotifyCanonicalJsonSerializer` and publishes two Redis Stream entries (`scanner.report.ready`, `scanner.scan.completed`) to the configured stream (default `stella.events`). The stream fields carry the whole envelope plus lightweight headers (`kind`, `tenant`, `ts`) so Notify and UI timelines can consume the event bus without recomputing signatures. Publish timeouts and bounded stream length are controlled via `scanner:events:publishTimeoutSeconds` and `scanner:events:maxStreamLength`. If the queue driver is already Redis and no explicit events DSN is provided, the host reuses the queue connection and auto-enables event emission so deployments get live envelopes without extra wiring. Compose/Helm bundles expose the same knobs via the `SCANNER__EVENTS__*` environment variables for quick tuning.
+
+---
+
+## 5) Execution flow (Worker)
+
+### 5.1 Acquire & verify
+
+1. **Resolve image** (prefer `repo@sha256:…`).
+2. **(Optional) verify image signature** per policy (cosign).
+3. **Pull blobs**, compute layer digests; record metadata.
+
+### 5.2 Layer union FS
+
+* Apply whiteouts; materialize final filesystem; map **file → first introducing layer**.
+* Windows layers (MSI/SxS/GAC) planned in **M2**.
+
+### 5.3 Evidence harvest (parallel analyzers; deterministic only)
+
+**A) OS packages**
+
+* **apk**: `/lib/apk/db/installed`
+* **dpkg**: `/var/lib/dpkg/status`, `/var/lib/dpkg/info/*.list`
+* **rpm**: `/var/lib/rpm/Packages` (via librpm or parser)
+* Record `name`, `version` (epoch/revision), `arch`, source package where present, and **declared file lists**.
+
+> **Data flow note:** Each OS analyzer now writes its canonical output into the shared `ScanAnalysisStore` under
+> `analysis.os.packages` (raw results), `analysis.os.fragments` (per-analyzer layer fragments), and contributes to
+> `analysis.layers.fragments` (the aggregated view consumed by emit/diff pipelines). Helpers in
+> `ScanAnalysisCompositionBuilder` convert these fragments into SBOM composition requests and component graphs so the
+> diff/emit stages no longer reach back into individual analyzer implementations.
+
+**B) Language ecosystems (installed state only)**
+
+* **Java**: `META-INF/maven/*/pom.properties`, MANIFEST → `pkg:maven/...`
+* **Node**: `node_modules/**/package.json` → `pkg:npm/...`
+* **Python**: `*.dist-info/{METADATA,RECORD}` → `pkg:pypi/...`
+* **Go**: Go **buildinfo** in binaries → `pkg:golang/...`
+* **.NET**: `*.deps.json` + assembly metadata → `pkg:nuget/...`
+* **Rust**: crates only when **explicitly present** (embedded metadata or cargo/registry traces); otherwise binaries reported as `bin:{sha256}`.
+
+> **Rule:** We only report components proven **on disk** with authoritative metadata. Lockfiles are evidence only.
+
+**C) Native link graph**
+
+* **ELF**: parse `PT_INTERP`, `DT_NEEDED`, RPATH/RUNPATH, **GNU symbol versions**; map **SONAMEs** to file paths; link executables → libs.
+* **PE/Mach‑O** (planned M2): import table, delay‑imports; version resources; code signatures.
+* Map libs back to **OS packages** if possible (via file lists); else emit `bin:{sha256}` components.
+* The exported metadata (`stellaops.os.*` properties, license list, source package) feeds policy scoring and export pipelines
+  directly – Policy evaluates quiet rules against package provenance while Exporters forward the enriched fields into
+  downstream JSON/Trivy payloads.
+
+**D) EntryTrace (ENTRYPOINT/CMD → terminal program)**
+
+* Read image config; parse shell (POSIX/Bash subset) with AST: `source`/`.` includes; `case/if`; `exec`/`command`; `run‑parts`.
+* Resolve commands via **PATH** within the **built rootfs**; follow language launchers (Java/Node/Python) to identify the terminal program (ELF/JAR/venv script).
+* Record **file:line** and choices for each hop; output chain graph.
+* Unresolvable dynamic constructs are recorded as **unknown** edges with reasons (e.g., `$FOO` unresolved).
+
+**E) Attestation & SBOM bind (optional)**
+
+* For each **file hash** or **binary hash**, query local cache of **Rekor v2** indices; if an SBOM attestation is found for **exact hash**, bind it to the component (origin=`attested`).
+* For the **image** digest, likewise bind SBOM attestations (build‑time referrers).
+
+### 5.4 Component normalization (exact only)
+
+* Create `Component` nodes only with deterministic identities: purl, or **`bin:{sha256}`** for unlabeled binaries.
+* Record **origin** (OS DB, installed metadata, linker, attestation).
+
+### 5.5 SBOM assembly & emit
+
+* **Per-layer SBOM fragments**: components introduced by the layer (+ relationships).
+* **Image SBOMs**: merge fragments; refer back to them via **CycloneDX BOM‑Link** (or SPDX ExternalRef).
+* Emit both **Inventory** & **Usage** views.
+* When the native analyzer reports an ELF `buildId`, attach it to component metadata and surface it as `stellaops:buildId` in CycloneDX properties (and diff metadata). This keeps SBOM/diff output in lockstep with runtime events and the debug-store manifest.
+* Serialize **CycloneDX JSON** and **CycloneDX Protobuf**; optionally **SPDX 3.0.1 JSON**.
+* Build **BOM‑Index** sidecar: purl table + roaring bitmap; flag `usedByEntrypoint` components for fast backend joins.
+
+The emitted `buildId` metadata is preserved in component hashes, diff payloads, and `/policy/runtime` responses so operators can pivot from SBOM entries → runtime events → `debug/.build-id/<aa>/<rest>.debug` within the Offline Kit or release bundle.
+
+### 5.6 DSSE attestation (via Signer/Attestor)
+
+* WebService constructs **predicate** with `image_digest`, `stellaops_version`, `license_id`, `policy_digest?` (when emitting **final reports**), timestamps.
+* Calls **Signer** (requires **OpTok + PoE**); Signer verifies **entitlement + scanner image integrity** and returns **DSSE bundle**.
+* **Attestor** logs to **Rekor v2**; returns `{uuid,index,proof}` → stored in `artifacts.rekor`.
+
+---
+
+## 6) Three‑way diff (image → layer → component)
+
+### 6.1 Keys & classification
+
+* Component key: **purl** when present; else `bin:{sha256}`.
+* Diff classes: `added`, `removed`, `version_changed` (`upgraded|downgraded`), `metadata_changed` (e.g., origin from attestation vs observed).
+* Layer attribution: for each change, resolve the **introducing/removing layer**.
+
+### 6.2 Algorithm (outline)
+
+```
+A = components(imageOld, key)
+B = components(imageNew, key)
+
+added   = B \ A
+removed = A \ B
+changed = { k in A∩B : version(A[k]) != version(B[k]) || origin changed }
+
+for each item in added/removed/changed:
+   layer = attribute_to_layer(item, imageOld|imageNew)
+   usageFlag = usedByEntrypoint(item, imageNew)
+emit diff.json (grouped by layer with badges)
+```
+
+Diffs are stored as artifacts and feed **UI** and **CLI**.
+
+---
+
+## 7) Build‑time SBOMs (fast CI path)
+
+**Scanner.Sbomer.BuildXPlugin** can act as a BuildKit **generator**:
+
+* During `docker buildx build --attest=type=sbom,generator=stellaops/sbom-indexer`, run analyzers on the build context/output; attach SBOMs as OCI **referrers** to the built image.
+* Optionally request **Signer/Attestor** to produce **Stella Ops‑verified** attestation immediately; else, Scanner.WebService can verify and re‑attest post‑push.
+* Scanner.WebService trusts build‑time SBOMs per policy, enabling **no‑rescan** for unchanged bases.
+
+---
+
+## 8) Configuration (YAML)
+
+```yaml
+scanner:
+  queue:
+    kind: redis
+    url: "redis://queue:6379/0"
+  mongo:
+    uri: "mongodb://mongo/scanner"
+  s3:
+    endpoint: "http://minio:9000"
+    bucket: "stellaops"
+    objectLock: "governance"   # or 'compliance'
+  analyzers:
+    os: { apk: true, dpkg: true, rpm: true }
+    lang: { java: true, node: true, python: true, go: true, dotnet: true, rust: true }
+    native: { elf: true, pe: false, macho: false }    # PE/Mach-O in M2
+    entryTrace: { enabled: true, shellMaxDepth: 64, followRunParts: true }
+  emit:
+    cdx: { json: true, protobuf: true }
+    spdx: { json: true }
+    compress: "zstd"
+  rekor:
+    url: "https://rekor-v2.internal"
+  signer:
+    url: "https://signer.internal"
+  limits:
+    maxParallel: 8
+    perRegistryConcurrency: 2
+  policyHints:
+    verifyImageSignature: false
+    trustBuildTimeSboms: true
+```
+
+---
+
+## 9) Scale & performance
+
+* **Parallelism**: per‑analyzer concurrency; bounded directory walkers; file CAS dedupe by sha256.
+* **Distributed locks** per **layer digest** to prevent duplicate work across Workers.
+* **Registry throttles**: per‑host concurrency budgets; exponential backoff on 429/5xx.
+* **Targets**:
+
+  * **Build‑time**: P95 ≤ 3–5 s on warmed bases (CI generator).
+  * **Post‑build delta**: P95 ≤ 10 s for 200 MB images with cache hit.
+  * **Emit**: CycloneDX Protobuf ≤ 150 ms for 5k components; JSON ≤ 500 ms.
+  * **Diff**: ≤ 200 ms for 5k vs 5k components.
+
+---
+
+## 10) Security posture
+
+* **AuthN**: Authority‑issued short OpToks (DPoP/mTLS).
+* **AuthZ**: scopes (`scanner.scan`, `scanner.export`, `scanner.catalog.read`).
+* **mTLS** to **Signer**/**Attestor**; only **Signer** can sign.
+* **No network fetches** during analysis (except registry pulls and optional Rekor index reads).
+* **Sandboxing**: non‑root containers; read‑only FS; seccomp profiles; disable execution of scanned content.
+* **Release integrity**: all first‑party images are **cosign‑signed**; Workers/WebService self‑verify at startup.
+
+---
+
+## 11) Observability & audit
+
+* **Metrics**:
+
+  * `scanner.jobs_inflight`, `scanner.scan_latency_seconds`
+  * `scanner.layer_cache_hits_total`, `scanner.file_cas_hits_total`
+  * `scanner.artifact_bytes_total{format}`
+  * `scanner.attestation_latency_seconds`, `scanner.rekor_failures_total`
+  * `scanner_analyzer_golang_heuristic_total{indicator,version_hint}` — increments whenever the Go analyzer falls back to heuristics (build-id or runtime markers). Grafana panel: `sum by (indicator) (rate(scanner_analyzer_golang_heuristic_total[5m]))`; alert when the rate is ≥ 1 for 15 minutes to highlight unexpected stripped binaries.
+* **Tracing**: spans for acquire→union→analyzers→compose→emit→sign→log.
+* **Audit logs**: DSSE requests log `license_id`, `image_digest`, `artifactSha256`, `policy_digest?`, Rekor UUID on success.
+
+---
+
+## 12) Testing matrix
+
+* **Determinism:** given same image + analyzers → byte‑identical **CDX Protobuf**; JSON normalized.
+* **OS packages:** ground‑truth images per distro; compare to package DB.
+* **Lang ecosystems:** sample images per ecosystem (Java/Node/Python/Go/.NET/Rust) with installed metadata; negative tests w/ lockfile‑only.
+* **Native & EntryTrace:** ELF graph correctness; shell AST cases (includes, run‑parts, exec, case/if).
+* **Diff:** layer attribution against synthetic two‑image sequences.
+* **Performance:** cold vs warm cache; large `node_modules` and `site‑packages`.
+* **Security:** ensure no code execution from image; fuzz parser inputs; path traversal resistance on layer extract.
+
+---
+
+## 13) Failure modes & degradations
+
+* **Missing OS DB** (files exist, DB removed): record **files**; do **not** fabricate package components; emit `bin:{sha256}` where unavoidable; flag in evidence.
+* **Unreadable metadata** (corrupt dist‑info): record file evidence; skip component creation; annotate.
+* **Dynamic shell constructs**: mark unresolved edges with reasons (env var unknown) and continue; **Usage** view may be partial.
+* **Registry rate limits**: honor backoff; queue job retries with jitter.
+* **Signer refusal** (license/plan/version): scan completes; artifact produced; **no attestation**; WebService marks result as **unverified**.
+
+---
+
+## 14) Optional plug‑ins (off by default)
+
+* **Patch‑presence detector** (signature‑based backport checks). Reads curated function‑level signatures from advisories; inspects binaries for patched code snippets to lower false‑positives for backported fixes. Runs as a sidecar analyzer that **annotates** components; never overrides core identities.
+* **Runtime probes** (with Zastava): when allowed, compare **/proc/<pid>/maps** (DSOs actually loaded) with static **Usage** view for precision.
+
+---
+
+## 15) DevOps & operations
+
+* **HA**: WebService horizontal scale; Workers autoscale by queue depth & CPU; distributed locks on layers.
+* **Retention**: ILM rules per artifact class (`short`, `default`, `compliance`); **Object Lock** for compliance artifacts (reports, signed SBOMs).
+* **Upgrades**: bump **cache schema** when analyzer outputs change; WebService triggers refresh of dependent artifacts.
+* **Backups**: Mongo (daily dumps); RustFS snapshots (filesystem-level rsync/ZFS) or S3 versioning when legacy driver enabled; Rekor v2 DB snapshots.
+
+---
+
+## 16) CLI & UI touch points
+
+* **CLI**: `stellaops scan <ref>`, `stellaops diff --old --new`, `stellaops export`, `stellaops verify attestation <bundle|url>`.
+* **UI**: Scan detail shows **Inventory/Usage** toggles, **Diff by Layer**, **Attestation badge** (verified/unverified), Rekor link, and **EntryTrace** chain with file:line breadcrumbs.
+
+---
+
+## 17) Roadmap (Scanner)
+
+* **M2**: Windows containers (MSI/SxS/GAC analyzers), PE/Mach‑O native analyzer, deeper Rust metadata.
+* **M2**: Buildx generator GA (certified external registries), cross‑registry trust policies.
+* **M3**: Patch‑presence plug‑in GA (opt‑in), cross‑image corpus clustering (evidence‑only; not identity).
+* **M3**: Advanced EntryTrace (POSIX shell features breadth, busybox detection).
+
+---
+
+### Appendix A — EntryTrace resolution (pseudo)
+
+```csharp
+ResolveEntrypoint(ImageConfig cfg, RootFs fs):
+  cmd = Normalize(cfg.ENTRYPOINT, cfg.CMD)
+  stack = [ Script(cmd, path=FindOnPath(cmd[0], fs)) ]
+  visited = set()
+
+  while stack not empty and depth < MAX:
+    cur = stack.pop()
+    if cur in visited: continue
+    visited.add(cur)
+
+    if IsShellScript(cur.path):
+       ast = ParseShell(cur.path)
+       foreach directive in ast:
+         if directive is Source include:
+            p = ResolveInclude(include.path, cur.env, fs)
+            stack.push(Script(p))
+         if directive is Exec call:
+            p = ResolveExec(call.argv[0], cur.env, fs)
+            stack.push(Program(p, argv=call.argv))
+         if directive is Interpreter (python -m / node / java -jar):
+            term = ResolveInterpreterTarget(call, fs)
+            stack.push(Program(term))
+    else:
+       return Terminal(cur.path)
+
+  return Unknown(reason)
+```
+
+### Appendix A.1 — EntryTrace Explainability
+
+EntryTrace emits structured diagnostics and metrics so operators can quickly understand why resolution succeeded or degraded:
+
+| Reason | Description | Typical Mitigation |
+|--------|-------------|--------------------|
+| `CommandNotFound` | A command referenced in the script cannot be located in the layered root filesystem or `PATH`. | Ensure binaries exist in the image or extend `PATH` hints. |
+| `MissingFile` | `source`/`.`/`run-parts` targets are missing. | Bundle the script or guard the include. |
+| `DynamicEnvironmentReference` | Path depends on `$VARS` that are unknown at scan time. | Provide defaults via scan metadata or accept partial usage. |
+| `RecursionLimitReached` | Nested includes exceeded the analyzer depth limit (default 64). | Flatten indirection or increase the limit in options. |
+| `RunPartsEmpty` | `run-parts` directory contained no executable entries. | Remove empty directories or ignore if intentional. |
+| `JarNotFound` / `ModuleNotFound` | Java/Python targets missing, preventing interpreter tracing. | Ship the jar/module with the image or adjust the launcher. |
+
+Diagnostics drive two metrics published by `EntryTraceMetrics`:
+
+- `entrytrace_resolutions_total{outcome}` — resolution attempts segmented by outcome (`resolved`, `partiallyresolved`, `unresolved`).
+- `entrytrace_unresolved_total{reason}` — diagnostic counts keyed by reason.
+
+Structured logs include `entrytrace.path`, `entrytrace.command`, `entrytrace.reason`, and `entrytrace.depth`, all correlated with scan/job IDs. Timestamps are normalized to UTC (microsecond precision) to keep DSSE attestations and UI traces explainable.
+
+### Appendix B — BOM‑Index sidecar
+
+```
+struct Header { magic, version, imageDigest, createdAt }
+vector<string> purls
+map<purlIndex, roaring_bitmap> components
+optional map<purlIndex, roaring_bitmap> usedByEntrypoint
+```
--- a/docs/modules/scanner/implementation_plan.md
+++ b/docs/modules/scanner/implementation_plan.md
@@ -0,0 +1,64 @@
+# Implementation plan — Scanner
+
+## Delivery phases
+- **Phase 1 – Control plane & job queue**  
+  Finalise Scanner WebService, queue abstraction (Redis/NATS), job leasing, CAS layer cache, artifact catalog, and API endpoints.
+- **Phase 2 – Analyzer parity & SBOM assembly**  
+  Implement OS/Lang/Native analyzers, inventory/usage SBOM views, entry trace resolution, deterministic component identity.
+- **Phase 3 – Diff & attestations**  
+  Deliver three-way diff engine, DSSE SBOM/report signing pipeline, attestation hand-off (Signer→Attestor), metadata for Export Center.
+- **Phase 4 – Integrations & exports**  
+  Integrate with Policy Engine, Vuln Explorer, Export Center, CLI/Console; provide buildx plugin, CLI commands, and offline scanning support.
+- **Phase 5 – Observability & resilience**  
+  Metrics/logs/traces, queue backpressure handling, cache eviction, runbooks, smoke tests, SLO dashboards.
+
+## Work breakdown
+- **Control plane**
+  - REST API for scan requests, diff, catalog listing, artifact retrieval.
+  - Queue service with idempotency, retries, dead-letter handling; worker scaling.
+  - CAS storage (RustFS + S3 fallback), GC, ILM policies, offline mode.
+- **Analyzers**
+  - OS (apk/dpkg/rpm), language (Java/Node/Python/Go/DotNet/Rust), native (ELF/PE/MachO).
+  - Deterministic metadata (purl, version, source location), heuristics optional under flags.
+  - Entry trace/usage analysis, dependency resolution, license detection.
+- **SBOM & diff**
+  - Inventory/usage SBOM assembly, CycloneDX/SPDX emitters, schema validation.
+  - Three-way diff (base, target, runtime), evidence linking, JSON export.
+- **Attestation & export**
+  - DSSE bundle signing, attestation metadata for Signer/Attestor, provenance summary.
+  - Export Center integration (SBOM/diff artifacts, manifests), CLI builder plugin (buildx).
+- **CLI/Console**
+  - CLI commands `stella scan`, `stella sbom diff`, `stella sbom export`, offline caching.
+  - Console flows for scan requests, diff viewer, SBOM downloads, attestation status.
+- **Observability & ops**
+  - Metrics (queue depth, scan latency, cache hit/miss, analyzer timing), logs/traces with job IDs.
+  - Alerts for backlog, failed scans, attestation issues, storage pressure.
+  - Runbooks for stuck jobs, cache corruption, analyzer regressions, offline mode.
+
+## Acceptance criteria
+- Scans produce deterministic SBOM inventory/usage views with component identity stability and reproducible diffs.
+- Queue/worker pipeline handles retries, backpressure, offline kits, and exports DSSE attestations for Signer/Attestor.
+- Export Center consumes SBOM/diff artifacts; Vuln Explorer receives metadata and explain traces.
+- CLI/Console parity for scan submission, diffing, exports, attestation verification.
+- Observability dashboards cover queue health, analyzer success rates, performance; alerts fire on SLO breaches.
+- Offline scanning (air-gapped) supported with local caches and manifest verification.
+
+## Risks & mitigations
+- **Analyzer drift/determinism:** golden fixtures, hash-based regression tests, deterministic sorting, strict identity rules.
+- **Queue overload:** adaptive backpressure, scaling workers, dead-letter review, priority lanes.
+- **Storage growth:** CAS dedupe, ILM policies, offline bundle pruning.
+- **Attestation failures:** retry with backoff, attestation health checks, Notify integration.
+- **Offline divergence:** packaging of analyzers/configs, manifest signatures, parity tests.
+
+## Test strategy
+- **Unit:** analyzer parsers, component identity, diff calculations, API validation.
+- **Integration:** end-to-end scan/diff/attestation flows, Export Center integration, CLI automation.
+- **Performance:** large images, concurrent scans, cache stress, queue throughput.
+- **Determinism:** repeated scans/diffs across systems, hash comparisons, property tests.
+- **Security:** RBAC, tenant isolation, attestation key handling, path sanitisation.
+- **Offline:** air-gap scanning, manifest verification, CLI offline mode.
+
+## Definition of done
+- Scanner services, analyzers, diffing, attestation pipeline, exports, and observability delivered with runbooks and Offline Kit parity.
+- Documentation (architecture, analyzer guides, CLI, offline mode, operations) updated with imposed rule statements.
+- ./TASKS.md and ../../TASKS.md updated with progress; regression fixtures maintained in repo.
--- a/docs/modules/scanner/operations/analyzers-grafana-dashboard.json
+++ b/docs/modules/scanner/operations/analyzers-grafana-dashboard.json
@@ -0,0 +1,155 @@
+{
+  "title": "StellaOps Scanner Analyzer Benchmarks",
+  "uid": "scanner-analyzer-bench",
+  "schemaVersion": 38,
+  "version": 1,
+  "editable": true,
+  "timezone": "",
+  "graphTooltip": 0,
+  "time": {
+    "from": "now-24h",
+    "to": "now"
+  },
+  "templating": {
+    "list": [
+      {
+        "name": "datasource",
+        "type": "datasource",
+        "query": "prometheus",
+        "refresh": 1,
+        "hide": 0,
+        "current": {}
+      }
+    ]
+  },
+  "annotations": {
+    "list": []
+  },
+  "panels": [
+    {
+      "id": 1,
+      "title": "Max Duration (ms)",
+      "type": "timeseries",
+      "datasource": {
+        "type": "prometheus",
+        "uid": "${datasource}"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "unit": "ms",
+          "displayName": "{{scenario}}"
+        },
+        "overrides": []
+      },
+      "options": {
+        "legend": {
+          "displayMode": "table",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "expr": "scanner_analyzer_bench_max_ms",
+          "legendFormat": "{{scenario}}",
+          "refId": "A"
+        },
+        {
+          "expr": "scanner_analyzer_bench_baseline_max_ms",
+          "legendFormat": "{{scenario}} baseline",
+          "refId": "B"
+        }
+      ]
+    },
+    {
+      "id": 2,
+      "title": "Regression Ratio vs Limit",
+      "type": "timeseries",
+      "datasource": {
+        "type": "prometheus",
+        "uid": "${datasource}"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "unit": "percentunit",
+          "displayName": "{{scenario}}",
+          "min": 0,
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 20
+              }
+            ]
+          }
+        },
+        "overrides": []
+      },
+      "options": {
+        "legend": {
+          "displayMode": "table",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "multi",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "expr": "(scanner_analyzer_bench_regression_ratio - 1) * 100",
+          "legendFormat": "{{scenario}} regression %",
+          "refId": "A"
+        },
+        {
+          "expr": "(scanner_analyzer_bench_regression_limit - 1) * 100",
+          "legendFormat": "{{scenario}} limit %",
+          "refId": "B"
+        }
+      ]
+    },
+    {
+      "id": 3,
+      "title": "Breached Scenarios",
+      "type": "stat",
+      "datasource": {
+        "type": "prometheus",
+        "uid": "${datasource}"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "displayName": "{{scenario}}",
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "options": {
+        "colorMode": "value",
+        "graphMode": "area",
+        "justifyMode": "center",
+        "reduceOptions": {
+          "calcs": [
+            "last"
+          ],
+          "fields": "",
+          "values": false
+        }
+      },
+      "targets": [
+        {
+          "expr": "scanner_analyzer_bench_regression_breached",
+          "legendFormat": "{{scenario}}",
+          "refId": "A"
+        }
+      ]
+    }
+  ]
+}
--- a/docs/modules/scanner/operations/analyzers.md
+++ b/docs/modules/scanner/operations/analyzers.md
@@ -0,0 +1,48 @@
+# Scanner Analyzer Benchmarks – Operations Guide
+
+## Purpose
+Keep the language analyzer microbench under the < 5 s SBOM pledge. CI emits Prometheus metrics and JSON fixtures so trend dashboards and alerts stay in lockstep with the repository baseline.
+
+> **Grafana note:** Import `docs/modules/scanner/operations/analyzers-grafana-dashboard.json` into your Prometheus-backed Grafana stack to monitor `scanner_analyzer_bench_*` metrics and alert on regressions.
+
+## Publishing workflow
+1. CI (or engineers running locally) execute:
+   ```bash
+   dotnet run \
+     --project src/Bench/StellaOps.Bench/Scanner.Analyzers/StellaOps.Bench.ScannerAnalyzers/StellaOps.Bench.ScannerAnalyzers.csproj \
+     -- \
+     --repo-root . \
+     --out src/Bench/StellaOps.Bench/Scanner.Analyzers/baseline.csv \
+     --json out/bench/scanner-analyzers/latest.json \
+     --prom out/bench/scanner-analyzers/latest.prom \
+     --commit "$(git rev-parse HEAD)" \
+     --environment "${CI_ENVIRONMENT_NAME:-local}"
+   ```
+2. Publish the artefacts (`baseline.csv`, `latest.json`, `latest.prom`) to `bench-artifacts/<date>/`.
+3. Promtail (or the CI job) pushes `latest.prom` into Prometheus; JSON lands in long-term storage for workbook snapshots.
+4. The harness exits non-zero if:
+   - `max_ms` for any scenario breaches its configured threshold; or
+   - `max_ms` regresses ≥ 20 % versus `baseline.csv`.
+
+## Grafana dashboard
+- Import `docs/modules/scanner/operations/analyzers-grafana-dashboard.json`.
+- Point the template variable `datasource` to the Prometheus instance ingesting `scanner_analyzer_bench_*` metrics.
+- Panels:
+  - **Max Duration (ms)** – compares live runs vs baseline.
+  - **Regression Ratio vs Limit** – plots `(max / baseline_max - 1) * 100`.
+  - **Breached Scenarios** – stat panel sourced from `scanner_analyzer_bench_regression_breached`.
+
+## Alerting & on-call response
+- **Primary alert**: fire when `scanner_analyzer_bench_regression_ratio{scenario=~".+"} >= 1.20` for 2 consecutive samples (10 min default). Suggested PromQL:
+  ```
+  max_over_time(scanner_analyzer_bench_regression_ratio[10m]) >= 1.20
+  ```
+- Suppress duplicates using the `scenario` label.
+- Pager payload should include `scenario`, `max_ms`, `baseline_max_ms`, and `commit`.
+- Immediate triage steps:
+  1. Check `latest.json` artefact for the failing scenario – confirm commit and environment.
+  2. Re-run the harness with `--captured-at` and `--baseline` pointing at the last known good CSV to verify determinism.
+  3. If regression persists, open an incident ticket tagged `scanner-analyzer-perf` and page the owning language guild.
+  4. Roll back the offending change or update the baseline after sign-off from the guild lead and Perf captain.
+
+Document the outcome in `docs/12_PERFORMANCE_WORKBOOK.md` (section 8) so trendlines reflect any accepted regressions.
--- a/docs/modules/scanner/operations/entrypoint-dynamic-analysis.md
+++ b/docs/modules/scanner/operations/entrypoint-dynamic-analysis.md
@@ -0,0 +1,72 @@
+# Entry-Point Dynamic Analysis
+
+When we have access to a running container (e.g., during runtime posture checks), Stella Ops augments the static inference with live signals. This document describes the Observational Exec Graph (OEG) that powers the dynamic mode.
+
+## 1) Goals
+
+- Capture the *actual* process tree and exec lineage after the container starts.
+- Identify steady-state processes (long-lived, listening, non-wrapper) even when supervision stacks are present.
+- Feed the same reduction and runtime-classification pipeline as the static analyser.
+
+## 2) Observational Exec Graph (OEG)
+
+### 2.1 Data sources
+- **Tracepoints / eBPF**: `sched_process_exec`, `sched_process_fork/clone`, and corresponding exit events give us pid, ppid, namespace, binary path, and argv snapshots with minimal overhead.
+- **/proc sampling**: for each tracked PID, capture `/proc/<pid>/{exe,cmdline,cwd}` and file descriptors (especially listening sockets).
+- **Namespace mapping**: normalise host PIDs to container PIDs (`NStgid`) so the graph is stable across runtimes.
+
+### 2.2 Graph model
+
+```csharp
+public sealed record ExecNode(int HostPid, int NsPid, int Ppid, string Exe, string[] Argv, long StartTicks);
+public sealed record ExecEdge(int ParentHostPid, int ChildHostPid, string Kind); // "clone" | "exec"
+```
+
+- Nodes represent `exec()` events (post-exec image) and contain the final argv.
+- Edges labelled `clone` capture forks; `exec` edges show program replacements.
+
+### 2.3 Steady-state candidate selection
+
+For each node compute features:
+
+| Feature | Rationale |
+| --- | --- |
+| Lifetime (until sampling end) | Long-lived processes are more likely to be the real workload. |
+| Additional execs downstream | Zero execs after start implies terminal. |
+| Listening sockets | Owning `LISTEN` sockets strongly suggests a server. |
+| Wrapper catalogue hit | Mark nodes that match known shims (`tini`, `gosu`, `supervisord`, etc.). |
+| Children fan-out | Supervisors spawn multiple children and remain parents. |
+
+Feed these into a scoring function; retain Top‑K candidates (usually 1–3) along with evidence.
+
+## 3) Integration with static pipeline
+
+1. For each steady-state candidate, snapshot the command/argv and normalise via `ResolvedCommand` (as in static mode).
+2. Run wrapper reduction and ShellFlow analysis if the candidate is a script.
+3. Invoke runtime detectors to classify the binary.
+4. Merge dynamic evidence with static evidence. Conflicts drop confidence or trigger the “supervisor” classification.
+
+## 4) Supervisors & multi-service containers
+
+Some images (e.g., `supervisord`, `s6`, `runit`) intentionally start multiple long-lived processes. Handle them as follows:
+
+- Detect supervisor binaries from the wrapper catalogue.
+- Analyse their configuration (`/etc/supervisord.conf`, `/etc/services.d/*`, etc.) to enumerate child services statically.
+- Emit multiple `TerminalProcess` entries with individual confidence scores but mark the parent as `type = supervisor`.
+
+## 5) Operational hints
+
+- Sampling window: 1–3 seconds after start is usually sufficient; extend in debug mode.
+- Overhead: prefer eBPF/tracepoints; fall back to periodic `/proc` walks when instrumentation isn’t available.
+- Security: honour namespace boundaries; never inspect processes outside the target container’s cgroup/namespace.
+- Failure mode: if dynamic capture fails, fall back to static mode and flag evidence accordingly (`"Dynamic capture unavailable"`).
+
+## 6) Deliverables
+
+The dynamic reducer returns an `EntryTraceResult` populated with:
+
+- `ExecGraph` containing nodes and edges for audit/debug.
+- `Terminals` listing steady-state processes (possibly multiple).
+- `Evidence` strings referencing dynamic signals (`"pid 47 listening on 0.0.0.0:8080"`, `"wrapper tini collapsed into /usr/local/bin/python"`).
+
+Downstream modules (Policy, Vuln Explorer, Export Center) treat the result identically to static scans, enabling easy comparison between build-time and runtime observations.
--- a/docs/modules/scanner/operations/entrypoint-lang-ccpp.md
+++ b/docs/modules/scanner/operations/entrypoint-lang-ccpp.md
@@ -0,0 +1,24 @@
+# Entry-Point Runtime — C / C++
+
+## Signals to gather
+- Dynamically linked ELF (`.dynamic`) with GLIBC references (`GLIBC`, `GLIBCXX`, `libstdc++`).
+- Presence of `/lib64/ld-linux-*.so.*` loaders.
+- Absence of Go/Rust-specific markers.
+- Native supervisor binaries (`nginx`, `envoy`, custom C services).
+- Config files adjacent to the binary (`/etc/app.conf`, YAML/INI).
+
+## Implementation notes
+- Treat this detector as the "native fallback": confirm no higher-priority language matched.
+- Collect shared library list to attach as evidence; highlight unusual dependencies.
+- Inspect `EXPOSE` ports and config directories to aid classification.
+- Normalise busybox-style symlinks (actual binary often `/bin/busybox` with applet name).
+
+## Evidence & scoring
+- Boost for ELF dynamic dependencies and loader presence.
+- Add evidence for config files, service managers, or env variables.
+- Penalise extremely small binaries without metadata (may be wrappers).
+
+## Edge cases
+- Static C binaries may look like Go; rely on build ID absence and library fingerprints.
+- When binary is part of a supervisor stack (e.g., `s6-svscan`), delegate classification to `Supervisor`.
+- Windows native services should be handled by PE analysis (`entrypoint-runtime-overview.md`).
--- a/docs/modules/scanner/operations/entrypoint-lang-deno.md
+++ b/docs/modules/scanner/operations/entrypoint-lang-deno.md
@@ -0,0 +1,22 @@
+# Entry-Point Runtime — Deno
+
+## Signals to gather
+- `argv0` equals `deno` or path ends with `/bin/deno`.
+- Arguments include `run`, `task`, `serve`, or `compile` outputs.
+- Presence of `deno.json` / `deno.jsonc`, `import_map.json`, or cached modules (`/deno-dir`).
+- Environment (`DENO_DIR`, `DENO_AUTH_TOKENS`).
+
+## Implementation notes
+- Resolve script URLs or local files; for remote sources record the URL as evidence.
+- Distinguish between `deno compile` executables and the Deno runtime invoking a script.
+- Recognise `deno task <name>` by reading tasks from `deno.json`.
+- ShellFlow should already collapse Docker official entrypoint (`/usr/bin/env deno task start`).
+
+## Evidence & scoring
+- Boost for confirmed script/URL and config file presence.
+- Add evidence for permissions flags (`--allow-net`, `--allow-env`) to aid policy decisions.
+- Penalise when only the binary is present without scripts.
+
+## Edge cases
+- Deno deploy shims or adapters may further wrap the runtime; rely on wrapper catalogue.
+- When `deno compile` emits a standalone binary, treat it as C/C++ unless metadata persists.
--- a/docs/modules/scanner/operations/entrypoint-lang-dotnet.md
+++ b/docs/modules/scanner/operations/entrypoint-lang-dotnet.md
@@ -0,0 +1,25 @@
+# Entry-Point Runtime — .NET / C#
+
+## Signals to gather
+- Framework-dependent: `dotnet <app.dll>` invocation.
+- Adjacent `*.runtimeconfig.json` (parse `tfm`, framework references, roll-forward).
+- Self-contained or single-file apps: ELF/PE with `DOTNET_BUNDLE`, `System.Private.CoreLib`, or `coreclr` markers.
+- ASP.NET hints: `ASPNETCORE_URLS`, `appsettings.json`, presence of `wwwroot`.
+- Windows builds: PE with CLI header (managed assembly) or native host embedding a bundle.
+
+## Implementation notes
+- Resolve DLL paths relative to the working directory after env expansion.
+- When `dotnet` is invoked without a DLL, treat as low-confidence and record evidence.
+- For single-file executables, read the first few MB looking for bundle markers rather than full PE/ELF parsing.
+- Capture runtimeconfig metadata when available; store TFM in `LanguageHit.MainModule`.
+- Treat `dotnet exec` wrappers the same as `dotnet <dll>`.
+
+## Evidence & scoring
+- Large confidence boost when both host (`dotnet`) and DLL artefact are present.
+- Add evidence for runtimeconfig parsing (`"runtimeconfig TFM=net8.0"`), bundle markers, or ASP.NET env vars.
+- Penalise detections lacking artefact confirmation.
+
+## Edge cases
+- Native AOT (`dotnet publish -p:PublishAot=true`) emits native binaries without managed markers—should fall back to C/C++ detector.
+- PowerShell-launched apps: ShellFlow should rewrite before the detector runs.
+- Side-by-side deployment where multiple DLLs exist—prefer the one passed to `dotnet` or specified via `DOTNET_STARTUP_HOOKS`.
--- a/docs/modules/scanner/operations/entrypoint-lang-elixir.md
+++ b/docs/modules/scanner/operations/entrypoint-lang-elixir.md
@@ -0,0 +1,22 @@
+# Entry-Point Runtime — Elixir / Erlang (BEAM)
+
+## Signals to gather
+- `argv0` equals `elixir`, `iex`, `mix`, `erl`, `beam.smp`, or release scripts (`bin/app start`).
+- Release layouts: `_build/prod/rel/<app>/bin/<app>`, `releases/<version>/vm.args`, `sys.config`.
+- Environment variables (`MIX_ENV`, `RELEASE_COOKIE`, `RELEASE_NODE`).
+- Config files (`config/config.exs`, `config/prod.exs`).
+
+## Implementation notes
+- Recognise Distillery / mix release scripts that `exec` the real BEAM VM.
+- When release script is invoked with `eval`, treat the wrapper as part of the chain but classify runtime as `Elixir`.
+- Inspect `vm.args` for node name, cookie, and distributed settings.
+- For pure Erlang services (no Elixir), the same detector should fire using `erl` hints.
+
+## Evidence & scoring
+- Boost for release directories and BEAM VM binaries (`beam.smp`).
+- Add evidence for config files and env vars.
+- Penalise minimal images lacking release artefacts (could be generic shell wrappers).
+
+## Edge cases
+- Phoenix apps often rely on `bin/server` wrapper—ShellFlow must collapse to release script.
+- Multi-node clusters may start multiple BEAM instances; treat as `Supervisor` if several nodes stay active.
--- a/docs/modules/scanner/operations/entrypoint-lang-go.md
+++ b/docs/modules/scanner/operations/entrypoint-lang-go.md
@@ -0,0 +1,24 @@
+# Entry-Point Runtime — Go
+
+## Signals to gather
+- Statically linked ELF with `.note.go.buildid`.
+- `.gopclntab` section (function name table) or `Go build ID` strings.
+- Minimal dynamic dependencies (often none) and musl/glibc loader differences.
+- `GODEBUG`, `GOMAXPROCS`, `GOENV` environment variables.
+- Go module artefacts: `go.mod`, `go.sum`.
+
+## Implementation notes
+- Use ELF parsing to locate `.note.go.buildid`; fallback to scanning the first few MB for `Go build ID`.
+- Distinguish from Rust/C by checking `.dynsym` count, presence of Go-specific section names, and the absence of `GLIBCXX`.
+- For distroless images, rely solely on ELF traits since no package metadata is present.
+- Record binary path and module files as evidence.
+
+## Evidence & scoring
+- Strong boost for `.note.go.buildid` or `.gopclntab`.
+- Add evidence for module files or env variables.
+- Penalise binaries with high numbers of shared libraries (likely C/C++).
+
+## Edge cases
+- TinyGo or stripped binaries may lack build IDs—fall back to heuristics (symbol patterns, text section).
+- CGO-enabled binaries include glibc dependencies; still treat as Go but mention CGO in evidence if detected.
+- Supervisors wrapping Go services (e.g., `envoy`) should be handled upstream by wrapper detection.
--- a/docs/modules/scanner/operations/entrypoint-lang-java.md
+++ b/docs/modules/scanner/operations/entrypoint-lang-java.md
@@ -0,0 +1,29 @@
+# Entry-Point Runtime — Java
+
+## Signals to gather
+- `argv0` equals `java` / `javaw` or resides under `*/bin/java`.
+- `-jar <app.jar>` argument with the jar present in the VFS.
+- Manifest metadata (`META-INF/MANIFEST.MF`) containing `Main-Class` or `Start-Class`.
+- Spring Boot layout (`BOOT-INF/**`).
+- Classpath form (`-cp/-classpath`) followed by a main class token.
+- Presence of an embedded JRE (`lib/modules`, `jre/bin/java`).
+- `JAVA_OPTS`, `JAVA_TOOL_OPTIONS`, or `JAVA_HOME` environment hints.
+- `EXPOSE` ports often associated with Java servers (`8080`, `8443`).
+
+## Implementation notes
+- Expand env variables before resolving jar/class paths (supports `${VAR}`, `${VAR:-default}`).
+- For classpath mode, open a subset of jars to corroborate `Main-Class`.
+- Track when the app is started through shell wrappers (`exec java -jar "$APP_JAR"`); ShellFlow should already collapse these.
+- Distinguish between installers (e.g., `java -version`) and actual app launches by checking for jar/class arguments.
+- When multiple jars/classes are possible, prefer manifest-backed artefacts but record alternates in evidence.
+
+## Evidence & scoring
+- Reward concrete artefacts (jar exists, manifest resolved).
+- Add evidence entries for each heuristic (`"MANIFEST Main-Class=com.example.Main"`, `"Spring Boot BOOT-INF detected"`).
+- Penalise missing artefacts or ambiguous classpaths.
+- Surface runtime-specific env/ports as supplementary clues, but keep their weight low to avoid false positives.
+
+## Edge cases
+- Launcher scripts that eventually run `java` — ensure ShellFlow surfaces the final command.
+- Multi-module fat jars: only expose the main entry jar in evidence; keep supporting jars as context.
+- Native image (`native-image` / GraalVM) should fall through to Go/Rust/C++ detectors when `java` binary is absent.
--- a/docs/modules/scanner/operations/entrypoint-lang-nginx.md
+++ b/docs/modules/scanner/operations/entrypoint-lang-nginx.md
@@ -0,0 +1,24 @@
+# Entry-Point Runtime — Nginx
+
+## Signals to gather
+- `argv0` equals `nginx`.
+- Config files: `/etc/nginx/nginx.conf`, `conf.d/*.conf`, `/usr/share/nginx/html`.
+- Environment (`NGINX_ENTRYPOINT_QUIET_LOGS`, `NGINX_PORT`, `NGINX_ENVSUBST_TEMPLATE`).
+- Listening sockets on 80/443 (dynamic mode) or `EXPOSE 80` (static).
+- Modules or scripts shipped with the official Docker entrypoint (`docker-entrypoint.sh` collapsing to `nginx -g "daemon off;"`).
+
+## Implementation notes
+- Parse `nginx.conf` (basic directive traversal) to extract worker processes, include chains, upstream definitions.
+- Handle official entrypoint idioms (`envsubst` templating) via ShellFlow.
+- Distinguish pure reverse proxies from PHP-FPM combos; when both `nginx` and `php-fpm` run, classify container as `Supervisor`.
+- Record static web content presence (`/usr/share/nginx/html/index.html`).
+
+## Evidence & scoring
+- Boost for confirmed config and workers.
+- Add evidence for templating features, env substitution, or modules.
+- Penalise if binary exists without config (likely not the entry point).
+
+## Edge cases
+- Alpine images may place configs under `/etc/nginx/conf.d`; include both.
+- Custom builds might rename binary (`openresty`, `tengine`); consider aliases if common.
+- Windows Nginx not supported; fall back to `Other`.
--- a/docs/modules/scanner/operations/entrypoint-lang-node.md
+++ b/docs/modules/scanner/operations/entrypoint-lang-node.md
@@ -0,0 +1,24 @@
+# Entry-Point Runtime — Node.js
+
+## Signals to gather
+- `argv0` equals `node`, `nodejs`, or path ends with `/bin/node`.
+- Scripts launched via package runners (`npm`, `yarn`, `pnpm node …`, `npx`).
+- Presence of `package.json` with `"main"` or `"scripts":{"start":…}` entries.
+- `NODE_ENV`, `NODE_OPTIONS`, or `NPM_PACKAGE_NAME` environment hints.
+- Bundler/PM2 scenarios: `pm2-runtime`, `pm2-docker`, `forever`, `nodemon`.
+
+## Implementation notes
+- Resolve script arguments (e.g., `node server.js`) relative to the working dir.
+- If invoked through `npm start`/`yarn run`, parse `package.json` to expand the actual script.
+- Support TypeScript loaders (`ts-node`, `node --loader`, `.mjs`) by inspecting extensions and flags.
+- Normalise shebang-based Node scripts (ShellFlow ensures `#!/usr/bin/env node` collapses to Node).
+
+## Evidence & scoring
+- Boost confidence when a concrete JS/TS entry file exists.
+- Add evidence for `package.json` metadata, PM2 ecosystem files, or `NODE_ENV` values.
+- Penalise when the entry file is missing or only package runners are present without scripts.
+
+## Edge cases
+- Multi-service supervisors (e.g., `pm2` managing multiple apps): treat as `Supervisor` and list programmes as children.
+- Serverless shims (e.g., Google Functions) wrap Node; prefer the user-provided handler script if detectable.
+- Distroless snapshots may omit package managers; rely on Node binary + script presence.
--- a/docs/modules/scanner/operations/entrypoint-lang-phpfpm.md
+++ b/docs/modules/scanner/operations/entrypoint-lang-phpfpm.md
@@ -0,0 +1,24 @@
+# Entry-Point Runtime — PHP-FPM
+
+## Signals to gather
+- `argv0` equals `php-fpm` or `php-fpm8*` variants; master process often invoked with `-F` or `--nodaemonize`.
+- Configuration files: `/usr/local/etc/php-fpm.conf`, `www.conf`, pool definitions under `php-fpm.d`.
+- PHP runtime artefacts: `composer.json`, `public/index.php`, `artisan`, `wp-config.php`.
+- Environment variables such as `PHP_FPM_CONFIG`, `PHP_INI_DIR`, `APP_ENV`.
+- Socket or port exposure (`listen = 9000`, `/run/php-fpm.sock`).
+
+## Implementation notes
+- Verify master process vs worker processes (master stays PID 1, workers forked).
+- Inspect pool configuration to extract listening endpoint and process manager mode.
+- If `docker-php-entrypoint` is involved, ShellFlow must expand to `php-fpm`.
+- Distinguish FPM from CLI invocations (`php script.php`) to avoid misclassification.
+
+## Evidence & scoring
+- Reward confirmed config files and listening sockets.
+- Add evidence for application artefacts (Composer lockfile, framework directories).
+- Penalise when only the binary is present without config (could be CLI usage).
+
+## Edge cases
+- Images bundling Apache/Nginx front-ends should end up as `Supervisor` with PHP-FPM as a child service.
+- Some Alpine packages install `php-fpm7` naming—include aliases in detector.
+- When `php-fpm` is launched via `s6` or supervisor, rely on child detection to avoid double counting.
--- a/docs/modules/scanner/operations/entrypoint-lang-python.md
+++ b/docs/modules/scanner/operations/entrypoint-lang-python.md
@@ -0,0 +1,25 @@
+# Entry-Point Runtime — Python
+
+## Signals to gather
+- `argv0` equals `python`, `python3`, `pypy`, or an interpreter symlink.
+- WSGI/ASGI servers: `gunicorn`, `uvicorn`, `hypercorn`, `daphne`.
+- Task runners: `celery -A app worker`, `rq worker`, `pytest`.
+- Presence of `requirements.txt`, `pyproject.toml`, `setup.cfg`, or `Pipfile`.
+- `PYTHONPATH`, `PYTHONUNBUFFERED`, `DJANGO_SETTINGS_MODULE`, `FLASK_APP`, or application-specific env vars.
+- Virtualenv detection (`/venv/bin/python`, `pyvenv.cfg`).
+
+## Implementation notes
+- When invoked as `python -m module`, resolve the module to a path if possible.
+- For WSGI/ASGI servers, inspect command arguments (`app:app`, `module:create_app`) and config files.
+- Recognise wrapper scripts such as `docker-entrypoint.py` that eventually `exec "$@"`.
+- Support zipped apps or single-file bundles by checking `zipapp` signatures.
+
+## Evidence & scoring
+- Increase confidence when module or script exists and dependencies are present.
+- Capture evidence for env variables, config files, or known server arguments.
+- Penalise ambiguous invocations (e.g., `python -c "..."` without persistent service).
+
+## Edge cases
+- Supervisors launching multiple Python workers fall back to `Supervisor` classification with Python listed as child.
+- Conda environments use different directory structures; look for `conda-meta` directories.
+- Alpine distroless images may ship `python` symlinks without standard libs—ensure script presence before final classification.
--- a/docs/modules/scanner/operations/entrypoint-lang-ruby.md
+++ b/docs/modules/scanner/operations/entrypoint-lang-ruby.md
@@ -0,0 +1,24 @@
+# Entry-Point Runtime — Ruby
+
+## Signals to gather
+- `argv0` equals `ruby`, `bundle`, `bundler`, `rackup`, `puma`, `unicorn`, `sidekiq`, or `resque`.
+- Bundler scripts: `bundle exec <cmd>`; Gemfile and `Gemfile.lock`.
+- Rails and Rack hints: `config.ru`, `bin/rails`, `bin/rake`.
+- Background jobs: `sidekiq`, `delayed_job`, `resque`.
+- Environment variables (`RAILS_ENV`, `RACK_ENV`, `BUNDLE_GEMFILE`).
+
+## Implementation notes
+- Normalise `bundle exec` by skipping the bundler wrapper and targeting the actual command.
+- Resolve script paths relative to the working directory.
+- For `puma`/`unicorn`, parse config files (`config/puma.rb`, `config/unicorn.rb`) to gather ports/workers.
+- Recognise `foreman start` or `overmind` launching Procfile processes—may devolve to `Supervisor` classification.
+
+## Evidence & scoring
+- Boost confidence when `Gemfile.lock` exists and the requested server script is found.
+- Add evidence for env variables and config files.
+- Penalise ambiguous CLI invocations or missing artefacts.
+
+## Edge cases
+- Alpine distroless images may rely on `ruby` symlinks; confirm binary presence.
+- JRuby (running on Java) may trigger both Ruby and Java signals—prefer Ruby if `ruby`/`jruby` interpreter is explicit.
+- Supervisors launching multiple Ruby workers should produce a single `Supervisor` entry with Ruby children.
--- a/docs/modules/scanner/operations/entrypoint-lang-rust.md
+++ b/docs/modules/scanner/operations/entrypoint-lang-rust.md
@@ -0,0 +1,24 @@
+# Entry-Point Runtime — Rust
+
+## Signals to gather
+- ELF binaries with DWARF producer strings containing `rustc`.
+- Symbols prefixed with `_ZN` (mangled Rust) or section `.rustc`.
+- Presence of `panic=abort` strings, `Rust` metadata, or Cargo artefacts (`Cargo.toml`, `Cargo.lock`).
+- Statically linked (no `.dynamic` entries) in many cases, or musl loader (`/lib/ld-musl-x86_64.so.1`).
+- Environment such as `RUST_LOG`, `RUST_BACKTRACE`.
+
+## Implementation notes
+- Parse DWARF `.debug_info` when available; short-circuit by scanning `.comment` sections for `rustc`.
+- Distinguish from Go by the absence of `.note.go.buildid`.
+- When Cargo artefacts exist, include target name and profile in evidence.
+- For binaries built with `--target x86_64-pc-windows-gnu`, treat them under the same detector (PE + Rust markers).
+
+## Evidence & scoring
+- Reward DWARF producer strings, Cargo files, and Rust-specific env vars.
+- Penalise when only generic static binary traits are present (may defer to C/C++).
+- Mention musl vs glibc loader differences for observability.
+
+## Edge cases
+- Rust compiled to WebAssembly or run inside Wasmtime falls outside this detector; leave as `Other`.
+- Stripped binaries without DWARF or comments may be indistinguishable from C—fall back to C/C++ and add note.
+- Supervisors launching multiple Rust binaries handled upstream.
--- a/docs/modules/scanner/operations/entrypoint-lang-supervisor.md
+++ b/docs/modules/scanner/operations/entrypoint-lang-supervisor.md
@@ -0,0 +1,25 @@
+# Entry-Point Runtime — Supervisors
+
+Some containers intentionally launch multiple long-lived services (sidecars, appliance images, `supervisord`, `s6`, `runit`, `pm2`). Instead of forcing a single runtime classification, the detector can emit a `Supervisor` entry with child services enumerated separately.
+
+## Signals to gather
+- Known supervisor binaries: `supervisord`, `s6-svscan`, `s6-supervise`, `runsvdir`, `pm2-runtime`, `forego`, `foreman`, `overmind`.
+- Configuration files: `/etc/supervisord.conf`, `/etc/s6/*.conf`, `Procfile`, `ecosystem.config.js`.
+- Multiple child processes that remain active after startup.
+- Environment variables controlling supervisor behaviour (`SUPERVISOR_*`, `PM2_HOME`, `S6_CMD_WAIT_FOR_SERVICES`).
+
+## Implementation notes
+- Keep the supervisor as the primary terminal but query configuration to list child commands.
+- For each child, run the usual reduction + runtime detection and attach results as derived evidence.
+- When configuration is templated (e.g., `envsubst`), evaluate ShellFlow output to resolve final commands.
+- Record scheduling details (autorestart, process limits) relevant for incident response.
+
+## Evidence & scoring
+- Supervisor detection flips `LanguageType.Supervisor` with mid-level confidence (0.6–0.7).
+- Confidence increases when configuration explicitly lists services and child processes are observed (dynamic mode).
+- Provide evidence for each child service (`"manages: php-fpm on /run/php-fpm.sock"`, `"manages: nginx listening on 0.0.0.0:80"`).
+
+## Edge cases
+- Docker Compose-style images using `bash` to run multiple services should also map here if ShellFlow detects multiple `&` background jobs.
+- Ensure we do not classify minimal init shims (`tini`, `dumb-init`) as supervisors—they should be collapsed.
+- When supervisor manages only one child, collapse to the child runtime and drop the supervisor evidence to avoid noise.
--- a/docs/modules/scanner/operations/entrypoint-problem.md
+++ b/docs/modules/scanner/operations/entrypoint-problem.md
@@ -0,0 +1,94 @@
+# Entry-Point Detection — Problem & Architecture
+
+## 1) Why this exists
+
+Container images rarely expose their *real* workload directly. Shell wrappers, init shims, supervisors, or language launchers often sit between the Dockerfile `ENTRYPOINT`/`CMD` values and the program you actually care about. Stella Ops needs a deterministic, explainable way to map any container image (or running container) to a single logical entry point that downstream systems can reason about.
+
+We define the target artefact as the tuple below:
+
+```jsonc
+{
+  "type": "java|dotnet|go|python|node|ruby|php-fpm|c/c++|rust|nginx|supervisor|other",
+  "resolvedBinary": "/app/app.jar | /app/app.dll | /app/server | /usr/local/bin/node",
+  "args": ["..."],
+  "confidence": 0.00..1.00,
+  "evidence": [
+    "why we believe this"
+  ],
+  "chain": [
+    {"from": "/bin/sh -c", "to": "/entrypoint.sh", "why": "ENTRYPOINT shell-form"},
+    {"from": "/entrypoint.sh", "to": "java -jar orders.jar", "why": "exec \"$@\" with java default"}
+  ]
+}
+```
+
+Constraints:
+
+- Static first: no `/proc`, no `ptrace`, no customer code execution when scanning images.
+- Honour Docker/OCI precedence (`ENTRYPOINT` vs `CMD`, shell- vs exec-form, Windows `Shell` overrides).
+- Work on distroless and multi-arch images as well as traditional distro bases.
+- Emit auditable evidence and reduction chains so policy decisions are explainable.
+
+## 2) Dual-mode architecture
+
+The scanner exposes a single façade but routes to two reducers:
+
+```
+Scanner.EntryTrace/
+  Common/
+    OciImageReader.cs
+    OverlayVfs.cs
+    Heuristics/
+    Models/
+  Dynamic/ProcReducer.cs   // running container
+  Static/ImageReducer.cs   // static image inference
+```
+
+Selection logic:
+
+```csharp
+IEntryReducer reducer = container.IsRunning
+  ? new ProcReducer()
+  : new ImageReducer();
+var result = reducer.TraceAndReduce(ct);
+```
+
+Both reducers publish a harmonised `EntryTraceResult`, allowing downstream modules (Policy Engine, Vuln Explorer, Export Center) to consume the same shape regardless of data source.
+
+## 3) Pipeline overview
+
+### 3.1 Static images
+
+1. Pull or load OCI image.
+2. Compose final argv (`ENTRYPOINT ++ CMD`), respecting shell overrides.
+3. Overlay layers with whiteout support via a lazy virtual filesystem.
+4. Resolve paths, shebangs, wrappers, and scripts until a terminal candidate emerges.
+5. Classify runtime family, identify application artefact, score confidence, and emit evidence.
+
+### 3.2 Running containers
+
+1. Capture real exec / fork events and build an exec graph.
+2. Locate steady-state processes (long-lived, owns listeners, not a shim).
+3. Collapse wrappers using the same catalogue as static mode.
+4. Cross-check with static heuristics to tighten confidence.
+
+### 3.3 Shared components
+
+- **ShellFlow static analyser** handles script idioms (`set --`, `exec "$@"`, branch rewrites).
+- **Wrapper catalogue** recognises shells, init shims, supervisors, and package runners.
+- **Runtime detectors** plug in per language/framework (Java, .NET, Node, Python, PHP-FPM, Ruby, Go, Rust, Nginx, C/C++).
+- **Score calibrator** turns detector raw scores into a unified 0..1 confidence.
+
+## 4) Document map
+
+The entry-point playbook is now split into focused guides:
+
+| Document | Purpose |
+| --- | --- |
+| `entrypoint-static-analysis.md` | Overlay VFS, argv composition, wrapper reduction, scoring. |
+| `entrypoint-dynamic-analysis.md` | Observational Exec Graph for running containers. |
+| `entrypoint-shell-analysis.md` | ShellFlow static analyser and script idioms. |
+| `entrypoint-runtime-overview.md` | Detector contracts, helper utilities, calibration, integrations. |
+| `entrypoint-lang-*.md` | Runtime-specific heuristics (Java, .NET, Node, Python, PHP-FPM, Ruby, Go, Rust, C/C++, Nginx, Deno, Elixir/BEAM, Supervisor). |
+
+Use this file as the landing page; each guide can be read independently when implementing or updating a specific component.
--- a/docs/modules/scanner/operations/entrypoint-runtime-overview.md
+++ b/docs/modules/scanner/operations/entrypoint-runtime-overview.md
@@ -0,0 +1,152 @@
+# Runtime Detector Overview
+
+Runtime classification converts a reduced command into a concrete language or framework identity with supporting evidence. This document describes the shared contracts, helper utilities, calibration strategy, and integration points; language-specific heuristics live in the `entrypoint-lang-*.md` files.
+
+## 1) Contracts
+
+```csharp
+public enum LanguageType {
+  Java, DotNet, Node, Python, PhpFpm, Ruby, Go, Rust, CCpp,
+  Nginx, Deno, Elixir, Supervisor, Other
+}
+
+public sealed record ResolvedCommand(
+  string[] Argv,
+  string   Argv0,
+  string?  AbsolutePath,
+  bool     IsElf,
+  bool     IsPe,
+  bool     IsScript,
+  string?  Shebang,
+  string   WorkingDir
+);
+
+public sealed record LanguageHit(
+  LanguageType Type,
+  double       RawScore,
+  string       ResolvedBinary,
+  string[]     Args,
+  List<string> Evidence,
+  string?      AppArtifactPath = null,
+  string?      MainModule = null,
+  Dictionary<string,string>? Extra = null
+);
+```
+
+### Interface
+
+```csharp
+public interface ILanguageSubDetector {
+  LanguageHit? TryDetect(
+    ResolvedCommand cmd, OverlayVfs vfs, EnvBag env, ImageContext img, CancellationToken ct = default);
+}
+
+public sealed class LanguageDetector {
+  private readonly ILanguageSubDetector[] _detectors = {
+    new JavaDetector(),
+    new DotNetDetector(),
+    new NodeDetector(),
+    new PythonDetector(),
+    new PhpFpmDetector(),
+    new RubyDetector(),
+    new NginxDetector(),
+    new GoDetector(),
+    new RustDetector(),
+    new DenoDetector(),
+    new ElixirDetector(),
+    new CCppDetector(),
+    new SupervisorDetector()
+  };
+  private readonly ScoreCalibrator _cal = ScoreCalibrator.Default;
+
+  public LanguageHit Detect(ResolvedCommand cmd, OverlayVfs vfs, EnvBag env, ImageContext img, out double confidence) {
+    var hits = _detectors.Select(d => d.TryDetect(cmd, vfs, env, img)).Where(h => h is not null).ToList()!;
+    LanguageHit best = hits.Count == 0
+      ? new LanguageHit(LanguageType.Other, 0.10, cmd.AbsolutePath ?? cmd.Argv0, cmd.Argv.Skip(1).ToArray(),
+                        new() { "No strong runtime family signals detected." })
+      : hits.OrderByDescending(_cal.Calibrate).First();
+
+    confidence = _cal.Calibrate(best);
+    foreach (var alt in hits.Where(h => h != best).OrderByDescending(_cal.Calibrate))
+      best.Evidence.Add($"Alternative: {alt!.Type} (score={_cal.Calibrate(alt):0.00}) — {string.Join("; ", alt.Evidence.Take(2))}…");
+    return best;
+  }
+}
+```
+
+## 2) Helpers
+
+```csharp
+static class VfsHelpers {
+  public static bool FileExists(OverlayVfs vfs, string path) => vfs.Exists(path);
+  public static bool TryOpen(OverlayVfs vfs, string path, out Stream? stream) {
+    if (!vfs.Exists(path)) { stream = null; return false; }
+    stream = vfs.OpenRead(path);
+    return true;
+  }
+  public static string Join(string cwd, string maybeRel) =>
+    Path.IsPathRooted(maybeRel) ? maybeRel : Path.GetFullPath(Path.Combine(cwd, maybeRel));
+}
+
+static class ArgvHelpers {
+  public static int IndexOf(this string[] argv, string flag) =>
+    Array.FindIndex(argv, a => a == flag);
+  public static string? Next(this string[] argv, int idx) =>
+    (idx >= 0 && idx + 1 < argv.Length) ? argv[idx + 1] : null;
+  public static bool AnyEndsWith(this IEnumerable<string> args, string suffix, bool ignoreCase = true) =>
+    args.Any(a => a.EndsWith(suffix, ignoreCase ? StringComparison.OrdinalIgnoreCase : StringComparison.Ordinal));
+  public static bool Is(this string? candidate, params string[] names) =>
+    candidate is not null && names.Any(n => string.Equals(Path.GetFileName(candidate), n, StringComparison.OrdinalIgnoreCase));
+}
+```
+
+## 3) Scoring & calibration
+
+- Each sub-detector returns a `RawScore` (0..1) based on family-specific heuristics.
+- Feed raw scores into a calibrator (Platt scaling or isotonic regression) trained on labelled corpora to get calibrated probabilities.
+- Persist calibration metadata per detector to avoid drift.
+- When no detector fires, return `LanguageType.Other` with low confidence and an evidence note.
+
+## 4) Cross-checks
+
+Enhance precision by combining detector results with filesystem and configuration signals:
+
+- Compare declared `EXPOSE` ports with runtime defaults (e.g., `80/443` for Nginx, `8080` for Java app servers).
+- Inspect service-specific configuration (`nginx.conf`, `php-fpm.conf`, `web.config`, `Gemfile`, `package.json`, `pyproject.toml`).
+- For Java and .NET, verify artefact presence and manifest metadata; for Go/Rust check static binary traits.
+- Re-run detectors after ShellFlow rewrites to ensure post-`exec` commands are analysed.
+
+## 5) Windows nuances
+
+- Use `config.Shell` to detect PowerShell vs CMD; adjust interpreter lookup accordingly.
+- PE probing is mandatory—PowerShell scripts often front .NET or native binaries.
+- Consider case-insensitive paths and `\` separators.
+
+## 6) Integration points
+
+- Static reducer passes `ResolvedCommand` → runtime detector.
+- Dynamic reducer pipes steady-state commands through the same interface.
+- Output `LanguageHit` populates the `TerminalProcess` along with `confidence`.
+- Downstream consumers (Policy Engine, Vuln Explorer) merge runtime type into their evidence trail.
+
+## 7) Next steps
+
+Language-specific heuristics live in:
+
+| Runtime | Document |
+| --- | --- |
+| Java | `entrypoint-lang-java.md` |
+| .NET / C# | `entrypoint-lang-dotnet.md` |
+| Node.js | `entrypoint-lang-node.md` |
+| Python | `entrypoint-lang-python.md` |
+| PHP-FPM | `entrypoint-lang-phpfpm.md` |
+| Ruby | `entrypoint-lang-ruby.md` |
+| Go | `entrypoint-lang-go.md` |
+| Rust | `entrypoint-lang-rust.md` |
+| C/C++ | `entrypoint-lang-ccpp.md` |
+| Nginx | `entrypoint-lang-nginx.md` |
+| Deno | `entrypoint-lang-deno.md` |
+| Elixir/Erlang (BEAM) | `entrypoint-lang-elixir.md` |
+| Supervisors | `entrypoint-lang-supervisor.md` |
+
+Each runtime file documents the heuristics, artefacts, and edge cases specific to that family.
--- a/docs/modules/scanner/operations/entrypoint-shell-analysis.md
+++ b/docs/modules/scanner/operations/entrypoint-shell-analysis.md
@@ -0,0 +1,83 @@
+# ShellFlow — Script Reduction Playbook
+
+Most container entry points eventually execute a shell script. The ShellFlow analyser resolves these scripts without executing user code, providing deterministic, explainable reductions.
+
+## 1) Scope
+
+- POSIX `sh` subset with common Bash extensions (control flow, functions, parameter expansion).
+- Handles idioms from official Docker images (`if [ "$1" = "server" ]; then …`, `exec gosu "$USER" "$@"`, `set -- java -jar …`).
+- Tracks positional parameters (`$@`, `$1..$9`), environment variables, and `set --` mutations.
+- Produces one or more candidate commands with supporting evidence.
+
+## 2) Architecture
+
+```
+ShellFlow/
+  Parser/           // POSIX sh lexer + recursive descent parser
+  Ast/              // nodes for lists, pipelines, conditionals, functions
+  Evaluator/        // partial evaluation & taint tracking
+  Idioms/           // pattern library for common Docker entrypoints
+  Planner/          // emits CommandPlan[]
+```
+
+### 2.1 CommandPlan
+
+```csharp
+public sealed record CommandPlan(
+  string[] Argv,
+  double   HeuristicScore,
+  IReadOnlyList<string> Evidence,
+  IReadOnlyList<ReductionEdge> Chain,
+  bool     IsFallback = false
+);
+```
+
+Plans feed directly into the static reducer, which selects the highest-confidence plan but keeps alternates as evidence.
+
+## 3) Parsing & AST
+
+- Tokenise words, assignments, pipelines (`|`), lists (`;`, `&&`, `||`), conditionals (`if`, `case`), loops (`for`, `while`, `until`), functions, and redirections.
+- Preserve heredocs and subshells as opaque nodes (evaluated conservatively).
+- Record source spans to surface meaningful evidence (`"line 12: exec java -jar $APP_JAR"`).
+
+## 4) Partial evaluation
+
+- Initialise symbol table from image environment plus caller-supplied args.
+- Treat `$@`, `$*`, `$1..$9` as tainted; propagate taint through assignments.
+- Resolve `${VAR:-default}` and `${VAR:+alt}` when `VAR` known; otherwise branch.
+- Support `set -- …` (resets positional parameters) and `shift`.
+- `source`/`.` commands are parsed recursively when files are available; else fallback to low-confidence branch.
+
+## 5) Exec sink detection
+
+- `exec <cmd>` dominates the remainder of the script.
+- Chains such as `exec gosu "$USER" "$@"` feed into wrapper collapse.
+- When no `exec` is present, pick the last reachable simple command in the main path.
+- Multi-branch scripts yield multiple plans with adjusted scores; unresolved branches are marked `IsFallback`.
+
+## 6) Idiom library
+
+| Pattern | Action |
+| --- | --- |
+| `if [ "${1:0:1}" = '-' ]; then set -- server "$@"; fi` | Rewrite argv to prepend default command. |
+| `if [ "$1" = "bash" ]; then exec "$@"; fi` | Pass-through for manual shells. |
+| `exec "$@"` + non-empty CMD | Substitute CMD vector into plan. |
+| `exec java -jar "$APP_JAR" "$@"` | Resolve JAR via env or filesystem. |
+| `set -- gosu "$APP_USER" "$@"` | Collapse into wrapper plan. |
+
+Idioms are implemented as AST visitors; each adds evidence strings when triggered.
+
+## 7) Confidence scoring
+
+- Base score from plan heuristics (`HeuristicScore`).
+- Penalties for unresolved taint (`$@` unknown), missing files, nested subshells, or fallbacks.
+- Bonus when idioms match, artefacts exist, or env values resolve cleanly.
+- Final confidence is combined with the outer static scoring model.
+
+## 8) Failure modes
+
+- Missing script (`ENTRYPOINT` points to deleted file): emit fallback plan with low confidence.
+- Self-modifying scripts or heavy dynamic features (`eval`, backticks): mark plan as low-confidence and surface warning evidence.
+- Commands that spawn supervisors without `exec`: return both the supervisor and inferred children (if configuration files are present).
+
+ShellFlow keeps the static reducer explainable: every inferred command is accompanied by the script span and reasoning used to reach it.
--- a/docs/modules/scanner/operations/entrypoint-static-analysis.md
+++ b/docs/modules/scanner/operations/entrypoint-static-analysis.md
@@ -0,0 +1,122 @@
+# Entry-Point Static Analysis
+
+This guide captures the static half of Stella Ops’ entry-point detection pipeline: how we turn image metadata and filesystem contents into a resolved binary, an execution chain, and a confidence score.
+
+## 1) Loading OCI images
+
+### 1.1 Supported inputs
+- Registry references (`repo:tag@sha256:digest`) using the existing content store.
+- Local OCI/Docker v2 archives (`docker save` tarball, OCI layout directory with `index.json` + `blobs/sha256/*`).
+
+### 1.2 Normalised model
+
+```csharp
+sealed class OciImage {
+  public required string Os;
+  public required string Arch;
+  public required string[] Entrypoint;
+  public required string[] Cmd;
+  public required string[] Shell;      // Windows / powershell overrides
+  public required string WorkingDir;
+  public required string[] Env;
+  public required string[] ExposedPorts;
+  public required LabelMap Labels;
+  public required LayerRef[] Layers;   // ordered, compressed blobs
+}
+```
+
+Compose the runtime argv as `Entrypoint ++ Cmd`, honouring shell-form vs exec-form (see §2.3).
+
+## 2) Overlay virtual filesystem
+
+### 2.1 Whiteouts
+- Regular whiteout: `path/.wh.<name>` removes `<name>` from lower layers.
+- Opaque directory: `path/.wh..wh..opq` hides the directory entirely.
+
+### 2.2 Lazy extraction
+- First pass: build a tar index `(path → layer, offset, size, mode, isWhiteout, isDir)`.
+- Decompress only when reading a file; optionally support eStargz TOC to accelerate random access.
+
+### 2.3 Shell-form composition
+- Dockerfile shell form is serialised as `["/bin/sh","-c","…"]` (or `Shell[]` override on Windows).
+- Always trust `config.json`; no need to inspect the Dockerfile.
+- Working directory defaults to `/` if unspecified.
+
+## 3) Low-level primitives
+
+### 3.1 PATH resolution
+- Extract `PATH` from environment (fallback `/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin`).
+- If `argv[0]` is relative or lacks `/`, walk the PATH to resolve an absolute file.
+- Verify execute bit (or Windows ACL) before accepting.
+
+### 3.2 Shebang handling
+- For non-ELF/PE files: read first line; interpret `#!interpreter args`.
+- Replace `argv[0]` with the interpreter, prepend shebang args, append script path per kernel semantics.
+
+### 3.3 Binary probes
+- Identify ELF via magic `\x7FELF`, parse `.interp`, `.dynamic`, linked libs, `.note.go.buildid`, DWARF producer.
+- Identify PE (Windows) and detect .NET single-file bundles via CLI header.
+- Record features for runtime scoring (Go vs Rust vs glibc vs musl).
+
+## 4) Wrapper catalogue
+
+Collapse known wrappers before analysing the target command:
+
+- Init shims: `tini`, `dumb-init`, `s6-svscan`, `runit`, `supervisord`.
+- Privilege droppers: `gosu`, `su-exec`, `chpst`.
+- Shells: `sh`, `bash`, `dash`, BusyBox variants.
+- Package runners: `npm`, `yarn`, `pnpm`, `pip`, `pipenv`, `poetry`, `bundle`, `rake`.
+
+Rules:
+- If wrapper contains a `--` sentinel (`tini -- app …`) drop the wrapper and record a reduction edge.
+- `gosu user cmd …` → collapse to `cmd …`.
+- For shell wrappers, delegate to the ShellFlow analyser (see separate guide).
+
+## 5) ShellFlow integration
+
+When the resolved command is a shell script, invoke the ShellFlow analyser to locate the eventual `exec` target. Key capabilities:
+
+- Parses POSIX sh (and common Bash extensions).
+- Tracks environment mutations (`set`, `export`, `set --`).
+- Resolves `$@`, `$1..9`, `${VAR:-default}`.
+- Recognises idioms from official Docker images (`if [ "$1" = "server" ]; then …`).
+- Emits multiple branches when predicates depend on unknown data, but tags them with lower confidence.
+
+The analyser returns one or more candidate commands along with reasons, which feed into the reduction engine.
+
+## 6) Reduction algorithm
+
+1. Compose argv `ENTRYPOINT ++ CMD`.
+2. Collapse wrappers; append `ReductionEdge` entries documenting each step.
+3. Resolve argv0 to an absolute file and classify (ELF/PE/script).
+4. If script → run ShellFlow; replace current command with highest-confidence `exec` target while preserving alternates as evidence.
+5. Attempt to resolve application artefacts for VM hosts (JARs, DLLs, JS entry, Python module, etc.).
+6. Emit `EntryTraceResult` with candidate terminals ranked by confidence.
+
+## 7) Confidence scoring
+
+Use a simple logistic model with feature contributions captured for the evidence trail. Example features:
+
+| Id | Signal | Weight |
+| --- | --- | --- |
+| `f1` | Entrypoint already an executable (ELF/PE) | +0.18 |
+| `f2` | Observed chain ends in non-wrapper binary | +0.22 |
+| `f3` | VM host + resolvable artefact | +0.20 |
+| `f4` | Exposed ports align with runtime | +0.06 |
+| `f5` | Shebang interpreter matches runtime family | +0.05 |
+| `f6` | Language artefact validation succeeded | +0.15 |
+| `f8` | Multi-branch script unresolved (`$@` taint) | −0.20 |
+| `f9` | Target missing execute bit | −0.25 |
+| `f10` | Shell with no `exec` | −0.18 |
+
+Persist per-feature evidence strings so UI/CLI users can see **why** the scanner picked a given entry point.
+
+## 8) Outputs
+
+Return a populated `EntryTraceResult`:
+
+- `Terminals` contains the best candidate(s) and their runtime classification.
+- `Evidence` aggregates feature messages, ShellFlow reasoning, wrapper reductions, and runtime detector hints.
+- `Chain` shows the explainable path from initial Docker argv to the final binary.
+
+Static and dynamic reducers share this shape, enabling downstream modules to remain agnostic of the detection mode.
--- a/docs/modules/scanner/operations/entrypoint.md
+++ b/docs/modules/scanner/operations/entrypoint.md
@@ -0,0 +1,26 @@
+# Entry-Point Documentation Index
+
+The entry-point detection system is now split into focused guides. Use this index to navigate the individual topics.
+
+| Topic | Document |
+| --- | --- |
+| Problem statement & architecture overview | `entrypoint-problem.md` |
+| Static resolver (OCI layers, wrappers, scoring) | `entrypoint-static-analysis.md` |
+| Dynamic resolver / Observational Exec Graph | `entrypoint-dynamic-analysis.md` |
+| ShellFlow script analysis | `entrypoint-shell-analysis.md` |
+| Runtime detector contracts & calibration | `entrypoint-runtime-overview.md` |
+| Java heuristics | `entrypoint-lang-java.md` |
+| .NET heuristics | `entrypoint-lang-dotnet.md` |
+| Node.js heuristics | `entrypoint-lang-node.md` |
+| Python heuristics | `entrypoint-lang-python.md` |
+| PHP-FPM heuristics | `entrypoint-lang-phpfpm.md` |
+| Ruby heuristics | `entrypoint-lang-ruby.md` |
+| Go heuristics | `entrypoint-lang-go.md` |
+| Rust heuristics | `entrypoint-lang-rust.md` |
+| C/C++ heuristics | `entrypoint-lang-ccpp.md` |
+| Nginx heuristics | `entrypoint-lang-nginx.md` |
+| Deno heuristics | `entrypoint-lang-deno.md` |
+| Elixir / Erlang (BEAM) heuristics | `entrypoint-lang-elixir.md` |
+| Supervisor classification | `entrypoint-lang-supervisor.md` |
+
+> Looking for historical context? The unified write-up previously in `entrypoint2.md` and `entrypoint-lang-detection.md` has been decomposed into the files above for easier maintenance.
--- a/docs/modules/scanner/operations/rustfs-migration.md
+++ b/docs/modules/scanner/operations/rustfs-migration.md
@@ -0,0 +1,88 @@
+# Scanner Artifact Store Migration (MinIO → RustFS)
+
+## Overview
+
+Sprint 11 introduces **RustFS** as the default artifact store for the Scanner plane. Existing
+deployments running MinIO (or any S3-compatible backend) must migrate stored SBOM artefacts to RustFS
+before switching the Scanner hosts to `scanner.artifactStore.driver = "rustfs"`.
+
+This runbook covers the recommended migration workflow and validation steps.
+
+## Prerequisites
+
+- RustFS service deployed and reachable from the Scanner control plane (`http(s)://rustfs:8080`).
+- Existing MinIO/S3 credentials with read access to the current bucket.
+- CLI environment with the StellaOps source tree (for the migration tool) and `dotnet 10` SDK.
+- Maintenance window sized to copy all artefacts (migration is read-only on the source bucket).
+
+## 1. Snapshot source bucket (optional but recommended)
+
+If the MinIO deployment offers versioning or snapshots, take one before migrating. For non-versioned
+deployments, capture an external backup (e.g., `mc mirror` to offline storage).
+
+## 2. Dry-run the migrator
+
+```
+dotnet run --project src/Tools/RustFsMigrator -- \
+  --s3-bucket scanner-artifacts \
+  --s3-endpoint http://stellaops-minio:9000 \
+  --s3-access-key stellaops \
+  --s3-secret-key dev-minio-secret \
+  --rustfs-endpoint http://stellaops-rustfs:8080 \
+  --rustfs-bucket scanner-artifacts \
+  --prefix scanner/ \
+  --dry-run
+```
+
+The dry-run enumerates keys and reports the object count without writing to RustFS. Use this to
+estimate migration time.
+
+## 3. Execute migration
+
+Remove the `--dry-run` flag to copy data. Optional flags:
+
+- `--immutable` – mark all migrated objects as immutable (`X-RustFS-Immutable`).
+- `--retain-days 365` – request retention (in days) via `X-RustFS-Retain-Seconds`.
+- `--rustfs-api-key-header` / `--rustfs-api-key` – provide auth headers when RustFS is protected.
+
+The tool streams each object from S3 and performs an idempotent `PUT` to RustFS preserving the key
+structure (e.g., `scanner/layers/<sha256>/sbom.cdx.json.zst`).
+
+## 4. Verify sample objects
+
+Pick a handful of SBOM digests and confirm:
+
+1. `GET /api/v1/buckets/<bucket>/objects/<key>` returns the expected payload (size + SHA-256).
+2. Scanner WebService configured with `scanner.artifactStore.driver = "rustfs"` can fetch the same
+   artefacts (Smoke test: `GET /api/v1/scanner/sboms/<digest>?format=cdx-json`).
+
+## 5. Switch Scanner hosts
+
+Update configuration (Helm/Compose/environment) to set:
+
+```
+scanner:
+  artifactStore:
+    driver: rustfs
+    endpoint: http://stellaops-rustfs:8080
+    bucket: scanner-artifacts
+    timeoutSeconds: 30
+```
+
+Redeploy Scanner WebService and Worker. Monitor logs for `RustFS` upload/download messages and
+Prometheus scrape (`rustfs_requests_total`).
+
+## 6. Cleanup legacy MinIO (optional)
+
+After a complete migration and validation period, decommission the MinIO bucket or repurpose it for
+other components (Concelier still supports S3). Ensure backups reference RustFS snapshots going
+forward.
+
+## Troubleshooting
+
+- **Uploads fail (HTTP 4xx/5xx):** Check RustFS logs and confirm API key headers. Re-run the migrator
+  for the affected keys.
+- **Missing objects post-cutover:** Re-run the migrator with the specific `--prefix`. The tool is
+  idempotent and safely overwrites existing objects.
+- **Performance tuning:** Run multiple instances of the migrator with disjoint prefixes if needed; the
+  RustFS API is stateless and supports parallel PUTs.