Files
git.stella-ops.org/docs/modules/scanner/operations/entrypoint-problem.md
master 7b5bdcf4d3 feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules
- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes.
- Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes.
- Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables.
- Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
2025-10-30 00:09:39 +02:00

3.8 KiB
Raw Blame History

Entry-Point Detection — Problem & Architecture

1) Why this exists

Container images rarely expose their real workload directly. Shell wrappers, init shims, supervisors, or language launchers often sit between the Dockerfile ENTRYPOINT/CMD values and the program you actually care about. StellaOps needs a deterministic, explainable way to map any container image (or running container) to a single logical entry point that downstream systems can reason about.

We define the target artefact as the tuple below:

{
  "type": "java|dotnet|go|python|node|ruby|php-fpm|c/c++|rust|nginx|supervisor|other",
  "resolvedBinary": "/app/app.jar | /app/app.dll | /app/server | /usr/local/bin/node",
  "args": ["..."],
  "confidence": 0.00..1.00,
  "evidence": [
    "why we believe this"
  ],
  "chain": [
    {"from": "/bin/sh -c", "to": "/entrypoint.sh", "why": "ENTRYPOINT shell-form"},
    {"from": "/entrypoint.sh", "to": "java -jar orders.jar", "why": "exec \"$@\" with java default"}
  ]
}

Constraints:

  • Static first: no /proc, no ptrace, no customer code execution when scanning images.
  • Honour Docker/OCI precedence (ENTRYPOINT vs CMD, shell- vs exec-form, Windows Shell overrides).
  • Work on distroless and multi-arch images as well as traditional distro bases.
  • Emit auditable evidence and reduction chains so policy decisions are explainable.

2) Dual-mode architecture

The scanner exposes a single façade but routes to two reducers:

Scanner.EntryTrace/
  Common/
    OciImageReader.cs
    OverlayVfs.cs
    Heuristics/
    Models/
  Dynamic/ProcReducer.cs   // running container
  Static/ImageReducer.cs   // static image inference

Selection logic:

IEntryReducer reducer = container.IsRunning
  ? new ProcReducer()
  : new ImageReducer();
var result = reducer.TraceAndReduce(ct);

Both reducers publish a harmonised EntryTraceResult, allowing downstream modules (Policy Engine, Vuln Explorer, Export Center) to consume the same shape regardless of data source.

3) Pipeline overview

3.1 Static images

  1. Pull or load OCI image.
  2. Compose final argv (ENTRYPOINT ++ CMD), respecting shell overrides.
  3. Overlay layers with whiteout support via a lazy virtual filesystem.
  4. Resolve paths, shebangs, wrappers, and scripts until a terminal candidate emerges.
  5. Classify runtime family, identify application artefact, score confidence, and emit evidence.

3.2 Running containers

  1. Capture real exec / fork events and build an exec graph.
  2. Locate steady-state processes (long-lived, owns listeners, not a shim).
  3. Collapse wrappers using the same catalogue as static mode.
  4. Cross-check with static heuristics to tighten confidence.

3.3 Shared components

  • ShellFlow static analyser handles script idioms (set --, exec "$@", branch rewrites).
  • Wrapper catalogue recognises shells, init shims, supervisors, and package runners.
  • Runtime detectors plug in per language/framework (Java, .NET, Node, Python, PHP-FPM, Ruby, Go, Rust, Nginx, C/C++).
  • Score calibrator turns detector raw scores into a unified 0..1 confidence.

4) Document map

The entry-point playbook is now split into focused guides:

Document Purpose
entrypoint-static-analysis.md Overlay VFS, argv composition, wrapper reduction, scoring.
entrypoint-dynamic-analysis.md Observational Exec Graph for running containers.
entrypoint-shell-analysis.md ShellFlow static analyser and script idioms.
entrypoint-runtime-overview.md Detector contracts, helper utilities, calibration, integrations.
entrypoint-lang-*.md Runtime-specific heuristics (Java, .NET, Node, Python, PHP-FPM, Ruby, Go, Rust, C/C++, Nginx, Deno, Elixir/BEAM, Supervisor).

Use this file as the landing page; each guide can be read independently when implementing or updating a specific component.