Entry-Point Detection — Problem & Architecture

1) Why this exists

Container images rarely expose their real workload directly. Shell wrappers, init shims, supervisors, or language launchers often sit between the Dockerfile ENTRYPOINT/CMD values and the program you actually care about. Stella Ops needs a deterministic, explainable way to map any container image (or running container) to a single logical entry point that downstream systems can reason about.

We define the target artefact as the tuple below:

{
  "type": "java|dotnet|go|python|node|ruby|php-fpm|c/c++|rust|nginx|supervisor|other",
  "resolvedBinary": "/app/app.jar | /app/app.dll | /app/server | /usr/local/bin/node",
  "args": ["..."],
  "confidence": 0.00..1.00,
  "evidence": [
    "why we believe this"
  ],
  "chain": [
    {"from": "/bin/sh -c", "to": "/entrypoint.sh", "why": "ENTRYPOINT shell-form"},
    {"from": "/entrypoint.sh", "to": "java -jar orders.jar", "why": "exec \"$@\" with java default"}
  ]
}

Constraints:

Static first: no /proc, no ptrace, no customer code execution when scanning images.
Honour Docker/OCI precedence (ENTRYPOINT vs CMD, shell- vs exec-form, Windows Shell overrides).
Work on distroless and multi-arch images as well as traditional distro bases.
Emit auditable evidence and reduction chains so policy decisions are explainable.

2) Dual-mode architecture

The scanner exposes a single façade but routes to two reducers:

Scanner.EntryTrace/
  Common/
    OciImageReader.cs
    OverlayVfs.cs
    Heuristics/
    Models/
  Dynamic/ProcReducer.cs   // running container
  Static/ImageReducer.cs   // static image inference

Selection logic:

IEntryReducer reducer = container.IsRunning
  ? new ProcReducer()
  : new ImageReducer();
var result = reducer.TraceAndReduce(ct);

Both reducers publish a harmonised EntryTraceResult, allowing downstream modules (Policy Engine, Vuln Explorer, Export Center) to consume the same shape regardless of data source.

3) Pipeline overview

3.1 Static images

Pull or load OCI image.
Compose final argv (ENTRYPOINT ++ CMD), respecting shell overrides.
Overlay layers with whiteout support via a lazy virtual filesystem.
Resolve paths, shebangs, wrappers, and scripts until a terminal candidate emerges.
Classify runtime family, identify application artefact, score confidence, and emit evidence.

3.2 Running containers

Capture real exec / fork events and build an exec graph.
Locate steady-state processes (long-lived, owns listeners, not a shim).
Collapse wrappers using the same catalogue as static mode.
Cross-check with static heuristics to tighten confidence.

3.3 Shared components

ShellFlow static analyser handles script idioms (set --, exec "$@", branch rewrites).
Wrapper catalogue recognises shells, init shims, supervisors, and package runners.
Runtime detectors plug in per language/framework (Java, .NET, Node, Python, PHP-FPM, Ruby, Go, Rust, Nginx, C/C++).
Score calibrator turns detector raw scores into a unified 0..1 confidence.

4) Document map

The entry-point playbook is now split into focused guides:

Document	Purpose
`entrypoint-static-analysis.md`	Overlay VFS, argv composition, wrapper reduction, scoring.
`entrypoint-dynamic-analysis.md`	Observational Exec Graph for running containers.
`entrypoint-shell-analysis.md`	ShellFlow static analyser and script idioms.
`entrypoint-runtime-overview.md`	Detector contracts, helper utilities, calibration, integrations.
`entrypoint-lang-*.md`	Runtime-specific heuristics (Java, .NET, Node, Python, PHP-FPM, Ruby, Go, Rust, C/C++, Nginx, Deno, Elixir/BEAM, Supervisor).

Use this file as the landing page; each guide can be read independently when implementing or updating a specific component.

3.9 KiB Raw Blame History Unescape Escape