Files
git.stella-ops.org/docs/modules/scanner/operations/entrypoint-problem.md
2025-10-31 18:50:15 +02:00

95 lines
3.9 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Entry-Point Detection — Problem & Architecture
## 1) Why this exists
Container images rarely expose their *real* workload directly. Shell wrappers, init shims, supervisors, or language launchers often sit between the Dockerfile `ENTRYPOINT`/`CMD` values and the program you actually care about. StellaOps needs a deterministic, explainable way to map any container image (or running container) to a single logical entry point that downstream systems can reason about.
We define the target artefact as the tuple below:
```jsonc
{
"type": "java|dotnet|go|python|node|ruby|php-fpm|c/c++|rust|nginx|supervisor|other",
"resolvedBinary": "/app/app.jar | /app/app.dll | /app/server | /usr/local/bin/node",
"args": ["..."],
"confidence": 0.00..1.00,
"evidence": [
"why we believe this"
],
"chain": [
{"from": "/bin/sh -c", "to": "/entrypoint.sh", "why": "ENTRYPOINT shell-form"},
{"from": "/entrypoint.sh", "to": "java -jar orders.jar", "why": "exec \"$@\" with java default"}
]
}
```
Constraints:
- Static first: no `/proc`, no `ptrace`, no customer code execution when scanning images.
- Honour Docker/OCI precedence (`ENTRYPOINT` vs `CMD`, shell- vs exec-form, Windows `Shell` overrides).
- Work on distroless and multi-arch images as well as traditional distro bases.
- Emit auditable evidence and reduction chains so policy decisions are explainable.
## 2) Dual-mode architecture
The scanner exposes a single façade but routes to two reducers:
```
Scanner.EntryTrace/
Common/
OciImageReader.cs
OverlayVfs.cs
Heuristics/
Models/
Dynamic/ProcReducer.cs // running container
Static/ImageReducer.cs // static image inference
```
Selection logic:
```csharp
IEntryReducer reducer = container.IsRunning
? new ProcReducer()
: new ImageReducer();
var result = reducer.TraceAndReduce(ct);
```
Both reducers publish a harmonised `EntryTraceResult`, allowing downstream modules (Policy Engine, Vuln Explorer, Export Center) to consume the same shape regardless of data source.
## 3) Pipeline overview
### 3.1 Static images
1. Pull or load OCI image.
2. Compose final argv (`ENTRYPOINT ++ CMD`), respecting shell overrides.
3. Overlay layers with whiteout support via a lazy virtual filesystem.
4. Resolve paths, shebangs, wrappers, and scripts until a terminal candidate emerges.
5. Classify runtime family, identify application artefact, score confidence, and emit evidence.
### 3.2 Running containers
1. Capture real exec / fork events and build an exec graph.
2. Locate steady-state processes (long-lived, owns listeners, not a shim).
3. Collapse wrappers using the same catalogue as static mode.
4. Cross-check with static heuristics to tighten confidence.
### 3.3 Shared components
- **ShellFlow static analyser** handles script idioms (`set --`, `exec "$@"`, branch rewrites).
- **Wrapper catalogue** recognises shells, init shims, supervisors, and package runners.
- **Runtime detectors** plug in per language/framework (Java, .NET, Node, Python, PHP-FPM, Ruby, Go, Rust, Nginx, C/C++).
- **Score calibrator** turns detector raw scores into a unified 0..1 confidence.
## 4) Document map
The entry-point playbook is now split into focused guides:
| Document | Purpose |
| --- | --- |
| `entrypoint-static-analysis.md` | Overlay VFS, argv composition, wrapper reduction, scoring. |
| `entrypoint-dynamic-analysis.md` | Observational Exec Graph for running containers. |
| `entrypoint-shell-analysis.md` | ShellFlow static analyser and script idioms. |
| `entrypoint-runtime-overview.md` | Detector contracts, helper utilities, calibration, integrations. |
| `entrypoint-lang-*.md` | Runtime-specific heuristics (Java, .NET, Node, Python, PHP-FPM, Ruby, Go, Rust, C/C++, Nginx, Deno, Elixir/BEAM, Supervisor). |
Use this file as the landing page; each guide can be read independently when implementing or updating a specific component.