- Introduced AuthorityAdvisoryAiOptions and related classes for managing advisory AI configurations, including remote inference options and tenant-specific settings. - Added AuthorityApiLifecycleOptions to control API lifecycle settings, including legacy OAuth endpoint configurations. - Implemented validation and normalization methods for both advisory AI and API lifecycle options to ensure proper configuration. - Created AuthorityNotificationsOptions and its related classes for managing notification settings, including ack tokens, webhooks, and escalation options. - Developed IssuerDirectoryClient and related models for interacting with the issuer directory service, including caching mechanisms and HTTP client configurations. - Added support for dependency injection through ServiceCollectionExtensions for the Issuer Directory Client. - Updated project file to include necessary package references for the new Issuer Directory Client library.
14 KiB
Entry-Point Static Analysis
This guide captures the static half of Stella Ops’ entry-point detection pipeline: how we turn image metadata and filesystem contents into a resolved binary, an execution chain, and a confidence score.
0) Implementation snapshot — Sprint 130.A (2025-11-02)
The StellaOps.Scanner.EntryTrace stack (analyzer + worker + surfaces) currently provides:
- OCI config + layered FS context:
EntryTraceImageContextFactorynormalises environment (PATHfallback), user, and working directory whileLayeredRootFileSystemhandles whiteouts, symlinks, and bounded byte reads (TryReadBytes) so ELF/PE probing stays offline friendly. - Wrapper-aware exec expansion: the analyzer unwraps init/user-switch/environment/supervisor wrappers (
tini,dumb-init,gosu,su-exec,chpst,env,supervisord,s6-supervise,runsv*) and records guard metadata plus environment/user deltas on nodes and edges. - Script + interpreter resolution: POSIX shell parsing (AST-driven) covers
source,run-parts,exec, and supervisor service directories, with Windowscmd /csupport. Python-m, Node script, and Java-jarlookups add evidence when targets are located. - Terminal classification & scoring:
ClassifyTerminalfingerprints ELF (PT_INTERP, Go build ID, Rust notes), PE/CLR, and JAR manifests, pairs them with shebang/runtime heuristics (python,node,java,.NET,php-fpm,nginx,ruby), and emitsEntryTracePlan/EntryTraceTerminalrecords capped at 95-point confidence. - NDJSON + capability stream:
EntryTraceNdjsonWriterproduces deterministicentrytrace.entry/node/edge/target/warning/capabilitylines consumed by AOC, CLI, and policy surfaces. - Runtime reconciliation:
ProcFileSystemSnapshot+ProcGraphBuilderreplay/proc,EntryTraceRuntimeReconcilermerges runtime terminals with static predictions, and diagnostics note matches/mismatches. - Surface integration: Scanner Worker caches graphs (
SurfaceCache), persistsEntryTraceResultvia the shared store, exposes NDJSON + graph throughScanAnalysisKeys, and the WebService/CLI (scan entrytrace) return the stored result.
Open follow-ups tracked for this wave:
- SCANNER-ENTRYTRACE-18-507 – fallback candidate discovery (Docker history,
/etc/services/**,/usr/local/bin/*-entrypoint) when ENTRYPOINT/CMD are empty. - SCANNER-ENTRYTRACE-18-508 – broaden wrapper catalogue (package/tool runners such as
bundle exec,npm,yarn node,docker-php-entrypoint,pipenv,poetry run). - ENTRYTRACE-SURFACE-01 (DOING) / ENTRYTRACE-SURFACE-02 (TODO) – finish wiring Surface.Validation/FS/Secrets to gate prerequisites and remove direct env/secret reads.
Sections §4–§7 below capture the long-term reduction design; features not yet implemented are explicitly noted in the task board.
Probing the analyzer today
- Load the image config
using var stream = File.OpenRead("config.json"); var config = OciImageConfigLoader.Load(stream); - Create a layered filesystem from extracted layer directories or tar archives:
var fs = LayeredRootFileSystem.FromArchives(layers); - Build the image context (normalises env, PATH, user, working dir):
var imageCtx = EntryTraceImageContextFactory.Create( config, fs, new EntryTraceAnalyzerOptions(), imageDigest, scanId); - Resolve the entry trace:
var analyzer = serviceProvider.GetRequiredService<IEntryTraceAnalyzer>(); var graph = await analyzer.ResolveAsync(imageCtx.Entrypoint, imageCtx.Context, cancellationToken); - Inspect results –
graph.Terminalslists classified candidates (path, runtime, confidence, evidence),graph.Nodes/Edgescapture the explainable chain, andgraph.Diagnosticshighlight unresolved steps. Emit metrics/telemetry viaEntryTraceMetrics. - Serialize if needed – pass the graph through
EntryTraceNdjsonWriter.Serializeto obtain deterministic NDJSON lines; the helper already computes capability summaries.
For ad-hoc investigation, snapshotting EntryTraceResult keeps graph and NDJSON aligned. Avoid ad-hoc JSON writers to maintain ordering guarantees.
Probing through Scanner.Worker
EntryTrace runs automatically inside the worker when these metadata keys exist on the lease:
| Key | Purpose |
|---|---|
ScanMetadataKeys.ImageConfigPath (default scanner.analyzers.entrytrace.configMetadataKey) |
Absolute path to the OCI config.json. |
ScanMetadataKeys.LayerDirectories or ScanMetadataKeys.LayerArchives |
Semicolon-delimited list of extracted layer folders or tar archives. |
ScanMetadataKeys.RuntimeProcRoot (optional) |
Path to a captured /proc tree for runtime reconciliation (air-gapped runs can mount a snapshot). |
Worker output lands in context.Analysis (EntryTraceGraph, EntryTraceNdjson) and is persisted via IEntryTraceResultStore. Ensure Surface Validation prerequisites pass before dispatching the analyzer.
Probing via WebService & CLI
- REST:
GET /api/scans/{scanId}/entrytracereturnsEntryTraceResponse(graph + ndjson + metadata). Requires scan ownership/authz. - CLI:
stella scan entrytrace <scan-id> [--ndjson] [--verbose]renders a confidence-sorted terminal table, diagnostics, and optionally the NDJSON payload.
Both surfaces consume the persisted result; rerunning the worker updates the stored document atomically.
NDJSON reference
EntryTraceNdjsonWriter.Serialize emits newline-delimited JSON in the following order so AOC consumers can stream without buffering:
entrytrace.entry— scan metadata (scan id, image digest, outcome, counts).entrytrace.node— every node in the graph with arguments, interpreter, evidence, and metadata.entrytrace.edge— directed relationships between nodes with optional wrapper metadata.entrytrace.target— resolved terminal programmes (EntryTracePlan), including runtime, confidence, arguments, environment, and evidence.entrytrace.warning— diagnostics (severity, reason, span, related path).entrytrace.capability— aggregated wrapper capabilities discovered during traversal.
Every line ends with a newline and is emitted in deterministic order (IDs ascending, keys lexicographically sorted) so downstream tooling can hash or diff outputs reproducibly.
1) Loading OCI images
1.1 Supported inputs
- Registry references (
repo:tag@sha256:digest) using the existing content store. - Local OCI/Docker v2 archives (
docker savetarball, OCI layout directory withindex.json+blobs/sha256/*).
1.2 Normalised model
sealed class OciImage {
public required string Os;
public required string Arch;
public required string[] Entrypoint;
public required string[] Cmd;
public required string[] Shell; // Windows / powershell overrides
public required string WorkingDir;
public required string[] Env;
public required string[] ExposedPorts;
public required LabelMap Labels;
public required LayerRef[] Layers; // ordered, compressed blobs
}
Compose the runtime argv as Entrypoint ++ Cmd, honouring shell-form vs exec-form (see §2.3).
2) Overlay virtual filesystem
2.1 Whiteouts
- Regular whiteout:
path/.wh.<name>removes<name>from lower layers. - Opaque directory:
path/.wh..wh..opqhides the directory entirely.
2.2 Lazy extraction
- First pass: build a tar index
(path → layer, offset, size, mode, isWhiteout, isDir). - Decompress only when reading a file; optionally support eStargz TOC to accelerate random access.
2.3 Shell-form composition
- Dockerfile shell form is serialised as
["/bin/sh","-c","…"](orShell[]override on Windows). - Always trust
config.json; no need to inspect the Dockerfile. - Working directory defaults to
/if unspecified.
3) Low-level primitives
3.1 PATH resolution
- Extract
PATHfrom environment (fallback/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin). - If
argv[0]is relative or lacks/, walk the PATH to resolve an absolute file. - Verify execute bit (or Windows ACL) before accepting.
3.2 Shebang handling
- For non-ELF/PE files: read first line; interpret
#!interpreter args. - Replace
argv[0]with the interpreter, prepend shebang args, append script path per kernel semantics.
3.3 Binary probes
- Identify ELF via magic
\x7FELF, parse.interp,.dynamic, linked libs,.note.go.buildid, DWARF producer,.rustcnotes, and musl/glibc fingerprints. - Identify PE (Windows) and detect .NET single-file bundles via CLI header / metadata tables; capture ready-to-run vs IL-only markers.
- Inspect archives (JAR/WAR/EAR) for
META-INF/MANIFEST.MFMain-Class/Main-Moduleand signed entries. - Detect PHP-FPM / nginx launchers (
php-fpm,apache2-foreground,nginx -g 'daemon off;') via binary names + nearby config (php.ini, nginx.conf). - Record evidence tuples for runtime scoring (interpreter, build ID, runtime note) so downstream components can explain the classification.
4) Wrapper catalogue
Roadmap note: extended package/tool runners land with SCANNER-ENTRYTRACE-18-508; today the catalogue covers init/user-switch/environment/supervisor wrappers listed above.
Collapse known wrappers before analysing the target command:
- Init shims:
tini,dumb-init,s6-svscan,runit,supervisord. - Privilege droppers:
gosu,su-exec,chpst. - Shells:
sh,bash,dash, BusyBox variants. - Package runners:
npm,yarn,pnpm,pip,pipenv,poetry,bundle,rake.
Rules:
- If wrapper contains a
--sentinel (tini -- app …) drop the wrapper and record a reduction edge. gosu user cmd …→ collapse tocmd ….- For shell wrappers, delegate to the ShellFlow analyser (see separate guide).
5) ShellFlow integration
When the resolved command is a shell script, invoke the ShellFlow analyser to locate the eventual exec target. Key capabilities:
- Parses POSIX sh (and common Bash extensions).
- Tracks environment mutations (
set,export,set --). - Resolves
$@,$1..9,${VAR:-default}. - Recognises idioms from official Docker images (
if [ "$1" = "server" ]; then …). - Emits multiple branches when predicates depend on unknown data, but tags them with lower confidence.
The analyser returns one or more candidate commands along with reasons, which feed into the reduction engine.
6) Reduction algorithm
- Compose argv
ENTRYPOINT ++ CMD. - Collapse wrappers; append
ReductionEdgeentries documenting each step. - Resolve argv0 to an absolute file and classify (ELF/PE/script).
- If script → run ShellFlow; replace current command with highest-confidence
exectarget while preserving alternates as evidence. - Attempt to resolve application artefacts for VM hosts (JARs, DLLs, JS entry, Python module, etc.).
- Emit
EntryTraceResultwith candidate terminals ranked by confidence.
7) Confidence scoring
Use a simple logistic model with feature contributions captured for the evidence trail. Example features:
| Id | Signal | Weight |
|---|---|---|
f1 |
Entrypoint already an executable (ELF/PE) | +0.18 |
f2 |
Observed chain ends in non-wrapper binary | +0.22 |
f3 |
VM host + resolvable artefact | +0.20 |
f4 |
Exposed ports align with runtime | +0.06 |
f5 |
Shebang interpreter matches runtime family | +0.05 |
f6 |
Language artefact validation succeeded | +0.15 |
f8 |
Multi-branch script unresolved ($@ taint) |
−0.20 |
f9 |
Target missing execute bit | −0.25 |
f10 |
Shell with no exec |
−0.18 |
Persist per-feature evidence strings so UI/CLI users can see why the scanner picked a given entry point.
8) Outputs
Return a populated EntryTraceResult:
Terminalscontains the best candidate(s) and their runtime classification.Evidenceaggregates feature messages, ShellFlow reasoning, wrapper reductions, and runtime detector hints.Chainshows the explainable path from initial Docker argv to the final binary.
Static and dynamic reducers share this shape, enabling downstream modules to remain agnostic of the detection mode.
9) ProcGraph replay (runtime parity)
Static resolution must be reconciled with live observations when a workload is running under the Stella Ops runtime agent:
- Read
/proc/1/{cmdline,exe}and walk descendants via/proc/*/statto construct the initial exec chain (ascending PID order). - Collapse known wrappers (
tini,dumb-init,gosu,su-exec,s6-supervise,runsv,supervisord) and privilege switches, mirroring the static wrapper catalogue. - Materialise a
ProcGraphobject that records each transition and the resolved executable path (via/proc/<pid>/exesymlinks). - Compare
ProcGraph.TerminalwithEntryTraceResult.Terminals[0], emittingconfidence=highwhen they match and downgrade when divergence occurs. - Persist the merged view so the CLI/UI can highlight static vs runtime discrepancies and feed drift detection in Zastava.
This replay is optional offline, but required when runtime evidence is available so policy decisions can lean on High-confidence matches.
10) Service & CLI surfaces
- Scanner.WebService must expose
/scans/{scanId}/entrytracereturning chain, terminal classification, evidence, and runtime agreement markers. - CLI gains
stella scan entrypoint <scanId>(and JSON streaming) for air-gapped review. - Policy / Export payloads include
entrytrace_terminal,entrytrace_confidence, and evidence arrays so downstream consumers retain provenance. - All outputs reuse the same
EntryTraceResultschema and NDJSON stream defined in §7, keeping the Offline Kit and DSSE attestations deterministic.