Files
git.stella-ops.org/docs/modules/scanner/operations/entrypoint-static-analysis.md
master f98cea3bcf Add Authority Advisory AI and API Lifecycle Configuration
- Introduced AuthorityAdvisoryAiOptions and related classes for managing advisory AI configurations, including remote inference options and tenant-specific settings.
- Added AuthorityApiLifecycleOptions to control API lifecycle settings, including legacy OAuth endpoint configurations.
- Implemented validation and normalization methods for both advisory AI and API lifecycle options to ensure proper configuration.
- Created AuthorityNotificationsOptions and its related classes for managing notification settings, including ack tokens, webhooks, and escalation options.
- Developed IssuerDirectoryClient and related models for interacting with the issuer directory service, including caching mechanisms and HTTP client configurations.
- Added support for dependency injection through ServiceCollectionExtensions for the Issuer Directory Client.
- Updated project file to include necessary package references for the new Issuer Directory Client library.
2025-11-02 13:50:25 +02:00

14 KiB
Raw Blame History

Entry-Point Static Analysis

This guide captures the static half of StellaOps entry-point detection pipeline: how we turn image metadata and filesystem contents into a resolved binary, an execution chain, and a confidence score.

0) Implementation snapshot — Sprint130.A (2025-11-02)

The StellaOps.Scanner.EntryTrace stack (analyzer + worker + surfaces) currently provides:

  • OCI config + layered FS context: EntryTraceImageContextFactory normalises environment (PATH fallback), user, and working directory while LayeredRootFileSystem handles whiteouts, symlinks, and bounded byte reads (TryReadBytes) so ELF/PE probing stays offline friendly.
  • Wrapper-aware exec expansion: the analyzer unwraps init/user-switch/environment/supervisor wrappers (tini, dumb-init, gosu, su-exec, chpst, env, supervisord, s6-supervise, runsv*) and records guard metadata plus environment/user deltas on nodes and edges.
  • Script + interpreter resolution: POSIX shell parsing (AST-driven) covers source, run-parts, exec, and supervisor service directories, with Windows cmd /c support. Python -m, Node script, and Java -jar lookups add evidence when targets are located.
  • Terminal classification & scoring: ClassifyTerminal fingerprints ELF (PT_INTERP, Go build ID, Rust notes), PE/CLR, and JAR manifests, pairs them with shebang/runtime heuristics (python, node, java, .NET, php-fpm, nginx, ruby), and emits EntryTracePlan/EntryTraceTerminal records capped at 95-point confidence.
  • NDJSON + capability stream: EntryTraceNdjsonWriter produces deterministic entrytrace.entry/node/edge/target/warning/capability lines consumed by AOC, CLI, and policy surfaces.
  • Runtime reconciliation: ProcFileSystemSnapshot + ProcGraphBuilder replay /proc, EntryTraceRuntimeReconciler merges runtime terminals with static predictions, and diagnostics note matches/mismatches.
  • Surface integration: Scanner Worker caches graphs (SurfaceCache), persists EntryTraceResult via the shared store, exposes NDJSON + graph through ScanAnalysisKeys, and the WebService/CLI (scan entrytrace) return the stored result.

Open follow-ups tracked for this wave:

  • SCANNER-ENTRYTRACE-18-507 fallback candidate discovery (Docker history, /etc/services/**, /usr/local/bin/*-entrypoint) when ENTRYPOINT/CMD are empty.
  • SCANNER-ENTRYTRACE-18-508 broaden wrapper catalogue (package/tool runners such as bundle exec, npm, yarn node, docker-php-entrypoint, pipenv, poetry run).
  • ENTRYTRACE-SURFACE-01 (DOING) / ENTRYTRACE-SURFACE-02 (TODO) finish wiring Surface.Validation/FS/Secrets to gate prerequisites and remove direct env/secret reads.

Sections §4§7 below capture the long-term reduction design; features not yet implemented are explicitly noted in the task board.

Probing the analyzer today

  1. Load the image config
    using var stream = File.OpenRead("config.json");
    var config = OciImageConfigLoader.Load(stream);
    
  2. Create a layered filesystem from extracted layer directories or tar archives:
    var fs = LayeredRootFileSystem.FromArchives(layers);
    
  3. Build the image context (normalises env, PATH, user, working dir):
    var imageCtx = EntryTraceImageContextFactory.Create(
        config, fs, new EntryTraceAnalyzerOptions(), imageDigest, scanId);
    
  4. Resolve the entry trace:
    var analyzer = serviceProvider.GetRequiredService<IEntryTraceAnalyzer>();
    var graph = await analyzer.ResolveAsync(imageCtx.Entrypoint, imageCtx.Context, cancellationToken);
    
  5. Inspect results graph.Terminals lists classified candidates (path, runtime, confidence, evidence), graph.Nodes/Edges capture the explainable chain, and graph.Diagnostics highlight unresolved steps. Emit metrics/telemetry via EntryTraceMetrics.
  6. Serialize if needed pass the graph through EntryTraceNdjsonWriter.Serialize to obtain deterministic NDJSON lines; the helper already computes capability summaries.

For ad-hoc investigation, snapshotting EntryTraceResult keeps graph and NDJSON aligned. Avoid ad-hoc JSON writers to maintain ordering guarantees.

Probing through Scanner.Worker

EntryTrace runs automatically inside the worker when these metadata keys exist on the lease:

Key Purpose
ScanMetadataKeys.ImageConfigPath (default scanner.analyzers.entrytrace.configMetadataKey) Absolute path to the OCI config.json.
ScanMetadataKeys.LayerDirectories or ScanMetadataKeys.LayerArchives Semicolon-delimited list of extracted layer folders or tar archives.
ScanMetadataKeys.RuntimeProcRoot (optional) Path to a captured /proc tree for runtime reconciliation (air-gapped runs can mount a snapshot).

Worker output lands in context.Analysis (EntryTraceGraph, EntryTraceNdjson) and is persisted via IEntryTraceResultStore. Ensure Surface Validation prerequisites pass before dispatching the analyzer.

Probing via WebService & CLI

  • REST: GET /api/scans/{scanId}/entrytrace returns EntryTraceResponse (graph + ndjson + metadata). Requires scan ownership/authz.
  • CLI: stella scan entrytrace <scan-id> [--ndjson] [--verbose] renders a confidence-sorted terminal table, diagnostics, and optionally the NDJSON payload.

Both surfaces consume the persisted result; rerunning the worker updates the stored document atomically.

NDJSON reference

EntryTraceNdjsonWriter.Serialize emits newline-delimited JSON in the following order so AOC consumers can stream without buffering:

  • entrytrace.entry — scan metadata (scan id, image digest, outcome, counts).
  • entrytrace.node — every node in the graph with arguments, interpreter, evidence, and metadata.
  • entrytrace.edge — directed relationships between nodes with optional wrapper metadata.
  • entrytrace.target — resolved terminal programmes (EntryTracePlan), including runtime, confidence, arguments, environment, and evidence.
  • entrytrace.warning — diagnostics (severity, reason, span, related path).
  • entrytrace.capability — aggregated wrapper capabilities discovered during traversal.

Every line ends with a newline and is emitted in deterministic order (IDs ascending, keys lexicographically sorted) so downstream tooling can hash or diff outputs reproducibly.

1) Loading OCI images

1.1 Supported inputs

  • Registry references (repo:tag@sha256:digest) using the existing content store.
  • Local OCI/Docker v2 archives (docker save tarball, OCI layout directory with index.json + blobs/sha256/*).

1.2 Normalised model

sealed class OciImage {
  public required string Os;
  public required string Arch;
  public required string[] Entrypoint;
  public required string[] Cmd;
  public required string[] Shell;      // Windows / powershell overrides
  public required string WorkingDir;
  public required string[] Env;
  public required string[] ExposedPorts;
  public required LabelMap Labels;
  public required LayerRef[] Layers;   // ordered, compressed blobs
}

Compose the runtime argv as Entrypoint ++ Cmd, honouring shell-form vs exec-form (see §2.3).

2) Overlay virtual filesystem

2.1 Whiteouts

  • Regular whiteout: path/.wh.<name> removes <name> from lower layers.
  • Opaque directory: path/.wh..wh..opq hides the directory entirely.

2.2 Lazy extraction

  • First pass: build a tar index (path → layer, offset, size, mode, isWhiteout, isDir).
  • Decompress only when reading a file; optionally support eStargz TOC to accelerate random access.

2.3 Shell-form composition

  • Dockerfile shell form is serialised as ["/bin/sh","-c","…"] (or Shell[] override on Windows).
  • Always trust config.json; no need to inspect the Dockerfile.
  • Working directory defaults to / if unspecified.

3) Low-level primitives

3.1 PATH resolution

  • Extract PATH from environment (fallback /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin).
  • If argv[0] is relative or lacks /, walk the PATH to resolve an absolute file.
  • Verify execute bit (or Windows ACL) before accepting.

3.2 Shebang handling

  • For non-ELF/PE files: read first line; interpret #!interpreter args.
  • Replace argv[0] with the interpreter, prepend shebang args, append script path per kernel semantics.

3.3 Binary probes

  • Identify ELF via magic \x7FELF, parse .interp, .dynamic, linked libs, .note.go.buildid, DWARF producer, .rustc notes, and musl/glibc fingerprints.
  • Identify PE (Windows) and detect .NET single-file bundles via CLI header / metadata tables; capture ready-to-run vs IL-only markers.
  • Inspect archives (JAR/WAR/EAR) for META-INF/MANIFEST.MF Main-Class/Main-Module and signed entries.
  • Detect PHP-FPM / nginx launchers (php-fpm, apache2-foreground, nginx -g 'daemon off;') via binary names + nearby config (php.ini, nginx.conf).
  • Record evidence tuples for runtime scoring (interpreter, build ID, runtime note) so downstream components can explain the classification.

4) Wrapper catalogue

Roadmap note: extended package/tool runners land with SCANNER-ENTRYTRACE-18-508; today the catalogue covers init/user-switch/environment/supervisor wrappers listed above.

Collapse known wrappers before analysing the target command:

  • Init shims: tini, dumb-init, s6-svscan, runit, supervisord.
  • Privilege droppers: gosu, su-exec, chpst.
  • Shells: sh, bash, dash, BusyBox variants.
  • Package runners: npm, yarn, pnpm, pip, pipenv, poetry, bundle, rake.

Rules:

  • If wrapper contains a -- sentinel (tini -- app …) drop the wrapper and record a reduction edge.
  • gosu user cmd … → collapse to cmd ….
  • For shell wrappers, delegate to the ShellFlow analyser (see separate guide).

5) ShellFlow integration

When the resolved command is a shell script, invoke the ShellFlow analyser to locate the eventual exec target. Key capabilities:

  • Parses POSIX sh (and common Bash extensions).
  • Tracks environment mutations (set, export, set --).
  • Resolves $@, $1..9, ${VAR:-default}.
  • Recognises idioms from official Docker images (if [ "$1" = "server" ]; then …).
  • Emits multiple branches when predicates depend on unknown data, but tags them with lower confidence.

The analyser returns one or more candidate commands along with reasons, which feed into the reduction engine.

6) Reduction algorithm

  1. Compose argv ENTRYPOINT ++ CMD.
  2. Collapse wrappers; append ReductionEdge entries documenting each step.
  3. Resolve argv0 to an absolute file and classify (ELF/PE/script).
  4. If script → run ShellFlow; replace current command with highest-confidence exec target while preserving alternates as evidence.
  5. Attempt to resolve application artefacts for VM hosts (JARs, DLLs, JS entry, Python module, etc.).
  6. Emit EntryTraceResult with candidate terminals ranked by confidence.

7) Confidence scoring

Use a simple logistic model with feature contributions captured for the evidence trail. Example features:

Id Signal Weight
f1 Entrypoint already an executable (ELF/PE) +0.18
f2 Observed chain ends in non-wrapper binary +0.22
f3 VM host + resolvable artefact +0.20
f4 Exposed ports align with runtime +0.06
f5 Shebang interpreter matches runtime family +0.05
f6 Language artefact validation succeeded +0.15
f8 Multi-branch script unresolved ($@ taint) 0.20
f9 Target missing execute bit 0.25
f10 Shell with no exec 0.18

Persist per-feature evidence strings so UI/CLI users can see why the scanner picked a given entry point.

8) Outputs

Return a populated EntryTraceResult:

  • Terminals contains the best candidate(s) and their runtime classification.
  • Evidence aggregates feature messages, ShellFlow reasoning, wrapper reductions, and runtime detector hints.
  • Chain shows the explainable path from initial Docker argv to the final binary.

Static and dynamic reducers share this shape, enabling downstream modules to remain agnostic of the detection mode.

9) ProcGraph replay (runtime parity)

Static resolution must be reconciled with live observations when a workload is running under the StellaOps runtime agent:

  1. Read /proc/1/{cmdline,exe} and walk descendants via /proc/*/stat to construct the initial exec chain (ascending PID order).
  2. Collapse known wrappers (tini, dumb-init, gosu, su-exec, s6-supervise, runsv, supervisord) and privilege switches, mirroring the static wrapper catalogue.
  3. Materialise a ProcGraph object that records each transition and the resolved executable path (via /proc/<pid>/exe symlinks).
  4. Compare ProcGraph.Terminal with EntryTraceResult.Terminals[0], emitting confidence=high when they match and downgrade when divergence occurs.
  5. Persist the merged view so the CLI/UI can highlight static vs runtime discrepancies and feed drift detection in Zastava.

This replay is optional offline, but required when runtime evidence is available so policy decisions can lean on High-confidence matches.

10) Service & CLI surfaces

  • Scanner.WebService must expose /scans/{scanId}/entrytrace returning chain, terminal classification, evidence, and runtime agreement markers.
  • CLI gains stella scan entrypoint <scanId> (and JSON streaming) for air-gapped review.
  • Policy / Export payloads include entrytrace_terminal, entrytrace_confidence, and evidence arrays so downstream consumers retain provenance.
  • All outputs reuse the same EntryTraceResult schema and NDJSON stream defined in §7, keeping the Offline Kit and DSSE attestations deterministic.