Files
git.stella-ops.org/docs/modules/scanner/operations/entrypoint-dynamic-analysis.md
master 7b5bdcf4d3 feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules
- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes.
- Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes.
- Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables.
- Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
2025-10-30 00:09:39 +02:00

3.9 KiB
Raw Blame History

Entry-Point Dynamic Analysis

When we have access to a running container (e.g., during runtime posture checks), StellaOps augments the static inference with live signals. This document describes the Observational Exec Graph (OEG) that powers the dynamic mode.

1) Goals

  • Capture the actual process tree and exec lineage after the container starts.
  • Identify steady-state processes (long-lived, listening, non-wrapper) even when supervision stacks are present.
  • Feed the same reduction and runtime-classification pipeline as the static analyser.

2) Observational Exec Graph (OEG)

2.1 Data sources

  • Tracepoints / eBPF: sched_process_exec, sched_process_fork/clone, and corresponding exit events give us pid, ppid, namespace, binary path, and argv snapshots with minimal overhead.
  • /proc sampling: for each tracked PID, capture /proc/<pid>/{exe,cmdline,cwd} and file descriptors (especially listening sockets).
  • Namespace mapping: normalise host PIDs to container PIDs (NStgid) so the graph is stable across runtimes.

2.2 Graph model

public sealed record ExecNode(int HostPid, int NsPid, int Ppid, string Exe, string[] Argv, long StartTicks);
public sealed record ExecEdge(int ParentHostPid, int ChildHostPid, string Kind); // "clone" | "exec"
  • Nodes represent exec() events (post-exec image) and contain the final argv.
  • Edges labelled clone capture forks; exec edges show program replacements.

2.3 Steady-state candidate selection

For each node compute features:

Feature Rationale
Lifetime (until sampling end) Long-lived processes are more likely to be the real workload.
Additional execs downstream Zero execs after start implies terminal.
Listening sockets Owning LISTEN sockets strongly suggests a server.
Wrapper catalogue hit Mark nodes that match known shims (tini, gosu, supervisord, etc.).
Children fan-out Supervisors spawn multiple children and remain parents.

Feed these into a scoring function; retain TopK candidates (usually 13) along with evidence.

3) Integration with static pipeline

  1. For each steady-state candidate, snapshot the command/argv and normalise via ResolvedCommand (as in static mode).
  2. Run wrapper reduction and ShellFlow analysis if the candidate is a script.
  3. Invoke runtime detectors to classify the binary.
  4. Merge dynamic evidence with static evidence. Conflicts drop confidence or trigger the “supervisor” classification.

4) Supervisors & multi-service containers

Some images (e.g., supervisord, s6, runit) intentionally start multiple long-lived processes. Handle them as follows:

  • Detect supervisor binaries from the wrapper catalogue.
  • Analyse their configuration (/etc/supervisord.conf, /etc/services.d/*, etc.) to enumerate child services statically.
  • Emit multiple TerminalProcess entries with individual confidence scores but mark the parent as type = supervisor.

5) Operational hints

  • Sampling window: 13 seconds after start is usually sufficient; extend in debug mode.
  • Overhead: prefer eBPF/tracepoints; fall back to periodic /proc walks when instrumentation isnt available.
  • Security: honour namespace boundaries; never inspect processes outside the target containers cgroup/namespace.
  • Failure mode: if dynamic capture fails, fall back to static mode and flag evidence accordingly ("Dynamic capture unavailable").

6) Deliverables

The dynamic reducer returns an EntryTraceResult populated with:

  • ExecGraph containing nodes and edges for audit/debug.
  • Terminals listing steady-state processes (possibly multiple).
  • Evidence strings referencing dynamic signals ("pid 47 listening on 0.0.0.0:8080", "wrapper tini collapsed into /usr/local/bin/python").

Downstream modules (Policy, Vuln Explorer, Export Center) treat the result identically to static scans, enabling easy comparison between build-time and runtime observations.