Files

master 7b5bdcf4d3 feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules

- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes.
- Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes.
- Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables.
- Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.

2025-10-30 00:09:39 +02:00

3.8 KiB

Raw Blame History

ShellFlow — Script Reduction Playbook

Most container entry points eventually execute a shell script. The ShellFlow analyser resolves these scripts without executing user code, providing deterministic, explainable reductions.

1) Scope

POSIX sh subset with common Bash extensions (control flow, functions, parameter expansion).
Handles idioms from official Docker images (if [ "$1" = "server" ]; then …, exec gosu "$USER" "$@", set -- java -jar …).
Tracks positional parameters ($@, $1..$9), environment variables, and set -- mutations.
Produces one or more candidate commands with supporting evidence.

2) Architecture

ShellFlow/
  Parser/           // POSIX sh lexer + recursive descent parser
  Ast/              // nodes for lists, pipelines, conditionals, functions
  Evaluator/        // partial evaluation & taint tracking
  Idioms/           // pattern library for common Docker entrypoints
  Planner/          // emits CommandPlan[]

2.1 CommandPlan

public sealed record CommandPlan(
  string[] Argv,
  double   HeuristicScore,
  IReadOnlyList<string> Evidence,
  IReadOnlyList<ReductionEdge> Chain,
  bool     IsFallback = false
);

Plans feed directly into the static reducer, which selects the highest-confidence plan but keeps alternates as evidence.

3) Parsing & AST

Tokenise words, assignments, pipelines (|), lists (;, &&, ||), conditionals (if, case), loops (for, while, until), functions, and redirections.
Preserve heredocs and subshells as opaque nodes (evaluated conservatively).
Record source spans to surface meaningful evidence ("line 12: exec java -jar $APP_JAR").

4) Partial evaluation

Initialise symbol table from image environment plus caller-supplied args.
Treat $@, $*, $1..$9 as tainted; propagate taint through assignments.
Resolve ${VAR:-default} and ${VAR:+alt} when VAR known; otherwise branch.
Support set -- … (resets positional parameters) and shift.
source/. commands are parsed recursively when files are available; else fallback to low-confidence branch.

5) Exec sink detection

exec <cmd> dominates the remainder of the script.
Chains such as exec gosu "$USER" "$@" feed into wrapper collapse.
When no exec is present, pick the last reachable simple command in the main path.
Multi-branch scripts yield multiple plans with adjusted scores; unresolved branches are marked IsFallback.

6) Idiom library

Pattern	Action
`if [ "${1:0:1}" = '-' ]; then set -- server "$@"; fi`	Rewrite argv to prepend default command.
`if [ "$1" = "bash" ]; then exec "$@"; fi`	Pass-through for manual shells.
`exec "$@"` + non-empty CMD	Substitute CMD vector into plan.
`exec java -jar "$APP_JAR" "$@"`	Resolve JAR via env or filesystem.
`set -- gosu "$APP_USER" "$@"`	Collapse into wrapper plan.

Idioms are implemented as AST visitors; each adds evidence strings when triggered.

7) Confidence scoring

Base score from plan heuristics (HeuristicScore).
Penalties for unresolved taint ($@ unknown), missing files, nested subshells, or fallbacks.
Bonus when idioms match, artefacts exist, or env values resolve cleanly.
Final confidence is combined with the outer static scoring model.

8) Failure modes

Missing script (ENTRYPOINT points to deleted file): emit fallback plan with low confidence.
Self-modifying scripts or heavy dynamic features (eval, backticks): mark plan as low-confidence and surface warning evidence.
Commands that spawn supervisors without exec: return both the supervisor and inferred children (if configuration files are present).

ShellFlow keeps the static reducer explainable: every inferred command is accompanied by the script span and reasoning used to reach it.

3.8 KiB Raw Blame History