- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes. - Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes. - Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables. - Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
		
			
				
	
	
	
		
			3.8 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	
			3.8 KiB
		
	
	
	
	
	
	
	
ShellFlow — Script Reduction Playbook
Most container entry points eventually execute a shell script. The ShellFlow analyser resolves these scripts without executing user code, providing deterministic, explainable reductions.
1) Scope
- POSIX shsubset with common Bash extensions (control flow, functions, parameter expansion).
- Handles idioms from official Docker images (if [ "$1" = "server" ]; then …,exec gosu "$USER" "$@",set -- java -jar …).
- Tracks positional parameters ($@,$1..$9), environment variables, andset --mutations.
- Produces one or more candidate commands with supporting evidence.
2) Architecture
ShellFlow/
  Parser/           // POSIX sh lexer + recursive descent parser
  Ast/              // nodes for lists, pipelines, conditionals, functions
  Evaluator/        // partial evaluation & taint tracking
  Idioms/           // pattern library for common Docker entrypoints
  Planner/          // emits CommandPlan[]
2.1 CommandPlan
public sealed record CommandPlan(
  string[] Argv,
  double   HeuristicScore,
  IReadOnlyList<string> Evidence,
  IReadOnlyList<ReductionEdge> Chain,
  bool     IsFallback = false
);
Plans feed directly into the static reducer, which selects the highest-confidence plan but keeps alternates as evidence.
3) Parsing & AST
- Tokenise words, assignments, pipelines (|), lists (;,&&,||), conditionals (if,case), loops (for,while,until), functions, and redirections.
- Preserve heredocs and subshells as opaque nodes (evaluated conservatively).
- Record source spans to surface meaningful evidence ("line 12: exec java -jar $APP_JAR").
4) Partial evaluation
- Initialise symbol table from image environment plus caller-supplied args.
- Treat $@,$*,$1..$9as tainted; propagate taint through assignments.
- Resolve ${VAR:-default}and${VAR:+alt}whenVARknown; otherwise branch.
- Support set -- …(resets positional parameters) andshift.
- source/- .commands are parsed recursively when files are available; else fallback to low-confidence branch.
5) Exec sink detection
- exec <cmd>dominates the remainder of the script.
- Chains such as exec gosu "$USER" "$@"feed into wrapper collapse.
- When no execis present, pick the last reachable simple command in the main path.
- Multi-branch scripts yield multiple plans with adjusted scores; unresolved branches are marked IsFallback.
6) Idiom library
| Pattern | Action | 
|---|---|
| if [ "${1:0:1}" = '-' ]; then set -- server "$@"; fi | Rewrite argv to prepend default command. | 
| if [ "$1" = "bash" ]; then exec "$@"; fi | Pass-through for manual shells. | 
| exec "$@"+ non-empty CMD | Substitute CMD vector into plan. | 
| exec java -jar "$APP_JAR" "$@" | Resolve JAR via env or filesystem. | 
| set -- gosu "$APP_USER" "$@" | Collapse into wrapper plan. | 
Idioms are implemented as AST visitors; each adds evidence strings when triggered.
7) Confidence scoring
- Base score from plan heuristics (HeuristicScore).
- Penalties for unresolved taint ($@unknown), missing files, nested subshells, or fallbacks.
- Bonus when idioms match, artefacts exist, or env values resolve cleanly.
- Final confidence is combined with the outer static scoring model.
8) Failure modes
- Missing script (ENTRYPOINTpoints to deleted file): emit fallback plan with low confidence.
- Self-modifying scripts or heavy dynamic features (eval, backticks): mark plan as low-confidence and surface warning evidence.
- Commands that spawn supervisors without exec: return both the supervisor and inferred children (if configuration files are present).
ShellFlow keeps the static reducer explainable: every inferred command is accompanied by the script span and reasoning used to reach it.