feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules
- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes. - Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes. - Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables. - Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
This commit is contained in:
		
							
								
								
									
										83
									
								
								docs/modules/scanner/operations/entrypoint-shell-analysis.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										83
									
								
								docs/modules/scanner/operations/entrypoint-shell-analysis.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,83 @@ | ||||
| # ShellFlow — Script Reduction Playbook | ||||
|  | ||||
| Most container entry points eventually execute a shell script. The ShellFlow analyser resolves these scripts without executing user code, providing deterministic, explainable reductions. | ||||
|  | ||||
| ## 1) Scope | ||||
|  | ||||
| - POSIX `sh` subset with common Bash extensions (control flow, functions, parameter expansion). | ||||
| - Handles idioms from official Docker images (`if [ "$1" = "server" ]; then …`, `exec gosu "$USER" "$@"`, `set -- java -jar …`). | ||||
| - Tracks positional parameters (`$@`, `$1..$9`), environment variables, and `set --` mutations. | ||||
| - Produces one or more candidate commands with supporting evidence. | ||||
|  | ||||
| ## 2) Architecture | ||||
|  | ||||
| ``` | ||||
| ShellFlow/ | ||||
|   Parser/           // POSIX sh lexer + recursive descent parser | ||||
|   Ast/              // nodes for lists, pipelines, conditionals, functions | ||||
|   Evaluator/        // partial evaluation & taint tracking | ||||
|   Idioms/           // pattern library for common Docker entrypoints | ||||
|   Planner/          // emits CommandPlan[] | ||||
| ``` | ||||
|  | ||||
| ### 2.1 CommandPlan | ||||
|  | ||||
| ```csharp | ||||
| public sealed record CommandPlan( | ||||
|   string[] Argv, | ||||
|   double   HeuristicScore, | ||||
|   IReadOnlyList<string> Evidence, | ||||
|   IReadOnlyList<ReductionEdge> Chain, | ||||
|   bool     IsFallback = false | ||||
| ); | ||||
| ``` | ||||
|  | ||||
| Plans feed directly into the static reducer, which selects the highest-confidence plan but keeps alternates as evidence. | ||||
|  | ||||
| ## 3) Parsing & AST | ||||
|  | ||||
| - Tokenise words, assignments, pipelines (`|`), lists (`;`, `&&`, `||`), conditionals (`if`, `case`), loops (`for`, `while`, `until`), functions, and redirections. | ||||
| - Preserve heredocs and subshells as opaque nodes (evaluated conservatively). | ||||
| - Record source spans to surface meaningful evidence (`"line 12: exec java -jar $APP_JAR"`). | ||||
|  | ||||
| ## 4) Partial evaluation | ||||
|  | ||||
| - Initialise symbol table from image environment plus caller-supplied args. | ||||
| - Treat `$@`, `$*`, `$1..$9` as tainted; propagate taint through assignments. | ||||
| - Resolve `${VAR:-default}` and `${VAR:+alt}` when `VAR` known; otherwise branch. | ||||
| - Support `set -- …` (resets positional parameters) and `shift`. | ||||
| - `source`/`.` commands are parsed recursively when files are available; else fallback to low-confidence branch. | ||||
|  | ||||
| ## 5) Exec sink detection | ||||
|  | ||||
| - `exec <cmd>` dominates the remainder of the script. | ||||
| - Chains such as `exec gosu "$USER" "$@"` feed into wrapper collapse. | ||||
| - When no `exec` is present, pick the last reachable simple command in the main path. | ||||
| - Multi-branch scripts yield multiple plans with adjusted scores; unresolved branches are marked `IsFallback`. | ||||
|  | ||||
| ## 6) Idiom library | ||||
|  | ||||
| | Pattern | Action | | ||||
| | --- | --- | | ||||
| | `if [ "${1:0:1}" = '-' ]; then set -- server "$@"; fi` | Rewrite argv to prepend default command. | | ||||
| | `if [ "$1" = "bash" ]; then exec "$@"; fi` | Pass-through for manual shells. | | ||||
| | `exec "$@"` + non-empty CMD | Substitute CMD vector into plan. | | ||||
| | `exec java -jar "$APP_JAR" "$@"` | Resolve JAR via env or filesystem. | | ||||
| | `set -- gosu "$APP_USER" "$@"` | Collapse into wrapper plan. | | ||||
|  | ||||
| Idioms are implemented as AST visitors; each adds evidence strings when triggered. | ||||
|  | ||||
| ## 7) Confidence scoring | ||||
|  | ||||
| - Base score from plan heuristics (`HeuristicScore`). | ||||
| - Penalties for unresolved taint (`$@` unknown), missing files, nested subshells, or fallbacks. | ||||
| - Bonus when idioms match, artefacts exist, or env values resolve cleanly. | ||||
| - Final confidence is combined with the outer static scoring model. | ||||
|  | ||||
| ## 8) Failure modes | ||||
|  | ||||
| - Missing script (`ENTRYPOINT` points to deleted file): emit fallback plan with low confidence. | ||||
| - Self-modifying scripts or heavy dynamic features (`eval`, backticks): mark plan as low-confidence and surface warning evidence. | ||||
| - Commands that spawn supervisors without `exec`: return both the supervisor and inferred children (if configuration files are present). | ||||
|  | ||||
| ShellFlow keeps the static reducer explainable: every inferred command is accompanied by the script span and reasoning used to reach it. | ||||
		Reference in New Issue
	
	Block a user