# StellaOps.Scanner.EntryTrace — Agent Charter

## Mission
Resolve container `ENTRYPOINT`/`CMD` chains into deterministic call graphs that fuel usage-aware SBOMs, policy explainability, and runtime drift detection. Implement the EntryTrace analyzers and expose them as restart-time plug-ins for the Scanner Worker.

## Scope
- Parse POSIX/Bourne shell constructs (exec, command, case, if, source/run-parts) with deterministic AST output.
- Walk layered root filesystems to resolve PATH lookups, interpreter hand-offs (Python/Node/Java), and record evidence.
- Surface explainable diagnostics for unresolved branches (env indirection, missing files, unsupported syntax) and emit metrics.
- Package analyzers as signed plug-ins under `plugins/scanner/entrytrace/`, guarded by restart-only policy.
- **Semantic analysis**: Classify entrypoints by application intent (ApiEndpoint, Worker, CronJob, etc.), capability class (NetworkListener, FileSystemAccess, etc.), and threat vectors.
- **Temporal tracking**: Track entrypoint evolution across image versions, detecting drift categories (intent changes, capability expansion, attack surface growth).
- **Mesh analysis**: Parse multi-container orchestration manifests (K8s, Docker Compose) to build cross-container reachability graphs and identify vulnerable paths.

## Out of Scope
- SBOM emission/diffing (owned by `Scanner.Emit`/`Scanner.Diff`).
- Runtime enforcement or live drift reconciliation (owned by Zastava).
- Registry/network fetchers beyond file lookups inside extracted layers.

## Interfaces & Contracts

### Core EntryTrace
- Primary entry point: `IEntryTraceAnalyzer.ResolveAsync` returning a deterministic `EntryTraceGraph`.
- Graph nodes must include file path, line span, interpreter classification, evidence source, and follow `Scanner.Core` timestamp/ID helpers when emitting events.
- Diagnostics must enumerate unknown reasons from fixed enum; metrics tagged `entrytrace.*`.
- Plug-ins register via `IEntryTraceAnalyzerFactory` and must validate against `IPluginCatalogGuard`.

### Semantic Entrypoints (Sprint 0411)
Located in `Semantic/`:
- `SemanticEntrypoint`: Classifies entrypoints with intent, capabilities, threat vectors, and confidence scores.
- `ApplicationIntent`: Enum for high-level purpose (ApiEndpoint, Worker, CronJob, CliTool, etc.).
- `CapabilityClass`: Enum for functional capabilities (NetworkListener, FileSystemAccess, ProcessSpawner, etc.).
- `ThreatVector`: Enum for security-relevant classifications (NetworkExposure, FilePathTraversal, CommandInjection, etc.).
- `DataFlowBoundary`: Record for trust boundaries in data flow.
- `SemanticConfidence`: Confidence scores for classification results.

### Temporal Entrypoints (Sprint 0412)
Located in `Temporal/`:
- `TemporalEntrypointGraph`: Tracks entrypoints across image versions with snapshots and deltas.
- `EntrypointSnapshot`: Point-in-time entrypoint state with content hash for comparison.
- `EntrypointDelta`: Version-to-version changes (added/removed/modified entrypoints).
- `EntrypointDrift`: Flags enum for drift categories (IntentChanged, CapabilitiesExpanded, AttackSurfaceGrew, PrivilegeEscalation, PortsAdded, etc.).
- `ITemporalEntrypointStore`: Interface for storing and querying temporal graphs.
- `InMemoryTemporalEntrypointStore`: Reference implementation with delta computation.

### Mesh Entrypoints (Sprint 0412)
Located in `Mesh/`:
- `MeshEntrypointGraph`: Multi-container service mesh with services, edges, and ingress paths.
- `ServiceNode`: Container in the mesh with entrypoints, exposed ports, and labels.
- `CrossContainerEdge`: Inter-service communication link.
- `CrossContainerPath`: Reachability path across services with vulnerability tracking.
- `IngressPath`: External exposure via ingress/load balancer.
- `IManifestParser`: Interface for parsing orchestration manifests.
- `KubernetesManifestParser`: Parser for K8s Deployment, Service, Ingress, StatefulSet, DaemonSet, Pod.
- `DockerComposeParser`: Parser for Docker Compose v2/v3 files.
- `MeshEntrypointAnalyzer`: Orchestrator for mesh analysis with security metrics and blast radius analysis.

### Speculative Execution (Sprint 0413)
Located in `Speculative/`:
- `SymbolicValue`: Algebraic type for symbolic values (Concrete, Symbolic, Unknown, Composite).
- `SymbolicState`: Execution state with variable bindings, path constraints, and terminal commands.
- `PathConstraint`: Branch predicate constraint with kind classification and env dependency tracking.
- `ExecutionPath`: Complete execution path with constraints, commands, and reachability confidence.
- `ExecutionTree`: All paths from symbolic execution with branch coverage metrics.
- `BranchPoint`: Decision point in the script with coverage statistics.
- `BranchCoverage`: Coverage metrics (total, covered, infeasible, env-dependent branches).
- `ISymbolicExecutor`: Interface for symbolic execution of shell scripts.
- `ShellSymbolicExecutor`: Implementation that explores all if/elif/else and case branches.
- `IConstraintEvaluator`: Interface for path feasibility evaluation.
- `PatternConstraintEvaluator`: Pattern-based evaluator for common shell conditionals.
- `PathEnumerator`: Systematic path exploration with grouping by terminal command.
- `PathConfidenceScorer`: Confidence scoring with multi-factor analysis.

### Binary Intelligence (Sprint 0414)
Located in `Binary/`:
- `CodeFingerprint`: Record for binary function fingerprinting with algorithm, hash, and metrics.
- `FingerprintAlgorithm`: Enum for fingerprint types (BasicBlockHash, ControlFlowGraph, StringReferences, ImportReferences, Combined).
- `FunctionSignature`: Record for extracted binary function metadata (name, offset, size, calling convention, basic blocks, references).
- `BasicBlock`: Record for control flow basic block with offset, size, and instruction count.
- `SymbolInfo`: Record for recovered symbol information with confidence and match method.
- `SymbolMatchMethod`: Enum for how symbols were recovered (DebugInfo, ExactFingerprint, FuzzyFingerprint, PatternMatch, etc.).
- `AlternativeMatch`: Record for secondary symbol match candidates.
- `SourceCorrelation`: Record for mapping binary code to source packages/files.
- `CorrelationEvidence`: Flags enum for evidence types (FingerprintMatch, SymbolName, StringPattern, ImportReference, SourcePath, ExactMatch).
- `BinaryAnalysisResult`: Aggregate result with functions, recovered symbols, source correlations, and vulnerable matches.
- `BinaryArchitecture`: Enum for CPU architectures (X86, X64, ARM, ARM64, RISCV32, RISCV64, WASM, Unknown).
- `BinaryFormat`: Enum for binary formats (ELF, PE, MachO, WASM, Raw, Unknown).
- `BinaryAnalysisMetrics`: Metrics for analysis coverage and timing.
- `VulnerableFunctionMatch`: Match of a binary function to a known-vulnerable OSS function.
- `VulnerabilitySeverity`: Enum for vulnerability severity levels.
- `IFingerprintGenerator`: Interface for generating fingerprints from function signatures.
- `BasicBlockFingerprintGenerator`, `ControlFlowFingerprintGenerator`, `CombinedFingerprintGenerator`: Implementations.
- `FingerprintGeneratorFactory`: Factory for creating fingerprint generators.
- `IFingerprintIndex`: Interface for fingerprint lookup with exact and similarity matching.
- `InMemoryFingerprintIndex`: O(1) exact match, O(n) similarity search implementation.
- `VulnerableFingerprintIndex`: Extends index with vulnerability tracking.
- `FingerprintMatch`: Result record with source package, version, vulnerability associations, and similarity score.
- `FingerprintIndexStatistics`: Statistics about the fingerprint index.
- `ISymbolRecovery`: Interface for recovering symbol names from stripped binaries.
- `PatternBasedSymbolRecovery`: Heuristic-based recovery using known patterns.
- `FunctionPattern`: Record for function signature patterns (malloc, strlen, OpenSSL, zlib, etc.).
- `BinaryIntelligenceAnalyzer`: Orchestrator coordinating fingerprinting, symbol recovery, source correlation, and vulnerability matching.
- `BinaryIntelligenceOptions`: Configuration for analysis (algorithm, thresholds, parallelism).
- `VulnerableFunctionMatcher`: Matches binary functions against known-vulnerable function corpus.
- `VulnerableFunctionMatcherOptions`: Configuration for matching thresholds.
- `FingerprintCorpusBuilder`: Builds fingerprint corpus from known OSS packages for later matching.

### Predictive Risk Scoring (Sprint 0415)
Located in `Risk/`:
- `RiskScore`: Record with OverallScore, Category, Confidence, Level, Factors, and ComputedAt.
- `RiskCategory`: Enum for risk dimensions (Exploitability, Exposure, Privilege, DataSensitivity, BlastRadius, DriftVelocity, SupplyChain, Unknown).
- `RiskLevel`: Enum for severity classification (Negligible, Low, Medium, High, Critical).
- `RiskFactor`: Record for individual contributing factors with name, category, score, weight, evidence, and source ID.
- `BusinessContext`: Record with environment, IsInternetFacing, DataClassification, CriticalityTier, ComplianceRegimes, and RiskMultiplier.
- `DataClassification`: Enum for data sensitivity (Public, Internal, Confidential, Restricted, Unknown).
- `SubjectType`: Enum for risk subject types (Image, Container, Service, Fleet).
- `RiskAssessment`: Aggregate record with subject, scores, factors, context, recommendations, and timestamps.
- `RiskTrend`: Record for tracking risk over time with snapshots and trend direction.
- `RiskSnapshot`: Point-in-time risk score for trend analysis.
- `TrendDirection`: Enum (Improving, Stable, Worsening, Volatile, Insufficient).
- `IRiskScorer`: Interface for computing risk scores from entrypoint intelligence.
- `IRiskContributor`: Interface for individual risk contributors (semantic, temporal, mesh, binary, vulnerability).
- `RiskContext`: Record aggregating all signal sources for risk computation.
- `VulnerabilityReference`: Record for known vulnerabilities with severity, CVSS, exploit status.
- `SemanticRiskContributor`: Risk from capabilities and threat vectors.
- `TemporalRiskContributor`: Risk from drift patterns and rapid changes.
- `MeshRiskContributor`: Risk from exposure, blast radius, and vulnerable paths.
- `BinaryRiskContributor`: Risk from vulnerable function usage in binaries.
- `VulnerabilityRiskContributor`: Risk from known CVEs and exploitability.
- `CompositeRiskScorer`: Combines all contributors with weighted scoring and business context adjustment.
- `CompositeRiskScorerOptions`: Configuration for weights and thresholds.
- `RiskExplainer`: Generates human-readable risk explanations with recommendations.
- `RiskReport`: Record with assessment, explanation, and recommendations.
- `RiskAggregator`: Fleet-level risk aggregation and trending.
- `FleetRiskSummary`: Summary statistics across fleet (count by level, top risks, trend).
- `RiskSummaryItem`: Individual subject summary for fleet views.
- `EntrypointRiskReport`: Complete report combining entrypoint graph with risk assessment.

## Observability & Security
- No dynamic assembly loading beyond restart-time plug-in catalog.
- Structured logs include `scanId`, `imageDigest`, `layerDigest`, `command`, `reason`.
- Metrics counters: `entrytrace_resolutions_total{result}`, `entrytrace_unresolved_total{reason}`.
- Deny `source` directives outside image root; sandbox file IO via provided `IRootFileSystem`.

## Testing
- Unit tests live in `../StellaOps.Scanner.EntryTrace.Tests` with golden fixtures under `Fixtures/`.
- Determinism harness: same inputs produce byte-identical serialized graphs.
- Parser fuzz seeds captured for regression; interpreter tracers validated with sample scripts for Python, Node, Java launchers.
- **Temporal tests**: `Temporal/TemporalEntrypointGraphTests.cs`, `Temporal/InMemoryTemporalEntrypointStoreTests.cs`.
- **Mesh tests**: `Mesh/MeshEntrypointGraphTests.cs`, `Mesh/KubernetesManifestParserTests.cs`, `Mesh/DockerComposeParserTests.cs`, `Mesh/MeshEntrypointAnalyzerTests.cs`.
- **Speculative tests**: `Speculative/SymbolicStateTests.cs`, `Speculative/ShellSymbolicExecutorTests.cs`, `Speculative/PathEnumeratorTests.cs`, `Speculative/PathConfidenceScorerTests.cs`.
- **Binary tests**: `Binary/CodeFingerprintTests.cs`, `Binary/FingerprintIndexTests.cs`, `Binary/SymbolRecoveryTests.cs`, `Binary/BinaryIntelligenceIntegrationTests.cs`.
- **Risk tests** (TODO): `Risk/RiskScoreTests.cs`, `Risk/RiskContributorTests.cs`, `Risk/CompositeRiskScorerTests.cs`.

## Required Reading
- `docs/modules/scanner/architecture.md`
- `docs/modules/platform/architecture-overview.md`
- `docs/modules/scanner/operations/entrypoint-problem.md`
- `docs/reachability/function-level-evidence.md`

## Working Agreement
- 1. Update task status to `DOING`/`DONE` in both correspoding sprint file `/docs/implplan/SPRINT_*.md` and the local `TASKS.md` when you start or finish work.
- 2. Review this charter and the Required Reading documents before coding; confirm prerequisites are met.
- 3. Keep changes deterministic (stable ordering, timestamps, hashes) and align with offline/air-gap expectations.
- 4. Coordinate doc updates, tests, and cross-guild communication whenever contracts or workflows change.
- 5. Revert to `TODO` if you pause the task without shipping changes; leave notes in commit/PR descriptions for context.