- Implemented unit tests for PathConfidenceScorer to evaluate path scoring under various conditions, including empty constraints, known and unknown constraints, environmental dependencies, and custom weights. - Developed tests for PathEnumerator to ensure correct path enumeration from simple scripts, handling known environments, and respecting maximum paths and depth limits. - Created tests for ShellSymbolicExecutor to validate execution of shell scripts, including handling of commands, branching, and environment tracking. - Added tests for SymbolicState to verify state management, variable handling, constraint addition, and environment dependency collection.
13 KiB
13 KiB
StellaOps.Scanner.EntryTrace — Agent Charter
Mission
Resolve container ENTRYPOINT/CMD chains into deterministic call graphs that fuel usage-aware SBOMs, policy explainability, and runtime drift detection. Implement the EntryTrace analyzers and expose them as restart-time plug-ins for the Scanner Worker.
Scope
- Parse POSIX/Bourne shell constructs (exec, command, case, if, source/run-parts) with deterministic AST output.
- Walk layered root filesystems to resolve PATH lookups, interpreter hand-offs (Python/Node/Java), and record evidence.
- Surface explainable diagnostics for unresolved branches (env indirection, missing files, unsupported syntax) and emit metrics.
- Package analyzers as signed plug-ins under
plugins/scanner/entrytrace/, guarded by restart-only policy. - Semantic analysis: Classify entrypoints by application intent (ApiEndpoint, Worker, CronJob, etc.), capability class (NetworkListener, FileSystemAccess, etc.), and threat vectors.
- Temporal tracking: Track entrypoint evolution across image versions, detecting drift categories (intent changes, capability expansion, attack surface growth).
- Mesh analysis: Parse multi-container orchestration manifests (K8s, Docker Compose) to build cross-container reachability graphs and identify vulnerable paths.
Out of Scope
- SBOM emission/diffing (owned by
Scanner.Emit/Scanner.Diff). - Runtime enforcement or live drift reconciliation (owned by Zastava).
- Registry/network fetchers beyond file lookups inside extracted layers.
Interfaces & Contracts
Core EntryTrace
- Primary entry point:
IEntryTraceAnalyzer.ResolveAsyncreturning a deterministicEntryTraceGraph. - Graph nodes must include file path, line span, interpreter classification, evidence source, and follow
Scanner.Coretimestamp/ID helpers when emitting events. - Diagnostics must enumerate unknown reasons from fixed enum; metrics tagged
entrytrace.*. - Plug-ins register via
IEntryTraceAnalyzerFactoryand must validate againstIPluginCatalogGuard.
Semantic Entrypoints (Sprint 0411)
Located in Semantic/:
SemanticEntrypoint: Classifies entrypoints with intent, capabilities, threat vectors, and confidence scores.ApplicationIntent: Enum for high-level purpose (ApiEndpoint, Worker, CronJob, CliTool, etc.).CapabilityClass: Enum for functional capabilities (NetworkListener, FileSystemAccess, ProcessSpawner, etc.).ThreatVector: Enum for security-relevant classifications (NetworkExposure, FilePathTraversal, CommandInjection, etc.).DataFlowBoundary: Record for trust boundaries in data flow.SemanticConfidence: Confidence scores for classification results.
Temporal Entrypoints (Sprint 0412)
Located in Temporal/:
TemporalEntrypointGraph: Tracks entrypoints across image versions with snapshots and deltas.EntrypointSnapshot: Point-in-time entrypoint state with content hash for comparison.EntrypointDelta: Version-to-version changes (added/removed/modified entrypoints).EntrypointDrift: Flags enum for drift categories (IntentChanged, CapabilitiesExpanded, AttackSurfaceGrew, PrivilegeEscalation, PortsAdded, etc.).ITemporalEntrypointStore: Interface for storing and querying temporal graphs.InMemoryTemporalEntrypointStore: Reference implementation with delta computation.
Mesh Entrypoints (Sprint 0412)
Located in Mesh/:
MeshEntrypointGraph: Multi-container service mesh with services, edges, and ingress paths.ServiceNode: Container in the mesh with entrypoints, exposed ports, and labels.CrossContainerEdge: Inter-service communication link.CrossContainerPath: Reachability path across services with vulnerability tracking.IngressPath: External exposure via ingress/load balancer.IManifestParser: Interface for parsing orchestration manifests.KubernetesManifestParser: Parser for K8s Deployment, Service, Ingress, StatefulSet, DaemonSet, Pod.DockerComposeParser: Parser for Docker Compose v2/v3 files.MeshEntrypointAnalyzer: Orchestrator for mesh analysis with security metrics and blast radius analysis.
Speculative Execution (Sprint 0413)
Located in Speculative/:
SymbolicValue: Algebraic type for symbolic values (Concrete, Symbolic, Unknown, Composite).SymbolicState: Execution state with variable bindings, path constraints, and terminal commands.PathConstraint: Branch predicate constraint with kind classification and env dependency tracking.ExecutionPath: Complete execution path with constraints, commands, and reachability confidence.ExecutionTree: All paths from symbolic execution with branch coverage metrics.BranchPoint: Decision point in the script with coverage statistics.BranchCoverage: Coverage metrics (total, covered, infeasible, env-dependent branches).ISymbolicExecutor: Interface for symbolic execution of shell scripts.ShellSymbolicExecutor: Implementation that explores all if/elif/else and case branches.IConstraintEvaluator: Interface for path feasibility evaluation.PatternConstraintEvaluator: Pattern-based evaluator for common shell conditionals.PathEnumerator: Systematic path exploration with grouping by terminal command.PathConfidenceScorer: Confidence scoring with multi-factor analysis.
Binary Intelligence (Sprint 0414)
Located in Binary/:
CodeFingerprint: Record for binary function fingerprinting with algorithm, hash, and metrics.FingerprintAlgorithm: Enum for fingerprint types (BasicBlockHash, ControlFlowGraph, StringReferences, ImportReferences, Combined).FunctionSignature: Record for extracted binary function metadata (name, offset, size, calling convention, basic blocks, references).BasicBlock: Record for control flow basic block with offset, size, and instruction count.SymbolInfo: Record for recovered symbol information with confidence and match method.SymbolMatchMethod: Enum for how symbols were recovered (DebugInfo, ExactFingerprint, FuzzyFingerprint, PatternMatch, etc.).AlternativeMatch: Record for secondary symbol match candidates.SourceCorrelation: Record for mapping binary code to source packages/files.CorrelationEvidence: Flags enum for evidence types (FingerprintMatch, SymbolName, StringPattern, ImportReference, SourcePath, ExactMatch).BinaryAnalysisResult: Aggregate result with functions, recovered symbols, source correlations, and vulnerable matches.BinaryArchitecture: Enum for CPU architectures (X86, X64, ARM, ARM64, RISCV32, RISCV64, WASM, Unknown).BinaryFormat: Enum for binary formats (ELF, PE, MachO, WASM, Raw, Unknown).BinaryAnalysisMetrics: Metrics for analysis coverage and timing.VulnerableFunctionMatch: Match of a binary function to a known-vulnerable OSS function.VulnerabilitySeverity: Enum for vulnerability severity levels.IFingerprintGenerator: Interface for generating fingerprints from function signatures.BasicBlockFingerprintGenerator,ControlFlowFingerprintGenerator,CombinedFingerprintGenerator: Implementations.FingerprintGeneratorFactory: Factory for creating fingerprint generators.IFingerprintIndex: Interface for fingerprint lookup with exact and similarity matching.InMemoryFingerprintIndex: O(1) exact match, O(n) similarity search implementation.VulnerableFingerprintIndex: Extends index with vulnerability tracking.FingerprintMatch: Result record with source package, version, vulnerability associations, and similarity score.FingerprintIndexStatistics: Statistics about the fingerprint index.ISymbolRecovery: Interface for recovering symbol names from stripped binaries.PatternBasedSymbolRecovery: Heuristic-based recovery using known patterns.FunctionPattern: Record for function signature patterns (malloc, strlen, OpenSSL, zlib, etc.).BinaryIntelligenceAnalyzer: Orchestrator coordinating fingerprinting, symbol recovery, source correlation, and vulnerability matching.BinaryIntelligenceOptions: Configuration for analysis (algorithm, thresholds, parallelism).VulnerableFunctionMatcher: Matches binary functions against known-vulnerable function corpus.VulnerableFunctionMatcherOptions: Configuration for matching thresholds.FingerprintCorpusBuilder: Builds fingerprint corpus from known OSS packages for later matching.
Predictive Risk Scoring (Sprint 0415)
Located in Risk/:
RiskScore: Record with OverallScore, Category, Confidence, Level, Factors, and ComputedAt.RiskCategory: Enum for risk dimensions (Exploitability, Exposure, Privilege, DataSensitivity, BlastRadius, DriftVelocity, SupplyChain, Unknown).RiskLevel: Enum for severity classification (Negligible, Low, Medium, High, Critical).RiskFactor: Record for individual contributing factors with name, category, score, weight, evidence, and source ID.BusinessContext: Record with environment, IsInternetFacing, DataClassification, CriticalityTier, ComplianceRegimes, and RiskMultiplier.DataClassification: Enum for data sensitivity (Public, Internal, Confidential, Restricted, Unknown).SubjectType: Enum for risk subject types (Image, Container, Service, Fleet).RiskAssessment: Aggregate record with subject, scores, factors, context, recommendations, and timestamps.RiskTrend: Record for tracking risk over time with snapshots and trend direction.RiskSnapshot: Point-in-time risk score for trend analysis.TrendDirection: Enum (Improving, Stable, Worsening, Volatile, Insufficient).IRiskScorer: Interface for computing risk scores from entrypoint intelligence.IRiskContributor: Interface for individual risk contributors (semantic, temporal, mesh, binary, vulnerability).RiskContext: Record aggregating all signal sources for risk computation.VulnerabilityReference: Record for known vulnerabilities with severity, CVSS, exploit status.SemanticRiskContributor: Risk from capabilities and threat vectors.TemporalRiskContributor: Risk from drift patterns and rapid changes.MeshRiskContributor: Risk from exposure, blast radius, and vulnerable paths.BinaryRiskContributor: Risk from vulnerable function usage in binaries.VulnerabilityRiskContributor: Risk from known CVEs and exploitability.CompositeRiskScorer: Combines all contributors with weighted scoring and business context adjustment.CompositeRiskScorerOptions: Configuration for weights and thresholds.RiskExplainer: Generates human-readable risk explanations with recommendations.RiskReport: Record with assessment, explanation, and recommendations.RiskAggregator: Fleet-level risk aggregation and trending.FleetRiskSummary: Summary statistics across fleet (count by level, top risks, trend).RiskSummaryItem: Individual subject summary for fleet views.EntrypointRiskReport: Complete report combining entrypoint graph with risk assessment.
Observability & Security
- No dynamic assembly loading beyond restart-time plug-in catalog.
- Structured logs include
scanId,imageDigest,layerDigest,command,reason. - Metrics counters:
entrytrace_resolutions_total{result},entrytrace_unresolved_total{reason}. - Deny
sourcedirectives outside image root; sandbox file IO via providedIRootFileSystem.
Testing
- Unit tests live in
../StellaOps.Scanner.EntryTrace.Testswith golden fixtures underFixtures/. - Determinism harness: same inputs produce byte-identical serialized graphs.
- Parser fuzz seeds captured for regression; interpreter tracers validated with sample scripts for Python, Node, Java launchers.
- Temporal tests:
Temporal/TemporalEntrypointGraphTests.cs,Temporal/InMemoryTemporalEntrypointStoreTests.cs. - Mesh tests:
Mesh/MeshEntrypointGraphTests.cs,Mesh/KubernetesManifestParserTests.cs,Mesh/DockerComposeParserTests.cs,Mesh/MeshEntrypointAnalyzerTests.cs. - Speculative tests:
Speculative/SymbolicStateTests.cs,Speculative/ShellSymbolicExecutorTests.cs,Speculative/PathEnumeratorTests.cs,Speculative/PathConfidenceScorerTests.cs. - Binary tests:
Binary/CodeFingerprintTests.cs,Binary/FingerprintIndexTests.cs,Binary/SymbolRecoveryTests.cs,Binary/BinaryIntelligenceIntegrationTests.cs. - Risk tests (TODO):
Risk/RiskScoreTests.cs,Risk/RiskContributorTests.cs,Risk/CompositeRiskScorerTests.cs.
Required Reading
docs/modules/scanner/architecture.mddocs/modules/platform/architecture-overview.mddocs/modules/scanner/operations/entrypoint-problem.mddocs/reachability/function-level-evidence.md
Working Agreement
-
- Update task status to
DOING/DONEin both correspoding sprint file/docs/implplan/SPRINT_*.mdand the localTASKS.mdwhen you start or finish work.
- Update task status to
-
- Review this charter and the Required Reading documents before coding; confirm prerequisites are met.
-
- Keep changes deterministic (stable ordering, timestamps, hashes) and align with offline/air-gap expectations.
-
- Coordinate doc updates, tests, and cross-guild communication whenever contracts or workflows change.
-
- Revert to
TODOif you pause the task without shipping changes; leave notes in commit/PR descriptions for context.
- Revert to