Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Notify Smoke Test / Notify Unit Tests (push) Has been cancelled
Notify Smoke Test / Notifier Service Tests (push) Has been cancelled
Notify Smoke Test / Notification Smoke Test (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled
9.6 KiB
9.6 KiB
Semantic Entrypoint Analysis
Part of Sprint 0411 - Semantic Entrypoint Engine
Overview
The Semantic Entrypoint Engine provides deep understanding of container entrypoints by inferring:
- Application Intent - What the application is designed to do (web server, CLI tool, worker, etc.)
- Capabilities - What system resources and external services the application uses
- Attack Surface - Potential security vulnerabilities based on detected patterns
- Data Boundaries - I/O edges where data enters or leaves the application
This semantic layer enables more accurate vulnerability prioritization, reachability analysis, and policy decisioning.
Schema Definition
SemanticEntrypoint Record
The core output of semantic analysis:
public sealed record SemanticEntrypoint
{
public required string Id { get; init; }
public required EntrypointSpecification Specification { get; init; }
public required ApplicationIntent Intent { get; init; }
public required CapabilityClass Capabilities { get; init; }
public required ImmutableArray<ThreatVector> AttackSurface { get; init; }
public required ImmutableArray<DataFlowBoundary> DataBoundaries { get; init; }
public required SemanticConfidence Confidence { get; init; }
public string? Language { get; init; }
public string? Framework { get; init; }
public string? FrameworkVersion { get; init; }
public string? RuntimeVersion { get; init; }
public ImmutableDictionary<string, string>? Metadata { get; init; }
}
Application Intent
Enumeration of recognized application types:
| Intent | Description | Example Frameworks |
|---|---|---|
WebServer |
HTTP/HTTPS listener | Django, Express, ASP.NET Core |
CliTool |
Command-line utility | Click, Cobra, System.CommandLine |
Worker |
Background job processor | Celery, Sidekiq, Hangfire |
BatchJob |
One-shot data processing | MapReduce, ETL scripts |
Serverless |
FaaS handler | Lambda, Azure Functions |
Daemon |
Long-running background service | systemd units |
StreamProcessor |
Real-time data pipeline | Kafka Streams, Flink |
RpcServer |
gRPC/Thrift server | grpc-go, grpc-dotnet |
GraphQlServer |
GraphQL API | Apollo, Hot Chocolate |
DatabaseServer |
Database engine | PostgreSQL, Redis |
MessageBroker |
Message queue server | RabbitMQ, NATS |
CacheServer |
Cache/session store | Redis, Memcached |
ProxyGateway |
Reverse proxy, API gateway | Envoy, NGINX |
Capability Classes
Flags enum representing detected capabilities:
| Capability | Description | Detection Signals |
|---|---|---|
NetworkListen |
Opens listening socket | http.ListenAndServe, app.listen() |
NetworkConnect |
Makes outbound connections | requests, http.Client |
FileRead |
Reads from filesystem | open(), File.ReadAllText() |
FileWrite |
Writes to filesystem | File write operations |
ProcessSpawn |
Spawns child processes | subprocess, exec.Command |
DatabaseSql |
SQL database access | psycopg2, SqlConnection |
DatabaseNoSql |
NoSQL database access | pymongo, redis |
MessageQueue |
Message broker client | pika, kafka-python |
CacheAccess |
Cache client operations | redis, memcached |
ExternalHttpApi |
External HTTP API calls | REST clients |
Authentication |
Auth operations | passport, JWT libraries |
SecretAccess |
Accesses secrets/credentials | Vault clients, env secrets |
Threat Vectors
Inferred security threats:
| Threat Type | CWE ID | OWASP Category | Contributing Capabilities |
|---|---|---|---|
SqlInjection |
89 | A03:2021 | DatabaseSql + UserInput |
Xss |
79 | A03:2021 | NetworkListen + UserInput |
Ssrf |
918 | A10:2021 | ExternalHttpApi + UserInput |
Rce |
94 | A03:2021 | ProcessSpawn + UserInput |
PathTraversal |
22 | A01:2021 | FileRead + UserInput |
InsecureDeserialization |
502 | A08:2021 | Deserialization patterns |
AuthenticationBypass |
287 | A07:2021 | Auth patterns detected |
CommandInjection |
78 | A03:2021 | ProcessSpawn patterns |
Data Flow Boundaries
I/O edges for data flow analysis:
| Boundary Type | Direction | Security Relevance |
|---|---|---|
HttpRequest |
Inbound | User input entry point |
HttpResponse |
Outbound | Data exposure point |
DatabaseQuery |
Outbound | SQL injection surface |
FileInput |
Inbound | Path traversal surface |
EnvironmentVar |
Inbound | Config injection surface |
MessageReceive |
Inbound | Deserialization surface |
ProcessSpawn |
Outbound | Command injection surface |
Confidence Scoring
All inferences include confidence scores:
public sealed record SemanticConfidence
{
public double Score { get; init; } // 0.0-1.0
public ConfidenceTier Tier { get; init; } // Unknown, Low, Medium, High, Definitive
public ImmutableArray<string> ReasoningChain { get; init; }
}
| Tier | Score Range | Description |
|---|---|---|
Definitive |
0.95-1.0 | Framework explicitly declared |
High |
0.8-0.95 | Strong pattern match |
Medium |
0.5-0.8 | Multiple weak signals |
Low |
0.2-0.5 | Heuristic inference |
Unknown |
0.0-0.2 | No reliable signals |
Language Adapters
Semantic analysis uses language-specific adapters:
Python Adapter
- Django: Detects
manage.py,INSTALLED_APPS, migrations - Flask/FastAPI: Detects
Flask(__name__),FastAPI()patterns - Celery: Detects
Celery()app,@taskdecorators - Click/Typer: Detects CLI decorators
- Lambda: Detects
lambda_handlerpattern
Java Adapter
- Spring Boot: Detects
@SpringBootApplication, starter dependencies - Quarkus: Detects
io.quarkuspackages - Kafka Streams: Detects
kafka-streamsdependency - Main-Class: Falls back to manifest analysis
Node Adapter
- Express: Detects
express()+listen() - NestJS: Detects
@nestjs/coredependency - Fastify: Detects
fastify()patterns - CLI bin: Detects
binfield in package.json
.NET Adapter
- ASP.NET Core: Detects
Microsoft.AspNetCorereferences - Worker Service: Detects
BackgroundServiceinheritance - Console: Detects
OutputType=Exewithout web deps
Go Adapter
- net/http: Detects
http.ListenAndServepatterns - Cobra: Detects
github.com/spf13/cobraimport - gRPC: Detects
google.golang.org/grpcimport
Integration Points
Entry Trace Pipeline
Semantic analysis integrates after entry trace resolution:
Container Image
↓
EntryTraceAnalyzer.ResolveAsync()
↓
EntryTraceGraph (nodes, edges, terminals)
↓
SemanticEntrypointOrchestrator.AnalyzeAsync()
↓
SemanticEntrypoint (intent, capabilities, threats)
SBOM Output
Semantic data appears in CycloneDX properties:
{
"properties": [
{ "name": "stellaops:semantic.intent", "value": "WebServer" },
{ "name": "stellaops:semantic.capabilities", "value": "NetworkListen,DatabaseSql" },
{ "name": "stellaops:semantic.threats", "value": "[{\"type\":\"SqlInjection\",\"confidence\":0.7}]" },
{ "name": "stellaops:semantic.risk.score", "value": "0.7" },
{ "name": "stellaops:semantic.framework", "value": "django" }
]
}
RichGraph Output
Semantic attributes on entrypoint nodes:
{
"kind": "entrypoint",
"attributes": {
"semantic_intent": "WebServer",
"semantic_capabilities": "NetworkListen,DatabaseSql,UserInput",
"semantic_threats": "SqlInjection,Xss",
"semantic_risk_score": "0.7",
"semantic_confidence": "0.85",
"semantic_confidence_tier": "High"
}
}
Usage Examples
CLI Usage
# Scan with semantic analysis
stella scan myimage:latest --semantic
# Output includes semantic fields
stella scan myimage:latest --format json | jq '.semantic'
Programmatic Usage
// Create orchestrator
var orchestrator = new SemanticEntrypointOrchestrator();
// Create context from entry trace result
var context = orchestrator.CreateContext(entryTraceResult, fileSystem, containerMetadata);
// Run analysis
var result = await orchestrator.AnalyzeAsync(context);
if (result.Success && result.Entrypoint is not null)
{
Console.WriteLine($"Intent: {result.Entrypoint.Intent}");
Console.WriteLine($"Capabilities: {result.Entrypoint.Capabilities}");
Console.WriteLine($"Risk Score: {result.Entrypoint.AttackSurface.Max(t => t.Confidence)}");
}
Extending the Engine
Adding a New Language Adapter
- Implement
ISemanticEntrypointAnalyzer:
public sealed class RubySemanticAdapter : ISemanticEntrypointAnalyzer
{
public IReadOnlyList<string> SupportedLanguages => new[] { "ruby" };
public int Priority => 100;
public ValueTask<SemanticEntrypoint> AnalyzeAsync(
SemanticAnalysisContext context,
CancellationToken cancellationToken)
{
// Detect Rails, Sinatra, Sidekiq, etc.
}
}
- Register in
SemanticEntrypointOrchestrator.CreateDefaultAdapters().
Adding a New Capability
- Add to
CapabilityClassflags enum - Update
CapabilityDetectorwith detection patterns - Update
ThreatVectorInferrerif capability contributes to threats - Update
DataBoundaryMapperif capability implies I/O boundaries