up
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Notify Smoke Test / Notify Unit Tests (push) Has been cancelled
Notify Smoke Test / Notifier Service Tests (push) Has been cancelled
Notify Smoke Test / Notification Smoke Test (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Notify Smoke Test / Notify Unit Tests (push) Has been cancelled
Notify Smoke Test / Notifier Service Tests (push) Has been cancelled
Notify Smoke Test / Notification Smoke Test (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled
This commit is contained in:
280
docs/modules/scanner/operations/entrypoint-semantic.md
Normal file
280
docs/modules/scanner/operations/entrypoint-semantic.md
Normal file
@@ -0,0 +1,280 @@
|
||||
# Semantic Entrypoint Analysis
|
||||
|
||||
> Part of Sprint 0411 - Semantic Entrypoint Engine
|
||||
|
||||
## Overview
|
||||
|
||||
The Semantic Entrypoint Engine provides deep understanding of container entrypoints by inferring:
|
||||
- **Application Intent** - What the application is designed to do (web server, CLI tool, worker, etc.)
|
||||
- **Capabilities** - What system resources and external services the application uses
|
||||
- **Attack Surface** - Potential security vulnerabilities based on detected patterns
|
||||
- **Data Boundaries** - I/O edges where data enters or leaves the application
|
||||
|
||||
This semantic layer enables more accurate vulnerability prioritization, reachability analysis, and policy decisioning.
|
||||
|
||||
## Schema Definition
|
||||
|
||||
### SemanticEntrypoint Record
|
||||
|
||||
The core output of semantic analysis:
|
||||
|
||||
```csharp
|
||||
public sealed record SemanticEntrypoint
|
||||
{
|
||||
public required string Id { get; init; }
|
||||
public required EntrypointSpecification Specification { get; init; }
|
||||
public required ApplicationIntent Intent { get; init; }
|
||||
public required CapabilityClass Capabilities { get; init; }
|
||||
public required ImmutableArray<ThreatVector> AttackSurface { get; init; }
|
||||
public required ImmutableArray<DataFlowBoundary> DataBoundaries { get; init; }
|
||||
public required SemanticConfidence Confidence { get; init; }
|
||||
public string? Language { get; init; }
|
||||
public string? Framework { get; init; }
|
||||
public string? FrameworkVersion { get; init; }
|
||||
public string? RuntimeVersion { get; init; }
|
||||
public ImmutableDictionary<string, string>? Metadata { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### Application Intent
|
||||
|
||||
Enumeration of recognized application types:
|
||||
|
||||
| Intent | Description | Example Frameworks |
|
||||
|--------|-------------|-------------------|
|
||||
| `WebServer` | HTTP/HTTPS listener | Django, Express, ASP.NET Core |
|
||||
| `CliTool` | Command-line utility | Click, Cobra, System.CommandLine |
|
||||
| `Worker` | Background job processor | Celery, Sidekiq, Hangfire |
|
||||
| `BatchJob` | One-shot data processing | MapReduce, ETL scripts |
|
||||
| `Serverless` | FaaS handler | Lambda, Azure Functions |
|
||||
| `Daemon` | Long-running background service | systemd units |
|
||||
| `StreamProcessor` | Real-time data pipeline | Kafka Streams, Flink |
|
||||
| `RpcServer` | gRPC/Thrift server | grpc-go, grpc-dotnet |
|
||||
| `GraphQlServer` | GraphQL API | Apollo, Hot Chocolate |
|
||||
| `DatabaseServer` | Database engine | PostgreSQL, Redis |
|
||||
| `MessageBroker` | Message queue server | RabbitMQ, NATS |
|
||||
| `CacheServer` | Cache/session store | Redis, Memcached |
|
||||
| `ProxyGateway` | Reverse proxy, API gateway | Envoy, NGINX |
|
||||
|
||||
### Capability Classes
|
||||
|
||||
Flags enum representing detected capabilities:
|
||||
|
||||
| Capability | Description | Detection Signals |
|
||||
|------------|-------------|-------------------|
|
||||
| `NetworkListen` | Opens listening socket | `http.ListenAndServe`, `app.listen()` |
|
||||
| `NetworkConnect` | Makes outbound connections | `requests`, `http.Client` |
|
||||
| `FileRead` | Reads from filesystem | `open()`, `File.ReadAllText()` |
|
||||
| `FileWrite` | Writes to filesystem | File write operations |
|
||||
| `ProcessSpawn` | Spawns child processes | `subprocess`, `exec.Command` |
|
||||
| `DatabaseSql` | SQL database access | `psycopg2`, `SqlConnection` |
|
||||
| `DatabaseNoSql` | NoSQL database access | `pymongo`, `redis` |
|
||||
| `MessageQueue` | Message broker client | `pika`, `kafka-python` |
|
||||
| `CacheAccess` | Cache client operations | `redis`, `memcached` |
|
||||
| `ExternalHttpApi` | External HTTP API calls | REST clients |
|
||||
| `Authentication` | Auth operations | `passport`, `JWT` libraries |
|
||||
| `SecretAccess` | Accesses secrets/credentials | Vault clients, env secrets |
|
||||
|
||||
### Threat Vectors
|
||||
|
||||
Inferred security threats:
|
||||
|
||||
| Threat Type | CWE ID | OWASP Category | Contributing Capabilities |
|
||||
|------------|--------|----------------|--------------------------|
|
||||
| `SqlInjection` | 89 | A03:2021 | `DatabaseSql` + `UserInput` |
|
||||
| `Xss` | 79 | A03:2021 | `NetworkListen` + `UserInput` |
|
||||
| `Ssrf` | 918 | A10:2021 | `ExternalHttpApi` + `UserInput` |
|
||||
| `Rce` | 94 | A03:2021 | `ProcessSpawn` + `UserInput` |
|
||||
| `PathTraversal` | 22 | A01:2021 | `FileRead` + `UserInput` |
|
||||
| `InsecureDeserialization` | 502 | A08:2021 | Deserialization patterns |
|
||||
| `AuthenticationBypass` | 287 | A07:2021 | Auth patterns detected |
|
||||
| `CommandInjection` | 78 | A03:2021 | `ProcessSpawn` patterns |
|
||||
|
||||
### Data Flow Boundaries
|
||||
|
||||
I/O edges for data flow analysis:
|
||||
|
||||
| Boundary Type | Direction | Security Relevance |
|
||||
|---------------|-----------|-------------------|
|
||||
| `HttpRequest` | Inbound | User input entry point |
|
||||
| `HttpResponse` | Outbound | Data exposure point |
|
||||
| `DatabaseQuery` | Outbound | SQL injection surface |
|
||||
| `FileInput` | Inbound | Path traversal surface |
|
||||
| `EnvironmentVar` | Inbound | Config injection surface |
|
||||
| `MessageReceive` | Inbound | Deserialization surface |
|
||||
| `ProcessSpawn` | Outbound | Command injection surface |
|
||||
|
||||
### Confidence Scoring
|
||||
|
||||
All inferences include confidence scores:
|
||||
|
||||
```csharp
|
||||
public sealed record SemanticConfidence
|
||||
{
|
||||
public double Score { get; init; } // 0.0-1.0
|
||||
public ConfidenceTier Tier { get; init; } // Unknown, Low, Medium, High, Definitive
|
||||
public ImmutableArray<string> ReasoningChain { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
| Tier | Score Range | Description |
|
||||
|------|-------------|-------------|
|
||||
| `Definitive` | 0.95-1.0 | Framework explicitly declared |
|
||||
| `High` | 0.8-0.95 | Strong pattern match |
|
||||
| `Medium` | 0.5-0.8 | Multiple weak signals |
|
||||
| `Low` | 0.2-0.5 | Heuristic inference |
|
||||
| `Unknown` | 0.0-0.2 | No reliable signals |
|
||||
|
||||
## Language Adapters
|
||||
|
||||
Semantic analysis uses language-specific adapters:
|
||||
|
||||
### Python Adapter
|
||||
- **Django**: Detects `manage.py`, `INSTALLED_APPS`, migrations
|
||||
- **Flask/FastAPI**: Detects `Flask(__name__)`, `FastAPI()` patterns
|
||||
- **Celery**: Detects `Celery()` app, `@task` decorators
|
||||
- **Click/Typer**: Detects CLI decorators
|
||||
- **Lambda**: Detects `lambda_handler` pattern
|
||||
|
||||
### Java Adapter
|
||||
- **Spring Boot**: Detects `@SpringBootApplication`, starter dependencies
|
||||
- **Quarkus**: Detects `io.quarkus` packages
|
||||
- **Kafka Streams**: Detects `kafka-streams` dependency
|
||||
- **Main-Class**: Falls back to manifest analysis
|
||||
|
||||
### Node Adapter
|
||||
- **Express**: Detects `express()` + `listen()`
|
||||
- **NestJS**: Detects `@nestjs/core` dependency
|
||||
- **Fastify**: Detects `fastify()` patterns
|
||||
- **CLI bin**: Detects `bin` field in package.json
|
||||
|
||||
### .NET Adapter
|
||||
- **ASP.NET Core**: Detects `Microsoft.AspNetCore` references
|
||||
- **Worker Service**: Detects `BackgroundService` inheritance
|
||||
- **Console**: Detects `OutputType=Exe` without web deps
|
||||
|
||||
### Go Adapter
|
||||
- **net/http**: Detects `http.ListenAndServe` patterns
|
||||
- **Cobra**: Detects `github.com/spf13/cobra` import
|
||||
- **gRPC**: Detects `google.golang.org/grpc` import
|
||||
|
||||
## Integration Points
|
||||
|
||||
### Entry Trace Pipeline
|
||||
|
||||
Semantic analysis integrates after entry trace resolution:
|
||||
|
||||
```
|
||||
Container Image
|
||||
↓
|
||||
EntryTraceAnalyzer.ResolveAsync()
|
||||
↓
|
||||
EntryTraceGraph (nodes, edges, terminals)
|
||||
↓
|
||||
SemanticEntrypointOrchestrator.AnalyzeAsync()
|
||||
↓
|
||||
SemanticEntrypoint (intent, capabilities, threats)
|
||||
```
|
||||
|
||||
### SBOM Output
|
||||
|
||||
Semantic data appears in CycloneDX properties:
|
||||
|
||||
```json
|
||||
{
|
||||
"properties": [
|
||||
{ "name": "stellaops:semantic.intent", "value": "WebServer" },
|
||||
{ "name": "stellaops:semantic.capabilities", "value": "NetworkListen,DatabaseSql" },
|
||||
{ "name": "stellaops:semantic.threats", "value": "[{\"type\":\"SqlInjection\",\"confidence\":0.7}]" },
|
||||
{ "name": "stellaops:semantic.risk.score", "value": "0.7" },
|
||||
{ "name": "stellaops:semantic.framework", "value": "django" }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### RichGraph Output
|
||||
|
||||
Semantic attributes on entrypoint nodes:
|
||||
|
||||
```json
|
||||
{
|
||||
"kind": "entrypoint",
|
||||
"attributes": {
|
||||
"semantic_intent": "WebServer",
|
||||
"semantic_capabilities": "NetworkListen,DatabaseSql,UserInput",
|
||||
"semantic_threats": "SqlInjection,Xss",
|
||||
"semantic_risk_score": "0.7",
|
||||
"semantic_confidence": "0.85",
|
||||
"semantic_confidence_tier": "High"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### CLI Usage
|
||||
|
||||
```bash
|
||||
# Scan with semantic analysis
|
||||
stella scan myimage:latest --semantic
|
||||
|
||||
# Output includes semantic fields
|
||||
stella scan myimage:latest --format json | jq '.semantic'
|
||||
```
|
||||
|
||||
### Programmatic Usage
|
||||
|
||||
```csharp
|
||||
// Create orchestrator
|
||||
var orchestrator = new SemanticEntrypointOrchestrator();
|
||||
|
||||
// Create context from entry trace result
|
||||
var context = orchestrator.CreateContext(entryTraceResult, fileSystem, containerMetadata);
|
||||
|
||||
// Run analysis
|
||||
var result = await orchestrator.AnalyzeAsync(context);
|
||||
|
||||
if (result.Success && result.Entrypoint is not null)
|
||||
{
|
||||
Console.WriteLine($"Intent: {result.Entrypoint.Intent}");
|
||||
Console.WriteLine($"Capabilities: {result.Entrypoint.Capabilities}");
|
||||
Console.WriteLine($"Risk Score: {result.Entrypoint.AttackSurface.Max(t => t.Confidence)}");
|
||||
}
|
||||
```
|
||||
|
||||
## Extending the Engine
|
||||
|
||||
### Adding a New Language Adapter
|
||||
|
||||
1. Implement `ISemanticEntrypointAnalyzer`:
|
||||
|
||||
```csharp
|
||||
public sealed class RubySemanticAdapter : ISemanticEntrypointAnalyzer
|
||||
{
|
||||
public IReadOnlyList<string> SupportedLanguages => new[] { "ruby" };
|
||||
public int Priority => 100;
|
||||
|
||||
public ValueTask<SemanticEntrypoint> AnalyzeAsync(
|
||||
SemanticAnalysisContext context,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
// Detect Rails, Sinatra, Sidekiq, etc.
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
2. Register in `SemanticEntrypointOrchestrator.CreateDefaultAdapters()`.
|
||||
|
||||
### Adding a New Capability
|
||||
|
||||
1. Add to `CapabilityClass` flags enum
|
||||
2. Update `CapabilityDetector` with detection patterns
|
||||
3. Update `ThreatVectorInferrer` if capability contributes to threats
|
||||
4. Update `DataBoundaryMapper` if capability implies I/O boundaries
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Entry Trace Problem Statement](./entrypoint-problem.md)
|
||||
- [Static Analysis Approach](./entrypoint-static-analysis.md)
|
||||
- [Language-Specific Guides](./entrypoint-lang-python.md)
|
||||
- [Reachability Evidence](../../reachability/function-level-evidence.md)
|
||||
Reference in New Issue
Block a user