save checkpoint

This commit is contained in:
master
2026-02-12 21:02:43 +02:00
parent 5bca406787
commit 9911b7d73c
593 changed files with 174390 additions and 1376 deletions

View File

@@ -36,6 +36,13 @@ Tier B catalog lookups are performed over three identity dimensions:
- `BinaryKey`
- `FileSha256`
### 1.2.1 Tier C Fingerprint Runtime Contract
Tier C function fingerprint matching is implemented as deterministic offline-safe byte-window analysis:
- `FingerprintExtractor` derives basic-block hashes, CFG hash, string-reference hashes, constants, and call-targets from bounded binary byte windows (not seed-only synthetic placeholders).
- `SignatureMatcher` applies configurable weighted matching across basic-block, CFG, string-reference, and constant signals.
- Golden CVE fixtures include required high-impact package coverage for `openssl`, `glibc`, `zlib`, and `curl`, and verification checks enforce this coverage during feature rechecks.
### 1.3 Module Scope
**In Scope:**

View File

@@ -39,6 +39,67 @@ Each transport connection carries:
---
## Front Door (Configurable Route Table)
The Router Gateway serves as the **single HTTP entry point** for the entire StellaOps platform. In addition to binary transport routing for microservices, it handles:
- **Static file serving** (Angular SPA dist)
- **Reverse proxy** to HTTP-only backend services
- **WebSocket proxy** to upstream WebSocket servers
- **SPA fallback** (extensionless paths serve `index.html`)
- **Custom error pages** (404/500 HTML fallback)
### Route Table Model
Routes are configured in `Gateway:Routes` as a `StellaOpsRoute[]` array, evaluated **first-match-wins**:
```csharp
public sealed class StellaOpsRoute
{
public StellaOpsRouteType Type { get; set; }
public string Path { get; set; }
public bool IsRegex { get; set; }
public string? TranslatesTo { get; set; }
public Dictionary<string, string> Headers { get; set; }
}
```
Route types:
| Type | Behavior |
|------|----------|
| `ReverseProxy` | Strip path prefix, forward to `TranslatesTo` HTTP URL |
| `StaticFiles` | Serve files from `TranslatesTo` directory, SPA fallback if `x-spa-fallback: true` header set |
| `StaticFile` | Serve a single file at exact path match |
| `WebSocket` | Bidirectional WebSocket proxy to `TranslatesTo` ws:// URL |
| `Microservice` | Pass through to binary transport pipeline |
| `NotFoundPage` | HTML file served on 404 (after all other middleware) |
| `ServerErrorPage` | HTML file served on 5xx (after all other middleware) |
### Pipeline Order
System paths (`/health`, `/metrics`, `/openapi.*`) bypass the route table entirely. The dispatch middleware runs before the microservice pipeline:
```
HealthCheckMiddleware → (system paths: health, metrics)
RouteDispatchMiddleware → (static files, reverse proxy, websocket)
MapRouterOpenApi → (OpenAPI endpoints)
UseWhen(non-system) → (microservice pipeline: auth, routing, transport)
ErrorPageFallbackMiddleware → (custom 404/500 pages)
```
### Docker Architecture
```
Browser → Router Gateway (port 80) → [microservices via binary transport]
→ [HTTP backends via reverse proxy]
→ [Angular SPA from /app/wwwroot volume]
```
The Angular SPA dist is provided by a `console-builder` init container that copies the built files to a shared `console-dist` volume mounted at `/app/wwwroot`.
---
## Service Identity
### Instance Identity

View File

@@ -253,10 +253,13 @@ When `scanner.events.enabled = true`, the WebService serialises the signed repor
* Record `name`, `version` (epoch/revision), `arch`, source package where present, and **declared file lists**.
> **Data flow note:** Each OS analyzer now writes its canonical output into the shared `ScanAnalysisStore` under
> `analysis.os.packages` (raw results), `analysis.os.fragments` (per-analyzer layer fragments), and contributes to
> `analysis.layers.fragments` (the aggregated view consumed by emit/diff pipelines). Helpers in
> `ScanAnalysisCompositionBuilder` convert these fragments into SBOM composition requests and component graphs so the
> diff/emit stages no longer reach back into individual analyzer implementations.
> `analysis.os.packages` (raw results), `analysis.os.fragments` (per-analyzer layer fragments), and contributes to
> `analysis.layers.fragments` (the aggregated view consumed by emit/diff pipelines). Helpers in
> `ScanAnalysisCompositionBuilder` convert these fragments into SBOM composition requests and component graphs so the
> diff/emit stages no longer reach back into individual analyzer implementations.
> RPM and DPKG changelog evidence now also emits deterministic vendor metadata keys
> `vendor.changelogBugRefs` and `vendor.changelogBugToCves` for Debian `Closes`, `RHBZ#`, and Launchpad `LP` bug-to-CVE
> correlation traces used during backport triage.
**B) Language ecosystems (installed state only)**
@@ -321,8 +324,9 @@ The **BinaryLookupStageExecutor** enriches scan results with binary-level vulner
* **Identity Extraction**: For each ELF/PE/Mach-O binary, extract Build-ID, file SHA256, and architecture. Generate a `binary_key` for catalog lookups.
* **Build-ID Catalog Lookup**: Query the BinaryIndex known-build catalog using Build-ID as primary key. Returns CVE matches with high confidence (>=0.95) when the exact binary version is indexed.
* **Fingerprint Matching**: For binaries not in the catalog, compute position-independent fingerprints (basic-block, CFG, string-refs) and match against the vulnerability corpus. Returns similarity scores and confidence.
* **Fix Status Detection**: For each CVE match, query distro-specific backport information to determine if the vulnerability was fixed via distro patch. Methods: `changelog`, `patch_analysis`, `advisory`.
* **Valkey Cache**: All lookups are cached with configurable TTL (default 1 hour for identities, 30 minutes for fingerprints). Target cache hit rate: >80% for repeat scans.
* **Fix Status Detection**: For each CVE match, query distro-specific backport information to determine if the vulnerability was fixed via distro patch. Methods: `changelog`, `patch_analysis`, `advisory`.
* **Valkey Cache**: All lookups are cached with configurable TTL (default 1 hour for identities, 30 minutes for fingerprints). Target cache hit rate: >80% for repeat scans.
* **Runtime wiring (2026-02-12)**: `BinaryLookupStageExecutor` now publishes unified mapped findings to `analysis.binary.findings`, Build-ID to PURL lookup results to `analysis.binary.buildid.mappings`, and patch verification output to `analysis.binary.patchverification.result`.
**BinaryFindingMapper** converts matches to standard findings format with `BinaryFindingEvidence`:
```csharp

View File

@@ -3,6 +3,7 @@
## Overview
- Accepts external SBOMs and runs them through validation, normalization, and analysis triggers.
- Stores the SBOM artifact in the scanner object store and records provenance metadata.
- Persists hot-lookup projection rows in `scanner.artifact_boms` (default scanner schema contract).
- Emits a deterministic analysis job id tied to the upload metadata.
## API

View File

@@ -11,6 +11,7 @@ The `StellaOps.Scanner.EntryTrace` stack (analyzer + worker + surfaces) currentl
- **Script + interpreter resolution**: POSIX shell parsing (AST-driven) covers `source`, `run-parts`, `exec`, and supervisor service directories, with Windows `cmd /c` support. Python `-m`, Node script, and Java `-jar` lookups add evidence when targets are located.
- **Terminal classification & scoring**: `ClassifyTerminal` fingerprints ELF (`PT_INTERP`, Go build ID, Rust notes), PE/CLR, and JAR manifests, pairs them with shebang/runtime heuristics (`python`, `node`, `java`, `.NET`, `php-fpm`, `nginx`, `ruby`), and emits `EntryTracePlan/EntryTraceTerminal` records capped at 95-point confidence.
- **NDJSON + capability stream**: `EntryTraceNdjsonWriter` produces deterministic `entrytrace.entry/node/edge/target/warning/capability` lines consumed by AOC, CLI, and policy surfaces.
- **Binary intelligence enrichment**: native terminal targets are analyzed with function-level fingerprinting (`BinaryIntelligenceAnalyzer`) and the resulting source-correlation/vulnerability summary is attached to `EntryTraceGraph.BinaryIntelligence`.
- **Runtime reconciliation**: `ProcFileSystemSnapshot` + `ProcGraphBuilder` replay `/proc`, `EntryTraceRuntimeReconciler` merges runtime terminals with static predictions, and diagnostics note matches/mismatches.
- **Surface integration**: Scanner Worker caches graphs (`SurfaceCache`), persists `EntryTraceResult` via the shared store, exposes NDJSON + graph through `ScanAnalysisKeys`, and the WebService/CLI (`scan entrytrace`) return the stored result.
@@ -62,7 +63,7 @@ Worker output lands in `context.Analysis` (`EntryTraceGraph`, `EntryTraceNdjson`
#### Probing via WebService & CLI
- **REST**: `GET /api/scans/{scanId}/entrytrace` returns `EntryTraceResponse` (`graph + ndjson + metadata`). Requires scan ownership/authz.
- **REST**: `GET /api/scans/{scanId}/entrytrace` returns `EntryTraceResponse` (`graph + ndjson + metadata`, including optional `graph.binaryIntelligence` for native terminal analyses). Requires scan ownership/authz.
- **CLI**: `stella scan entrytrace <scan-id> [--ndjson] [--verbose]` renders a confidence-sorted terminal table, diagnostics, and optionally the NDJSON payload.
Both surfaces consume the persisted result; rerunning the worker updates the stored document atomically.