Add Authority Advisory AI and API Lifecycle Configuration

- Introduced AuthorityAdvisoryAiOptions and related classes for managing advisory AI configurations, including remote inference options and tenant-specific settings.
- Added AuthorityApiLifecycleOptions to control API lifecycle settings, including legacy OAuth endpoint configurations.
- Implemented validation and normalization methods for both advisory AI and API lifecycle options to ensure proper configuration.
- Created AuthorityNotificationsOptions and its related classes for managing notification settings, including ack tokens, webhooks, and escalation options.
- Developed IssuerDirectoryClient and related models for interacting with the issuer directory service, including caching mechanisms and HTTP client configurations.
- Added support for dependency injection through ServiceCollectionExtensions for the Issuer Directory Client.
- Updated project file to include necessary package references for the new Issuer Directory Client library.
This commit is contained in:
master
2025-11-02 13:40:38 +02:00
parent 66cb6c4b8a
commit f98cea3bcf
516 changed files with 68157 additions and 24754 deletions

View File

@@ -1,11 +1,89 @@
# Entry-Point Static Analysis
This guide captures the static half of StellaOps entry-point detection pipeline: how we turn image metadata and filesystem contents into a resolved binary, an execution chain, and a confidence score.
## 1) Loading OCI images
### 1.1 Supported inputs
- Registry references (`repo:tag@sha256:digest`) using the existing content store.
# Entry-Point Static Analysis
This guide captures the static half of StellaOps entry-point detection pipeline: how we turn image metadata and filesystem contents into a resolved binary, an execution chain, and a confidence score.
## 0) Implementation snapshot — Sprint130.A (2025-11-02)
The `StellaOps.Scanner.EntryTrace` stack (analyzer + worker + surfaces) currently provides:
- **OCI config + layered FS context**: `EntryTraceImageContextFactory` normalises environment (`PATH` fallback), user, and working directory while `LayeredRootFileSystem` handles whiteouts, symlinks, and bounded byte reads (`TryReadBytes`) so ELF/PE probing stays offline friendly.
- **Wrapper-aware exec expansion**: the analyzer unwraps init/user-switch/environment/supervisor wrappers (`tini`, `dumb-init`, `gosu`, `su-exec`, `chpst`, `env`, `supervisord`, `s6-supervise`, `runsv*`) and records guard metadata plus environment/user deltas on nodes and edges.
- **Script + interpreter resolution**: POSIX shell parsing (AST-driven) covers `source`, `run-parts`, `exec`, and supervisor service directories, with Windows `cmd /c` support. Python `-m`, Node script, and Java `-jar` lookups add evidence when targets are located.
- **Terminal classification & scoring**: `ClassifyTerminal` fingerprints ELF (`PT_INTERP`, Go build ID, Rust notes), PE/CLR, and JAR manifests, pairs them with shebang/runtime heuristics (`python`, `node`, `java`, `.NET`, `php-fpm`, `nginx`, `ruby`), and emits `EntryTracePlan/EntryTraceTerminal` records capped at 95-point confidence.
- **NDJSON + capability stream**: `EntryTraceNdjsonWriter` produces deterministic `entrytrace.entry/node/edge/target/warning/capability` lines consumed by AOC, CLI, and policy surfaces.
- **Runtime reconciliation**: `ProcFileSystemSnapshot` + `ProcGraphBuilder` replay `/proc`, `EntryTraceRuntimeReconciler` merges runtime terminals with static predictions, and diagnostics note matches/mismatches.
- **Surface integration**: Scanner Worker caches graphs (`SurfaceCache`), persists `EntryTraceResult` via the shared store, exposes NDJSON + graph through `ScanAnalysisKeys`, and the WebService/CLI (`scan entrytrace`) return the stored result.
Open follow-ups tracked for this wave:
- **SCANNER-ENTRYTRACE-18-507** fallback candidate discovery (Docker history, `/etc/services/**`, `/usr/local/bin/*-entrypoint`) when ENTRYPOINT/CMD are empty.
- **SCANNER-ENTRYTRACE-18-508** broaden wrapper catalogue (package/tool runners such as `bundle exec`, `npm`, `yarn node`, `docker-php-entrypoint`, `pipenv`, `poetry run`).
- **ENTRYTRACE-SURFACE-01** (DOING) / **ENTRYTRACE-SURFACE-02** (TODO) finish wiring Surface.Validation/FS/Secrets to gate prerequisites and remove direct env/secret reads.
_Sections §4§7 below capture the long-term reduction design; features not yet implemented are explicitly noted in the task board._
### Probing the analyzer today
1. **Load the image config**
```csharp
using var stream = File.OpenRead("config.json");
var config = OciImageConfigLoader.Load(stream);
```
2. **Create a layered filesystem** from extracted layer directories or tar archives:
```csharp
var fs = LayeredRootFileSystem.FromArchives(layers);
```
3. **Build the image context** (normalises env, PATH, user, working dir):
```csharp
var imageCtx = EntryTraceImageContextFactory.Create(
config, fs, new EntryTraceAnalyzerOptions(), imageDigest, scanId);
```
4. **Resolve the entry trace**:
```csharp
var analyzer = serviceProvider.GetRequiredService<IEntryTraceAnalyzer>();
var graph = await analyzer.ResolveAsync(imageCtx.Entrypoint, imageCtx.Context, cancellationToken);
```
5. **Inspect results** `graph.Terminals` lists classified candidates (path, runtime, confidence, evidence), `graph.Nodes/Edges` capture the explainable chain, and `graph.Diagnostics` highlight unresolved steps. Emit metrics/telemetry via `EntryTraceMetrics`.
6. **Serialize if needed** pass the graph through `EntryTraceNdjsonWriter.Serialize` to obtain deterministic NDJSON lines; the helper already computes capability summaries.
For ad-hoc investigation, snapshotting `EntryTraceResult` keeps graph and NDJSON aligned. Avoid ad-hoc JSON writers to maintain ordering guarantees.
#### Probing through Scanner.Worker
EntryTrace runs automatically inside the worker when these metadata keys exist on the lease:
| Key | Purpose |
| --- | --- |
| `ScanMetadataKeys.ImageConfigPath` (default `scanner.analyzers.entrytrace.configMetadataKey`) | Absolute path to the OCI `config.json`. |
| `ScanMetadataKeys.LayerDirectories` or `ScanMetadataKeys.LayerArchives` | Semicolon-delimited list of extracted layer folders or tar archives. |
| `ScanMetadataKeys.RuntimeProcRoot` *(optional)* | Path to a captured `/proc` tree for runtime reconciliation (air-gapped runs can mount a snapshot). |
Worker output lands in `context.Analysis` (`EntryTraceGraph`, `EntryTraceNdjson`) and is persisted via `IEntryTraceResultStore`. Ensure Surface Validation prerequisites pass before dispatching the analyzer.
#### Probing via WebService & CLI
- **REST**: `GET /api/scans/{scanId}/entrytrace` returns `EntryTraceResponse` (`graph + ndjson + metadata`). Requires scan ownership/authz.
- **CLI**: `stella scan entrytrace <scan-id> [--ndjson] [--verbose]` renders a confidence-sorted terminal table, diagnostics, and optionally the NDJSON payload.
Both surfaces consume the persisted result; rerunning the worker updates the stored document atomically.
### NDJSON reference
`EntryTraceNdjsonWriter.Serialize` emits newline-delimited JSON in the following order so AOC consumers can stream without buffering:
- `entrytrace.entry` — scan metadata (scan id, image digest, outcome, counts).
- `entrytrace.node` — every node in the graph with arguments, interpreter, evidence, and metadata.
- `entrytrace.edge` — directed relationships between nodes with optional wrapper metadata.
- `entrytrace.target` — resolved terminal programmes (`EntryTracePlan`), including runtime, confidence, arguments, environment, and evidence.
- `entrytrace.warning` — diagnostics (severity, reason, span, related path).
- `entrytrace.capability` — aggregated wrapper capabilities discovered during traversal.
Every line ends with a newline and is emitted in deterministic order (IDs ascending, keys lexicographically sorted) so downstream tooling can hash or diff outputs reproducibly.
## 1) Loading OCI images
### 1.1 Supported inputs
- Registry references (`repo:tag@sha256:digest`) using the existing content store.
- Local OCI/Docker v2 archives (`docker save` tarball, OCI layout directory with `index.json` + `blobs/sha256/*`).
### 1.2 Normalised model
@@ -53,14 +131,18 @@ Compose the runtime argv as `Entrypoint ++ Cmd`, honouring shell-form vs exec-fo
- For non-ELF/PE files: read first line; interpret `#!interpreter args`.
- Replace `argv[0]` with the interpreter, prepend shebang args, append script path per kernel semantics.
### 3.3 Binary probes
- Identify ELF via magic `\x7FELF`, parse `.interp`, `.dynamic`, linked libs, `.note.go.buildid`, DWARF producer.
- Identify PE (Windows) and detect .NET single-file bundles via CLI header.
- Record features for runtime scoring (Go vs Rust vs glibc vs musl).
### 3.3 Binary probes
- Identify ELF via magic `\x7FELF`, parse `.interp`, `.dynamic`, linked libs, `.note.go.buildid`, DWARF producer, `.rustc` notes, and musl/glibc fingerprints.
- Identify PE (Windows) and detect .NET single-file bundles via CLI header / metadata tables; capture ready-to-run vs IL-only markers.
- Inspect archives (JAR/WAR/EAR) for `META-INF/MANIFEST.MF` `Main-Class`/`Main-Module` and signed entries.
- Detect PHP-FPM / nginx launchers (`php-fpm`, `apache2-foreground`, `nginx -g 'daemon off;'`) via binary names + nearby config (php.ini, nginx.conf).
- Record evidence tuples for runtime scoring (interpreter, build ID, runtime note) so downstream components can explain the classification.
## 4) Wrapper catalogue
Collapse known wrappers before analysing the target command:
## 4) Wrapper catalogue
> _Roadmap note_: extended package/tool runners land with **SCANNER-ENTRYTRACE-18-508**; today the catalogue covers init/user-switch/environment/supervisor wrappers listed above.
Collapse known wrappers before analysing the target command:
- Init shims: `tini`, `dumb-init`, `s6-svscan`, `runit`, `supervisord`.
- Privilege droppers: `gosu`, `su-exec`, `chpst`.
@@ -111,12 +193,31 @@ Use a simple logistic model with feature contributions captured for the evidence
Persist per-feature evidence strings so UI/CLI users can see **why** the scanner picked a given entry point.
## 8) Outputs
Return a populated `EntryTraceResult`:
- `Terminals` contains the best candidate(s) and their runtime classification.
- `Evidence` aggregates feature messages, ShellFlow reasoning, wrapper reductions, and runtime detector hints.
- `Chain` shows the explainable path from initial Docker argv to the final binary.
Static and dynamic reducers share this shape, enabling downstream modules to remain agnostic of the detection mode.
## 8) Outputs
Return a populated `EntryTraceResult`:
- `Terminals` contains the best candidate(s) and their runtime classification.
- `Evidence` aggregates feature messages, ShellFlow reasoning, wrapper reductions, and runtime detector hints.
- `Chain` shows the explainable path from initial Docker argv to the final binary.
Static and dynamic reducers share this shape, enabling downstream modules to remain agnostic of the detection mode.
## 9) ProcGraph replay (runtime parity)
Static resolution must be reconciled with live observations when a workload is running under the StellaOps runtime agent:
1. Read `/proc/1/{cmdline,exe}` and walk descendants via `/proc/*/stat` to construct the initial exec chain (ascending PID order).
2. Collapse known wrappers (`tini`, `dumb-init`, `gosu`, `su-exec`, `s6-supervise`, `runsv`, `supervisord`) and privilege switches, mirroring the static wrapper catalogue.
3. Materialise a `ProcGraph` object that records each transition and the resolved executable path (via `/proc/<pid>/exe` symlinks).
4. Compare `ProcGraph.Terminal` with `EntryTraceResult.Terminals[0]`, emitting `confidence=high` when they match and downgrade when divergence occurs.
5. Persist the merged view so the CLI/UI can highlight static vs runtime discrepancies and feed drift detection in Zastava.
This replay is optional offline, but required when runtime evidence is available so policy decisions can lean on High-confidence matches.
## 10) Service & CLI surfaces
- **Scanner.WebService** must expose `/scans/{scanId}/entrytrace` returning chain, terminal classification, evidence, and runtime agreement markers.
- **CLI** gains `stella scan entrypoint <scanId>` (and JSON streaming) for air-gapped review.
- **Policy / Export** payloads include `entrytrace_terminal`, `entrytrace_confidence`, and evidence arrays so downstream consumers retain provenance.
- All outputs reuse the same `EntryTraceResult` schema and NDJSON stream defined in §7, keeping the Offline Kit and DSSE attestations deterministic.