Files
git.stella-ops.org/docs/scanner-core-contracts.md

148 lines
8.6 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Scanner Core Contracts
The **Scanner Core** library provides shared contracts, observability helpers, and security utilities consumed by `Scanner.WebService`, `Scanner.Worker`, analyzers, and tooling. These primitives guarantee deterministic identifiers, timestamps, and log context for all scanning flows.
## Canonical DTOs
- `ScanJob` & `ScanJobStatus` canonical job metadata (image reference/digest, tenant, correlation ID, timestamps, failure details). Constructors normalise timestamps to UTC microsecond precision and canonicalise image digests. Round-trips with `JsonSerializerDefaults.Web` using `ScannerJsonOptions`.
- `ScanProgressEvent` & `ScanStage`/`ScanProgressEventKind` stage-level progress surface for queue/stream consumers. Includes deterministic sequence numbers, optional progress percentage, attributes, and attached `ScannerError`.
- `ScannerError` & `ScannerErrorCode` shared error taxonomy spanning queue, analyzers, storage, exporters, and signing. Carries severity, retryability, structured details, and microsecond-precision timestamps.
- `ScanJobId` strongly-typed identifier rendered as `Guid` (lowercase `N` format) with deterministic parsing.
### Canonical JSON samples
The golden fixtures consumed by `ScannerCoreContractsTests` document the wire shape shared with downstream services. They live under `src/StellaOps.Scanner.Core.Tests/Fixtures/` and a representative extract is shown below.
```json
{
"id": "8f4cc9c582454b9d9b4f5ae049631b7d",
"status": "running",
"imageReference": "registry.example.com/stellaops/scanner:1.2.3",
"imageDigest": "sha256:abcdef",
"createdAt": "2025-10-18T14:30:15.123456+00:00",
"updatedAt": "2025-10-18T14:30:20.123456+00:00",
"correlationId": "scan-analyzeoperatingsystem-8f4cc9c582454b9d9b4f5ae049631b7d",
"tenantId": "tenant-a",
"metadata": {
"requestId": "req-1234",
"source": "ci"
},
"failure": {
"code": "analyzerFailure",
"severity": "error",
"message": "Analyzer failed to parse layer",
"timestamp": "2025-10-18T14:30:15.123456+00:00",
"retryable": false,
"stage": "AnalyzeOperatingSystem",
"component": "os-analyzer",
"details": {
"layerDigest": "sha256:deadbeef",
"attempt": "1"
}
}
}
```
Progress events follow the same conventions (`jobId`, `stage`, `kind`, `timestamp`, `attributes`, optional embedded `ScannerError`). The fixtures are verified via deterministic JSON comparison in every CI run.
## Deterministic helpers
- `ScannerIdentifiers` derives `ScanJobId`, correlation IDs, and SHA-256 hashes from normalised inputs (image reference/digest, tenant, salt). Ensures case-insensitive stability and reproducible metric keys.
- `ScannerTimestamps` trims to microsecond precision, provides ISO-8601 (`yyyy-MM-ddTHH:mm:ss.ffffffZ`) rendering, and parsing helpers.
- `ScannerJsonOptions` standard JSON options (web defaults, camel-case enums) shared by services/tests.
- `ScanAnalysisStore` & `ScanAnalysisKeys` shared in-memory analysis cache flowing through Worker stages. OS analyzers populate
`analysis.os.packages` (raw output), `analysis.os.fragments` (per-analyzer component fragments), and merge into
`analysis.layers.fragments` so emit/diff stages can compose SBOMs and diffs without knowledge of individual analyzer
implementations.
## Observability primitives
- `ScannerDiagnostics` global `ActivitySource`/`Meter` for scanner components. `StartActivity` seeds deterministic tags (`job_id`, `stage`, `component`, `correlation_id`).
- `ScannerMetricNames` centralises metric prefixes (`stellaops.scanner.*`) and deterministic job/event tag builders.
- `ScannerCorrelationContext` & `ScannerCorrelationContextAccessor` ambient correlation propagation via `AsyncLocal` for log scopes, metrics, and diagnostics.
- `ScannerLogExtensions` `ILogger` scopes for jobs/progress events with automatic correlation context push, minimal allocations, and consistent structured fields.
### Observability overhead validation
A micro-benchmark executed on 2025-10-19 (4vCPU runner, .NET 10.0.100-rc.1) measured the average scope cost across 1000000 iterations:
| Scope | Mean (µs/call) |
|-------|----------------|
| `BeginScanScope` (logger attached) | 0.80 |
| `BeginScanScope` (noop logger) | 0.31 |
| `BeginProgressScope` | 0.57 |
To reproduce, run `dotnet test src/StellaOps.Scanner.Core.Tests -c Release` (see `ScannerLogExtensionsPerformanceTests`) or copy the snippet below into a throwaway `dotnet run` console project and execute it with `dotnet run -c Release`:
```csharp
using System.Collections.Generic;
using System.Diagnostics;
using Microsoft.Extensions.Logging;
using StellaOps.Scanner.Core.Contracts;
using StellaOps.Scanner.Core.Observability;
using StellaOps.Scanner.Core.Utility;
var factory = LoggerFactory.Create(builder => builder.AddFilter(static _ => true));
var logger = factory.CreateLogger("bench");
var jobId = ScannerIdentifiers.CreateJobId("registry.example.com/stellaops/scanner:1.2.3", "sha256:abcdef", "tenant-a", "benchmark");
var correlationId = ScannerIdentifiers.CreateCorrelationId(jobId, nameof(ScanStage.AnalyzeOperatingSystem));
var now = ScannerTimestamps.Normalize(new DateTimeOffset(2025, 10, 19, 12, 0, 0, TimeSpan.Zero));
var job = new ScanJob(jobId, ScanJobStatus.Running, "registry.example.com/stellaops/scanner:1.2.3", "sha256:abcdef", now, now, correlationId, "tenant-a", new Dictionary<string, string>(StringComparer.Ordinal) { ["requestId"] = "req-bench" });
var progress = new ScanProgressEvent(jobId, ScanStage.AnalyzeOperatingSystem, ScanProgressEventKind.Progress, 42, now, 10.5, "benchmark", new Dictionary<string, string>(StringComparer.Ordinal) { ["sample"] = "true" });
Console.WriteLine("Scanner Core Observability micro-bench (1,000,000 iterations)");
Report("BeginScanScope (logger)", Measure(static ctx => ctx.Logger.BeginScanScope(ctx.Job, ctx.Stage, ctx.Component), new ScopeContext(logger, job, nameof(ScanStage.AnalyzeOperatingSystem), "os-analyzer")));
Report("BeginScanScope (no logger)", Measure(static ctx => ScannerLogExtensions.BeginScanScope(null, ctx.Job, ctx.Stage, ctx.Component), new ScopeContext(logger, job, nameof(ScanStage.AnalyzeOperatingSystem), "os-analyzer")));
Report("BeginProgressScope", Measure(static ctx => ctx.Logger.BeginProgressScope(ctx.Progress!, ctx.Component), new ScopeContext(logger, job, nameof(ScanStage.AnalyzeOperatingSystem), "os-analyzer", progress)));
static double Measure(Func<ScopeContext, IDisposable> factory, ScopeContext context)
{
const int iterations = 1_000_000;
for (var i = 0; i < 10_000; i++)
{
using var scope = factory(context);
}
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
var sw = Stopwatch.StartNew();
for (var i = 0; i < iterations; i++)
{
using var scope = factory(context);
}
sw.Stop();
return sw.Elapsed.TotalSeconds * 1_000_000 / iterations;
}
static void Report(string label, double microseconds)
=> Console.WriteLine($"{label,-28}: {microseconds:F3} µs");
readonly record struct ScopeContext(ILogger Logger, ScanJob Job, string? Stage, string? Component, ScanProgressEvent? Progress = null);
```
Both guardrails enforce the ≤5µs acceptance target for SP9-G1.
## Security utilities
- `AuthorityTokenSource` caches short-lived OpToks per audience+scope using deterministic keys and refresh skew (default 30s). Integrates with `StellaOps.Auth.Client`.
- `DpopProofValidator` validates DPoP proofs (alg allowlist, `htm`/`htu`, nonce, replay window, signature) backed by pluggable `IDpopReplayCache`. Ships with `InMemoryDpopReplayCache` for restart-only deployments.
- `RestartOnlyPluginGuard` enforces restart-time plug-in registration (deterministic path normalisation; throws if new plug-ins added post-seal).
- `ServiceCollectionExtensions.AddScannerAuthorityCore` DI helper wiring Authority client, OpTok source, DPoP validation, replay cache, and plug-in guard.
## Testing guarantees
Unit tests (`StellaOps.Scanner.Core.Tests`) assert:
- DTO JSON round-trips are stable and deterministic (`ScannerCoreContractsTests` + golden fixtures).
- Identifier/hash helpers ignore case and emit lowercase hex.
- Timestamp normalisation retains UTC semantics.
- Log scopes push/pop correlation context predictably while staying under the 5µs envelope.
- Authority token caching honours refresh skew and invalidation.
- DPoP validator accepts valid proofs, rejects nonce mismatch/replay, and enforces signature validation.
- Restart-only plug-in guard blocks runtime additions post-seal.