Refactor and enhance scanner worker functionality

- Cleaned up code formatting and organization across multiple files for improved readability.
- Introduced `OsScanAnalyzerDispatcher` to handle OS analyzer execution and plugin loading.
- Updated `ScanJobContext` to include an `Analysis` property for storing scan results.
- Enhanced `ScanJobProcessor` to utilize the new `OsScanAnalyzerDispatcher`.
- Improved logging and error handling in `ScanProgressReporter` for better traceability.
- Updated project dependencies and added references to new analyzer plugins.
- Revised task documentation to reflect current status and dependencies.
This commit is contained in:
2025-10-19 18:34:15 +03:00
parent daa6a4ae8c
commit 7e2fa0a42a
59 changed files with 5563 additions and 2288 deletions

View File

@@ -1,26 +1,26 @@
# AGENTS
## Role
Scanner.Worker engineers own the queue-driven execution host that turns scan jobs into SBOM artefacts with deterministic progress reporting.
## Scope
- Host bootstrap: configuration binding, Authority client wiring, graceful shutdown, restart-time plug-in discovery hooks.
- Job acquisition & lease renewal semantics backed by the Scanner queue abstraction.
- Analyzer orchestration skeleton: stage pipeline, cancellation awareness, deterministic progress emissions.
- Telemetry: structured logging, OpenTelemetry metrics/traces, health counters for offline diagnostics.
## Participants
- Consumes jobs from `StellaOps.Scanner.Queue`.
- Persists progress/artifacts via `StellaOps.Scanner.Storage` once those modules land.
- Emits metrics and structured logs consumed by Observability stack & WebService status endpoints.
## Interfaces & contracts
- Queue lease abstraction (`IScanJobLease`, `IScanJobSource`) with deterministic identifiers and attempt counters.
- Analyzer dispatcher contracts for OS/lang/native analyzers and emitters.
- Telemetry resource attributes shared with Scanner.WebService and Scheduler.
## In/Out of scope
In scope: worker host, concurrency orchestration, lease renewal, cancellation wiring, deterministic logging/metrics.
Out of scope: queue provider implementations, analyzer business logic, Mongo/object-store repositories.
## Observability expectations
- Meter `StellaOps.Scanner.Worker` with queue latency, stage duration, failure counters.
- Activity source `StellaOps.Scanner.Worker.Job` for per-job tracing.
- Log correlation IDs (`jobId`, `leaseId`, `scanId`) with structured payloads; avoid dumping secrets or full manifests.
## Tests
- Integration fixture `WorkerBasicScanScenario` verifying acquisition → heartbeat → analyzer stages → completion.
- Unit tests around retry/jitter calculators as they are introduced.
# AGENTS
## Role
Scanner.Worker engineers own the queue-driven execution host that turns scan jobs into SBOM artefacts with deterministic progress reporting.
## Scope
- Host bootstrap: configuration binding, Authority client wiring, graceful shutdown, restart-time plug-in discovery hooks.
- Job acquisition & lease renewal semantics backed by the Scanner queue abstraction.
- Analyzer orchestration skeleton: stage pipeline, cancellation awareness, deterministic progress emissions.
- Telemetry: structured logging, OpenTelemetry metrics/traces, health counters for offline diagnostics.
## Participants
- Consumes jobs from `StellaOps.Scanner.Queue`.
- Persists progress/artifacts via `StellaOps.Scanner.Storage` once those modules land.
- Emits metrics and structured logs consumed by Observability stack & WebService status endpoints.
## Interfaces & contracts
- Queue lease abstraction (`IScanJobLease`, `IScanJobSource`) with deterministic identifiers and attempt counters.
- Analyzer dispatcher contracts for OS/lang/native analyzers and emitters.
- Telemetry resource attributes shared with Scanner.WebService and Scheduler.
## In/Out of scope
In scope: worker host, concurrency orchestration, lease renewal, cancellation wiring, deterministic logging/metrics.
Out of scope: queue provider implementations, analyzer business logic, Mongo/object-store repositories.
## Observability expectations
- Meter `StellaOps.Scanner.Worker` with queue latency, stage duration, failure counters.
- Activity source `StellaOps.Scanner.Worker.Job` for per-job tracing.
- Log correlation IDs (`jobId`, `leaseId`, `scanId`) with structured payloads; avoid dumping secrets or full manifests.
## Tests
- Integration fixture `WorkerBasicScanScenario` verifying acquisition → heartbeat → analyzer stages → completion.
- Unit tests around retry/jitter calculators as they are introduced.

View File

@@ -1,15 +1,15 @@
using System.Diagnostics;
using System.Diagnostics.Metrics;
namespace StellaOps.Scanner.Worker.Diagnostics;
public static class ScannerWorkerInstrumentation
{
public const string ActivitySourceName = "StellaOps.Scanner.Worker.Job";
public const string MeterName = "StellaOps.Scanner.Worker";
public static ActivitySource ActivitySource { get; } = new(ActivitySourceName);
public static Meter Meter { get; } = new(MeterName, version: "1.0.0");
}
using System.Diagnostics;
using System.Diagnostics.Metrics;
namespace StellaOps.Scanner.Worker.Diagnostics;
public static class ScannerWorkerInstrumentation
{
public const string ActivitySourceName = "StellaOps.Scanner.Worker.Job";
public const string MeterName = "StellaOps.Scanner.Worker";
public static ActivitySource ActivitySource { get; } = new(ActivitySourceName);
public static Meter Meter { get; } = new(MeterName, version: "1.0.0");
}

View File

@@ -1,109 +1,109 @@
using System;
using System.Collections.Generic;
using System.Diagnostics.Metrics;
using StellaOps.Scanner.Worker.Processing;
namespace StellaOps.Scanner.Worker.Diagnostics;
public sealed class ScannerWorkerMetrics
{
private readonly Histogram<double> _queueLatencyMs;
private readonly Histogram<double> _jobDurationMs;
private readonly Histogram<double> _stageDurationMs;
private readonly Counter<long> _jobsCompleted;
private readonly Counter<long> _jobsFailed;
public ScannerWorkerMetrics()
{
_queueLatencyMs = ScannerWorkerInstrumentation.Meter.CreateHistogram<double>(
"scanner_worker_queue_latency_ms",
unit: "ms",
description: "Time from job enqueue to lease acquisition.");
_jobDurationMs = ScannerWorkerInstrumentation.Meter.CreateHistogram<double>(
"scanner_worker_job_duration_ms",
unit: "ms",
description: "Total processing duration per job.");
_stageDurationMs = ScannerWorkerInstrumentation.Meter.CreateHistogram<double>(
"scanner_worker_stage_duration_ms",
unit: "ms",
description: "Stage execution duration per job.");
_jobsCompleted = ScannerWorkerInstrumentation.Meter.CreateCounter<long>(
"scanner_worker_jobs_completed_total",
description: "Number of successfully completed scan jobs.");
_jobsFailed = ScannerWorkerInstrumentation.Meter.CreateCounter<long>(
"scanner_worker_jobs_failed_total",
description: "Number of scan jobs that failed permanently.");
}
public void RecordQueueLatency(ScanJobContext context, TimeSpan latency)
{
if (latency <= TimeSpan.Zero)
{
return;
}
_queueLatencyMs.Record(latency.TotalMilliseconds, CreateTags(context));
}
public void RecordJobDuration(ScanJobContext context, TimeSpan duration)
{
if (duration <= TimeSpan.Zero)
{
return;
}
_jobDurationMs.Record(duration.TotalMilliseconds, CreateTags(context));
}
public void RecordStageDuration(ScanJobContext context, string stage, TimeSpan duration)
{
if (duration <= TimeSpan.Zero)
{
return;
}
_stageDurationMs.Record(duration.TotalMilliseconds, CreateTags(context, stage: stage));
}
public void IncrementJobCompleted(ScanJobContext context)
{
_jobsCompleted.Add(1, CreateTags(context));
}
public void IncrementJobFailed(ScanJobContext context, string failureReason)
{
_jobsFailed.Add(1, CreateTags(context, failureReason: failureReason));
}
private static KeyValuePair<string, object?>[] CreateTags(ScanJobContext context, string? stage = null, string? failureReason = null)
{
var tags = new List<KeyValuePair<string, object?>>(stage is null ? 5 : 6)
{
new("job.id", context.JobId),
new("scan.id", context.ScanId),
new("attempt", context.Lease.Attempt),
};
if (context.Lease.Metadata.TryGetValue("queue", out var queueName) && !string.IsNullOrWhiteSpace(queueName))
{
tags.Add(new KeyValuePair<string, object?>("queue", queueName));
}
if (context.Lease.Metadata.TryGetValue("job.kind", out var jobKind) && !string.IsNullOrWhiteSpace(jobKind))
{
tags.Add(new KeyValuePair<string, object?>("job.kind", jobKind));
}
if (!string.IsNullOrWhiteSpace(stage))
{
tags.Add(new KeyValuePair<string, object?>("stage", stage));
}
if (!string.IsNullOrWhiteSpace(failureReason))
{
tags.Add(new KeyValuePair<string, object?>("reason", failureReason));
}
return tags.ToArray();
}
}
using System;
using System.Collections.Generic;
using System.Diagnostics.Metrics;
using StellaOps.Scanner.Worker.Processing;
namespace StellaOps.Scanner.Worker.Diagnostics;
public sealed class ScannerWorkerMetrics
{
private readonly Histogram<double> _queueLatencyMs;
private readonly Histogram<double> _jobDurationMs;
private readonly Histogram<double> _stageDurationMs;
private readonly Counter<long> _jobsCompleted;
private readonly Counter<long> _jobsFailed;
public ScannerWorkerMetrics()
{
_queueLatencyMs = ScannerWorkerInstrumentation.Meter.CreateHistogram<double>(
"scanner_worker_queue_latency_ms",
unit: "ms",
description: "Time from job enqueue to lease acquisition.");
_jobDurationMs = ScannerWorkerInstrumentation.Meter.CreateHistogram<double>(
"scanner_worker_job_duration_ms",
unit: "ms",
description: "Total processing duration per job.");
_stageDurationMs = ScannerWorkerInstrumentation.Meter.CreateHistogram<double>(
"scanner_worker_stage_duration_ms",
unit: "ms",
description: "Stage execution duration per job.");
_jobsCompleted = ScannerWorkerInstrumentation.Meter.CreateCounter<long>(
"scanner_worker_jobs_completed_total",
description: "Number of successfully completed scan jobs.");
_jobsFailed = ScannerWorkerInstrumentation.Meter.CreateCounter<long>(
"scanner_worker_jobs_failed_total",
description: "Number of scan jobs that failed permanently.");
}
public void RecordQueueLatency(ScanJobContext context, TimeSpan latency)
{
if (latency <= TimeSpan.Zero)
{
return;
}
_queueLatencyMs.Record(latency.TotalMilliseconds, CreateTags(context));
}
public void RecordJobDuration(ScanJobContext context, TimeSpan duration)
{
if (duration <= TimeSpan.Zero)
{
return;
}
_jobDurationMs.Record(duration.TotalMilliseconds, CreateTags(context));
}
public void RecordStageDuration(ScanJobContext context, string stage, TimeSpan duration)
{
if (duration <= TimeSpan.Zero)
{
return;
}
_stageDurationMs.Record(duration.TotalMilliseconds, CreateTags(context, stage: stage));
}
public void IncrementJobCompleted(ScanJobContext context)
{
_jobsCompleted.Add(1, CreateTags(context));
}
public void IncrementJobFailed(ScanJobContext context, string failureReason)
{
_jobsFailed.Add(1, CreateTags(context, failureReason: failureReason));
}
private static KeyValuePair<string, object?>[] CreateTags(ScanJobContext context, string? stage = null, string? failureReason = null)
{
var tags = new List<KeyValuePair<string, object?>>(stage is null ? 5 : 6)
{
new("job.id", context.JobId),
new("scan.id", context.ScanId),
new("attempt", context.Lease.Attempt),
};
if (context.Lease.Metadata.TryGetValue("queue", out var queueName) && !string.IsNullOrWhiteSpace(queueName))
{
tags.Add(new KeyValuePair<string, object?>("queue", queueName));
}
if (context.Lease.Metadata.TryGetValue("job.kind", out var jobKind) && !string.IsNullOrWhiteSpace(jobKind))
{
tags.Add(new KeyValuePair<string, object?>("job.kind", jobKind));
}
if (!string.IsNullOrWhiteSpace(stage))
{
tags.Add(new KeyValuePair<string, object?>("stage", stage));
}
if (!string.IsNullOrWhiteSpace(failureReason))
{
tags.Add(new KeyValuePair<string, object?>("reason", failureReason));
}
return tags.ToArray();
}
}

View File

@@ -1,102 +1,102 @@
using System;
using System.Collections.Generic;
using System.Reflection;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using OpenTelemetry.Metrics;
using OpenTelemetry.Resources;
using OpenTelemetry.Trace;
using StellaOps.Scanner.Worker.Options;
namespace StellaOps.Scanner.Worker.Diagnostics;
public static class TelemetryExtensions
{
public static void ConfigureScannerWorkerTelemetry(this IHostApplicationBuilder builder, ScannerWorkerOptions options)
{
ArgumentNullException.ThrowIfNull(builder);
ArgumentNullException.ThrowIfNull(options);
var telemetry = options.Telemetry;
if (!telemetry.EnableTelemetry)
{
return;
}
var openTelemetry = builder.Services.AddOpenTelemetry();
openTelemetry.ConfigureResource(resource =>
{
var version = Assembly.GetExecutingAssembly().GetName().Version?.ToString() ?? "unknown";
resource.AddService(telemetry.ServiceName, serviceVersion: version, serviceInstanceId: Environment.MachineName);
resource.AddAttributes(new[]
{
new KeyValuePair<string, object>("deployment.environment", builder.Environment.EnvironmentName),
});
foreach (var kvp in telemetry.ResourceAttributes)
{
if (string.IsNullOrWhiteSpace(kvp.Key) || kvp.Value is null)
{
continue;
}
resource.AddAttributes(new[] { new KeyValuePair<string, object>(kvp.Key, kvp.Value) });
}
});
if (telemetry.EnableTracing)
{
openTelemetry.WithTracing(tracing =>
{
tracing.AddSource(ScannerWorkerInstrumentation.ActivitySourceName);
ConfigureExporter(tracing, telemetry);
});
}
if (telemetry.EnableMetrics)
{
openTelemetry.WithMetrics(metrics =>
{
metrics
.AddMeter(ScannerWorkerInstrumentation.MeterName)
.AddRuntimeInstrumentation()
.AddProcessInstrumentation();
ConfigureExporter(metrics, telemetry);
});
}
}
private static void ConfigureExporter(TracerProviderBuilder tracing, ScannerWorkerOptions.TelemetryOptions telemetry)
{
if (!string.IsNullOrWhiteSpace(telemetry.OtlpEndpoint))
{
tracing.AddOtlpExporter(options =>
{
options.Endpoint = new Uri(telemetry.OtlpEndpoint);
});
}
if (telemetry.ExportConsole || string.IsNullOrWhiteSpace(telemetry.OtlpEndpoint))
{
tracing.AddConsoleExporter();
}
}
private static void ConfigureExporter(MeterProviderBuilder metrics, ScannerWorkerOptions.TelemetryOptions telemetry)
{
if (!string.IsNullOrWhiteSpace(telemetry.OtlpEndpoint))
{
metrics.AddOtlpExporter(options =>
{
options.Endpoint = new Uri(telemetry.OtlpEndpoint);
});
}
if (telemetry.ExportConsole || string.IsNullOrWhiteSpace(telemetry.OtlpEndpoint))
{
metrics.AddConsoleExporter();
}
}
}
using System;
using System.Collections.Generic;
using System.Reflection;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using OpenTelemetry.Metrics;
using OpenTelemetry.Resources;
using OpenTelemetry.Trace;
using StellaOps.Scanner.Worker.Options;
namespace StellaOps.Scanner.Worker.Diagnostics;
public static class TelemetryExtensions
{
public static void ConfigureScannerWorkerTelemetry(this IHostApplicationBuilder builder, ScannerWorkerOptions options)
{
ArgumentNullException.ThrowIfNull(builder);
ArgumentNullException.ThrowIfNull(options);
var telemetry = options.Telemetry;
if (!telemetry.EnableTelemetry)
{
return;
}
var openTelemetry = builder.Services.AddOpenTelemetry();
openTelemetry.ConfigureResource(resource =>
{
var version = Assembly.GetExecutingAssembly().GetName().Version?.ToString() ?? "unknown";
resource.AddService(telemetry.ServiceName, serviceVersion: version, serviceInstanceId: Environment.MachineName);
resource.AddAttributes(new[]
{
new KeyValuePair<string, object>("deployment.environment", builder.Environment.EnvironmentName),
});
foreach (var kvp in telemetry.ResourceAttributes)
{
if (string.IsNullOrWhiteSpace(kvp.Key) || kvp.Value is null)
{
continue;
}
resource.AddAttributes(new[] { new KeyValuePair<string, object>(kvp.Key, kvp.Value) });
}
});
if (telemetry.EnableTracing)
{
openTelemetry.WithTracing(tracing =>
{
tracing.AddSource(ScannerWorkerInstrumentation.ActivitySourceName);
ConfigureExporter(tracing, telemetry);
});
}
if (telemetry.EnableMetrics)
{
openTelemetry.WithMetrics(metrics =>
{
metrics
.AddMeter(ScannerWorkerInstrumentation.MeterName)
.AddRuntimeInstrumentation()
.AddProcessInstrumentation();
ConfigureExporter(metrics, telemetry);
});
}
}
private static void ConfigureExporter(TracerProviderBuilder tracing, ScannerWorkerOptions.TelemetryOptions telemetry)
{
if (!string.IsNullOrWhiteSpace(telemetry.OtlpEndpoint))
{
tracing.AddOtlpExporter(options =>
{
options.Endpoint = new Uri(telemetry.OtlpEndpoint);
});
}
if (telemetry.ExportConsole || string.IsNullOrWhiteSpace(telemetry.OtlpEndpoint))
{
tracing.AddConsoleExporter();
}
}
private static void ConfigureExporter(MeterProviderBuilder metrics, ScannerWorkerOptions.TelemetryOptions telemetry)
{
if (!string.IsNullOrWhiteSpace(telemetry.OtlpEndpoint))
{
metrics.AddOtlpExporter(options =>
{
options.Endpoint = new Uri(telemetry.OtlpEndpoint);
});
}
if (telemetry.ExportConsole || string.IsNullOrWhiteSpace(telemetry.OtlpEndpoint))
{
metrics.AddConsoleExporter();
}
}
}

View File

@@ -1,201 +1,202 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using StellaOps.Scanner.Worker.Diagnostics;
using StellaOps.Scanner.Worker.Options;
using StellaOps.Scanner.Worker.Processing;
namespace StellaOps.Scanner.Worker.Hosting;
public sealed partial class ScannerWorkerHostedService : BackgroundService
{
private readonly IScanJobSource _jobSource;
private readonly ScanJobProcessor _processor;
private readonly LeaseHeartbeatService _heartbeatService;
private readonly ScannerWorkerMetrics _metrics;
private readonly TimeProvider _timeProvider;
private readonly IOptionsMonitor<ScannerWorkerOptions> _options;
private readonly ILogger<ScannerWorkerHostedService> _logger;
private readonly IDelayScheduler _delayScheduler;
public ScannerWorkerHostedService(
IScanJobSource jobSource,
ScanJobProcessor processor,
LeaseHeartbeatService heartbeatService,
ScannerWorkerMetrics metrics,
TimeProvider timeProvider,
IDelayScheduler delayScheduler,
IOptionsMonitor<ScannerWorkerOptions> options,
ILogger<ScannerWorkerHostedService> logger)
{
_jobSource = jobSource ?? throw new ArgumentNullException(nameof(jobSource));
_processor = processor ?? throw new ArgumentNullException(nameof(processor));
_heartbeatService = heartbeatService ?? throw new ArgumentNullException(nameof(heartbeatService));
_metrics = metrics ?? throw new ArgumentNullException(nameof(metrics));
_timeProvider = timeProvider ?? throw new ArgumentNullException(nameof(timeProvider));
_delayScheduler = delayScheduler ?? throw new ArgumentNullException(nameof(delayScheduler));
_options = options ?? throw new ArgumentNullException(nameof(options));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
var runningJobs = new HashSet<Task>();
var delayStrategy = new PollDelayStrategy(_options.CurrentValue.Polling);
WorkerStarted(_logger);
while (!stoppingToken.IsCancellationRequested)
{
runningJobs.RemoveWhere(static task => task.IsCompleted);
var options = _options.CurrentValue;
if (runningJobs.Count >= options.MaxConcurrentJobs)
{
var completed = await Task.WhenAny(runningJobs).ConfigureAwait(false);
runningJobs.Remove(completed);
continue;
}
IScanJobLease? lease = null;
try
{
lease = await _jobSource.TryAcquireAsync(stoppingToken).ConfigureAwait(false);
}
catch (OperationCanceledException) when (stoppingToken.IsCancellationRequested)
{
break;
}
catch (Exception ex)
{
_logger.LogError(ex, "Scanner worker failed to acquire job lease; backing off.");
}
if (lease is null)
{
var delay = delayStrategy.NextDelay();
await _delayScheduler.DelayAsync(delay, stoppingToken).ConfigureAwait(false);
continue;
}
delayStrategy.Reset();
runningJobs.Add(RunJobAsync(lease, stoppingToken));
}
if (runningJobs.Count > 0)
{
await Task.WhenAll(runningJobs).ConfigureAwait(false);
}
WorkerStopping(_logger);
}
private async Task RunJobAsync(IScanJobLease lease, CancellationToken stoppingToken)
{
var options = _options.CurrentValue;
var jobStart = _timeProvider.GetUtcNow();
var queueLatency = jobStart - lease.EnqueuedAtUtc;
var jobCts = CancellationTokenSource.CreateLinkedTokenSource(stoppingToken);
var jobToken = jobCts.Token;
var context = new ScanJobContext(lease, _timeProvider, jobStart, jobToken);
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using StellaOps.Scanner.Worker.Diagnostics;
using StellaOps.Scanner.Worker.Options;
using StellaOps.Scanner.Worker.Processing;
namespace StellaOps.Scanner.Worker.Hosting;
public sealed partial class ScannerWorkerHostedService : BackgroundService
{
private readonly IScanJobSource _jobSource;
private readonly ScanJobProcessor _processor;
private readonly LeaseHeartbeatService _heartbeatService;
private readonly ScannerWorkerMetrics _metrics;
private readonly TimeProvider _timeProvider;
private readonly IOptionsMonitor<ScannerWorkerOptions> _options;
private readonly ILogger<ScannerWorkerHostedService> _logger;
private readonly IDelayScheduler _delayScheduler;
public ScannerWorkerHostedService(
IScanJobSource jobSource,
ScanJobProcessor processor,
LeaseHeartbeatService heartbeatService,
ScannerWorkerMetrics metrics,
TimeProvider timeProvider,
IDelayScheduler delayScheduler,
IOptionsMonitor<ScannerWorkerOptions> options,
ILogger<ScannerWorkerHostedService> logger)
{
_jobSource = jobSource ?? throw new ArgumentNullException(nameof(jobSource));
_processor = processor ?? throw new ArgumentNullException(nameof(processor));
_heartbeatService = heartbeatService ?? throw new ArgumentNullException(nameof(heartbeatService));
_metrics = metrics ?? throw new ArgumentNullException(nameof(metrics));
_timeProvider = timeProvider ?? throw new ArgumentNullException(nameof(timeProvider));
_delayScheduler = delayScheduler ?? throw new ArgumentNullException(nameof(delayScheduler));
_options = options ?? throw new ArgumentNullException(nameof(options));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
var runningJobs = new HashSet<Task>();
var delayStrategy = new PollDelayStrategy(_options.CurrentValue.Polling);
WorkerStarted(_logger);
while (!stoppingToken.IsCancellationRequested)
{
runningJobs.RemoveWhere(static task => task.IsCompleted);
var options = _options.CurrentValue;
if (runningJobs.Count >= options.MaxConcurrentJobs)
{
var completed = await Task.WhenAny(runningJobs).ConfigureAwait(false);
runningJobs.Remove(completed);
continue;
}
IScanJobLease? lease = null;
try
{
lease = await _jobSource.TryAcquireAsync(stoppingToken).ConfigureAwait(false);
}
catch (OperationCanceledException) when (stoppingToken.IsCancellationRequested)
{
break;
}
catch (Exception ex)
{
_logger.LogError(ex, "Scanner worker failed to acquire job lease; backing off.");
}
if (lease is null)
{
var delay = delayStrategy.NextDelay();
await _delayScheduler.DelayAsync(delay, stoppingToken).ConfigureAwait(false);
continue;
}
delayStrategy.Reset();
runningJobs.Add(RunJobAsync(lease, stoppingToken));
}
if (runningJobs.Count > 0)
{
await Task.WhenAll(runningJobs).ConfigureAwait(false);
}
WorkerStopping(_logger);
}
private async Task RunJobAsync(IScanJobLease lease, CancellationToken stoppingToken)
{
var options = _options.CurrentValue;
var jobStart = _timeProvider.GetUtcNow();
var queueLatency = jobStart - lease.EnqueuedAtUtc;
var jobCts = CancellationTokenSource.CreateLinkedTokenSource(stoppingToken);
var jobToken = jobCts.Token;
var context = new ScanJobContext(lease, _timeProvider, jobStart, jobToken);
_metrics.RecordQueueLatency(context, queueLatency);
JobAcquired(_logger, lease.JobId, lease.ScanId, lease.Attempt, queueLatency.TotalMilliseconds);
var processingTask = _processor.ExecuteAsync(context, jobToken).AsTask();
var heartbeatTask = _heartbeatService.RunAsync(lease, jobToken);
Exception? processingException = null;
try
{
await _processor.ExecuteAsync(context, jobToken).ConfigureAwait(false);
await processingTask.ConfigureAwait(false);
jobCts.Cancel();
await heartbeatTask.ConfigureAwait(false);
await lease.CompleteAsync(stoppingToken).ConfigureAwait(false);
var duration = _timeProvider.GetUtcNow() - jobStart;
_metrics.RecordJobDuration(context, duration);
_metrics.IncrementJobCompleted(context);
JobCompleted(_logger, lease.JobId, lease.ScanId, duration.TotalMilliseconds);
}
catch (OperationCanceledException) when (stoppingToken.IsCancellationRequested)
{
processingException = null;
await lease.AbandonAsync("host-stopping", CancellationToken.None).ConfigureAwait(false);
JobAbandoned(_logger, lease.JobId, lease.ScanId);
}
catch (Exception ex)
{
processingException = ex;
var duration = _timeProvider.GetUtcNow() - jobStart;
_metrics.RecordJobDuration(context, duration);
var reason = ex.GetType().Name;
var maxAttempts = options.Queue.MaxAttempts;
if (lease.Attempt >= maxAttempts)
{
await lease.PoisonAsync(reason, CancellationToken.None).ConfigureAwait(false);
_metrics.IncrementJobFailed(context, reason);
JobPoisoned(_logger, lease.JobId, lease.ScanId, lease.Attempt, maxAttempts, ex);
}
else
{
await lease.AbandonAsync(reason, CancellationToken.None).ConfigureAwait(false);
JobAbandonedWithError(_logger, lease.JobId, lease.ScanId, lease.Attempt, maxAttempts, ex);
}
}
finally
{
jobCts.Cancel();
try
{
await heartbeatTask.ConfigureAwait(false);
}
catch (Exception ex) when (processingException is null && ex is not OperationCanceledException)
{
_logger.LogWarning(ex, "Heartbeat loop ended with an exception for job {JobId}.", lease.JobId);
}
await lease.DisposeAsync().ConfigureAwait(false);
jobCts.Dispose();
}
}
[LoggerMessage(EventId = 2000, Level = LogLevel.Information, Message = "Scanner worker host started.")]
private static partial void WorkerStarted(ILogger logger);
[LoggerMessage(EventId = 2001, Level = LogLevel.Information, Message = "Scanner worker host stopping.")]
private static partial void WorkerStopping(ILogger logger);
[LoggerMessage(
EventId = 2002,
Level = LogLevel.Information,
Message = "Leased job {JobId} (scan {ScanId}) attempt {Attempt}; queue latency {LatencyMs:F0} ms.")]
private static partial void JobAcquired(ILogger logger, string jobId, string scanId, int attempt, double latencyMs);
[LoggerMessage(
EventId = 2003,
Level = LogLevel.Information,
Message = "Job {JobId} (scan {ScanId}) completed in {DurationMs:F0} ms.")]
private static partial void JobCompleted(ILogger logger, string jobId, string scanId, double durationMs);
[LoggerMessage(
EventId = 2004,
Level = LogLevel.Warning,
Message = "Job {JobId} (scan {ScanId}) abandoned due to host shutdown.")]
private static partial void JobAbandoned(ILogger logger, string jobId, string scanId);
[LoggerMessage(
EventId = 2005,
Level = LogLevel.Warning,
Message = "Job {JobId} (scan {ScanId}) attempt {Attempt}/{MaxAttempts} abandoned after failure; job will be retried.")]
private static partial void JobAbandonedWithError(ILogger logger, string jobId, string scanId, int attempt, int maxAttempts, Exception exception);
[LoggerMessage(
EventId = 2006,
Level = LogLevel.Error,
Message = "Job {JobId} (scan {ScanId}) attempt {Attempt}/{MaxAttempts} exceeded retry budget; quarantining job.")]
private static partial void JobPoisoned(ILogger logger, string jobId, string scanId, int attempt, int maxAttempts, Exception exception);
}
_metrics.IncrementJobCompleted(context);
JobCompleted(_logger, lease.JobId, lease.ScanId, duration.TotalMilliseconds);
}
catch (OperationCanceledException) when (stoppingToken.IsCancellationRequested)
{
processingException = null;
await lease.AbandonAsync("host-stopping", CancellationToken.None).ConfigureAwait(false);
JobAbandoned(_logger, lease.JobId, lease.ScanId);
}
catch (Exception ex)
{
processingException = ex;
var duration = _timeProvider.GetUtcNow() - jobStart;
_metrics.RecordJobDuration(context, duration);
var reason = ex.GetType().Name;
var maxAttempts = options.Queue.MaxAttempts;
if (lease.Attempt >= maxAttempts)
{
await lease.PoisonAsync(reason, CancellationToken.None).ConfigureAwait(false);
_metrics.IncrementJobFailed(context, reason);
JobPoisoned(_logger, lease.JobId, lease.ScanId, lease.Attempt, maxAttempts, ex);
}
else
{
await lease.AbandonAsync(reason, CancellationToken.None).ConfigureAwait(false);
JobAbandonedWithError(_logger, lease.JobId, lease.ScanId, lease.Attempt, maxAttempts, ex);
}
}
finally
{
jobCts.Cancel();
try
{
await heartbeatTask.ConfigureAwait(false);
}
catch (Exception ex) when (processingException is null && ex is not OperationCanceledException)
{
_logger.LogWarning(ex, "Heartbeat loop ended with an exception for job {JobId}.", lease.JobId);
}
await lease.DisposeAsync().ConfigureAwait(false);
jobCts.Dispose();
}
}
[LoggerMessage(EventId = 2000, Level = LogLevel.Information, Message = "Scanner worker host started.")]
private static partial void WorkerStarted(ILogger logger);
[LoggerMessage(EventId = 2001, Level = LogLevel.Information, Message = "Scanner worker host stopping.")]
private static partial void WorkerStopping(ILogger logger);
[LoggerMessage(
EventId = 2002,
Level = LogLevel.Information,
Message = "Leased job {JobId} (scan {ScanId}) attempt {Attempt}; queue latency {LatencyMs:F0} ms.")]
private static partial void JobAcquired(ILogger logger, string jobId, string scanId, int attempt, double latencyMs);
[LoggerMessage(
EventId = 2003,
Level = LogLevel.Information,
Message = "Job {JobId} (scan {ScanId}) completed in {DurationMs:F0} ms.")]
private static partial void JobCompleted(ILogger logger, string jobId, string scanId, double durationMs);
[LoggerMessage(
EventId = 2004,
Level = LogLevel.Warning,
Message = "Job {JobId} (scan {ScanId}) abandoned due to host shutdown.")]
private static partial void JobAbandoned(ILogger logger, string jobId, string scanId);
[LoggerMessage(
EventId = 2005,
Level = LogLevel.Warning,
Message = "Job {JobId} (scan {ScanId}) attempt {Attempt}/{MaxAttempts} abandoned after failure; job will be retried.")]
private static partial void JobAbandonedWithError(ILogger logger, string jobId, string scanId, int attempt, int maxAttempts, Exception exception);
[LoggerMessage(
EventId = 2006,
Level = LogLevel.Error,
Message = "Job {JobId} (scan {ScanId}) attempt {Attempt}/{MaxAttempts} exceeded retry budget; quarantining job.")]
private static partial void JobPoisoned(ILogger logger, string jobId, string scanId, int attempt, int maxAttempts, Exception exception);
}

View File

@@ -2,141 +2,162 @@ using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Collections.ObjectModel;
namespace StellaOps.Scanner.Worker.Options;
public sealed class ScannerWorkerOptions
{
public const string SectionName = "Scanner:Worker";
public int MaxConcurrentJobs { get; set; } = 2;
public QueueOptions Queue { get; } = new();
public PollingOptions Polling { get; } = new();
public AuthorityOptions Authority { get; } = new();
using System.IO;
using StellaOps.Scanner.Core.Contracts;
namespace StellaOps.Scanner.Worker.Options;
public sealed class ScannerWorkerOptions
{
public const string SectionName = "Scanner:Worker";
public int MaxConcurrentJobs { get; set; } = 2;
public QueueOptions Queue { get; } = new();
public PollingOptions Polling { get; } = new();
public AuthorityOptions Authority { get; } = new();
public TelemetryOptions Telemetry { get; } = new();
public ShutdownOptions Shutdown { get; } = new();
public sealed class QueueOptions
{
public int MaxAttempts { get; set; } = 5;
public double HeartbeatSafetyFactor { get; set; } = 3.0;
public int MaxHeartbeatJitterMilliseconds { get; set; } = 750;
public IReadOnlyList<TimeSpan> HeartbeatRetryDelays => _heartbeatRetryDelays;
public TimeSpan MinHeartbeatInterval { get; set; } = TimeSpan.FromSeconds(10);
public TimeSpan MaxHeartbeatInterval { get; set; } = TimeSpan.FromSeconds(30);
public void SetHeartbeatRetryDelays(IEnumerable<TimeSpan> delays)
{
_heartbeatRetryDelays = NormalizeDelays(delays);
}
internal IReadOnlyList<TimeSpan> NormalizedHeartbeatRetryDelays => _heartbeatRetryDelays;
private static IReadOnlyList<TimeSpan> NormalizeDelays(IEnumerable<TimeSpan> delays)
{
var buffer = new List<TimeSpan>();
foreach (var delay in delays)
{
if (delay <= TimeSpan.Zero)
{
continue;
}
buffer.Add(delay);
}
buffer.Sort();
return new ReadOnlyCollection<TimeSpan>(buffer);
}
private IReadOnlyList<TimeSpan> _heartbeatRetryDelays = new ReadOnlyCollection<TimeSpan>(new TimeSpan[]
{
TimeSpan.FromSeconds(2),
TimeSpan.FromSeconds(5),
TimeSpan.FromSeconds(10),
});
}
public sealed class PollingOptions
{
public TimeSpan InitialDelay { get; set; } = TimeSpan.FromMilliseconds(200);
public TimeSpan MaxDelay { get; set; } = TimeSpan.FromSeconds(5);
public double JitterRatio { get; set; } = 0.2;
}
public sealed class AuthorityOptions
{
public bool Enabled { get; set; }
public string? Issuer { get; set; }
public string? ClientId { get; set; }
public string? ClientSecret { get; set; }
public bool RequireHttpsMetadata { get; set; } = true;
public string? MetadataAddress { get; set; }
public int BackchannelTimeoutSeconds { get; set; } = 20;
public int TokenClockSkewSeconds { get; set; } = 30;
public IList<string> Scopes { get; } = new List<string> { "scanner.scan" };
public ResilienceOptions Resilience { get; } = new();
}
public sealed class ResilienceOptions
{
public bool? EnableRetries { get; set; }
public IList<TimeSpan> RetryDelays { get; } = new List<TimeSpan>
{
TimeSpan.FromMilliseconds(250),
TimeSpan.FromMilliseconds(500),
TimeSpan.FromSeconds(1),
TimeSpan.FromSeconds(5),
};
public bool? AllowOfflineCacheFallback { get; set; }
public TimeSpan? OfflineCacheTolerance { get; set; }
}
public sealed class TelemetryOptions
{
public bool EnableLogging { get; set; } = true;
public bool EnableTelemetry { get; set; } = true;
public bool EnableTracing { get; set; }
public bool EnableMetrics { get; set; } = true;
public string ServiceName { get; set; } = "stellaops-scanner-worker";
public string? OtlpEndpoint { get; set; }
public bool ExportConsole { get; set; }
public IDictionary<string, string?> ResourceAttributes { get; } = new ConcurrentDictionary<string, string?>(StringComparer.OrdinalIgnoreCase);
}
public AnalyzerOptions Analyzers { get; } = new();
public sealed class QueueOptions
{
public int MaxAttempts { get; set; } = 5;
public double HeartbeatSafetyFactor { get; set; } = 3.0;
public int MaxHeartbeatJitterMilliseconds { get; set; } = 750;
public IReadOnlyList<TimeSpan> HeartbeatRetryDelays => _heartbeatRetryDelays;
public TimeSpan MinHeartbeatInterval { get; set; } = TimeSpan.FromSeconds(10);
public TimeSpan MaxHeartbeatInterval { get; set; } = TimeSpan.FromSeconds(30);
public void SetHeartbeatRetryDelays(IEnumerable<TimeSpan> delays)
{
_heartbeatRetryDelays = NormalizeDelays(delays);
}
internal IReadOnlyList<TimeSpan> NormalizedHeartbeatRetryDelays => _heartbeatRetryDelays;
private static IReadOnlyList<TimeSpan> NormalizeDelays(IEnumerable<TimeSpan> delays)
{
var buffer = new List<TimeSpan>();
foreach (var delay in delays)
{
if (delay <= TimeSpan.Zero)
{
continue;
}
buffer.Add(delay);
}
buffer.Sort();
return new ReadOnlyCollection<TimeSpan>(buffer);
}
private IReadOnlyList<TimeSpan> _heartbeatRetryDelays = new ReadOnlyCollection<TimeSpan>(new TimeSpan[]
{
TimeSpan.FromSeconds(2),
TimeSpan.FromSeconds(5),
TimeSpan.FromSeconds(10),
});
}
public sealed class PollingOptions
{
public TimeSpan InitialDelay { get; set; } = TimeSpan.FromMilliseconds(200);
public TimeSpan MaxDelay { get; set; } = TimeSpan.FromSeconds(5);
public double JitterRatio { get; set; } = 0.2;
}
public sealed class AuthorityOptions
{
public bool Enabled { get; set; }
public string? Issuer { get; set; }
public string? ClientId { get; set; }
public string? ClientSecret { get; set; }
public bool RequireHttpsMetadata { get; set; } = true;
public string? MetadataAddress { get; set; }
public int BackchannelTimeoutSeconds { get; set; } = 20;
public int TokenClockSkewSeconds { get; set; } = 30;
public IList<string> Scopes { get; } = new List<string> { "scanner.scan" };
public ResilienceOptions Resilience { get; } = new();
}
public sealed class ResilienceOptions
{
public bool? EnableRetries { get; set; }
public IList<TimeSpan> RetryDelays { get; } = new List<TimeSpan>
{
TimeSpan.FromMilliseconds(250),
TimeSpan.FromMilliseconds(500),
TimeSpan.FromSeconds(1),
TimeSpan.FromSeconds(5),
};
public bool? AllowOfflineCacheFallback { get; set; }
public TimeSpan? OfflineCacheTolerance { get; set; }
}
public sealed class TelemetryOptions
{
public bool EnableLogging { get; set; } = true;
public bool EnableTelemetry { get; set; } = true;
public bool EnableTracing { get; set; }
public bool EnableMetrics { get; set; } = true;
public string ServiceName { get; set; } = "stellaops-scanner-worker";
public string? OtlpEndpoint { get; set; }
public bool ExportConsole { get; set; }
public IDictionary<string, string?> ResourceAttributes { get; } = new ConcurrentDictionary<string, string?>(StringComparer.OrdinalIgnoreCase);
}
public sealed class ShutdownOptions
{
public TimeSpan Timeout { get; set; } = TimeSpan.FromSeconds(30);
}
public sealed class AnalyzerOptions
{
public AnalyzerOptions()
{
PluginDirectories = new List<string>
{
Path.Combine("plugins", "scanner", "analyzers", "os"),
};
}
public IList<string> PluginDirectories { get; }
public string RootFilesystemMetadataKey { get; set; } = ScanMetadataKeys.RootFilesystemPath;
public string WorkspaceMetadataKey { get; set; } = ScanMetadataKeys.WorkspacePath;
}
}

View File

@@ -1,91 +1,91 @@
using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.Extensions.Options;
namespace StellaOps.Scanner.Worker.Options;
public sealed class ScannerWorkerOptionsValidator : IValidateOptions<ScannerWorkerOptions>
{
public ValidateOptionsResult Validate(string? name, ScannerWorkerOptions options)
{
ArgumentNullException.ThrowIfNull(options);
var failures = new List<string>();
if (options.MaxConcurrentJobs <= 0)
using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.Extensions.Options;
namespace StellaOps.Scanner.Worker.Options;
public sealed class ScannerWorkerOptionsValidator : IValidateOptions<ScannerWorkerOptions>
{
public ValidateOptionsResult Validate(string? name, ScannerWorkerOptions options)
{
ArgumentNullException.ThrowIfNull(options);
var failures = new List<string>();
if (options.MaxConcurrentJobs <= 0)
{
failures.Add("Scanner.Worker:MaxConcurrentJobs must be greater than zero.");
}
if (options.Queue.HeartbeatSafetyFactor < 3.0)
{
failures.Add("Scanner.Worker:MaxConcurrentJobs must be greater than zero.");
failures.Add("Scanner.Worker:Queue:HeartbeatSafetyFactor must be at least 3.");
}
if (options.Queue.HeartbeatSafetyFactor < 2.0)
{
failures.Add("Scanner.Worker:Queue:HeartbeatSafetyFactor must be at least 2.");
}
if (options.Queue.MaxAttempts <= 0)
{
failures.Add("Scanner.Worker:Queue:MaxAttempts must be greater than zero.");
}
if (options.Queue.MinHeartbeatInterval <= TimeSpan.Zero)
{
failures.Add("Scanner.Worker:Queue:MinHeartbeatInterval must be greater than zero.");
}
if (options.Queue.MaxHeartbeatInterval <= options.Queue.MinHeartbeatInterval)
{
failures.Add("Scanner.Worker:Queue:MaxHeartbeatInterval must be greater than MinHeartbeatInterval.");
}
if (options.Polling.InitialDelay <= TimeSpan.Zero)
{
failures.Add("Scanner.Worker:Polling:InitialDelay must be greater than zero.");
}
if (options.Polling.MaxDelay < options.Polling.InitialDelay)
{
failures.Add("Scanner.Worker:Polling:MaxDelay must be greater than or equal to InitialDelay.");
}
if (options.Polling.JitterRatio is < 0 or > 1)
{
failures.Add("Scanner.Worker:Polling:JitterRatio must be between 0 and 1.");
}
if (options.Authority.Enabled)
{
if (string.IsNullOrWhiteSpace(options.Authority.Issuer))
{
failures.Add("Scanner.Worker:Authority requires Issuer when Enabled is true.");
}
if (string.IsNullOrWhiteSpace(options.Authority.ClientId))
{
failures.Add("Scanner.Worker:Authority requires ClientId when Enabled is true.");
}
if (options.Authority.BackchannelTimeoutSeconds <= 0)
{
failures.Add("Scanner.Worker:Authority:BackchannelTimeoutSeconds must be greater than zero.");
}
if (options.Authority.TokenClockSkewSeconds < 0)
{
failures.Add("Scanner.Worker:Authority:TokenClockSkewSeconds cannot be negative.");
}
if (options.Authority.Resilience.RetryDelays.Any(delay => delay <= TimeSpan.Zero))
{
failures.Add("Scanner.Worker:Authority:Resilience:RetryDelays must be positive durations.");
}
}
if (options.Shutdown.Timeout < TimeSpan.FromSeconds(5))
{
failures.Add("Scanner.Worker:Shutdown:Timeout must be at least 5 seconds to allow lease completion.");
}
if (options.Queue.MaxAttempts <= 0)
{
failures.Add("Scanner.Worker:Queue:MaxAttempts must be greater than zero.");
}
if (options.Queue.MinHeartbeatInterval <= TimeSpan.Zero)
{
failures.Add("Scanner.Worker:Queue:MinHeartbeatInterval must be greater than zero.");
}
if (options.Queue.MaxHeartbeatInterval <= options.Queue.MinHeartbeatInterval)
{
failures.Add("Scanner.Worker:Queue:MaxHeartbeatInterval must be greater than MinHeartbeatInterval.");
}
if (options.Polling.InitialDelay <= TimeSpan.Zero)
{
failures.Add("Scanner.Worker:Polling:InitialDelay must be greater than zero.");
}
if (options.Polling.MaxDelay < options.Polling.InitialDelay)
{
failures.Add("Scanner.Worker:Polling:MaxDelay must be greater than or equal to InitialDelay.");
}
if (options.Polling.JitterRatio is < 0 or > 1)
{
failures.Add("Scanner.Worker:Polling:JitterRatio must be between 0 and 1.");
}
if (options.Authority.Enabled)
{
if (string.IsNullOrWhiteSpace(options.Authority.Issuer))
{
failures.Add("Scanner.Worker:Authority requires Issuer when Enabled is true.");
}
if (string.IsNullOrWhiteSpace(options.Authority.ClientId))
{
failures.Add("Scanner.Worker:Authority requires ClientId when Enabled is true.");
}
if (options.Authority.BackchannelTimeoutSeconds <= 0)
{
failures.Add("Scanner.Worker:Authority:BackchannelTimeoutSeconds must be greater than zero.");
}
if (options.Authority.TokenClockSkewSeconds < 0)
{
failures.Add("Scanner.Worker:Authority:TokenClockSkewSeconds cannot be negative.");
}
if (options.Authority.Resilience.RetryDelays.Any(delay => delay <= TimeSpan.Zero))
{
failures.Add("Scanner.Worker:Authority:Resilience:RetryDelays must be positive durations.");
}
}
if (options.Shutdown.Timeout < TimeSpan.FromSeconds(5))
{
failures.Add("Scanner.Worker:Shutdown:Timeout must be at least 5 seconds to allow lease completion.");
}
if (options.Telemetry.EnableTelemetry)
{
if (!options.Telemetry.EnableMetrics && !options.Telemetry.EnableTracing)
@@ -94,6 +94,11 @@ public sealed class ScannerWorkerOptionsValidator : IValidateOptions<ScannerWork
}
}
if (string.IsNullOrWhiteSpace(options.Analyzers.RootFilesystemMetadataKey))
{
failures.Add("Scanner.Worker:Analyzers:RootFilesystemMetadataKey must be provided.");
}
return failures.Count == 0 ? ValidateOptionsResult.Success : ValidateOptionsResult.Fail(failures);
}
}

View File

@@ -1,20 +1,20 @@
using System;
using System.Threading;
using System.Threading.Tasks;
namespace StellaOps.Scanner.Worker.Processing;
public sealed class AnalyzerStageExecutor : IScanStageExecutor
{
private readonly IScanAnalyzerDispatcher _dispatcher;
public AnalyzerStageExecutor(IScanAnalyzerDispatcher dispatcher)
{
_dispatcher = dispatcher ?? throw new ArgumentNullException(nameof(dispatcher));
}
public string StageName => ScanStageNames.ExecuteAnalyzers;
public ValueTask ExecuteAsync(ScanJobContext context, CancellationToken cancellationToken)
=> _dispatcher.ExecuteAsync(context, cancellationToken);
}
using System;
using System.Threading;
using System.Threading.Tasks;
namespace StellaOps.Scanner.Worker.Processing;
public sealed class AnalyzerStageExecutor : IScanStageExecutor
{
private readonly IScanAnalyzerDispatcher _dispatcher;
public AnalyzerStageExecutor(IScanAnalyzerDispatcher dispatcher)
{
_dispatcher = dispatcher ?? throw new ArgumentNullException(nameof(dispatcher));
}
public string StageName => ScanStageNames.ExecuteAnalyzers;
public ValueTask ExecuteAsync(ScanJobContext context, CancellationToken cancellationToken)
=> _dispatcher.ExecuteAsync(context, cancellationToken);
}

View File

@@ -1,10 +1,10 @@
using System;
using System.Threading;
using System.Threading.Tasks;
namespace StellaOps.Scanner.Worker.Processing;
public interface IDelayScheduler
{
Task DelayAsync(TimeSpan delay, CancellationToken cancellationToken);
}
using System;
using System.Threading;
using System.Threading.Tasks;
namespace StellaOps.Scanner.Worker.Processing;
public interface IDelayScheduler
{
Task DelayAsync(TimeSpan delay, CancellationToken cancellationToken);
}

View File

@@ -1,15 +1,15 @@
using System.Threading;
using System.Threading.Tasks;
namespace StellaOps.Scanner.Worker.Processing;
public interface IScanAnalyzerDispatcher
{
ValueTask ExecuteAsync(ScanJobContext context, CancellationToken cancellationToken);
}
public sealed class NullScanAnalyzerDispatcher : IScanAnalyzerDispatcher
{
public ValueTask ExecuteAsync(ScanJobContext context, CancellationToken cancellationToken)
=> ValueTask.CompletedTask;
}
using System.Threading;
using System.Threading.Tasks;
namespace StellaOps.Scanner.Worker.Processing;
public interface IScanAnalyzerDispatcher
{
ValueTask ExecuteAsync(ScanJobContext context, CancellationToken cancellationToken);
}
public sealed class NullScanAnalyzerDispatcher : IScanAnalyzerDispatcher
{
public ValueTask ExecuteAsync(ScanJobContext context, CancellationToken cancellationToken)
=> ValueTask.CompletedTask;
}

View File

@@ -1,31 +1,31 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
namespace StellaOps.Scanner.Worker.Processing;
public interface IScanJobLease : IAsyncDisposable
{
string JobId { get; }
string ScanId { get; }
int Attempt { get; }
DateTimeOffset EnqueuedAtUtc { get; }
DateTimeOffset LeasedAtUtc { get; }
TimeSpan LeaseDuration { get; }
IReadOnlyDictionary<string, string> Metadata { get; }
ValueTask RenewAsync(CancellationToken cancellationToken);
ValueTask CompleteAsync(CancellationToken cancellationToken);
ValueTask AbandonAsync(string reason, CancellationToken cancellationToken);
ValueTask PoisonAsync(string reason, CancellationToken cancellationToken);
}
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
namespace StellaOps.Scanner.Worker.Processing;
public interface IScanJobLease : IAsyncDisposable
{
string JobId { get; }
string ScanId { get; }
int Attempt { get; }
DateTimeOffset EnqueuedAtUtc { get; }
DateTimeOffset LeasedAtUtc { get; }
TimeSpan LeaseDuration { get; }
IReadOnlyDictionary<string, string> Metadata { get; }
ValueTask RenewAsync(CancellationToken cancellationToken);
ValueTask CompleteAsync(CancellationToken cancellationToken);
ValueTask AbandonAsync(string reason, CancellationToken cancellationToken);
ValueTask PoisonAsync(string reason, CancellationToken cancellationToken);
}

View File

@@ -1,9 +1,9 @@
using System.Threading;
using System.Threading.Tasks;
namespace StellaOps.Scanner.Worker.Processing;
public interface IScanJobSource
{
Task<IScanJobLease?> TryAcquireAsync(CancellationToken cancellationToken);
}
using System.Threading;
using System.Threading.Tasks;
namespace StellaOps.Scanner.Worker.Processing;
public interface IScanJobSource
{
Task<IScanJobLease?> TryAcquireAsync(CancellationToken cancellationToken);
}

View File

@@ -1,11 +1,11 @@
using System.Threading;
using System.Threading.Tasks;
namespace StellaOps.Scanner.Worker.Processing;
public interface IScanStageExecutor
{
string StageName { get; }
ValueTask ExecuteAsync(ScanJobContext context, CancellationToken cancellationToken);
}
using System.Threading;
using System.Threading.Tasks;
namespace StellaOps.Scanner.Worker.Processing;
public interface IScanStageExecutor
{
string StageName { get; }
ValueTask ExecuteAsync(ScanJobContext context, CancellationToken cancellationToken);
}

View File

@@ -1,148 +1,155 @@
using System;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using StellaOps.Scanner.Worker.Options;
namespace StellaOps.Scanner.Worker.Processing;
public sealed class LeaseHeartbeatService
{
private readonly TimeProvider _timeProvider;
private readonly IOptionsMonitor<ScannerWorkerOptions> _options;
private readonly IDelayScheduler _delayScheduler;
private readonly ILogger<LeaseHeartbeatService> _logger;
public LeaseHeartbeatService(TimeProvider timeProvider, IDelayScheduler delayScheduler, IOptionsMonitor<ScannerWorkerOptions> options, ILogger<LeaseHeartbeatService> logger)
{
_timeProvider = timeProvider ?? throw new ArgumentNullException(nameof(timeProvider));
_delayScheduler = delayScheduler ?? throw new ArgumentNullException(nameof(delayScheduler));
_options = options ?? throw new ArgumentNullException(nameof(options));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
using System;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using StellaOps.Scanner.Worker.Options;
namespace StellaOps.Scanner.Worker.Processing;
public sealed class LeaseHeartbeatService
{
private readonly TimeProvider _timeProvider;
private readonly IOptionsMonitor<ScannerWorkerOptions> _options;
private readonly IDelayScheduler _delayScheduler;
private readonly ILogger<LeaseHeartbeatService> _logger;
public LeaseHeartbeatService(TimeProvider timeProvider, IDelayScheduler delayScheduler, IOptionsMonitor<ScannerWorkerOptions> options, ILogger<LeaseHeartbeatService> logger)
{
_timeProvider = timeProvider ?? throw new ArgumentNullException(nameof(timeProvider));
_delayScheduler = delayScheduler ?? throw new ArgumentNullException(nameof(delayScheduler));
_options = options ?? throw new ArgumentNullException(nameof(options));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
public async Task RunAsync(IScanJobLease lease, CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(lease);
var options = _options.CurrentValue;
var interval = ComputeInterval(options, lease);
await Task.Yield();
while (!cancellationToken.IsCancellationRequested)
{
options = _options.CurrentValue;
var delay = ApplyJitter(interval, options.Queue.MaxHeartbeatJitterMilliseconds);
var options = _options.CurrentValue;
var interval = ComputeInterval(options, lease);
var delay = ApplyJitter(interval, options.Queue);
try
{
await _delayScheduler.DelayAsync(delay, cancellationToken).ConfigureAwait(false);
}
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
{
break;
}
if (cancellationToken.IsCancellationRequested)
{
break;
}
if (await TryRenewAsync(options, lease, cancellationToken).ConfigureAwait(false))
{
continue;
}
_logger.LogError(
"Job {JobId} (scan {ScanId}) lease renewal exhausted retries; cancelling processing.",
lease.JobId,
lease.ScanId);
throw new InvalidOperationException("Lease renewal retries exhausted.");
}
}
{
break;
}
if (cancellationToken.IsCancellationRequested)
{
break;
}
if (await TryRenewAsync(options, lease, cancellationToken).ConfigureAwait(false))
{
continue;
}
_logger.LogError(
"Job {JobId} (scan {ScanId}) lease renewal exhausted retries; cancelling processing.",
lease.JobId,
lease.ScanId);
throw new InvalidOperationException("Lease renewal retries exhausted.");
}
}
private static TimeSpan ComputeInterval(ScannerWorkerOptions options, IScanJobLease lease)
{
var divisor = options.Queue.HeartbeatSafetyFactor <= 0 ? 3.0 : options.Queue.HeartbeatSafetyFactor;
var recommended = TimeSpan.FromTicks((long)(lease.LeaseDuration.Ticks / Math.Max(2.0, divisor)));
var safetyFactor = Math.Max(3.0, divisor);
var recommended = TimeSpan.FromTicks((long)(lease.LeaseDuration.Ticks / safetyFactor));
if (recommended < options.Queue.MinHeartbeatInterval)
{
recommended = options.Queue.MinHeartbeatInterval;
}
else if (recommended > options.Queue.MaxHeartbeatInterval)
{
recommended = options.Queue.MaxHeartbeatInterval;
}
{
recommended = options.Queue.MaxHeartbeatInterval;
}
return recommended;
}
private static TimeSpan ApplyJitter(TimeSpan duration, int maxJitterMilliseconds)
private static TimeSpan ApplyJitter(TimeSpan duration, ScannerWorkerOptions.QueueOptions queueOptions)
{
if (maxJitterMilliseconds <= 0)
if (queueOptions.MaxHeartbeatJitterMilliseconds <= 0)
{
return duration;
}
var offset = Random.Shared.NextDouble() * maxJitterMilliseconds;
return duration + TimeSpan.FromMilliseconds(offset);
var offsetMs = Random.Shared.NextDouble() * queueOptions.MaxHeartbeatJitterMilliseconds;
var adjusted = duration - TimeSpan.FromMilliseconds(offsetMs);
if (adjusted < queueOptions.MinHeartbeatInterval)
{
return queueOptions.MinHeartbeatInterval;
}
return adjusted > TimeSpan.Zero ? adjusted : queueOptions.MinHeartbeatInterval;
}
private async Task<bool> TryRenewAsync(ScannerWorkerOptions options, IScanJobLease lease, CancellationToken cancellationToken)
{
try
{
await lease.RenewAsync(cancellationToken).ConfigureAwait(false);
return true;
}
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
{
return false;
}
catch (Exception ex)
{
_logger.LogWarning(
ex,
"Job {JobId} (scan {ScanId}) heartbeat failed; retrying.",
lease.JobId,
lease.ScanId);
}
foreach (var delay in options.Queue.NormalizedHeartbeatRetryDelays)
{
if (cancellationToken.IsCancellationRequested)
{
return false;
}
try
{
await _delayScheduler.DelayAsync(delay, cancellationToken).ConfigureAwait(false);
}
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
{
return false;
}
try
{
await lease.RenewAsync(cancellationToken).ConfigureAwait(false);
return true;
}
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
{
return false;
}
catch (Exception ex)
{
_logger.LogWarning(
ex,
"Job {JobId} (scan {ScanId}) heartbeat retry failed; will retry after {Delay}.",
lease.JobId,
lease.ScanId,
delay);
}
}
return false;
}
}
{
await lease.RenewAsync(cancellationToken).ConfigureAwait(false);
return true;
}
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
{
return false;
}
catch (Exception ex)
{
_logger.LogWarning(
ex,
"Job {JobId} (scan {ScanId}) heartbeat failed; retrying.",
lease.JobId,
lease.ScanId);
}
foreach (var delay in options.Queue.NormalizedHeartbeatRetryDelays)
{
if (cancellationToken.IsCancellationRequested)
{
return false;
}
try
{
await _delayScheduler.DelayAsync(delay, cancellationToken).ConfigureAwait(false);
}
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
{
return false;
}
try
{
await lease.RenewAsync(cancellationToken).ConfigureAwait(false);
return true;
}
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
{
return false;
}
catch (Exception ex)
{
_logger.LogWarning(
ex,
"Job {JobId} (scan {ScanId}) heartbeat retry failed; will retry after {Delay}.",
lease.JobId,
lease.ScanId,
delay);
}
}
return false;
}
}

View File

@@ -1,18 +1,18 @@
using System;
using System.Threading;
using System.Threading.Tasks;
namespace StellaOps.Scanner.Worker.Processing;
public sealed class NoOpStageExecutor : IScanStageExecutor
{
public NoOpStageExecutor(string stageName)
{
StageName = stageName ?? throw new ArgumentNullException(nameof(stageName));
}
public string StageName { get; }
public ValueTask ExecuteAsync(ScanJobContext context, CancellationToken cancellationToken)
=> ValueTask.CompletedTask;
}
using System;
using System.Threading;
using System.Threading.Tasks;
namespace StellaOps.Scanner.Worker.Processing;
public sealed class NoOpStageExecutor : IScanStageExecutor
{
public NoOpStageExecutor(string stageName)
{
StageName = stageName ?? throw new ArgumentNullException(nameof(stageName));
}
public string StageName { get; }
public ValueTask ExecuteAsync(ScanJobContext context, CancellationToken cancellationToken)
=> ValueTask.CompletedTask;
}

View File

@@ -1,26 +1,26 @@
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Logging;
namespace StellaOps.Scanner.Worker.Processing;
public sealed class NullScanJobSource : IScanJobSource
{
private readonly ILogger<NullScanJobSource> _logger;
private int _logged;
public NullScanJobSource(ILogger<NullScanJobSource> logger)
{
_logger = logger;
}
public Task<IScanJobLease?> TryAcquireAsync(CancellationToken cancellationToken)
{
if (Interlocked.Exchange(ref _logged, 1) == 0)
{
_logger.LogWarning("No queue provider registered. Scanner worker will idle until a queue adapter is configured.");
}
return Task.FromResult<IScanJobLease?>(null);
}
}
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Logging;
namespace StellaOps.Scanner.Worker.Processing;
public sealed class NullScanJobSource : IScanJobSource
{
private readonly ILogger<NullScanJobSource> _logger;
private int _logged;
public NullScanJobSource(ILogger<NullScanJobSource> logger)
{
_logger = logger;
}
public Task<IScanJobLease?> TryAcquireAsync(CancellationToken cancellationToken)
{
if (Interlocked.Exchange(ref _logged, 1) == 0)
{
_logger.LogWarning("No queue provider registered. Scanner worker will idle until a queue adapter is configured.");
}
return Task.FromResult<IScanJobLease?>(null);
}
}

View File

@@ -0,0 +1,153 @@
using System;
using System.Collections.Generic;
using System.Collections.ObjectModel;
using System.IO;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using StellaOps.Scanner.Analyzers.OS;
using StellaOps.Scanner.Analyzers.OS.Abstractions;
using StellaOps.Scanner.Analyzers.OS.Plugin;
using StellaOps.Scanner.Core.Contracts;
using StellaOps.Scanner.Worker.Options;
namespace StellaOps.Scanner.Worker.Processing;
internal sealed class OsScanAnalyzerDispatcher : IScanAnalyzerDispatcher
{
private readonly IServiceScopeFactory _scopeFactory;
private readonly OsAnalyzerPluginCatalog _catalog;
private readonly ScannerWorkerOptions _options;
private readonly ILogger<OsScanAnalyzerDispatcher> _logger;
private IReadOnlyList<string> _pluginDirectories = Array.Empty<string>();
public OsScanAnalyzerDispatcher(
IServiceScopeFactory scopeFactory,
OsAnalyzerPluginCatalog catalog,
IOptions<ScannerWorkerOptions> options,
ILogger<OsScanAnalyzerDispatcher> logger)
{
_scopeFactory = scopeFactory ?? throw new ArgumentNullException(nameof(scopeFactory));
_catalog = catalog ?? throw new ArgumentNullException(nameof(catalog));
_options = options?.Value ?? throw new ArgumentNullException(nameof(options));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
LoadPlugins();
}
public async ValueTask ExecuteAsync(ScanJobContext context, CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(context);
using var scope = _scopeFactory.CreateScope();
var services = scope.ServiceProvider;
var analyzers = _catalog.CreateAnalyzers(services);
if (analyzers.Count == 0)
{
_logger.LogWarning("No OS analyzers available; skipping analyzer stage for job {JobId}.", context.JobId);
return;
}
var metadata = new Dictionary<string, string>(context.Lease.Metadata, StringComparer.Ordinal);
var rootfsPath = ResolvePath(metadata, _options.Analyzers.RootFilesystemMetadataKey);
if (rootfsPath is null)
{
_logger.LogWarning(
"Metadata key '{MetadataKey}' missing for job {JobId}; unable to locate root filesystem. OS analyzers skipped.",
_options.Analyzers.RootFilesystemMetadataKey,
context.JobId);
return;
}
var workspacePath = ResolvePath(metadata, _options.Analyzers.WorkspaceMetadataKey);
var loggerFactory = services.GetRequiredService<ILoggerFactory>();
var results = new List<OSPackageAnalyzerResult>(analyzers.Count);
foreach (var analyzer in analyzers)
{
cancellationToken.ThrowIfCancellationRequested();
var analyzerLogger = loggerFactory.CreateLogger(analyzer.GetType());
var analyzerContext = new OSPackageAnalyzerContext(rootfsPath, workspacePath, context.TimeProvider, analyzerLogger, metadata);
try
{
var result = await analyzer.AnalyzeAsync(analyzerContext, cancellationToken).ConfigureAwait(false);
results.Add(result);
}
catch (Exception ex)
{
_logger.LogError(ex, "Analyzer {AnalyzerId} failed for job {JobId}.", analyzer.AnalyzerId, context.JobId);
}
}
if (results.Count > 0)
{
var dictionary = results.ToDictionary(result => result.AnalyzerId, StringComparer.OrdinalIgnoreCase);
context.Analysis.Set(ScanAnalysisKeys.OsPackageAnalyzers, dictionary);
}
}
private void LoadPlugins()
{
var directories = new List<string>();
foreach (var configured in _options.Analyzers.PluginDirectories)
{
if (string.IsNullOrWhiteSpace(configured))
{
continue;
}
var path = configured;
if (!Path.IsPathRooted(path))
{
path = Path.GetFullPath(Path.Combine(AppContext.BaseDirectory, path));
}
directories.Add(path);
}
if (directories.Count == 0)
{
directories.Add(Path.Combine(AppContext.BaseDirectory, "plugins", "scanner", "analyzers", "os"));
}
_pluginDirectories = new ReadOnlyCollection<string>(directories);
for (var i = 0; i < _pluginDirectories.Count; i++)
{
var directory = _pluginDirectories[i];
var seal = i == _pluginDirectories.Count - 1;
try
{
_catalog.LoadFromDirectory(directory, seal);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to load analyzer plug-ins from {Directory}.", directory);
}
}
}
private static string? ResolvePath(IReadOnlyDictionary<string, string> metadata, string key)
{
if (string.IsNullOrWhiteSpace(key))
{
return null;
}
if (!metadata.TryGetValue(key, out var value) || string.IsNullOrWhiteSpace(value))
{
return null;
}
var trimmed = value.Trim();
return Path.IsPathRooted(trimmed)
? trimmed
: Path.GetFullPath(trimmed);
}
}

View File

@@ -1,49 +1,49 @@
using System;
using StellaOps.Scanner.Worker.Options;
namespace StellaOps.Scanner.Worker.Processing;
public sealed class PollDelayStrategy
{
private readonly ScannerWorkerOptions.PollingOptions _options;
private TimeSpan _currentDelay;
public PollDelayStrategy(ScannerWorkerOptions.PollingOptions options)
{
_options = options ?? throw new ArgumentNullException(nameof(options));
}
public TimeSpan NextDelay()
{
if (_currentDelay == TimeSpan.Zero)
{
_currentDelay = _options.InitialDelay;
return ApplyJitter(_currentDelay);
}
var doubled = _currentDelay + _currentDelay;
_currentDelay = doubled < _options.MaxDelay ? doubled : _options.MaxDelay;
return ApplyJitter(_currentDelay);
}
public void Reset() => _currentDelay = TimeSpan.Zero;
private TimeSpan ApplyJitter(TimeSpan duration)
{
if (_options.JitterRatio <= 0)
{
return duration;
}
var maxOffset = duration.TotalMilliseconds * _options.JitterRatio;
if (maxOffset <= 0)
{
return duration;
}
var offset = (Random.Shared.NextDouble() * 2.0 - 1.0) * maxOffset;
var adjustedMs = Math.Max(0, duration.TotalMilliseconds + offset);
return TimeSpan.FromMilliseconds(adjustedMs);
}
}
using System;
using StellaOps.Scanner.Worker.Options;
namespace StellaOps.Scanner.Worker.Processing;
public sealed class PollDelayStrategy
{
private readonly ScannerWorkerOptions.PollingOptions _options;
private TimeSpan _currentDelay;
public PollDelayStrategy(ScannerWorkerOptions.PollingOptions options)
{
_options = options ?? throw new ArgumentNullException(nameof(options));
}
public TimeSpan NextDelay()
{
if (_currentDelay == TimeSpan.Zero)
{
_currentDelay = _options.InitialDelay;
return ApplyJitter(_currentDelay);
}
var doubled = _currentDelay + _currentDelay;
_currentDelay = doubled < _options.MaxDelay ? doubled : _options.MaxDelay;
return ApplyJitter(_currentDelay);
}
public void Reset() => _currentDelay = TimeSpan.Zero;
private TimeSpan ApplyJitter(TimeSpan duration)
{
if (_options.JitterRatio <= 0)
{
return duration;
}
var maxOffset = duration.TotalMilliseconds * _options.JitterRatio;
if (maxOffset <= 0)
{
return duration;
}
var offset = (Random.Shared.NextDouble() * 2.0 - 1.0) * maxOffset;
var adjustedMs = Math.Max(0, duration.TotalMilliseconds + offset);
return TimeSpan.FromMilliseconds(adjustedMs);
}
}

View File

@@ -1,27 +1,31 @@
using System;
using System.Threading;
namespace StellaOps.Scanner.Worker.Processing;
public sealed class ScanJobContext
{
using StellaOps.Scanner.Core.Contracts;
namespace StellaOps.Scanner.Worker.Processing;
public sealed class ScanJobContext
{
public ScanJobContext(IScanJobLease lease, TimeProvider timeProvider, DateTimeOffset startUtc, CancellationToken cancellationToken)
{
Lease = lease ?? throw new ArgumentNullException(nameof(lease));
TimeProvider = timeProvider ?? throw new ArgumentNullException(nameof(timeProvider));
StartUtc = startUtc;
CancellationToken = cancellationToken;
Analysis = new ScanAnalysisStore();
}
public IScanJobLease Lease { get; }
public TimeProvider TimeProvider { get; }
public DateTimeOffset StartUtc { get; }
public CancellationToken CancellationToken { get; }
public string JobId => Lease.JobId;
public IScanJobLease Lease { get; }
public TimeProvider TimeProvider { get; }
public DateTimeOffset StartUtc { get; }
public CancellationToken CancellationToken { get; }
public string JobId => Lease.JobId;
public string ScanId => Lease.ScanId;
public ScanAnalysisStore Analysis { get; }
}

View File

@@ -1,65 +1,65 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Logging;
namespace StellaOps.Scanner.Worker.Processing;
public sealed class ScanJobProcessor
{
private readonly IReadOnlyDictionary<string, IScanStageExecutor> _executors;
private readonly ScanProgressReporter _progressReporter;
private readonly ILogger<ScanJobProcessor> _logger;
public ScanJobProcessor(IEnumerable<IScanStageExecutor> executors, ScanProgressReporter progressReporter, ILogger<ScanJobProcessor> logger)
{
_progressReporter = progressReporter ?? throw new ArgumentNullException(nameof(progressReporter));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
var map = new Dictionary<string, IScanStageExecutor>(StringComparer.OrdinalIgnoreCase);
foreach (var executor in executors ?? Array.Empty<IScanStageExecutor>())
{
if (executor is null || string.IsNullOrWhiteSpace(executor.StageName))
{
continue;
}
map[executor.StageName] = executor;
}
foreach (var stage in ScanStageNames.Ordered)
{
if (map.ContainsKey(stage))
{
continue;
}
map[stage] = new NoOpStageExecutor(stage);
_logger.LogDebug("No executor registered for stage {Stage}; using no-op placeholder.", stage);
}
_executors = map;
}
public async ValueTask ExecuteAsync(ScanJobContext context, CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(context);
foreach (var stage in ScanStageNames.Ordered)
{
cancellationToken.ThrowIfCancellationRequested();
if (!_executors.TryGetValue(stage, out var executor))
{
continue;
}
await _progressReporter.ExecuteStageAsync(
context,
stage,
executor.ExecuteAsync,
cancellationToken).ConfigureAwait(false);
}
}
}
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Logging;
namespace StellaOps.Scanner.Worker.Processing;
public sealed class ScanJobProcessor
{
private readonly IReadOnlyDictionary<string, IScanStageExecutor> _executors;
private readonly ScanProgressReporter _progressReporter;
private readonly ILogger<ScanJobProcessor> _logger;
public ScanJobProcessor(IEnumerable<IScanStageExecutor> executors, ScanProgressReporter progressReporter, ILogger<ScanJobProcessor> logger)
{
_progressReporter = progressReporter ?? throw new ArgumentNullException(nameof(progressReporter));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
var map = new Dictionary<string, IScanStageExecutor>(StringComparer.OrdinalIgnoreCase);
foreach (var executor in executors ?? Array.Empty<IScanStageExecutor>())
{
if (executor is null || string.IsNullOrWhiteSpace(executor.StageName))
{
continue;
}
map[executor.StageName] = executor;
}
foreach (var stage in ScanStageNames.Ordered)
{
if (map.ContainsKey(stage))
{
continue;
}
map[stage] = new NoOpStageExecutor(stage);
_logger.LogDebug("No executor registered for stage {Stage}; using no-op placeholder.", stage);
}
_executors = map;
}
public async ValueTask ExecuteAsync(ScanJobContext context, CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(context);
foreach (var stage in ScanStageNames.Ordered)
{
cancellationToken.ThrowIfCancellationRequested();
if (!_executors.TryGetValue(stage, out var executor))
{
continue;
}
await _progressReporter.ExecuteStageAsync(
context,
stage,
executor.ExecuteAsync,
cancellationToken).ConfigureAwait(false);
}
}
}

View File

@@ -1,86 +1,86 @@
using System;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Logging;
using StellaOps.Scanner.Worker.Diagnostics;
namespace StellaOps.Scanner.Worker.Processing;
public sealed partial class ScanProgressReporter
{
private readonly ScannerWorkerMetrics _metrics;
private readonly ILogger<ScanProgressReporter> _logger;
public ScanProgressReporter(ScannerWorkerMetrics metrics, ILogger<ScanProgressReporter> logger)
{
_metrics = metrics ?? throw new ArgumentNullException(nameof(metrics));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
public async ValueTask ExecuteStageAsync(
ScanJobContext context,
string stageName,
Func<ScanJobContext, CancellationToken, ValueTask> stageWork,
CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(context);
ArgumentException.ThrowIfNullOrWhiteSpace(stageName);
ArgumentNullException.ThrowIfNull(stageWork);
StageStarting(_logger, context.JobId, context.ScanId, stageName, context.Lease.Attempt);
var start = context.TimeProvider.GetUtcNow();
using var activity = ScannerWorkerInstrumentation.ActivitySource.StartActivity(
$"scanner.worker.{stageName}",
ActivityKind.Internal);
activity?.SetTag("scanner.worker.job_id", context.JobId);
activity?.SetTag("scanner.worker.scan_id", context.ScanId);
activity?.SetTag("scanner.worker.stage", stageName);
try
{
await stageWork(context, cancellationToken).ConfigureAwait(false);
var duration = context.TimeProvider.GetUtcNow() - start;
_metrics.RecordStageDuration(context, stageName, duration);
StageCompleted(_logger, context.JobId, context.ScanId, stageName, duration.TotalMilliseconds);
}
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
{
StageCancelled(_logger, context.JobId, context.ScanId, stageName);
throw;
}
catch (Exception ex)
{
var duration = context.TimeProvider.GetUtcNow() - start;
_metrics.RecordStageDuration(context, stageName, duration);
StageFailed(_logger, context.JobId, context.ScanId, stageName, ex);
throw;
}
}
[LoggerMessage(
EventId = 1000,
Level = LogLevel.Information,
Message = "Job {JobId} (scan {ScanId}) entering stage {Stage} (attempt {Attempt}).")]
private static partial void StageStarting(ILogger logger, string jobId, string scanId, string stage, int attempt);
[LoggerMessage(
EventId = 1001,
Level = LogLevel.Information,
Message = "Job {JobId} (scan {ScanId}) finished stage {Stage} in {ElapsedMs:F0} ms.")]
private static partial void StageCompleted(ILogger logger, string jobId, string scanId, string stage, double elapsedMs);
[LoggerMessage(
EventId = 1002,
Level = LogLevel.Warning,
Message = "Job {JobId} (scan {ScanId}) stage {Stage} cancelled by request.")]
private static partial void StageCancelled(ILogger logger, string jobId, string scanId, string stage);
[LoggerMessage(
EventId = 1003,
Level = LogLevel.Error,
Message = "Job {JobId} (scan {ScanId}) stage {Stage} failed.")]
private static partial void StageFailed(ILogger logger, string jobId, string scanId, string stage, Exception exception);
}
using System;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Logging;
using StellaOps.Scanner.Worker.Diagnostics;
namespace StellaOps.Scanner.Worker.Processing;
public sealed partial class ScanProgressReporter
{
private readonly ScannerWorkerMetrics _metrics;
private readonly ILogger<ScanProgressReporter> _logger;
public ScanProgressReporter(ScannerWorkerMetrics metrics, ILogger<ScanProgressReporter> logger)
{
_metrics = metrics ?? throw new ArgumentNullException(nameof(metrics));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
public async ValueTask ExecuteStageAsync(
ScanJobContext context,
string stageName,
Func<ScanJobContext, CancellationToken, ValueTask> stageWork,
CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(context);
ArgumentException.ThrowIfNullOrWhiteSpace(stageName);
ArgumentNullException.ThrowIfNull(stageWork);
StageStarting(_logger, context.JobId, context.ScanId, stageName, context.Lease.Attempt);
var start = context.TimeProvider.GetUtcNow();
using var activity = ScannerWorkerInstrumentation.ActivitySource.StartActivity(
$"scanner.worker.{stageName}",
ActivityKind.Internal);
activity?.SetTag("scanner.worker.job_id", context.JobId);
activity?.SetTag("scanner.worker.scan_id", context.ScanId);
activity?.SetTag("scanner.worker.stage", stageName);
try
{
await stageWork(context, cancellationToken).ConfigureAwait(false);
var duration = context.TimeProvider.GetUtcNow() - start;
_metrics.RecordStageDuration(context, stageName, duration);
StageCompleted(_logger, context.JobId, context.ScanId, stageName, duration.TotalMilliseconds);
}
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
{
StageCancelled(_logger, context.JobId, context.ScanId, stageName);
throw;
}
catch (Exception ex)
{
var duration = context.TimeProvider.GetUtcNow() - start;
_metrics.RecordStageDuration(context, stageName, duration);
StageFailed(_logger, context.JobId, context.ScanId, stageName, ex);
throw;
}
}
[LoggerMessage(
EventId = 1000,
Level = LogLevel.Information,
Message = "Job {JobId} (scan {ScanId}) entering stage {Stage} (attempt {Attempt}).")]
private static partial void StageStarting(ILogger logger, string jobId, string scanId, string stage, int attempt);
[LoggerMessage(
EventId = 1001,
Level = LogLevel.Information,
Message = "Job {JobId} (scan {ScanId}) finished stage {Stage} in {ElapsedMs:F0} ms.")]
private static partial void StageCompleted(ILogger logger, string jobId, string scanId, string stage, double elapsedMs);
[LoggerMessage(
EventId = 1002,
Level = LogLevel.Warning,
Message = "Job {JobId} (scan {ScanId}) stage {Stage} cancelled by request.")]
private static partial void StageCancelled(ILogger logger, string jobId, string scanId, string stage);
[LoggerMessage(
EventId = 1003,
Level = LogLevel.Error,
Message = "Job {JobId} (scan {ScanId}) stage {Stage} failed.")]
private static partial void StageFailed(ILogger logger, string jobId, string scanId, string stage, Exception exception);
}

View File

@@ -1,23 +1,23 @@
using System.Collections.Generic;
namespace StellaOps.Scanner.Worker.Processing;
public static class ScanStageNames
{
public const string ResolveImage = "resolve-image";
public const string PullLayers = "pull-layers";
public const string BuildFilesystem = "build-filesystem";
public const string ExecuteAnalyzers = "execute-analyzers";
public const string ComposeArtifacts = "compose-artifacts";
public const string EmitReports = "emit-reports";
public static readonly IReadOnlyList<string> Ordered = new[]
{
ResolveImage,
PullLayers,
BuildFilesystem,
ExecuteAnalyzers,
ComposeArtifacts,
EmitReports,
};
}
using System.Collections.Generic;
namespace StellaOps.Scanner.Worker.Processing;
public static class ScanStageNames
{
public const string ResolveImage = "resolve-image";
public const string PullLayers = "pull-layers";
public const string BuildFilesystem = "build-filesystem";
public const string ExecuteAnalyzers = "execute-analyzers";
public const string ComposeArtifacts = "compose-artifacts";
public const string EmitReports = "emit-reports";
public static readonly IReadOnlyList<string> Ordered = new[]
{
ResolveImage,
PullLayers,
BuildFilesystem,
ExecuteAnalyzers,
ComposeArtifacts,
EmitReports,
};
}

View File

@@ -1,18 +1,18 @@
using System;
using System.Threading;
using System.Threading.Tasks;
namespace StellaOps.Scanner.Worker.Processing;
public sealed class SystemDelayScheduler : IDelayScheduler
{
public Task DelayAsync(TimeSpan delay, CancellationToken cancellationToken)
{
if (delay <= TimeSpan.Zero)
{
return Task.CompletedTask;
}
return Task.Delay(delay, cancellationToken);
}
}
using System;
using System.Threading;
using System.Threading.Tasks;
namespace StellaOps.Scanner.Worker.Processing;
public sealed class SystemDelayScheduler : IDelayScheduler
{
public Task DelayAsync(TimeSpan delay, CancellationToken cancellationToken)
{
if (delay <= TimeSpan.Zero)
{
return Task.CompletedTask;
}
return Task.Delay(delay, cancellationToken);
}
}

View File

@@ -1,98 +1,103 @@
using System.Diagnostics;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using Microsoft.Extensions.DependencyInjection.Extensions;
using StellaOps.Auth.Client;
using System.Diagnostics;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using Microsoft.Extensions.DependencyInjection.Extensions;
using StellaOps.Auth.Client;
using StellaOps.Scanner.Analyzers.OS.Plugin;
using StellaOps.Scanner.EntryTrace;
using StellaOps.Scanner.Worker.Diagnostics;
using StellaOps.Scanner.Worker.Hosting;
using StellaOps.Scanner.Worker.Options;
using StellaOps.Scanner.Worker.Processing;
var builder = Host.CreateApplicationBuilder(args);
builder.Services.AddOptions<ScannerWorkerOptions>()
.BindConfiguration(ScannerWorkerOptions.SectionName)
.ValidateOnStart();
builder.Services.AddSingleton<IValidateOptions<ScannerWorkerOptions>, ScannerWorkerOptionsValidator>();
builder.Services.AddSingleton(TimeProvider.System);
builder.Services.AddSingleton<ScannerWorkerMetrics>();
builder.Services.AddSingleton<ScanProgressReporter>();
builder.Services.AddSingleton<ScanJobProcessor>();
var builder = Host.CreateApplicationBuilder(args);
builder.Services.AddOptions<ScannerWorkerOptions>()
.BindConfiguration(ScannerWorkerOptions.SectionName)
.ValidateOnStart();
builder.Services.AddSingleton<IValidateOptions<ScannerWorkerOptions>, ScannerWorkerOptionsValidator>();
builder.Services.AddSingleton(TimeProvider.System);
builder.Services.AddSingleton<ScannerWorkerMetrics>();
builder.Services.AddSingleton<ScanProgressReporter>();
builder.Services.AddSingleton<ScanJobProcessor>();
builder.Services.AddSingleton<LeaseHeartbeatService>();
builder.Services.AddSingleton<IDelayScheduler, SystemDelayScheduler>();
builder.Services.AddEntryTraceAnalyzer();
builder.Services.TryAddSingleton<IScanJobSource, NullScanJobSource>();
builder.Services.TryAddSingleton<IScanAnalyzerDispatcher, NullScanAnalyzerDispatcher>();
builder.Services.AddSingleton<IScanStageExecutor, AnalyzerStageExecutor>();
builder.Services.AddSingleton<ScannerWorkerHostedService>();
builder.Services.AddHostedService(sp => sp.GetRequiredService<ScannerWorkerHostedService>());
var workerOptions = builder.Configuration.GetSection(ScannerWorkerOptions.SectionName).Get<ScannerWorkerOptions>() ?? new ScannerWorkerOptions();
builder.Services.Configure<HostOptions>(options =>
{
options.ShutdownTimeout = workerOptions.Shutdown.Timeout;
});
builder.ConfigureScannerWorkerTelemetry(workerOptions);
if (workerOptions.Authority.Enabled)
{
builder.Services.AddStellaOpsAuthClient(clientOptions =>
{
clientOptions.Authority = workerOptions.Authority.Issuer?.Trim() ?? string.Empty;
clientOptions.ClientId = workerOptions.Authority.ClientId?.Trim() ?? string.Empty;
clientOptions.ClientSecret = workerOptions.Authority.ClientSecret;
clientOptions.EnableRetries = workerOptions.Authority.Resilience.EnableRetries ?? true;
clientOptions.HttpTimeout = TimeSpan.FromSeconds(workerOptions.Authority.BackchannelTimeoutSeconds);
clientOptions.DefaultScopes.Clear();
foreach (var scope in workerOptions.Authority.Scopes)
{
if (string.IsNullOrWhiteSpace(scope))
{
continue;
}
clientOptions.DefaultScopes.Add(scope);
}
clientOptions.RetryDelays.Clear();
foreach (var delay in workerOptions.Authority.Resilience.RetryDelays)
{
if (delay <= TimeSpan.Zero)
{
continue;
}
clientOptions.RetryDelays.Add(delay);
}
if (workerOptions.Authority.Resilience.AllowOfflineCacheFallback is bool allowOffline)
{
clientOptions.AllowOfflineCacheFallback = allowOffline;
}
if (workerOptions.Authority.Resilience.OfflineCacheTolerance is { } tolerance && tolerance > TimeSpan.Zero)
{
clientOptions.OfflineCacheTolerance = tolerance;
}
});
}
builder.Logging.Configure(options =>
{
options.ActivityTrackingOptions = ActivityTrackingOptions.SpanId
| ActivityTrackingOptions.TraceId
| ActivityTrackingOptions.ParentId;
});
var host = builder.Build();
await host.RunAsync();
public partial class Program;
builder.Services.AddSingleton<OsAnalyzerPluginCatalog>();
builder.Services.AddSingleton<IScanAnalyzerDispatcher, OsScanAnalyzerDispatcher>();
builder.Services.AddSingleton<IScanStageExecutor, AnalyzerStageExecutor>();
builder.Services.AddSingleton<ScannerWorkerHostedService>();
builder.Services.AddHostedService(sp => sp.GetRequiredService<ScannerWorkerHostedService>());
var workerOptions = builder.Configuration.GetSection(ScannerWorkerOptions.SectionName).Get<ScannerWorkerOptions>() ?? new ScannerWorkerOptions();
builder.Services.Configure<HostOptions>(options =>
{
options.ShutdownTimeout = workerOptions.Shutdown.Timeout;
});
builder.ConfigureScannerWorkerTelemetry(workerOptions);
if (workerOptions.Authority.Enabled)
{
builder.Services.AddStellaOpsAuthClient(clientOptions =>
{
clientOptions.Authority = workerOptions.Authority.Issuer?.Trim() ?? string.Empty;
clientOptions.ClientId = workerOptions.Authority.ClientId?.Trim() ?? string.Empty;
clientOptions.ClientSecret = workerOptions.Authority.ClientSecret;
clientOptions.EnableRetries = workerOptions.Authority.Resilience.EnableRetries ?? true;
clientOptions.HttpTimeout = TimeSpan.FromSeconds(workerOptions.Authority.BackchannelTimeoutSeconds);
clientOptions.DefaultScopes.Clear();
foreach (var scope in workerOptions.Authority.Scopes)
{
if (string.IsNullOrWhiteSpace(scope))
{
continue;
}
clientOptions.DefaultScopes.Add(scope);
}
clientOptions.RetryDelays.Clear();
foreach (var delay in workerOptions.Authority.Resilience.RetryDelays)
{
if (delay <= TimeSpan.Zero)
{
continue;
}
clientOptions.RetryDelays.Add(delay);
}
if (workerOptions.Authority.Resilience.AllowOfflineCacheFallback is bool allowOffline)
{
clientOptions.AllowOfflineCacheFallback = allowOffline;
}
if (workerOptions.Authority.Resilience.OfflineCacheTolerance is { } tolerance && tolerance > TimeSpan.Zero)
{
clientOptions.OfflineCacheTolerance = tolerance;
}
});
}
builder.Logging.Configure(options =>
{
options.ActivityTrackingOptions = ActivityTrackingOptions.SpanId
| ActivityTrackingOptions.TraceId
| ActivityTrackingOptions.ParentId;
});
var host = builder.Build();
await host.RunAsync();
public partial class Program;

View File

@@ -1,20 +1,22 @@
<Project Sdk="Microsoft.NET.Sdk.Worker">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<LangVersion>preview</LangVersion>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="OpenTelemetry.Extensions.Hosting" Version="1.12.0" />
<PackageReference Include="OpenTelemetry.Exporter.Console" Version="1.12.0" />
<PackageReference Include="OpenTelemetry.Exporter.OpenTelemetryProtocol" Version="1.12.0" />
<PackageReference Include="OpenTelemetry.Instrumentation.Runtime" Version="1.12.0" />
<PackageReference Include="OpenTelemetry.Instrumentation.Process" Version="1.12.0-beta.1" />
</ItemGroup>
<Project Sdk="Microsoft.NET.Sdk.Worker">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<LangVersion>preview</LangVersion>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="OpenTelemetry.Extensions.Hosting" Version="1.12.0" />
<PackageReference Include="OpenTelemetry.Exporter.Console" Version="1.12.0" />
<PackageReference Include="OpenTelemetry.Exporter.OpenTelemetryProtocol" Version="1.12.0" />
<PackageReference Include="OpenTelemetry.Instrumentation.Runtime" Version="1.12.0" />
<PackageReference Include="OpenTelemetry.Instrumentation.Process" Version="1.12.0-beta.1" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\StellaOps.Plugin\StellaOps.Plugin.csproj" />
<ProjectReference Include="..\StellaOps.Authority\StellaOps.Auth.Client\StellaOps.Auth.Client.csproj" />
<ProjectReference Include="..\StellaOps.Scanner.Analyzers.OS\StellaOps.Scanner.Analyzers.OS.csproj" />
<ProjectReference Include="..\StellaOps.Scanner.EntryTrace\StellaOps.Scanner.EntryTrace.csproj" />
</ItemGroup>
</Project>

View File

@@ -1,8 +1,9 @@
# Scanner Worker Task Board
| ID | Status | Owner(s) | Depends on | Description | Exit Criteria |
|----|--------|----------|------------|-------------|---------------|
| SCANNER-WORKER-09-201 | DONE (2025-10-19) | Scanner Worker Guild | SCANNER-CORE-09-501 | Worker host bootstrap with Authority auth, hosted services, and graceful shutdown semantics. | `Program.cs` binds `Scanner:Worker` options, registers delay scheduler, configures telemetry + Authority client, and enforces shutdown timeout. |
| SCANNER-WORKER-09-202 | DONE (2025-10-19) | Scanner Worker Guild | SCANNER-WORKER-09-201, SCANNER-QUEUE-09-401 | Lease/heartbeat loop with retry+jitter, poison-job quarantine, structured logging. | `ScannerWorkerHostedService` + `LeaseHeartbeatService` manage concurrency, renewal margins, poison handling, and structured logs exercised by integration fixture. |
| SCANNER-WORKER-09-203 | DONE (2025-10-19) | Scanner Worker Guild | SCANNER-WORKER-09-202, SCANNER-STORAGE-09-301 | Analyzer dispatch skeleton emitting deterministic stage progress and honoring cancellation tokens. | Deterministic stage list + `ScanProgressReporter`; `WorkerBasicScanScenario` validates ordering and cancellation propagation. |
# Scanner Worker Task Board
| ID | Status | Owner(s) | Depends on | Description | Exit Criteria |
|----|--------|----------|------------|-------------|---------------|
| SCANNER-WORKER-09-201 | DONE (2025-10-19) | Scanner Worker Guild | SCANNER-CORE-09-501 | Worker host bootstrap with Authority auth, hosted services, and graceful shutdown semantics. | `Program.cs` binds `Scanner:Worker` options, registers delay scheduler, configures telemetry + Authority client, and enforces shutdown timeout. |
| SCANNER-WORKER-09-202 | DONE (2025-10-19) | Scanner Worker Guild | SCANNER-WORKER-09-201, SCANNER-QUEUE-09-401 | Lease/heartbeat loop with retry+jitter, poison-job quarantine, structured logging. | `ScannerWorkerHostedService` + `LeaseHeartbeatService` manage concurrency, renewal margins, poison handling, and structured logs exercised by integration fixture. |
| SCANNER-WORKER-09-203 | DONE (2025-10-19) | Scanner Worker Guild | SCANNER-WORKER-09-202, SCANNER-STORAGE-09-301 | Analyzer dispatch skeleton emitting deterministic stage progress and honoring cancellation tokens. | Deterministic stage list + `ScanProgressReporter`; `WorkerBasicScanScenario` validates ordering and cancellation propagation. |
| SCANNER-WORKER-09-204 | DONE (2025-10-19) | Scanner Worker Guild | SCANNER-WORKER-09-203 | Worker metrics (queue latency, stage duration, failure counts) with OpenTelemetry resource wiring. | `ScannerWorkerMetrics` records queue/job/stage metrics; integration test asserts analyzer stage histogram entries. |
| SCANNER-WORKER-09-205 | DONE (2025-10-19) | Scanner Worker Guild | SCANNER-WORKER-09-202 | Harden heartbeat jitter so lease safety margin stays ≥3× and cover with regression tests. | `LeaseHeartbeatService` clamps jitter to safety window, validator enforces ≥3 safety factor, regression tests cover heartbeat scheduling and metrics. |