feat: Add initial implementation of Vulnerability Resolver Jobs
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled

- Created project for StellaOps.Scanner.Analyzers.Native.Tests with necessary dependencies.
- Documented roles and guidelines in AGENTS.md for Scheduler module.
- Implemented IResolverJobService interface and InMemoryResolverJobService for handling resolver jobs.
- Added ResolverBacklogNotifier and ResolverBacklogService for monitoring job metrics.
- Developed API endpoints for managing resolver jobs and retrieving metrics.
- Defined models for resolver job requests and responses.
- Integrated dependency injection for resolver job services.
- Implemented ImpactIndexSnapshot for persisting impact index data.
- Introduced SignalsScoringOptions for configurable scoring weights in reachability scoring.
- Added unit tests for ReachabilityScoringService and RuntimeFactsIngestionService.
- Created dotnet-filter.sh script to handle command-line arguments for dotnet.
- Established nuget-prime project for managing package downloads.
This commit is contained in:
master
2025-11-18 07:52:15 +02:00
parent e69b57d467
commit 8355e2ff75
299 changed files with 13293 additions and 2444 deletions

41
src/Scheduler/AGENTS.md Normal file
View File

@@ -0,0 +1,41 @@
# AGENTS · Scheduler Working Directory
## Roles
- **Scheduler Worker/WebService Engineer**: .NET 10 (preview) across workers, web service, and shared libraries; keep jobs/metrics deterministic and tenant-safe.
- **QA / Reliability**: Adds/maintains unit + integration tests in `__Tests`, covers determinism, job orchestration, and metrics; validates Mongo/Redis/NATS contracts without live cloud deps.
- **Docs/Runbook Touches**: Update `docs/modules/scheduler/**` and `operations/` assets when contracts or operational characteristics change.
## Required Reading
- `docs/modules/scheduler/README.md`
- `docs/modules/scheduler/architecture.md`
- `docs/modules/scheduler/implementation_plan.md`
- `docs/modules/platform/architecture-overview.md`
- Current sprint file(s) for this module (e.g., `docs/implplan/SPRINT_0155_0001_0001_scheduler_i.md`, `SPRINT_0156_0001_0002_scheduler_ii.md`).
## Working Directory & Boundaries
- Primary scope: `src/Scheduler/**` including WebService, Worker.Host, `__Libraries`, `__Tests`, plugins, and solution files.
- Cross-module edits require an explicit note in sprint **Delivery Tracker** and **Decisions & Risks**.
- Fixtures belong under `src/Scheduler/__Tests/Fixtures` and must be deterministic.
## Engineering Rules
- Target `net10.0`; prefer latest C# preview permitted in repo.
- Offline-first: no new external calls; use cached feeds (`/local-nugets`) and configurable endpoints.
- Determinism: stable ordering, UTC ISO-8601 timestamps, seeded randomness; avoid host-specific paths in outputs/events.
- Observability: use structured logging; keep metric/label names consistent with published dashboards (`policy_simulation_*`, `graph_*`, `overlay_*`).
- Security: tenant isolation on all queues/stores; avoid leaking PII/secrets in logs or metrics.
## Testing & Verification
- Default: `dotnet test src/Scheduler/StellaOps.Scheduler.sln` (note: GraphJobs `IGraphJobStore.UpdateAsync` accessibility issue is a known blocker; document if encountered).
- Add/extend tests in `src/Scheduler/__Tests/**`; prefer minimal deterministic fixtures and stable sort order.
- When adding metrics, include unit tests validating label sets and defaults; update `operations/worker-prometheus-rules.yaml` if alert semantics change.
## Workflow Expectations
- Mirror task state changes in sprint files and, where applicable, module TASKS boards.
- If blocked by contracts or upstream issues, set task to `BLOCKED` in sprint tracker and note the required decision/fix.
- Document runbook/operational changes alongside code changes.
## Allowed Shared Libraries
- May reference shared helpers under `src/Scheduler/__Libraries/**` and existing plugins; new shared libs require sprint note.
## Air-gap & Offline
- Support air-gapped operation: no hardcoded internet endpoints; provide config flags and mirrored feeds when needed.

View File

@@ -25,6 +25,7 @@ internal static class PolicySimulationEndpointExtensions
group.MapGet("/{simulationId}/stream", StreamSimulationAsync);
group.MapGet("/metrics", GetMetricsAsync);
group.MapPost("/", CreateSimulationAsync);
group.MapPost("/preview", PreviewSimulationAsync);
group.MapPost("/{simulationId}/cancel", CancelSimulationAsync);
group.MapPost("/{simulationId}/retry", RetrySimulationAsync);
}
@@ -198,6 +199,75 @@ internal static class PolicySimulationEndpointExtensions
}
}
private static async Task<IResult> PreviewSimulationAsync(
HttpContext httpContext,
PolicySimulationCreateRequest request,
[FromServices] ITenantContextAccessor tenantAccessor,
[FromServices] IScopeAuthorizer scopeAuthorizer,
[FromServices] IPolicyRunService policyRunService,
CancellationToken cancellationToken)
{
try
{
scopeAuthorizer.EnsureScope(httpContext, Scope);
var tenant = tenantAccessor.GetTenant(httpContext);
var actor = SchedulerEndpointHelpers.ResolveActorId(httpContext);
if (string.IsNullOrWhiteSpace(request.PolicyId))
{
throw new ValidationException("policyId must be provided.");
}
if (request.PolicyVersion is null || request.PolicyVersion <= 0)
{
throw new ValidationException("policyVersion must be provided and greater than zero.");
}
var normalizedMetadata = NormalizeMetadata(request.Metadata);
var inputs = request.Inputs ?? PolicyRunInputs.Empty;
var policyRequest = new PolicyRunRequest(
tenant.TenantId,
request.PolicyId,
PolicyRunMode.Simulate,
inputs,
request.Priority,
runId: null,
policyVersion: request.PolicyVersion,
requestedBy: actor,
queuedAt: null,
correlationId: request.CorrelationId,
metadata: normalizedMetadata);
var status = await policyRunService
.EnqueueAsync(tenant.TenantId, policyRequest, cancellationToken)
.ConfigureAwait(false);
var preview = new
{
candidates = inputs.Targets?.Count ?? 0,
estimatedRuns = inputs.Targets?.Count ?? 0,
message = "preview pending execution; actual diff will be available once job starts"
};
return Results.Created(
$"/api/v1/scheduler/policies/simulations/{status.RunId}",
new { simulation = new PolicySimulationResponse(status), preview });
}
catch (UnauthorizedAccessException ex)
{
return Results.Json(new { error = ex.Message }, statusCode: StatusCodes.Status401Unauthorized);
}
catch (InvalidOperationException ex)
{
return Results.Json(new { error = ex.Message }, statusCode: StatusCodes.Status403Forbidden);
}
catch (ValidationException ex)
{
return Results.BadRequest(new { error = ex.Message });
}
}
private static async Task<IResult> CancelSimulationAsync(
HttpContext httpContext,
string simulationId,

View File

@@ -20,6 +20,7 @@ using StellaOps.Scheduler.WebService.Schedules;
using StellaOps.Scheduler.WebService.Options;
using StellaOps.Scheduler.WebService.PolicyRuns;
using StellaOps.Scheduler.WebService.PolicySimulations;
using StellaOps.Scheduler.WebService.VulnerabilityResolverJobs;
using StellaOps.Scheduler.WebService.Runs;
var builder = WebApplication.CreateBuilder(args);
@@ -98,6 +99,7 @@ else
builder.Services.AddSingleton<IPolicyRunService, InMemoryPolicyRunService>();
}
builder.Services.AddSingleton<IGraphJobCompletionPublisher, GraphJobEventPublisher>();
builder.Services.AddSingleton<IResolverJobService, InMemoryResolverJobService>();
if (cartographerOptions.Webhook.Enabled)
{
builder.Services.AddHttpClient<ICartographerWebhookClient, CartographerWebhookClient>((serviceProvider, client) =>
@@ -112,6 +114,7 @@ else
}
builder.Services.AddScoped<IGraphJobService, GraphJobService>();
builder.Services.AddImpactIndexStub();
builder.Services.AddResolverJobServices();
var schedulerOptions = builder.Configuration.GetSection("Scheduler").Get<SchedulerOptions>() ?? new SchedulerOptions();
schedulerOptions.Validate();
@@ -202,6 +205,7 @@ app.MapGet("/healthz", () => Results.Json(new { status = "ok" }));
app.MapGet("/readyz", () => Results.Json(new { status = "ready" }));
app.MapGraphJobEndpoints();
ResolverJobEndpointExtensions.MapResolverJobEndpoints(app);
app.MapScheduleEndpoints();
app.MapRunEndpoints();
app.MapPolicyRunEndpoints();

View File

@@ -0,0 +1,10 @@
using System.ComponentModel.DataAnnotations;
namespace StellaOps.Scheduler.WebService.VulnerabilityResolverJobs;
public interface IResolverJobService
{
Task<ResolverJobResponse> CreateAsync(string tenantId, ResolverJobRequest request, CancellationToken cancellationToken);
Task<ResolverJobResponse?> GetAsync(string tenantId, string jobId, CancellationToken cancellationToken);
ResolverBacklogMetricsResponse ComputeMetrics(string tenantId);
}

View File

@@ -0,0 +1,141 @@
using System.Collections.Concurrent;
using System.ComponentModel.DataAnnotations;
namespace StellaOps.Scheduler.WebService.VulnerabilityResolverJobs;
/// <summary>
/// Lightweight in-memory resolver job service to satisfy API contract and rate-limit callers.
/// Suitable for stub/air-gap scenarios; replace with Mongo-backed implementation when ready.
/// </summary>
public sealed class InMemoryResolverJobService : IResolverJobService
{
private readonly ConcurrentDictionary<string, ResolverJobResponse> _store = new(StringComparer.OrdinalIgnoreCase);
private readonly ConcurrentDictionary<string, List<DateTimeOffset>> _tenantCreates = new(StringComparer.OrdinalIgnoreCase);
private readonly TimeProvider _timeProvider;
private const int MaxJobsPerMinute = 60;
public InMemoryResolverJobService(TimeProvider? timeProvider = null)
{
_timeProvider = timeProvider ?? TimeProvider.System;
}
public Task<ResolverJobResponse> CreateAsync(string tenantId, ResolverJobRequest request, CancellationToken cancellationToken)
{
ArgumentException.ThrowIfNullOrWhiteSpace(tenantId);
ArgumentNullException.ThrowIfNull(request);
ValidateRequest(request);
EnforceRateLimit(tenantId);
var id = GenerateId(tenantId, request.ArtifactId, request.PolicyId);
var created = _timeProvider.GetUtcNow();
var response = new ResolverJobResponse(
id,
request.ArtifactId.Trim(),
request.PolicyId.Trim(),
"queued",
created,
CompletedAt: null,
request.CorrelationId,
request.Metadata ?? new Dictionary<string, string>());
_store[id] = response;
TrackCreate(tenantId, created);
return Task.FromResult(response);
}
public Task<ResolverJobResponse?> GetAsync(string tenantId, string jobId, CancellationToken cancellationToken)
{
ArgumentException.ThrowIfNullOrWhiteSpace(tenantId);
ArgumentException.ThrowIfNullOrWhiteSpace(jobId);
_store.TryGetValue(jobId, out var response);
return Task.FromResult(response);
}
public ResolverBacklogMetricsResponse ComputeMetrics(string tenantId)
{
var now = _timeProvider.GetUtcNow();
var pending = new List<ResolverJobResponse>();
var completed = new List<ResolverJobResponse>();
foreach (var job in _store.Values)
{
if (string.Equals(job.Status, "completed", StringComparison.OrdinalIgnoreCase))
{
completed.Add(job);
}
else
{
pending.Add(job);
}
}
var lagEntries = completed
.Where(j => j.CompletedAt is not null)
.Select(j => new ResolverLagEntry(
j.Id,
j.CompletedAt!.Value,
Math.Max((j.CompletedAt!.Value - j.CreatedAt).TotalSeconds, 0d),
j.CorrelationId,
j.ArtifactId,
j.PolicyId))
.OrderByDescending(e => e.CompletedAt)
.ToList();
return new ResolverBacklogMetricsResponse(
tenantId,
Pending: pending.Count,
Running: 0,
Completed: completed.Count,
Failed: 0,
MinLagSeconds: lagEntries.Count == 0 ? null : lagEntries.Min(e => (double?)e.LagSeconds),
MaxLagSeconds: lagEntries.Count == 0 ? null : lagEntries.Max(e => (double?)e.LagSeconds),
AverageLagSeconds: lagEntries.Count == 0 ? null : lagEntries.Average(e => e.LagSeconds),
RecentCompleted: lagEntries.Take(5).ToList());
}
private static void ValidateRequest(ResolverJobRequest request)
{
if (string.IsNullOrWhiteSpace(request.ArtifactId))
{
throw new ValidationException("artifactId is required.");
}
if (string.IsNullOrWhiteSpace(request.PolicyId))
{
throw new ValidationException("policyId is required.");
}
}
private static string GenerateId(string tenantId, string artifactId, string policyId)
{
var raw = $"{tenantId}:{artifactId}:{policyId}:{Guid.NewGuid():N}";
return "resolver-" + Convert.ToHexString(System.Security.Cryptography.SHA256.HashData(System.Text.Encoding.UTF8.GetBytes(raw))).ToLowerInvariant();
}
private void EnforceRateLimit(string tenantId)
{
var now = _timeProvider.GetUtcNow();
var cutoff = now.AddMinutes(-1);
var list = _tenantCreates.GetOrAdd(tenantId, static _ => new List<DateTimeOffset>());
lock (list)
{
list.RemoveAll(ts => ts < cutoff);
if (list.Count >= MaxJobsPerMinute)
{
throw new InvalidOperationException("resolver job rate limit exceeded");
}
}
}
private void TrackCreate(string tenantId, DateTimeOffset timestamp)
{
var list = _tenantCreates.GetOrAdd(tenantId, static _ => new List<DateTimeOffset>());
lock (list)
{
list.Add(timestamp);
}
}
}

View File

@@ -0,0 +1,28 @@
using Microsoft.Extensions.Logging;
namespace StellaOps.Scheduler.WebService.VulnerabilityResolverJobs;
public interface IResolverBacklogNotifier
{
void NotifyIfBreached(ResolverBacklogMetricsResponse metrics);
}
internal sealed class LoggingResolverBacklogNotifier : IResolverBacklogNotifier
{
private readonly ILogger<LoggingResolverBacklogNotifier> _logger;
private readonly int _threshold;
public LoggingResolverBacklogNotifier(ILogger<LoggingResolverBacklogNotifier> logger, int threshold = 100)
{
_logger = logger;
_threshold = threshold;
}
public void NotifyIfBreached(ResolverBacklogMetricsResponse metrics)
{
if (metrics.Pending > _threshold)
{
_logger.LogWarning("resolver backlog threshold exceeded: {Pending} pending (threshold {Threshold})", metrics.Pending, _threshold);
}
}
}

View File

@@ -0,0 +1,51 @@
using System.Collections.Immutable;
using StellaOps.Scheduler.Queue;
namespace StellaOps.Scheduler.WebService.VulnerabilityResolverJobs;
internal interface IResolverBacklogService
{
ResolverBacklogSummary GetSummary();
}
internal sealed class ResolverBacklogService : IResolverBacklogService
{
private readonly TimeProvider _timeProvider;
public ResolverBacklogService(TimeProvider? timeProvider = null)
{
_timeProvider = timeProvider ?? TimeProvider.System;
}
public ResolverBacklogSummary GetSummary()
{
var samples = SchedulerQueueMetrics.CaptureDepthSamples();
if (samples.Count == 0)
{
return new ResolverBacklogSummary(_timeProvider.GetUtcNow(), 0, 0, ImmutableArray<ResolverBacklogEntry>.Empty);
}
long total = 0;
long max = 0;
var builder = ImmutableArray.CreateBuilder<ResolverBacklogEntry>(samples.Count);
foreach (var sample in samples)
{
total += sample.Depth;
if (sample.Depth > max)
{
max = sample.Depth;
}
builder.Add(new ResolverBacklogEntry(sample.Transport, sample.Queue, sample.Depth));
}
return new ResolverBacklogSummary(_timeProvider.GetUtcNow(), total, max, builder.ToImmutable());
}
}
public sealed record ResolverBacklogSummary(
DateTimeOffset ObservedAt,
long TotalDepth,
long MaxDepth,
IReadOnlyList<ResolverBacklogEntry> Queues);
public sealed record ResolverBacklogEntry(string Transport, string Queue, long Depth);

View File

@@ -0,0 +1,101 @@
using System.ComponentModel.DataAnnotations;
using Microsoft.AspNetCore.Mvc;
using StellaOps.Auth.Abstractions;
using StellaOps.Scheduler.WebService.Auth;
namespace StellaOps.Scheduler.WebService.VulnerabilityResolverJobs;
public static class ResolverJobEndpointExtensions
{
private const string ScopeWrite = StellaOpsScopes.EffectiveWrite;
private const string ScopeRead = StellaOpsScopes.FindingsRead;
public static void MapResolverJobEndpoints(this IEndpointRouteBuilder builder)
{
var group = builder.MapGroup("/api/v1/scheduler/vuln/resolver");
group.MapPost("/jobs", CreateJobAsync);
group.MapGet("/jobs/{jobId}", GetJobAsync);
group.MapGet("/metrics", GetLagMetricsAsync);
}
internal static async Task<IResult> CreateJobAsync(
[FromBody] ResolverJobRequest request,
HttpContext httpContext,
[FromServices] ITenantContextAccessor tenantAccessor,
[FromServices] IScopeAuthorizer authorizer,
[FromServices] IResolverJobService jobService,
CancellationToken cancellationToken)
{
try
{
authorizer.EnsureScope(httpContext, ScopeWrite);
var tenant = tenantAccessor.GetTenant(httpContext);
var job = await jobService.CreateAsync(tenant.TenantId, request, cancellationToken).ConfigureAwait(false);
return Results.Created($"/api/v1/scheduler/vuln/resolver/jobs/{job.Id}", job);
}
catch (UnauthorizedAccessException ex)
{
return Results.Json(new { error = ex.Message }, statusCode: StatusCodes.Status401Unauthorized);
}
catch (InvalidOperationException ex)
{
return Results.Json(new { error = ex.Message }, statusCode: StatusCodes.Status403Forbidden);
}
catch (ValidationException ex)
{
return Results.Json(new { error = ex.Message }, statusCode: StatusCodes.Status400BadRequest);
}
}
internal static async Task<IResult> GetJobAsync(
string jobId,
HttpContext httpContext,
[FromServices] ITenantContextAccessor tenantAccessor,
[FromServices] IScopeAuthorizer authorizer,
[FromServices] IResolverJobService jobService,
CancellationToken cancellationToken)
{
try
{
authorizer.EnsureScope(httpContext, ScopeRead);
var tenant = tenantAccessor.GetTenant(httpContext);
var job = await jobService.GetAsync(tenant.TenantId, jobId, cancellationToken).ConfigureAwait(false);
return job is null ? Results.NotFound() : Results.Ok(job);
}
catch (UnauthorizedAccessException ex)
{
return Results.Json(new { error = ex.Message }, statusCode: StatusCodes.Status401Unauthorized);
}
catch (InvalidOperationException ex)
{
return Results.Json(new { error = ex.Message }, statusCode: StatusCodes.Status403Forbidden);
}
}
internal static IResult GetLagMetricsAsync(
HttpContext httpContext,
[FromServices] ITenantContextAccessor tenantAccessor,
[FromServices] IScopeAuthorizer authorizer,
[FromServices] IResolverJobService jobService,
[FromServices] IResolverBacklogService backlogService,
[FromServices] IResolverBacklogNotifier backlogNotifier)
{
try
{
authorizer.EnsureScope(httpContext, ScopeRead);
var tenant = tenantAccessor.GetTenant(httpContext);
var metrics = jobService.ComputeMetrics(tenant.TenantId);
var backlog = backlogService.GetSummary();
backlogNotifier.NotifyIfBreached(metrics with { Pending = (int)backlog.TotalDepth });
return Results.Ok(new { jobs = metrics, backlog });
}
catch (UnauthorizedAccessException ex)
{
return Results.Json(new { error = ex.Message }, statusCode: StatusCodes.Status401Unauthorized);
}
catch (InvalidOperationException ex)
{
return Results.Json(new { error = ex.Message }, statusCode: StatusCodes.Status403Forbidden);
}
}
}

View File

@@ -0,0 +1,44 @@
using System.ComponentModel.DataAnnotations;
using System.Text.Json.Serialization;
namespace StellaOps.Scheduler.WebService.VulnerabilityResolverJobs;
public sealed record ResolverJobRequest(
[property: Required]
string ArtifactId,
[property: Required]
string PolicyId,
string? CorrelationId = null,
IReadOnlyDictionary<string, string>? Metadata = null)
{
public ResolverJobRequest() : this(string.Empty, string.Empty, null, null) { }
}
public sealed record ResolverJobResponse(
string Id,
string ArtifactId,
string PolicyId,
string Status,
DateTimeOffset CreatedAt,
DateTimeOffset? CompletedAt,
string? CorrelationId,
IReadOnlyDictionary<string, string>? Metadata);
public sealed record ResolverBacklogMetricsResponse(
string TenantId,
int Pending,
int Running,
int Completed,
int Failed,
double? MinLagSeconds,
double? MaxLagSeconds,
double? AverageLagSeconds,
IReadOnlyList<ResolverLagEntry> RecentCompleted);
public sealed record ResolverLagEntry(
string JobId,
DateTimeOffset CompletedAt,
double LagSeconds,
string? CorrelationId,
string? ArtifactId,
string? PolicyId);

View File

@@ -0,0 +1,16 @@
using Microsoft.AspNetCore.Routing;
using Microsoft.Extensions.DependencyInjection;
namespace StellaOps.Scheduler.WebService.VulnerabilityResolverJobs;
public static class ResolverJobServiceCollectionExtensions
{
public static IServiceCollection AddResolverJobServices(this IServiceCollection services)
{
services.AddSingleton<IResolverJobService, InMemoryResolverJobService>();
services.AddSingleton<IResolverBacklogService, ResolverBacklogService>();
services.AddSingleton<IResolverBacklogNotifier, LoggingResolverBacklogNotifier>();
return services;
}
}

View File

@@ -107,21 +107,78 @@ public sealed class FixtureImpactIndex : IImpactIndex
return CreateImpactSet(state, selector, Enumerable.Empty<FixtureMatch>(), usageOnly);
}
public async ValueTask<ImpactSet> ResolveAllAsync(
Selector selector,
bool usageOnly,
CancellationToken cancellationToken = default)
{
public async ValueTask<ImpactSet> ResolveAllAsync(
Selector selector,
bool usageOnly,
CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(selector);
var state = await EnsureInitializedAsync(cancellationToken).ConfigureAwait(false);
var matches = state.ImagesByDigest.Values
.Select(image => new FixtureMatch(image, image.UsedByEntrypoint))
.Where(match => !usageOnly || match.UsedByEntrypoint);
return CreateImpactSet(state, selector, matches, usageOnly);
}
var matches = state.ImagesByDigest.Values
.Select(image => new FixtureMatch(image, image.UsedByEntrypoint))
.Where(match => !usageOnly || match.UsedByEntrypoint);
return CreateImpactSet(state, selector, matches, usageOnly);
}
public ValueTask RemoveAsync(string imageDigest, CancellationToken cancellationToken = default)
{
// Fixture-backed index is immutable; removals are ignored.
return ValueTask.CompletedTask;
}
public async ValueTask<ImpactIndexSnapshot> CreateSnapshotAsync(CancellationToken cancellationToken = default)
{
var state = await EnsureInitializedAsync(cancellationToken).ConfigureAwait(false);
var images = state.ImagesByDigest.Values
.OrderBy(image => image.Digest, StringComparer.OrdinalIgnoreCase)
.Select((image, index) => new ImpactImageRecord(
index,
"fixture",
image.Digest,
image.Registry,
image.Repository,
image.Namespaces,
image.Tags,
image.Labels,
image.GeneratedAt,
image.Components.Select(c => c.Purl).ToImmutableArray(),
image.Components.Where(c => c.UsedByEntrypoint).Select(c => c.Purl).ToImmutableArray()))
.ToImmutableArray();
var contains = images
.SelectMany(img => img.Components.Select(purl => (purl, img.ImageId)))
.GroupBy(pair => pair.purl, StringComparer.OrdinalIgnoreCase)
.ToImmutableDictionary(
g => g.Key,
g => g.Select(p => p.ImageId).Distinct().OrderBy(id => id).ToImmutableArray(),
StringComparer.OrdinalIgnoreCase);
var usedBy = images
.SelectMany(img => img.EntrypointComponents.Select(purl => (purl, img.ImageId)))
.GroupBy(pair => pair.purl, StringComparer.OrdinalIgnoreCase)
.ToImmutableDictionary(
g => g.Key,
g => g.Select(p => p.ImageId).Distinct().OrderBy(id => id).ToImmutableArray(),
StringComparer.OrdinalIgnoreCase);
return new ImpactIndexSnapshot(
state.GeneratedAt,
state.SnapshotId,
images,
contains,
usedBy);
}
public ValueTask RestoreSnapshotAsync(ImpactIndexSnapshot snapshot, CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(snapshot);
// Fixture index remains immutable; restoration is a no-op.
return ValueTask.CompletedTask;
}
private async Task<FixtureIndexState> EnsureInitializedAsync(CancellationToken cancellationToken)
{

View File

@@ -39,8 +39,29 @@ public interface IImpactIndex
/// <param name="selector">Selector scoping the query.</param>
/// <param name="usageOnly">When true, restricts results to images with entrypoint usage.</param>
/// <param name="cancellationToken">Cancellation token.</param>
ValueTask<ImpactSet> ResolveAllAsync(
Selector selector,
bool usageOnly,
CancellationToken cancellationToken = default);
}
ValueTask<ImpactSet> ResolveAllAsync(
Selector selector,
bool usageOnly,
CancellationToken cancellationToken = default);
/// <summary>
/// Removes an image digest and its component mappings from the index.
/// Used when an image is deleted or aged out.
/// </summary>
ValueTask RemoveAsync(
string imageDigest,
CancellationToken cancellationToken = default);
/// <summary>
/// Creates a compacted snapshot of the index for persistence (e.g., RocksDB/Redis).
/// </summary>
ValueTask<ImpactIndexSnapshot> CreateSnapshotAsync(
CancellationToken cancellationToken = default);
/// <summary>
/// Restores index state from a previously persisted snapshot.
/// </summary>
ValueTask RestoreSnapshotAsync(
ImpactIndexSnapshot snapshot,
CancellationToken cancellationToken = default);
}

View File

@@ -3,7 +3,7 @@ using System.Collections.Immutable;
namespace StellaOps.Scheduler.ImpactIndex;
internal sealed record ImpactImageRecord(
public sealed record ImpactImageRecord(
int ImageId,
string TenantId,
string Digest,

View File

@@ -0,0 +1,37 @@
using System.Collections.Immutable;
using System.Text.Json;
using System.Text.Json.Serialization;
namespace StellaOps.Scheduler.ImpactIndex;
/// <summary>
/// Serializable snapshot for persisting the ImpactIndex (e.g., RocksDB/Redis).
/// Contains compacted image IDs and per-purl bitmap membership.
/// </summary>
public sealed record ImpactIndexSnapshot(
DateTimeOffset GeneratedAt,
string SnapshotId,
ImmutableArray<ImpactImageRecord> Images,
ImmutableDictionary<string, ImmutableArray<int>> ContainsByPurl,
ImmutableDictionary<string, ImmutableArray<int>> UsedByEntrypointByPurl)
{
public static byte[] ToBytes(ImpactIndexSnapshot snapshot)
{
var options = SerializerOptions;
return JsonSerializer.SerializeToUtf8Bytes(snapshot, options);
}
public static ImpactIndexSnapshot FromBytes(ReadOnlySpan<byte> payload)
{
var options = SerializerOptions;
var snapshot = JsonSerializer.Deserialize<ImpactIndexSnapshot>(payload, options);
return snapshot ?? throw new InvalidOperationException("ImpactIndexSnapshot payload could not be deserialized.");
}
private static readonly JsonSerializerOptions SerializerOptions = new()
{
PropertyNamingPolicy = JsonNamingPolicy.CamelCase,
WriteIndented = false,
DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull
};
}

View File

@@ -23,16 +23,41 @@ public sealed class RoaringImpactIndex : IImpactIndex
private readonly Dictionary<string, int> _imageIds = new(StringComparer.OrdinalIgnoreCase);
private readonly Dictionary<int, ImpactImageRecord> _images = new();
private readonly Dictionary<string, RoaringBitmap> _containsByPurl = new(StringComparer.OrdinalIgnoreCase);
private readonly Dictionary<string, RoaringBitmap> _usedByEntrypointByPurl = new(StringComparer.OrdinalIgnoreCase);
private readonly Dictionary<string, RoaringBitmap> _usedByEntrypointByPurl = new(StringComparer.OrdinalIgnoreCase);
private readonly ILogger<RoaringImpactIndex> _logger;
private readonly TimeProvider _timeProvider;
private string? _snapshotId;
private readonly ILogger<RoaringImpactIndex> _logger;
private readonly TimeProvider _timeProvider;
public RoaringImpactIndex(ILogger<RoaringImpactIndex> logger, TimeProvider? timeProvider = null)
{
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
_timeProvider = timeProvider ?? TimeProvider.System;
}
public RoaringImpactIndex(ILogger<RoaringImpactIndex> logger, TimeProvider? timeProvider = null)
{
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
_timeProvider = timeProvider ?? TimeProvider.System;
}
public ValueTask RemoveAsync(string imageDigest, CancellationToken cancellationToken = default)
{
ArgumentException.ThrowIfNullOrWhiteSpace(imageDigest);
lock (_gate)
{
if (!_imageIds.TryGetValue(imageDigest, out var imageId))
{
return ValueTask.CompletedTask;
}
if (_images.TryGetValue(imageId, out var record))
{
RemoveImageComponents(record);
_images.Remove(imageId);
}
_imageIds.Remove(imageDigest);
_snapshotId = null;
}
return ValueTask.CompletedTask;
}
public async Task IngestAsync(ImpactIndexIngestionRequest request, CancellationToken cancellationToken = default)
{
@@ -130,11 +155,108 @@ public sealed class RoaringImpactIndex : IImpactIndex
CancellationToken cancellationToken = default)
=> ValueTask.FromResult(CreateEmptyImpactSet(selector, usageOnly));
public ValueTask<ImpactSet> ResolveAllAsync(
Selector selector,
bool usageOnly,
CancellationToken cancellationToken = default)
=> ValueTask.FromResult(ResolveAllCore(selector, usageOnly));
public ValueTask<ImpactSet> ResolveAllAsync(
Selector selector,
bool usageOnly,
CancellationToken cancellationToken = default)
=> ValueTask.FromResult(ResolveAllCore(selector, usageOnly));
public ValueTask<ImpactIndexSnapshot> CreateSnapshotAsync(CancellationToken cancellationToken = default)
{
cancellationToken.ThrowIfCancellationRequested();
lock (_gate)
{
var orderedImages = _images
.Values
.OrderBy(img => img.Digest, StringComparer.OrdinalIgnoreCase)
.ThenBy(img => img.Repository, StringComparer.OrdinalIgnoreCase)
.ToArray();
var idMap = orderedImages
.Select((image, index) => (image.ImageId, NewId: index))
.ToDictionary(tuple => tuple.ImageId, tuple => tuple.NewId);
var compactedImages = orderedImages
.Select(image => image with { ImageId = idMap[image.ImageId] })
.ToImmutableArray();
ImmutableDictionary<string, ImmutableArray<int>> CompactBitmaps(Dictionary<string, RoaringBitmap> source)
{
var builder = ImmutableDictionary.CreateBuilder<string, ImmutableArray<int>>(StringComparer.OrdinalIgnoreCase);
foreach (var (key, bitmap) in source)
{
var remapped = bitmap
.Select(id => idMap.TryGetValue(id, out var newId) ? newId : (int?)null)
.Where(id => id.HasValue)
.Select(id => id!.Value)
.Distinct()
.OrderBy(id => id)
.ToImmutableArray();
if (remapped.Length > 0)
{
builder[key] = remapped;
}
}
return builder.ToImmutable();
}
var contains = CompactBitmaps(_containsByPurl);
var usedBy = CompactBitmaps(_usedByEntrypointByPurl);
var generatedAt = orderedImages.Length == 0
? _timeProvider.GetUtcNow()
: orderedImages.Max(img => img.GeneratedAt);
var snapshotId = ComputeSnapshotId(compactedImages, contains, usedBy);
_snapshotId = snapshotId;
var snapshot = new ImpactIndexSnapshot(
generatedAt,
snapshotId,
compactedImages,
contains,
usedBy);
return ValueTask.FromResult(snapshot);
}
}
public ValueTask RestoreSnapshotAsync(ImpactIndexSnapshot snapshot, CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(snapshot);
cancellationToken.ThrowIfCancellationRequested();
lock (_gate)
{
_images.Clear();
_imageIds.Clear();
_containsByPurl.Clear();
_usedByEntrypointByPurl.Clear();
foreach (var image in snapshot.Images)
{
_images[image.ImageId] = image;
_imageIds[image.Digest] = image.ImageId;
}
foreach (var kvp in snapshot.ContainsByPurl)
{
_containsByPurl[kvp.Key] = RoaringBitmap.Create(kvp.Value.ToArray());
}
foreach (var kvp in snapshot.UsedByEntrypointByPurl)
{
_usedByEntrypointByPurl[kvp.Key] = RoaringBitmap.Create(kvp.Value.ToArray());
}
_snapshotId = snapshot.SnapshotId;
}
return ValueTask.CompletedTask;
}
private ImpactSet ResolveByPurlsCore(IEnumerable<string> purls, bool usageOnly, Selector selector)
{
@@ -231,27 +353,27 @@ public sealed class RoaringImpactIndex : IImpactIndex
var generatedAt = latestGeneratedAt == DateTimeOffset.MinValue ? _timeProvider.GetUtcNow() : latestGeneratedAt;
return new ImpactSet(
selector,
images.ToImmutableArray(),
usageOnly,
generatedAt,
images.Count,
snapshotId: null,
schemaVersion: SchedulerSchemaVersions.ImpactSet);
}
return new ImpactSet(
selector,
images.ToImmutableArray(),
usageOnly,
generatedAt,
images.Count,
snapshotId: _snapshotId,
schemaVersion: SchedulerSchemaVersions.ImpactSet);
}
private ImpactSet CreateEmptyImpactSet(Selector selector, bool usageOnly)
{
return new ImpactSet(
selector,
ImmutableArray<ImpactImage>.Empty,
usageOnly,
_timeProvider.GetUtcNow(),
0,
snapshotId: null,
schemaVersion: SchedulerSchemaVersions.ImpactSet);
}
return new ImpactSet(
selector,
ImmutableArray<ImpactImage>.Empty,
usageOnly,
_timeProvider.GetUtcNow(),
0,
snapshotId: _snapshotId,
schemaVersion: SchedulerSchemaVersions.ImpactSet);
}
private static bool ImageMatchesSelector(ImpactImageRecord image, Selector selector)
{
@@ -403,22 +525,54 @@ public sealed class RoaringImpactIndex : IImpactIndex
return RoaringBitmap.Create(remaining);
}
private static bool MatchesScope(ImpactImageRecord image, Selector selector)
{
return selector.Scope switch
{
SelectorScope.AllImages => true,
private static bool MatchesScope(ImpactImageRecord image, Selector selector)
{
return selector.Scope switch
{
SelectorScope.AllImages => true,
SelectorScope.ByDigest => selector.Digests.Contains(image.Digest, StringComparer.OrdinalIgnoreCase),
SelectorScope.ByRepository => selector.Repositories.Any(repo =>
string.Equals(repo, image.Repository, StringComparison.OrdinalIgnoreCase) ||
string.Equals(repo, $"{image.Registry}/{image.Repository}", StringComparison.OrdinalIgnoreCase)),
SelectorScope.ByNamespace => !image.Namespaces.IsDefaultOrEmpty && selector.Namespaces.Any(ns => image.Namespaces.Contains(ns, StringComparer.OrdinalIgnoreCase)),
SelectorScope.ByLabels => selector.Labels.All(label =>
image.Labels.TryGetValue(label.Key, out var value) &&
(label.Values.Length == 0 || label.Values.Contains(value, StringComparer.OrdinalIgnoreCase))),
_ => true,
};
}
SelectorScope.ByLabels => selector.Labels.All(label =>
image.Labels.TryGetValue(label.Key, out var value) &&
(label.Values.Length == 0 || label.Values.Contains(value, StringComparer.OrdinalIgnoreCase))),
_ => true,
};
}
private static string ComputeSnapshotId(
ImmutableArray<ImpactImageRecord> images,
ImmutableDictionary<string, ImmutableArray<int>> contains,
ImmutableDictionary<string, ImmutableArray<int>> usedBy)
{
var builder = new StringBuilder();
foreach (var image in images.OrderBy(img => img.Digest, StringComparer.OrdinalIgnoreCase))
{
builder.Append(image.Digest).Append('|').Append(image.GeneratedAt.ToUnixTimeSeconds()).Append(';');
}
void AppendMap(ImmutableDictionary<string, ImmutableArray<int>> map)
{
foreach (var kvp in map.OrderBy(pair => pair.Key, StringComparer.OrdinalIgnoreCase))
{
builder.Append(kvp.Key).Append('=');
foreach (var id in kvp.Value)
{
builder.Append(id).Append(',');
}
builder.Append('|');
}
}
AppendMap(contains);
AppendMap(usedBy);
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(builder.ToString()));
return "snap-" + Convert.ToHexString(hash).ToLowerInvariant();
}
private static bool MatchesTagPattern(string tag, string pattern)
{

View File

@@ -71,11 +71,13 @@ internal sealed class GraphBuildExecutionService
return GraphBuildExecutionResult.Skipped(job, "transition_invalid");
}
if (!await _repository.TryReplaceAsync(running, job.Status, cancellationToken).ConfigureAwait(false))
{
_metrics.RecordGraphJobResult("build", "skipped");
return GraphBuildExecutionResult.Skipped(job, "concurrency_conflict");
}
if (!await _repository.TryReplaceAsync(running, job.Status, cancellationToken).ConfigureAwait(false))
{
_metrics.RecordGraphJobResult("build", "skipped");
return GraphBuildExecutionResult.Skipped(job, "concurrency_conflict");
}
_metrics.RecordGraphJobStart("build", running.TenantId, running.GraphSnapshotId ?? running.SbomId);
var attempt = 0;
CartographerBuildResult? lastResult = null;
@@ -114,9 +116,11 @@ internal sealed class GraphBuildExecutionService
{
var completionTime = _timeProvider.GetUtcNow();
await NotifyCompletionAsync(running, GraphJobStatus.Completed, completionTime, response.GraphSnapshotId, response.ResultUri, response.Error, cancellationToken).ConfigureAwait(false);
_metrics.RecordGraphJobResult("build", "completed", completionTime - running.CreatedAt);
return GraphBuildExecutionResult.Completed(running, response.ResultUri);
}
var duration = completionTime - running.CreatedAt;
_metrics.RecordGraphJobResult("build", "completed", duration);
_metrics.RecordGraphJobCompletion("build", running.TenantId, running.GraphSnapshotId ?? running.SbomId, "completed", duration);
return GraphBuildExecutionResult.Completed(running, response.ResultUri);
}
if (response.Status == GraphJobStatus.Failed)
{
@@ -124,9 +128,11 @@ internal sealed class GraphBuildExecutionService
{
var completionTime = _timeProvider.GetUtcNow();
await NotifyCompletionAsync(running, GraphJobStatus.Failed, completionTime, response.GraphSnapshotId, response.ResultUri, response.Error, cancellationToken).ConfigureAwait(false);
_metrics.RecordGraphJobResult("build", "failed", completionTime - running.CreatedAt);
return GraphBuildExecutionResult.Failed(running, response.Error);
}
var duration = completionTime - running.CreatedAt;
_metrics.RecordGraphJobResult("build", "failed", duration);
_metrics.RecordGraphJobCompletion("build", running.TenantId, running.GraphSnapshotId ?? running.SbomId, "failed", duration);
return GraphBuildExecutionResult.Failed(running, response.Error);
}
_logger.LogWarning(
"Cartographer build attempt {Attempt} failed for job {JobId}; retrying in {Delay} (reason: {Reason}).",
@@ -144,9 +150,11 @@ internal sealed class GraphBuildExecutionService
{
var completionTime = _timeProvider.GetUtcNow();
await NotifyCompletionAsync(running, GraphJobStatus.Failed, completionTime, response.GraphSnapshotId, response.ResultUri, response.Error ?? "Cartographer did not complete the build.", cancellationToken).ConfigureAwait(false);
_metrics.RecordGraphJobResult("build", "failed", completionTime - running.CreatedAt);
return GraphBuildExecutionResult.Failed(running, response.Error);
}
var duration = completionTime - running.CreatedAt;
_metrics.RecordGraphJobResult("build", "failed", duration);
_metrics.RecordGraphJobCompletion("build", running.TenantId, running.GraphSnapshotId ?? running.SbomId, "failed", duration);
return GraphBuildExecutionResult.Failed(running, response.Error);
}
await Task.Delay(backoff, cancellationToken).ConfigureAwait(false);
}
@@ -170,9 +178,11 @@ internal sealed class GraphBuildExecutionService
var error = lastResult?.Error ?? lastException?.Message ?? "Cartographer build failed";
var finalTime = _timeProvider.GetUtcNow();
await NotifyCompletionAsync(running, GraphJobStatus.Failed, finalTime, lastResult?.GraphSnapshotId ?? running.GraphSnapshotId, lastResult?.ResultUri, error, cancellationToken).ConfigureAwait(false);
_metrics.RecordGraphJobResult("build", "failed", finalTime - running.CreatedAt);
return GraphBuildExecutionResult.Failed(running, error);
}
var finalDuration = finalTime - running.CreatedAt;
_metrics.RecordGraphJobResult("build", "failed", finalDuration);
_metrics.RecordGraphJobCompletion("build", running.TenantId, running.GraphSnapshotId ?? running.SbomId, "failed", finalDuration);
return GraphBuildExecutionResult.Failed(running, error);
}
private async Task NotifyCompletionAsync(
GraphBuildJob job,

View File

@@ -71,11 +71,13 @@ internal sealed class GraphOverlayExecutionService
return GraphOverlayExecutionResult.Skipped(job, "transition_invalid");
}
if (!await _repository.TryReplaceOverlayAsync(running, job.Status, cancellationToken).ConfigureAwait(false))
{
_metrics.RecordGraphJobResult("overlay", "skipped");
return GraphOverlayExecutionResult.Skipped(job, "concurrency_conflict");
}
if (!await _repository.TryReplaceOverlayAsync(running, job.Status, cancellationToken).ConfigureAwait(false))
{
_metrics.RecordGraphJobResult("overlay", "skipped");
return GraphOverlayExecutionResult.Skipped(job, "concurrency_conflict");
}
_metrics.RecordGraphJobStart("overlay", running.TenantId, running.GraphSnapshotId);
var attempt = 0;
CartographerOverlayResult? lastResult = null;
@@ -96,9 +98,11 @@ internal sealed class GraphOverlayExecutionService
{
var completionTime = _timeProvider.GetUtcNow();
await NotifyCompletionAsync(running, GraphJobStatus.Completed, completionTime, response.GraphSnapshotId ?? running.GraphSnapshotId, response.ResultUri, response.Error, cancellationToken).ConfigureAwait(false);
_metrics.RecordGraphJobResult("overlay", "completed", completionTime - running.CreatedAt);
return GraphOverlayExecutionResult.Completed(running, response.ResultUri);
}
var duration = completionTime - running.CreatedAt;
_metrics.RecordGraphJobResult("overlay", "completed", duration);
_metrics.RecordGraphJobCompletion("overlay", running.TenantId, running.GraphSnapshotId, "completed", duration);
return GraphOverlayExecutionResult.Completed(running, response.ResultUri);
}
if (response.Status == GraphJobStatus.Failed)
{
@@ -106,9 +110,11 @@ internal sealed class GraphOverlayExecutionService
{
var completionTime = _timeProvider.GetUtcNow();
await NotifyCompletionAsync(running, GraphJobStatus.Failed, completionTime, response.GraphSnapshotId ?? running.GraphSnapshotId, response.ResultUri, response.Error, cancellationToken).ConfigureAwait(false);
_metrics.RecordGraphJobResult("overlay", "failed", completionTime - running.CreatedAt);
return GraphOverlayExecutionResult.Failed(running, response.Error);
}
var duration = completionTime - running.CreatedAt;
_metrics.RecordGraphJobResult("overlay", "failed", duration);
_metrics.RecordGraphJobCompletion("overlay", running.TenantId, running.GraphSnapshotId, "failed", duration);
return GraphOverlayExecutionResult.Failed(running, response.Error);
}
_logger.LogWarning(
"Cartographer overlay attempt {Attempt} failed for job {JobId}; retrying in {Delay} (reason: {Reason}).",
@@ -125,9 +131,11 @@ internal sealed class GraphOverlayExecutionService
{
var completionTime = _timeProvider.GetUtcNow();
await NotifyCompletionAsync(running, GraphJobStatus.Failed, completionTime, response.GraphSnapshotId ?? running.GraphSnapshotId, response.ResultUri, response.Error ?? "Cartographer did not complete the overlay.", cancellationToken).ConfigureAwait(false);
_metrics.RecordGraphJobResult("overlay", "failed", completionTime - running.CreatedAt);
return GraphOverlayExecutionResult.Failed(running, response.Error);
}
var duration = completionTime - running.CreatedAt;
_metrics.RecordGraphJobResult("overlay", "failed", duration);
_metrics.RecordGraphJobCompletion("overlay", running.TenantId, running.GraphSnapshotId, "failed", duration);
return GraphOverlayExecutionResult.Failed(running, response.Error);
}
await Task.Delay(backoff, cancellationToken).ConfigureAwait(false);
}
@@ -151,9 +159,11 @@ internal sealed class GraphOverlayExecutionService
var error = lastResult?.Error ?? lastException?.Message ?? "Cartographer overlay failed";
var finalTime = _timeProvider.GetUtcNow();
await NotifyCompletionAsync(running, GraphJobStatus.Failed, finalTime, lastResult?.GraphSnapshotId ?? running.GraphSnapshotId, lastResult?.ResultUri, error, cancellationToken).ConfigureAwait(false);
_metrics.RecordGraphJobResult("overlay", "failed", finalTime - running.CreatedAt);
return GraphOverlayExecutionResult.Failed(running, error);
}
var finalDuration = finalTime - running.CreatedAt;
_metrics.RecordGraphJobResult("overlay", "failed", finalDuration);
_metrics.RecordGraphJobCompletion("overlay", running.TenantId, running.GraphSnapshotId, "failed", finalDuration);
return GraphOverlayExecutionResult.Failed(running, error);
}
private async Task NotifyCompletionAsync(
GraphOverlayJob job,

View File

@@ -23,6 +23,9 @@ public sealed class SchedulerWorkerMetrics : IDisposable
private readonly UpDownCounter<long> _runsActive;
private readonly Counter<long> _graphJobsTotal;
private readonly Histogram<double> _graphJobDurationSeconds;
private readonly UpDownCounter<long> _graphJobsInflight;
private readonly Histogram<double> _graphBuildSeconds;
private readonly Histogram<double> _overlayLagSeconds;
private readonly ConcurrentDictionary<string, long> _backlog = new(StringComparer.Ordinal);
private readonly ObservableGauge<long> _backlogGauge;
private bool _disposed;
@@ -78,6 +81,18 @@ public sealed class SchedulerWorkerMetrics : IDisposable
"scheduler_graph_job_duration_seconds",
unit: "s",
description: "Graph job durations grouped by type and result.");
_graphJobsInflight = _meter.CreateUpDownCounter<long>(
"graph_jobs_inflight",
unit: "count",
description: "Number of in-flight graph jobs grouped by type, tenant, and graph identifier.");
_graphBuildSeconds = _meter.CreateHistogram<double>(
"graph_build_seconds",
unit: "s",
description: "Wall-clock duration of Cartographer graph build jobs grouped by tenant and graph identifier.");
_overlayLagSeconds = _meter.CreateHistogram<double>(
"overlay_lag_seconds",
unit: "s",
description: "Latency between overlay job creation and completion grouped by tenant and graph identifier.");
_backlogGauge = _meter.CreateObservableGauge<long>(
"scheduler_runner_backlog",
ObserveBacklog,
@@ -85,6 +100,28 @@ public sealed class SchedulerWorkerMetrics : IDisposable
description: "Remaining images queued for runner processing grouped by mode and schedule.");
}
public void RecordGraphJobStart(string type, string tenantId, string graphId)
{
_graphJobsInflight.Add(1, GraphTags(type, tenantId, graphId));
}
public void RecordGraphJobCompletion(string type, string tenantId, string graphId, string result, TimeSpan? duration)
{
_graphJobsInflight.Add(-1, GraphTags(type, tenantId, graphId));
if (string.Equals(type, "build", StringComparison.OrdinalIgnoreCase) && duration is { } buildDuration)
{
_graphBuildSeconds.Record(Math.Max(buildDuration.TotalSeconds, 0d), GraphResultTags(type, tenantId, graphId, result));
}
if (string.Equals(type, "overlay", StringComparison.OrdinalIgnoreCase) && duration is { } lag)
{
_overlayLagSeconds.Record(Math.Max(lag.TotalSeconds, 0d), GraphResultTags(type, tenantId, graphId, result));
}
_graphJobDurationSeconds.Record(Math.Max(duration?.TotalSeconds ?? 0d, 0d), GraphResultTags(type, tenantId, graphId, result));
}
public void RecordGraphJobResult(string type, string result, TimeSpan? duration = null)
{
var tags = new[]
@@ -221,6 +258,23 @@ public sealed class SchedulerWorkerMetrics : IDisposable
}
}
private static KeyValuePair<string, object?>[] GraphTags(string type, string tenantId, string graphId)
=> new[]
{
new KeyValuePair<string, object?>("type", type),
new KeyValuePair<string, object?>("tenant", tenantId),
new KeyValuePair<string, object?>("graph_id", graphId)
};
private static KeyValuePair<string, object?>[] GraphResultTags(string type, string tenantId, string graphId, string result)
=> new[]
{
new KeyValuePair<string, object?>("type", type),
new KeyValuePair<string, object?>("tenant", tenantId),
new KeyValuePair<string, object?>("graph_id", graphId),
new KeyValuePair<string, object?>("result", result)
};
private static string BuildBacklogKey(string mode, string? scheduleId)
=> $"{mode}|{scheduleId ?? string.Empty}";

View File

@@ -130,11 +130,11 @@ public sealed class RoaringImpactIndexTests
}
[Fact]
public async Task ResolveAllAsync_UsageOnlyFiltersEntrypointImages()
{
var component = ComponentIdentity.Create("pkg:npm/a@1.0.0", "a", "1.0.0", "pkg:npm/a@1.0.0");
var (entryStream, entryDigest) = CreateBomIndex(component, ComponentUsage.Create(true, new[] { "/start.sh" }));
var nonEntryDigestValue = "sha256:" + new string('1', 64);
public async Task ResolveAllAsync_UsageOnlyFiltersEntrypointImages()
{
var component = ComponentIdentity.Create("pkg:npm/a@1.0.0", "a", "1.0.0", "pkg:npm/a@1.0.0");
var (entryStream, entryDigest) = CreateBomIndex(component, ComponentUsage.Create(true, new[] { "/start.sh" }));
var nonEntryDigestValue = "sha256:" + new string('1', 64);
var (nonEntryStream, nonEntryDigest) = CreateBomIndex(component, ComponentUsage.Create(false), nonEntryDigestValue);
var index = new RoaringImpactIndex(NullLogger<RoaringImpactIndex>.Instance);
@@ -159,12 +159,88 @@ public sealed class RoaringImpactIndexTests
var selector = new Selector(SelectorScope.AllImages, tenantId: "tenant-alpha");
var usageOnly = await index.ResolveAllAsync(selector, usageOnly: true);
usageOnly.Images.Should().ContainSingle(image => image.ImageDigest == entryDigest);
var allImages = await index.ResolveAllAsync(selector, usageOnly: false);
allImages.Images.Should().HaveCount(2);
}
var usageOnly = await index.ResolveAllAsync(selector, usageOnly: true);
usageOnly.Images.Should().ContainSingle(image => image.ImageDigest == entryDigest);
var allImages = await index.ResolveAllAsync(selector, usageOnly: false);
allImages.Images.Should().HaveCount(2);
}
[Fact]
public async Task RemoveAsync_RemovesImageAndComponents()
{
var component = ComponentIdentity.Create("pkg:npm/a@1.0.0", "a", "1.0.0", "pkg:npm/a@1.0.0");
var (stream1, digest1) = CreateBomIndex(component, ComponentUsage.Create(true, new[] { "/start.sh" }));
var (stream2, digest2) = CreateBomIndex(component, ComponentUsage.Create(true, new[] { "/start.sh" }));
var index = new RoaringImpactIndex(NullLogger<RoaringImpactIndex>.Instance);
await index.IngestAsync(new ImpactIndexIngestionRequest
{
TenantId = "tenant-alpha",
ImageDigest = digest1,
Registry = "docker.io",
Repository = "library/service",
BomIndexStream = stream1,
});
await index.IngestAsync(new ImpactIndexIngestionRequest
{
TenantId = "tenant-alpha",
ImageDigest = digest2,
Registry = "docker.io",
Repository = "library/service",
BomIndexStream = stream2,
});
await index.RemoveAsync(digest1);
var selector = new Selector(SelectorScope.AllImages, tenantId: "tenant-alpha");
var impact = await index.ResolveByPurlsAsync(new[] { "pkg:npm/a@1.0.0" }, usageOnly: false, selector);
impact.Images.Should().ContainSingle(img => img.ImageDigest == digest2);
}
[Fact]
public async Task CreateSnapshotAsync_CompactsIdsAndRestores()
{
var component = ComponentIdentity.Create("pkg:npm/a@1.0.0", "a", "1.0.0", "pkg:npm/a@1.0.0");
var (stream1, digest1) = CreateBomIndex(component, ComponentUsage.Create(true, new[] { "/start.sh" }));
var (stream2, digest2) = CreateBomIndex(component, ComponentUsage.Create(false));
var index = new RoaringImpactIndex(NullLogger<RoaringImpactIndex>.Instance);
await index.IngestAsync(new ImpactIndexIngestionRequest
{
TenantId = "tenant-alpha",
ImageDigest = digest1,
Registry = "docker.io",
Repository = "library/service",
BomIndexStream = stream1,
});
await index.IngestAsync(new ImpactIndexIngestionRequest
{
TenantId = "tenant-alpha",
ImageDigest = digest2,
Registry = "docker.io",
Repository = "library/service",
BomIndexStream = stream2,
});
await index.RemoveAsync(digest1);
var snapshot = await index.CreateSnapshotAsync();
var restored = new RoaringImpactIndex(NullLogger<RoaringImpactIndex>.Instance);
await restored.RestoreSnapshotAsync(snapshot);
var selector = new Selector(SelectorScope.AllImages, tenantId: "tenant-alpha");
var resolved = await restored.ResolveAllAsync(selector, usageOnly: false);
resolved.Images.Should().ContainSingle(img => img.ImageDigest == digest2);
resolved.SnapshotId.Should().Be(snapshot.SnapshotId);
}
private static (Stream Stream, string Digest) CreateBomIndex(ComponentIdentity identity, ComponentUsage usage, string? digest = null)
{

View File

@@ -213,5 +213,19 @@ public sealed class ImpactTargetingServiceTests
public ValueTask<ImpactSet> ResolveAllAsync(Selector selector, bool usageOnly, CancellationToken cancellationToken = default)
=> OnResolveAll?.Invoke(selector, usageOnly, cancellationToken)
?? ValueTask.FromResult(CreateEmptyImpactSet(selector, usageOnly));
public ValueTask RemoveAsync(string imageDigest, CancellationToken cancellationToken = default)
=> ValueTask.CompletedTask;
public ValueTask<ImpactIndexSnapshot> CreateSnapshotAsync(CancellationToken cancellationToken = default)
=> ValueTask.FromResult(new ImpactIndexSnapshot(
DateTimeOffset.UtcNow,
"stub",
ImmutableArray<ImpactImageRecord>.Empty,
ImmutableDictionary<string, ImmutableArray<int>>.Empty,
ImmutableDictionary<string, ImmutableArray<int>>.Empty));
public ValueTask RestoreSnapshotAsync(ImpactIndexSnapshot snapshot, CancellationToken cancellationToken = default)
=> ValueTask.CompletedTask;
}
}

View File

@@ -2,12 +2,14 @@ using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using MongoDB.Driver;
using Microsoft.Extensions.Logging.Abstractions;
using Microsoft.Extensions.Options;
using StellaOps.Scheduler.Models;
using StellaOps.Scheduler.Storage.Mongo.Repositories;
using StellaOps.Scheduler.Worker.Options;
using StellaOps.Scheduler.Worker.Policy;
using StellaOps.Scheduler.Worker.Observability;
namespace StellaOps.Scheduler.Worker.Tests;
@@ -35,7 +37,7 @@ public sealed class PolicyRunDispatchBackgroundServiceTests
var executionService = new PolicyRunExecutionService(
repository,
new StubPolicyRunClient(),
Options.Create(options),
Microsoft.Extensions.Options.Options.Create(options),
timeProvider: null,
new SchedulerWorkerMetrics(),
new StubPolicyRunTargetingService(),
@@ -45,7 +47,7 @@ public sealed class PolicyRunDispatchBackgroundServiceTests
return new PolicyRunDispatchBackgroundService(
repository,
executionService,
Options.Create(options),
Microsoft.Extensions.Options.Options.Create(options),
timeProvider: null,
NullLogger<PolicyRunDispatchBackgroundService>.Instance);
}
@@ -61,7 +63,9 @@ public sealed class PolicyRunDispatchBackgroundServiceTests
private sealed class RecordingPolicyRunJobRepository : IPolicyRunJobRepository
{
public int LeaseAttempts { get; private set; }
private int _leaseAttempts;
public int LeaseAttempts => Volatile.Read(ref _leaseAttempts);
public Task InsertAsync(PolicyRunJob job, IClientSessionHandle? session = null, CancellationToken cancellationToken = default)
=> Task.CompletedTask;
@@ -80,7 +84,7 @@ public sealed class PolicyRunDispatchBackgroundServiceTests
IClientSessionHandle? session = null,
CancellationToken cancellationToken = default)
{
Interlocked.Increment(ref LeaseAttempts);
Interlocked.Increment(ref _leaseAttempts);
return Task.FromResult<PolicyRunJob?>(null);
}