Add PHP Analyzer Plugin and Composer Lock Data Handling
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled

- Implemented the PhpAnalyzerPlugin to analyze PHP projects.
- Created ComposerLockData class to represent data from composer.lock files.
- Developed ComposerLockReader to load and parse composer.lock files asynchronously.
- Introduced ComposerPackage class to encapsulate package details.
- Added PhpPackage class to represent PHP packages with metadata and evidence.
- Implemented PhpPackageCollector to gather packages from ComposerLockData.
- Created PhpLanguageAnalyzer to perform analysis and emit results.
- Added capability signals for known PHP frameworks and CMS.
- Developed unit tests for the PHP language analyzer and its components.
- Included sample composer.lock and expected output for testing.
- Updated project files for the new PHP analyzer library and tests.
This commit is contained in:
StellaOps Bot
2025-11-22 14:02:49 +02:00
parent a7f3c7869a
commit b6b9ffc050
158 changed files with 16272 additions and 809 deletions

63
src/Graph/AGENTS.md Normal file
View File

@@ -0,0 +1,63 @@
# AGENTS · Graph Module
## Purpose & Scope
- Working directories: `src/Graph/StellaOps.Graph.Api`, `src/Graph/StellaOps.Graph.Indexer`, and `src/Graph/__Tests`.
- Modules covered: Graph API (query/search/paths/diff/overlay/export) and Graph Indexer (ingest, snapshot, overlays).
- Applicable sprints: `docs/implplan/SPRINT_0207_0001_0001_graph.md`, `docs/implplan/SPRINT_0141_0001_0001_graph_indexer.md`, and any follow-on graph docs sprints (`docs/implplan/SPRINT_0321_0001_0001_docs_modules_graph.md`).
## Roles
- Backend engineer (.NET 10) — API, planners, overlays, exports.
- Data/ETL engineer — Indexer ingest, snapshots, overlays.
- QA/Perf engineer — deterministic tests, load/fuzz, offline parity.
- Docs maintainer — graph API/ops runbooks, Offline Kit notes.
## Required Reading (treat as read before DOING)
- `docs/README.md`
- `docs/07_HIGH_LEVEL_ARCHITECTURE.md`
- `docs/modules/platform/architecture-overview.md`
- `docs/modules/graph/architecture.md`
- `docs/modules/graph/implementation_plan.md`
- Sprint doc for current work (e.g., `docs/implplan/SPRINT_0207_0001_0001_graph.md`).
- Policy overlay contract refs when touching overlays: `POLICY-ENGINE-30-001..003` (see policy module docs).
## Determinism & Offline
- Default to deterministic ordering for streams/exports; manifest checksums required for `graphml/csv/ndjson` exports.
- Timestamps: UTC ISO-8601; avoid wall-clock in tests.
- Snapshot/export roots configurable via `STELLAOPS_GRAPH_SNAPSHOT_DIR` or `SbomIngestOptions.SnapshotRootDirectory`.
- Offline posture: no external calls beyond allowlisted feeds; prefer cached schemas and local nugets in `local-nugets/`.
## Data & Environment
- Canonical store: MongoDB (>=3.0 driver). Tests use `STELLAOPS_TEST_MONGO_URI`; fallback `mongodb://127.0.0.1:27017`, then Mongo2Go.
- Collections: `graph_nodes`, `graph_edges`, `graph_overlays_cache`, `graph_snapshots`, `graph_saved_queries`.
- Tenant isolation mandatory on every query and export.
## Testing Expectations
- Unit: node/edge builders, identifier stability, overlay calculators, planners, diff engine.
- Integration: ingest → snapshot → query/paths/diff/export end-to-end; RBAC + tenant guards.
- Performance: synthetic datasets (~500k nodes / 2M edges) with enforced budgets; capture latency metrics.
- Security: RBAC scopes (`graph:read/query/export`), audit logging, rate limiting.
- Offline: export/import parity for Offline Kit bundles; deterministic manifests verified in tests.
## Observability
- Metrics to emit: `graph_ingest_lag_seconds`, `graph_tile_latency_seconds`, `graph_query_budget_denied_total`, `graph_overlay_cache_hit_ratio`, clustering counters from architecture doc.
- Structured logs with trace IDs; traces for ingest stages and query planner/executor.
## Coding Standards
- Target framework: net10.0 with latest C# preview features.
- Use dependency injection; avoid static singletons.
- Respect module boundaries; shared libs only if declared in sprint or architecture docs.
- Naming: projects `StellaOps.Graph.Api`, `StellaOps.Graph.Indexer`; prefer `Graph*` prefixes for internal components.
## Coordination & Status
- Update sprint Delivery Tracker statuses (TODO → DOING → DONE/BLOCKED) in relevant sprint file.
- If a required contract/doc is missing or stale, mark the affected task BLOCKED in the sprint and log under Decisions & Risks; do not pause work waiting for live answers.
## Run/Test Commands (examples)
- Restore: `dotnet restore src/Graph/StellaOps.Graph.Api/StellaOps.Graph.Api.csproj --source ../local-nugets`
- Build: `dotnet build src/Graph/StellaOps.Graph.Api/StellaOps.Graph.Api.csproj -c Release`
- Tests: `dotnet test src/Graph/__Tests/StellaOps.Graph.Indexer.Tests/StellaOps.Graph.Indexer.Tests.csproj`
- Lint/style: follow repo-wide analyzers in `Directory.Build.props` / `.editorconfig`.
## Evidence
- Keep artefacts deterministic; attach manifest hashes in PR/sprint notes when delivering exports or snapshots.
- Document new metrics/routes/schemas under `docs/modules/graph` and link from sprint Decisions & Risks.

View File

@@ -0,0 +1,471 @@
using System;
using System.Collections.Generic;
using System.Collections.Immutable;
using System.Linq;
using System.Text.Json.Nodes;
using StellaOps.Graph.Indexer.Documents;
namespace StellaOps.Graph.Indexer.Analytics;
public sealed class GraphAnalyticsEngine
{
private readonly GraphAnalyticsOptions _options;
public GraphAnalyticsEngine(GraphAnalyticsOptions options)
{
_options = options ?? throw new ArgumentNullException(nameof(options));
if (_options.MaxPropagationIterations <= 0)
{
throw new ArgumentOutOfRangeException(nameof(options.MaxPropagationIterations), "must be positive");
}
}
public GraphAnalyticsResult Compute(GraphAnalyticsSnapshot snapshot)
{
ArgumentNullException.ThrowIfNull(snapshot);
var topology = BuildTopology(snapshot);
var clusters = ComputeClusters(topology);
var centrality = ComputeCentrality(topology);
return new GraphAnalyticsResult(clusters, centrality);
}
private GraphTopology BuildTopology(GraphAnalyticsSnapshot snapshot)
{
var nodes = snapshot.Nodes
.Select(node => new GraphNode(node["id"]!.GetValue<string>(), node["kind"]!.GetValue<string>()))
.ToImmutableArray();
var nodeLookup = nodes.ToImmutableDictionary(n => n.NodeId, n => n.Kind, StringComparer.Ordinal);
var adjacency = new Dictionary<string, HashSet<string>>(StringComparer.Ordinal);
foreach (var node in nodes)
{
adjacency[node.NodeId] = new HashSet<string>(StringComparer.Ordinal);
}
var resolver = new EdgeEndpointResolver(snapshot.Nodes);
foreach (var edge in snapshot.Edges)
{
if (!resolver.TryResolve(edge, out var source, out var target))
{
continue;
}
if (!adjacency.ContainsKey(source) || !adjacency.ContainsKey(target))
{
continue;
}
// Treat the graph as undirected for clustering / centrality to stabilise communities.
adjacency[source].Add(target);
adjacency[target].Add(source);
}
return new GraphTopology(nodes, adjacency, nodeLookup);
}
private ImmutableArray<ClusterAssignment> ComputeClusters(GraphTopology topology)
{
var labels = topology.Nodes
.OrderBy(n => n.NodeId, StringComparer.Ordinal)
.Select(n => (NodeId: n.NodeId, Label: n.NodeId, Kind: n.Kind))
.ToArray();
for (var iteration = 0; iteration < _options.MaxPropagationIterations; iteration++)
{
var updated = false;
foreach (ref var entry in labels.AsSpan())
{
if (!topology.Adjacency.TryGetValue(entry.NodeId, out var neighbors) || neighbors.Count == 0)
{
continue;
}
var best = SelectDominantLabel(neighbors, labels);
if (!string.Equals(best, entry.Label, StringComparison.Ordinal))
{
entry.Label = best;
updated = true;
}
}
if (!updated)
{
break;
}
}
return labels
.OrderBy(t => t.NodeId, StringComparer.Ordinal)
.Select(t => new ClusterAssignment(t.NodeId, t.Label, t.Kind))
.ToImmutableArray();
}
private static string SelectDominantLabel(IEnumerable<string> neighbors, (string NodeId, string Label, string Kind)[] labels)
{
var labelCounts = new Dictionary<string, int>(StringComparer.Ordinal);
foreach (var neighbor in neighbors)
{
var neighborLabel = labels.First(t => t.NodeId == neighbor).Label;
labelCounts.TryGetValue(neighborLabel, out var count);
labelCounts[neighborLabel] = count + 1;
}
var max = labelCounts.Max(kvp => kvp.Value);
return labelCounts
.Where(kvp => kvp.Value == max)
.Select(kvp => kvp.Key)
.OrderBy(label => label, StringComparer.Ordinal)
.First();
}
private ImmutableArray<CentralityScore> ComputeCentrality(GraphTopology topology)
{
var degreeScores = new Dictionary<string, double>(StringComparer.Ordinal);
foreach (var (nodeId, neighbors) in topology.Adjacency)
{
degreeScores[nodeId] = neighbors.Count;
}
var betweenness = CalculateBetweenness(topology);
return topology.Nodes
.OrderBy(n => n.NodeId, StringComparer.Ordinal)
.Select(n => new CentralityScore(
n.NodeId,
degreeScores.TryGetValue(n.NodeId, out var degree) ? degree : 0d,
betweenness.TryGetValue(n.NodeId, out var between) ? between : 0d,
n.Kind))
.ToImmutableArray();
}
private Dictionary<string, double> CalculateBetweenness(GraphTopology topology)
{
var scores = topology.Nodes.ToDictionary(n => n.NodeId, _ => 0d, StringComparer.Ordinal);
if (scores.Count == 0)
{
return scores;
}
var sampled = topology.Nodes
.OrderBy(n => n.NodeId, StringComparer.Ordinal)
.Take(Math.Min(_options.BetweennessSampleSize, topology.Nodes.Length))
.ToArray();
foreach (var source in sampled)
{
var stack = new Stack<string>();
var predecessors = new Dictionary<string, List<string>>(StringComparer.Ordinal);
var sigma = new Dictionary<string, double>(StringComparer.Ordinal);
var distance = new Dictionary<string, int>(StringComparer.Ordinal);
var queue = new Queue<string>();
foreach (var node in topology.Nodes)
{
predecessors[node.NodeId] = new List<string>();
sigma[node.NodeId] = 0;
distance[node.NodeId] = -1;
}
sigma[source.NodeId] = 1;
distance[source.NodeId] = 0;
queue.Enqueue(source.NodeId);
while (queue.Count > 0)
{
var v = queue.Dequeue();
stack.Push(v);
foreach (var neighbor in topology.GetNeighbors(v))
{
if (distance[neighbor] < 0)
{
distance[neighbor] = distance[v] + 1;
queue.Enqueue(neighbor);
}
if (distance[neighbor] == distance[v] + 1)
{
sigma[neighbor] += sigma[v];
predecessors[neighbor].Add(v);
}
}
}
var delta = topology.Nodes.ToDictionary(n => n.NodeId, _ => 0d, StringComparer.Ordinal);
while (stack.Count > 0)
{
var w = stack.Pop();
foreach (var v in predecessors[w])
{
delta[v] += (sigma[v] / sigma[w]) * (1 + delta[w]);
}
if (!string.Equals(w, source.NodeId, StringComparison.Ordinal))
{
scores[w] += delta[w];
}
}
}
return scores;
}
private sealed record GraphNode(string NodeId, string Kind);
private sealed class GraphTopology
{
public GraphTopology(ImmutableArray<GraphNode> nodes, Dictionary<string, HashSet<string>> adjacency, IReadOnlyDictionary<string, string> kinds)
{
Nodes = nodes;
Adjacency = adjacency;
Kinds = kinds;
}
public ImmutableArray<GraphNode> Nodes { get; }
public Dictionary<string, HashSet<string>> Adjacency { get; }
public IReadOnlyDictionary<string, string> Kinds { get; }
public IEnumerable<string> GetNeighbors(string nodeId)
{
if (Adjacency.TryGetValue(nodeId, out var neighbors))
{
return neighbors;
}
return Array.Empty<string>();
}
}
private sealed class EdgeEndpointResolver
{
private readonly IReadOnlyDictionary<string, JsonObject> _nodesById;
private readonly IReadOnlyDictionary<string, string> _componentNodeByPurl;
private readonly IReadOnlyDictionary<string, string> _artifactNodeByDigest;
public EdgeEndpointResolver(ImmutableArray<JsonObject> nodes)
{
_nodesById = nodes.ToImmutableDictionary(
node => node["id"]!.GetValue<string>(),
node => node,
StringComparer.Ordinal);
_componentNodeByPurl = BuildComponentIndex(nodes);
_artifactNodeByDigest = BuildArtifactIndex(nodes);
}
public bool TryResolve(JsonObject edge, out string source, out string target)
{
var kind = edge["kind"]!.GetValue<string>();
var canonicalKey = edge["canonical_key"]!.AsObject();
string? s = null;
string? t = null;
switch (kind)
{
case "CONTAINS":
s = canonicalKey.TryGetPropertyValue("artifact_node_id", out var containsSource)
? containsSource?.GetValue<string>()
: null;
t = canonicalKey.TryGetPropertyValue("component_node_id", out var containsTarget)
? containsTarget?.GetValue<string>()
: null;
break;
case "DECLARED_IN":
s = canonicalKey.TryGetPropertyValue("component_node_id", out var declaredSource)
? declaredSource?.GetValue<string>()
: null;
t = canonicalKey.TryGetPropertyValue("file_node_id", out var declaredTarget)
? declaredTarget?.GetValue<string>()
: null;
break;
case "AFFECTED_BY":
s = canonicalKey.TryGetPropertyValue("component_node_id", out var affectedSource)
? affectedSource?.GetValue<string>()
: null;
t = canonicalKey.TryGetPropertyValue("advisory_node_id", out var affectedTarget)
? affectedTarget?.GetValue<string>()
: null;
break;
case "VEX_EXEMPTS":
s = canonicalKey.TryGetPropertyValue("component_node_id", out var vexSource)
? vexSource?.GetValue<string>()
: null;
t = canonicalKey.TryGetPropertyValue("vex_node_id", out var vexTarget)
? vexTarget?.GetValue<string>()
: null;
break;
case "GOVERNS_WITH":
s = canonicalKey.TryGetPropertyValue("policy_node_id", out var policySource)
? policySource?.GetValue<string>()
: null;
t = canonicalKey.TryGetPropertyValue("component_node_id", out var policyTarget)
? policyTarget?.GetValue<string>()
: null;
break;
case "OBSERVED_RUNTIME":
s = canonicalKey.TryGetPropertyValue("runtime_node_id", out var runtimeSource)
? runtimeSource?.GetValue<string>()
: null;
t = canonicalKey.TryGetPropertyValue("component_node_id", out var runtimeTarget)
? runtimeTarget?.GetValue<string>()
: null;
break;
case "BUILT_FROM":
s = canonicalKey.TryGetPropertyValue("parent_artifact_node_id", out var builtSource)
? builtSource?.GetValue<string>()
: null;
if (canonicalKey.TryGetPropertyValue("child_artifact_node_id", out var builtTargetNode) && builtTargetNode is not null)
{
t = builtTargetNode.GetValue<string>();
}
else if (canonicalKey.TryGetPropertyValue("child_artifact_digest", out var builtTargetDigest) && builtTargetDigest is not null)
{
_artifactNodeByDigest.TryGetValue(builtTargetDigest.GetValue<string>(), out t);
}
break;
case "DEPENDS_ON":
s = canonicalKey.TryGetPropertyValue("component_node_id", out var dependsSource)
? dependsSource?.GetValue<string>()
: null;
if (canonicalKey.TryGetPropertyValue("dependency_node_id", out var dependsTargetNode) && dependsTargetNode is not null)
{
t = dependsTargetNode.GetValue<string>();
}
else if (canonicalKey.TryGetPropertyValue("dependency_purl", out var dependencyPurl) && dependencyPurl is not null)
{
_componentNodeByPurl.TryGetValue(dependencyPurl.GetValue<string>(), out t);
}
break;
default:
s = ExtractFirstNodeId(canonicalKey);
t = ExtractSecondNodeId(canonicalKey);
break;
}
if (s is null || t is null)
{
source = string.Empty;
target = string.Empty;
return false;
}
if (!_nodesById.ContainsKey(s) || !_nodesById.ContainsKey(t))
{
source = string.Empty;
target = string.Empty;
return false;
}
source = s;
target = t;
return true;
}
private static Dictionary<string, string> BuildComponentIndex(ImmutableArray<JsonObject> nodes)
{
var components = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
foreach (var node in nodes)
{
if (!string.Equals(node["kind"]!.GetValue<string>(), "component", StringComparison.Ordinal))
{
continue;
}
if (!node.TryGetPropertyValue("attributes", out var attributesNode) || attributesNode is not JsonObject attributes)
{
continue;
}
if (!attributes.TryGetPropertyValue("purl", out var purlNode) || purlNode is null)
{
continue;
}
var purl = purlNode.GetValue<string>();
if (!string.IsNullOrWhiteSpace(purl))
{
components.TryAdd(purl.Trim(), node["id"]!.GetValue<string>());
}
}
return components;
}
private static Dictionary<string, string> BuildArtifactIndex(ImmutableArray<JsonObject> nodes)
{
var artifacts = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
foreach (var node in nodes)
{
if (!string.Equals(node["kind"]!.GetValue<string>(), "artifact", StringComparison.Ordinal))
{
continue;
}
if (!node.TryGetPropertyValue("attributes", out var attributesNode) || attributesNode is not JsonObject attributes)
{
continue;
}
if (!attributes.TryGetPropertyValue("artifact_digest", out var digestNode) || digestNode is null)
{
continue;
}
var digest = digestNode.GetValue<string>();
if (!string.IsNullOrWhiteSpace(digest))
{
artifacts.TryAdd(digest.Trim(), node["id"]!.GetValue<string>());
}
}
return artifacts;
}
private static string? ExtractFirstNodeId(JsonObject canonicalKey)
{
foreach (var property in canonicalKey)
{
if (property.Value is JsonValue value
&& value.TryGetValue(out string? candidate)
&& candidate is not null
&& candidate.StartsWith("gn:", StringComparison.Ordinal))
{
return candidate;
}
}
return null;
}
private static string? ExtractSecondNodeId(JsonObject canonicalKey)
{
var encountered = false;
foreach (var property in canonicalKey)
{
if (property.Value is JsonValue value
&& value.TryGetValue(out string? candidate)
&& candidate is not null
&& candidate.StartsWith("gn:", StringComparison.Ordinal))
{
if (!encountered)
{
encountered = true;
continue;
}
return candidate;
}
}
return null;
}
}
}

View File

@@ -0,0 +1,53 @@
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
namespace StellaOps.Graph.Indexer.Analytics;
public sealed class GraphAnalyticsHostedService : BackgroundService
{
private readonly IGraphAnalyticsPipeline _pipeline;
private readonly GraphAnalyticsOptions _options;
private readonly ILogger<GraphAnalyticsHostedService> _logger;
public GraphAnalyticsHostedService(
IGraphAnalyticsPipeline pipeline,
IOptions<GraphAnalyticsOptions> options,
ILogger<GraphAnalyticsHostedService> logger)
{
_pipeline = pipeline ?? throw new ArgumentNullException(nameof(pipeline));
_options = options?.Value ?? throw new ArgumentNullException(nameof(options));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
using var clusteringTimer = new PeriodicTimer(_options.ClusterInterval);
using var centralityTimer = new PeriodicTimer(_options.CentralityInterval);
while (!stoppingToken.IsCancellationRequested)
{
var clusteringTask = clusteringTimer.WaitForNextTickAsync(stoppingToken).AsTask();
var centralityTask = centralityTimer.WaitForNextTickAsync(stoppingToken).AsTask();
var completed = await Task.WhenAny(clusteringTask, centralityTask).ConfigureAwait(false);
if (completed.IsCanceled || stoppingToken.IsCancellationRequested)
{
break;
}
try
{
await _pipeline.RunAsync(new GraphAnalyticsRunContext(ForceBackfill: false), stoppingToken).ConfigureAwait(false);
}
catch (OperationCanceledException)
{
// graceful shutdown
}
catch (Exception ex)
{
_logger.LogError(ex, "graph-indexer: analytics pipeline failed during scheduled run");
}
}
}
}

View File

@@ -0,0 +1,88 @@
using System;
using System.Collections.Generic;
using System.Diagnostics.Metrics;
namespace StellaOps.Graph.Indexer.Analytics;
public sealed class GraphAnalyticsMetrics : IDisposable
{
public const string MeterName = "StellaOps.Graph.Indexer";
public const string MeterVersion = "1.0.0";
private const string RunsTotalName = "graph_analytics_runs_total";
private const string FailuresTotalName = "graph_analytics_failures_total";
private const string DurationSecondsName = "graph_analytics_duration_seconds";
private const string ClustersTotalName = "graph_analytics_clusters_total";
private const string CentralityTotalName = "graph_analytics_centrality_total";
private readonly Meter _meter;
private readonly bool _ownsMeter;
private readonly Counter<long> _runsTotal;
private readonly Counter<long> _failuresTotal;
private readonly Histogram<double> _durationSeconds;
private readonly Counter<long> _clustersTotal;
private readonly Counter<long> _centralityTotal;
private bool _disposed;
public GraphAnalyticsMetrics()
: this(null)
{
}
public GraphAnalyticsMetrics(Meter? meter)
{
_meter = meter ?? new Meter(MeterName, MeterVersion);
_ownsMeter = meter is null;
_runsTotal = _meter.CreateCounter<long>(RunsTotalName, unit: "count", description: "Total analytics runs executed.");
_failuresTotal = _meter.CreateCounter<long>(FailuresTotalName, unit: "count", description: "Total analytics runs that failed.");
_durationSeconds = _meter.CreateHistogram<double>(DurationSecondsName, unit: "s", description: "Duration of analytics runs.");
_clustersTotal = _meter.CreateCounter<long>(ClustersTotalName, unit: "count", description: "Cluster assignments written.");
_centralityTotal = _meter.CreateCounter<long>(CentralityTotalName, unit: "count", description: "Centrality scores written.");
}
public void RecordRun(string tenant, bool success, TimeSpan duration, int clusterCount, int centralityCount)
{
ThrowIfDisposed();
var tags = new KeyValuePair<string, object?>[]
{
new("tenant", tenant),
new("success", success)
};
var tagSpan = tags.AsSpan();
_runsTotal.Add(1, tagSpan);
if (!success)
{
_failuresTotal.Add(1, tagSpan);
}
_durationSeconds.Record(duration.TotalSeconds, tagSpan);
_clustersTotal.Add(clusterCount, tagSpan);
_centralityTotal.Add(centralityCount, tagSpan);
}
private void ThrowIfDisposed()
{
if (_disposed)
{
throw new ObjectDisposedException(nameof(GraphAnalyticsMetrics));
}
}
public void Dispose()
{
if (_disposed)
{
return;
}
if (_ownsMeter)
{
_meter.Dispose();
}
_disposed = true;
}
}

View File

@@ -0,0 +1,31 @@
using System;
namespace StellaOps.Graph.Indexer.Analytics;
public sealed class GraphAnalyticsOptions
{
/// <summary>
/// Interval for running clustering (Louvain-style label propagation).
/// </summary>
public TimeSpan ClusterInterval { get; set; } = TimeSpan.FromMinutes(5);
/// <summary>
/// Interval for recomputing centrality metrics (degree + betweenness approximation).
/// </summary>
public TimeSpan CentralityInterval { get; set; } = TimeSpan.FromMinutes(5);
/// <summary>
/// Maximum number of iterations for label propagation.
/// </summary>
public int MaxPropagationIterations { get; set; } = 6;
/// <summary>
/// Number of seed nodes to sample (deterministically) for betweenness approximation.
/// </summary>
public int BetweennessSampleSize { get; set; } = 12;
/// <summary>
/// Whether to also write cluster ids onto graph node documents (alongside overlays).
/// </summary>
public bool WriteClusterAssignmentsToNodes { get; set; } = true;
}

View File

@@ -0,0 +1,72 @@
using System.Diagnostics;
using Microsoft.Extensions.Logging;
namespace StellaOps.Graph.Indexer.Analytics;
public sealed class GraphAnalyticsPipeline : IGraphAnalyticsPipeline
{
private readonly GraphAnalyticsEngine _engine;
private readonly IGraphSnapshotProvider _snapshotProvider;
private readonly IGraphAnalyticsWriter _writer;
private readonly GraphAnalyticsMetrics _metrics;
private readonly ILogger<GraphAnalyticsPipeline> _logger;
public GraphAnalyticsPipeline(
GraphAnalyticsEngine engine,
IGraphSnapshotProvider snapshotProvider,
IGraphAnalyticsWriter writer,
GraphAnalyticsMetrics metrics,
ILogger<GraphAnalyticsPipeline> logger)
{
_engine = engine ?? throw new ArgumentNullException(nameof(engine));
_snapshotProvider = snapshotProvider ?? throw new ArgumentNullException(nameof(snapshotProvider));
_writer = writer ?? throw new ArgumentNullException(nameof(writer));
_metrics = metrics ?? throw new ArgumentNullException(nameof(metrics));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
public async Task RunAsync(GraphAnalyticsRunContext context, CancellationToken cancellationToken)
{
cancellationToken.ThrowIfCancellationRequested();
var snapshots = await _snapshotProvider.GetPendingSnapshotsAsync(cancellationToken).ConfigureAwait(false);
foreach (var snapshot in snapshots)
{
var stopwatch = Stopwatch.StartNew();
try
{
cancellationToken.ThrowIfCancellationRequested();
var result = _engine.Compute(snapshot);
await _writer.PersistClusterAssignmentsAsync(snapshot, result.Clusters, cancellationToken).ConfigureAwait(false);
await _writer.PersistCentralityAsync(snapshot, result.CentralityScores, cancellationToken).ConfigureAwait(false);
await _snapshotProvider.MarkProcessedAsync(snapshot.Tenant, snapshot.SnapshotId, cancellationToken).ConfigureAwait(false);
stopwatch.Stop();
_metrics.RecordRun(snapshot.Tenant, success: true, stopwatch.Elapsed, result.Clusters.Length, result.CentralityScores.Length);
_logger.LogInformation(
"graph-indexer: analytics computed for snapshot {SnapshotId} tenant {Tenant} with {ClusterCount} clusters and {CentralityCount} centrality scores in {DurationMs:F2} ms",
snapshot.SnapshotId,
snapshot.Tenant,
result.Clusters.Length,
result.CentralityScores.Length,
stopwatch.Elapsed.TotalMilliseconds);
}
catch (Exception ex)
{
stopwatch.Stop();
_metrics.RecordRun(snapshot.Tenant, success: false, stopwatch.Elapsed, 0, 0);
_logger.LogError(
ex,
"graph-indexer: analytics failed for snapshot {SnapshotId} tenant {Tenant}",
snapshot.SnapshotId,
snapshot.Tenant);
throw;
}
}
}
}

View File

@@ -0,0 +1,36 @@
using System;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Options;
namespace StellaOps.Graph.Indexer.Analytics;
public static class GraphAnalyticsServiceCollectionExtensions
{
public static IServiceCollection AddGraphAnalyticsPipeline(
this IServiceCollection services,
Action<GraphAnalyticsOptions>? configureOptions = null)
{
ArgumentNullException.ThrowIfNull(services);
if (configureOptions is not null)
{
services.Configure(configureOptions);
}
else
{
services.Configure<GraphAnalyticsOptions>(_ => { });
}
services.AddSingleton<GraphAnalyticsEngine>(provider =>
{
var options = provider.GetRequiredService<IOptions<GraphAnalyticsOptions>>();
return new GraphAnalyticsEngine(options.Value);
});
services.AddSingleton<GraphAnalyticsMetrics>();
services.AddSingleton<IGraphAnalyticsPipeline, GraphAnalyticsPipeline>();
services.AddHostedService<GraphAnalyticsHostedService>();
return services;
}
}

View File

@@ -0,0 +1,40 @@
using System;
using System.Collections.Immutable;
using System.Text.Json.Nodes;
using StellaOps.Graph.Indexer.Documents;
namespace StellaOps.Graph.Indexer.Analytics;
public sealed record GraphAnalyticsSnapshot(
string Tenant,
string SnapshotId,
DateTimeOffset GeneratedAt,
ImmutableArray<JsonObject> Nodes,
ImmutableArray<JsonObject> Edges);
public sealed record GraphAnalyticsRunContext(bool ForceBackfill);
public sealed record ClusterAssignment(string NodeId, string ClusterId, string Kind);
public sealed record CentralityScore(string NodeId, double Degree, double Betweenness, string Kind);
public sealed record GraphAnalyticsResult(
ImmutableArray<ClusterAssignment> Clusters,
ImmutableArray<CentralityScore> CentralityScores);
public interface IGraphSnapshotProvider
{
Task<IReadOnlyList<GraphAnalyticsSnapshot>> GetPendingSnapshotsAsync(CancellationToken cancellationToken);
Task MarkProcessedAsync(string tenant, string snapshotId, CancellationToken cancellationToken);
}
public interface IGraphAnalyticsWriter
{
Task PersistClusterAssignmentsAsync(GraphAnalyticsSnapshot snapshot, ImmutableArray<ClusterAssignment> assignments, CancellationToken cancellationToken);
Task PersistCentralityAsync(GraphAnalyticsSnapshot snapshot, ImmutableArray<CentralityScore> scores, CancellationToken cancellationToken);
}
public interface IGraphAnalyticsPipeline
{
Task RunAsync(GraphAnalyticsRunContext context, CancellationToken cancellationToken);
}

View File

@@ -0,0 +1,9 @@
namespace StellaOps.Graph.Indexer.Analytics;
public sealed class GraphAnalyticsWriterOptions
{
public string ClusterCollectionName { get; set; } = "graph_cluster_overlays";
public string CentralityCollectionName { get; set; } = "graph_centrality_overlays";
public string NodeCollectionName { get; set; } = "graph_nodes";
public bool WriteClusterAssignmentsToNodes { get; set; } = true;
}

View File

@@ -0,0 +1,26 @@
using System.Collections.Immutable;
namespace StellaOps.Graph.Indexer.Analytics;
public sealed class InMemoryGraphAnalyticsWriter : IGraphAnalyticsWriter
{
private readonly List<(GraphAnalyticsSnapshot Snapshot, ImmutableArray<ClusterAssignment> Assignments)> _clusters = new();
private readonly List<(GraphAnalyticsSnapshot Snapshot, ImmutableArray<CentralityScore> Scores)> _centrality = new();
public IReadOnlyList<(GraphAnalyticsSnapshot Snapshot, ImmutableArray<ClusterAssignment> Assignments)> ClusterWrites => _clusters;
public IReadOnlyList<(GraphAnalyticsSnapshot Snapshot, ImmutableArray<CentralityScore> Scores)> CentralityWrites => _centrality;
public Task PersistClusterAssignmentsAsync(GraphAnalyticsSnapshot snapshot, ImmutableArray<ClusterAssignment> assignments, CancellationToken cancellationToken)
{
cancellationToken.ThrowIfCancellationRequested();
_clusters.Add((snapshot, assignments));
return Task.CompletedTask;
}
public Task PersistCentralityAsync(GraphAnalyticsSnapshot snapshot, ImmutableArray<CentralityScore> scores, CancellationToken cancellationToken)
{
cancellationToken.ThrowIfCancellationRequested();
_centrality.Add((snapshot, scores));
return Task.CompletedTask;
}
}

View File

@@ -0,0 +1,35 @@
using System.Collections.Concurrent;
using System.Collections.Immutable;
namespace StellaOps.Graph.Indexer.Analytics;
public sealed class InMemoryGraphSnapshotProvider : IGraphSnapshotProvider
{
private readonly ConcurrentQueue<GraphAnalyticsSnapshot> _queue = new();
public void Enqueue(GraphAnalyticsSnapshot snapshot)
{
ArgumentNullException.ThrowIfNull(snapshot);
_queue.Enqueue(snapshot);
}
public Task<IReadOnlyList<GraphAnalyticsSnapshot>> GetPendingSnapshotsAsync(CancellationToken cancellationToken)
{
cancellationToken.ThrowIfCancellationRequested();
var list = new List<GraphAnalyticsSnapshot>();
while (_queue.TryDequeue(out var snapshot))
{
list.Add(snapshot);
}
return Task.FromResult<IReadOnlyList<GraphAnalyticsSnapshot>>(list.ToImmutableArray());
}
public Task MarkProcessedAsync(string tenant, string snapshotId, CancellationToken cancellationToken)
{
// No-op for in-memory provider; processing removes items eagerly.
cancellationToken.ThrowIfCancellationRequested();
return Task.CompletedTask;
}
}

View File

@@ -0,0 +1,116 @@
using System.Collections.Generic;
using System.Globalization;
using System.Linq;
using System.Text.Json.Nodes;
using MongoDB.Bson;
using MongoDB.Driver;
namespace StellaOps.Graph.Indexer.Analytics;
public sealed class MongoGraphAnalyticsWriter : IGraphAnalyticsWriter
{
private readonly IMongoCollection<BsonDocument> _clusters;
private readonly IMongoCollection<BsonDocument> _centrality;
private readonly IMongoCollection<BsonDocument> _nodes;
private readonly GraphAnalyticsWriterOptions _options;
public MongoGraphAnalyticsWriter(IMongoDatabase database, GraphAnalyticsWriterOptions? options = null)
{
ArgumentNullException.ThrowIfNull(database);
_options = options ?? new GraphAnalyticsWriterOptions();
_clusters = database.GetCollection<BsonDocument>(_options.ClusterCollectionName);
_centrality = database.GetCollection<BsonDocument>(_options.CentralityCollectionName);
_nodes = database.GetCollection<BsonDocument>(_options.NodeCollectionName);
}
public async Task PersistClusterAssignmentsAsync(GraphAnalyticsSnapshot snapshot, ImmutableArray<ClusterAssignment> assignments, CancellationToken cancellationToken)
{
cancellationToken.ThrowIfCancellationRequested();
if (assignments.Length == 0)
{
return;
}
var models = new List<WriteModel<BsonDocument>>(assignments.Length);
foreach (var assignment in assignments)
{
var filter = Builders<BsonDocument>.Filter.And(
Builders<BsonDocument>.Filter.Eq("tenant", snapshot.Tenant),
Builders<BsonDocument>.Filter.Eq("snapshot_id", snapshot.SnapshotId),
Builders<BsonDocument>.Filter.Eq("node_id", assignment.NodeId));
var document = new BsonDocument
{
{ "tenant", snapshot.Tenant },
{ "snapshot_id", snapshot.SnapshotId },
{ "node_id", assignment.NodeId },
{ "cluster_id", assignment.ClusterId },
{ "kind", assignment.Kind },
{ "generated_at", snapshot.GeneratedAt.UtcDateTime }
};
models.Add(new ReplaceOneModel<BsonDocument>(filter, document) { IsUpsert = true });
}
await _clusters.BulkWriteAsync(models, new BulkWriteOptions { IsOrdered = false }, cancellationToken).ConfigureAwait(false);
if (_options.WriteClusterAssignmentsToNodes)
{
await WriteClustersToNodesAsync(assignments, cancellationToken).ConfigureAwait(false);
}
}
public async Task PersistCentralityAsync(GraphAnalyticsSnapshot snapshot, ImmutableArray<CentralityScore> scores, CancellationToken cancellationToken)
{
cancellationToken.ThrowIfCancellationRequested();
if (scores.Length == 0)
{
return;
}
var models = new List<WriteModel<BsonDocument>>(scores.Length);
foreach (var score in scores)
{
var filter = Builders<BsonDocument>.Filter.And(
Builders<BsonDocument>.Filter.Eq("tenant", snapshot.Tenant),
Builders<BsonDocument>.Filter.Eq("snapshot_id", snapshot.SnapshotId),
Builders<BsonDocument>.Filter.Eq("node_id", score.NodeId));
var document = new BsonDocument
{
{ "tenant", snapshot.Tenant },
{ "snapshot_id", snapshot.SnapshotId },
{ "node_id", score.NodeId },
{ "kind", score.Kind },
{ "degree", score.Degree },
{ "betweenness", score.Betweenness },
{ "generated_at", snapshot.GeneratedAt.UtcDateTime }
};
models.Add(new ReplaceOneModel<BsonDocument>(filter, document) { IsUpsert = true });
}
await _centrality.BulkWriteAsync(models, new BulkWriteOptions { IsOrdered = false }, cancellationToken).ConfigureAwait(false);
}
private async Task WriteClustersToNodesAsync(IEnumerable<ClusterAssignment> assignments, CancellationToken cancellationToken)
{
var models = new List<WriteModel<BsonDocument>>();
foreach (var assignment in assignments)
{
var filter = Builders<BsonDocument>.Filter.Eq("id", assignment.NodeId);
var update = Builders<BsonDocument>.Update.Set("attributes.cluster_id", assignment.ClusterId);
models.Add(new UpdateOneModel<BsonDocument>(filter, update) { IsUpsert = false });
}
if (models.Count == 0)
{
return;
}
await _nodes.BulkWriteAsync(models, new BulkWriteOptions { IsOrdered = false }, cancellationToken).ConfigureAwait(false);
}
}

View File

@@ -0,0 +1,89 @@
using System;
using System.Collections.Generic;
using System.Diagnostics.Metrics;
namespace StellaOps.Graph.Indexer.Incremental;
public sealed class GraphBackfillMetrics : IDisposable
{
private const string MeterName = "StellaOps.Graph.Indexer";
private const string MeterVersion = "1.0.0";
private const string ChangesTotalName = "graph_changes_total";
private const string BackfillTotalName = "graph_backfill_total";
private const string FailuresTotalName = "graph_change_failures_total";
private const string LagSecondsName = "graph_change_lag_seconds";
private readonly Meter _meter;
private readonly bool _ownsMeter;
private readonly Counter<long> _changesTotal;
private readonly Counter<long> _backfillTotal;
private readonly Counter<long> _failuresTotal;
private readonly Histogram<double> _lagSeconds;
private bool _disposed;
public GraphBackfillMetrics()
: this(null)
{
}
public GraphBackfillMetrics(Meter? meter)
{
_meter = meter ?? new Meter(MeterName, MeterVersion);
_ownsMeter = meter is null;
_changesTotal = _meter.CreateCounter<long>(ChangesTotalName, unit: "count", description: "Total change events applied.");
_backfillTotal = _meter.CreateCounter<long>(BackfillTotalName, unit: "count", description: "Total backfill events applied.");
_failuresTotal = _meter.CreateCounter<long>(FailuresTotalName, unit: "count", description: "Failed change applications.");
_lagSeconds = _meter.CreateHistogram<double>(LagSecondsName, unit: "s", description: "Lag between change emission and application.");
}
public void RecordApplied(string tenant, bool backfill, TimeSpan lag, bool success)
{
ThrowIfDisposed();
var tags = new KeyValuePair<string, object?>[]
{
new("tenant", tenant),
new("backfill", backfill),
new("success", success)
};
var tagSpan = tags.AsSpan();
_changesTotal.Add(1, tagSpan);
if (backfill)
{
_backfillTotal.Add(1, tagSpan);
}
if (!success)
{
_failuresTotal.Add(1, tagSpan);
}
_lagSeconds.Record(lag.TotalSeconds, tagSpan);
}
private void ThrowIfDisposed()
{
if (_disposed)
{
throw new ObjectDisposedException(nameof(GraphBackfillMetrics));
}
}
public void Dispose()
{
if (_disposed)
{
return;
}
if (_ownsMeter)
{
_meter.Dispose();
}
_disposed = true;
}
}

View File

@@ -0,0 +1,28 @@
using System.Collections.Immutable;
using System.Text.Json.Nodes;
namespace StellaOps.Graph.Indexer.Incremental;
public sealed record GraphChangeEvent(
string Tenant,
string SnapshotId,
string SequenceToken,
ImmutableArray<JsonObject> Nodes,
ImmutableArray<JsonObject> Edges,
bool IsBackfill = false);
public interface IGraphChangeEventSource
{
IAsyncEnumerable<GraphChangeEvent> ReadAsync(CancellationToken cancellationToken);
}
public interface IGraphBackfillSource
{
IAsyncEnumerable<GraphChangeEvent> ReadBackfillAsync(CancellationToken cancellationToken);
}
public interface IIdempotencyStore
{
Task<bool> HasSeenAsync(string sequenceToken, CancellationToken cancellationToken);
Task MarkSeenAsync(string sequenceToken, CancellationToken cancellationToken);
}

View File

@@ -0,0 +1,12 @@
using System;
namespace StellaOps.Graph.Indexer.Incremental;
public sealed class GraphChangeStreamOptions
{
public TimeSpan PollInterval { get; set; } = TimeSpan.FromSeconds(5);
public TimeSpan BackfillInterval { get; set; } = TimeSpan.FromMinutes(15);
public TimeSpan RetryBackoff { get; set; } = TimeSpan.FromSeconds(3);
public int MaxRetryAttempts { get; set; } = 3;
public int MaxBatchSize { get; set; } = 256;
}

View File

@@ -0,0 +1,119 @@
using System.Globalization;
using System.Linq;
using System.Text.Json.Nodes;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using StellaOps.Graph.Indexer.Ingestion.Sbom;
namespace StellaOps.Graph.Indexer.Incremental;
public sealed class GraphChangeStreamProcessor : BackgroundService
{
private readonly IGraphChangeEventSource _changeSource;
private readonly IGraphBackfillSource _backfillSource;
private readonly IGraphDocumentWriter _writer;
private readonly IIdempotencyStore _idempotencyStore;
private readonly GraphChangeStreamOptions _options;
private readonly GraphBackfillMetrics _metrics;
private readonly ILogger<GraphChangeStreamProcessor> _logger;
public GraphChangeStreamProcessor(
IGraphChangeEventSource changeSource,
IGraphBackfillSource backfillSource,
IGraphDocumentWriter writer,
IIdempotencyStore idempotencyStore,
IOptions<GraphChangeStreamOptions> options,
GraphBackfillMetrics metrics,
ILogger<GraphChangeStreamProcessor> logger)
{
_changeSource = changeSource ?? throw new ArgumentNullException(nameof(changeSource));
_backfillSource = backfillSource ?? throw new ArgumentNullException(nameof(backfillSource));
_writer = writer ?? throw new ArgumentNullException(nameof(writer));
_idempotencyStore = idempotencyStore ?? throw new ArgumentNullException(nameof(idempotencyStore));
_options = options?.Value ?? throw new ArgumentNullException(nameof(options));
_metrics = metrics ?? throw new ArgumentNullException(nameof(metrics));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
using var pollTimer = new PeriodicTimer(_options.PollInterval);
using var backfillTimer = new PeriodicTimer(_options.BackfillInterval);
while (!stoppingToken.IsCancellationRequested)
{
var pollTask = pollTimer.WaitForNextTickAsync(stoppingToken).AsTask();
var backfillTask = backfillTimer.WaitForNextTickAsync(stoppingToken).AsTask();
var completed = await Task.WhenAny(pollTask, backfillTask).ConfigureAwait(false);
if (completed.IsCanceled || stoppingToken.IsCancellationRequested)
{
break;
}
if (completed == pollTask)
{
await ApplyStreamAsync(isBackfill: false, stoppingToken).ConfigureAwait(false);
}
else
{
await ApplyStreamAsync(isBackfill: true, stoppingToken).ConfigureAwait(false);
}
}
}
internal async Task ApplyStreamAsync(bool isBackfill, CancellationToken cancellationToken)
{
var source = isBackfill ? _backfillSource.ReadBackfillAsync(cancellationToken) : _changeSource.ReadAsync(cancellationToken);
await foreach (var change in source.WithCancellation(cancellationToken))
{
if (await _idempotencyStore.HasSeenAsync(change.SequenceToken, cancellationToken).ConfigureAwait(false))
{
continue;
}
var attempts = 0;
while (true)
{
try
{
cancellationToken.ThrowIfCancellationRequested();
var batch = new GraphBuildBatch(change.Nodes, change.Edges);
await _writer.WriteAsync(batch, cancellationToken).ConfigureAwait(false);
await _idempotencyStore.MarkSeenAsync(change.SequenceToken, cancellationToken).ConfigureAwait(false);
var collectedAt = change.Nodes
.Select(n => n.TryGetPropertyValue("provenance", out var prov) && prov is JsonObject obj && obj.TryGetPropertyValue("collected_at", out var collected) ? collected?.GetValue<string>() : null)
.FirstOrDefault(value => !string.IsNullOrWhiteSpace(value));
var lag = DateTimeOffset.TryParse(collectedAt, CultureInfo.InvariantCulture, DateTimeStyles.AdjustToUniversal, out var parsed)
? DateTimeOffset.UtcNow - parsed
: TimeSpan.Zero;
_metrics.RecordApplied(change.Tenant, isBackfill, lag, success: true);
break;
}
catch (OperationCanceledException)
{
throw;
}
catch (Exception ex)
{
attempts++;
_metrics.RecordApplied(change.Tenant, isBackfill, TimeSpan.Zero, success: false);
_logger.LogError(ex, "graph-indexer: change stream apply failed for snapshot {SnapshotId} attempt {Attempt}", change.SnapshotId, attempts);
if (attempts >= _options.MaxRetryAttempts)
{
break;
}
await Task.Delay(_options.RetryBackoff, cancellationToken).ConfigureAwait(false);
}
}
}
}
}

View File

@@ -0,0 +1,28 @@
using System;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Options;
namespace StellaOps.Graph.Indexer.Incremental;
public static class GraphChangeStreamServiceCollectionExtensions
{
public static IServiceCollection AddGraphChangeStreamProcessor(
this IServiceCollection services,
Action<GraphChangeStreamOptions>? configureOptions = null)
{
ArgumentNullException.ThrowIfNull(services);
if (configureOptions is not null)
{
services.Configure(configureOptions);
}
else
{
services.Configure<GraphChangeStreamOptions>(_ => { });
}
services.AddSingleton<GraphBackfillMetrics>();
services.AddHostedService<GraphChangeStreamProcessor>();
return services;
}
}

View File

@@ -0,0 +1,21 @@
using System.Collections.Concurrent;
namespace StellaOps.Graph.Indexer.Incremental;
public sealed class InMemoryIdempotencyStore : IIdempotencyStore
{
private readonly ConcurrentDictionary<string, byte> _seen = new(StringComparer.Ordinal);
public Task<bool> HasSeenAsync(string sequenceToken, CancellationToken cancellationToken)
{
cancellationToken.ThrowIfCancellationRequested();
return Task.FromResult(_seen.ContainsKey(sequenceToken));
}
public Task MarkSeenAsync(string sequenceToken, CancellationToken cancellationToken)
{
cancellationToken.ThrowIfCancellationRequested();
_seen.TryAdd(sequenceToken, 0);
return Task.CompletedTask;
}
}

View File

@@ -0,0 +1,3 @@
using System.Runtime.CompilerServices;
[assembly: InternalsVisibleTo("StellaOps.Graph.Indexer.Tests")]

View File

@@ -11,6 +11,7 @@
<ItemGroup>
<PackageReference Include="Microsoft.Extensions.Logging.Abstractions" Version="10.0.0-rc.2.25502.107" />
<PackageReference Include="Microsoft.Extensions.DependencyInjection.Abstractions" Version="10.0.0-rc.2.25502.107" />
<PackageReference Include="Microsoft.Extensions.Hosting.Abstractions" Version="10.0.0-rc.2.25502.107" />
<PackageReference Include="Microsoft.Extensions.Options" Version="10.0.0-rc.2.25502.107" />
<PackageReference Include="MongoDB.Driver" Version="3.5.0" />
</ItemGroup>

View File

@@ -0,0 +1,27 @@
using System.Linq;
using StellaOps.Graph.Indexer.Analytics;
namespace StellaOps.Graph.Indexer.Tests;
public sealed class GraphAnalyticsEngineTests
{
[Fact]
public void Compute_IsDeterministic_ForLinearGraph()
{
var snapshot = GraphAnalyticsTestData.CreateLinearSnapshot();
var engine = new GraphAnalyticsEngine(new GraphAnalyticsOptions { MaxPropagationIterations = 5, BetweennessSampleSize = 8 });
var first = engine.Compute(snapshot);
var second = engine.Compute(snapshot);
Assert.Equal(first.Clusters, second.Clusters);
Assert.Equal(first.CentralityScores, second.CentralityScores);
var mainCluster = first.Clusters.First(c => c.NodeId == snapshot.Nodes[0]["id"]!.GetValue<string>()).ClusterId;
Assert.All(first.Clusters.Where(c => c.NodeId != snapshot.Nodes[^1]["id"]!.GetValue<string>()), c => Assert.Equal(mainCluster, c.ClusterId));
var centralNode = first.CentralityScores.OrderByDescending(c => c.Betweenness).First();
Assert.True(centralNode.Betweenness > 0);
Assert.True(centralNode.Degree >= 2);
}
}

View File

@@ -0,0 +1,32 @@
using System.Collections.Immutable;
using Microsoft.Extensions.Logging.Abstractions;
using StellaOps.Graph.Indexer.Analytics;
namespace StellaOps.Graph.Indexer.Tests;
public sealed class GraphAnalyticsPipelineTests
{
[Fact]
public async Task RunAsync_WritesClustersAndCentrality()
{
var snapshot = GraphAnalyticsTestData.CreateLinearSnapshot();
var provider = new InMemoryGraphSnapshotProvider();
provider.Enqueue(snapshot);
using var metrics = new GraphAnalyticsMetrics();
var writer = new InMemoryGraphAnalyticsWriter();
var pipeline = new GraphAnalyticsPipeline(
new GraphAnalyticsEngine(new GraphAnalyticsOptions()),
provider,
writer,
metrics,
NullLogger<GraphAnalyticsPipeline>.Instance);
await pipeline.RunAsync(new GraphAnalyticsRunContext(false), CancellationToken.None);
Assert.Single(writer.ClusterWrites);
Assert.Single(writer.CentralityWrites);
Assert.Equal(snapshot.Nodes.Length, writer.ClusterWrites.Single().Assignments.Length);
}
}

View File

@@ -0,0 +1,78 @@
using System.Collections.Generic;
using System.Collections.Immutable;
using System.Text.Json.Nodes;
using StellaOps.Graph.Indexer.Analytics;
using StellaOps.Graph.Indexer.Schema;
namespace StellaOps.Graph.Indexer.Tests;
internal static class GraphAnalyticsTestData
{
public static GraphAnalyticsSnapshot CreateLinearSnapshot()
{
var tenant = "tenant-a";
var provenance = new GraphProvenanceSpec("test", DateTimeOffset.UtcNow, "sbom-1", 1);
var nodeA = GraphDocumentFactory.CreateNode(new GraphNodeSpec(
tenant,
"component",
new Dictionary<string, string> { { "purl", "pkg:npm/a@1.0.0" } },
new JsonObject { ["purl"] = "pkg:npm/a@1.0.0" },
provenance,
DateTimeOffset.UtcNow,
null));
var nodeB = GraphDocumentFactory.CreateNode(new GraphNodeSpec(
tenant,
"component",
new Dictionary<string, string> { { "purl", "pkg:npm/b@1.0.0" } },
new JsonObject { ["purl"] = "pkg:npm/b@1.0.0" },
provenance,
DateTimeOffset.UtcNow,
null));
var nodeC = GraphDocumentFactory.CreateNode(new GraphNodeSpec(
tenant,
"component",
new Dictionary<string, string> { { "purl", "pkg:npm/c@1.0.0" } },
new JsonObject { ["purl"] = "pkg:npm/c@1.0.0" },
provenance,
DateTimeOffset.UtcNow,
null));
var nodeD = GraphDocumentFactory.CreateNode(new GraphNodeSpec(
tenant,
"component",
new Dictionary<string, string> { { "purl", "pkg:npm/d@1.0.0" } },
new JsonObject { ["purl"] = "pkg:npm/d@1.0.0" },
provenance,
DateTimeOffset.UtcNow,
null));
var edgeAB = CreateDependsOnEdge(tenant, nodeA["id"]!.GetValue<string>(), nodeB["id"]!.GetValue<string>(), provenance);
var edgeBC = CreateDependsOnEdge(tenant, nodeB["id"]!.GetValue<string>(), nodeC["id"]!.GetValue<string>(), provenance);
return new GraphAnalyticsSnapshot(
tenant,
"snapshot-1",
DateTimeOffset.UtcNow,
ImmutableArray.Create(nodeA, nodeB, nodeC, nodeD),
ImmutableArray.Create(edgeAB, edgeBC));
}
private static JsonObject CreateDependsOnEdge(string tenant, string sourceNodeId, string dependencyNodeId, GraphProvenanceSpec provenance)
{
return GraphDocumentFactory.CreateEdge(new GraphEdgeSpec(
tenant,
"DEPENDS_ON",
new Dictionary<string, string>
{
{ "component_node_id", sourceNodeId },
{ "dependency_node_id", dependencyNodeId }
},
new JsonObject(),
provenance,
DateTimeOffset.UtcNow,
null));
}
}

View File

@@ -0,0 +1,110 @@
using System;
using System.Collections.Generic;
using System.Collections.Immutable;
using System.Text.Json.Nodes;
using Microsoft.Extensions.Logging.Abstractions;
using Microsoft.Extensions.Options;
using StellaOps.Graph.Indexer.Incremental;
using StellaOps.Graph.Indexer.Ingestion.Sbom;
namespace StellaOps.Graph.Indexer.Tests;
public sealed class GraphChangeStreamProcessorTests
{
[Fact]
public async Task ApplyStreamAsync_SkipsDuplicates_AndRetries()
{
var tenant = "tenant-a";
var nodes = ImmutableArray.Create(new JsonObject { ["id"] = "gn:tenant-a:component:a", ["kind"] = "component" });
var edges = ImmutableArray<JsonObject>.Empty;
var events = new List<GraphChangeEvent>
{
new(tenant, "snap-1", "seq-1", nodes, edges, false),
new(tenant, "snap-1", "seq-1", nodes, edges, false), // duplicate
new(tenant, "snap-1", "seq-2", nodes, edges, false)
};
var changeSource = new FakeChangeSource(events);
var backfillSource = new FakeChangeSource(Array.Empty<GraphChangeEvent>());
var store = new InMemoryIdempotencyStore();
var writer = new FlakyWriter(failFirst: true);
using var metrics = new GraphBackfillMetrics();
var options = Options.Create(new GraphChangeStreamOptions
{
MaxRetryAttempts = 3,
RetryBackoff = TimeSpan.FromMilliseconds(10)
});
var processor = new GraphChangeStreamProcessor(
changeSource,
backfillSource,
writer,
store,
options,
metrics,
NullLogger<GraphChangeStreamProcessor>.Instance);
await processor.ApplyStreamAsync(isBackfill: false, CancellationToken.None);
Assert.Equal(2, writer.BatchCount); // duplicate skipped
Assert.True(writer.SucceededAfterRetry);
}
private sealed class FakeChangeSource : IGraphChangeEventSource, IGraphBackfillSource
{
private readonly IReadOnlyList<GraphChangeEvent> _events;
public FakeChangeSource(IReadOnlyList<GraphChangeEvent> events)
{
_events = events;
}
public IAsyncEnumerable<GraphChangeEvent> ReadAsync(CancellationToken cancellationToken)
{
return EmitAsync(cancellationToken);
}
public IAsyncEnumerable<GraphChangeEvent> ReadBackfillAsync(CancellationToken cancellationToken)
{
return EmitAsync(cancellationToken);
}
private async IAsyncEnumerable<GraphChangeEvent> EmitAsync([System.Runtime.CompilerServices.EnumeratorCancellation] CancellationToken cancellationToken)
{
foreach (var change in _events)
{
cancellationToken.ThrowIfCancellationRequested();
yield return change;
await Task.Yield();
}
}
}
private sealed class FlakyWriter : IGraphDocumentWriter
{
private readonly bool _failFirst;
private int _attempts;
public FlakyWriter(bool failFirst)
{
_failFirst = failFirst;
}
public int BatchCount { get; private set; }
public bool SucceededAfterRetry => _attempts > 1 && BatchCount > 0;
public Task WriteAsync(GraphBuildBatch batch, CancellationToken cancellationToken)
{
_attempts++;
if (_failFirst && _attempts == 1)
{
throw new InvalidOperationException("simulated failure");
}
BatchCount++;
return Task.CompletedTask;
}
}
}

View File

@@ -9,5 +9,8 @@
<ItemGroup>
<ProjectReference Include="../../StellaOps.Graph.Indexer/StellaOps.Graph.Indexer.csproj" />
<PackageReference Include="xunit" Version="2.9.2" />
<PackageReference Include="xunit.runner.visualstudio" Version="2.8.2" />
<PackageReference Include="Microsoft.NET.Test.Sdk" Version="17.11.1" />
</ItemGroup>
</Project>