feat: Initialize Zastava Webhook service with TLS and Authority authentication

- Added Program.cs to set up the web application with Serilog for logging, health check endpoints, and a placeholder admission endpoint.
- Configured Kestrel server to use TLS 1.3 and handle client certificates appropriately.
- Created StellaOps.Zastava.Webhook.csproj with necessary dependencies including Serilog and Polly.
- Documented tasks in TASKS.md for the Zastava Webhook project, outlining current work and exit criteria for each task.
This commit is contained in:
master
2025-10-19 18:36:22 +03:00
parent 2062da7a8b
commit d099a90f9b
966 changed files with 91038 additions and 1850 deletions

View File

@@ -0,0 +1,33 @@
# StellaOps.Scanner.Analyzers.Lang — Agent Charter
## Role
Deliver deterministic language ecosystem analyzers that run inside Scanner Workers, emit component evidence for SBOM assembly, and package as restart-time plug-ins.
## Scope
- Shared analyzer abstractions for installed application ecosystems (Java, Node.js, Python, Go, .NET, Rust).
- Evidence helpers that map on-disk artefacts to canonical component identities (purl/bin sha) with provenance and usage flags.
- File-system traversal, metadata parsing, and normalization for language-specific package formats.
- Plug-in bootstrap, manifest authoring, and DI registration so Workers load analyzers at start-up.
## Out of Scope
- OS package analyzers, native link graph, or EntryTrace plug-ins (handled by other guilds).
- SBOM composition, diffing, or signing (owned by Emit/Diff/Signer groups).
- Policy adjudication or vulnerability joins.
## Expectations
- Deterministic output: identical inputs → identical component ordering and hashes.
- Memory discipline: streaming walkers, avoid loading entire trees; reuse buffers.
- Cancellation-aware and timeboxed per layer.
- Enrich telemetry (counters + timings) via Scanner.Core primitives.
- Update `TASKS.md` as work progresses (TODO → DOING → DONE/BLOCKED).
## Dependencies
- Scanner.Core contracts + observability helpers.
- Scanner.Worker analyzer dispatcher.
- Upcoming Scanner.Emit models for SBOM assembly.
- Plugin host infrastructure under `StellaOps.Plugin`.
## Testing & Artifacts
- Determinism harness with golden fixtures under `Fixtures/`.
- Microbench benchmarks recorded per language where feasible.
- Plugin manifests stored under `plugins/scanner/analyzers/lang/` with cosign workflow documented.

View File

@@ -0,0 +1,24 @@
namespace StellaOps.Scanner.Analyzers.Lang;
/// <summary>
/// Contract implemented by language ecosystem analyzers. Analyzers must be deterministic,
/// cancellation-aware, and refrain from mutating shared state.
/// </summary>
public interface ILanguageAnalyzer
{
/// <summary>
/// Stable identifier (e.g., <c>java</c>, <c>node</c>).
/// </summary>
string Id { get; }
/// <summary>
/// Human-readable display name for diagnostics.
/// </summary>
string DisplayName { get; }
/// <summary>
/// Executes the analyzer against the resolved filesystem.
/// </summary>
ValueTask AnalyzeAsync(LanguageAnalyzerContext context, LanguageComponentWriter writer, CancellationToken cancellationToken);
}

View File

@@ -0,0 +1,16 @@
namespace StellaOps.Scanner.Analyzers.Lang.Internal;
internal static class LanguageAnalyzerJson
{
public static JsonSerializerOptions CreateDefault(bool indent = false)
{
var options = new JsonSerializerOptions(JsonSerializerDefaults.Web)
{
WriteIndented = indent,
DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull,
};
options.Converters.Add(new JsonStringEnumConverter(JsonNamingPolicy.CamelCase));
return options;
}
}

View File

@@ -0,0 +1,67 @@
namespace StellaOps.Scanner.Analyzers.Lang;
public sealed class LanguageAnalyzerContext
{
public LanguageAnalyzerContext(string rootPath, TimeProvider timeProvider, LanguageUsageHints? usageHints = null, IServiceProvider? services = null)
{
if (string.IsNullOrWhiteSpace(rootPath))
{
throw new ArgumentException("Root path is required", nameof(rootPath));
}
RootPath = Path.GetFullPath(rootPath);
if (!Directory.Exists(RootPath))
{
throw new DirectoryNotFoundException($"Root path '{RootPath}' does not exist.");
}
TimeProvider = timeProvider ?? throw new ArgumentNullException(nameof(timeProvider));
UsageHints = usageHints ?? LanguageUsageHints.Empty;
Services = services;
}
public string RootPath { get; }
public TimeProvider TimeProvider { get; }
public LanguageUsageHints UsageHints { get; }
public IServiceProvider? Services { get; }
public bool TryGetService<T>([NotNullWhen(true)] out T? service) where T : class
{
if (Services is null)
{
service = null;
return false;
}
service = Services.GetService(typeof(T)) as T;
return service is not null;
}
public string ResolvePath(ReadOnlySpan<char> relative)
{
if (relative.IsEmpty)
{
return RootPath;
}
var relativeString = new string(relative);
var combined = Path.Combine(RootPath, relativeString);
return Path.GetFullPath(combined);
}
public string GetRelativePath(string absolutePath)
{
if (string.IsNullOrWhiteSpace(absolutePath))
{
return string.Empty;
}
var relative = Path.GetRelativePath(RootPath, absolutePath);
return OperatingSystem.IsWindows()
? relative.Replace('\\', '/')
: relative;
}
}

View File

@@ -0,0 +1,59 @@
namespace StellaOps.Scanner.Analyzers.Lang;
public sealed class LanguageAnalyzerEngine
{
private readonly IReadOnlyList<ILanguageAnalyzer> _analyzers;
public LanguageAnalyzerEngine(IEnumerable<ILanguageAnalyzer> analyzers)
{
if (analyzers is null)
{
throw new ArgumentNullException(nameof(analyzers));
}
_analyzers = analyzers
.Where(static analyzer => analyzer is not null)
.Distinct(new AnalyzerIdComparer())
.OrderBy(static analyzer => analyzer.Id, StringComparer.Ordinal)
.ToArray();
}
public IReadOnlyList<ILanguageAnalyzer> Analyzers => _analyzers;
public async ValueTask<LanguageAnalyzerResult> AnalyzeAsync(LanguageAnalyzerContext context, CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(context);
var builder = new LanguageAnalyzerResultBuilder();
var writer = new LanguageComponentWriter(builder);
foreach (var analyzer in _analyzers)
{
cancellationToken.ThrowIfCancellationRequested();
await analyzer.AnalyzeAsync(context, writer, cancellationToken).ConfigureAwait(false);
}
return builder.Build();
}
private sealed class AnalyzerIdComparer : IEqualityComparer<ILanguageAnalyzer>
{
public bool Equals(ILanguageAnalyzer? x, ILanguageAnalyzer? y)
{
if (ReferenceEquals(x, y))
{
return true;
}
if (x is null || y is null)
{
return false;
}
return string.Equals(x.Id, y.Id, StringComparison.Ordinal);
}
public int GetHashCode(ILanguageAnalyzer obj)
=> obj?.Id is null ? 0 : StringComparer.Ordinal.GetHashCode(obj.Id);
}
}

View File

@@ -0,0 +1,111 @@
using StellaOps.Scanner.Core.Contracts;
namespace StellaOps.Scanner.Analyzers.Lang;
public sealed class LanguageAnalyzerResult
{
private readonly ImmutableArray<LanguageComponentRecord> _components;
internal LanguageAnalyzerResult(IEnumerable<LanguageComponentRecord> components)
{
_components = components
.OrderBy(static record => record.ComponentKey, StringComparer.Ordinal)
.ThenBy(static record => record.AnalyzerId, StringComparer.Ordinal)
.ToImmutableArray();
}
public IReadOnlyList<LanguageComponentRecord> Components => _components;
public ImmutableArray<ComponentRecord> ToComponentRecords(string analyzerId, string? layerDigest = null)
=> LanguageComponentMapper.ToComponentRecords(analyzerId, _components, layerDigest);
public LayerComponentFragment ToLayerFragment(string analyzerId, string? layerDigest = null)
=> LanguageComponentMapper.ToLayerFragment(analyzerId, _components, layerDigest);
public IReadOnlyList<LanguageComponentSnapshot> ToSnapshots()
=> _components.Select(static component => component.ToSnapshot()).ToImmutableArray();
public string ToJson(bool indent = true)
{
var snapshots = ToSnapshots();
var options = Internal.LanguageAnalyzerJson.CreateDefault(indent);
return JsonSerializer.Serialize(snapshots, options);
}
}
internal sealed class LanguageAnalyzerResultBuilder
{
private readonly Dictionary<string, LanguageComponentRecord> _records = new(StringComparer.Ordinal);
private readonly object _sync = new();
public void Add(LanguageComponentRecord record)
{
ArgumentNullException.ThrowIfNull(record);
lock (_sync)
{
if (_records.TryGetValue(record.ComponentKey, out var existing))
{
existing.Merge(record);
return;
}
_records[record.ComponentKey] = record;
}
}
public void AddRange(IEnumerable<LanguageComponentRecord> records)
{
foreach (var record in records ?? Array.Empty<LanguageComponentRecord>())
{
Add(record);
}
}
public LanguageAnalyzerResult Build()
{
lock (_sync)
{
return new LanguageAnalyzerResult(_records.Values.ToArray());
}
}
}
public sealed class LanguageComponentWriter
{
private readonly LanguageAnalyzerResultBuilder _builder;
internal LanguageComponentWriter(LanguageAnalyzerResultBuilder builder)
{
_builder = builder ?? throw new ArgumentNullException(nameof(builder));
}
public void Add(LanguageComponentRecord record)
=> _builder.Add(record);
public void AddRange(IEnumerable<LanguageComponentRecord> records)
=> _builder.AddRange(records);
public void AddFromPurl(
string analyzerId,
string purl,
string name,
string? version,
string type,
IEnumerable<KeyValuePair<string, string?>>? metadata = null,
IEnumerable<LanguageComponentEvidence>? evidence = null,
bool usedByEntrypoint = false)
=> Add(LanguageComponentRecord.FromPurl(analyzerId, purl, name, version, type, metadata, evidence, usedByEntrypoint));
public void AddFromExplicitKey(
string analyzerId,
string componentKey,
string? purl,
string name,
string? version,
string type,
IEnumerable<KeyValuePair<string, string?>>? metadata = null,
IEnumerable<LanguageComponentEvidence>? evidence = null,
bool usedByEntrypoint = false)
=> Add(LanguageComponentRecord.FromExplicitKey(analyzerId, componentKey, purl, name, version, type, metadata, evidence, usedByEntrypoint));
}

View File

@@ -0,0 +1,18 @@
namespace StellaOps.Scanner.Analyzers.Lang;
public enum LanguageEvidenceKind
{
File,
Metadata,
Derived,
}
public sealed record LanguageComponentEvidence(
LanguageEvidenceKind Kind,
string Source,
string Locator,
string? Value,
string? Sha256)
{
public string ComparisonKey => string.Join('|', Kind, Source, Locator, Value, Sha256);
}

View File

@@ -0,0 +1,223 @@
using System.Collections.Immutable;
using System.Security.Cryptography;
using System.Text;
using StellaOps.Scanner.Core.Contracts;
namespace StellaOps.Scanner.Analyzers.Lang;
/// <summary>
/// Helpers converting language analyzer component records into canonical scanner component models.
/// </summary>
public static class LanguageComponentMapper
{
private const string LayerHashPrefix = "stellaops:lang:";
private const string MetadataPrefix = "stellaops.lang";
/// <summary>
/// Computes a deterministic synthetic layer digest for the supplied analyzer identifier.
/// </summary>
public static string ComputeLayerDigest(string analyzerId)
{
ArgumentException.ThrowIfNullOrWhiteSpace(analyzerId);
var payload = $"{LayerHashPrefix}{analyzerId.Trim().ToLowerInvariant()}";
var bytes = Encoding.UTF8.GetBytes(payload);
var hash = SHA256.HashData(bytes);
return $"sha256:{Convert.ToHexString(hash).ToLowerInvariant()}";
}
/// <summary>
/// Projects language component records into a deterministic set of component records.
/// </summary>
public static ImmutableArray<ComponentRecord> ToComponentRecords(
string analyzerId,
IEnumerable<LanguageComponentRecord> components,
string? layerDigest = null)
{
ArgumentException.ThrowIfNullOrWhiteSpace(analyzerId);
ArgumentNullException.ThrowIfNull(components);
var effectiveLayer = string.IsNullOrWhiteSpace(layerDigest)
? ComputeLayerDigest(analyzerId)
: layerDigest!;
var builder = ImmutableArray.CreateBuilder<ComponentRecord>();
foreach (var record in components.OrderBy(static component => component.ComponentKey, StringComparer.Ordinal))
{
builder.Add(CreateComponentRecord(analyzerId, effectiveLayer, record));
}
return builder.ToImmutable();
}
/// <summary>
/// Creates a layer component fragment using the supplied component records.
/// </summary>
public static LayerComponentFragment ToLayerFragment(
string analyzerId,
IEnumerable<LanguageComponentRecord> components,
string? layerDigest = null)
{
var componentRecords = ToComponentRecords(analyzerId, components, layerDigest);
if (componentRecords.IsEmpty)
{
return LayerComponentFragment.Create(ComputeLayerDigest(analyzerId), componentRecords);
}
return LayerComponentFragment.Create(componentRecords[0].LayerDigest, componentRecords);
}
private static ComponentRecord CreateComponentRecord(
string analyzerId,
string layerDigest,
LanguageComponentRecord record)
{
ArgumentNullException.ThrowIfNull(record);
var identity = ComponentIdentity.Create(
key: ResolveIdentityKey(record),
name: record.Name,
version: record.Version,
purl: record.Purl,
componentType: record.Type);
var evidence = MapEvidence(record);
var metadata = BuildMetadata(analyzerId, record);
var usage = record.UsedByEntrypoint
? ComponentUsage.Create(usedByEntrypoint: true)
: ComponentUsage.Unused;
return new ComponentRecord
{
Identity = identity,
LayerDigest = layerDigest,
Evidence = evidence,
Dependencies = ImmutableArray<string>.Empty,
Metadata = metadata,
Usage = usage,
};
}
private static ImmutableArray<ComponentEvidence> MapEvidence(LanguageComponentRecord record)
{
var builder = ImmutableArray.CreateBuilder<ComponentEvidence>();
foreach (var item in record.Evidence)
{
if (item is null)
{
continue;
}
var kind = item.Kind switch
{
LanguageEvidenceKind.File => "file",
LanguageEvidenceKind.Metadata => "metadata",
LanguageEvidenceKind.Derived => "derived",
_ => "unknown",
};
var value = string.IsNullOrWhiteSpace(item.Locator) ? item.Source : item.Locator;
if (string.IsNullOrWhiteSpace(value))
{
value = kind;
}
builder.Add(new ComponentEvidence
{
Kind = kind,
Value = value,
Source = string.IsNullOrWhiteSpace(item.Source) ? null : item.Source,
});
}
return builder.Count == 0
? ImmutableArray<ComponentEvidence>.Empty
: builder.ToImmutable();
}
private static ComponentMetadata? BuildMetadata(string analyzerId, LanguageComponentRecord record)
{
var properties = new SortedDictionary<string, string>(StringComparer.Ordinal)
{
[$"{MetadataPrefix}.analyzerId"] = analyzerId
};
var licenseList = new List<string>();
foreach (var pair in record.Metadata)
{
if (string.IsNullOrWhiteSpace(pair.Key))
{
continue;
}
if (!string.IsNullOrWhiteSpace(pair.Value))
{
var value = pair.Value.Trim();
properties[$"{MetadataPrefix}.meta.{pair.Key}"] = value;
if (IsLicenseKey(pair.Key) && value.Length > 0)
{
foreach (var candidate in value.Split(new[] { ',', ';' }, StringSplitOptions.TrimEntries | StringSplitOptions.RemoveEmptyEntries))
{
if (candidate.Length > 0)
{
licenseList.Add(candidate);
}
}
}
}
}
var evidenceIndex = 0;
foreach (var evidence in record.Evidence)
{
if (evidence is null)
{
continue;
}
var prefix = $"{MetadataPrefix}.evidence.{evidenceIndex}";
if (!string.IsNullOrWhiteSpace(evidence.Value))
{
properties[$"{prefix}.value"] = evidence.Value.Trim();
}
if (!string.IsNullOrWhiteSpace(evidence.Sha256))
{
properties[$"{prefix}.sha256"] = evidence.Sha256.Trim();
}
evidenceIndex++;
}
IReadOnlyList<string>? licenses = null;
if (licenseList.Count > 0)
{
licenses = licenseList
.Distinct(StringComparer.OrdinalIgnoreCase)
.OrderBy(static license => license, StringComparer.Ordinal)
.ToArray();
}
return new ComponentMetadata
{
Licenses = licenses,
Properties = properties.Count == 0 ? null : properties,
};
}
private static string ResolveIdentityKey(LanguageComponentRecord record)
{
var key = record.ComponentKey;
if (key.StartsWith("purl::", StringComparison.Ordinal))
{
return key[6..];
}
return key;
}
private static bool IsLicenseKey(string key)
=> key.Contains("license", StringComparison.OrdinalIgnoreCase);
}

View File

@@ -0,0 +1,219 @@
namespace StellaOps.Scanner.Analyzers.Lang;
public sealed class LanguageComponentRecord
{
private readonly SortedDictionary<string, string?> _metadata;
private readonly SortedDictionary<string, LanguageComponentEvidence> _evidence;
private LanguageComponentRecord(
string analyzerId,
string componentKey,
string? purl,
string name,
string? version,
string type,
IEnumerable<KeyValuePair<string, string?>> metadata,
IEnumerable<LanguageComponentEvidence> evidence,
bool usedByEntrypoint)
{
AnalyzerId = analyzerId ?? throw new ArgumentNullException(nameof(analyzerId));
ComponentKey = componentKey ?? throw new ArgumentNullException(nameof(componentKey));
Purl = string.IsNullOrWhiteSpace(purl) ? null : purl.Trim();
Name = name ?? throw new ArgumentNullException(nameof(name));
Version = string.IsNullOrWhiteSpace(version) ? null : version.Trim();
Type = string.IsNullOrWhiteSpace(type) ? throw new ArgumentException("Type is required", nameof(type)) : type.Trim();
UsedByEntrypoint = usedByEntrypoint;
_metadata = new SortedDictionary<string, string?>(StringComparer.Ordinal);
foreach (var entry in metadata ?? Array.Empty<KeyValuePair<string, string?>>())
{
if (string.IsNullOrWhiteSpace(entry.Key))
{
continue;
}
_metadata[entry.Key.Trim()] = entry.Value;
}
_evidence = new SortedDictionary<string, LanguageComponentEvidence>(StringComparer.Ordinal);
foreach (var evidenceItem in evidence ?? Array.Empty<LanguageComponentEvidence>())
{
if (evidenceItem is null)
{
continue;
}
_evidence[evidenceItem.ComparisonKey] = evidenceItem;
}
}
public string AnalyzerId { get; }
public string ComponentKey { get; }
public string? Purl { get; }
public string Name { get; }
public string? Version { get; }
public string Type { get; }
public bool UsedByEntrypoint { get; private set; }
public IReadOnlyDictionary<string, string?> Metadata => _metadata;
public IReadOnlyCollection<LanguageComponentEvidence> Evidence => _evidence.Values;
public static LanguageComponentRecord FromPurl(
string analyzerId,
string purl,
string name,
string? version,
string type,
IEnumerable<KeyValuePair<string, string?>>? metadata = null,
IEnumerable<LanguageComponentEvidence>? evidence = null,
bool usedByEntrypoint = false)
{
if (string.IsNullOrWhiteSpace(purl))
{
throw new ArgumentException("purl is required", nameof(purl));
}
var key = $"purl::{purl.Trim()}";
return new LanguageComponentRecord(
analyzerId,
key,
purl,
name,
version,
type,
metadata ?? Array.Empty<KeyValuePair<string, string?>>(),
evidence ?? Array.Empty<LanguageComponentEvidence>(),
usedByEntrypoint);
}
public static LanguageComponentRecord FromExplicitKey(
string analyzerId,
string componentKey,
string? purl,
string name,
string? version,
string type,
IEnumerable<KeyValuePair<string, string?>>? metadata = null,
IEnumerable<LanguageComponentEvidence>? evidence = null,
bool usedByEntrypoint = false)
{
if (string.IsNullOrWhiteSpace(componentKey))
{
throw new ArgumentException("Component key is required", nameof(componentKey));
}
return new LanguageComponentRecord(
analyzerId,
componentKey.Trim(),
purl,
name,
version,
type,
metadata ?? Array.Empty<KeyValuePair<string, string?>>(),
evidence ?? Array.Empty<LanguageComponentEvidence>(),
usedByEntrypoint);
}
internal void Merge(LanguageComponentRecord other)
{
ArgumentNullException.ThrowIfNull(other);
if (!ComponentKey.Equals(other.ComponentKey, StringComparison.Ordinal))
{
throw new InvalidOperationException($"Cannot merge component '{ComponentKey}' with '{other.ComponentKey}'.");
}
UsedByEntrypoint |= other.UsedByEntrypoint;
foreach (var entry in other._metadata)
{
if (!_metadata.TryGetValue(entry.Key, out var existing) || string.IsNullOrEmpty(existing))
{
_metadata[entry.Key] = entry.Value;
}
}
foreach (var evidenceItem in other._evidence)
{
_evidence[evidenceItem.Key] = evidenceItem.Value;
}
}
public LanguageComponentSnapshot ToSnapshot()
{
return new LanguageComponentSnapshot
{
AnalyzerId = AnalyzerId,
ComponentKey = ComponentKey,
Purl = Purl,
Name = Name,
Version = Version,
Type = Type,
UsedByEntrypoint = UsedByEntrypoint,
Metadata = _metadata.ToDictionary(static pair => pair.Key, static pair => pair.Value, StringComparer.Ordinal),
Evidence = _evidence.Values.Select(static item => new LanguageComponentEvidenceSnapshot
{
Kind = item.Kind,
Source = item.Source,
Locator = item.Locator,
Value = item.Value,
Sha256 = item.Sha256,
}).ToArray(),
};
}
}
public sealed class LanguageComponentSnapshot
{
[JsonPropertyName("analyzerId")]
public string AnalyzerId { get; set; } = string.Empty;
[JsonPropertyName("componentKey")]
public string ComponentKey { get; set; } = string.Empty;
[JsonPropertyName("purl")]
public string? Purl { get; set; }
[JsonPropertyName("name")]
public string Name { get; set; } = string.Empty;
[JsonPropertyName("version")]
public string? Version { get; set; }
[JsonPropertyName("type")]
public string Type { get; set; } = string.Empty;
[JsonPropertyName("usedByEntrypoint")]
public bool UsedByEntrypoint { get; set; }
[JsonPropertyName("metadata")]
public IDictionary<string, string?> Metadata { get; set; } = new Dictionary<string, string?>(StringComparer.Ordinal);
[JsonPropertyName("evidence")]
public IReadOnlyList<LanguageComponentEvidenceSnapshot> Evidence { get; set; } = Array.Empty<LanguageComponentEvidenceSnapshot>();
}
public sealed class LanguageComponentEvidenceSnapshot
{
[JsonPropertyName("kind")]
public LanguageEvidenceKind Kind { get; set; }
[JsonPropertyName("source")]
public string Source { get; set; } = string.Empty;
[JsonPropertyName("locator")]
public string Locator { get; set; } = string.Empty;
[JsonPropertyName("value")]
public string? Value { get; set; }
[JsonPropertyName("sha256")]
public string? Sha256 { get; set; }
}

View File

@@ -0,0 +1,49 @@
namespace StellaOps.Scanner.Analyzers.Lang;
public sealed class LanguageUsageHints
{
private static readonly StringComparer Comparer = OperatingSystem.IsWindows()
? StringComparer.OrdinalIgnoreCase
: StringComparer.Ordinal;
private readonly ImmutableHashSet<string> _usedPaths;
public static LanguageUsageHints Empty { get; } = new(Array.Empty<string>());
public LanguageUsageHints(IEnumerable<string> usedPaths)
{
if (usedPaths is null)
{
throw new ArgumentNullException(nameof(usedPaths));
}
_usedPaths = usedPaths
.Select(Normalize)
.Where(static path => path.Length > 0)
.ToImmutableHashSet(Comparer);
}
public bool IsPathUsed(string path)
{
if (string.IsNullOrWhiteSpace(path))
{
return false;
}
var normalized = Normalize(path);
return _usedPaths.Contains(normalized);
}
private static string Normalize(string path)
{
if (string.IsNullOrWhiteSpace(path))
{
return string.Empty;
}
var full = Path.GetFullPath(path);
return OperatingSystem.IsWindows()
? full.Replace('\\', '/').TrimEnd('/')
: full;
}
}

View File

@@ -0,0 +1,11 @@
global using System;
global using System.Collections.Concurrent;
global using System.Collections.Generic;
global using System.Collections.Immutable;
global using System.Diagnostics.CodeAnalysis;
global using System.IO;
global using System.Linq;
global using System.Text.Json;
global using System.Text.Json.Serialization;
global using System.Threading;
global using System.Threading.Tasks;

View File

@@ -0,0 +1,114 @@
# StellaOps Scanner — Language Analyzer Implementation Plan (2025Q4)
> **Goal.** Deliver best-in-class language analyzers that outperform competitors on fidelity, determinism, and offline readiness while integrating tightly with Scanner Worker orchestration and SBOM composition.
All sprints below assume prerequisites from SP10-G2 (core scaffolding + Java analyzer) are complete. Each sprint is sized for a focused guild (≈11.5weeks) and produces definitive gates for downstream teams (Emit, Policy, Scheduler).
---
## Sprint LA1 — Node Analyzer & Workspace Intelligence (Tasks 10-302, 10-307, 10-308, 10-309 subset) *(DOING — 2025-10-19)*
- **Scope:** Resolve hoisted `node_modules`, PNPM structures, Yarn Berry Plug'n'Play, symlinked workspaces, and detect security-sensitive scripts.
- **Deliverables:**
- `StellaOps.Scanner.Analyzers.Lang.Node` plug-in with manifest + DI registration.
- Deterministic walker supporting >100k modules with streaming JSON parsing.
- Workspace graph persisted as analyzer metadata (`package.json` provenance + symlink target proofs).
- **Acceptance Metrics:**
- 10k module fixture scans <1.8s on 4vCPU (p95).
- Memory ceiling <220MB (tracked via deterministic benchmark harness).
- All symlink targets canonicalized; path traversal guarded.
- **Gate Artifacts:**
- `Fixtures/lang/node/**` golden outputs.
- Analyzer benchmark CSV + flamegraph (commit under `bench/Scanner.Analyzers`).
- Worker integration sample enabling Node analyzer via manifest.
- **Progress (2025-10-19):** Module walker with package-lock/yarn/pnpm resolution, workspace attribution, integrity metadata, and deterministic fixture harness committed; Node tasks 10-302A/B marked DONE. Shared component mapper + canonical result harness landed, closing tasks 10-307/308. Script metadata & telemetry (10-302C) emit policy hints, hashed evidence, and feed `scanner_analyzer_node_scripts_total` into Worker OpenTelemetry pipeline.
## Sprint LA2 — Python Analyzer & Entry Point Attribution (Tasks 10-303, 10-307, 10-308, 10-309 subset)
- **Scope:** Parse `*.dist-info`, `RECORD` hashes, entry points, and pip-installed editable packages; integrate usage hints from EntryTrace.
- **Deliverables:**
- `StellaOps.Scanner.Analyzers.Lang.Python` plug-in.
- RECORD hash validation with optional Zip64 support for `.whl` caches.
- Entry-point mapping into `UsageFlags` for Emit stage.
- **Acceptance Metrics:**
- Hash verification throughput 75MB/s sustained with streaming reader.
- False-positive rate for editable installs <1% on curated fixtures.
- Determinism check across CPython 3.83.12 generated metadata.
- **Gate Artifacts:**
- Golden fixtures for `site-packages`, virtualenv, and layered pip caches.
- Usage hint propagation tests (EntryTrace analyzer SBOM).
- Metrics counters (`scanner_analyzer_python_components_total`) documented.
## Sprint LA3 — Go Analyzer & Build Info Synthesis (Tasks 10-304, 10-307, 10-308, 10-309 subset)
- **Scope:** Extract Go build metadata from `.note.go.buildid`, embedded module info, and fallback to `bin:{sha256}`; surface VCS provenance.
- **Deliverables:**
- `StellaOps.Scanner.Analyzers.Lang.Go` plug-in.
- DWARF-lite parser to enrich component origin (commit hash + dirty flag) when available.
- Shared hash cache to dedupe repeated binaries across layers.
- **Acceptance Metrics:**
- Analyzer latency 400µs per binary (hot cache) / 2ms (cold).
- Provenance coverage 95% on representative Go fixture suite.
- Zero allocations in happy path beyond pooled buffers (validated via BenchmarkDotNet).
- **Gate Artifacts:**
- Benchmarks vs competitor open-source tool (Trivy or Syft) demonstrating faster metadata extraction.
- Documentation snippet explaining VCS metadata fields for Policy team.
## Sprint LA4 — .NET Analyzer & RID Variants (Tasks 10-305, 10-307, 10-308, 10-309 subset)
- **Scope:** Parse `*.deps.json`, `runtimeconfig.json`, assembly metadata, and RID-specific assets; correlate with native dependencies.
- **Deliverables:**
- `StellaOps.Scanner.Analyzers.Lang.DotNet` plug-in.
- Strong-name + Authenticode optional verification when offline cert bundle provided.
- RID-aware component grouping with fallback to `bin:{sha256}` for self-contained apps.
- **Acceptance Metrics:**
- Multi-target app fixture processed <1.2s; memory <250MB.
- RID variant collapse reduces component explosion by 40% vs naive listing.
- All security metadata (signing Publisher, timestamp) surfaced deterministically.
- **Gate Artifacts:**
- Signed .NET sample apps (framework-dependent & self-contained) under `samples/scanner/lang/dotnet/`.
- Tests verifying dual runtimeconfig merge logic.
- Guidance for Policy on license propagation from NuGet metadata.
## Sprint LA5 — Rust Analyzer & Binary Fingerprinting (Tasks 10-306, 10-307, 10-308, 10-309 subset)
- **Scope:** Detect crates via metadata in `.fingerprint`, Cargo.lock fragments, or embedded `rustc` markers; robust fallback to binary hash classification.
- **Deliverables:**
- `StellaOps.Scanner.Analyzers.Lang.Rust` plug-in.
- Symbol table heuristics capable of attributing stripped binaries by leveraging `.comment` and section names without violating determinism.
- Quiet-provenance flags to differentiate heuristics from hard evidence.
- **Acceptance Metrics:**
- Accurate crate attribution 85% on curated Cargo workspace fixtures.
- Heuristic fallback clearly labeled; no false certain claims.
- Analyzer completes <1s on 500 binary corpus.
- **Gate Artifacts:**
- Fixtures covering cargo workspaces, binaries with embedded metadata stripped.
- ADR documenting heuristic boundaries + risk mitigations.
## Sprint LA6 — Shared Evidence Enhancements & Worker Integration (Tasks 10-307, 10-308, 10-309 finalization)
- **Scope:** Finalize shared helpers, deterministic harness expansion, Worker/Emit wiring, and macro benchmarks.
- **Deliverables:**
- Consolidated `LanguageComponentWriter` extensions for license, vulnerability hints, and usage propagation.
- Worker dispatcher loading plug-ins via manifest registry + health checks.
- Combined analyzer benchmark suite executed in CI with regression thresholds.
- **Acceptance Metrics:**
- Worker executes mixed analyzer suite (Java+Node+Python+Go+.NET+Rust) within SLA: warm scan <6s, cold <25s.
- CI determinism guard catches output drift (>0 diff tolerance) across all fixtures.
- Telemetry coverage: each analyzer emits timing + component counters.
- **Gate Artifacts:**
- `SPRINTS_LANG_IMPLEMENTATION_PLAN.md` progress log updated (this file).
- `bench/Scanner.Analyzers/lang-matrix.csv` recorded + referenced in docs.
- Ops notes for packaging plug-ins into Offline Kit.
---
## Cross-Sprint Considerations
- **Security:** All analyzers must enforce path canonicalization, guard against zip-slip, and expose provenance classifications (`observed`, `heuristic`, `attested`).
- **Offline-first:** No network calls; rely on cached metadata and optional offline bundles (license texts, signature roots).
- **Determinism:** Normalise timestamps to `0001-01-01T00:00:00Z` when persisting synthetic data; sort collections by stable keys.
- **Benchmarking:** Extend `bench/Scanner.Analyzers` to compare against open-source scanners (Syft/Trivy) and document performance wins.
- **Hand-offs:** Emit guild requires consistent component schemas; Policy needs license + provenance metadata; Scheduler depends on usage flags for ImpactIndex.
## Tracking & Reporting
- Update `TASKS.md` per sprint (TODO → DOING → DONE) with date stamps.
- Log sprint summaries in `docs/updates/` once each sprint lands.
- Use module-specific CI pipeline to run analyzer suites nightly (determinism + perf).
---
**Next Action:** Start Sprint LA1 (Node Analyzer) — move tasks 10-302, 10-307, 10-308, 10-309 → DOING and spin up fixtures + benchmarks.

View File

@@ -0,0 +1,21 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<LangVersion>preview</LangVersion>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
<EnableDefaultItems>false</EnableDefaultItems>
</PropertyGroup>
<ItemGroup>
<Compile Include="**\\*.cs" Exclude="obj\\**;bin\\**" />
<EmbeddedResource Include="**\\*.json" Exclude="obj\\**;bin\\**" />
<None Include="**\\*" Exclude="**\\*.cs;**\\*.json;bin\\**;obj\\**" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\\StellaOps.Scanner.Core\\StellaOps.Scanner.Core.csproj" />
</ItemGroup>
</Project>

View File

@@ -0,0 +1,13 @@
# Language Analyzer Task Board
| ID | Status | Owner(s) | Depends on | Description | Exit Criteria |
|----|--------|----------|------------|-------------|---------------|
| SCANNER-ANALYZERS-LANG-10-301 | DONE (2025-10-19) | Language Analyzer Guild | SCANNER-CORE-09-501, SCANNER-WORKER-09-203 | Java analyzer emitting deterministic `pkg:maven` components using pom.properties / MANIFEST evidence. | Java analyzer extracts coordinates+version+licenses with provenance; golden fixtures deterministic; microbenchmark meets target. |
| SCANNER-ANALYZERS-LANG-10-302 | DOING (2025-10-19) | Language Analyzer Guild | SCANNER-ANALYZERS-LANG-10-307 | Node analyzer resolving workspaces/symlinks into `pkg:npm` identities. | Node analyzer handles symlinks/workspaces; outputs sorted components; determinism harness covers hoisted deps. |
| SCANNER-ANALYZERS-LANG-10-303 | TODO | Language Analyzer Guild | SCANNER-ANALYZERS-LANG-10-307 | Python analyzer consuming `*.dist-info` metadata and RECORD hashes. | Analyzer binds METADATA + RECORD evidence, includes entry points, determinism fixtures stable. |
| SCANNER-ANALYZERS-LANG-10-304 | TODO | Language Analyzer Guild | SCANNER-ANALYZERS-LANG-10-307 | Go analyzer leveraging buildinfo for `pkg:golang` components. | Buildinfo parser emits module path/version + vcs metadata; binaries without buildinfo downgraded gracefully. |
| SCANNER-ANALYZERS-LANG-10-305 | TODO | Language Analyzer Guild | SCANNER-ANALYZERS-LANG-10-307 | .NET analyzer parsing `*.deps.json`, assembly metadata, and RID variants. | Analyzer merges deps.json + assembly info; dedupes per RID; determinism verified. |
| SCANNER-ANALYZERS-LANG-10-306 | TODO | Language Analyzer Guild | SCANNER-ANALYZERS-LANG-10-307 | Rust analyzer detecting crate provenance or falling back to `bin:{sha256}`. | Analyzer emits `pkg:cargo` when metadata present; falls back to binary hash; fixtures cover both paths. |
| SCANNER-ANALYZERS-LANG-10-307 | DONE (2025-10-19) | Language Analyzer Guild | SCANNER-CORE-09-501 | Shared language evidence helpers + usage flag propagation. | Shared abstractions implemented; analyzers reuse helpers; evidence includes usage hints; unit tests cover canonical ordering. |
| SCANNER-ANALYZERS-LANG-10-308 | DONE (2025-10-19) | Language Analyzer Guild | SCANNER-ANALYZERS-LANG-10-307 | Determinism + fixture harness for language analyzers. | Harness executes analyzers against fixtures; golden JSON stored; CI helper ensures stable hashes. |
| SCANNER-ANALYZERS-LANG-10-309 | DOING (2025-10-19) | Language Analyzer Guild | SCANNER-ANALYZERS-LANG-10-301..308 | Package language analyzers as restart-time plug-ins (manifest + host registration). | Plugin manifests authored under `plugins/scanner/analyzers/lang`; Worker loads via DI; restart required flag enforced; tests confirm manifest integrity. |