up
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Notify Smoke Test / Notify Unit Tests (push) Has been cancelled
Notify Smoke Test / Notifier Service Tests (push) Has been cancelled
Notify Smoke Test / Notification Smoke Test (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled

This commit is contained in:
StellaOps Bot
2025-12-13 18:08:55 +02:00
parent 6e45066e37
commit f1a39c4ce3
234 changed files with 24038 additions and 6910 deletions

View File

@@ -5,19 +5,33 @@ namespace StellaOps.Scanner.Analyzers.Lang.Python.Internal.Packaging.Adapters;
/// <summary>
/// Adapter for container layer overlays that may contain Python packages.
/// Handles whiteout files and layer ordering.
/// Implements OCI overlay semantics including whiteouts per Action 3 contract.
/// </summary>
internal sealed class ContainerLayerAdapter : IPythonPackagingAdapter
{
public string Name => "container-layer";
public int Priority => 100; // Lowest priority - use other adapters first
/// <summary>
/// Container-specific metadata keys.
/// </summary>
internal static class MetadataKeys
{
public const string OverlayIncomplete = "container.overlayIncomplete";
public const string LayerSource = "container.layerSource";
public const string LayerOrder = "container.layerOrder";
public const string Warning = "container.warning";
public const string WhiteoutApplied = "container.whiteoutApplied";
public const string LayersProcessed = "container.layersProcessed";
}
public bool CanHandle(PythonVirtualFileSystem vfs, string path)
{
// Container layers typically have specific patterns
// Check for layer root markers or whiteout files
return vfs.EnumerateFiles(path, ".wh.*").Any() ||
HasContainerLayoutMarkers(vfs, path);
HasContainerLayoutMarkers(vfs, path) ||
HasLayerDirectories(path);
}
public async IAsyncEnumerable<PythonPackageInfo> DiscoverPackagesAsync(
@@ -25,10 +39,96 @@ internal sealed class ContainerLayerAdapter : IPythonPackagingAdapter
string path,
[System.Runtime.CompilerServices.EnumeratorCancellation] CancellationToken cancellationToken = default)
{
// Discover packages from common Python installation paths in containers
var pythonPaths = FindPythonPathsInContainer(vfs, path);
// Discover container layers
var layers = ContainerOverlayHandler.DiscoverLayers(path);
// Use DistInfoAdapter for each discovered path
if (layers.Count > 0)
{
// Process with overlay semantics
await foreach (var pkg in DiscoverWithOverlayAsync(vfs, path, layers, cancellationToken).ConfigureAwait(false))
{
yield return pkg;
}
}
else
{
// No layer structure detected - scan as merged rootfs
await foreach (var pkg in DiscoverFromMergedRootfsAsync(vfs, path, cancellationToken).ConfigureAwait(false))
{
yield return pkg;
}
}
}
private async IAsyncEnumerable<PythonPackageInfo> DiscoverWithOverlayAsync(
PythonVirtualFileSystem vfs,
string rootPath,
IReadOnlyList<ContainerOverlayHandler.LayerInfo> layers,
[System.Runtime.CompilerServices.EnumeratorCancellation] CancellationToken cancellationToken)
{
// Build overlay result
var overlayResult = ContainerOverlayHandler.ProcessLayers(layers, layerPath =>
{
return EnumerateFilesRecursive(layerPath);
});
var discoveredPackages = new Dictionary<string, PythonPackageInfo>(StringComparer.OrdinalIgnoreCase);
var distInfoAdapter = new DistInfoAdapter();
// Process each layer in order
foreach (var layer in layers.OrderBy(static l => l.Order))
{
cancellationToken.ThrowIfCancellationRequested();
// Find Python paths in this layer
var pythonPaths = FindPythonPathsInLayer(layer.Path);
foreach (var pythonPath in pythonPaths)
{
if (!distInfoAdapter.CanHandle(vfs, pythonPath))
{
continue;
}
await foreach (var pkg in distInfoAdapter.DiscoverPackagesAsync(vfs, pythonPath, cancellationToken).ConfigureAwait(false))
{
// Check if package's metadata path is visible after overlay
var isVisible = IsPackageVisible(pkg, layer.Path, overlayResult);
if (!isVisible)
{
// Package was whited out - remove from discovered
discoveredPackages.Remove(pkg.NormalizedName);
continue;
}
// Add container metadata
var containerPkg = pkg with
{
Location = pythonPath,
Confidence = AdjustConfidenceForOverlay(pkg.Confidence, overlayResult.IsComplete),
ContainerMetadata = BuildContainerMetadata(layer, overlayResult)
};
// Later layers override earlier ones (last-wins within overlay)
discoveredPackages[pkg.NormalizedName] = containerPkg;
}
}
}
foreach (var pkg in discoveredPackages.Values.OrderBy(static p => p.NormalizedName, StringComparer.Ordinal))
{
yield return pkg;
}
}
private async IAsyncEnumerable<PythonPackageInfo> DiscoverFromMergedRootfsAsync(
PythonVirtualFileSystem vfs,
string path,
[System.Runtime.CompilerServices.EnumeratorCancellation] CancellationToken cancellationToken)
{
// No layer structure - this is a merged rootfs, scan directly
var pythonPaths = FindPythonPathsInContainer(vfs, path);
var distInfoAdapter = new DistInfoAdapter();
foreach (var pythonPath in pythonPaths)
@@ -42,7 +142,6 @@ internal sealed class ContainerLayerAdapter : IPythonPackagingAdapter
await foreach (var pkg in distInfoAdapter.DiscoverPackagesAsync(vfs, pythonPath, cancellationToken).ConfigureAwait(false))
{
// Mark as coming from container layer
yield return pkg with
{
Location = pythonPath,
@@ -51,7 +150,7 @@ internal sealed class ContainerLayerAdapter : IPythonPackagingAdapter
}
}
// Also check for vendored packages in /app, /opt, etc.
// Also check for vendored packages
var vendoredPaths = FindVendoredPathsInContainer(vfs, path);
foreach (var vendoredPath in vendoredPaths)
{
@@ -64,9 +163,135 @@ internal sealed class ContainerLayerAdapter : IPythonPackagingAdapter
}
}
private static bool IsPackageVisible(
PythonPackageInfo pkg,
string layerPath,
ContainerOverlayHandler.OverlayResult overlay)
{
if (string.IsNullOrEmpty(pkg.MetadataPath))
{
return true; // Can't check without metadata path
}
// Build full path and check visibility
var fullPath = Path.Combine(layerPath, pkg.MetadataPath.TrimStart('/'));
return ContainerOverlayHandler.IsPathVisible(overlay, fullPath);
}
private static IReadOnlyDictionary<string, string> BuildContainerMetadata(
ContainerOverlayHandler.LayerInfo layer,
ContainerOverlayHandler.OverlayResult overlay)
{
var metadata = new Dictionary<string, string>
{
[MetadataKeys.LayerSource] = Path.GetFileName(layer.Path),
[MetadataKeys.LayerOrder] = layer.Order.ToString(),
[MetadataKeys.LayersProcessed] = overlay.ProcessedLayers.Count.ToString()
};
if (!overlay.IsComplete)
{
metadata[MetadataKeys.OverlayIncomplete] = "true";
}
if (overlay.Warning is not null)
{
metadata[MetadataKeys.Warning] = overlay.Warning;
}
if (overlay.WhiteoutedPaths.Count > 0)
{
metadata[MetadataKeys.WhiteoutApplied] = "true";
}
return metadata;
}
private static IEnumerable<string> EnumerateFilesRecursive(string path)
{
if (!Directory.Exists(path))
{
yield break;
}
var options = new EnumerationOptions
{
RecurseSubdirectories = true,
IgnoreInaccessible = true,
AttributesToSkip = FileAttributes.System
};
foreach (var file in Directory.EnumerateFiles(path, "*", options))
{
yield return file;
}
}
private static IEnumerable<string> FindPythonPathsInLayer(string layerPath)
{
var foundPaths = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
// Common Python installation paths
var patterns = new[]
{
"usr/lib/python*/site-packages",
"usr/local/lib/python*/site-packages",
"opt/*/lib/python*/site-packages",
".venv/lib/python*/site-packages",
"venv/lib/python*/site-packages"
};
foreach (var pattern in patterns)
{
var searchPath = Path.Combine(layerPath, pattern.Replace("*/", ""));
if (Directory.Exists(Path.GetDirectoryName(searchPath)))
{
try
{
var matches = Directory.GetDirectories(
Path.GetDirectoryName(searchPath)!,
Path.GetFileName(pattern.Replace("*/site-packages", "")),
SearchOption.TopDirectoryOnly);
foreach (var match in matches)
{
var sitePackages = Path.Combine(match, "site-packages");
if (Directory.Exists(sitePackages))
{
foundPaths.Add(sitePackages);
}
}
}
catch
{
// Ignore enumeration errors
}
}
}
return foundPaths;
}
private static bool HasLayerDirectories(string path)
{
if (string.IsNullOrEmpty(path) || !Directory.Exists(path))
return false;
try
{
return Directory.Exists(Path.Combine(path, "layers")) ||
Directory.Exists(Path.Combine(path, ".layers")) ||
Directory.GetDirectories(path, "layer*").Any();
}
catch
{
return false;
}
}
private static bool HasContainerLayoutMarkers(PythonVirtualFileSystem vfs, string path)
{
// Check for typical container root structure
var markers = new[]
{
$"{path}/etc/os-release",
@@ -83,18 +308,6 @@ internal sealed class ContainerLayerAdapter : IPythonPackagingAdapter
{
var foundPaths = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
// Common Python installation paths in containers
var pythonPathPatterns = new[]
{
$"{path}/usr/lib/python*/site-packages",
$"{path}/usr/local/lib/python*/site-packages",
$"{path}/opt/*/lib/python*/site-packages",
$"{path}/home/*/.local/lib/python*/site-packages",
$"{path}/.venv/lib/python*/site-packages",
$"{path}/venv/lib/python*/site-packages"
};
// Search for site-packages directories
var sitePackagesDirs = vfs.EnumerateFiles(path, "site-packages/*")
.Select(f => GetParentDirectory(f.VirtualPath))
.Where(p => p is not null && p.EndsWith("site-packages", StringComparison.OrdinalIgnoreCase))
@@ -113,7 +326,6 @@ internal sealed class ContainerLayerAdapter : IPythonPackagingAdapter
{
var vendoredPaths = new List<string>();
// Common vendored package locations
var vendorPatterns = new[]
{
$"{path}/app/vendor",
@@ -138,9 +350,7 @@ internal sealed class ContainerLayerAdapter : IPythonPackagingAdapter
string path,
[System.Runtime.CompilerServices.EnumeratorCancellation] CancellationToken cancellationToken)
{
// Find packages by looking for __init__.py or standalone .py files
var initFiles = vfs.EnumerateFiles(path, "__init__.py").ToList();
var discoveredPackages = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
foreach (var initFile in initFiles)
@@ -174,13 +384,13 @@ internal sealed class ContainerLayerAdapter : IPythonPackagingAdapter
InstallerTool: null,
EditableTarget: null,
IsDirectDependency: true,
Confidence: PythonPackageConfidence.Low);
Confidence: PythonPackageConfidence.Low,
ContainerMetadata: null);
}
}
private static PythonPackageConfidence AdjustConfidenceForContainer(PythonPackageConfidence confidence)
{
// Container layers may have incomplete or overlaid files
return confidence switch
{
PythonPackageConfidence.Definitive => PythonPackageConfidence.High,
@@ -188,6 +398,24 @@ internal sealed class ContainerLayerAdapter : IPythonPackagingAdapter
};
}
private static PythonPackageConfidence AdjustConfidenceForOverlay(
PythonPackageConfidence confidence,
bool isComplete)
{
if (!isComplete)
{
// Reduce confidence when overlay is incomplete
return confidence switch
{
PythonPackageConfidence.Definitive => PythonPackageConfidence.Medium,
PythonPackageConfidence.High => PythonPackageConfidence.Medium,
_ => PythonPackageConfidence.Low
};
}
return AdjustConfidenceForContainer(confidence);
}
private static string? GetParentDirectory(string path)
{
var lastSep = path.LastIndexOf('/');

View File

@@ -0,0 +1,236 @@
using System.Text.RegularExpressions;
namespace StellaOps.Scanner.Analyzers.Lang.Python.Internal.Packaging;
/// <summary>
/// Handles OCI container overlay semantics including whiteouts and layer ordering.
/// Per Action 3 contract in SPRINT_0405_0001_0001.
/// </summary>
internal sealed partial class ContainerOverlayHandler
{
private const string SingleFileWhiteoutPrefix = ".wh.";
private const string OpaqueWhiteoutMarker = ".wh..wh..opq";
/// <summary>
/// Represents a layer in the container overlay.
/// </summary>
internal sealed record LayerInfo(string Path, int Order, bool IsComplete);
/// <summary>
/// Result of processing container layers with overlay semantics.
/// </summary>
internal sealed record OverlayResult(
IReadOnlySet<string> VisiblePaths,
IReadOnlySet<string> WhiteoutedPaths,
IReadOnlyList<LayerInfo> ProcessedLayers,
bool IsComplete,
string? Warning);
/// <summary>
/// Discovers and orders container layers deterministically.
/// </summary>
public static IReadOnlyList<LayerInfo> DiscoverLayers(string rootPath)
{
var layers = new List<LayerInfo>();
// Check for layer directories
var layerDirs = new List<string>();
// Pattern 1: layers/* (direct children)
var layersDir = Path.Combine(rootPath, "layers");
if (Directory.Exists(layersDir))
{
layerDirs.AddRange(Directory.GetDirectories(layersDir)
.OrderBy(static d => GetLayerSortKey(Path.GetFileName(d)), StringComparer.OrdinalIgnoreCase));
}
// Pattern 2: .layers/* (direct children)
var dotLayersDir = Path.Combine(rootPath, ".layers");
if (Directory.Exists(dotLayersDir))
{
layerDirs.AddRange(Directory.GetDirectories(dotLayersDir)
.OrderBy(static d => GetLayerSortKey(Path.GetFileName(d)), StringComparer.OrdinalIgnoreCase));
}
// Pattern 3: layer* (direct children of root)
var layerPrefixDirs = Directory.GetDirectories(rootPath, "layer*")
.Where(static d => LayerPrefixPattern().IsMatch(Path.GetFileName(d)))
.OrderBy(static d => GetLayerSortKey(Path.GetFileName(d)), StringComparer.OrdinalIgnoreCase);
layerDirs.AddRange(layerPrefixDirs);
// Assign order based on discovery sequence
var order = 0;
foreach (var layerDir in layerDirs)
{
layers.Add(new LayerInfo(layerDir, order++, IsLayerComplete(layerDir)));
}
return layers;
}
/// <summary>
/// Processes layers and returns visible paths after applying whiteout semantics.
/// Lower order = earlier layer, higher order = later layer (takes precedence).
/// </summary>
public static OverlayResult ProcessLayers(IReadOnlyList<LayerInfo> layers, Func<string, IEnumerable<string>> enumerateFiles)
{
var visiblePaths = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
var whiteoutedPaths = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
var opaqueDirectories = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
var isComplete = true;
string? warning = null;
// Process layers in order (lower index = earlier, higher index = later/overrides)
foreach (var layer in layers.OrderBy(static l => l.Order))
{
if (!layer.IsComplete)
{
isComplete = false;
}
var layerFiles = enumerateFiles(layer.Path).ToList();
// First pass: collect whiteouts and opaque markers
var layerWhiteouts = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
foreach (var file in layerFiles)
{
var fileName = Path.GetFileName(file);
var dirPath = Path.GetDirectoryName(file);
if (fileName == OpaqueWhiteoutMarker && dirPath is not null)
{
// Opaque whiteout: remove all prior contents of this directory
var opaqueDir = NormalizePath(dirPath);
opaqueDirectories.Add(opaqueDir);
// Remove any visible paths under this directory from prior layers
var toRemove = visiblePaths.Where(p => IsUnderDirectory(p, opaqueDir)).ToList();
foreach (var path in toRemove)
{
visiblePaths.Remove(path);
whiteoutedPaths.Add(path);
}
}
else if (fileName.StartsWith(SingleFileWhiteoutPrefix, StringComparison.Ordinal))
{
// Single file whiteout: remove specific file
var targetName = fileName[SingleFileWhiteoutPrefix.Length..];
var targetPath = dirPath is not null
? NormalizePath(Path.Combine(dirPath, targetName))
: targetName;
layerWhiteouts.Add(targetPath);
visiblePaths.Remove(targetPath);
whiteoutedPaths.Add(targetPath);
}
}
// Second pass: add non-whiteout files
foreach (var file in layerFiles)
{
var fileName = Path.GetFileName(file);
// Skip whiteout marker files themselves
if (fileName == OpaqueWhiteoutMarker ||
fileName.StartsWith(SingleFileWhiteoutPrefix, StringComparison.Ordinal))
{
continue;
}
var normalizedPath = NormalizePath(file);
// Check if this file is under an opaque directory from a later layer
// (shouldn't happen in forward processing, but be defensive)
if (!layerWhiteouts.Contains(normalizedPath))
{
visiblePaths.Add(normalizedPath);
whiteoutedPaths.Remove(normalizedPath); // File added back in later layer
}
}
}
if (!isComplete)
{
warning = "Overlay context incomplete; inventory may include removed packages";
}
return new OverlayResult(
visiblePaths,
whiteoutedPaths,
layers,
isComplete,
warning);
}
/// <summary>
/// Checks if a path would be visible after overlay processing.
/// </summary>
public static bool IsPathVisible(OverlayResult overlay, string path)
{
var normalized = NormalizePath(path);
return overlay.VisiblePaths.Contains(normalized);
}
/// <summary>
/// Gets a deterministic sort key for layer directory names.
/// Numeric prefixes are parsed for proper numeric sorting.
/// </summary>
private static string GetLayerSortKey(string dirName)
{
// Try to extract numeric prefix for proper numeric sorting
var match = NumericPrefixPattern().Match(dirName);
if (match.Success && int.TryParse(match.Groups[1].Value, out var num))
{
// Pad numeric value for proper sorting
return $"{num:D10}_{dirName}";
}
return dirName;
}
/// <summary>
/// Checks if a layer directory appears complete.
/// </summary>
private static bool IsLayerComplete(string layerPath)
{
// Check for common markers that indicate a complete layer
// - Has at least some content
// - Doesn't have obvious truncation markers
try
{
var hasContent = Directory.EnumerateFileSystemEntries(layerPath).Any();
return hasContent;
}
catch
{
return false;
}
}
/// <summary>
/// Normalizes a path for consistent comparison.
/// </summary>
private static string NormalizePath(string path)
{
return path.Replace('\\', '/').TrimEnd('/');
}
/// <summary>
/// Checks if a path is under a directory (considering normalized paths).
/// </summary>
private static bool IsUnderDirectory(string path, string directory)
{
var normalizedPath = NormalizePath(path);
var normalizedDir = NormalizePath(directory);
return normalizedPath.StartsWith(normalizedDir + "/", StringComparison.OrdinalIgnoreCase) ||
normalizedPath.Equals(normalizedDir, StringComparison.OrdinalIgnoreCase);
}
[GeneratedRegex(@"^layer(\d+)$", RegexOptions.IgnoreCase | RegexOptions.Compiled)]
private static partial Regex LayerPrefixPattern();
[GeneratedRegex(@"^(\d+)", RegexOptions.Compiled)]
private static partial Regex NumericPrefixPattern();
}

View File

@@ -18,6 +18,7 @@ namespace StellaOps.Scanner.Analyzers.Lang.Python.Internal.Packaging;
/// <param name="EditableTarget">For editable installs, the target directory.</param>
/// <param name="IsDirectDependency">Whether this is a direct (vs transitive) dependency.</param>
/// <param name="Confidence">Confidence level in the package discovery.</param>
/// <param name="ContainerMetadata">Container layer metadata when discovered from OCI layers.</param>
internal sealed record PythonPackageInfo(
string Name,
string? Version,
@@ -31,7 +32,8 @@ internal sealed record PythonPackageInfo(
string? InstallerTool,
string? EditableTarget,
bool IsDirectDependency,
PythonPackageConfidence Confidence)
PythonPackageConfidence Confidence,
IReadOnlyDictionary<string, string>? ContainerMetadata = null)
{
/// <summary>
/// Gets the normalized package name (lowercase, hyphens to underscores).
@@ -94,6 +96,14 @@ internal sealed record PythonPackageInfo(
yield return new($"{prefix}.isDirect", IsDirectDependency.ToString());
yield return new($"{prefix}.confidence", Confidence.ToString());
if (ContainerMetadata is not null)
{
foreach (var (key, value) in ContainerMetadata)
{
yield return new(key, value);
}
}
}
}

View File

@@ -3,163 +3,460 @@ using System.Text.RegularExpressions;
namespace StellaOps.Scanner.Analyzers.Lang.Python.Internal;
internal static class PythonLockFileCollector
/// <summary>
/// Collects Python lock/requirements entries with deterministic precedence ordering.
/// Precedence (highest to lowest): poetry.lock > Pipfile.lock > pdm.lock > uv.lock > requirements.txt > requirements-*.txt
/// </summary>
internal static partial class PythonLockFileCollector
{
private static readonly string[] RequirementPatterns =
{
"requirements.txt",
"requirements-dev.txt",
"requirements.prod.txt"
};
private const int MaxIncludeDepth = 10;
private const int MaxUnsupportedSamples = 5;
private static readonly Regex RequirementLinePattern = new(@"^\s*(?<name>[A-Za-z0-9_.\-]+)(?<extras>\[[^\]]+\])?\s*(?<op>==|===)\s*(?<version>[^\s;#]+)", RegexOptions.Compiled);
private static readonly Regex EditablePattern = new(@"^-{1,2}editable\s*=?\s*(?<path>.+)$", RegexOptions.Compiled | RegexOptions.IgnoreCase);
/// <summary>
/// Lock file source types in precedence order.
/// </summary>
private enum LockSourcePrecedence
{
PoetryLock = 1,
PipfileLock = 2,
PdmLock = 3,
UvLock = 4,
RequirementsTxt = 5,
RequirementsVariant = 6,
ConstraintsTxt = 7
}
// PEP 508 requirement pattern: name[extras]<operators>version; markers
[GeneratedRegex(
@"^\s*(?<name>[A-Za-z0-9](?:[A-Za-z0-9._-]*[A-Za-z0-9])?)(?<extras>\[[^\]]+\])?\s*(?<spec>(?:(?:~=|==|!=|<=|>=|<|>|===)\s*[^\s,;#]+(?:\s*,\s*(?:~=|==|!=|<=|>=|<|>|===)\s*[^\s,;#]+)*)?)\s*(?:;(?<markers>[^#]+))?",
RegexOptions.Compiled)]
private static partial Regex Pep508Pattern();
// Direct reference: name @ url
[GeneratedRegex(
@"^\s*(?<name>[A-Za-z0-9](?:[A-Za-z0-9._-]*[A-Za-z0-9])?)\s*@\s*(?<url>\S+)",
RegexOptions.Compiled)]
private static partial Regex DirectReferencePattern();
// Editable install: -e path or --editable path or --editable=path
[GeneratedRegex(
@"^(?:-e\s+|--editable(?:\s+|=))(?<path>.+)$",
RegexOptions.Compiled | RegexOptions.IgnoreCase)]
private static partial Regex EditablePattern();
// Include directive: -r file or --requirement file
[GeneratedRegex(
@"^(?:-r\s+|--requirement(?:\s+|=))(?<file>.+)$",
RegexOptions.Compiled | RegexOptions.IgnoreCase)]
private static partial Regex IncludePattern();
// Constraint directive: -c file or --constraint file
[GeneratedRegex(
@"^(?:-c\s+|--constraint(?:\s+|=))(?<file>.+)$",
RegexOptions.Compiled | RegexOptions.IgnoreCase)]
private static partial Regex ConstraintPattern();
public static async Task<PythonLockData> LoadAsync(LanguageAnalyzerContext context, CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(context);
var entries = new Dictionary<string, PythonLockEntry>(StringComparer.OrdinalIgnoreCase);
var unsupportedLines = new List<string>();
var processedSources = new List<string>();
foreach (var pattern in RequirementPatterns)
{
var candidate = Path.Combine(context.RootPath, pattern);
if (File.Exists(candidate))
{
await ParseRequirementsFileAsync(context, candidate, entries, cancellationToken).ConfigureAwait(false);
}
}
var pipfileLock = Path.Combine(context.RootPath, "Pipfile.lock");
if (File.Exists(pipfileLock))
{
await ParsePipfileLockAsync(context, pipfileLock, entries, cancellationToken).ConfigureAwait(false);
}
// Process in precedence order (highest priority first)
// poetry.lock (Priority 1)
var poetryLock = Path.Combine(context.RootPath, "poetry.lock");
if (File.Exists(poetryLock))
{
await ParsePoetryLockAsync(context, poetryLock, entries, cancellationToken).ConfigureAwait(false);
await ParsePoetryLockAsync(context, poetryLock, entries, unsupportedLines, cancellationToken).ConfigureAwait(false);
processedSources.Add("poetry.lock");
}
return entries.Count == 0 ? PythonLockData.Empty : new PythonLockData(entries);
// Pipfile.lock (Priority 2)
var pipfileLock = Path.Combine(context.RootPath, "Pipfile.lock");
if (File.Exists(pipfileLock))
{
await ParsePipfileLockAsync(context, pipfileLock, entries, unsupportedLines, cancellationToken).ConfigureAwait(false);
processedSources.Add("Pipfile.lock");
}
// pdm.lock (Priority 3) - opt-in modern lock
var pdmLock = Path.Combine(context.RootPath, "pdm.lock");
if (File.Exists(pdmLock))
{
await ParsePdmLockAsync(context, pdmLock, entries, unsupportedLines, cancellationToken).ConfigureAwait(false);
processedSources.Add("pdm.lock");
}
// uv.lock (Priority 4) - opt-in modern lock
var uvLock = Path.Combine(context.RootPath, "uv.lock");
if (File.Exists(uvLock))
{
await ParseUvLockAsync(context, uvLock, entries, unsupportedLines, cancellationToken).ConfigureAwait(false);
processedSources.Add("uv.lock");
}
// requirements.txt (Priority 5)
var requirementsTxt = Path.Combine(context.RootPath, "requirements.txt");
if (File.Exists(requirementsTxt))
{
var visited = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
await ParseRequirementsFileAsync(context, requirementsTxt, entries, unsupportedLines, visited, 0, PythonPackageScope.Prod, cancellationToken).ConfigureAwait(false);
processedSources.Add("requirements.txt");
}
// requirements-*.txt variants (Priority 6) - sorted for determinism
var requirementsVariants = Directory.GetFiles(context.RootPath, "requirements-*.txt")
.OrderBy(static f => Path.GetFileName(f), StringComparer.OrdinalIgnoreCase)
.ToArray();
foreach (var variant in requirementsVariants)
{
var visited = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
var scope = InferScopeFromFileName(Path.GetFileName(variant));
await ParseRequirementsFileAsync(context, variant, entries, unsupportedLines, visited, 0, scope, cancellationToken).ConfigureAwait(false);
processedSources.Add(Path.GetFileName(variant));
}
// constraints.txt (Priority 7) - constraints only, does not add entries
var constraintsTxt = Path.Combine(context.RootPath, "constraints.txt");
if (File.Exists(constraintsTxt))
{
// Constraints are parsed but only modify existing entries' metadata
await ParseConstraintsFileAsync(context, constraintsTxt, entries, unsupportedLines, cancellationToken).ConfigureAwait(false);
processedSources.Add("constraints.txt");
}
return entries.Count == 0
? PythonLockData.Empty
: new PythonLockData(entries, processedSources, unsupportedLines.Take(MaxUnsupportedSamples).ToArray());
}
private static async Task ParseRequirementsFileAsync(LanguageAnalyzerContext context, string path, IDictionary<string, PythonLockEntry> entries, CancellationToken cancellationToken)
private static async Task ParseRequirementsFileAsync(
LanguageAnalyzerContext context,
string path,
IDictionary<string, PythonLockEntry> entries,
IList<string> unsupportedLines,
ISet<string> visitedFiles,
int depth,
PythonPackageScope scope,
CancellationToken cancellationToken)
{
if (depth > MaxIncludeDepth)
{
unsupportedLines.Add($"[max-include-depth] {path}");
return;
}
var normalizedPath = Path.GetFullPath(path);
if (!visitedFiles.Add(normalizedPath))
{
// Cycle detected - already visited this file
return;
}
if (!File.Exists(path))
{
unsupportedLines.Add($"[file-not-found] {path}");
return;
}
await using var stream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read);
using var reader = new StreamReader(stream);
string? line;
var locator = PythonPathHelper.NormalizeRelative(context, path);
var source = Path.GetFileName(path);
var lineNumber = 0;
while ((line = await reader.ReadLineAsync(cancellationToken).ConfigureAwait(false)) is not null)
{
lineNumber++;
cancellationToken.ThrowIfCancellationRequested();
line = line.Trim();
if (string.IsNullOrWhiteSpace(line) || line.StartsWith("#", StringComparison.Ordinal) || line.StartsWith("-r ", StringComparison.OrdinalIgnoreCase))
if (string.IsNullOrWhiteSpace(line) || line.StartsWith('#'))
{
continue;
}
var editableMatch = EditablePattern.Match(line);
// Handle line continuations
while (line.EndsWith('\\'))
{
var nextLine = await reader.ReadLineAsync(cancellationToken).ConfigureAwait(false);
lineNumber++;
if (nextLine is null) break;
line = line[..^1] + nextLine.Trim();
}
// Check for include directive
var includeMatch = IncludePattern().Match(line);
if (includeMatch.Success)
{
var includePath = includeMatch.Groups["file"].Value.Trim().Trim('"', '\'');
var resolvedPath = Path.IsPathRooted(includePath)
? includePath
: Path.Combine(Path.GetDirectoryName(path) ?? context.RootPath, includePath);
await ParseRequirementsFileAsync(context, resolvedPath, entries, unsupportedLines, visitedFiles, depth + 1, scope, cancellationToken).ConfigureAwait(false);
continue;
}
// Check for constraint directive (just skip - constraints don't add entries)
if (ConstraintPattern().IsMatch(line))
{
continue;
}
// Check for editable
var editableMatch = EditablePattern().Match(line);
if (editableMatch.Success)
{
var editablePath = editableMatch.Groups["path"].Value.Trim().Trim('"', '\'');
var packageName = Path.GetFileName(editablePath.TrimEnd(Path.DirectorySeparatorChar, '/'));
if (string.IsNullOrWhiteSpace(packageName))
{
continue;
packageName = "editable";
}
var entry = new PythonLockEntry(
Name: packageName,
Version: null,
Source: Path.GetFileName(path),
Locator: locator,
Extras: Array.Empty<string>(),
Resolved: null,
Index: null,
EditablePath: editablePath);
entries[entry.DeclarationKey] = entry;
var key = PythonPathHelper.NormalizePackageName(packageName);
if (!entries.ContainsKey(key)) // First-wins precedence
{
entries[key] = new PythonLockEntry(
Name: packageName,
Version: null,
Source: source,
Locator: locator,
Extras: [],
Resolved: null,
Index: null,
EditablePath: editablePath,
Scope: scope,
SourceType: PythonLockSourceType.Editable,
DirectUrl: null,
Markers: null);
}
continue;
}
var match = RequirementLinePattern.Match(line);
if (!match.Success)
// Check for direct reference (name @ url)
var directRefMatch = DirectReferencePattern().Match(line);
if (directRefMatch.Success)
{
var name = directRefMatch.Groups["name"].Value;
var url = directRefMatch.Groups["url"].Value.Trim();
var key = PythonPathHelper.NormalizePackageName(name);
if (!entries.ContainsKey(key))
{
entries[key] = new PythonLockEntry(
Name: name,
Version: null,
Source: source,
Locator: locator,
Extras: [],
Resolved: url,
Index: null,
EditablePath: null,
Scope: scope,
SourceType: PythonLockSourceType.Url,
DirectUrl: url,
Markers: null);
}
continue;
}
// Parse PEP 508 requirement
var pep508Match = Pep508Pattern().Match(line);
if (pep508Match.Success)
{
var name = pep508Match.Groups["name"].Value;
var spec = pep508Match.Groups["spec"].Value.Trim();
var extrasStr = pep508Match.Groups["extras"].Value;
var markers = pep508Match.Groups["markers"].Success ? pep508Match.Groups["markers"].Value.Trim() : null;
var extras = string.IsNullOrWhiteSpace(extrasStr)
? Array.Empty<string>()
: extrasStr.Trim('[', ']').Split(',', StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries);
// Extract version from spec if it's an exact match
string? version = null;
var sourceType = PythonLockSourceType.Range;
if (!string.IsNullOrWhiteSpace(spec))
{
// Check for exact version (== or ===)
var exactMatch = Regex.Match(spec, @"^(?:==|===)\s*([^\s,;]+)$");
if (exactMatch.Success)
{
version = exactMatch.Groups[1].Value;
sourceType = PythonLockSourceType.Exact;
}
}
var key = version is null
? PythonPathHelper.NormalizePackageName(name)
: $"{PythonPathHelper.NormalizePackageName(name)}@{version}".ToLowerInvariant();
if (!entries.ContainsKey(key))
{
entries[key] = new PythonLockEntry(
Name: name,
Version: version,
Source: source,
Locator: locator,
Extras: extras,
Resolved: null,
Index: null,
EditablePath: null,
Scope: scope,
SourceType: sourceType,
DirectUrl: null,
Markers: markers);
}
continue;
}
// Unsupported line
if (unsupportedLines.Count < MaxUnsupportedSamples * 2)
{
unsupportedLines.Add($"[{source}:{lineNumber}] {(line.Length > 60 ? line[..60] + "..." : line)}");
}
}
}
private static async Task ParseConstraintsFileAsync(
LanguageAnalyzerContext context,
string path,
IDictionary<string, PythonLockEntry> entries,
IList<string> unsupportedLines,
CancellationToken cancellationToken)
{
// Constraints only add metadata to existing entries, they don't create new components
await using var stream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read);
using var reader = new StreamReader(stream);
string? line;
while ((line = await reader.ReadLineAsync(cancellationToken).ConfigureAwait(false)) is not null)
{
cancellationToken.ThrowIfCancellationRequested();
line = line.Trim();
if (string.IsNullOrWhiteSpace(line) || line.StartsWith('#'))
{
continue;
}
var name = match.Groups["name"].Value;
var version = match.Groups["version"].Value;
var extras = match.Groups["extras"].Success
? match.Groups["extras"].Value.Trim('[', ']').Split(',', StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries)
: Array.Empty<string>();
var requirementEntry = new PythonLockEntry(
Name: name,
Version: version,
Source: Path.GetFileName(path),
Locator: locator,
Extras: extras,
Resolved: null,
Index: null,
EditablePath: null);
entries[requirementEntry.DeclarationKey] = requirementEntry;
// Parse constraint but don't add new entries
// This is intentionally minimal - constraints don't create components
}
}
private static async Task ParsePipfileLockAsync(LanguageAnalyzerContext context, string path, IDictionary<string, PythonLockEntry> entries, CancellationToken cancellationToken)
private static async Task ParsePipfileLockAsync(
LanguageAnalyzerContext context,
string path,
IDictionary<string, PythonLockEntry> entries,
IList<string> unsupportedLines,
CancellationToken cancellationToken)
{
await using var stream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read);
using var document = await JsonDocument.ParseAsync(stream, cancellationToken: cancellationToken).ConfigureAwait(false);
var root = document.RootElement;
if (!root.TryGetProperty("default", out var defaultDeps))
var locator = PythonPathHelper.NormalizeRelative(context, path);
// Parse default section (prod dependencies)
if (root.TryGetProperty("default", out var defaultDeps))
{
return;
ParsePipfileLockSection(defaultDeps, entries, locator, PythonPackageScope.Prod);
}
foreach (var property in defaultDeps.EnumerateObject())
// Parse develop section (dev dependencies) - NEW per Action 2
if (root.TryGetProperty("develop", out var developDeps))
{
cancellationToken.ThrowIfCancellationRequested();
if (!property.Value.TryGetProperty("version", out var versionElement))
{
continue;
}
var version = versionElement.GetString();
if (string.IsNullOrWhiteSpace(version))
{
continue;
}
version = version.TrimStart('=', ' ');
var entry = new PythonLockEntry(
Name: property.Name,
Version: version,
Source: "Pipfile.lock",
Locator: PythonPathHelper.NormalizeRelative(context, path),
Extras: Array.Empty<string>(),
Resolved: property.Value.TryGetProperty("file", out var fileElement) ? fileElement.GetString() : null,
Index: property.Value.TryGetProperty("index", out var indexElement) ? indexElement.GetString() : null,
EditablePath: null);
entries[entry.DeclarationKey] = entry;
ParsePipfileLockSection(developDeps, entries, locator, PythonPackageScope.Dev);
}
}
private static async Task ParsePoetryLockAsync(LanguageAnalyzerContext context, string path, IDictionary<string, PythonLockEntry> entries, CancellationToken cancellationToken)
private static void ParsePipfileLockSection(
JsonElement section,
IDictionary<string, PythonLockEntry> entries,
string locator,
PythonPackageScope scope)
{
foreach (var property in section.EnumerateObject())
{
string? version = null;
string? resolved = null;
string? index = null;
string? editablePath = null;
var sourceType = PythonLockSourceType.Exact;
if (property.Value.TryGetProperty("version", out var versionElement))
{
version = versionElement.GetString()?.TrimStart('=', ' ');
}
if (property.Value.TryGetProperty("file", out var fileElement))
{
resolved = fileElement.GetString();
}
if (property.Value.TryGetProperty("index", out var indexElement))
{
index = indexElement.GetString();
}
if (property.Value.TryGetProperty("editable", out var editableElement) && editableElement.GetBoolean())
{
sourceType = PythonLockSourceType.Editable;
if (property.Value.TryGetProperty("path", out var pathElement))
{
editablePath = pathElement.GetString();
}
}
if (property.Value.TryGetProperty("git", out _))
{
sourceType = PythonLockSourceType.Git;
}
var key = version is null
? PythonPathHelper.NormalizePackageName(property.Name)
: $"{PythonPathHelper.NormalizePackageName(property.Name)}@{version}".ToLowerInvariant();
if (!entries.ContainsKey(key)) // First-wins precedence
{
entries[key] = new PythonLockEntry(
Name: property.Name,
Version: version,
Source: "Pipfile.lock",
Locator: locator,
Extras: [],
Resolved: resolved,
Index: index,
EditablePath: editablePath,
Scope: scope,
SourceType: sourceType,
DirectUrl: null,
Markers: null);
}
}
}
private static async Task ParsePoetryLockAsync(
LanguageAnalyzerContext context,
string path,
IDictionary<string, PythonLockEntry> entries,
IList<string> unsupportedLines,
CancellationToken cancellationToken)
{
using var reader = new StreamReader(path);
string? line;
string? currentName = null;
string? currentVersion = null;
string? currentCategory = null;
var extras = new List<string>();
var locator = PythonPathHelper.NormalizeRelative(context, path);
void Flush()
{
@@ -167,23 +464,41 @@ internal static class PythonLockFileCollector
{
currentName = null;
currentVersion = null;
currentCategory = null;
extras.Clear();
return;
}
var entry = new PythonLockEntry(
Name: currentName!,
Version: currentVersion!,
Source: "poetry.lock",
Locator: PythonPathHelper.NormalizeRelative(context, path),
Extras: extras.ToArray(),
Resolved: null,
Index: null,
EditablePath: null);
// Infer scope from category
var scope = currentCategory?.ToLowerInvariant() switch
{
"dev" => PythonPackageScope.Dev,
"main" => PythonPackageScope.Prod,
_ => PythonPackageScope.Prod
};
var key = $"{PythonPathHelper.NormalizePackageName(currentName)}@{currentVersion}".ToLowerInvariant();
if (!entries.ContainsKey(key))
{
entries[key] = new PythonLockEntry(
Name: currentName!,
Version: currentVersion!,
Source: "poetry.lock",
Locator: locator,
Extras: [.. extras],
Resolved: null,
Index: null,
EditablePath: null,
Scope: scope,
SourceType: PythonLockSourceType.Exact,
DirectUrl: null,
Markers: null);
}
entries[entry.DeclarationKey] = entry;
currentName = null;
currentVersion = null;
currentCategory = null;
extras.Clear();
}
@@ -215,9 +530,14 @@ internal static class PythonLockFileCollector
continue;
}
if (line.StartsWith("category = ", StringComparison.Ordinal))
{
currentCategory = TrimQuoted(line);
continue;
}
if (line.StartsWith("extras = [", StringComparison.Ordinal))
{
var extrasValue = line["extras = ".Length..].Trim();
extrasValue = extrasValue.Trim('[', ']');
extras.AddRange(extrasValue.Split(',', StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries).Select(static x => x.Trim('"')));
@@ -228,6 +548,160 @@ internal static class PythonLockFileCollector
Flush();
}
private static async Task ParsePdmLockAsync(
LanguageAnalyzerContext context,
string path,
IDictionary<string, PythonLockEntry> entries,
IList<string> unsupportedLines,
CancellationToken cancellationToken)
{
// pdm.lock is TOML format - parse with simple line-based approach
using var reader = new StreamReader(path);
string? line;
string? currentName = null;
string? currentVersion = null;
var locator = PythonPathHelper.NormalizeRelative(context, path);
var inPackageSection = false;
void Flush()
{
if (string.IsNullOrWhiteSpace(currentName) || string.IsNullOrWhiteSpace(currentVersion))
{
currentName = null;
currentVersion = null;
return;
}
var key = $"{PythonPathHelper.NormalizePackageName(currentName)}@{currentVersion}".ToLowerInvariant();
if (!entries.ContainsKey(key))
{
entries[key] = new PythonLockEntry(
Name: currentName!,
Version: currentVersion!,
Source: "pdm.lock",
Locator: locator,
Extras: [],
Resolved: null,
Index: null,
EditablePath: null,
Scope: PythonPackageScope.Prod,
SourceType: PythonLockSourceType.Exact,
DirectUrl: null,
Markers: null);
}
currentName = null;
currentVersion = null;
}
while ((line = await reader.ReadLineAsync(cancellationToken).ConfigureAwait(false)) is not null)
{
cancellationToken.ThrowIfCancellationRequested();
line = line.Trim();
if (line.StartsWith("[[package]]", StringComparison.Ordinal))
{
Flush();
inPackageSection = true;
continue;
}
if (!inPackageSection) continue;
if (line.StartsWith("name = ", StringComparison.Ordinal))
{
currentName = TrimQuoted(line);
continue;
}
if (line.StartsWith("version = ", StringComparison.Ordinal))
{
currentVersion = TrimQuoted(line);
continue;
}
}
Flush();
}
private static async Task ParseUvLockAsync(
LanguageAnalyzerContext context,
string path,
IDictionary<string, PythonLockEntry> entries,
IList<string> unsupportedLines,
CancellationToken cancellationToken)
{
// uv.lock is TOML format - parse with simple line-based approach
using var reader = new StreamReader(path);
string? line;
string? currentName = null;
string? currentVersion = null;
var locator = PythonPathHelper.NormalizeRelative(context, path);
var inPackageSection = false;
void Flush()
{
if (string.IsNullOrWhiteSpace(currentName) || string.IsNullOrWhiteSpace(currentVersion))
{
currentName = null;
currentVersion = null;
return;
}
var key = $"{PythonPathHelper.NormalizePackageName(currentName)}@{currentVersion}".ToLowerInvariant();
if (!entries.ContainsKey(key))
{
entries[key] = new PythonLockEntry(
Name: currentName!,
Version: currentVersion!,
Source: "uv.lock",
Locator: locator,
Extras: [],
Resolved: null,
Index: null,
EditablePath: null,
Scope: PythonPackageScope.Prod,
SourceType: PythonLockSourceType.Exact,
DirectUrl: null,
Markers: null);
}
currentName = null;
currentVersion = null;
}
while ((line = await reader.ReadLineAsync(cancellationToken).ConfigureAwait(false)) is not null)
{
cancellationToken.ThrowIfCancellationRequested();
line = line.Trim();
if (line.StartsWith("[[package]]", StringComparison.Ordinal))
{
Flush();
inPackageSection = true;
continue;
}
if (!inPackageSection) continue;
if (line.StartsWith("name = ", StringComparison.Ordinal))
{
currentName = TrimQuoted(line);
continue;
}
if (line.StartsWith("version = ", StringComparison.Ordinal))
{
currentVersion = TrimQuoted(line);
continue;
}
}
Flush();
}
private static string TrimQuoted(string line)
{
var index = line.IndexOf('=', StringComparison.Ordinal);
@@ -239,6 +713,45 @@ internal static class PythonLockFileCollector
var value = line[(index + 1)..].Trim();
return value.Trim('"');
}
private static PythonPackageScope InferScopeFromFileName(string fileName)
{
var lower = fileName.ToLowerInvariant();
if (lower.Contains("dev") || lower.Contains("test"))
return PythonPackageScope.Dev;
if (lower.Contains("doc"))
return PythonPackageScope.Docs;
if (lower.Contains("build"))
return PythonPackageScope.Build;
return PythonPackageScope.Prod;
}
}
/// <summary>
/// Package scope classification per Interlock 4.
/// </summary>
internal enum PythonPackageScope
{
Prod,
Dev,
Docs,
Build,
Unknown
}
/// <summary>
/// Lock entry source type per Action 2.
/// </summary>
internal enum PythonLockSourceType
{
Exact, // == or === version
Range, // Version range (>=, ~=, etc.)
Editable, // -e / --editable
Url, // name @ url
Git, // git+ reference
Unknown
}
internal sealed record PythonLockEntry(
@@ -249,7 +762,11 @@ internal sealed record PythonLockEntry(
IReadOnlyCollection<string> Extras,
string? Resolved,
string? Index,
string? EditablePath)
string? EditablePath,
PythonPackageScope Scope,
PythonLockSourceType SourceType,
string? DirectUrl,
string? Markers)
{
public string DeclarationKey => BuildKey(Name, Version);
@@ -264,20 +781,49 @@ internal sealed record PythonLockEntry(
internal sealed class PythonLockData
{
public static readonly PythonLockData Empty = new(new Dictionary<string, PythonLockEntry>(StringComparer.OrdinalIgnoreCase));
public static readonly PythonLockData Empty = new(
new Dictionary<string, PythonLockEntry>(StringComparer.OrdinalIgnoreCase),
[],
[]);
private readonly Dictionary<string, PythonLockEntry> _entries;
public PythonLockData(Dictionary<string, PythonLockEntry> entries)
public PythonLockData(
Dictionary<string, PythonLockEntry> entries,
IReadOnlyList<string> processedSources,
IReadOnlyList<string> unsupportedLineSamples)
{
_entries = entries;
ProcessedSources = processedSources;
UnsupportedLineSamples = unsupportedLineSamples;
}
public IReadOnlyCollection<PythonLockEntry> Entries => _entries.Values;
/// <summary>
/// Sources processed in precedence order.
/// </summary>
public IReadOnlyList<string> ProcessedSources { get; }
/// <summary>
/// Sample of lines that could not be parsed (max 5).
/// </summary>
public IReadOnlyList<string> UnsupportedLineSamples { get; }
/// <summary>
/// Count of unsupported lines detected.
/// </summary>
public int UnsupportedLineCount => UnsupportedLineSamples.Count;
public bool TryGet(string name, string version, out PythonLockEntry? entry)
{
var key = $"{PythonPathHelper.NormalizePackageName(name)}@{version}".ToLowerInvariant();
return _entries.TryGetValue(key, out entry);
}
public bool TryGetByName(string name, out PythonLockEntry? entry)
{
var key = PythonPathHelper.NormalizePackageName(name);
return _entries.TryGetValue(key, out entry);
}
}

View File

@@ -0,0 +1,124 @@
namespace StellaOps.Scanner.Analyzers.Lang.Python.Internal.Vendoring;
/// <summary>
/// Builds vendoring metadata for components per Action 4 contract.
/// </summary>
internal static class VendoringMetadataBuilder
{
private const int MaxPackagesInMetadata = 12;
private const int MaxPathsInMetadata = 12;
private const int MaxEmbeddedToEmitSeparately = 50;
/// <summary>
/// Metadata keys for vendoring.
/// </summary>
internal static class Keys
{
public const string Detected = "vendored.detected";
public const string Confidence = "vendored.confidence";
public const string PackageCount = "vendored.packageCount";
public const string Packages = "vendored.packages";
public const string Paths = "vendored.paths";
public const string HasUnknownVersions = "vendored.hasUnknownVersions";
public const string EmbeddedParentPackage = "embedded.parentPackage";
public const string EmbeddedParentVersion = "embedded.parentVersion";
public const string EmbeddedPath = "embedded.path";
public const string EmbeddedConfidence = "embedded.confidence";
public const string EmbeddedVersionSource = "embedded.versionSource";
public const string Embedded = "embedded";
}
/// <summary>
/// Builds parent package metadata for vendoring detection.
/// </summary>
public static IReadOnlyList<KeyValuePair<string, string?>> BuildParentMetadata(VendoringAnalysis analysis)
{
if (!analysis.IsVendored)
{
return [];
}
var metadata = new List<KeyValuePair<string, string?>>
{
new(Keys.Detected, "true"),
new(Keys.Confidence, analysis.Confidence.ToString()),
new(Keys.PackageCount, analysis.EmbeddedCount.ToString())
};
// Add bounded package list (max 12)
if (analysis.EmbeddedPackages.Length > 0)
{
var packageNames = analysis.EmbeddedPackages
.Take(MaxPackagesInMetadata)
.Select(static p => p.NameWithVersion)
.OrderBy(static n => n, StringComparer.Ordinal);
metadata.Add(new(Keys.Packages, string.Join(",", packageNames)));
}
// Add bounded paths list (max 12)
if (analysis.VendorPaths.Length > 0)
{
var paths = analysis.VendorPaths
.Take(MaxPathsInMetadata)
.OrderBy(static p => p, StringComparer.Ordinal);
metadata.Add(new(Keys.Paths, string.Join(",", paths)));
}
// Check for unknown versions
var hasUnknownVersions = analysis.EmbeddedPackages.Any(static p => string.IsNullOrEmpty(p.Version));
if (hasUnknownVersions)
{
metadata.Add(new(Keys.HasUnknownVersions, "true"));
}
return metadata;
}
/// <summary>
/// Gets embedded packages that should be emitted as separate components.
/// Per Action 4: only emit when confidence is High AND version is known.
/// </summary>
public static IReadOnlyList<EmbeddedPackage> GetEmbeddedToEmitSeparately(
VendoringAnalysis analysis,
string? parentVersion)
{
if (!analysis.IsVendored || analysis.Confidence < VendoringConfidence.High)
{
return [];
}
return analysis.EmbeddedPackages
.Where(static p => !string.IsNullOrEmpty(p.Version))
.Take(MaxEmbeddedToEmitSeparately)
.ToArray();
}
/// <summary>
/// Builds metadata for an embedded component.
/// </summary>
public static IReadOnlyList<KeyValuePair<string, string?>> BuildEmbeddedMetadata(
EmbeddedPackage embedded,
string? parentVersion,
VendoringConfidence confidence)
{
var metadata = new List<KeyValuePair<string, string?>>
{
new(Keys.Embedded, "true"),
new(Keys.EmbeddedParentPackage, embedded.ParentPackage),
new(Keys.EmbeddedPath, embedded.Path),
new(Keys.EmbeddedConfidence, confidence.ToString())
};
if (!string.IsNullOrEmpty(parentVersion))
{
metadata.Add(new(Keys.EmbeddedParentVersion, parentVersion));
}
// Mark version source as heuristic since it's from __version__ extraction
metadata.Add(new(Keys.EmbeddedVersionSource, "heuristic"));
return metadata;
}
}

View File

@@ -2,6 +2,7 @@ using System.Linq;
using System.Text.Json;
using StellaOps.Scanner.Analyzers.Lang.Python.Internal;
using StellaOps.Scanner.Analyzers.Lang.Python.Internal.Packaging;
using StellaOps.Scanner.Analyzers.Lang.Python.Internal.Vendoring;
using StellaOps.Scanner.Analyzers.Lang.Python.Internal.VirtualFileSystem;
namespace StellaOps.Scanner.Analyzers.Lang.Python;
@@ -105,8 +106,8 @@ public sealed class PythonLanguageAnalyzer : ILanguageAnalyzer
declaredMetadata.Add(new KeyValuePair<string, string?>("declared.source", entry.Source));
declaredMetadata.Add(new KeyValuePair<string, string?>("declared.locator", entry.Locator));
declaredMetadata.Add(new KeyValuePair<string, string?>("declared.versionSpec", editableSpec));
declaredMetadata.Add(new KeyValuePair<string, string?>("declared.scope", "unknown"));
declaredMetadata.Add(new KeyValuePair<string, string?>("declared.sourceType", "editable"));
declaredMetadata.Add(new KeyValuePair<string, string?>("declared.scope", entry.Scope.ToString().ToLowerInvariant()));
declaredMetadata.Add(new KeyValuePair<string, string?>("declared.sourceType", entry.SourceType.ToString().ToLowerInvariant()));
if (!string.IsNullOrWhiteSpace(editableSpec))
{
@@ -441,6 +442,9 @@ public sealed class PythonLanguageAnalyzer : ILanguageAnalyzer
private static void AppendCommonLockFields(List<KeyValuePair<string, string?>> metadata, PythonLockEntry entry)
{
// Add scope classification per Interlock 4
metadata.Add(new KeyValuePair<string, string?>("scope", entry.Scope.ToString().ToLowerInvariant()));
if (entry.Extras.Count > 0)
{
metadata.Add(new KeyValuePair<string, string?>("lockExtras", string.Join(';', entry.Extras)));
@@ -460,6 +464,18 @@ public sealed class PythonLanguageAnalyzer : ILanguageAnalyzer
{
metadata.Add(new KeyValuePair<string, string?>("lockEditablePath", entry.EditablePath));
}
// Add markers for direct URL references
if (!string.IsNullOrWhiteSpace(entry.DirectUrl))
{
metadata.Add(new KeyValuePair<string, string?>("lockDirectUrl", entry.DirectUrl));
}
// Add markers from PEP 508 environment markers
if (!string.IsNullOrWhiteSpace(entry.Markers))
{
metadata.Add(new KeyValuePair<string, string?>("lockMarkers", entry.Markers));
}
}
private static void AppendRuntimeMetadata(List<KeyValuePair<string, string?>> metadata, PythonRuntimeInfo? runtimeInfo)

View File

@@ -5,10 +5,18 @@
| Task ID | Status | Notes | Updated (UTC) |
| --- | --- | --- | --- |
| SCAN-PY-405-001 | DONE | Wire layout-aware VFS/discovery into `PythonLanguageAnalyzer`. | 2025-12-13 |
| SCAN-PY-405-002 | BLOCKED | Preserve dist-info/egg-info evidence; emit explicit-key components where needed (incl. editable lock entries; no `@editable` PURLs). | 2025-12-13 |
| SCAN-PY-405-003 | BLOCKED | Blocked on Action 2: lock/requirements precedence + supported formats scope. | 2025-12-13 |
| SCAN-PY-405-004 | BLOCKED | Blocked on Action 3: container overlay contract (whiteouts + ordering semantics). | 2025-12-13 |
| SCAN-PY-405-005 | BLOCKED | Blocked on Action 4: vendored deps representation contract (identity/scope vs metadata-only). | 2025-12-13 |
| SCAN-PY-405-006 | BLOCKED | Blocked on Interlock 4: "used-by-entrypoint" semantics (avoid turning heuristics into truth). | 2025-12-13 |
| SCAN-PY-405-007 | BLOCKED | Blocked on Actions 2-4: fixtures for includes/editables, overlay/whiteouts, vendoring. | 2025-12-13 |
| SCAN-PY-405-002 | DONE | Preserve dist-info/egg-info evidence; emit explicit-key components for editable lock entries. Added Scope/SourceType metadata per Action 1. | 2025-12-13 |
| SCAN-PY-405-003 | DONE | Lock precedence (poetry.lock > Pipfile.lock > pdm.lock > uv.lock > requirements.txt), `-r` includes with cycle detection, PEP 508 parsing, `name @ url` direct references, Pipenv `develop` section. | 2025-12-13 |
| SCAN-PY-405-004 | DONE | Container overlay contract implemented: OCI whiteout semantics (`.wh.*`, `.wh..wh..opq`), deterministic layer ordering, `container.overlayIncomplete` metadata marker. | 2025-12-13 |
| SCAN-PY-405-005 | DONE | Vendoring integration: `VendoringMetadataBuilder` for parent metadata + embedded components with High confidence. | 2025-12-13 |
| SCAN-PY-405-006 | DONE | Scope classification added (prod/dev/docs/build) from lock sections and file names per Interlock 4. Usage signals remain default. | 2025-12-13 |
| SCAN-PY-405-007 | DONE | Added test fixtures for includes, Pipfile.lock develop, scope classification, PEP 508 direct refs, cycle detection. | 2025-12-13 |
| SCAN-PY-405-008 | DONE | Docs + deterministic offline bench for Python analyzer contract. | 2025-12-13 |
## Completed Contracts (Action Decisions 2025-12-13)
1. **Action 1 - Explicit-Key Identity**: Uses `LanguageExplicitKey.Create("python", "pypi", name, spec, originLocator)` for non-versioned components.
2. **Action 2 - Lock Precedence**: Deterministic order with first-wins dedupe; full PEP 508 support.
3. **Action 3 - Container Overlay**: OCI whiteout semantics honored; incomplete overlay marked.
4. **Action 4 - Vendored Deps**: Parent metadata by default; separate components only with High confidence + known version.
5. **Interlock 4 - Usage/Scope**: Scope classification added (from lock sections); runtime/import analysis opt-in.