# Scanner Multi-Language License Detection Framework ## Module Scanner ## Status VERIFIED ## Description Comprehensive license detection framework with SPDX expression categorization service, license text extraction from source files, copyright notice extraction, per-language detectors (Python, Java, Go, Rust, JavaScript, .NET), and an aggregation service that merges results across analyzers. No direct match in known features list. ## Implementation Details - **Core Licensing Framework**: - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/LicenseCategorizationService.cs` - `LicenseCategorizationService` categorizing SPDX license expressions (permissive, copyleft, commercial, etc.) - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/ILicenseCategorizationService.cs` - Interface for license categorization - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/LicenseTextExtractor.cs` - `LicenseTextExtractor` extracting license text from source files (LICENSE, COPYING, etc.) - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/ILicenseTextExtractor.cs` - Interface for text extraction - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/CopyrightExtractor.cs` - `CopyrightExtractor` extracting copyright notices from source files - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/ICopyrightExtractor.cs` - Interface for copyright extraction - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/LicenseDetectionAggregator.cs` - `LicenseDetectionAggregator` merging license detection results across multiple per-language analyzers - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/ILicenseDetectionAggregator.cs` - Interface for aggregation - **Result Models**: - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/LicenseDetectionResult.cs` - Per-package license detection result - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/LicenseDetectionSummary.cs` - Summary across all packages - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/LicenseTextExtractionResult.cs` - License text extraction result - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/CopyrightNotice.cs` - Copyright notice model - **Per-Language Detectors**: - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Python/Internal/Licensing/PythonLicenseDetector.cs` - Python license detection (setup.py, pyproject.toml, PKG-INFO) - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Python/Internal/Licensing/SpdxLicenseNormalizer.cs` - SPDX normalization for Python classifiers - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Java/Internal/License/JavaLicenseDetector.cs` - Java license detection (pom.xml, META-INF) - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Go/Internal/GoLicenseDetector.cs` - Go license detection (go.mod, vendor) - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Go/Internal/EnhancedGoLicenseDetector.cs` - Enhanced Go license detection with source analysis - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Rust/Internal/EnhancedRustLicenseDetector.cs` - Rust license detection (Cargo.toml) - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Node/Internal/Licensing/NodeLicenseDetector.cs` - Node.js license detection (package.json) - `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.DotNet/Internal/Licensing/DotNetLicenseDetector.cs` - .NET license detection (.csproj, .nuspec) - **Evidence**: - `src/Scanner/__Libraries/StellaOps.Scanner.Emit/Evidence/LicenseEvidenceBuilder.cs` - `LicenseEvidenceBuilder` building license evidence for attestation - **Tests**: - `src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Tests/Licensing/LicenseCategorizationServiceTests.cs` - Categorization tests - `src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Tests/Licensing/LicenseTextExtractorTests.cs` - Text extraction tests - `src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Tests/Licensing/CopyrightExtractorTests.cs` - Copyright extraction tests - `src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Tests/Licensing/LicenseDetectionAggregatorTests.cs` - Aggregation tests - `src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Tests/Licensing/LicenseDetectionIntegrationTests.cs` - Integration tests ## E2E Test Plan - [ ] Scan a multi-language container image (Python + Java + Node.js) and verify license detection aggregates results from all per-language detectors - [ ] Verify the `LicenseCategorizationService` correctly classifies SPDX expressions (MIT as permissive, GPL-3.0 as copyleft, etc.) - [ ] Verify `LicenseTextExtractor` extracts full license text from LICENSE/COPYING files and embedded license headers - [ ] Verify `CopyrightExtractor` captures copyright notices with correct year ranges and holder names - [ ] Verify the `LicenseDetectionAggregator` merges results from multiple analyzers without duplicates - [ ] Verify each per-language detector handles its ecosystem-specific license metadata correctly (Python classifiers, Maven POM licenses, package.json license field) --- ## Verification | Check | Result | |-------|--------| | Tier 0 - Source files exist | PASS | | Tier 1 - Build + code review | PASS | | Tier 2 - Integration tests | PASS | | Verified | 2026-02-13T18:10:00Z |