Files
git.stella-ops.org/docs/features/checked/scanner/scanner-multi-language-license-detection-framework.md
2026-02-14 09:11:48 +02:00

5.4 KiB

Scanner Multi-Language License Detection Framework

Module

Scanner

Status

VERIFIED

Description

Comprehensive license detection framework with SPDX expression categorization service, license text extraction from source files, copyright notice extraction, per-language detectors (Python, Java, Go, Rust, JavaScript, .NET), and an aggregation service that merges results across analyzers. No direct match in known features list.

Implementation Details

  • Core Licensing Framework:
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/LicenseCategorizationService.cs - LicenseCategorizationService categorizing SPDX license expressions (permissive, copyleft, commercial, etc.)
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/ILicenseCategorizationService.cs - Interface for license categorization
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/LicenseTextExtractor.cs - LicenseTextExtractor extracting license text from source files (LICENSE, COPYING, etc.)
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/ILicenseTextExtractor.cs - Interface for text extraction
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/CopyrightExtractor.cs - CopyrightExtractor extracting copyright notices from source files
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/ICopyrightExtractor.cs - Interface for copyright extraction
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/LicenseDetectionAggregator.cs - LicenseDetectionAggregator merging license detection results across multiple per-language analyzers
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/ILicenseDetectionAggregator.cs - Interface for aggregation
  • Result Models:
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/LicenseDetectionResult.cs - Per-package license detection result
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/LicenseDetectionSummary.cs - Summary across all packages
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/LicenseTextExtractionResult.cs - License text extraction result
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang/Core/Licensing/CopyrightNotice.cs - Copyright notice model
  • Per-Language Detectors:
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Python/Internal/Licensing/PythonLicenseDetector.cs - Python license detection (setup.py, pyproject.toml, PKG-INFO)
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Python/Internal/Licensing/SpdxLicenseNormalizer.cs - SPDX normalization for Python classifiers
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Java/Internal/License/JavaLicenseDetector.cs - Java license detection (pom.xml, META-INF)
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Go/Internal/GoLicenseDetector.cs - Go license detection (go.mod, vendor)
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Go/Internal/EnhancedGoLicenseDetector.cs - Enhanced Go license detection with source analysis
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Rust/Internal/EnhancedRustLicenseDetector.cs - Rust license detection (Cargo.toml)
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Node/Internal/Licensing/NodeLicenseDetector.cs - Node.js license detection (package.json)
    • src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.DotNet/Internal/Licensing/DotNetLicenseDetector.cs - .NET license detection (.csproj, .nuspec)
  • Evidence:
    • src/Scanner/__Libraries/StellaOps.Scanner.Emit/Evidence/LicenseEvidenceBuilder.cs - LicenseEvidenceBuilder building license evidence for attestation
  • Tests:
    • src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Tests/Licensing/LicenseCategorizationServiceTests.cs - Categorization tests
    • src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Tests/Licensing/LicenseTextExtractorTests.cs - Text extraction tests
    • src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Tests/Licensing/CopyrightExtractorTests.cs - Copyright extraction tests
    • src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Tests/Licensing/LicenseDetectionAggregatorTests.cs - Aggregation tests
    • src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Tests/Licensing/LicenseDetectionIntegrationTests.cs - Integration tests

E2E Test Plan

  • Scan a multi-language container image (Python + Java + Node.js) and verify license detection aggregates results from all per-language detectors
  • Verify the LicenseCategorizationService correctly classifies SPDX expressions (MIT as permissive, GPL-3.0 as copyleft, etc.)
  • Verify LicenseTextExtractor extracts full license text from LICENSE/COPYING files and embedded license headers
  • Verify CopyrightExtractor captures copyright notices with correct year ranges and holder names
  • Verify the LicenseDetectionAggregator merges results from multiple analyzers without duplicates
  • Verify each per-language detector handles its ecosystem-specific license metadata correctly (Python classifiers, Maven POM licenses, package.json license field)

Verification

Check Result
Tier 0 - Source files exist PASS
Tier 1 - Build + code review PASS
Tier 2 - Integration tests PASS
Verified 2026-02-13T18:10:00Z