Some checks failed
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
3.6 KiB
3.6 KiB
Java Analyzer (Scanner)
What it does
- Inventories Maven coordinates from JVM archives (JAR/WAR/EAR/fat JAR) without executing build tools.
- Prefers installed artifact metadata (
META-INF/maven/**/pom.properties), with apom.xmlfallback when properties are missing. - Enriches output with bounded embedded-library scan metadata and JNI usage hints.
Inputs and precedence
- Installed archive inventory: parse Maven coordinates from
META-INF/maven/**/pom.propertiesin each discovered archive. pom.xmlfallback: when nopom.propertiesin the archive, parseMETA-INF/maven/**/pom.xmland emit a Maven PURL only whengroupId,artifactId, andversionare concrete (no placeholders like${...}).- Lock augmentation (current): when a lock entry matches an installed artifact, merge lock metadata onto the component; unmatched lock entries still emit declared-only components.
- Multi-module lock precedence (pending): deterministic precedence rules are tracked in
SCAN-JAVA-403-003(blocked). - Runtime images (pending): runtime component identity is tracked in
SCAN-JAVA-403-004(blocked).
Embedded archives (fat JAR / WAR / EAR layouts)
The analyzer scans embedded library jars without extracting them to disk:
BOOT-INF/lib/*.jarWEB-INF/lib/*.jarAPP-INF/lib/*.jarlib/*.jar
Locator format
Evidence locators are nested deterministically using ! separators:
outer.jar!BOOT-INF/lib/inner.jar!META-INF/maven/.../pom.properties
Bounds and skip markers
Embedded scanning is bounded and deterministic:
- Max embedded jars per archive:
256 - Max embedded jar bytes:
25 MiB
When embedded scanning is skipped or truncated, the outer component metadata includes deterministic markers:
embeddedScan.candidateJars,embeddedScan.scannedJars,embeddedScan.emittedComponentsembeddedScanSkipped=true,embeddedScan.skippedJars,embeddedScanSkipReasons=<...>(when applicable)
Embedded components include:
embedded=trueembedded.containerJarPath=<outerRelativePath>embedded.entryPath=<embeddedEntryPath>
Evidence and hashing
- Evidence locators are project-relative, use
/separators, and use!for nested artifact paths. sha256forpom.propertiesandpom.xmlevidence is computed over the raw entry bytes.
pom.xml with incomplete coordinates
When pom.xml is present but coordinates are incomplete (missing values or ${...} placeholders), the analyzer emits an explicit-key component:
purl=null,version=nullmetadata.unresolvedCoordinates=truecomponentKeyfollows the cross-analyzer explicit-key scheme viaLanguageExplicitKey.Create("java", "maven", ...)
JNI metadata (bytecode-based)
JNI hints are derived from parsed bytecode (native method flags and load call sites), not raw ASCII scanning.
When bytecode analysis finds JNI edges (jni.edgeCount > 0), components are annotated with bounded, deterministic metadata:
jni.edgeCount,jni.nativeMethodCount,jni.loadCallCount, optionaljni.warningCountjni.reasons(distinct reason codes)jni.targetLibraries(top-N stable sample; currently 12)
Known limitations
- Shaded jars that strip Maven metadata remain best-effort; embedded libs without Maven metadata do not emit components.
- Gradle multi-module lock precedence and runtime image component identity remain blocked until explicit decisions land.
References
- Sprint:
docs/implplan/SPRINT_0403_0001_0001_scanner_java_detection_gaps.md - Cross-analyzer contract:
docs/modules/scanner/language-analyzers-contract.md - Implementation:
src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Java/JavaLanguageAnalyzer.cs