Files
git.stella-ops.org/docs/modules/scanner/analyzers-node.md
StellaOps Bot 6e45066e37
Some checks failed
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
up
2025-12-13 09:37:15 +02:00

4.5 KiB
Raw Blame History

Node Analyzer (npm/Yarn/pnpm)

This document captures the Node language analyzers deterministic behavior guarantees and safety constraints (what it emits, what it refuses to emit, and how it stays bounded/offline).

Component identity & precedence

Installed vs declared-only

  • The analyzer always emits on-disk inventory first (workspace member manifests + installed node_modules/PNPM/Yarn PnP cache packages).
  • It then emits declared-only components for lockfile / manifest declarations that are not backed by on-disk inventory:
    • If a declared entry has a concrete resolved version from a lockfile, it emits a versioned pkg:npm/...@<version> PURL.
    • If the version is non-concrete (ranges/tags/git/file/workspace/link/path), it emits an explicit-key component (purl=null, version=null).

Identity safety (PURL vs explicit-key)

  • Concrete PURLs are emitted only when the analyzer can prove a concrete version from local evidence (installed package.json or a lockfile-resolved entry).
  • Declared-only/non-concrete dependencies use LanguageExplicitKey (see docs/modules/scanner/language-analyzers-contract.md).

Lock metadata lookup precedence

When attaching lock metadata to an installed package:

  1. package-lock.json path match (packages["<relativePath>"]),
  2. (name, version) match (Yarn/pnpm multi-version support),
  3. fallback to name-only (last-wins) for legacy locks.

Lockfile parsing guarantees (offline)

package-lock.json (npm)

  • Supports v3+ packages{} layout and legacy dependencies{} traversal.
  • Correctly extracts nested names from node_modules/.../node_modules/... paths (including scoped packages).

yarn.lock (Yarn v1 + Berry v2/v3)

  • Supports both Yarn v1 (resolved "https://...") and Berry fields (resolution:, checksum:).
  • If integrity is absent but checksum is present, the analyzer records integrity-like evidence as checksum:<value>.
  • Ignores the __metadata section.

pnpm-lock.yaml (pnpm)

  • Parses modern packages: and snapshots: sections.
  • Does not drop entries that lack integrity (workspace/link/file/git); instead it emits:
    • lockIntegrityMissing=true
    • lockIntegrityMissingReason=<workspace|link|file|git|directory|missing>

Workspaces

  • Reads workspace members from the root package.json (workspaces array or { packages: [...] } form).
  • Supports glob patterns:
    • * (single segment)
    • ** (multi-segment)
  • Expansion is bounded and deterministic:
    • Skips node_modules
    • Caps traversal depth and total visited directories/members
    • Stable, sorted member output
  • Dependency scopes (production|development|peer|optional) are derived from both the root and workspace manifests, with deterministic precedence.

Import scanning (bounded)

  • Import scanning runs only for the root package and workspace member packages (not node_modules packages).
  • File types: .js/.jsx/.mjs/.cjs/.ts/.tsx/.mts/.cts.
  • Parser behavior:
    • Attempts AST parsing as script/module; falls back to a bounded regex heuristic for TS when parsing fails.
  • Hard caps per package:
    • maxFiles=500, maxBytes=5MiB, maxFileBytes=512KiB, maxDepth=20
    • Skips node_modules and .pnpm directories during traversal
  • If capped, the analyzer marks the package metadata with:
    • importScanSkipped=true
    • importScan.filesScanned=<n>
    • importScan.bytesScanned=<n>

Container layer layouts

  • Candidate layer roots under the analysis root:
    • layers/*, .layers/*, layer*
  • Each candidate root is scanned independently.
  • The analyzer also discovers package.json roots nested under layer roots (bounded depth) and includes their nested node_modules roots when present.

Determinism & evidence hashing

  • On-disk package.json manifests are hashed (sha256) when ≤ 1 MiB and attached to the root evidence for deterministic provenance.
  • Output ordering is stable (componentKey ordering, sorted metadata/evidence).

Benchmark

  • Scenario id: node_detection_gaps_fixture (config: src/Bench/StellaOps.Bench/Scanner.Analyzers/config.json)
  • Fixture root: samples/runtime/node-detection-gaps
  • Run:
    • dotnet run --project src/Bench/StellaOps.Bench/Scanner.Analyzers/StellaOps.Bench.ScannerAnalyzers/StellaOps.Bench.ScannerAnalyzers.csproj -- --repo-root . --config src/Bench/StellaOps.Bench/Scanner.Analyzers/config.json --json out/bench/scanner-analyzers/latest.json --prom out/bench/scanner-analyzers/latest.prom
    • Prometheus output includes additional metrics under scanner_analyzer_bench_metric{scenario=\"...\",name=\"node.importScan.*\"}.