Files
git.stella-ops.org/docs/modules/scanner/analyzers-go.md
StellaOps Bot f1a39c4ce3
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Notify Smoke Test / Notify Unit Tests (push) Has been cancelled
Notify Smoke Test / Notifier Service Tests (push) Has been cancelled
Notify Smoke Test / Notification Smoke Test (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled
up
2025-12-13 18:08:55 +02:00

6.4 KiB

Go Analyzer (Scanner)

What it does

  • Inventories Go components from binaries (embedded buildinfo) and source (go.mod/go.sum/go.work/vendor) without executing go.
  • Emits pkg:golang/<module>@<version> when a concrete version is available; otherwise emits deterministic explicit-key components (no "range-as-version" PURLs).
  • Records VCS/build metadata and bounded evidence for audit/replay; remains offline-first.
  • Detects security-relevant capabilities in Go source code (exec, filesystem, network, native code, etc.).

Inputs and precedence

The analyzer processes inputs in the following order, with binary evidence taking precedence:

  1. Binary inventory (Phase 1, authoritative): Extract embedded build info (runtime/debug buildinfo blob) and emit Go modules (main + deps) with concrete versions and build settings evidence. Binary-derived components include provenance=binary metadata.
  2. Source inventory (Phase 2, supplementary): Parse go.mod, go.sum, go.work, and vendor/modules.txt to emit modules not already covered by binary evidence. Source-derived components include provenance=source metadata.
  3. Heuristic fallback (stripped binaries): When buildinfo is missing, emit deterministic bin components keyed by sha256 plus minimal classification evidence.

Precedence rules:

  • Binary evidence is scanned first and takes precedence over source evidence.
  • When both source and binary evidence exist for the same module path@version, only the binary-derived component is emitted.
  • Main modules are tracked separately: if a binary emits module@version, source module@(devel) is suppressed.
  • This ensures deterministic, non-duplicative output.

Project discovery (modules + workspaces)

  • Standalone modules are discovered by locating go.mod files (bounded recursion depth 10; vendor directories skipped).
  • Workspaces are discovered via go.work at the analysis root; use members become additional module roots.
  • Vendored dependencies are detected via vendor/modules.txt when present.

Workspace replace directive propagation

go.work files may contain replace directives that apply to all workspace members:

  • Workspace-level replaces are inherited by all member modules.
  • Module-level replaces take precedence over workspace-level replaces for the same module path.
  • Duplicate replace keys are handled deterministically (last-one-wins within each scope).

Identity rules (PURL vs explicit key)

Concrete versions emit a PURL:

  • purl = pkg:golang/<modulePath>@<version>

Non-concrete identities emit an explicit key:

  • Used for source-only main modules ((devel)) and for any non-versioned module identity.
  • PURL is omitted (purl=null) and the component is keyed deterministically via AddFromExplicitKey.

Evidence and metadata

Binary-derived components

Binary components include (when present):

  • provenance=binary
  • go.version
  • modulePath.main and build.* settings
  • VCS fields (build.vcs* from build settings and/or go.dwarf tokens)
  • moduleSum and replacement metadata when available
  • CGO signals (cgo.enabled, flags, compiler hints; plus adjacent native libs when detected)

Source-derived components

Source components include:

  • provenance=source
  • moduleSum from go.sum (when present)
  • vendor signals (vendored=true) and vendor evidence locators
  • replacement/exclude flags with stable metadata keys
  • best-effort license signals for main module and vendored modules
  • capabilities metadata listing detected capability kinds (exec, filesystem, network, etc.)
  • capabilities.maxRisk indicating highest risk level (critical/high/medium/low)

Heuristic fallback components

Fallback components include:

  • type=bin, deterministic sha256 identity, and a classification evidence marker
  • Metric scanner_analyzer_golang_heuristic_total{indicator,version_hint} increments per heuristic emission

Capability scanning

The analyzer detects security-relevant capabilities in Go source code:

Capability Risk Examples
Exec Critical exec.Command, syscall.Exec, os.StartProcess
NativeCode Critical unsafe.Pointer, //go:linkname, syscall.Syscall
PluginLoading Critical plugin.Open
Filesystem High/Medium os.Remove, os.Chmod, os.WriteFile
Network Medium net.Dial, http.Get, http.ListenAndServe
Environment High/Medium os.Setenv, os.Getenv
Database Medium sql.Open, db.Query
DynamicCode High reflect.Value.Call, template.Execute
Serialization Medium gob.NewDecoder, xml.Unmarshal
Reflection Low/Medium reflect.TypeOf, reflect.New
Crypto Low Hash functions, cipher operations

Capabilities are emitted as:

  • Metadata: capabilities=exec,filesystem,network (comma-separated list of kinds)
  • Metadata: capabilities.maxRisk=critical|high|medium|low
  • Evidence: Top 10 capability locations with pattern and line number

IO/Memory bounds

Binary and DWARF scanning uses bounded windowed reads to limit memory usage:

  • Build info scanning: 16 MB windows with 4 KB overlap; max file size 128 MB.
  • DWARF token scanning: 8 MB windows with 1 KB overlap; max file size 256 MB.
  • Small files (below window size) are read directly for efficiency.

Retract semantics

Go's retract directive only applies to versions of the declaring module itself, not to dependencies:

  • The RetractedVersions field in inventory results contains only versions of the main module that are retracted.
  • Dependency retraction cannot be determined offline (would require fetching each module's go.mod).
  • No false-positive retraction warnings are emitted for dependencies.

Cache key correctness

Binary build info is cached using a composite key:

  • File path (normalized for OS case sensitivity)
  • File length
  • Last modification time
  • 4 KB header hash (FNV-1a)

The header hash ensures correct behavior in containerized/layered filesystem environments where files may have identical metadata but different content.

References

  • Sprint: docs/implplan/SPRINT_0402_0001_0001_scanner_go_analyzer_gaps.md
  • Cross-analyzer contract: docs/modules/scanner/language-analyzers-contract.md
  • Implementation: src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Go/GoLanguageAnalyzer.cs
  • Capability scanner: src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Go/Internal/GoCapabilityScanner.cs