# Go Analyzer (Scanner) ## What it does - Inventories Go components from **binaries** (embedded buildinfo) and **source** (go.mod/go.sum/go.work/vendor) without executing `go`. - Emits `pkg:golang/@` when a concrete version is available; otherwise emits deterministic explicit-key components (no "range-as-version" PURLs). - Records VCS/build metadata and bounded evidence for audit/replay; remains offline-first. - Detects security-relevant capabilities in Go source code (exec, filesystem, network, native code, etc.). ## Inputs and precedence The analyzer processes inputs in the following order, with binary evidence taking precedence: 1. **Binary inventory (Phase 1, authoritative)**: Extract embedded build info (`runtime/debug` buildinfo blob) and emit Go modules (main + deps) with concrete versions and build settings evidence. Binary-derived components include `provenance=binary` metadata. 2. **Source inventory (Phase 2, supplementary)**: Parse `go.mod`, `go.sum`, `go.work`, and `vendor/modules.txt` to emit modules not already covered by binary evidence. Source-derived components include `provenance=source` metadata. 3. **Heuristic fallback (stripped binaries)**: When buildinfo is missing, emit deterministic `bin` components keyed by sha256 plus minimal classification evidence. **Precedence rules:** - Binary evidence is scanned first and takes precedence over source evidence. - When both source and binary evidence exist for the same module path@version, only the binary-derived component is emitted. - Main modules are tracked separately: if a binary emits `module@version`, source `module@(devel)` is suppressed. - This ensures deterministic, non-duplicative output. ## Project discovery (modules + workspaces) - Standalone modules are discovered by locating `go.mod` files (bounded recursion depth 10; vendor directories skipped). - Workspaces are discovered via `go.work` at the analysis root; `use` members become additional module roots. - Vendored dependencies are detected via `vendor/modules.txt` when present. ## Workspace replace directive propagation `go.work` files may contain `replace` directives that apply to all workspace members: - Workspace-level replaces are inherited by all member modules. - Module-level replaces take precedence over workspace-level replaces for the same module path. - Duplicate replace keys are handled deterministically (last-one-wins within each scope). ## Identity rules (PURL vs explicit key) Concrete versions emit a PURL: - `purl = pkg:golang/@` Non-concrete identities emit an explicit key: - Used for source-only main modules (`(devel)`) and for any non-versioned module identity. - PURL is omitted (`purl=null`) and the component is keyed deterministically via `AddFromExplicitKey`. ## Evidence and metadata ### Binary-derived components Binary components include (when present): - `provenance=binary` - `go.version` - `modulePath.main` and `build.*` settings - VCS fields (`build.vcs*` from build settings and/or `go.dwarf` tokens) - `moduleSum` and replacement metadata when available - CGO signals (`cgo.enabled`, flags, compiler hints; plus adjacent native libs when detected) ### Source-derived components Source components include: - `provenance=source` - `moduleSum` from `go.sum` (when present) - vendor signals (`vendored=true`) and `vendor` evidence locators - replacement/exclude flags with stable metadata keys - best-effort license signals for main module and vendored modules - `capabilities` metadata listing detected capability kinds (exec, filesystem, network, etc.) - `capabilities.maxRisk` indicating highest risk level (critical/high/medium/low) ### Heuristic fallback components Fallback components include: - `type=bin`, deterministic `sha256` identity, and a classification evidence marker - Metric `scanner_analyzer_golang_heuristic_total{indicator,version_hint}` increments per heuristic emission ## Capability scanning The analyzer detects security-relevant capabilities in Go source code: | Capability | Risk | Examples | |------------|------|----------| | Exec | Critical | `exec.Command`, `syscall.Exec`, `os.StartProcess` | | NativeCode | Critical | `unsafe.Pointer`, `//go:linkname`, `syscall.Syscall` | | PluginLoading | Critical | `plugin.Open` | | Filesystem | High/Medium | `os.Remove`, `os.Chmod`, `os.WriteFile` | | Network | Medium | `net.Dial`, `http.Get`, `http.ListenAndServe` | | Environment | High/Medium | `os.Setenv`, `os.Getenv` | | Database | Medium | `sql.Open`, `db.Query` | | DynamicCode | High | `reflect.Value.Call`, `template.Execute` | | Serialization | Medium | `gob.NewDecoder`, `xml.Unmarshal` | | Reflection | Low/Medium | `reflect.TypeOf`, `reflect.New` | | Crypto | Low | Hash functions, cipher operations | Capabilities are emitted as: - Metadata: `capabilities=exec,filesystem,network` (comma-separated list of kinds) - Metadata: `capabilities.maxRisk=critical|high|medium|low` - Evidence: Top 10 capability locations with pattern and line number ## IO/Memory bounds Binary and DWARF scanning uses bounded windowed reads to limit memory usage: - **Build info scanning**: 16 MB windows with 4 KB overlap; max file size 128 MB. - **DWARF token scanning**: 8 MB windows with 1 KB overlap; max file size 256 MB. - Small files (below window size) are read directly for efficiency. ## Retract semantics Go's `retract` directive only applies to versions of the declaring module itself, not to dependencies: - The `RetractedVersions` field in inventory results contains only versions of the main module that are retracted. - Dependency retraction cannot be determined offline (would require fetching each module's go.mod). - No false-positive retraction warnings are emitted for dependencies. ## Cache key correctness Binary build info is cached using a composite key: - File path (normalized for OS case sensitivity) - File length - Last modification time - 4 KB header hash (FNV-1a) The header hash ensures correct behavior in containerized/layered filesystem environments where files may have identical metadata but different content. ## References - Sprint: `docs/implplan/SPRINT_0402_0001_0001_scanner_go_analyzer_gaps.md` - Cross-analyzer contract: `docs/modules/scanner/language-analyzers-contract.md` - Implementation: `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Go/GoLanguageAnalyzer.cs` - Capability scanner: `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Go/Internal/GoCapabilityScanner.cs`