Refactor code structure for improved readability and maintainability
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
# component_architecture_scanner.md — **Stella Ops Scanner** (2025Q4)
|
||||
|
||||
> Aligned with Epic 6 – Vulnerability Explorer and Epic 10 – Export Center.
|
||||
# component_architecture_scanner.md — **Stella Ops Scanner** (2025Q4)
|
||||
|
||||
> Aligned with Epic 6 – Vulnerability Explorer and Epic 10 – Export Center.
|
||||
|
||||
> **Scope.** Implementation‑ready architecture for the **Scanner** subsystem: WebService, Workers, analyzers, SBOM assembly (inventory & usage), per‑layer caching, three‑way diffs, artifact catalog (RustFS default + Mongo, S3-compatible fallback), attestation hand‑off, and scale/security posture. This document is the contract between the scanning plane and everything else (Policy, Excititor, Concelier, UI, CLI).
|
||||
|
||||
@@ -30,31 +30,31 @@ src/
|
||||
├─ StellaOps.Scanner.Cache/ # layer cache; file CAS; bloom/bitmap indexes
|
||||
├─ StellaOps.Scanner.EntryTrace/ # ENTRYPOINT/CMD → terminal program resolver (shell AST)
|
||||
├─ StellaOps.Scanner.Analyzers.OS.[Apk|Dpkg|Rpm]/
|
||||
├─ StellaOps.Scanner.Analyzers.Lang.[Java|Node|Python|Go|DotNet|Rust]/
|
||||
├─ StellaOps.Scanner.Analyzers.Native.[ELF|PE|MachO]/ # PE/Mach-O planned (M2)
|
||||
├─ StellaOps.Scanner.Symbols.Native/ # NEW – native symbol reader/demangler (Sprint 401)
|
||||
├─ StellaOps.Scanner.CallGraph.Native/ # NEW – function/call-edge builder + CAS emitter
|
||||
├─ StellaOps.Scanner.Analyzers.Lang.[Java|Node|Bun|Python|Go|DotNet|Rust|Ruby|Php]/
|
||||
├─ StellaOps.Scanner.Analyzers.Native.[ELF|PE|MachO]/ # PE/Mach-O planned (M2)
|
||||
├─ StellaOps.Scanner.Symbols.Native/ # NEW – native symbol reader/demangler (Sprint 401)
|
||||
├─ StellaOps.Scanner.CallGraph.Native/ # NEW – function/call-edge builder + CAS emitter
|
||||
├─ StellaOps.Scanner.Emit.CDX/ # CycloneDX (JSON + Protobuf)
|
||||
├─ StellaOps.Scanner.Emit.SPDX/ # SPDX 3.0.1 JSON
|
||||
├─ StellaOps.Scanner.Diff/ # image→layer→component three‑way diff
|
||||
├─ StellaOps.Scanner.Index/ # BOM‑Index sidecar (purls + roaring bitmaps)
|
||||
├─ StellaOps.Scanner.Tests.* # unit/integration/e2e fixtures
|
||||
└─ Tools/
|
||||
├─ StellaOps.Scanner.Sbomer.BuildXPlugin/ # BuildKit generator (image referrer SBOMs)
|
||||
└─ StellaOps.Scanner.Sbomer.DockerImage/ # CLI‑driven scanner container
|
||||
└─ Tools/
|
||||
├─ StellaOps.Scanner.Sbomer.BuildXPlugin/ # BuildKit generator (image referrer SBOMs)
|
||||
└─ StellaOps.Scanner.Sbomer.DockerImage/ # CLI‑driven scanner container
|
||||
```
|
||||
|
||||
Analyzer assemblies and buildx generators are packaged as **restart-time plug-ins** under `plugins/scanner/**` with manifests; services must restart to activate new plug-ins.
|
||||
|
||||
### 1.2 Native reachability upgrades (Nov 2026)
|
||||
|
||||
- **Stripped-binary pipeline**: native analyzers must recover functions even without symbols (prolog patterns, xrefs, PLT/GOT, vtables). Emit a tool-agnostic neutral JSON (NJIF) with functions, CFG/CG, and evidence tags. Keep heuristics deterministic and record toolchain hashes in the scan manifest.
|
||||
- **Synthetic roots**: treat `.preinit_array`, `.init_array`, legacy `.ctors`, and `_init` as graph entrypoints; add roots for constructors in each `DT_NEEDED` dependency. Tag edges from these roots with `phase=load` for explainers.
|
||||
- **Build-id capture**: read `.note.gnu.build-id` for every ELF, store hex build-id alongside soname/path, propagate into `SymbolID`/`code_id`, and expose it to SBOM + runtime joiners. If missing, fall back to file hash and mark source accordingly.
|
||||
- **PURL-resolved edges**: annotate call edges with the callee purl and `symbol_digest` so graphs merge with SBOM components. See `docs/reachability/purl-resolved-edges.md` for schema rules and acceptance tests.
|
||||
- **Unknowns emission**: when symbol → purl mapping or edge targets remain unresolved, emit structured Unknowns to Signals (see `docs/signals/unknowns-registry.md`) instead of dropping evidence.
|
||||
- **Hybrid attestation**: emit **graph-level DSSE** for every `richgraph-v1` (mandatory) and optional **edge-bundle DSSE** (≤512 edges) for runtime/init-root/contested edges or third-party provenance. Publish graph DSSE digests to Rekor by default; edge-bundle Rekor publish is policy-driven. CAS layout: `cas://reachability/graphs/{blake3}` for graph body, `.../{blake3}.dsse` for envelope, and `cas://reachability/edges/{graph_hash}/{bundle_id}[.dsse]` for bundles. Deterministic ordering before hashing/signing is required.
|
||||
- **Deterministic call-graph manifest**: capture analyzer versions, feed hashes, toolchain digests, and flags in a manifest stored alongside `richgraph-v1`; replaying with the same manifest MUST yield identical node/edge sets and hashes (see `docs/reachability/lead.md`).
|
||||
Analyzer assemblies and buildx generators are packaged as **restart-time plug-ins** under `plugins/scanner/**` with manifests; services must restart to activate new plug-ins.
|
||||
|
||||
### 1.2 Native reachability upgrades (Nov 2026)
|
||||
|
||||
- **Stripped-binary pipeline**: native analyzers must recover functions even without symbols (prolog patterns, xrefs, PLT/GOT, vtables). Emit a tool-agnostic neutral JSON (NJIF) with functions, CFG/CG, and evidence tags. Keep heuristics deterministic and record toolchain hashes in the scan manifest.
|
||||
- **Synthetic roots**: treat `.preinit_array`, `.init_array`, legacy `.ctors`, and `_init` as graph entrypoints; add roots for constructors in each `DT_NEEDED` dependency. Tag edges from these roots with `phase=load` for explainers.
|
||||
- **Build-id capture**: read `.note.gnu.build-id` for every ELF, store hex build-id alongside soname/path, propagate into `SymbolID`/`code_id`, and expose it to SBOM + runtime joiners. If missing, fall back to file hash and mark source accordingly.
|
||||
- **PURL-resolved edges**: annotate call edges with the callee purl and `symbol_digest` so graphs merge with SBOM components. See `docs/reachability/purl-resolved-edges.md` for schema rules and acceptance tests.
|
||||
- **Unknowns emission**: when symbol → purl mapping or edge targets remain unresolved, emit structured Unknowns to Signals (see `docs/signals/unknowns-registry.md`) instead of dropping evidence.
|
||||
- **Hybrid attestation**: emit **graph-level DSSE** for every `richgraph-v1` (mandatory) and optional **edge-bundle DSSE** (≤512 edges) for runtime/init-root/contested edges or third-party provenance. Publish graph DSSE digests to Rekor by default; edge-bundle Rekor publish is policy-driven. CAS layout: `cas://reachability/graphs/{blake3}` for graph body, `.../{blake3}.dsse` for envelope, and `cas://reachability/edges/{graph_hash}/{bundle_id}[.dsse]` for bundles. Deterministic ordering before hashing/signing is required.
|
||||
- **Deterministic call-graph manifest**: capture analyzer versions, feed hashes, toolchain digests, and flags in a manifest stored alongside `richgraph-v1`; replaying with the same manifest MUST yield identical node/edge sets and hashes (see `docs/reachability/lead.md`).
|
||||
|
||||
### 1.1 Queue backbone (Redis / NATS)
|
||||
|
||||
@@ -144,9 +144,10 @@ No confidences. Either a fact is proven with listed mechanisms, or it is not cla
|
||||
* `images { imageDigest, repo, tag?, arch, createdAt, lastSeen }`
|
||||
* `layers { layerDigest, mediaType, size, createdAt, lastSeen }`
|
||||
* `links { fromType, fromDigest, artifactId }` // image/layer -> artifact
|
||||
* `jobs { _id, kind, args, state, startedAt, heartbeatAt, endedAt, error }`
|
||||
* `lifecycleRules { ruleId, scope, ttlDays, retainIfReferenced, immutable }`
|
||||
* `ruby.packages { _id: scanId, imageDigest, generatedAtUtc, packages[] }` // decoded `RubyPackageInventory` documents for CLI/Policy reuse
|
||||
* `jobs { _id, kind, args, state, startedAt, heartbeatAt, endedAt, error }`
|
||||
* `lifecycleRules { ruleId, scope, ttlDays, retainIfReferenced, immutable }`
|
||||
* `ruby.packages { _id: scanId, imageDigest, generatedAtUtc, packages[] }` // decoded `RubyPackageInventory` documents for CLI/Policy reuse
|
||||
* `bun.packages { _id: scanId, imageDigest, generatedAtUtc, packages[] }` // decoded `BunPackageInventory` documents for CLI/Policy reuse
|
||||
|
||||
### 3.3 Object store layout (RustFS)
|
||||
|
||||
@@ -175,10 +176,11 @@ All under `/api/v1/scanner`. Auth: **OpTok** (DPoP/mTLS); RBAC scopes.
|
||||
|
||||
```
|
||||
POST /scans { imageRef|digest, force?:bool } → { scanId }
|
||||
GET /scans/{id} → { status, imageDigest, artifacts[], rekor? }
|
||||
GET /sboms/{imageDigest} ?format=cdx-json|cdx-pb|spdx-json&view=inventory|usage → bytes
|
||||
GET /scans/{id}/ruby-packages → { scanId, imageDigest, generatedAt, packages[] }
|
||||
GET /diff?old=<digest>&new=<digest>&view=inventory|usage → diff.json
|
||||
GET /scans/{id} → { status, imageDigest, artifacts[], rekor? }
|
||||
GET /sboms/{imageDigest} ?format=cdx-json|cdx-pb|spdx-json&view=inventory|usage → bytes
|
||||
GET /scans/{id}/ruby-packages → { scanId, imageDigest, generatedAt, packages[] }
|
||||
GET /scans/{id}/bun-packages → { scanId, imageDigest, generatedAt, packages[] }
|
||||
GET /diff?old=<digest>&new=<digest>&view=inventory|usage → diff.json
|
||||
POST /exports { imageDigest, format, view, attest?:bool } → { artifactId, rekor? }
|
||||
POST /reports { imageDigest, policyRevision? } → { reportId, rekor? } # delegates to backend policy+vex
|
||||
GET /catalog/artifacts/{id} → { meta }
|
||||
@@ -223,6 +225,7 @@ When `scanner.events.enabled = true`, the WebService serialises the signed repor
|
||||
|
||||
* **Java**: `META-INF/maven/*/pom.properties`, MANIFEST → `pkg:maven/...`
|
||||
* **Node**: `node_modules/**/package.json` → `pkg:npm/...`
|
||||
* **Bun**: `bun.lock` (JSONC text) + `node_modules/**/package.json` + `node_modules/.bun/**/package.json` (isolated linker) → `pkg:npm/...`; `bun.lockb` (binary) emits remediation guidance
|
||||
* **Python**: `*.dist-info/{METADATA,RECORD}` → `pkg:pypi/...`
|
||||
* **Go**: Go **buildinfo** in binaries → `pkg:golang/...`
|
||||
* **.NET**: `*.deps.json` + assembly metadata → `pkg:nuget/...`
|
||||
@@ -230,18 +233,18 @@ When `scanner.events.enabled = true`, the WebService serialises the signed repor
|
||||
|
||||
> **Rule:** We only report components proven **on disk** with authoritative metadata. Lockfiles are evidence only.
|
||||
|
||||
**C) Native link graph**
|
||||
|
||||
* **ELF**: parse `PT_INTERP`, `DT_NEEDED`, RPATH/RUNPATH, **GNU symbol versions**; map **SONAMEs** to file paths; link executables → libs.
|
||||
* **PE/Mach‑O** (planned M2): import table, delay‑imports; version resources; code signatures.
|
||||
* Map libs back to **OS packages** if possible (via file lists); else emit `bin:{sha256}` components.
|
||||
* The exported metadata (`stellaops.os.*` properties, license list, source package) feeds policy scoring and export pipelines
|
||||
directly – Policy evaluates quiet rules against package provenance while Exporters forward the enriched fields into
|
||||
downstream JSON/Trivy payloads.
|
||||
* **Reachability lattice**: analyzers + runtime probes emit `Evidence`/`Mitigation` records (see `docs/reachability/lattice.md`). The lattice engine joins static path evidence, runtime hits (EventPipe/JFR), taint flows, environment gates, and mitigations into `ReachDecision` documents that feed VEX gating and event graph storage.
|
||||
* Sprint 401 introduces `StellaOps.Scanner.Symbols.Native` (DWARF/PDB reader + demangler) and `StellaOps.Scanner.CallGraph.Native`
|
||||
(function boundary detector + call-edge builder). These libraries feed `FuncNode`/`CallEdge` CAS bundles and enrich reachability
|
||||
graphs with `{code_id, confidence, evidence}` so Signals/Policy/UI can cite function-level justifications.
|
||||
**C) Native link graph**
|
||||
|
||||
* **ELF**: parse `PT_INTERP`, `DT_NEEDED`, RPATH/RUNPATH, **GNU symbol versions**; map **SONAMEs** to file paths; link executables → libs.
|
||||
* **PE/Mach‑O** (planned M2): import table, delay‑imports; version resources; code signatures.
|
||||
* Map libs back to **OS packages** if possible (via file lists); else emit `bin:{sha256}` components.
|
||||
* The exported metadata (`stellaops.os.*` properties, license list, source package) feeds policy scoring and export pipelines
|
||||
directly – Policy evaluates quiet rules against package provenance while Exporters forward the enriched fields into
|
||||
downstream JSON/Trivy payloads.
|
||||
* **Reachability lattice**: analyzers + runtime probes emit `Evidence`/`Mitigation` records (see `docs/reachability/lattice.md`). The lattice engine joins static path evidence, runtime hits (EventPipe/JFR), taint flows, environment gates, and mitigations into `ReachDecision` documents that feed VEX gating and event graph storage.
|
||||
* Sprint 401 introduces `StellaOps.Scanner.Symbols.Native` (DWARF/PDB reader + demangler) and `StellaOps.Scanner.CallGraph.Native`
|
||||
(function boundary detector + call-edge builder). These libraries feed `FuncNode`/`CallEdge` CAS bundles and enrich reachability
|
||||
graphs with `{code_id, confidence, evidence}` so Signals/Policy/UI can cite function-level justifications.
|
||||
|
||||
**D) EntryTrace (ENTRYPOINT/CMD → terminal program)**
|
||||
|
||||
@@ -273,10 +276,10 @@ The emitted `buildId` metadata is preserved in component hashes, diff payloads,
|
||||
|
||||
### 5.6 DSSE attestation (via Signer/Attestor)
|
||||
|
||||
* WebService constructs **predicate** with `image_digest`, `stellaops_version`, `license_id`, `policy_digest?` (when emitting **final reports**), timestamps.
|
||||
* Calls **Signer** (requires **OpTok + PoE**); Signer verifies **entitlement + scanner image integrity** and returns **DSSE bundle**.
|
||||
* **Attestor** logs to **Rekor v2**; returns `{uuid,index,proof}` → stored in `artifacts.rekor`.
|
||||
* Operator enablement runbooks (toggles, env-var map, rollout guidance) live in [`operations/dsse-rekor-operator-guide.md`](operations/dsse-rekor-operator-guide.md) per SCANNER-ENG-0015.
|
||||
* WebService constructs **predicate** with `image_digest`, `stellaops_version`, `license_id`, `policy_digest?` (when emitting **final reports**), timestamps.
|
||||
* Calls **Signer** (requires **OpTok + PoE**); Signer verifies **entitlement + scanner image integrity** and returns **DSSE bundle**.
|
||||
* **Attestor** logs to **Rekor v2**; returns `{uuid,index,proof}` → stored in `artifacts.rekor`.
|
||||
* Operator enablement runbooks (toggles, env-var map, rollout guidance) live in [`operations/dsse-rekor-operator-guide.md`](operations/dsse-rekor-operator-guide.md) per SCANNER-ENG-0015.
|
||||
|
||||
---
|
||||
|
||||
@@ -333,7 +336,7 @@ scanner:
|
||||
objectLock: "governance" # or 'compliance'
|
||||
analyzers:
|
||||
os: { apk: true, dpkg: true, rpm: true }
|
||||
lang: { java: true, node: true, python: true, go: true, dotnet: true, rust: true }
|
||||
lang: { java: true, node: true, bun: true, python: true, go: true, dotnet: true, rust: true, ruby: true, php: true }
|
||||
native: { elf: true, pe: false, macho: false } # PE/Mach-O in M2
|
||||
entryTrace: { enabled: true, shellMaxDepth: 64, followRunParts: true }
|
||||
emit:
|
||||
@@ -478,17 +481,17 @@ ResolveEntrypoint(ImageConfig cfg, RootFs fs):
|
||||
return Unknown(reason)
|
||||
```
|
||||
|
||||
### Appendix A.1 — EntryTrace Explainability
|
||||
|
||||
### Appendix A.0 — Replay / Record mode
|
||||
|
||||
- WebService ships a **RecordModeService** that assembles replay manifests (schema v1) with policy/feed/tool pins and reachability references, then writes deterministic input/output bundles to the configured object store (RustFS default, S3/Minio fallback) under `replay/<head>/<digest>.tar.zst`.
|
||||
- Bundles contain canonical manifest JSON plus inputs (policy/feed/tool/analyzer digests) and outputs (SBOM, findings, optional VEX/logs); CAS URIs follow `cas://replay/...` and are attached to scan snapshots as `ReplayArtifacts`.
|
||||
- Reachability graphs/traces are folded into the manifest via `ReachabilityReplayWriter`; manifests and bundles hash with stable ordering for replay verification (`docs/replay/DETERMINISTIC_REPLAY.md`).
|
||||
- Worker sealed-mode intake reads `replay.bundle.uri` + `replay.bundle.sha256` (plus determinism feed/policy pins) from job metadata, persists bundle refs in analysis and surface manifest, and validates hashes before use.
|
||||
- Deterministic execution switches (`docs/modules/scanner/deterministic-execution.md`) must be enabled when generating replay bundles to keep hashes stable.
|
||||
|
||||
EntryTrace emits structured diagnostics and metrics so operators can quickly understand why resolution succeeded or degraded:
|
||||
### Appendix A.1 — EntryTrace Explainability
|
||||
|
||||
### Appendix A.0 — Replay / Record mode
|
||||
|
||||
- WebService ships a **RecordModeService** that assembles replay manifests (schema v1) with policy/feed/tool pins and reachability references, then writes deterministic input/output bundles to the configured object store (RustFS default, S3/Minio fallback) under `replay/<head>/<digest>.tar.zst`.
|
||||
- Bundles contain canonical manifest JSON plus inputs (policy/feed/tool/analyzer digests) and outputs (SBOM, findings, optional VEX/logs); CAS URIs follow `cas://replay/...` and are attached to scan snapshots as `ReplayArtifacts`.
|
||||
- Reachability graphs/traces are folded into the manifest via `ReachabilityReplayWriter`; manifests and bundles hash with stable ordering for replay verification (`docs/replay/DETERMINISTIC_REPLAY.md`).
|
||||
- Worker sealed-mode intake reads `replay.bundle.uri` + `replay.bundle.sha256` (plus determinism feed/policy pins) from job metadata, persists bundle refs in analysis and surface manifest, and validates hashes before use.
|
||||
- Deterministic execution switches (`docs/modules/scanner/deterministic-execution.md`) must be enabled when generating replay bundles to keep hashes stable.
|
||||
|
||||
EntryTrace emits structured diagnostics and metrics so operators can quickly understand why resolution succeeded or degraded:
|
||||
|
||||
| Reason | Description | Typical Mitigation |
|
||||
|--------|-------------|--------------------|
|
||||
|
||||
146
docs/modules/scanner/bun-analyzer-gotchas.md
Normal file
146
docs/modules/scanner/bun-analyzer-gotchas.md
Normal file
@@ -0,0 +1,146 @@
|
||||
# Bun Analyzer Developer Gotchas
|
||||
|
||||
This document covers common pitfalls and considerations when working with the Bun analyzer.
|
||||
|
||||
## 1. Isolated Installs Are Symlink-Heavy
|
||||
|
||||
Bun's isolated linker (`bun install --linker isolated`) creates a flat store under `node_modules/.bun/` with symlinks for package resolution. This differs from the default hoisted layout.
|
||||
|
||||
**Implications:**
|
||||
- The analyzer must traverse `node_modules/.bun/**/package.json` in addition to `node_modules/**/package.json`
|
||||
- Symlink safety guards are critical to prevent infinite loops and out-of-root traversal
|
||||
- Both logical and real paths are recorded in evidence for traceability
|
||||
- Performance guards (`MaxSymlinkDepth=10`, `MaxFilesPerRoot=50000`) are enforced
|
||||
|
||||
**Testing:**
|
||||
- Use the `IsolatedLinkerInstallIsParsedAsync` test fixture to verify `.bun/` traversal
|
||||
- Use the `SymlinkSafetyIsEnforcedAsync` test fixture for symlink corner cases
|
||||
|
||||
## 2. `node_modules/.bun/` Scanning Requirement
|
||||
|
||||
Unlike Node.js, Bun may store packages entirely under `node_modules/.bun/` with only symlinks in the top-level `node_modules/`. If your scanner configuration excludes `.bun/` directories, you will miss dependencies.
|
||||
|
||||
**Checklist:**
|
||||
- Ensure glob patterns include `.bun/` subdirectories
|
||||
- Do not filter out hidden directories in container scans
|
||||
- Verify evidence shows packages from both `node_modules/` and `node_modules/.bun/`
|
||||
|
||||
## 3. `bun.lockb` Migration Path
|
||||
|
||||
The binary lockfile (`bun.lockb`) format is undocumented and unstable. The analyzer treats it as **unsupported** and emits a remediation finding.
|
||||
|
||||
**Migration command:**
|
||||
```bash
|
||||
bun install --save-text-lockfile
|
||||
```
|
||||
|
||||
This generates `bun.lock` (JSONC text format) which the analyzer can parse.
|
||||
|
||||
**WebService response:** When only `bun.lockb` is present:
|
||||
- The scan completes but reports unsupported status
|
||||
- Remediation guidance is included in findings
|
||||
- No package inventory is generated
|
||||
|
||||
## 4. JSONC Lockfile Format
|
||||
|
||||
`bun.lock` uses JSONC (JSON with Comments) format supporting:
|
||||
- Single-line comments (`// ...`)
|
||||
- Multi-line comments (`/* ... */`)
|
||||
- Trailing commas in arrays and objects
|
||||
|
||||
**Parser considerations:**
|
||||
- The `BunLockParser` tolerates these JSONC features
|
||||
- Standard JSON parsers will fail on `bun.lock` files
|
||||
- Format may evolve with Bun releases; parser is intentionally tolerant
|
||||
|
||||
## 5. Multi-Stage Build Implications
|
||||
|
||||
In multi-stage Docker builds, the final image may contain only production artifacts without the lockfile or `node_modules/.bun/` directory.
|
||||
|
||||
**Scanning strategies:**
|
||||
1. **Image scanning (recommended for production):** Scans the final image filesystem. Set `include_dev: false` to filter dev dependencies
|
||||
2. **Repository scanning:** Scans `bun.lock` from source. Includes all dependencies by default (`include_dev: true`)
|
||||
|
||||
**Best practice:** Scan both the repository (for complete visibility) and production images (for runtime accuracy).
|
||||
|
||||
## 6. npm Ecosystem Reuse
|
||||
|
||||
Bun packages are npm packages. The analyzer:
|
||||
- Emits `pkg:npm/<name>@<version>` PURLs (same as Node analyzer)
|
||||
- Uses `ecosystem = npm` for vulnerability lookups
|
||||
- Adds `package_manager = bun` metadata for differentiation
|
||||
|
||||
This means:
|
||||
- Vulnerability intelligence is shared with Node analyzer
|
||||
- VEX statements for npm packages apply to Bun
|
||||
- No separate Bun-specific advisory database is needed
|
||||
|
||||
## 7. Source Detection in Lockfile
|
||||
|
||||
`bun.lock` entries include source information that determines package type:
|
||||
|
||||
| Source Pattern | Type | Example |
|
||||
|---------------|------|---------|
|
||||
| No source / default registry | `registry` | `lodash@4.17.21` |
|
||||
| `git+https://...` or `git://...` | `git` | VCS dependency |
|
||||
| `file:` or `link:` | `tarball` | Local package |
|
||||
| `workspace:` | `workspace` | Monorepo member |
|
||||
|
||||
The analyzer records source type in evidence for provenance tracking.
|
||||
|
||||
## 8. Workspace/Monorepo Handling
|
||||
|
||||
Bun workspaces use a single `bun.lock` at the root with multiple `package.json` files in subdirectories.
|
||||
|
||||
**Analyzer behavior:**
|
||||
- Discovers the root by presence of `bun.lock` + `package.json`
|
||||
- Traverses all `node_modules/` directories under the root
|
||||
- Deduplicates packages by `(name, version)` while accumulating occurrence paths
|
||||
- Records workspace member paths in metadata
|
||||
|
||||
**Testing:** Use the `WorkspacesAreParsedAsync` test fixture.
|
||||
|
||||
## 9. Dev/Prod Dependency Filtering
|
||||
|
||||
The `include_dev` configuration option controls whether dev dependencies are included:
|
||||
|
||||
| Context | Default `include_dev` | Rationale |
|
||||
|---------|----------------------|-----------|
|
||||
| Repository scan (lockfile-only) | `true` | Full visibility for developers |
|
||||
| Image scan (installed packages) | `true` | Packages are present regardless of intent |
|
||||
|
||||
**Override:** Set `include_dev: false` in scan configuration to exclude dev dependencies from results.
|
||||
|
||||
## 10. Evidence Model
|
||||
|
||||
Each Bun package includes evidence with:
|
||||
- `source`: Where the package was found (`node_modules`, `bun.lock`, `node_modules/.bun`)
|
||||
- `locator`: File path to the evidence
|
||||
- `resolved`: The resolved URL from lockfile (if available)
|
||||
- `integrity`: SHA hash from lockfile (if available)
|
||||
- `sha256`: File hash for installed packages
|
||||
|
||||
Evidence enables:
|
||||
- Tracing packages to their origin
|
||||
- Validating integrity
|
||||
- Explaining presence in SBOM
|
||||
|
||||
## CLI Reference
|
||||
|
||||
### Inspect local workspace
|
||||
```bash
|
||||
stellaops-cli bun inspect --root /path/to/project
|
||||
```
|
||||
|
||||
### Resolve packages from scan
|
||||
```bash
|
||||
stellaops-cli bun resolve --scan-id <id>
|
||||
stellaops-cli bun resolve --digest sha256:<hash>
|
||||
stellaops-cli bun resolve --ref myregistry.io/myapp:latest
|
||||
```
|
||||
|
||||
### Output formats
|
||||
```bash
|
||||
stellaops-cli bun inspect --format json > packages.json
|
||||
stellaops-cli bun inspect --format table
|
||||
```
|
||||
Reference in New Issue
Block a user