Files
git.stella-ops.org/docs/modules/scanner/os-analyzers-evidence.md
StellaOps Bot 564df71bfb
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Export Center CI / export-ci (push) Has been cancelled
Notify Smoke Test / Notify Unit Tests (push) Has been cancelled
Notify Smoke Test / Notifier Service Tests (push) Has been cancelled
Notify Smoke Test / Notification Smoke Test (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
up
2025-12-13 00:20:26 +02:00

75 lines
3.0 KiB
Markdown

# OS Analyzer Evidence Semantics (Non-Language Scanners)
This document defines the **evidence contract** produced by OS/non-language analyzers (apk/dpkg/rpm + Windows/macOS OS analyzers) so downstream SBOM/attestation logic can rely on stable, deterministic semantics.
## Evidence Paths
- `OSPackageFileEvidence.Path` is **rootfs-relative** and **normalized**:
- No leading slash (`/`).
- Forward slashes only (`/`), even on Windows inputs.
- Never a host path.
- Any analyzer-specific absolute path must be converted to rootfs-relative before emission.
- Helper: `StellaOps.Scanner.Analyzers.OS.Helpers.OsPath.TryGetRootfsRelative(...)`.
Examples:
- Good: `usr/bin/bash`
- Bad: `/usr/bin/bash`
- Bad: `C:\scans\rootfs\usr\bin\bash`
## Layer Attribution
- `OSPackageFileEvidence.LayerDigest` is **best-effort** attribution derived from scan metadata:
- `ScanMetadataKeys.LayerDirectories` (optional mapping of layer digest → extracted directory)
- `ScanMetadataKeys.CurrentLayerDigest` (fallback/default)
- Helper: `StellaOps.Scanner.Analyzers.OS.Helpers.OsFileEvidenceFactory`.
## Digest & Hashing Strategy
Default posture is **avoid unbounded hashing**:
- Prefer package-manager-provided digests when present (`OSPackageFileEvidence.Digests` / `OSPackageFileEvidence.Sha256`).
- Compute `sha256` only when:
- No digests are present, and
- File exists, and
- File size is ≤ 16 MiB (`OsFileEvidenceFactory` safeguard).
- Primary digest selection for file evidence metadata prefers strongest available:
- `sha512``sha384``sha256``sha1``md5`
## Analyzer Warnings
OS analyzers may emit `AnalyzerWarning` entries (`Code`, `Message`) for partial/edge conditions (missing db, parse errors, unexpected layout).
Normalization rules (in `OsPackageAnalyzerBase`):
- Deduplicate by `(Code, Message)`.
- Stable sort by `Code` then `Message` (ordinal).
- Cap at 50 warnings.
## OS Analyzer Caching (Surface Cache)
Linux OS analyzers (apk/dpkg/rpm) support **safe, deterministic reuse** via `ISurfaceCache`:
- Cache key: `(tenant, analyzerId, rootfsFingerprint)` under namespace `scanner/os/analyzers`.
- Fingerprint inputs are intentionally narrow: a single **analyzer-specific** “DB fingerprint file”:
- `apk`: `lib/apk/db/installed`
- `dpkg`: `var/lib/dpkg/status`
- `rpm`: `var/lib/rpm/rpmdb.sqlite` (preferred) or legacy `Packages` fallback
- Fingerprint payload includes:
- Root path + analyzerId
- Relative fingerprint file path
- File length + `LastWriteTimeUtc` (ms)
- Optional file-content sha256 when the file is ≤ 8 MiB
Worker wiring:
- `StellaOps.Scanner.Worker.Processing.CompositeScanAnalyzerDispatcher` records cache hit/miss counters per analyzer.
## RPM sqlite Reader Notes
When `rpmdb.sqlite` is present, the reader avoids `SELECT *` and column scanning:
- Uses `PRAGMA table_info(Packages)` to select a likely RPM header blob column (prefers `hdr`/`header`, excludes `pkgId` when possible).
- Queries only `pkgKey` + header blob column for parsing.