# OS Analyzer Evidence Semantics (Non-Language Scanners) This document defines the **evidence contract** produced by OS/non-language analyzers (apk/dpkg/rpm + Windows/macOS OS analyzers) so downstream SBOM/attestation logic can rely on stable, deterministic semantics. ## Evidence Paths - `OSPackageFileEvidence.Path` is **rootfs-relative** and **normalized**: - No leading slash (`/`). - Forward slashes only (`/`), even on Windows inputs. - Never a host path. - Any analyzer-specific absolute path must be converted to rootfs-relative before emission. - Helper: `StellaOps.Scanner.Analyzers.OS.Helpers.OsPath.TryGetRootfsRelative(...)`. Examples: - Good: `usr/bin/bash` - Bad: `/usr/bin/bash` - Bad: `C:\scans\rootfs\usr\bin\bash` ## Layer Attribution - `OSPackageFileEvidence.LayerDigest` is **best-effort** attribution derived from scan metadata: - `ScanMetadataKeys.LayerDirectories` (optional mapping of layer digest → extracted directory) - `ScanMetadataKeys.CurrentLayerDigest` (fallback/default) - Helper: `StellaOps.Scanner.Analyzers.OS.Helpers.OsFileEvidenceFactory`. ## Digest & Hashing Strategy Default posture is **avoid unbounded hashing**: - Prefer package-manager-provided digests when present (`OSPackageFileEvidence.Digests` / `OSPackageFileEvidence.Sha256`). - Compute `sha256` only when: - No digests are present, and - File exists, and - File size is ≤ 16 MiB (`OsFileEvidenceFactory` safeguard). - Primary digest selection for file evidence metadata prefers strongest available: - `sha512` → `sha384` → `sha256` → `sha1` → `md5` ## Analyzer Warnings OS analyzers may emit `AnalyzerWarning` entries (`Code`, `Message`) for partial/edge conditions (missing db, parse errors, unexpected layout). Normalization rules (in `OsPackageAnalyzerBase`): - Deduplicate by `(Code, Message)`. - Stable sort by `Code` then `Message` (ordinal). - Cap at 50 warnings. ## OS Analyzer Caching (Surface Cache) Linux OS analyzers (apk/dpkg/rpm) support **safe, deterministic reuse** via `ISurfaceCache`: - Cache key: `(tenant, analyzerId, rootfsFingerprint)` under namespace `scanner/os/analyzers`. - Fingerprint inputs are intentionally narrow: a single **analyzer-specific** “DB fingerprint file”: - `apk`: `lib/apk/db/installed` - `dpkg`: `var/lib/dpkg/status` - `rpm`: `var/lib/rpm/rpmdb.sqlite` (preferred) or legacy `Packages` fallback - Fingerprint payload includes: - Root path + analyzerId - Relative fingerprint file path - File length + `LastWriteTimeUtc` (ms) - Optional file-content sha256 when the file is ≤ 8 MiB Worker wiring: - `StellaOps.Scanner.Worker.Processing.CompositeScanAnalyzerDispatcher` records cache hit/miss counters per analyzer. ## RPM sqlite Reader Notes When `rpmdb.sqlite` is present, the reader avoids `SELECT *` and column scanning: - Uses `PRAGMA table_info(Packages)` to select a likely RPM header blob column (prefers `hdr`/`header`, excludes `pkgId` when possible). - Queries only `pkgKey` + header blob column for parsing.