Files

StellaOps Bot 7d5250238c save progress

2025-12-18 09:53:46 +02:00

27 KiB

Raw Blame History

Here are two practical ways to make your software supply‑chain evidence both useful and verifiable—with enough background to get you shipping.

1) Binary SBOMs that still work when there’s no package manager

Why this matters: Container images built FROM scratch or “distroless” often lack package metadata, so typical SBOMs go blank. A binary SBOM extracts facts directly from executables—so you still know “what’s inside,” even in bare images.

Core idea (plain English):

Parse binaries (ELF on Linux, PE on Windows, Mach‑O on macOS).
Record file paths, cryptographic hashes, import tables, compiler/linker hints, and for ELF also the .note.gnu.build-id (a unique ID most linkers embed).
Map these fingerprints to known packages/versions (vendor fingerprints, distro databases, your own allowlists).
Sign the result as an attestation so others can trust it without re‑running your scanner.

Minimal pipeline sketch:

Extract: readelf -n (ELF notes), objdump/otool for imports; compute SHA‑256 for every binary.
Normalize: Emit CycloneDX or SPDX components for binaries, not just packages.
Map: Use Build‑ID → package hints (e.g., glibc, OpenSSL), symbol/version patterns, and path heuristics.
Attest: Wrap the SBOM in DSSE + in‑toto and push to your registry alongside the image digest.

Pragmatic spec for developers:

Inputs: OCI image digest.
Outputs:
- binary-sbom.cdx.json (CycloneDX) or binary-sbom.spdx.json.
- attestation.intoto.jsonl (DSSE envelope referencing the SBOM’s SHA‑256 and the image digest).
Data fields to capture per artifact:
- algorithm: sha256, digest: <hex>, type: elf|pe|macho, path, size,
- elf.build_id (if present), imports[], compiler[], arch, endian.
Verification:
- cosign verify-attestation --type sbom --digest <image-digest> ...

Why the ELF Build‑ID is gold: it’s a stable, linker‑emitted identifier that helps correlate stripped binaries to upstream packages—critical when filenames and symbols lie.

2) Reachability analysis so you only page people for real risk

Why this matters: Not every CVE in your deps can actually be hit by your app. If you can show “no call path reaches the vulnerable sink,” you can de‑noise alerts and ship faster.

Core idea (plain English):

Build an interprocedural call graph of your app (across modules/packages).
Mark known “sinks” from vulnerability advisories (e.g., dangerous API + version range).
Compute graph reachability from your entrypoints (HTTP handlers, CLI main, background jobs).
The intersection of {reachable nodes} × {vulnerable sinks} = “actionable” findings.
Emit a signed witness (attestation) that states which sinks are reachable/unreachable and why.

Minimal pipeline sketch:

Ingest code/bytecode: language‑specific frontends (e.g., .NET IL, JVM bytecode, Python AST, Go SSA).
Build graph: nodes = functions/methods; edges = call sites (include dynamic edges conservatively).
Mark entrypoints: web routes, message handlers, cron jobs, exported CLIs.
Mark sinks: from your vuln DB (API signature + version).
Decide: run graph search from entrypoints → is any sink reachable?
Attest: DSSE witness with:
- artifact digest (commit SHA / image digest),
- tool version + rule set hash,
- list of reachable sinks with at least one example call path,
- list of proven unreachable sinks (under stated assumptions).

Developer contract (portable across languages):

Inputs: source/bytecode zip + manifest of entrypoints.
Outputs:
- reachability.witness.json (DSSE envelope),
- optional paths/ folder with top‑N call paths as compact JSON (for UX rendering).
Verification:
- Recompute call graph deterministically given the same inputs + tool version,
- cosign verify-attestation --type reachability ...

How these two pieces fit together

Binary SBOM = “What exactly is in the artifact?” (even in bare images)
Reachability witness = “Which vulns actually matter to this app build?”
Sign both as DSSE/in‑toto attestations and attach to the image/release. Your CI can enforce:
- “Block if high‑severity + reachable,”
- “Warn (don’t block) if high‑severity but unreachable with a fresh witness.”

Quick starter checklist (copy/paste to a task board)

Binary extractors: ELF/PE/Mach‑O parsers; hash & Build‑ID capture.
Mapping rules: Build‑ID → known package DB; symbol/version heuristics.
Emit CycloneDX/SPDX; add file‑level components for binaries.
DSSE signing and cosign/rekor publish for SBOM attestation.
Language frontends for reachability (pick your top 1–2 first).
Call‑graph builder + entrypoint detector.
Sink catalog normalizer (map CVE → API signature).
Reachability engine + example path extractor.
DSSE witness for reachability; attach to build.
CI policy: block on “reachable high/critical”; surface paths in UI.

If you want, I can turn this into concrete .NET‑first tasks with sample code scaffolds and a tiny demo repo that builds an image, extracts a binary SBOM, runs reachability on a toy service, and emits both attestations. Below is a concrete, “do‑this‑then‑this” implementation plan for a layered binary→PURL mapping system that fits StellaOps’ constraints: offline, deterministic, SBOM‑first, and with unknowns recorded instead of guessing.

I’m going to assume your target is the common pain case StellaOps itself calls out: when package metadata is missing, Scanner falls back to binary identity (bin:{sha256}) and you want to deterministically “lift” those binaries into stable package identities (PURLs) without turning the core SBOM into fuzzy guesswork. StellaOps’ own Scanner docs emphasize deterministic analyzers, no fuzzy identity in core, and keeping heuristics as opt‑in add‑ons. (Stella Ops)

0) What “binary mapping” means in StellaOps terms

In Scanner’s architecture, the component key is:

PURL when present
otherwise bin:{sha256} (Stella Ops)

So “better binary mapping” = systematically converting more of those bin:* components into PURLs (or at least producing actionable mapping evidence + Unknowns) while preserving:

deterministic replay (same inputs ⇒ same output)
offline operation (air‑gapped kits)
policy safety (don’t hide false negatives behind fuzzy IDs)

Also, StellaOps already has the concept of “gaps” being first‑class via the Unknowns Registry (identity gaps, missing build‑id, version conflicts, missing edges, etc.). (Gitea: Git with a cup of tea) Your binary mapping work should feed this system.

1) Design constraints you must keep (or you’ll fight the platform)

1.1 Determinism rules

StellaOps’ Scanner architecture is explicit: core analyzers are deterministic; heuristic plug‑ins must not contaminate the core SBOM unless explicitly enabled. (Stella Ops)

That implies:

No probabilistic “best guess” PURL in the default mapping path.
If you do fuzzy inference, it must be emitted as:
- “hints” attached to Unknowns, or
- a separate heuristic artifact gated by flags.

1.2 Offline kit + debug store is already a hook you can exploit

Offline kits already bundle:

scanner plug‑ins (OS + language analyzers packaged under plugins/scanner/analyzers/**)
a debug store layout: debug/.build-id/<aa>/<rest>.debug
a debug-manifest.json that maps build‑ids → originating images (for symbol retrieval) (Stella Ops)

This is perfect for building a Build‑ID→PURL index that remains offline and signed.

1.3 Scanner Worker already loads analyzers via directory catalogs

The Worker loads OS and language analyzer plug‑ins from default directories (unless overridden), using deterministic directory normalization and a “seal” concept on the last directory. (Gitea: Git with a cup of tea)

So you can add a third catalog for native/binary mapping that behaves the same way.

2) Layering strategy: what to implement (and in what order)

You want a resolver pipeline with strict ordering from “hard evidence” → “soft evidence”.

Layer 0 — In‑image authoritative mapping (highest confidence)

These sources are authoritative because they come from within the artifact:

OS package DB present (dpkg/rpm/apk):

Map path → package using file ownership lists.
If you can also compute file hashes/build‑ids, store them as evidence.

Language ecosystem metadata present (already handled by language analyzers):

For example, a Python wheel RECORD or a Go buildinfo section can directly imply module versions.

Decision rule: If a binary file is owned by an OS package, prefer that over any external mapping index.

Layer 1 — “Build provenance” mapping via build IDs / UUIDs (strong, portable)

When package DB is missing (distroless/scratch), use compiler/linker stable IDs:

ELF: .note.gnu.build-id
Mach‑O: LC_UUID
PE: CodeView (PDB GUID+Age) / build signature

This should be your primary fallback because it survives stripping and renaming.

Layer 2 — Hash mapping for curated or vendor‑pinned binaries (strong but brittle across rebuilds)

Use SHA‑256 → PURL mapping when:

binaries are redistributed unchanged (busybox, chromium, embedded runtimes)
you maintain a curated “known binaries” manifest

StellaOps already has “curated binary manifest generation” mentioned in its repo history, and a vendor/manifest.json concept exists (for pinned artifacts / binaries in the system). (Gitea: Git with a cup of tea) For your ops environment you’ll create a similar manifest for your fleet.

Layer 3 — Dependency closure constraints (helpful as a disambiguator, not a primary mapper)

If the binary’s DT_NEEDED / imports point to libs you can identify, you can use that to disambiguate multiple possible candidates (“this openssl build-id matches, but only one candidate has the required glibc baseline”).

This must remain deterministic and rules‑based.

Layer 4 — Heuristic hints (never change the core SBOM by default)

Examples:

symbol version patterns (GLIBC_2.28, etc.)
embedded version strings
import tables
compiler metadata

These produce Unknown evidence/hints, not a resolved identity, unless a special “heuristics allowed” flag is turned on.

Layer 5 — Unknowns Registry output (mandatory when you can’t decide)

If a mapping can’t be made decisively:

emit Unknowns (identity_gap, missing_build_id, version_conflict, etc.) (Gitea: Git with a cup of tea) This is not optional; it’s how you prevent silent false negatives.

3) Concrete data model you should implement

3.1 Binary identity record

Create a single canonical identity structure that every layer uses:

public enum BinaryFormat { Elf, Pe, MachO, Unknown }

public sealed record BinaryIdentity(
    BinaryFormat Format,
    string Path,              // normalized (posix style), rooted at image root
    string Sha256,            // always present
    string? BuildId,          // ELF
    string? MachOUuid,        // Mach-O
    string? PeCodeViewGuid,   // PE/PDB
    string? Arch,             // amd64/arm64/...
    long SizeBytes
);

Determinism tip: normalize Path to a single separator and collapse //, ./, etc.

3.2 Mapping candidate

Each resolver layer returns candidates like:

public enum MappingVerdict { Resolved, Unresolved, Ambiguous }

public sealed record BinaryMappingCandidate(
    string Purl,
    double Confidence,          // 0..1 but deterministic
    string ResolverId,          // e.g. "os.fileowner", "buildid.index.v1"
    IReadOnlyList<string> Evidence, // stable ordering
    IReadOnlyDictionary<string,string> Properties // stable ordering
);

3.3 Final mapping result

public sealed record BinaryMappingResult(
    MappingVerdict Verdict,
    BinaryIdentity Subject,
    BinaryMappingCandidate? Winner,
    IReadOnlyList<BinaryMappingCandidate> Alternatives,
    string MappingIndexDigest // sha256 of index snapshot used (or "none")
);

4) Build the “Binary Map Index” that makes Layer 1 and 2 work offline

4.1 Where it lives in StellaOps

Put it in the Offline Kit as a signed artifact, next to other feeds and plug-ins. Offline kit packaging already includes plug-ins and a debug store with a deterministic layout. (Stella Ops)

Recommended layout:

offline-kit/
  feeds/
    binary-map/
      v1/
        buildid.map.zst
        sha256.map.zst
        index.manifest.json
        index.manifest.json.sig   (DSSE or JWS, consistent with your kit)

4.2 Index record schema (v1)

Make each record explicit and replayable:

{
  "schema": "stellaops.binary-map.v1",
  "records": [
    {
      "key": { "kind": "elf.build_id", "value": "2f3a..."},
      "purl": "pkg:deb/debian/openssl@3.0.11-1~deb12u2?arch=amd64",
      "evidence": {
        "source": "os.dpkg.fileowner",
        "source_image": "sha256:....",
        "path": "/usr/lib/x86_64-linux-gnu/libssl.so.3",
        "package": "openssl",
        "package_version": "3.0.11-1~deb12u2"
      }
    }
  ]
}

Key points:

key.kind is one of elf.build_id, macho.uuid, pe.codeview, file.sha256
include evidence with enough detail to justify mapping

4.3 How to generate the index (deterministically)

You need an offline index builder pipeline. In StellaOps terms, this is best treated like a feed exporter step (build-time), then shipped in the Offline Kit.

Input set options (choose one or mix):

“Golden base images” list (your fleet’s base images)
Distro repositories mirrored into the airgap (Deb/RPM/APK archives)
Previously scanned images that are allowed into the kit

Generation steps:

For each input image:
- Extract rootfs in a deterministic path order.
- Run OS analyzers (dpkg/rpm/apk) + native identity collection (ELF/PE/MachO).
Produce raw tuples:
- (build_id | uuid | codeview | sha256) → (purl, evidence)
Deduplicate:
- Canonicalize PURLs (normalize qualifiers order, lowercasing rules).
- If the same key maps to multiple distinct PURLs, keep them all and mark as conflict (do not pick one).
Sort:
- Sort by (key.kind, key.value, purl) lexicographically.
Serialize:
- Emit line‑delimited JSON or a simple binary format.
- Compress (zstd).
Compute digests:
- sha256 of each artifact.
- sha256 of concatenated (artifact name + sha) for a manifest hash.
Sign:
- include in kit manifest and sign with the same process you use for other offline kit elements. Offline kit import in StellaOps validates digests and signatures. (Stella Ops)

5) Runtime side: implement the layered resolver in Scanner Worker

5.1 Where to hook in

You want this to run after OS + language analyzers have produced fragments, and after native identity collection has produced binary identities.

Scanner Worker already executes analyzers and appends fragments to context.Analysis. (Gitea: Git with a cup of tea)

Scanner module responsibilities explicitly include OS, language, and native ecosystems as restart-only plug-ins. (Gitea: Git with a cup of tea) So implement binary mapping as either:

part of the native ecosystem analyzer output stage, or
a post-analyzer enrichment stage that runs before SBOM composition.

I recommend: post-analyzer enrichment stage, because it can consult OS+lang analyzer results and unify decisions.

5.2 Add a new ScanAnalysis key

Store collected binary identities in analysis:

ScanAnalysisKeys.NativeBinaryIdentities → ImmutableArray<BinaryIdentity>

And store mapping results:

ScanAnalysisKeys.NativeBinaryMappings → ImmutableArray<BinaryMappingResult>

5.3 Implement the resolver pipeline (deterministic ordering)

public interface IBinaryMappingResolver
{
    string Id { get; }      // stable ID
    int Order { get; }      // deterministic
    BinaryMappingCandidate? TryResolve(BinaryIdentity identity, MappingContext ctx);
}

Pipeline:

Sort resolvers by (Order, Id) (Ordinal comparison).
For each resolver:
- if it returns a candidate, add it to candidates list.
- if the resolver is “authoritative” (Layer 0), you can short‑circuit on first hit.
Decide:
- If 0 candidates ⇒ Unresolved
- If 1 candidate ⇒ Resolved
- If >1:
  - If candidates have different PURLs ⇒ Ambiguous unless a deterministic “dominates” rule exists
  - If candidates have same PURL (from multiple sources) ⇒ merge evidence

5.4 Implement each layer as a resolver

Resolver A: OS file owner (Layer 0)

Inputs:

OS analyzer results in context.Analysis (they’re already stored in ScanAnalysisKeys.OsPackageAnalyzers). (Gitea: Git with a cup of tea)
You need OS analyzers to expose file ownership mapping.

Implementation options:

Extend OS analyzers to produce path → packageId maps.
Or load that from dpkg/rpm DB at mapping time (fast enough if you only query per binary path).

Candidate:

Purl = pkg:<ecosystem>/<name>@<version>?arch=...
Confidence = 1.0
Evidence includes:
- analyzer id
- package name/version
- file path

Resolver B: Build‑ID index (Layer 1)

Inputs:

identity.BuildId (or uuid/codeview)
BinaryMapIndex loaded from Offline Kit feeds/binary-map/v1/buildid.map.zst

Implementation:

On worker startup: load and parse index into an immutable structure:
- FrozenDictionary<string, BuildIdEntry[]> (or sorted arrays + binary search)
If key maps to multiple PURLs:
- return multiple candidates (same resolver id), forcing Ambiguous verdict upstream

Candidate:

Confidence = 0.95 (still deterministic)
Evidence includes index manifest digest + record evidence

Resolver C: SHA‑256 index (Layer 2)

Inputs:

identity.Sha256
feeds/binary-map/v1/sha256.map.zst OR your ops “curated binaries” manifest

Candidate:

Confidence:
- 0.9 if from signed curated manifest
- 0.7 if from “observed in previous scan cache” (I’d avoid this unless you version and sign the cache)

Resolver D: Dependency closure constraints (Layer 3)

Only run if you have native dependency parsing output (DT_NEEDED / imports). The resolver does not return a mapping on its own; instead, it can:

bump confidence for existing candidates
or rule out candidates deterministically (e.g., glibc baseline mismatch)

Make this a “candidate rewriter” stage:

public interface ICandidateRefiner
{
    string Id { get; }
    int Order { get; }
    IReadOnlyList<BinaryMappingCandidate> Refine(BinaryIdentity id, IReadOnlyList<BinaryMappingCandidate> cands, MappingContext ctx);
}

Resolver E: Heuristic hints (Layer 4)

Never resolves to a PURL by default. It just produces Unknown evidence payload:

extracted strings (“OpenSSL 3.0.11”)
imported symbol names
SONAME
symbol version requirements

6) SBOM composition behavior: how to “lift” bin components safely

6.1 Don’t break the component key rules

Scanner uses:

key = PURL when present, else bin:{sha256} (Stella Ops)

When you resolve a binary identity to a PURL, you have two clean options:

Option 1 (recommended): replace the component key with the PURL

This makes downstream policy/advisory matching work naturally.
It’s deterministic as long as the mapping index is versioned and shipped with the kit.

Option 2: keep bin:{sha256} as the component key and attach resolved_purl

Lower disruption to diffing, but policy now has to understand the “resolved_purl” field.
If StellaOps policy assumes component.purl is the canonical key, this will cause pain.

Given StellaOps emphasizes PURLs as the canonical key for identity, I’d implement Option 1, but record robust evidence + index digest.

6.2 Preserve file-level evidence

Even after lifting to PURL, keep evidence that ties the package identity back to file bytes:

file path(s)
sha256
build-id/uuid
mapping resolver id + index digest

This is what makes attestations verifiable and helps operators debug.

7) Unknowns integration: emit Unknowns whenever mapping isn’t decisive

The Unknowns Registry exists precisely for “unresolved symbol → package mapping”, “missing build-id”, “ambiguous purl”, etc. (Gitea: Git with a cup of tea)

7.1 When to emit Unknowns

Emit Unknowns for:

identity.BuildId == null for ELF
- unknown_type = missing_build_id
- evidence: “ELF missing .note.gnu.build-id; using sha256 only”
Multiple candidates with different PURLs
- unknown_type = version_conflict (or identity_gap)
- evidence: list candidates + their evidence
Heuristic hints found but no authoritative mapping
- unknown_type = identity_gap
- evidence: imported symbols, strings, SONAME

7.2 How to compute `unknown_id` deterministically

Unknowns schema suggests:

unknown_id is derived from sha256 over (type + scope + evidence) (Gitea: Git with a cup of tea)

Do:

stable JSON canonicalization of scope + unknown_type + primary evidence fields
sha256
prefix with unk:sha256:<...>

This guarantees idempotent ingestion behavior (POST /unknowns/ingest upsert). (Gitea: Git with a cup of tea)

8) Packaging as a StellaOps plug-in (so ops can upgrade it offline)

8.1 Plug-in manifest

Scanner plug-ins use a manifest.json with schemaVersion, id, entryPoint (dotnet assembly + typeName), etc. (Gitea: Git with a cup of tea)

Create something like:

{
  "schemaVersion": "1.0",
  "id": "stellaops.analyzer.native.binarymap",
  "displayName": "StellaOps Native Binary Mapper",
  "version": "0.1.0",
  "requiresRestart": true,
  "entryPoint": {
    "type": "dotnet",
    "assembly": "StellaOps.Scanner.Analyzers.Native.BinaryMap.dll",
    "typeName": "StellaOps.Scanner.Analyzers.Native.BinaryMap.BinaryMapPlugin"
  },
  "capabilities": [
    "native-analyzer",
    "binary-mapper",
    "elf",
    "pe",
    "macho"
  ],
  "metadata": {
    "org.stellaops.analyzer.kind": "native",
    "org.stellaops.restart.required": "true"
  }
}

8.2 Worker loading

Mirror the pattern in CompositeScanAnalyzerDispatcher:

add a catalog INativeAnalyzerPluginCatalog
default directory: plugins/scanner/analyzers/native
load directories with the same “seal last directory” behavior (Gitea: Git with a cup of tea)

9) Tests and performance gates (what “done” looks like)

StellaOps has determinism tests and golden fixtures for analyzers; follow that style. (Gitea: Git with a cup of tea)

9.1 Determinism tests

Create fixtures with:

same binaries in different file order
same binaries hardlinked/symlinked
stripped ELF missing build-id
multi-arch variants

Assert:

mapping output JSON byte-for-byte stable
unknown ids stable
candidate ordering stable

9.2 “No fuzzy identity” guardrail tests

Add tests that:

heuristic resolver never emits a Resolved verdict unless a feature flag is enabled
ambiguous candidates never auto-select a winner

9.3 Performance budgets

For ops, you care about scan wall time. Adopt budgets like:

identity extraction < 25ms / binary (native parsing)
mapping lookup O(1) / binary (frozen dict) or O(log n) with sorted arrays
index load time bounded (lazy load per worker start)

Track metrics:

count resolved per layer
count ambiguous/unresolved
unknown density (ties into Unknowns Registry scoring later) (Gitea: Git with a cup of tea)

10) Practical “ops” workflow: how to keep improving mapping safely

10.1 Add a feedback loop from Unknowns → index builder

Unknowns are your backlog:

“missing build-id”
“ambiguous mapping”
“hash seen but not in index”

For each Unknown:

decide if it should be mapped in core (needs authoritative source)
if yes: add reference artifact to your index builder input set
rebuild the BinaryMap index
ship via Offline Kit update (signed)

10.2 Don’t let your index silently drift

Because determinism matters, treat the BinaryMap index like a feed:

version it (v1, v2)
sign it
store index digest in scan evidence

That way you can explain: “This binary was mapped using binary-map/v1 digest XYZ”.

11) Minimal implementation checklist (if you want the shortest path to value)

If you only do 3 things, do these:

Build‑ID extraction everywhere (ELF/Mach‑O/PE) and always store it in evidence (also emit Unknown when missing, as StellaOps expects) (Gitea: Git with a cup of tea)
Offline Build‑ID → PURL index shipped in Offline Kit (fits perfectly with the existing debug-store + kit pattern) (Stella Ops)
Deterministic resolver pipeline + Unknowns emission (so you improve mapping without introducing silent risk) (Gitea: Git with a cup of tea)

If you tell me whether your main pain is distroless, FROM scratch, or vendor‑bundled runtimes (chromium/node/openssl/etc.), I can give you the best “Layer 1 index builder” recipe for that category (what to use as authoritative sources and how to avoid collisions) — but the plan above is already safe and implementable without further assumptions.

27 KiB Raw Blame History Unescape Escape

1) Binary SBOMs that still work when there’s no package manager

2) Reachability analysis so you only page people for real risk

How these two pieces fit together

Quick starter checklist (copy/paste to a task board)

0) What “binary mapping” means in StellaOps terms

1) Design constraints you must keep (or you’ll fight the platform)

1.1 Determinism rules

1.2 Offline kit + debug store is already a hook you can exploit

1.3 Scanner Worker already loads analyzers via directory catalogs

2) Layering strategy: what to implement (and in what order)

Layer 0 — In‑image authoritative mapping (highest confidence)

Layer 1 — “Build provenance” mapping via build IDs / UUIDs (strong, portable)

Layer 2 — Hash mapping for curated or vendor‑pinned binaries (strong but brittle across rebuilds)

Layer 3 — Dependency closure constraints (helpful as a disambiguator, not a primary mapper)

Layer 4 — Heuristic hints (never change the core SBOM by default)

Layer 5 — Unknowns Registry output (mandatory when you can’t decide)

3) Concrete data model you should implement

3.1 Binary identity record

3.2 Mapping candidate

3.3 Final mapping result

4) Build the “Binary Map Index” that makes Layer 1 and 2 work offline

4.1 Where it lives in StellaOps

4.2 Index record schema (v1)

4.3 How to generate the index (deterministically)

5) Runtime side: implement the layered resolver in Scanner Worker

5.1 Where to hook in

5.2 Add a new ScanAnalysis key

5.3 Implement the resolver pipeline (deterministic ordering)

5.4 Implement each layer as a resolver

Resolver A: OS file owner (Layer 0)

Resolver B: Build‑ID index (Layer 1)

Resolver C: SHA‑256 index (Layer 2)

Resolver D: Dependency closure constraints (Layer 3)

Resolver E: Heuristic hints (Layer 4)

6) SBOM composition behavior: how to “lift” bin components safely

6.1 Don’t break the component key rules

6.2 Preserve file-level evidence

7) Unknowns integration: emit Unknowns whenever mapping isn’t decisive

7.1 When to emit Unknowns

7.2 How to compute unknown_id deterministically

8) Packaging as a StellaOps plug-in (so ops can upgrade it offline)

8.1 Plug-in manifest

8.2 Worker loading

9) Tests and performance gates (what “done” looks like)

9.1 Determinism tests

9.2 “No fuzzy identity” guardrail tests

9.3 Performance budgets

10) Practical “ops” workflow: how to keep improving mapping safely

10.1 Add a feedback loop from Unknowns → index builder

10.2 Don’t let your index silently drift

11) Minimal implementation checklist (if you want the shortest path to value)

27 KiB

Raw Blame History

7.2 How to compute `unknown_id` deterministically