Here’s a compact pattern you can drop into Stella Ops to make reachability checks fast, reproducible, and audit‑friendly. --- # Lazy, single‑use reachability cache + signed “reach‑map” artifacts **Why:** reachability queries explode combinatorially; precomputing everything wastes RAM and goes stale. Cache results only when first asked, make them deterministic, and emit a signed artifact so the same evidence can be replayed in VEX proofs. **Core ideas (plain English):** * **Lazy on first call:** compute only the exact path/query requested; cache that result. * **Deterministic key:** cache key = `algo_signature + inputs_hash + call_path_hash` so the same inputs always hit the same entry. * **Single‑use / bounded TTL:** entries survive just long enough to serve concurrent deduped calls, then get evicted (or on TTL/size). This keeps memory tight and avoids stale proofs. * **Reach‑map artifact:** every cache fill writes a compact, deterministic JSON “reach‑map” (edges, justifications, versions, timestamps) and signs it (DSSE). The artifact is what VEX cites, not volatile memory. * **Replayable proofs:** later runs can skip recomputation by verifying + loading the reach‑map, yielding byte‑for‑byte identical evidence. **Minimal shape (C#/.NET 10):** ```csharp public readonly record struct ReachKey( string AlgoSig, // e.g., "RTA@sha256:…" string InputsHash, // SBOM slice + policy + versions string CallPathHash // normalized query graph (src->sink, opts) ); public sealed class ReachCache { private readonly ConcurrentDictionary>> _memo = new(); public Task GetOrComputeAsync( ReachKey key, Func> compute, CancellationToken ct) { var lazy = _memo.GetOrAdd(key, _ => new Lazy>( () => compute(), LazyThreadSafetyMode.ExecutionAndPublication)); return lazy.Value.ContinueWith(t => { if (t.IsCompletedSuccessfully) return t.Result; _memo.TryRemove(key, out _); // don’t retain failures throw t.Exception ?? new Exception("reachability failed"); }, ct); } public void Evict(ReachKey key) => _memo.TryRemove(key, out _); } ``` **Compute path → emit DSSE reach‑map (pseudocode):** ```csharp var result = await cache.GetOrComputeAsync(key, async () => { var graph = BuildSlice(inputs); // deterministic ordering! var paths = FindReachable(graph, query); // your chosen algo var reachMap = Canonicalize(new { algo = key.AlgoSig, inputs_hash = key.InputsHash, call_path = key.CallPathHash, edges = paths.Edges, witnesses = paths.Witnesses, // file:line, symbol ids, versions created = NowUtcIso8601() }); var dsse = Dsse.Sign(reachMap, signingKey); // e.g., in‑toto/DSSE await ArtifactStore.PutAsync(KeyToPath(key), dsse.Bytes); return new ReachResult(paths, dsse.Digest); }, ct); ``` **Operational rules:** * **Canonical everything:** sort nodes/edges, normalize file paths, strip nondeterministic fields. * **Cache scope:** per‑scan, per‑workspace, or per‑feed version. Evict on feed/policy changes. * **TTL:** e.g., 15–60 minutes; or evict after pipeline completes. Guard with a max‑entries cap. * **Concurrency:** use `Lazy>` (above) to coalesce duplicate in‑flight calls. * **Validation path:** before computing, look for `reach-map.dsse` by `ReachKey`; if signature verifies and schema version matches, load and return (no compute). **How this helps VEX in Stella Ops:** * **Consistency:** the DSSE reach‑map is the evidence blob your VEX record links to. * **Speed:** repeat scans and parallel microservices reuse cached or pre‑signed artifacts. * **Memory safety:** no unbounded precompute; everything is small, query‑driven. **Drop‑in tasks for your agents:** 1. **Define ReachKey builders** in `Scanner.WebService` (inputs hash = SBOM slice + policy + resolver versions). 2. **Add ReachCache** as a scoped service with size/TTL config (appsettings → `Scanner.Reach.Cache`). 3. **Implement Canonicalize + Dsse.Sign** in `StellaOps.Crypto` (support FIPS/eIDAS/GOST modes). 4. **ArtifactStore**: write/read `reach-map.dsse.json` under deterministic path: `artifacts/reach///.dsse.json`. 5. **Wire VEXer** to reference the artifact digest and include a verification note. 6. **Tests:** golden fixtures asserting stable bytes for the same inputs; mutation tests to ensure any input change invalidates the cache key. If you want, I can turn this into a ready‑to‑commit `StellaOps.Scanner.Reach` module (interfaces, options, tests, and a stub DSSE signer). I will split this in two parts: 1. What are Stella Ops’ concrete advantages (the “moats”). 2. How developers must build to actually realize them (guidelines and checklists). --- ## 1. Stella Ops Advantages – What We Are Optimizing For ### 1.1 Deterministic, Replayable Security Evidence **Idea:** Any scan or VEX decision run today must be replayable bit-for-bit in 3–5 years for audits, disputes, and compliance. **What this means:** * Every scan has an explicit **input manifest** (feeds, rules, policies, versions, timestamps). * Outputs (findings, reachability, VEX, attestations) are **pure functions** of that manifest. * Evidence is stored as **immutable artifacts** (DSSE, SBOMs, reach-maps, policy snapshots), not just rows in a DB. --- ### 1.2 Reachability-First, Quiet-By-Design Triage **Idea:** The main value is not “finding more CVEs” but **proving which ones matter** in your actual runtime and call graph – and keeping noise down. **What this means:** * Scoring/prioritization is dominated by **reachability + runtime context**, not just CVSS. * Unknowns and partial evidence are surfaced **explicitly**, not hidden. * UX is intentionally quiet: “Can I ship?” → “Yes / No, because of these N concrete, reachable issues.” --- ### 1.3 Crypto-Sovereign, Air-Gap-Ready Trust **Idea:** The platform must run offline, support local CAs/HSMs, and switch between cryptographic regimes (FIPS, eIDAS, GOST, SM, PQC) by configuration, not by code changes. **What this means:** * No hard dependency on any public CA, cloud KMS, or single trust provider. * All attestations are **locally verifiable** with bundled roots and policies. * Crypto suites are **pluggable profiles** selected per deployment / tenant. --- ### 1.4 Policy / Lattice Engine (“Trust Algebra Studio”) **Idea:** Vendors, customers, and regulators speak different languages. Stella Ops provides a **formal lattice** to merge and reason over: * VEX statements * Runtime observations * Code provenance * Organizational policies …without losing provenance (“who said what”). **What this means:** * Clear separation between **facts** (observations) and **policies** (how we rank/merge them). * Lattice merge operations are **explicit, testable functions**, not hidden heuristics. * Same artifact can be interpreted differently by different tenants via different lattice policies. --- ### 1.5 Proof-Linked SBOM→VEX Chain **Idea:** Every VEX claim must point to concrete, verifiable evidence: * Which SBOM / version? * Which reachability analysis? * Which runtime signals? * Which signer/policy? **What this means:** * VEX is not just a JSON document – it is a **graph of links** to attestations and analysis artifacts. * You can click from a VEX statement to the exact DSSE reach-map / scan run that justified it. --- ### 1.6 Proof-of-Integrity Graph (Build → Image → Runtime) **Idea:** Connect: * Source → Build → Image → SBOM → Scan → VEX → Runtime …into a single **cryptographically verifiable graph**. **What this means:** * Every step has a **signed attestation** (in-toto/DSSE style). * Graph queries like “Show me all running pods that descend from this compromised builder” or “Show me all VEX statements that rely on this revoked key” are first-class. --- ### 1.7 AI Codex & Zastava Companion (Explainable by Construction) **Idea:** AI is used only as a **narrator and planner** on top of hard evidence, not as an oracle. **What this means:** * Zastava never invents facts; it explains **what is already in the evidence graph**. * Remediation plans cite **concrete artifacts** (scan IDs, attestations, policies) and affected assets. * All AI outputs include links back to raw structured data and can be re-generated in future with the same evidence set. --- ### 1.8 Proof-Market Ledger & Adaptive Trust Economics **Idea:** Over time, vendors publishing good SBOM/VEX evidence should **gain trust-credit**; sloppy or contradictory publishers lose it. **What this means:** * A ledger of **published proofs**, signatures, and revocations. * A **trust score** per artifact / signer / vendor, derived from consistency, coverage, and historical correctness. * This feeds into procurement and risk dashboards, not just security triage. --- ## 2. Developer Guidelines – How to Build for These Advantages I will phrase this as rules and checklists you can directly apply in Stella Ops repos (.NET 10, C#, Postgres, MongoDB, etc.). --- ### 2.1 Determinism & Replayability **Rules:** 1. **Pure functions, explicit manifests** * Any long-running or non-trivial computation (scan, reachability, lattice merge, trust score) must accept a **single, structured input manifest**, e.g.: ```jsonc { "scannerVersion": "1.3.0", "rulesetId": "stella-default-2025.11", "feeds": { "nvdDigest": "sha256:...", "osvDigest": "sha256:..." }, "sbomDigest": "sha256:...", "policyDigest": "sha256:..." } ``` * No hidden configuration from environment variables, machine-local files, or system clock inside the core algorithm. 2. **Canonicalization everywhere** * Before hashing or signing: * Sort arrays by stable keys. * Normalize paths (POSIX style), line endings (LF), and encodings (UTF-8). * Provide a shared `StellaOps.Core.Canonicalization` library used by all services. 3. **Stable IDs** * Every scan, reachability call, lattice evaluation, and VEX bundle gets an opaque but **stable** ID based on the input manifest hash. * Do not use incremental integer IDs for evidence; use digests (hashes) or ULIDs/GUIDs derived from content. 4. **Golden fixtures** * For each non-trivial algorithm, ship at least one **golden fixture**: * Input manifest JSON * Expected output JSON * CI must assert byte-for-byte equality for these fixtures (after canonicalization). **Developer checklist (per feature):** * [ ] Input manifest type defined and versioned. * [ ] Canonicalization applied before hashing/signing. * [ ] Output stored with `inputsDigest` and `algoDigest`. * [ ] At least one golden fixture proves determinism. --- ### 2.2 Reachability-First Analysis & Quiet UX **Rules:** 1. **Reachability lives in Scanner.WebService** * All lattice/graph heavy lifting for reachability must run in `Scanner.WebService` (standing architectural rule). * Other services (Concelier, Excitors, Feedser) only **consume** reachability artifacts and must “preserve prune source” (never rewrite paths/proofs, only annotate or filter). 2. **Lazy, query-driven computation** * Do not precompute reachability for entire SBOMs. * Compute per **exact query** (image + vulnerability or source→sink path). * Use an in-memory or short-lived cache keyed by: * Algorithm signature * Input manifest hash * Query description (call-path hash) 3. **Evidence-first, severity-second** * Internal ranking objects should look like: ```csharp public sealed record FindingRank( string FindingId, EvidencePointer Evidence, ReachabilityScore Reach, ExploitStatus Exploit, RuntimePresence Runtime, double FinalScore); ``` * UI always has a “Show evidence” or “Explain” action that can be serialized as JSON and re-used by Zastava. 4. **Quiet-by-design UX** * For any list view, default sort is: 1. Reachable, exploitable, runtime-present 2. Reachable, exploitable 3. Reachable, unknown exploit 4. Unreachable * Show **counts by bucket**, not only total CVE count. **Developer checklist:** * [ ] Reachability algorithms only in Scanner.WebService. * [ ] Cache is lazy and keyed by deterministic inputs. * [ ] Output includes explicit evidence pointers. * [ ] UI endpoints expose reachability state in structured form. --- ### 2.3 Crypto-Sovereign & Air-Gap Mode **Rules:** 1. **Cryptography via “profiles”** * Implement a `CryptoProfile` abstraction (e.g. `FipsProfile`, `GostProfile`, `EidasProfile`, `SmProfile`, `PqcProfile`). * All signing/verifying APIs take a `CryptoProfile` or resolve one from tenant config; no direct calls to `RSA.Create()` etc. in business code. 2. **No hard dependency on public PKI** * All verification logic must accept: * Provided root cert bundle * Local CRL or OCSP-equivalent * Never assume internet OCSP/CRL. 3. **Offline bundles** * Any operation required for air-gapped mode must be satisfiable with: * SBOM + feeds + policy bundle + key material * Define explicit **“offline bundle” formats** (zip/tar + manifest) with hashes of all contents. 4. **Key rotation and algorithm agility** * Metadata for every signature must record: * Algorithm * Key ID * Profile * Verification code must fail safely when a profile is disabled, and error messages must be precise. **Developer checklist:** * [ ] No direct crypto calls in feature code; only via profile layer. * [ ] All attestations carry algorithm + key id + profile. * [ ] Offline bundle type exists for this workflow. * [ ] Tests for at least 2 different crypto profiles. --- ### 2.4 Policy / Lattice Engine **Rules:** 1. **Facts vs. Policies separation** * Facts: * SBOM components, CVEs, reachability edges, runtime signals. * Policies: * “If vendor says ‘not affected’ and reachability says unreachable, treat as Informational.” * Serialize facts and policies separately, with their own digests. 2. **Lattice implementation location** * Lattice evaluation (trust algebra) for VEX decisions happens in: * `Scanner.WebService` for scan-time interpretation * `Vexer/Excitor` for publishing and transformation into VEX documents * Concelier/Feedser must not recompute lattice results, only read them. 3. **Formal merge operations** * Each lattice merge function must be: * Explicitly named (e.g. `MaxSeverity`, `VendorOverridesIfStrongerEvidence`, `ConservativeIntersection`). * Versioned and referenced by ID in artifacts (e.g. `latticeAlgo: "trust-algebra/v1/max-severity"`). 4. **Studio-ready representation** * Internal data structures must align with a future “Trust Algebra Studio” UI: * Nodes = statements (VEX, runtime observation, reachability result) * Edges = “derived_from” / “overrides” / “constraints” * Policies = transformations over these graphs. **Developer checklist:** * [ ] Facts and policies are serialized separately. * [ ] Lattice code is in allowed services only. * [ ] Merge strategies are named and versioned. * [ ] Artifacts record which lattice algorithm was used. --- ### 2.5 Proof-Linked SBOM→VEX Chain **Rules:** 1. **Link, don’t merge** * SBOM, scan result, reachability artifact, and VEX should keep their own schemas. * Use **linking IDs** instead of denormalizing everything into one mega-document. 2. **Evidence pointers in VEX** * Every VEX statement (per vuln/component) includes: * `sbomDigest` * `scanId` * `reachMapDigest` * `policyDigest` * `signerKeyId` 3. **DSSE everywhere** * All analysis artifacts are wrapped in DSSE: * Payload = canonical JSON * Envelope = signature + key metadata + profile * Do not invent yet another custom envelope format. **Developer checklist:** * [ ] VEX schema includes pointers back to all upstream artifacts. * [ ] No duplication of SBOM or scan content inside VEX. * [ ] DSSE used as standard envelope type. --- ### 2.6 Proof-of-Integrity Graph **Rules:** 1. **Graph-first storage model** * Model the lifecycle as a graph: * Nodes: source commit, build, image, SBOM, scan, VEX, runtime instance. * Edges: “built_from”, “scanned_as”, “deployed_as”, “derived_from”. * Use stable IDs and store in a graph-friendly form (e.g. adjacency collections in Postgres or document graph in Mongo). 2. **Attestations as edges** * Attestations represent edges, not just metadata blobs. * Example: a build attestation is an edge: `commit -> image`, signed by the CI builder. 3. **Queryable from APIs** * Expose API endpoints like: * `GET /graph/runtime/{podId}/lineage` * `GET /graph/image/{digest}/vex` * Zastava and the UI must use the same APIs, not private shortcuts. **Developer checklist:** * [ ] Graph nodes and edges modelled explicitly. * [ ] Each edge type has an attestation schema. * [ ] At least two graph traversal APIs implemented. --- ### 2.7 AI Codex & Zastava Companion **Rules:** 1. **Evidence in, explanation out** * Zastava must receive: * Explicit evidence bundle (JSON) for a question. * The user’s question. * It must not be responsible for data retrieval or correlation itself – that is the platform’s job. 2. **Stable explanation contracts** * Define a structured response format, for example: ```json { "shortAnswer": "You can ship, with 1 reachable critical.", "findingsSummary": [...], "remediationPlan": [...], "evidencePointers": [...] } ``` * This allows regeneration and multi-language rendering later. 3. **No silent decisions** * Every recommendation must include: * Which lattice policy was assumed. * Which artifacts were used (by ID). **Developer checklist:** * [ ] Zastava APIs accept evidence bundles, not query strings against the DB. * [ ] Responses are structured and deterministic given the evidence. * [ ] Explanations include policy and artifact references. --- ### 2.8 Proof-Market Ledger & Adaptive Trust **Rules:** 1. **Ledger as append-only** * Treat proof-market ledger as an **append-only log**: * New proofs (SBOM/VEX/attestations) * Revocations * Corrections / contradictions * Do not delete; instead emit revocation events. 2. **Trust-score derivation** * Trust is not a free-form label; it is a numeric or lattice value computed from: * Number of consistent proofs over time. * Speed of publishing after CVE. * Rate of contradictions or revocations. 3. **Separation from security decisions** * Trust scores feed into: * Sorting and highlighting. * Procurement / vendor dashboards. * Do not hard-gate security decisions solely on trust scores. **Developer checklist:** * [ ] Ledger is append-only with explicit revocations. * [ ] Trust scoring algorithm documented and versioned. * [ ] UI uses trust scores only as a dimension, not a gate. --- ### 2.9 Quantum-Resilient Mode **Rules:** 1. **Optional PQC** * PQC algorithms (e.g. Dilithium, Falcon) are an **opt-in crypto profile**. * Artifacts can carry multiple signatures (classical + PQC) to ease migration. 2. **No PQC assumption in core logic** * Core logic must treat algorithm as opaque; only crypto layer understands whether it is PQ or classical. **Developer checklist:** * [ ] PQC profile implemented as a first-class profile. * [ ] Artifacts support multi-signature envelopes. --- ## 3. Definition of Done Templates You can use this as a per-feature DoD in Stella Ops: **For any new feature that touches scans, VEX, or evidence:** * [ ] Deterministic: input manifest defined, canonicalization applied, golden fixture(s) added. * [ ] Evidence: outputs are DSSE-wrapped and linked (not merged) into existing artifacts. * [ ] Reachability / Lattice: if applicable, runs only in allowed services and records algorithm IDs. * [ ] Crypto: crypto calls go through profile abstraction; tests for at least 2 profiles if security-sensitive. * [ ] Graph: lineage edges added where appropriate; node/edge IDs stable and queryable. * [ ] UX/API: at least one API to retrieve structured evidence for Zastava and UI. * [ ] Tests: unit + golden + at least one integration test with a full SBOM → scan → VEX chain. If you want, next step can be to pick one module (e.g. Scanner.WebService or Vexer) and turn these high-level rules into a concrete CONTRIBUTING.md / ARCHITECTURE.md for that service.