stella-ops.org/git.stella-ops.org

Fork 0

Files

master a3c7fe5e88

Docs CI / lint-and-preview (push) Has been cancelled

Details

devportal-offline / build-offline (push) Has been cancelled

Details

Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled

Details

add advisories

2025-12-09 18:45:57 +02:00

13 KiB

Raw Blame History

Here’s a crisp, plug‑in set of reproducible benchmarks you can bake into Stella Ops so buyers, auditors, and your own team can see measurable wins—without hand‑wavy heuristics.

Benchmarks Stella Ops should standardize

1) Time‑to‑Evidence (TTE) How fast Stella Ops turns a “suspicion” into a signed, auditor‑usable proof (e.g., VEX+attestations).

Definition: TTE = t(proof_ready) – t(artifact_ingested)
Scope: scanning, reachability, policy evaluation, proof generation, notarization, and publication to your proof ledger.
Targets:
- P50 < 2m for typical container images (≤ 500 MB, known ecosystems).
- P95 < 5m including cold‑start/offline‑bundle mode.
Report: Median/P95 by artifact size bucket; break down stages (fetch → analyze → reachability → VEX → sign → publish).
Auditable logs: DSSE/DSD signatures, policy hash, feed set IDs, scanner build hash.

2) False‑Negative Drift Rate (FN‑Drift) Catches when a previously “clean” artifact later becomes “affected” because the world changed (new CVE, rule, or feed).

Definition (rolling window 30d): FN‑Drift = (# artifacts re‑classified from {unaffected/unknown} → affected) / (total artifacts re‑evaluated)
Stratify by cause: feed delta, rule delta, lattice/policy delta, reachability delta.
Goal: keep feed‑caused FN‑Drift low by faster deltas (good) while keeping engine‑caused FN‑Drift near zero (stability).
Guardrails: require explanations on re‑classification: include diff of feeds, rule versions, and lattice policy commit.
Badge: “No engine‑caused FN drift in 90d” (hash‑linked evidence bundle).

3) Deterministic Re‑scan Reproducibility (Hash‑Stable Proofs) Same inputs → same outputs, byte‑for‑byte, including proofs. Crucial for audits and regulated buys.

Definition: Given a scan manifest (artifact digest, feed snapshots, engine build hash, lattice/policy hash), re‑scan must produce identical: findings set, VEX decisions, proofs, and top‑level bundle hash.
Metric: Repro rate = identical_outputs / total_replays (target 100%).

Proof object:

{
  artifact_digest,
  scan_manifest_hash,
  feeds_merkle_root,
  engine_build_hash,
  policy_lattice_hash,
  findings_sha256,
  vex_bundle_sha256,
  proof_bundle_sha256
}

CI check: nightly replay of a fixed corpus; fail pipeline on any non‑determinism (with diff).

Minimal implementation plan (developer‑ready)

Canonical Scan Manifest (CSM): immutable JSON (canonicalized), covering: artifact digests; feed URIs + content hashes; engine build + ruleset hashes; lattice/policy hash; config flags; environment fingerprint (CPU features, locale). Store CSM + DSSE envelope.
Stage timers: emit monotonic timestamps for each stage; roll up to TTE. Persist per‑artifact in Postgres (time‑series table by artifact_digest).
Delta re‑eval daemon: on any feed/rule/policy change, re‑score the corpus referenced by that feed snapshot; log re‑classifications with cause; compute FN‑Drift daily.
Replay harness: given a CSM, re‑run pipeline in sealed mode (no network, feeds from snapshot); recompute bundle hashes; assert equality.
Proof bundle: tar/zip with canonical ordering; include SBOM slice, reachability graph, VEX, signatures, and an index.json (canonical). The bundle’s SHA256 is your public “proof hash.”

What to put on dashboards & in SLAs

TTE panel: P50/P95 by image size; stacked bars by stage; alerts when P95 breaches SLO.
FN‑Drift panel: overall and by cause; red flag if engine‑caused drift > 0.1% in 30d.
Repro panel: last 24h/7d replay pass rate (goal 100%); list any non‑deterministic modules.

Why this wins sales & audits

Auditors: can pick any proof hash → replay from CSM → get the exact same signed outcome.
Buyers: TTE proves speed; FN‑Drift proves stability and feed hygiene; Repro proves you’re not heuristic‑wobbly.
Competitors: many can’t show deterministic replay or attribute drift causes—your “hash‑stable proofs” make that gap obvious.

If you want, I can generate the exact PostgreSQL schema, .NET 10 structs, and a nightly replay GitLab job that enforces these three metrics out‑of‑the‑box. Below is the complete, implementation-ready package you asked for: PostgreSQL schema, .NET 10 types, and a CI replay job for the three Stella Ops benchmarks: Time-to-Evidence (TTE), False-Negative Drift (FN-Drift), and Deterministic Replayability.

This is written so your mid-level developers can drop it directly into Stella Ops without re-architecting anything.

1. PostgreSQL Schema (Canonical, Deterministic, Normalized)

1.1 Table: scan_manifest

Immutable record describing exactly what was used for a scan.

CREATE TABLE scan_manifest (
    manifest_id UUID PRIMARY KEY,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),

    artifact_digest TEXT NOT NULL,
    feeds_merkle_root TEXT NOT NULL,
    engine_build_hash TEXT NOT NULL,
    policy_lattice_hash TEXT NOT NULL,

    ruleset_hash TEXT NOT NULL,
    config_flags JSONB NOT NULL,

    environment_fingerprint JSONB NOT NULL,

    raw_manifest JSONB NOT NULL,
    raw_manifest_sha256 TEXT NOT NULL
);

Notes:

raw_manifest is the canonical JSON used for deterministic replay.
raw_manifest_sha256 is the canonicalized-JSON hash, not a hash of the unformatted body.

1.2 Table: scan_execution

One execution corresponds to one run of the scanner with one manifest.

CREATE TABLE scan_execution (
    execution_id UUID PRIMARY KEY,
    manifest_id UUID NOT NULL REFERENCES scan_manifest(manifest_id) ON DELETE CASCADE,

    started_at TIMESTAMPTZ NOT NULL,
    finished_at TIMESTAMPTZ NOT NULL,

    t_ingest_ms INT NOT NULL,
    t_analyze_ms INT NOT NULL,
    t_reachability_ms INT NOT NULL,
    t_vex_ms INT NOT NULL,
    t_sign_ms INT NOT NULL,
    t_publish_ms INT NOT NULL,

    proof_bundle_sha256 TEXT NOT NULL,
    findings_sha256 TEXT NOT NULL,
    vex_bundle_sha256 TEXT NOT NULL,

    replay_mode BOOLEAN NOT NULL DEFAULT FALSE
);

Derived view for Time-to-Evidence:

CREATE VIEW scan_tte AS
SELECT
    execution_id,
    manifest_id,
    (finished_at - started_at) AS tte_interval
FROM scan_execution;

1.3 Table: classification_history

Used for FN-Drift tracking.

CREATE TABLE classification_history (
    id BIGSERIAL PRIMARY KEY,
    artifact_digest TEXT NOT NULL,
    manifest_id UUID NOT NULL REFERENCES scan_manifest(manifest_id) ON DELETE CASCADE,
    execution_id UUID NOT NULL REFERENCES scan_execution(execution_id) ON DELETE CASCADE,

    previous_status TEXT NOT NULL, -- unaffected | unknown | affected
    new_status TEXT NOT NULL,
    cause TEXT NOT NULL,           -- engine_delta | feed_delta | ruleset_delta | policy_delta

    changed_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Materialized view for drift statistics:

CREATE MATERIALIZED VIEW fn_drift_stats AS
SELECT
    date_trunc('day', changed_at) AS day_bucket,
    COUNT(*) FILTER (WHERE new_status = 'affected') AS affected_count,
    COUNT(*) AS total_reclassified,
    ROUND(
        (COUNT(*) FILTER (WHERE new_status = 'affected')::numeric /
         NULLIF(COUNT(*), 0)) * 100, 4
    ) AS drift_percent
FROM classification_history
GROUP BY 1;

2. .NET 10 / C# Types (Deterministic, Hash-Stable)

The following structs map 1:1 to the DB entities and enforce canonicalization rules.

2.1 CSM Structure

public sealed record CanonicalScanManifest
{
    public required string ArtifactDigest { get; init; }
    public required string FeedsMerkleRoot { get; init; }
    public required string EngineBuildHash { get; init; }
    public required string PolicyLatticeHash { get; init; }
    public required string RulesetHash { get; init; }

    public required IReadOnlyDictionary<string, string> ConfigFlags { get; init; }
    public required EnvironmentFingerprint Environment { get; init; }
}

public sealed record EnvironmentFingerprint
{
    public required string CpuModel { get; init; }
    public required string RuntimeVersion { get; init; }
    public required string Os { get; init; }
    public required IReadOnlyDictionary<string, string> Extra { get; init; }
}

Deterministic canonical-JSON serializer

Your developers must generate a stable JSON:

internal static class CanonicalJson
{
    private static readonly JsonSerializerOptions Options = new()
    {
        WriteIndented = false,
        PropertyNamingPolicy = JsonNamingPolicy.CamelCase
    };

    public static string Serialize(object obj)
    {
        using var stream = new MemoryStream();
        using (var writer = new Utf8JsonWriter(stream, new JsonWriterOptions
            {
                Indented = false,
                SkipValidation = false
            }))
        {
            JsonSerializer.Serialize(writer, obj, obj.GetType(), Options);
        }

        var bytes = stream.ToArray();
        // Sort object keys alphabetically and array items in stable order.
        // This step is mandatory to guarantee canonical form:
        var canonical = JsonCanonicalizer.Canonicalize(bytes);

        return canonical;
    }
}

JsonCanonicalizer is your deterministic canonicalization engine (already referenced in other Stella Ops modules).

2.2 Execution record

public sealed record ScanExecutionMetrics
{
    public required int IngestMs { get; init; }
    public required int AnalyzeMs { get; init; }
    public required int ReachabilityMs { get; init; }
    public required int VexMs { get; init; }
    public required int SignMs { get; init; }
    public required int PublishMs { get; init; }
}

2.3 Replay harness entrypoint

public static class ReplayRunner
{
    public static ReplayResult Replay(Guid manifestId, IScannerEngine engine)
    {
        var manifest = ManifestRepository.Load(manifestId);
        var canonical = CanonicalJson.Serialize(manifest.RawObject);
        var canonicalHash = Sha256(canonical);

        if (canonicalHash != manifest.RawManifestSHA256)
            throw new InvalidOperationException("Manifest integrity violation.");

        using var feeds = FeedSnapshotResolver.Open(manifest.FeedsMerkleRoot);

        var exec = engine.Scan(new ScanRequest
        {
            ArtifactDigest = manifest.ArtifactDigest,
            Feeds = feeds,
            LatticeHash = manifest.PolicyLatticeHash,
            EngineBuildHash = manifest.EngineBuildHash,
            CanonicalManifest = canonical
        });

        return new ReplayResult(
            exec.FindingsHash == manifest.FindingsSHA256,
            exec.VexBundleHash == manifest.VexBundleSHA256,
            exec.ProofBundleHash == manifest.ProofBundleSHA256,
            exec
        );
    }
}

Replay must run with:

no network
feeds resolved strictly from snapshots
deterministic clock (monotonic timers only)

3. GitLab CI Job for Nightly Deterministic Replay

replay-test:
  stage: test
  image: mcr.microsoft.com/dotnet/sdk:10.0
  script:
    - echo "Starting nightly deterministic replay"

    # 1. Export 200 random manifests from Postgres
    - >
      psql "$PG_CONN" -Atc "
      SELECT manifest_id
      FROM scan_manifest
      ORDER BY random()
      LIMIT 200
      " > manifests.txt

    # 2. Replay each manifest
    - >
      while read mid; do
        echo "Replaying $mid"
        dotnet run --project src/StellaOps.Scanner.Replay \
          --manifest $mid || exit 1
      done < manifests.txt

    # 3. Aggregate results
    - >
      if grep -R "NON-DETERMINISTIC" replay-logs; then
        echo "Replay failures detected"
        exit 1
      else
        echo "All replays deterministic"
      fi
  artifacts:
    paths:
      - replay-logs/
    expire_in: 7 days
  only:
    - schedules

Replay job failure criteria:

Any mismatch in findings/VEX/proof bundle hash
Any non-canonical input or manifest discrepancy
Any accidental feed/network access

4. Developer Rules (Should be added to docs/stellaops-developer-rules.md)

A scan is not valid unless the Canonical Scan Manifest (CSM) hash is stored.
Every stage must emit monotonic timestamps for TTE. Do not mix monotonic and wall clock.
Classification changes must always include a cause: no silent reclassification.
Replay mode must never reach network, dynamic rules, cloud feeds, or external clocks.
Proof bundles must be TAR with deterministic ordering: alphabetical filenames, fixed uid/gid=0, fixed mtime=0.

5. Ready for integration

If you want, I can produce:

the full EF Core 9 mapping classes
a migration file consistent with your existing Stella Ops module naming
the Angular UI block that displays TTE, FN-Drift, and Replay statistics
a deterministic TAR writer (C#) for proof bundles

Tell me which part you want next.

13 KiB Raw Blame History Unescape Escape

Benchmarks Stella Ops should standardize

Minimal implementation plan (developer‑ready)

What to put on dashboards & in SLAs

Why this wins sales & audits

1. PostgreSQL Schema (Canonical, Deterministic, Normalized)

1.1 Table: scan_manifest

1.2 Table: scan_execution

1.3 Table: classification_history

2. .NET 10 / C# Types (Deterministic, Hash-Stable)

2.1 CSM Structure

Deterministic canonical-JSON serializer

2.2 Execution record

2.3 Replay harness entrypoint

3. GitLab CI Job for Nightly Deterministic Replay

4. Developer Rules (Should be added to docs/stellaops-developer-rules.md)

5. Ready for integration

13 KiB

Raw Blame History

Benchmarks Stella Ops should standardize