add advisories

This commit is contained in:
master
2025-12-09 18:45:57 +02:00
committed by StellaOps Bot
parent 199aaf74d8
commit 96e5646977
23 changed files with 9284 additions and 762 deletions

View File

@@ -0,0 +1,381 @@
Heres a crisp, plugin set of **reproducible benchmarks** you can bake into StellaOps so buyers, auditors, and your own team can see measurable wins—without handwavy heuristics.
# Benchmarks StellaOps should standardize
**1) TimetoEvidence (TTE)**
How fast StellaOps turns a “suspicion” into a signed, auditorusable proof (e.g., VEX+attestations).
* **Definition:** `TTE = t(proof_ready) t(artifact_ingested)`
* **Scope:** scanning, reachability, policy evaluation, proof generation, notarization, and publication to your proof ledger.
* **Targets:**
* *P50* < 2m for typical container images (≤ 500 MB, known ecosystems).
* *P95* < 5m including coldstart/offlinebundle mode.
* **Report:** Median/P95 by artifact size bucket; break down stages (fetch analyze reachability VEX sign publish).
* **Auditable logs:** DSSE/DSD signatures, policy hash, feed set IDs, scanner build hash.
**2) FalseNegative Drift Rate (FNDrift)**
Catches when a previously clean artifact later becomes affected because the world changed (new CVE, rule, or feed).
* **Definition (rolling window 30d):**
`FNDrift = (# artifacts reclassified from {unaffected/unknown} → affected) / (total artifacts reevaluated)`
* **Stratify by cause:** feed delta, rule delta, lattice/policy delta, reachability delta.
* **Goal:** keep *feedcaused* FNDrift low by faster deltas (good) while keeping *enginecaused* FNDrift near zero (stability).
* **Guardrails:** require **explanations** on reclassification: include diff of feeds, rule versions, and lattice policy commit.
* **Badge:** No enginecaused FN drift in 90d (hashlinked evidence bundle).
**3) Deterministic Rescan Reproducibility (HashStable Proofs)**
Same inputs same outputs, byteforbyte, including proofs. Crucial for audits and regulated buys.
* **Definition:**
Given a **scan manifest** (artifact digest, feed snapshots, engine build hash, lattice/policy hash), rescan must produce **identical**: findings set, VEX decisions, proofs, and toplevel bundle hash.
* **Metric:** `Repro rate = identical_outputs / total_replays` (target 100%).
* **Proof object:**
```
{
artifact_digest,
scan_manifest_hash,
feeds_merkle_root,
engine_build_hash,
policy_lattice_hash,
findings_sha256,
vex_bundle_sha256,
proof_bundle_sha256
}
```
* **CI check:** nightly replay of a fixed corpus; fail pipeline on any nondeterminism (with diff).
# Minimal implementation plan (developerready)
* **Canonical Scan Manifest (CSM):** immutable JSON (canonicalized), covering: artifact digests; feed URIs + content hashes; engine build + ruleset hashes; lattice/policy hash; config flags; environment fingerprint (CPU features, locale). Store CSM + DSSE envelope.
* **Stage timers:** emit monotonic timestamps for each stage; roll up to TTE. Persist perartifact in Postgres (timeseries table by artifact_digest).
* **Delta reeval daemon:** on any feed/rule/policy change, rescore the corpus referenced by that feed snapshot; log reclassifications with cause; compute FNDrift daily.
* **Replay harness:** given a CSM, rerun pipeline in sealed mode (no network, feeds from snapshot); recompute bundle hashes; assert equality.
* **Proof bundle:** tar/zip with canonical ordering; include SBOM slice, reachability graph, VEX, signatures, and an index.json (canonical). The bundles SHA256 is your public “proof hash.”
# What to put on dashboards & in SLAs
* **TTE panel:** P50/P95 by image size; stacked bars by stage; alerts when P95 breaches SLO.
* **FNDrift panel:** overall and by cause; red flag if enginecaused drift > 0.1% in 30d.
* **Repro panel:** last 24h/7d replay pass rate (goal 100%); list any nondeterministic modules.
# Why this wins sales & audits
* **Auditors:** can pick any proof hash → replay from CSM → get the exact same signed outcome.
* **Buyers:** TTE proves speed; FNDrift proves stability and feed hygiene; Repro proves youre not heuristicwobbly.
* **Competitors:** many cant show deterministic replay or attribute drift causes—your “hashstable proofs” make that gap obvious.
If you want, I can generate the exact **PostgreSQL schema**, **.NET 10 structs**, and a **nightly replay GitLab job** that enforces these three metrics outofthebox.
Below is the complete, implementation-ready package you asked for: PostgreSQL schema, .NET 10 types, and a CI replay job for the three Stella Ops benchmarks: Time-to-Evidence (TTE), False-Negative Drift (FN-Drift), and Deterministic Replayability.
This is written so your mid-level developers can drop it directly into Stella Ops without re-architecting anything.
---
# 1. PostgreSQL Schema (Canonical, Deterministic, Normalized)
## 1.1 Table: scan_manifest
Immutable record describing exactly what was used for a scan.
```sql
CREATE TABLE scan_manifest (
manifest_id UUID PRIMARY KEY,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
artifact_digest TEXT NOT NULL,
feeds_merkle_root TEXT NOT NULL,
engine_build_hash TEXT NOT NULL,
policy_lattice_hash TEXT NOT NULL,
ruleset_hash TEXT NOT NULL,
config_flags JSONB NOT NULL,
environment_fingerprint JSONB NOT NULL,
raw_manifest JSONB NOT NULL,
raw_manifest_sha256 TEXT NOT NULL
);
```
Notes:
* `raw_manifest` is the canonical JSON used for deterministic replay.
* `raw_manifest_sha256` is the canonicalized-JSON hash, not a hash of the unformatted body.
---
## 1.2 Table: scan_execution
One execution corresponds to one run of the scanner with one manifest.
```sql
CREATE TABLE scan_execution (
execution_id UUID PRIMARY KEY,
manifest_id UUID NOT NULL REFERENCES scan_manifest(manifest_id) ON DELETE CASCADE,
started_at TIMESTAMPTZ NOT NULL,
finished_at TIMESTAMPTZ NOT NULL,
t_ingest_ms INT NOT NULL,
t_analyze_ms INT NOT NULL,
t_reachability_ms INT NOT NULL,
t_vex_ms INT NOT NULL,
t_sign_ms INT NOT NULL,
t_publish_ms INT NOT NULL,
proof_bundle_sha256 TEXT NOT NULL,
findings_sha256 TEXT NOT NULL,
vex_bundle_sha256 TEXT NOT NULL,
replay_mode BOOLEAN NOT NULL DEFAULT FALSE
);
```
Derived view for Time-to-Evidence:
```sql
CREATE VIEW scan_tte AS
SELECT
execution_id,
manifest_id,
(finished_at - started_at) AS tte_interval
FROM scan_execution;
```
---
## 1.3 Table: classification_history
Used for FN-Drift tracking.
```sql
CREATE TABLE classification_history (
id BIGSERIAL PRIMARY KEY,
artifact_digest TEXT NOT NULL,
manifest_id UUID NOT NULL REFERENCES scan_manifest(manifest_id) ON DELETE CASCADE,
execution_id UUID NOT NULL REFERENCES scan_execution(execution_id) ON DELETE CASCADE,
previous_status TEXT NOT NULL, -- unaffected | unknown | affected
new_status TEXT NOT NULL,
cause TEXT NOT NULL, -- engine_delta | feed_delta | ruleset_delta | policy_delta
changed_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
```
Materialized view for drift statistics:
```sql
CREATE MATERIALIZED VIEW fn_drift_stats AS
SELECT
date_trunc('day', changed_at) AS day_bucket,
COUNT(*) FILTER (WHERE new_status = 'affected') AS affected_count,
COUNT(*) AS total_reclassified,
ROUND(
(COUNT(*) FILTER (WHERE new_status = 'affected')::numeric /
NULLIF(COUNT(*), 0)) * 100, 4
) AS drift_percent
FROM classification_history
GROUP BY 1;
```
---
# 2. .NET 10 / C# Types (Deterministic, Hash-Stable)
The following structs map 1:1 to the DB entities and enforce canonicalization rules.
## 2.1 CSM Structure
```csharp
public sealed record CanonicalScanManifest
{
public required string ArtifactDigest { get; init; }
public required string FeedsMerkleRoot { get; init; }
public required string EngineBuildHash { get; init; }
public required string PolicyLatticeHash { get; init; }
public required string RulesetHash { get; init; }
public required IReadOnlyDictionary<string, string> ConfigFlags { get; init; }
public required EnvironmentFingerprint Environment { get; init; }
}
public sealed record EnvironmentFingerprint
{
public required string CpuModel { get; init; }
public required string RuntimeVersion { get; init; }
public required string Os { get; init; }
public required IReadOnlyDictionary<string, string> Extra { get; init; }
}
```
### Deterministic canonical-JSON serializer
Your developers must generate a stable JSON:
```csharp
internal static class CanonicalJson
{
private static readonly JsonSerializerOptions Options = new()
{
WriteIndented = false,
PropertyNamingPolicy = JsonNamingPolicy.CamelCase
};
public static string Serialize(object obj)
{
using var stream = new MemoryStream();
using (var writer = new Utf8JsonWriter(stream, new JsonWriterOptions
{
Indented = false,
SkipValidation = false
}))
{
JsonSerializer.Serialize(writer, obj, obj.GetType(), Options);
}
var bytes = stream.ToArray();
// Sort object keys alphabetically and array items in stable order.
// This step is mandatory to guarantee canonical form:
var canonical = JsonCanonicalizer.Canonicalize(bytes);
return canonical;
}
}
```
`JsonCanonicalizer` is your deterministic canonicalization engine (already referenced in other Stella Ops modules).
---
## 2.2 Execution record
```csharp
public sealed record ScanExecutionMetrics
{
public required int IngestMs { get; init; }
public required int AnalyzeMs { get; init; }
public required int ReachabilityMs { get; init; }
public required int VexMs { get; init; }
public required int SignMs { get; init; }
public required int PublishMs { get; init; }
}
```
---
## 2.3 Replay harness entrypoint
```csharp
public static class ReplayRunner
{
public static ReplayResult Replay(Guid manifestId, IScannerEngine engine)
{
var manifest = ManifestRepository.Load(manifestId);
var canonical = CanonicalJson.Serialize(manifest.RawObject);
var canonicalHash = Sha256(canonical);
if (canonicalHash != manifest.RawManifestSHA256)
throw new InvalidOperationException("Manifest integrity violation.");
using var feeds = FeedSnapshotResolver.Open(manifest.FeedsMerkleRoot);
var exec = engine.Scan(new ScanRequest
{
ArtifactDigest = manifest.ArtifactDigest,
Feeds = feeds,
LatticeHash = manifest.PolicyLatticeHash,
EngineBuildHash = manifest.EngineBuildHash,
CanonicalManifest = canonical
});
return new ReplayResult(
exec.FindingsHash == manifest.FindingsSHA256,
exec.VexBundleHash == manifest.VexBundleSHA256,
exec.ProofBundleHash == manifest.ProofBundleSHA256,
exec
);
}
}
```
Replay must run with:
* no network
* feeds resolved strictly from snapshots
* deterministic clock (monotonic timers only)
---
# 3. GitLab CI Job for Nightly Deterministic Replay
```yaml
replay-test:
stage: test
image: mcr.microsoft.com/dotnet/sdk:10.0
script:
- echo "Starting nightly deterministic replay"
# 1. Export 200 random manifests from Postgres
- >
psql "$PG_CONN" -Atc "
SELECT manifest_id
FROM scan_manifest
ORDER BY random()
LIMIT 200
" > manifests.txt
# 2. Replay each manifest
- >
while read mid; do
echo "Replaying $mid"
dotnet run --project src/StellaOps.Scanner.Replay \
--manifest $mid || exit 1
done < manifests.txt
# 3. Aggregate results
- >
if grep -R "NON-DETERMINISTIC" replay-logs; then
echo "Replay failures detected"
exit 1
else
echo "All replays deterministic"
fi
artifacts:
paths:
- replay-logs/
expire_in: 7 days
only:
- schedules
```
Replay job failure criteria:
* Any mismatch in findings/VEX/proof bundle hash
* Any non-canonical input or manifest discrepancy
* Any accidental feed/network access
---
# 4. Developer Rules (Should be added to docs/stellaops-developer-rules.md)
1. A scan is not valid unless the Canonical Scan Manifest (CSM) hash is stored.
2. Every stage must emit monotonic timestamps for TTE. Do not mix monotonic and wall clock.
3. Classification changes must always include a cause: no silent reclassification.
4. Replay mode must never reach network, dynamic rules, cloud feeds, or external clocks.
5. Proof bundles must be TAR with deterministic ordering: alphabetical filenames, fixed uid/gid=0, fixed mtime=0.
---
# 5. Ready for integration
If you want, I can produce:
* the full EF Core 9 mapping classes
* a migration file consistent with your existing Stella Ops module naming
* the Angular UI block that displays TTE, FN-Drift, and Replay statistics
* a deterministic TAR writer (C#) for proof bundles
Tell me which part you want next.