up
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
api-governance / spectral-lint (push) Has been cancelled
oas-ci / oas-validate (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
SDK Publish & Sign / sdk-publish (push) Has been cancelled

This commit is contained in:
master
2025-11-27 15:05:48 +02:00
parent 4831c7fcb0
commit e950474a77
278 changed files with 81498 additions and 672 deletions

View File

@@ -0,0 +1,684 @@
Heres a practical, firsttimefriendly guide to using VEX in StellaOps, plus a concrete .NET pattern you can drop in today.
---
# VEX in a nutshell
* **VEX (Vulnerability Exploitability eXchange)**: a small JSON document that says whether specific CVEs *actually* affect a product/version.
* **OpenVEX**: SBOMagnostic; references products/components directly (URIs, PURLs, hashes). Great for canonical internal models.
* **CycloneDX VEX / SPDX VEX**: tie VEX statements closely to a specific SBOM instance (component BOM ref IDs). Great when the BOM is your source of truth.
**Our strategy:**
* **Store VEX separately** from SBOMs (deterministic, easier airgap bundling).
* **Link by strong references** (PURLs + content hashes + optional SBOM component IDs).
* **Translate on ingest** between OpenVEX ↔ CycloneDX VEX as needed so downstream tools stay happy.
---
# Translation model (OpenVEX ↔ CycloneDX VEX)
1. **Identity mapping**
* Prefer **PURL** for packages; fallback to **SHA256 (or SHA512)** of artifact; optionally include **SBOM `bom-ref`** if known.
2. **Product scope**
* OpenVEX “product” → CycloneDX `affects` with `bom-ref` (if available) or a synthetic ref derived from PURL/hash.
3. **Status mapping**
* `affected | not_affected | under_investigation | fixed` map 1:1.
* Keep **timestamps**, **justification**, **impact statement**, and **origin**.
4. **Evidence**
* Preserve links to advisories, commits, tests; attach as CycloneDX `analysis/evidence` notes (or OpenVEX `metadata/notes`).
**Collision rules (deterministic):**
* New statement wins if:
* Newer `timestamp` **and**
* Higher **provenance trust** (signed by vendor/Authority) or equal with a lexicographic tiebreak (issuer keyID).
---
# Storage model (MongoDBfriendly)
* **Collections**
* `vex.documents` one doc per VEX file (OpenVEX or CycloneDX VEX).
* `vex.statements` *flattened*, one per (product/component, vuln).
* `artifacts` canonical component index (PURL, hashes, optional SBOM refs).
* **Reference keys**
* `artifactKey = purl || sha256 || (groupId:name:version for .NET/NuGet)`
* `vulnKey = cveId || ghsaId || internalId`
* **Deterministic IDs**
* `_id = sha256(canonicalize(statement-json-without-signature))`
* **Signatures**
* Keep DSSE/Sigstore envelopes in `vex.documents.signatures[]` for audit & replay.
---
# Airgap bundling
Package **SBOMs + VEX + artifacts index + trust roots** as a single tarball:
```
/bundle/
sboms/*.json
vex/*.json # OpenVEX & CycloneDX VEX allowed
index/artifacts.jsonl # purl, hashes, bom-ref map
trust/rekor.merkle.roots
trust/fulcio.certs.pem
trust/keys/*.pub
manifest.json # content list + sha256 + issuedAt
```
* **Deterministic replay:** reingest is pure function of bundle bytes → identical DB state.
---
# .NET 10 implementation (C#) deterministic ingestion
### Core models
```csharp
public record ArtifactRef(
string? Purl,
string? Sha256,
string? BomRef);
public enum VexStatus { Affected, NotAffected, UnderInvestigation, Fixed }
public record VexStatement(
string StatementId, // sha256 of canonical payload
ArtifactRef Artifact,
string VulnId, // e.g., "CVE-2024-1234"
VexStatus Status,
string? Justification,
string? ImpactStatement,
DateTimeOffset Timestamp,
string IssuerKeyId, // from DSSE/Signing
int ProvenanceScore); // Authority policy
```
### Canonicalizer (stable order, no env fields)
```csharp
static string Canonicalize(VexStatement s)
{
var payload = new {
artifact = new { s.Artifact.Purl, s.Artifact.Sha256, s.Artifact.BomRef },
vulnId = s.VulnId,
status = s.Status.ToString(),
justification = s.Justification,
impact = s.ImpactStatement,
timestamp = s.Timestamp.UtcDateTime
};
// Use System.Text.Json with deterministic ordering
var opts = new System.Text.Json.JsonSerializerOptions {
WriteIndented = false
};
string json = System.Text.Json.JsonSerializer.Serialize(payload, opts);
// Normalize unicode + newline
json = json.Normalize(NormalizationForm.FormKC).Replace("\r\n","\n");
return json;
}
static string Sha256(string s)
{
using var sha = System.Security.Cryptography.SHA256.Create();
var bytes = sha.ComputeHash(System.Text.Encoding.UTF8.GetBytes(s));
return Convert.ToHexString(bytes).ToLowerInvariant();
}
```
### Ingest pipeline
```csharp
public sealed class VexIngestor
{
readonly IVexParser _parser; // OpenVEX & CycloneDX adapters
readonly IArtifactIndex _artifacts;
readonly IVexRepo _repo; // Mongo-backed
readonly IPolicy _policy; // tie-break rules
public async Task IngestAsync(Stream vexJson, SignatureEnvelope? sig)
{
var doc = await _parser.ParseAsync(vexJson); // yields normalized statements
var issuer = sig?.KeyId ?? "unknown";
foreach (var st in doc.Statements)
{
var canon = Canonicalize(st);
var id = Sha256(canon);
var withMeta = st with {
StatementId = id,
IssuerKeyId = issuer,
ProvenanceScore = _policy.Score(sig, st)
};
// Upsert artifact (purl/hash/bomRef)
await _artifacts.UpsertAsync(withMeta.Artifact);
// Deterministic merge
var existing = await _repo.GetAsync(id)
?? await _repo.FindByKeysAsync(withMeta.Artifact, st.VulnId);
if (existing is null || _policy.IsNewerAndStronger(existing, withMeta))
await _repo.UpsertAsync(withMeta);
}
if (sig is not null) await _repo.AttachSignatureAsync(doc.DocumentId, sig);
}
}
```
### Parsers (adapters)
* `OpenVexParser` reads OpenVEX; emits `VexStatement` with `ArtifactRef(PURL/hash)`
* `CycloneDxVexParser` resolves `bom-ref` → look up PURL/hash via `IArtifactIndex` (if SBOM present); if not, store bomref and mark artifact unresolved for later backfill.
---
# Why this works for StellaOps
* **SBOMagnostic core** (OpenVEXfirst) maps cleanly to your MongoDB canonical stores and `.NET 10` services.
* **SBOMaware edges** (CycloneDX VEX) are still supported via adapters and `bom-ref` backfill.
* **Deterministic everything**: canonical JSON → SHA256 IDs → reproducible merges → perfect for audits and offline environments.
* **Airgap ready**: single bundle with trust roots, replayable on any node.
---
# Next steps (plugandplay)
1. Implement the two parsers (`OpenVexParser`, `CycloneDxVexParser`).
2. Add the repo/index interfaces to your `StellaOps.Vexer` service:
* `IVexRepo` (Mongo collections `vex.documents`, `vex.statements`)
* `IArtifactIndex` (your canonical PURL/hash map)
3. Wire `Policy` to Authority to score signatures and apply tiebreaks.
4. Add a `bundle ingest` CLI: `vexer ingest /bundle/manifest.json`.
5. Expose GraphQL (HotChocolate) queries:
* `vexStatements(artifactKey, vulnId)`, `vexStatus(artifactKey)`, `evidence(...)`.
If you want, I can generate the exact Mongo schemas, HotChocolate types, and a minimal test bundle to validate the ingest endtoend.
Below is a complete, developer-ready implementation plan for the **VEX ingestion, translation, canonicalization, storage, and merge-policy pipeline** inside **Stella Ops.Vexer**, aligned with your architecture, deterministic requirements, MongoDB model, DSSE/Authority workflow, and `.NET 10` standards.
This is structured so an average developer can follow it step-by-step without ambiguity.
It is broken into phases, each with clear tasks, acceptance criteria, failure modes, interfaces, and code pointers.
---
# Stella Ops.Vexer
## Full Implementation Plan (Developer-Executable)
---
# 1. Core Objectives
Develop a deterministic, replayable, SBOM-agnostic but SBOM-compatible VEX subsystem supporting:
* OpenVEX and CycloneDX VEX ingestion.
* Canonicalization → SHA-256 identity.
* Cross-linking to artifacts (purl, hash, bom-ref).
* Merge policies driven by Authority trust/lattice rules.
* Complete offline reproducibility.
* MongoDB canonical storage.
* Exposed through gRPC/REST/GraphQL.
---
# 2. Module Structure (to be implemented)
```
src/StellaOps.Vexer/
Application/
Commands/
Queries/
Ingest/
Translation/
Merge/
Policies/
Domain/
Entities/
ValueObjects/
Services/
Infrastructure/
Mongo/
AuthorityClient/
Hashing/
Signature/
BlobStore/
Presentation/
GraphQL/
REST/
gRPC/
```
Every subfolder must compile in strict mode (treat warnings as errors).
---
# 3. Data Model (MongoDB)
## 3.1 `vex.statements` collection
Document schema:
```json
{
"_id": "sha256(canonical-json)",
"artifact": {
"purl": "pkg:nuget/... or null",
"sha256": "hex or null",
"bomRef": "optional ref",
"resolved": true | false
},
"vulnId": "CVE-XXXX-YYYY",
"status": "affected | not_affected | under_investigation | fixed",
"justification": "...",
"impact": "...",
"timestamp": "2024-01-01T12:34:56Z",
"issuerKeyId": "FULCIO-KEY-ID",
"provenanceScore": 0100,
"documentId": "UUID of vex.documents entry",
"sourceFormat": "openvex|cyclonedx",
"createdAt": "...",
"updatedAt": "..."
}
```
## 3.2 `vex.documents` collection
```
{
"_id": "<uuid>",
"format": "openvex|cyclonedx",
"rawBlobId": "<blob-id in blobstore>",
"signatures": [
{
"type": "dsse",
"verified": true,
"issuerKeyId": "F-123...",
"timestamp": "...",
"bundleEvidence": {...}
}
],
"ingestedAt": "...",
"statementIds": ["sha256-1", "sha256-2", ...]
}
```
---
# 4. Components to Implement
## 4.1 Parsing Layer
### Interfaces
```csharp
public interface IVexParser
{
ValueTask<ParsedVexDocument> ParseAsync(Stream jsonStream);
}
public sealed record ParsedVexDocument(
string DocumentId,
string Format,
IReadOnlyList<ParsedVexStatement> Statements);
```
### Tasks
1. Implement `OpenVexParser`.
* Use System.Text.Json source generators.
* Validate OpenVEX schema version.
* Extract product → component mapping.
* Map to internal `ArtifactRef`.
2. Implement `CycloneDxVexParser`.
* Support 1.5+ “vex” extension.
* bom-ref resolution through `IArtifactIndex`.
* Mark unresolved `bom-ref` but store them.
### Acceptance Criteria
* Both parsers produce identical internal representation of statements.
* Unknown fields must not corrupt canonicalization.
* 100% deterministic mapping for same input.
---
## 4.2 Canonicalizer
Implement deterministic ordering, UTF-8 normalization, stable JSON.
### Tasks
1. Create `Canonicalizer` class.
2. Apply:
* Property order: artifact, vulnId, status, justification, impact, timestamp.
* Remove optional metadata (issuerKeyId, provenance).
* Normalize Unicode → NFKC.
* Replace CRLF → LF.
3. Generate SHA-256.
### Interface
```csharp
public interface IVexCanonicalizer
{
string Canonicalize(VexStatement s);
string ComputeId(string canonicalJson);
}
```
### Acceptance Criteria
* Hash identical on all OS, time, locale, machines.
* Replaying the same bundle yields same `_id`.
---
## 4.3 Authority / Signature Verification
### Tasks
1. Implement DSSE envelope reader.
2. Integrate Authority client:
* Verify certificate chain (Fulcio/GOST/eIDAS etc).
* Obtain trust lattice score.
* Produce `ProvenanceScore`: int.
### Interface
```csharp
public interface ISignatureVerifier
{
ValueTask<SignatureVerificationResult> VerifyAsync(Stream payload, Stream envelope);
}
```
### Acceptance Criteria
* If verification fails → Vexer stores document but flags signature invalid.
* Scores map to priority in merge policy.
---
## 4.4 Merge Policies
### Implement Default Policy
1. Newer timestamp wins.
2. If timestamps equal:
* Higher provenance score wins.
* If both equal, lexicographically smaller issuerKeyId wins.
### Interface
```csharp
public interface IVexMergePolicy
{
bool ShouldReplace(VexStatement existing, VexStatement incoming);
}
```
### Acceptance Criteria
* Merge decisions reproducible.
* Deterministic ordering even when values equal.
---
## 4.5 Ingestion Pipeline
### Steps
1. Accept `multipart/form-data` or referenced blob ID.
2. Parse via correct parser.
3. Verify signature (optional).
4. For each statement:
* Canonicalize.
* Compute `_id`.
* Upsert artifact into `artifacts` (via `IArtifactIndex`).
* Resolve bom-ref (if CycloneDX).
* Existing statement? Apply merge policy.
* Insert or update.
5. Create `vex.documents` entry.
### Class
`VexIngestService`
### Required Methods
```csharp
public Task<IngestResult> IngestAsync(VexIngestRequest request);
```
### Acceptance Tests
* Idempotent: ingesting same VEX repeated → DB unchanged.
* Deterministic under concurrency.
* Air-gap replay produces identical DB state.
---
## 4.6 Translation Layer
### Implement two converters:
* `OpenVexToCycloneDxTranslator`
* `CycloneDxToOpenVexTranslator`
### Rules
* Prefer PURL → hash → synthetic bom-ref.
* Single VEX statement → one CycloneDX “analysis” entry.
* Preserve justification, impact, notes.
### Acceptance Criteria
* Round-trip OpenVEX → CycloneDX → OpenVEX produces equal canonical hashes (except format markers).
---
## 4.7 Artifact Index Backfill
### Reason
CycloneDX VEX may refer to bom-refs not yet known at ingestion.
### Tasks
1. Store unresolved artifacts.
2. Create background `BackfillWorker`:
* Watches `sboms.documents` ingestion events.
* Matches bom-refs.
* Updates statements with resolved PURL/hashes.
* Recomputes canonical JSON + SHA-256 (new version stored as new ID).
3. Marks old unresolved statement as superseded.
### Acceptance Criteria
* Backfilling is monotonic: no overwriting original.
* Deterministic after backfill: same SBOM yields same final ID.
---
## 4.8 Bundle Ingestion (Air-Gap Mode)
### Structure
```
bundle/
sboms/*.json
vex/*.json
index/artifacts.jsonl
trust/*
manifest.json
```
### Tasks
1. Implement `BundleIngestService`.
2. Stages:
* Validate manifest + hashes.
* Import trust roots (local only).
* Ingest SBOMs first.
* Ingest VEX documents.
3. Reproduce same IDs on all nodes.
### Acceptance Criteria
* Byte-identical bundle → byte-identical DB.
* Works offline completely.
---
# 5. Interfaces for GraphQL/REST/gRPC
Expose:
## Queries
* `vexStatement(id)`
* `vexStatementsByArtifact(purl/hash)`
* `vexStatus(purl)` → latest merged status
* `vexDocument(id)`
* `affectedComponents(vulnId)`
## Mutations
* `ingestVexDocument`
* `translateVex(format)`
* `exportVexDocument(id, targetFormat)`
* `replayBundle(bundleId)`
All responses must include deterministic IDs.
---
# 6. Detailed Developer Tasks by Sprint
## Sprint 1: Foundation
1. Create solution structure.
2. Add Mongo DB contexts.
3. Implement data entities.
4. Implement hashing + canonicalizer.
5. Implement IVexParser interface.
## Sprint 2: Parsers
1. Implement OpenVexParser.
2. Implement CycloneDxParser.
3. Develop strong unit tests for JSON normalization.
## Sprint 3: Signature & Authority
1. DSSE envelope reader.
2. Call Authority to verify signatures.
3. Produce provenance scores.
## Sprint 4: Merge Policy Engine
1. Implement deterministic lattice merge.
2. Unit tests: 20+ collision scenarios.
## Sprint 5: Ingestion Pipeline
1. Implement ingest service end-to-end.
2. Insert/update logic.
3. Add GraphQL endpoints.
## Sprint 6: Translation Layer
1. OpenVEX↔CycloneDX converter.
2. Tests for round-trip.
## Sprint 7: Backfill System
1. Bom-ref resolver worker.
2. Rehashing logic for updated artifacts.
3. Events linking SBOM ingestion to backfill.
## Sprint 8: Air-Gap Bundle
1. BundleIngestService.
2. Manifest verification.
3. Trust root local loading.
## Sprint 9: Hardening
1. Fuzz parsers.
2. Deterministic stress tests.
3. Concurrency validation.
4. Storage compaction.
---
# 7. Failure Handling Matrix
| Failure | Action | Logged? | Retries |
| ------------------- | -------------------------------------- | ------- | ------- |
| Invalid JSON | Reject document | Yes | 0 |
| Invalid schema | Reject | Yes | 0 |
| Signature invalid | Store document, mark signature invalid | Yes | 0 |
| Artifact unresolved | Store unresolved, enqueue backfill | Yes | 3 |
| Merge conflict | Apply policy | Yes | 0 |
| Canonical mismatch | Hard fail | Yes | 0 |
---
# 8. Developer Unit Test Checklist
### must have tests for:
* Canonicalization stability (100 samples).
* Identical input twice → identical `_id`.
* Parsing OpenVEX with multi-product definitions.
* Parsing CycloneDX with missing bom-refs.
* Merge policy tie-breakers.
* Air-gap replay reproducibility.
* Translation equivalence.
---
# 9. Deliverables for Developers
They must produce:
1. Interfaces + DTOs + document schemas.
2. Canonicalizer with 100% deterministic output.
3. Two production-grade parsers.
4. Signature verification pipeline.
5. Merge policies aligned with Authority trust model.
6. End-to-end ingestion service.
7. Translation layer.
8. Backfill worker.
9. Air-gap bundle script + service.
10. GraphQL APIs.
---
If you want, I can next produce:
* A full **developer handbook** (6090 pages).
* Full **technical architecture ADRs**.
* A concrete **scaffold** with compiles-clean `.NET 10` project.
* Complete **test suite specification**.
* A **README.md** for new joiners.