Add tests for SBOM generation determinism across multiple formats

- Created `StellaOps.TestKit.Tests` project for unit tests related to determinism.
- Implemented `DeterminismManifestTests` to validate deterministic output for canonical bytes and strings, file read/write operations, and error handling for invalid schema versions.
- Added `SbomDeterminismTests` to ensure identical inputs produce consistent SBOMs across SPDX 3.0.1 and CycloneDX 1.6/1.7 formats, including parallel execution tests.
- Updated project references in `StellaOps.Integration.Determinism` to include the new determinism testing library.
This commit is contained in:
master
2025-12-23 18:56:12 +02:00
committed by StellaOps Bot
parent 7ac70ece71
commit 491e883653
409 changed files with 23797 additions and 17779 deletions

View File

@@ -1,229 +1,6 @@
# VEX Observations & Linksets
# Archived: VEX Observations & Linksets
> Imposed rule: Work of this type or tasks of this type on this component must
> also be applied everywhere else it should be applied.
This document was consolidated during docs cleanup.
Link-Not-Merge brings the same immutable observation model to Excititor that
Concelier now uses for advisories. VEX statements are stored as append-only
observations; linksets correlate them, capture conflicts, and keep provenance so
Policy Engine and UI surfaces can explain decisions without collapsing sources.
---
## 1. Model overview
### 1.1 Observation lifecycle
1. **Ingest** Connectors fetch OpenVEX, CSAF VEX, CycloneDX VEX, or VEX
attestations, validate signatures, and strip any derived consensus data
forbidden by the Aggregation-Only Contract (AOC).
2. **Persist** Excititor writes immutable `vex_observations` keyed by tenant,
provider, upstream identifier, and `contentHash`. Supersedes chains record
revisions; the original payload is never mutated.
3. **Expose** WebService will surface paginated observation APIs and Offline
Kit snapshots mirror the same data for air-gapped sites.
Observation schema sketch (final shape lands with `EXCITITOR-LNM-21-001`):
```text
observationId = {tenant}:{providerId}:{upstreamId}:{revision}
tenant, providerId, streamId
upstream{ upstreamId, documentVersion, fetchedAt, receivedAt,
contentHash, signature{present, format?, keyId?, signature?} }
content{ format, specVersion, raw }
statements[
{ vulnerabilityId, productKey, status, justification?,
introducedVersion?, fixedVersion?, locator }
]
linkset{ purls[], cpes[], aliases[], references[],
reconciledFrom[], conflicts[]? }
attributes{ batchId?, replayCursor? }
createdAt
```
- **Raw payload** (`content.raw`) remains lossless (Relaxed Extended JSON).
- **Statements** provide normalized tuples for each claim contained in the
document, including justification and version hints.
- **Linkset** mirrors identifiers extracted during ingestion, retaining JSON
pointer metadata so audits can trace back to the source fragment.
### 1.2 Linkset lifecycle
Linksets correlate claims referring to the same `(vulnerabilityId, productKey)`
pair across providers.
1. **Seed** Observations push normalized identifiers (CVE, GHSA, vendor IDs)
plus canonical product keys (purl preferred, cpe fallback). Platform-scoped
statements remain marked `non_joinable`.
2. **Correlate** The linkset builder groups statements by tenant and identity,
combines alias graphs from Concelier, and uses justification/product overlap
to assign correlation confidence.
3. **Annotate** Conflicts (status disagreement, justification mismatch, range
inconsistencies) are recorded as structured entries.
4. **Persist** Results land in `vex_linksets` with deterministic IDs (hash of
sorted `(vulnerabilityId, productKey, observationIds)`) and append-only
history for replay/debugging.
Linksets never override statements or invent consensus; they simply align
evidence for Policy Engine and consumers.
---
## 2. Observation vs. linkset
- **Purpose**
- Observation: Immutable record of a single upstream VEX document.
- Linkset: Correlated evidence spanning observations that describe the same
product-vulnerability pair.
- **Mutation**
- Observation: Append-only via supersedes.
- Linkset: Regenerated deterministically by correlation jobs.
- **Allowed fields**
- Observation: Raw payload, provenance, normalized statement tuples, join
hints.
- Linkset: Observation references, statement IDs, confidence metrics, conflict
annotations.
- **Forbidden fields**
- Observation: Derived consensus, suppression flags, risk scores.
- Linkset: Derived severity or policy decisions (only evidence + conflicts).
- **Consumers**
- Observation: Evidence exports, Offline Kit mirrors, CLI raw dumps.
- Linkset: Policy Engine VEX overlay, Console evidence panes, Vuln Explorer.
### 2.1 Example sequence
1. Canonical vendor issues an attested OpenVEX declaring `CVE-2025-2222` as
`not_affected` for `pkg:rpm/redhat/openssl@1.1.1w-12`. Excititor inserts a
new observation referencing that statement.
2. Upstream CycloneDX VEX from a distro reports the same product as `affected`
with `under_investigation` justification.
3. Linkset builder groups both statements by alias overlap and product key,
setting confidence `high` because CVE and purl match.
4. Conflict annotation records `status-mismatch` and retains both justifications;
Policy Engine uses this to explain why suppression cannot proceed without
policy override.
---
## 3. Conflict handling
Structured conflicts capture disagreements without mutating source statements.
```json
{
"type": "status-mismatch",
"vulnerabilityId": "CVE-2025-2222",
"productKey": "pkg:rpm/redhat/openssl@1.1.1w-12",
"statements": [
{
"observationId": "tenant:redhat:openvex:3",
"providerId": "redhat",
"status": "not_affected",
"justification": "component_not_present"
},
{
"observationId": "tenant:ubuntu:cyclonedx:12",
"providerId": "ubuntu",
"status": "affected",
"justification": "under_investigation"
}
],
"confidence": "medium",
"detectedAt": "2025-10-27T14:30:00Z"
}
```
Conflict classes (tracked via `EXCITITOR-LNM-21-003`):
- `status-mismatch` Different statuses for the same pair (affected vs
not_affected vs fixed vs under_investigation).
- `justification-divergence` Same status but incompatible justifications or
missing justification where policy requires it.
- `version-range-clash` Introduced/fixed ranges contradict each other.
- `non-joinable-overlap` Platform-scoped statements collide with package
statements; flagged as warning but retained.
- `metadata-gap` Missing provenance/signature field on specific statements.
Conflicts surface through:
- `/vex/linksets/{id}` APIs (`conflicts[]` payload).
- Console evidence panels (badges + drawer detail).
- CLI exports (`stella vex linkset …` planned in `CLI-LNM-22-002`).
- Metrics dashboards (`vex_linkset_conflicts_total{type}`).
---
## 4. AOC alignment
- **Raw-first** `content.raw` and `statements[]` mirror upstream input; no
derived consensus or suppression values are written by ingestion.
- **No merges** Each upstream statement persists independently; linksets refer
back via `observationId`.
- **Provenance mandatory** Missing signature or source metadata yields
`ERR_AOC_004`; ingestion blocks until connectors fix the feed.
- **Idempotent writes** Duplicate `(providerId, upstreamId, contentHash)`
results in a no-op; revisions append with a `supersedes` pointer.
- **Deterministic output** Correlator sorts identifiers, normalizes timestamps
(UTC ISO-8601), and hashes canonical JSON to generate stable linkset IDs.
- **Scope-aware** Tenant claims enforced on write/read; Authority scopes
`vex:ingest` / `vex:read` are required (see `AUTH-AOC-22-001`).
Violations raise `ERR_AOC_00x`, emit `aoc_violation_total`, and prevent the data
from landing downstream.
---
## 5. Downstream consumption
- **Policy Engine** Evaluates VEX evidence alongside advisory linksets to gate
suppression, severity downgrades, or explainability.
- **Console UI** Evidence panel renders VEX statements grouped by provider and
highlights conflicts or missing signatures.
- **CLI** Planned commands export observations/linksets for offline analysis
(`CLI-LNM-22-002`).
- **Offline Kit** Bundled snapshots keep VEX data aligned with advisory
observations for air-gapped parity.
- **Observability** Dashboards track ingestion latency, conflict counts, and
supersedes depth per provider.
New consumers must treat both collections as read-only and preserve deterministic
ordering when caching.
---
## 6. Validation & testing
- **Unit tests** (`StellaOps.Excititor.Core.Tests`) to cover schema guards,
deterministic linkset hashing, conflict classification, and supersedes
behaviour.
- **Mongo integration tests** (`StellaOps.Excititor.Storage.Mongo.Tests`) to
verify indexes, shard keys, and idempotent writes across tenants.
- **CLI smoke suites** (`stella vex observations`, `stella vex linksets`) for
JSON determinism and exit code coverage.
- **Replay determinism** Feed identical upstream payloads twice and ensure
observation/linkset hashes match across runs.
- **Offline kit verification** Validate VEX exports packaged in Offline Kit
snapshots against live service outputs.
- **Fixture refresh** Samples (`SAMPLES-LNM-22-002`) must include multi-source
conflicts and justification variants used by docs and UI tests.
---
## 7. Reviewer checklist
- Observation schema aligns with `EXCITITOR-LNM-21-001` once the schema lands;
update references as soon as the final contract is published.
- Linkset lifecycle covers correlation signals (alias graphs, product keys,
justification rules) and deterministic ID strategy.
- Conflict classes include status, justification, version range, platform overlap
scenarios.
- AOC guardrails called out with relevant error codes and Authority scopes.
- Downstream consumer list matches active APIs/CLI features (update when
`CLI-LNM-22-002` and WebService endpoints ship).
- Validation section references Core, Storage, CLI, and Offline test suites plus
fixture requirements.
- Imposed rule reminder retained at top.
Dependencies outstanding (2025-10-27): `EXCITITOR-LNM-21-001..005` and
`EXCITITOR-LNM-21-101..102` are still TODO; revisit this document once schemas,
APIs, and fixtures are implemented.
- Canonical guide: `docs/16_VEX_CONSENSUS_GUIDE.md`
- Module dossiers: `docs/modules/excititor/architecture.md`, `docs/modules/vex-lens/architecture.md`

View File

@@ -1,15 +1,6 @@
# VEX Consensus Algorithm — Draft Skeleton (2025-12-05 UTC)
# Archived: VEX Consensus Algorithm
Status: draft placeholder. Depends on consensus overview and PLVL0102.
This document was consolidated during docs cleanup.
## Normalization
- Input normalization steps (pending schema).
## Weighting & Thresholds
- How weights are assigned; threshold examples (to fill).
## Examples
- Sample merge scenarios (placeholder).
## Open TODOs
- Populate equations and concrete scenarios when data is available.
- Canonical guide: `docs/16_VEX_CONSENSUS_GUIDE.md`
- Module dossier: `docs/modules/vex-lens/architecture.md`

View File

@@ -1,15 +1,6 @@
# VEX Consensus API — Draft Skeleton (2025-12-05 UTC)
# Archived: VEX Consensus API
Status: draft placeholder. Inputs pending: PLVL0102 policy join notes.
This document was consolidated during docs cleanup.
## Endpoints
- List and describe endpoints (to fill).
## Query Parameters
- Filters, pagination, projections (pending contract).
## Rate Limits
- TBD; add concrete values once agreed.
## Open TODOs
- Add request/response examples when schemas are delivered.
- Canonical guide: `docs/16_VEX_CONSENSUS_GUIDE.md`
- Module dossier: `docs/modules/vex-lens/architecture.md`

View File

@@ -1,12 +1,6 @@
# VEX Consensus Console — Draft Skeleton (2025-12-05 UTC)
# Archived: VEX Consensus Console Integration
Status: draft placeholder. Inputs pending: console overlay assets.
This document was consolidated during docs cleanup.
## Workflows
- Browse/filters; conflict resolution; accessibility notes.
## Notifications
- How conflicts/exceptions surface in UI.
## Open TODOs
- Add screenshots/flows when assets arrive.
- Canonical guide: `docs/16_VEX_CONSENSUS_GUIDE.md`
- Console guide: `docs/15_UI_GUIDE.md`

View File

@@ -1,52 +1,6 @@
# Excitor consensus JSON sample (beta)
# Archived: VEX Consensus JSON
```jsonc
{
"vulnId": "CVE-2025-12345",
"productKey": "pkg:maven/org.apache.commons/commons-text@1.11.0",
"rollupStatus": "NOT_AFFECTED",
"sources": [
{
"providerId": "redhat",
"status": "NOT_AFFECTED",
"justification": "component_not_present",
"weight": 0.62,
"trust": {
"tier": "distro",
"note": "tier=distro;weight=0.62",
"weight": 0.62,
"cosign": {
"issuer": "https://issuer.redhat.com",
"identityPattern": "spiffe://redhat/vex/*"
},
"pgpFingerprints": [
"04F2C0A87B1D9E90B1D8A35DCEB5ABCD12345678"
]
},
"lastObserved": "2025-11-04T18:22:31Z",
"accepted": true,
"reason": "trust-tier vendor, signed OpenVEX"
},
{
"providerId": "github",
"status": "AFFECTED",
"justification": null,
"weight": 0.27,
"trust": {
"tier": "community",
"note": "tier=community;weight=0.27",
"weight": 0.27
},
"lastObserved": "2025-11-05T01:12:03Z",
"accepted": false,
"reason": "lower trust tier and stale statement"
}
],
"policyRevisionId": "vex-consensus-policy@2025-11-05",
"evaluatedAt": "2025-11-05T02:05:14Z",
"consensusDigest": "sha256:41f2d96728b24f7a8b7f1251983b8edccd1e0f5781d4a51e51c8e6b20c1fa31a"
}
```
This document was consolidated during docs cleanup.
> **Note:** This payload is generated from the beta consensus endpoint and is subject to change prior to GA. Keys and semantics are documented alongside API previews in `docs/modules/excitor/README.md`.
> **New:** `sources[].trust` mirrors the `vex.provenance.*` envelope emitted by Excititor connectors (provider weight/tier, cosign hints, PGP fingerprints). VEX Lens copies the raw metadata so Policy Engine, Console, and Advisory AI can explain consensus decisions without replaying ingestion.
- Canonical guide: `docs/16_VEX_CONSENSUS_GUIDE.md`
- Module dossier: `docs/modules/vex-lens/architecture.md`

View File

@@ -1,203 +1,6 @@
# VEX Consensus Overview — Evidence-Linked Decisions
# Archived: VEX Consensus Overview
> Status: Updated 2025-12-11 · Owners: Policy Guild, Scanner Guild, Signals Guild
This document was consolidated during docs cleanup.
> Stella Ops isn't just another scanner—it's a different product category: **deterministic, evidence-linked vulnerability decisions** that survive auditors, regulators, and supply-chain propagation.
<!-- TODO: Review for separate approval - completed VEX consensus overview -->
## Context: Four Capabilities
The VEX Consensus Engine supports **Explainable Policy (Lattice VEX)**—one of four capabilities no competitor offers together:
1. **Signed Reachability** Every reachability graph is sealed with DSSE.
2. **Deterministic Replay** Scans run bit-for-bit identical from frozen feeds.
3. **Explainable Policy (Lattice VEX)** Evidence-linked VEX decisions with explicit "Unknown" state handling.
4. **Sovereign + Offline Operation** FIPS/eIDAS/GOST/SM/PQC profiles and offline mirrors.
All decisions are sealed in **Decision Capsules** for audit-grade reproducibility.
---
## Purpose
The VEX Consensus Engine merges multiple evidence sources into a single, reproducible vulnerability status for each component-vulnerability pair. Unlike simple VEX aggregation that picks the "most authoritative" statement, Stella Ops applies **lattice logic** to combine all inputs deterministically.
**Key differentiators:**
- **Evidence-linked decisions**: Every VEX assertion includes pointers to the underlying proof
- **Explicit "Unknown" state**: Incomplete data is surfaced as `under_investigation`, never as false safety
- **Deterministic consensus**: Given the same inputs, the engine produces identical outputs
- **Human-readable justifications**: Every decision comes with an explainable trace
---
## Inputs
The consensus engine ingests evidence from multiple sources:
| Source Type | Description | Evidence Link |
|-------------|-------------|---------------|
| **SBOM data** | Component identities (PURLs), dependency relationships, layer provenance | `sbom_hash`, `layer_digest` |
| **Advisory feeds** | OSV, GHSA, NVD, CNVD, CNNVD, ENISA, JVN, BDU, vendor feeds | `advisory_snapshot_id`, `feed_hash` |
| **Reachability evidence** | Static call-graph analysis, runtime traces, entry-point proximity | `reach_decision_id`, `graph_hash` |
| **VEX statements** | Vendor VEX, internal VEX, third-party VEX | `vex_statement_id`, `issuer_id` |
| **Waivers/Mitigations** | Temporary exceptions, compensating controls | `waiver_id`, `mitigation_id` |
| **Policy rules** | Lattice configuration, threshold settings | `policy_version`, `policy_hash` |
Each input is content-addressed and timestamped, enabling full traceability.
---
## Lattice Logic
The consensus engine applies a **partial order** over vulnerability states:
```
UNKNOWN (under_investigation)
< NOT_AFFECTED
< AFFECTED
< FIXED
```
Cross-product with confidence levels:
- **High confidence**: Strong evidence from multiple sources
- **Medium confidence**: Partial evidence or single authoritative source
- **Low confidence**: Weak evidence, pending investigation
**Merge semantics:**
- Monotonic joins: states can only progress "up" the lattice
- Conflict resolution: prioritized by source trust level and evidence strength
- "Unknown" preserved: if any critical input is missing, the decision stays `under_investigation`
See `docs/reachability/lattice.md` for the full scoring model.
---
## Outputs
### Decision Artifact
Each consensus decision produces:
```json
{
"vulnerability": "CVE-2025-1234",
"component": "pkg:nuget/Example@1.2.3",
"status": "not_affected|under_investigation|affected|fixed",
"confidence": "high|medium|low",
"justification": "component_not_present|vulnerable_code_not_present|inline_mitigations_already_exist|...",
"evidence_refs": {
"sbom": "sha256:...",
"advisory_snapshot": "nvd-2025-12-01",
"reachability": "reach:abc123",
"vex_statements": ["vex:vendor-redhat-001", "vex:internal-002"],
"mitigations": ["mit:waf-rule-xyz"]
},
"policy_version": "corp-policy@2025-12-01",
"policy_hash": "sha256:...",
"timestamp": "2025-12-11T00:00:00Z",
"status_notes": "Reachability score 22 (Possible) with WAF rule mitigation.",
"action_statement": "Monitor config ABC",
"impact_statement": "Runtime probes observed 0 hits; static call graph absent."
}
```
### Evidence Graph
Every decision artifact links to an **evidence graph** containing:
- SBOM component hash / PURL match
- Vulnerability record snapshot ID
- Reachability proof artifact (if applicable)
- Runtime observation proof (if available)
- Mitigation evidence
This enables **proof-linked VEX**—auditors can trace any decision back to its inputs.
---
## Decision Capsules Integration
Consensus decisions are sealed into **Decision Capsules** along with:
- Exact SBOM used
- Exact vuln feed snapshots
- Reachability evidence (static + runtime)
- Policy version + lattice rules
- Derived VEX statements
- DSSE signatures over all of the above
Capsules enable:
- Bit-for-bit replay: `stella replay capsule.yaml`
- Offline verification: No network required
- Audit-grade evidence: Every decision is provable
---
## Threshold and Confidence Handling
| Confidence Level | Criteria | Default Action |
|------------------|----------|----------------|
| High | Multiple corroborating sources, strong reachability evidence | Auto-apply decision |
| Medium | Single authoritative source or partial reachability evidence | Apply with advisory flag |
| Low | Weak evidence, conflicting sources | Mark `under_investigation` |
Policy rules can override these defaults per environment.
---
## VEX Propagation
Once consensus is reached, Stella Ops can generate **downstream VEX statements** for consumers:
- **OpenVEX format**: Standard VEX for interoperability
- **CSAF VEX**: For CSAF-compliant ecosystems
- **Custom formats**: Via export templates
Downstream consumers can automatically trust and ingest these VEX statements because they include:
- Proof pointers to the evidence graph
- Signatures from trusted issuers
- Replay bundle references
**Key differentiator**: Competitors export VEX formats; Stella provides a unified proof model that can be verified independently.
---
## API Integration
```bash
# Evaluate consensus for a component-vuln pair
POST /v1/vex/consensus/evaluate
{
"component": "pkg:nuget/Example@1.2.3",
"vulnerabilities": ["CVE-2025-1234"],
"policy": "corp-policy@2025-12-01"
}
# Get consensus decision with evidence
GET /v1/vex/consensus/{decision_id}?include_evidence=true
# Export VEX for downstream propagation
POST /v1/vex/export
{
"format": "openvex|csaf",
"decisions": ["decision:abc123"]
}
```
---
## Open TODOs
- [ ] PLVL0102 schema integration (pending schema finalization)
- [ ] Issuer directory details for third-party VEX sources
- [ ] CSAF VEX export template
- [ ] CLI commands for consensus querying
---
## Related Documentation
- `docs/reachability/lattice.md` — Reachability scoring model
- `docs/vex/consensus-algorithm.md` — Algorithm details
- `docs/vex/consensus-api.md` — API reference
- `docs/vex/aggregation.md` — VEX aggregation rules
- `docs/vex/issuer-directory.md` — Trusted VEX issuers
- Canonical guide: `docs/16_VEX_CONSENSUS_GUIDE.md`
- Module dossiers: `docs/modules/excititor/architecture.md`, `docs/modules/vex-lens/architecture.md`

View File

@@ -1,28 +1,6 @@
# VEX Explorer Integration (Md.XI draft)
# Archived: Vulnerability Explorer Integration
> Status: DRAFT — pending GRAP0101 alignment, CSAF mapping specifics, and CLI examples. Do not publish until hashes recorded.
This document was consolidated during docs cleanup.
## Scope
- Map Explorer VEX handling: CSAF ingestion, suppression precedence, status semantics, and integration points with findings.
- Provide deterministic examples; hash payloads/screens in `docs/assets/vuln-explorer/SHA256SUMS`.
## Dependencies
- GRAP0101 contract (field names, identifiers).
- CLI/console assets (due 2025-12-09).
- Policy/VEX mapping rules from Excititor Guild.
## Topics (outline)
- CSAF → internal VEX decision mapping; precedence vs policy overrides.
- Status semantics: NOT_AFFECTED / AFFECTED_* / FIXED; validity windows; VEX-first triage per Vuln Explorer architecture.
- Suppression precedence: VEX decisions take priority over reachability/policy unless explicit override (confirm post-GRAP0101).
- Export/propagation to advisories/CLI/console.
## Determinism
- Use fixed CSAF samples; hash examples.
### Hash Capture Checklist (when assets land)
- `assets/vuln-explorer/vex-csaf-sample.json` (input)
- `assets/vuln-explorer/vex-mapping-output.json` (normalized decisions)
- `assets/vuln-explorer/vex-precedence-table.md` (suppression/precedence matrix)
_Last updated: 2025-12-05 (UTC)_
- Canonical guide: `docs/20_VULNERABILITY_EXPLORER_GUIDE.md`
- VEX guide: `docs/16_VEX_CONSENSUS_GUIDE.md`

View File

@@ -1,15 +1,6 @@
# VEX Issuer Directory — Draft Skeleton (2025-12-05 UTC)
# Archived: VEX Issuer Directory
Status: draft placeholder. Inputs pending: issuer directory keys/overrides, audit model.
This document was consolidated during docs cleanup.
## Management
- Add/update issuers; key material handling (to be filled).
## Trust Overrides
- Local overrides, expiry/rotation rules.
## Audit
- Recording changes; export/logging expectations.
## Open TODOs
- Insert concrete commands/APIs once available.
- Canonical guide: `docs/16_VEX_CONSENSUS_GUIDE.md`
- Related: `docs/modules/excititor/architecture.md`, `docs/modules/vex-lens/architecture.md`