Extend Vexer attestation/export stack and Concelier OSV fixes

This commit is contained in:
2025-10-16 19:44:10 +03:00
parent 46f7c807d3
commit cb3acb8c4a
103 changed files with 6852 additions and 1840 deletions

View File

@@ -26,6 +26,7 @@ MongoDB acts as the canonical store; collections (with logical responsibilities)
- `vex.consensus` consensus projections per `(vulnId, productKey)` capturing rollup status, source weights, conflicts, and policy revision.
- `vex.exports` export manifests containing artifact digests, cache metadata, and attestation pointers.
- `vex.cache` index from `querySignature`/`format` to export digest for fast reuse.
- `vex.migrations` tracks applied storage migrations (index bootstrap, future schema updates).
GridFS is used for large raw payloads when necessary, and artifact stores (S3/MinIO/file) hold serialized exports referenced by `vex.exports`.
@@ -54,6 +55,7 @@ Policy snapshots are immutable and versioned so consensus records capture the po
- JSON serialization uses `VexCanonicalJsonSerializer`, enforcing property ordering and camelCase naming for reproducible snapshots and test fixtures.
- `VexQuerySignature` produces canonical filter/order strings and SHA-256 digests, enabling cache keys shared across services.
- Export manifests reuse cached artifacts when the same signature/format is requested unless `ForceRefresh` is explicitly set.
- For scorring multiple sources on same VEX topic use - `VEXER_SCORRING.md`
## 6. Observability & offline posture
@@ -68,5 +70,16 @@ Policy snapshots are immutable and versioned so consensus records capture the po
- Build WebService endpoints (`/vexer/status`, `/vexer/claims`, `/vexer/exports`) plus CLI verbs mirroring Feedser patterns.
- Provide CSAF, CycloneDX VEX, and OpenVEX normalizers along with vendor-specific connectors (Red Hat, Cisco, SUSE, MSRC, Oracle, Ubuntu, OCI attestation).
- Extend policy diagnostics with schema validation, change tracking, and operator-facing diff reports.
- Mongo bootstrapper runs ordered migrations (`vex.migrations`) to ensure indexes for raw documents, providers, consensus snapshots, exports, and cache entries.
## Appendix A Policy diagnostics workflow
- `StellaOps.Vexer.Policy` now exposes `IVexPolicyDiagnostics`, producing deterministic diagnostics reports with timestamp, severity counts, active provider overrides, and the full issue list surfaced by `IVexPolicyProvider`.
- CLI/WebService layers should call `IVexPolicyDiagnostics.GetDiagnostics()` to display operator-friendly summaries (`vexer policy diagnostics` and `/vexer/policy/diagnostics` are the planned entry points).
- Recommendations in the report guide operators to resolve blocking errors, review warnings, and audit override usage before consensus runs—embed them directly in UX copy instead of re-deriving logic.
- Export/consensus telemetry should log the diagnostic `Version` alongside `policyRevisionId` so dashboards can correlate policy changes with consensus decisions.
- Offline installations can persist the diagnostics report (JSON) in the Offline Kit to document policy headroom during audits; the output is deterministic and diff-friendly.
- Use `VexPolicyBinder` when ingesting operator-supplied YAML/JSON bundles; it normalizes weight/override values, reports deterministic issues, and returns the consensus-ready `VexConsensusPolicyOptions` used by `VexPolicyProvider`.
- Reload telemetry emits `vex.policy.reloads` (tags: `revision`, `version`, `issues`) whenever a new digest is observed—feed this into dashboards to correlate policy changes with consensus outcomes.
This architecture keeps Vexer aligned with StellaOps' deterministic, offline-operable design while layering VEX-specific consensus and attestation capabilities on top of the Feedser foundations.

83
docs/VEXER_SCORRING.md Normal file
View File

@@ -0,0 +1,83 @@
## Status
This document tracks the future-looking risk scoring model for Vexer. The calculation below is not active yet; Sprint 7 work will add the required schema fields, policy controls, and services. Until that ships, Vexer emits consensus statuses without numeric scores.
## Scoring model (target state)
**S = Gate(VEX_status) × W_trust(source) × [Severity_base × (1 + α·KEV + β·EPSS)]**
* **Gate(VEX_status)**: `affected`/`under_investigation` → 1, `not_affected`/`fixed` → 0. A trusted “not affected” or “fixed” still zeroes the score.
* **W_trust(source)**: normalized policy weight (baseline 01). Policies may opt into >1 boosts for signed vendor feeds once Phase 1 closes.
* **Severity_base**: canonical numeric severity from Feedser (CVSS or org-defined scale).
* **KEV flag**: 0/1 boost when CISA Known Exploited Vulnerabilities applies.
* **EPSS**: probability [0,1]; bounded multiplier.
* **α, β**: configurable coefficients (default α=0.25, β=0.5) stored in policy.
Safeguards: freeze boosts when product identity is unknown, clamp outputs ≥0, and log every factor in the audit trail.
## Implementation roadmap
| Phase | Scope | Artifacts |
| --- | --- | --- |
| **Phase 1 Schema foundations** | Extend Vexer consensus/claims and Feedser canonical advisories with severity, KEV, EPSS, and expose α/β + weight ceilings in policy. | Sprint 7 tasks `VEXER-CORE-02-001`, `VEXER-POLICY-02-001`, `VEXER-STORAGE-02-001`, `FEEDCORE-ENGINE-07-001`. |
| **Phase 2 Deterministic score engine** | Implement a scoring component that executes alongside consensus and persists score envelopes with hashes. | Planned task `VEXER-CORE-02-002` (backlog). |
| **Phase 3 Surfacing & enforcement** | Expose scores via WebService/CLI, integrate with Feedser noise priors, and enforce policy-based suppressions. | To be scheduled after Phase 2. |
## Data model (after Phase 1)
```json
{
"vulnerabilityId": "CVE-2025-12345",
"product": "pkg:name@version",
"consensus": {
"status": "affected",
"policyRevisionId": "rev-12",
"policyDigest": "0D9AEC…"
},
"signals": {
"severity": {"scheme": "CVSS:3.1", "score": 7.5},
"kev": true,
"epss": 0.40
},
"policy": {
"weight": 1.15,
"alpha": 0.25,
"beta": 0.5
},
"score": {
"value": 10.8,
"generatedAt": "2025-11-05T14:12:30Z",
"audit": [
"gate:affected",
"weight:1.15",
"severity:7.5",
"kev:1",
"epss:0.40"
]
}
}
```
## Operational guidance
* **Inputs**: Feedser delivers severity/KEV/EPSS via the advisory event log; Vexer connectors load VEX statements. Policy owns trust tiers and coefficients.
* **Processing**: the scoring engine (Phase 2) runs next to consensus, storing results with deterministic hashes so exports and attestations can reference them.
* **Consumption**: WebService/CLI will return consensus plus score; scanners may suppress findings only when policy-authorized VEX gating and signed score envelopes agree.
## Pseudocode (Phase 2 preview)
```python
def risk_score(gate, weight, severity, kev, epss, alpha, beta, freeze_boosts=False):
if gate == 0:
return 0
if freeze_boosts:
kev, epss = 0, 0
boost = 1 + alpha * kev + beta * epss
return max(0, weight * severity * boost)
```
## FAQ
* **Can operators opt out?** Set α=β=0 or keep weights ≤1.0 via policy.
* **What about missing signals?** Treat them as zero and log the omission.
* **When will this ship?** Phase 1 is planned for Sprint 7; later phases depend on connector coverage and attestation delivery.

View File

@@ -1,6 +1,6 @@
# Feedser GHSA Connector Operations Runbook
_Last updated: 2025-10-12_
_Last updated: 2025-10-16_
## 1. Overview
The GitHub Security Advisories (GHSA) connector pulls advisory metadata from the GitHub REST API `/security/advisories` endpoint. GitHub enforces both primary and secondary rate limits, so operators must monitor usage and configure retries to avoid throttling incidents.
@@ -114,3 +114,10 @@ When enabling GHSA the first time, run a staged backfill:
- Prometheus: `ghsa_ratelimit_remaining_bucket` (from histogram) use `histogram_quantile(0.99, ...)` to trend capacity.
- VictoriaMetrics: `LAST_over_time(ghsa_ratelimit_remaining_sum[5m])` for simple last-value graphs.
- Grafana: stack remaining + used to visualise total limit per resource.
## 8. Canonical metric fallback analytics
When GitHub omits CVSS vectors/scores, the connector now assigns a deterministic canonical metric id in the form `ghsa:severity/<level>` and publishes it to Merge so severity precedence still resolves against GHSA even without CVSS data.
- Metric: `ghsa.map.canonical_metric_fallbacks` (counter) with tags `severity`, `canonical_metric_id`, `reason=no_cvss`.
- Monitor the counter alongside Merge parity checks; a sudden spike suggests GitHub is shipping advisories without vectors and warrants cross-checking downstream exporters.
- Because the canonical id feeds Merge, parity dashboards should overlay this metric to confirm fallback advisories continue to merge ahead of downstream sources when GHSA supplies more recent data.

View File

@@ -0,0 +1,24 @@
# Feedser OSV Connector Operations Notes
_Last updated: 2025-10-16_
The OSV connector ingests advisories from OSV.dev across OSS ecosystems. This note highlights the additional merge/export expectations introduced with the canonical metric fallback work in Sprint 4.
## 1. Canonical metric fallbacks
- When OSV omits CVSS vectors (common for CVSS v4-only payloads) the mapper now emits a deterministic canonical metric id in the form `osv:severity/<level>` and normalises the advisory severity to the same `<level>`.
- Metric: `osv.map.canonical_metric_fallbacks` (counter) with tags `severity`, `canonical_metric_id`, `ecosystem`, `reason=no_cvss`. Watch this alongside merge parity dashboards to catch spikes where OSV publishes severity-only advisories.
- Merge precedence still prefers GHSA over OSV; the shared severity-based canonical id keeps Merge/export parity deterministic even when only OSV supplies severity data.
## 2. CWE provenance
- `database_specific.cwe_ids` now populates provenance decision reasons for every mapped weakness. Expect `decisionReason="database_specific.cwe_ids"` on OSV weakness provenance and confirm exporters preserve the value.
- If OSV ever attaches `database_specific.cwe_notes`, the connector will surface the joined note string in `decisionReason` instead of the default marker.
## 3. Dashboards & alerts
- Extend existing merge dashboards with the new counter:
- Overlay `sum(osv.map.canonical_metric_fallbacks{ecosystem=~".+"})` with Merge severity overrides to confirm fallback advisories are reconciling cleanly.
- Alert when the 1-hour sum exceeds 50 for any ecosystem; baseline volume is currently <5 per day (mostly GHSA mirrors emitting CVSS v4 only).
- Exporters already surface `canonicalMetricId`; no schema change is required, but ORAS/Trivy bundles should be spot-checked after deploying the connector update.
## 4. Runbook updates
- Fixture parity suites (`osv-ghsa.*`) now assert the fallback id and provenance notes. Regenerate via `dotnet test src/StellaOps.Feedser.Source.Osv.Tests/StellaOps.Feedser.Source.Osv.Tests.csproj`.
- When investigating merge severity conflicts, include the fallback counter and confirm OSV advisories carry the expected `osv:severity/<level>` id before raising connector bugs.