Add unit and integration tests for VexCandidateEmitter and SmartDiff repositories

- Implemented comprehensive unit tests for VexCandidateEmitter to validate candidate emission logic based on various scenarios including absent and present APIs, confidence thresholds, and rate limiting.
- Added integration tests for SmartDiff PostgreSQL repositories, covering snapshot storage and retrieval, candidate storage, and material risk change handling.
- Ensured tests validate correct behavior for storing, retrieving, and querying snapshots and candidates, including edge cases and expected outcomes.
This commit is contained in:
master
2025-12-16 18:44:25 +02:00
parent 2170a58734
commit 3a2100aa78
126 changed files with 15776 additions and 542 deletions

View File

@@ -20,14 +20,14 @@
| # | Invariant | What it forbids or requires | Enforcement surfaces |
|---|-----------|-----------------------------|----------------------|
| 1 | No derived severity at ingest | Reject top-level keys such as `severity`, `cvss`, `effective_status`, `consensus_provider`, `risk_score`. Raw upstream CVSS remains inside `content.raw`. | Mongo schema validator, `AOCWriteGuard`, Roslyn analyzer, `stella aoc verify`. |
| 1 | No derived severity at ingest | Reject top-level keys such as `severity`, `cvss`, `effective_status`, `consensus_provider`, `risk_score`. Raw upstream CVSS remains inside `content.raw`. | PostgreSQL schema validator, `AOCWriteGuard`, Roslyn analyzer, `stella aoc verify`. |
| 2 | No merges or opinionated dedupe | Each upstream document persists on its own; ingestion never collapses multiple vendors into one document. | Repository interceptors, unit/fixture suites. |
| 3 | Provenance is mandatory | `source.*`, `upstream.*`, and `signature` metadata must be present; missing provenance triggers `ERR_AOC_004`. | Schema validator, guard, CLI verifier. |
| 4 | Idempotent upserts | Writes keyed by `(vendor, upstream_id, content_hash)` either no-op or insert a new revision with `supersedes`. Duplicate hashes map to the same document. | Repository guard, storage unique index, CI smoke tests. |
| 5 | Append-only revisions | Updates create a new document with `supersedes` pointer; no in-place mutation of content. | Mongo schema (`supersedes` format), guard, data migration scripts. |
| 5 | Append-only revisions | Updates create a new document with `supersedes` pointer; no in-place mutation of content. | PostgreSQL schema (`supersedes` format), guard, data migration scripts. |
| 6 | Linkset only | Ingestion may compute link hints (`purls`, `cpes`, IDs) to accelerate joins, but must not transform or infer severity or policy. Observations now persist both canonical linksets (for indexed queries) and raw linksets (preserving upstream order/duplicates) so downstream policy can decide how to normalise. When `concelier:features:noMergeEnabled=true`, all merge-derived canonicalisation paths must be disabled. | Linkset builders reviewed via fixtures/analyzers; raw-vs-canonical parity covered by observation fixtures; analyzer `CONCELIER0002` blocks merge API usage. |
| 7 | Policy-only effective findings | Only Policy Engine identities can write `effective_finding_*`; ingestion callers receive `ERR_AOC_006` if they attempt it. | Authority scopes, Policy Engine guard. |
| 8 | Schema safety | Unknown top-level keys reject with `ERR_AOC_007`; timestamps use ISO 8601 UTC strings; tenant is required. | Mongo validator, JSON schema tests. |
| 8 | Schema safety | Unknown top-level keys reject with `ERR_AOC_007`; timestamps use ISO 8601 UTC strings; tenant is required. | PostgreSQL validator, JSON schema tests. |
| 9 | Clock discipline | Collectors stamp `fetched_at` and `received_at` monotonically per batch to support reproducibility windows. | Collector contracts, QA fixtures. |
## 4. Raw Schemas
@@ -113,11 +113,11 @@ Canonicalisation rules:
|------|-------------|-------------|----------|
| `ERR_AOC_001` | Forbidden field detected (severity, cvss, effective data). | 400 | Ingestion APIs, CLI verifier, CI guard. |
| `ERR_AOC_002` | Merge attempt detected (multiple upstream sources fused into one document). | 400 | Ingestion APIs, CLI verifier. |
| `ERR_AOC_003` | Idempotency violation (duplicate without supersedes pointer). | 409 | Repository guard, Mongo unique index, CLI verifier. |
| `ERR_AOC_003` | Idempotency violation (duplicate without supersedes pointer). | 409 | Repository guard, PostgreSQL unique index, CLI verifier. |
| `ERR_AOC_004` | Missing provenance metadata (`source`, `upstream`, `signature`). | 422 | Schema validator, ingestion endpoints. |
| `ERR_AOC_005` | Signature or checksum mismatch. | 422 | Collector validation, CLI verifier. |
| `ERR_AOC_006` | Attempt to persist derived findings from ingestion context. | 403 | Policy engine guard, Authority scopes. |
| `ERR_AOC_007` | Unknown top-level fields (schema violation). | 400 | Mongo validator, CLI verifier. |
| `ERR_AOC_007` | Unknown top-level fields (schema violation). | 400 | PostgreSQL validator, CLI verifier. |
Consumers should map these codes to CLI exit codes and structured log events so automation can fail fast and produce actionable guidance. The shared guard library (`StellaOps.Aoc.AocError`) emits consistent payloads (`code`, `message`, `violations[]`) for HTTP APIs, CLI tooling, and verifiers.
@@ -144,7 +144,7 @@ Consumers should map these codes to CLI exit codes and structured log events so
1. Freeze ingestion writes except for raw pass-through paths while deploying schema validators.
2. Snapshot existing collections to `_backup_*` for rollback safety.
3. Strip forbidden fields from historical documents into a temporary `advisory_view_legacy` used only during transition.
4. Enable Mongo JSON schema validators for `advisory_raw` and `vex_raw`.
4. Enable PostgreSQL JSON schema validators for `advisory_raw` and `vex_raw`.
5. Run collectors in `--dry-run` to confirm only allowed keys appear; fix violations before lifting the freeze.
6. Point Policy Engine to consume exclusively from raw collections and compute derived outputs downstream.
7. Delete legacy normalisation paths from ingestion code and enable runtime guards plus CI linting.
@@ -169,7 +169,7 @@ Consumers should map these codes to CLI exit codes and structured log events so
## 11. Compliance Checklist
- [ ] Deterministic guard enabled in Concelier and Excititor repositories.
- [ ] Mongo validators deployed for `advisory_raw` and `vex_raw`.
- [ ] PostgreSQL validators deployed for `advisory_raw` and `vex_raw`.
- [ ] Authority scopes and tenant enforcement verified via integration tests.
- [ ] CLI and CI pipelines run `stella aoc verify` against seeded snapshots.
- [ ] Observability feeds (metrics, logs, traces) wired into dashboards with alerts.