From 515975edc569a6e26e5407b8eb2f55b483315415 Mon Sep 17 00:00:00 2001 From: master <> Date: Fri, 7 Nov 2025 19:54:02 +0200 Subject: [PATCH] Implement Advisory Canonicalization and Backfill Migration - Added AdvisoryCanonicalizer for canonicalizing advisory identifiers. - Created EnsureAdvisoryCanonicalKeyBackfillMigration to populate advisory_key and links in advisory_raw documents. - Introduced FileSurfaceManifestStore for managing surface manifests with file system backing. - Developed ISurfaceManifestReader and ISurfaceManifestWriter interfaces for reading and writing manifests. - Implemented SurfaceManifestPathBuilder for constructing paths and URIs for surface manifests. - Added tests for FileSurfaceManifestStore to ensure correct functionality and deterministic behavior. - Updated documentation for new features and migration steps. --- docs/dev/lnm-determinism-tests.md | 32 +++ docs/implplan/SPRINTS.md | 4 + .../implplan/SPRINT_110_ingestion_evidence.md | 8 +- docs/implplan/SPRINT_130_scanner_surface.md | 2 + docs/modules/concelier/architecture.md | 41 ++-- .../scanner/design/surface-fs-consumers.md | 51 ++++ docs/modules/scanner/design/surface-fs.md | 89 ++++--- .../ConcelierAdvisoryDocumentProviderTests.cs | 11 +- .../Contracts/AdvisoryRawContracts.cs | 24 +- .../Extensions/AdvisoryRawRequestMapper.cs | 38 +-- .../Options/ConcelierOptions.cs | 2 +- .../StellaOps.Concelier.WebService/Program.cs | 53 +++++ .../StellaOps.Concelier.WebService/TASKS.md | 4 +- .../Raw/AdvisoryCanonicalizer.cs | 195 +++++++++++++++ .../Raw/AdvisoryRawService.cs | 126 ++++++++-- .../Raw/IAdvisoryRawRepository.cs | 23 +- .../Raw/IAdvisoryRawService.cs | 20 +- .../StellaOps.Concelier.Merge/TASKS.md | 4 +- .../AdvisoryRawDocument.cs | 62 ++--- .../RawDocumentFactory.cs | 27 ++- ...reAdvisoryCanonicalKeyBackfillMigration.cs | 166 +++++++++++++ .../Raw/MongoAdvisoryRawRepository.cs | 216 +++++++++++++---- .../ServiceCollectionExtensions.cs | 1 + .../AdvisoryObservationFactoryTests.cs | 83 ++++++- .../Raw/AdvisoryRawServiceTests.cs | 190 ++++++++++----- .../MergePrecedenceIntegrationTests.cs | 26 +- ...oryObservationsRawLinksetMigrationTests.cs | 13 +- .../Migrations/MongoMigrationRunnerTests.cs | 15 ++ .../ConcelierOptionsPostConfigureTests.cs | 34 ++- .../WebServiceEndpointsTests.cs | 61 +++++ .../FileSurfaceManifestStore.cs | 224 ++++++++++++++++++ .../ISurfaceManifestReader.cs | 15 ++ .../ISurfaceManifestWriter.cs | 11 + .../ServiceCollectionExtensions.cs | 61 ++++- .../StellaOps.Scanner.Surface.FS.csproj | 1 + .../SurfaceManifestModels.cs | 2 +- .../SurfaceManifestPathBuilder.cs | 85 +++++++ .../SurfaceManifestStoreOptions.cs | 45 ++++ .../StellaOps.Scanner.Surface.FS/TASKS.md | 6 +- .../FileSurfaceManifestStoreTests.cs | 156 ++++++++++++ .../StellaOps.Scheduler.Worker/TASKS.md | 1 + .../StellaOps.Zastava.Observer/TASKS.md | 1 + 42 files changed, 1893 insertions(+), 336 deletions(-) create mode 100644 docs/dev/lnm-determinism-tests.md create mode 100644 docs/modules/scanner/design/surface-fs-consumers.md create mode 100644 src/Concelier/__Libraries/StellaOps.Concelier.Core/Raw/AdvisoryCanonicalizer.cs create mode 100644 src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/Migrations/EnsureAdvisoryCanonicalKeyBackfillMigration.cs create mode 100644 src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/FileSurfaceManifestStore.cs create mode 100644 src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/ISurfaceManifestReader.cs create mode 100644 src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/ISurfaceManifestWriter.cs create mode 100644 src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/SurfaceManifestPathBuilder.cs create mode 100644 src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/SurfaceManifestStoreOptions.cs create mode 100644 src/Scanner/__Tests/StellaOps.Scanner.Surface.FS.Tests/FileSurfaceManifestStoreTests.cs diff --git a/docs/dev/lnm-determinism-tests.md b/docs/dev/lnm-determinism-tests.md new file mode 100644 index 000000000..120a826e5 --- /dev/null +++ b/docs/dev/lnm-determinism-tests.md @@ -0,0 +1,32 @@ +# Link-Not-Merge Determinism Test Plan + +**Task:** MERGE-LNM-21-003 — replace legacy merge determinism suites with observation/linkset regressions now that `NoMergeEnabled` is defaulted to `true`. + +## Objectives +- Validate raw advisory documents remain byte-stable through observation/linkset materialisation. +- Ensure conflicts detected during linkset building surface in telemetry and persisted artifacts without merge-side mutation. +- Keep canonical hash output stable for exports/evidence bundles after repeated runs. + +## Test Coverage Outline +1. **Raw → Observation determinism** + - Feed canonical advisory raw fixtures containing mixed casing, duplicate aliases, and provenance metadata. + - Assert repeated runs of `AdvisoryObservationFactory` emit identical observations (structural equality + canonical JSON hash). + - Verify raw linkset payload retains original ordering/whitespace while canonical linkset stays normalised. + - Initial coverage implemented via `AdvisoryObservationFactoryTests.Create_IsDeterministicAcrossRuns` (core tests). + +2. **Linkset conflict surfacing** + - Build linksets from conflicting advisory observations (e.g., differing severity or status flags). + - Confirm conflict markers propagate to `AdvisoryLinkset` outputs and associated metrics/log records. + - Capture deterministic ordering of conflict explanations for evidence exports. + +3. **Evidence/export parity** + - Re-run observation/linkset pipelines against identical fixtures and assert resulting evidence manifests hash-identically. + - Track monotonic `supersedes` chains and ensure canonical link records include `PRIMARY` schemes. + +## Migration Steps +- [ ] Retire `StellaOps.Concelier.Merge.Tests` determinism suites once observation/linkset equivalents land. +- [ ] Introduce new regression fixtures under `StellaOps.Concelier.Core.Tests` (shared via `StellaOps.Concelier.Testing`). +- [ ] Wire test helpers to Mongo in-memory harness for end-to-end parity runs. +- [ ] Update documentation (`docs/migration/no-merge.md`) with validation checklist once new tests are green. + +_Pending_: execute suites on a workstation with the .NET 10 preview SDK; local environment lacks a functioning CLI, so validation runs must happen downstream. diff --git a/docs/implplan/SPRINTS.md b/docs/implplan/SPRINTS.md index dfc466af6..6e5d3485a 100644 --- a/docs/implplan/SPRINTS.md +++ b/docs/implplan/SPRINTS.md @@ -78,6 +78,10 @@ Follow the sprint files below in order. Update task status in both `SPRINTS` and > 2025-11-02: SURFACE-VAL-01 moved to DOING (Surface Validation Guild) – aligning design document with implementation plan. > 2025-11-02: SURFACE-FS-01 moved to DOING (Surface FS Guild) – finalising cache layout and manifest spec. > 2025-11-02: SURFACE-FS-02 moved to DOING (Surface FS Guild) – building core abstractions and deterministic serializers. +> 2025-11-07: SURFACE-FS-01 marked DONE – updated `surface-fs.md` with pointer layout, offline kit flow, and architecture cross-link. +> 2025-11-07: SURFACE-FS-02 marked DONE – landed file-backed manifest store (`FileSurfaceManifestStore`), deterministic serialization, and unit coverage. +> 2025-11-07: SCHED-SURFACE-02 added (Scheduler Worker Guild) – prefetch Surface manifests before scheduling reruns. +> 2025-11-07: ZASTAVA-SURFACE-02 added (Zastava Observer Guild) – adopt Surface manifest reader for drift diagnostics. > 2025-11-02: SURFACE-SECRETS-01 moved to DOING (Surface Secrets Guild) – updating secrets design for provider matrix. > 2025-11-02: SURFACE-SECRETS-02 moved to DOING (Surface Secrets Guild) – implementing base providers + tests. > 2025-11-02: AUTH-POLICY-27-002 marked DONE (Authority Core & Security Guild) – interactive-only policy publish/promote scopes delivered with metadata, fresh-auth enforcement, and audit/docs updates. diff --git a/docs/implplan/SPRINT_110_ingestion_evidence.md b/docs/implplan/SPRINT_110_ingestion_evidence.md index ad52c4d47..30de6d553 100644 --- a/docs/implplan/SPRINT_110_ingestion_evidence.md +++ b/docs/implplan/SPRINT_110_ingestion_evidence.md @@ -158,8 +158,8 @@ CONCELIER-SIG-26-001 `Vulnerable symbol exposure` | TODO | Expose advisory metad CONCELIER-STORE-AOC-19-005 `Raw linkset backfill` | TODO (2025-11-04) | Plan and execute advisory_observations `rawLinkset` backfill (online + Offline Kit bundles), supply migration scripts + rehearse rollback. Follow the coordination plan in `docs/dev/raw-linkset-backfill-plan.md`. Dependencies: CONCELIER-CORE-AOC-19-004. | Concelier Storage Guild, DevOps Guild (src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/TASKS.md) CONCELIER-TEN-48-001 `Tenant-aware linking` | TODO | Ensure advisory normalization/linking runs per tenant with RLS enforcing isolation; emit capability endpoint reporting `merge=false`; update events with tenant context. Dependencies: AUTH-TEN-47-001. | Concelier Core Guild (src/Concelier/__Libraries/StellaOps.Concelier.Core/TASKS.md) CONCELIER-VEXLENS-30-001 `Advisory rationale bridges` | TODO | Guarantee advisory key consistency and cross-links for consensus rationale; Label: VEX-Lens. Dependencies: CONCELIER-VULN-29-001, VEXLENS-30-005. | Concelier WebService Guild, VEX Lens Guild (src/Concelier/StellaOps.Concelier.WebService/TASKS.md) -CONCELIER-VULN-29-001 `Advisory key canonicalization` | TODO | Canonicalize (lossless) advisory identifiers (CVE/GHSA/vendor) into `advisory_key`, persist `links[]`, expose raw payload snapshots for Explorer evidence tabs; AOC-compliant: no merge, no derived fields, no suppression. Include migration/backfill scripts. Dependencies: CONCELIER-LNM-21-001. | Concelier WebService Guild, Data Integrity Guild (src/Concelier/StellaOps.Concelier.WebService/TASKS.md) -CONCELIER-VULN-29-002 `Evidence retrieval API` | TODO | Provide `/vuln/evidence/advisories/{advisory_key}` returning raw advisory docs with provenance, filtering by tenant and source. Dependencies: CONCELIER-VULN-29-001, VULN-API-29-003. | Concelier WebService Guild (src/Concelier/StellaOps.Concelier.WebService/TASKS.md) +CONCELIER-VULN-29-001 `Advisory key canonicalization` | DONE (2025-11-07) | Canonicalize (lossless) advisory identifiers (CVE/GHSA/vendor) into `advisory_key`, persist `links[]`, expose raw payload snapshots for Explorer evidence tabs; AOC-compliant: no merge, no derived fields, no suppression. Include migration/backfill scripts. Dependencies: CONCELIER-LNM-21-001. | Concelier WebService Guild, Data Integrity Guild (src/Concelier/StellaOps.Concelier.WebService/TASKS.md) +CONCELIER-VULN-29-002 `Evidence retrieval API` | DONE (2025-11-07) | Provide `/vuln/evidence/advisories/{advisory_key}` returning raw advisory docs with provenance, filtering by tenant and source. Dependencies: CONCELIER-VULN-29-001, VULN-API-29-003. | Concelier WebService Guild (src/Concelier/StellaOps.Concelier.WebService/TASKS.md) [Ingestion & Evidence] 110.B) Concelier.V @@ -211,8 +211,8 @@ Depends on: Sprint 110.B - Concelier.VI Summary: Ingestion & Evidence focus on Concelier (phase VII). Task ID | State | Task description | Owners (Source) --- | --- | --- | --- -MERGE-LNM-21-002 | DOING (2025-11-07) | Refactor or retire `AdvisoryMergeService` and related pipelines, ensuring callers transition to observation/linkset APIs; add compile-time analyzer preventing merge service usage.
2025-11-03: Began dependency audit and call-site inventory ahead of deprecation plan; cataloging service registrations/tests referencing merge APIs.
2025-11-05 14:42Z: Drafted `concelier:features:noMergeEnabled` gating, merge job allowlist handling, and deprecation/telemetry changes prior to analyzer rollout.
2025-11-06 16:10Z: Landed analyzer project (`CONCELIER0002`), wired into Concelier WebService/tests, and updated docs to direct suppressions through explicit migration notes.
2025-11-07 03:25Z: Default-on toggle + job gating break existing Concelier WebService tests; guard/migration adjustments pending before closing the task.
2025-11-07 07:05Z: Added ingest-path diagnostics (hash logging + test log dumping) to trace why HTTP binding loses `upstream.contentHash` with `noMergeEnabled=true`; need to adapt seeding/tests once the binding issue is fixed. | BE-Merge (src/Concelier/__Libraries/StellaOps.Concelier.Merge/TASKS.md) -MERGE-LNM-21-003 Determinism/test updates | QA Guild, BE-Merge | Replace merge determinism suites with observation/linkset regression tests verifying no data mutation and conflicts remain visible. Dependencies: MERGE-LNM-21-002. | MERGE-LNM-21-002 (src/Concelier/__Libraries/StellaOps.Concelier.Merge/TASKS.md) +MERGE-LNM-21-002 | DONE (2025-11-07) | Refactor or retire `AdvisoryMergeService` and related pipelines, ensuring callers transition to observation/linkset APIs; add compile-time analyzer preventing merge service usage.
2025-11-03: Began dependency audit and call-site inventory ahead of deprecation plan; cataloging service registrations/tests referencing merge APIs.
2025-11-05 14:42Z: Drafted `concelier:features:noMergeEnabled` gating, merge job allowlist handling, and deprecation/telemetry changes prior to analyzer rollout.
2025-11-06 16:10Z: Landed analyzer project (`CONCELIER0002`), wired into Concelier WebService/tests, and updated docs to direct suppressions through explicit migration notes.
2025-11-07 03:25Z: Default-on toggle + job gating surfaced ingestion test brittleness; guard/migration diagnostics capture requests missing `upstream.contentHash`.
2025-11-07 19:45Z: Set `ConcelierOptions.Features.NoMergeEnabled` default to `true`, added regression coverage (`Features_NoMergeEnabled_DefaultsToTrue`), and rechecked ingest helpers to carry canonical links. Remote .NET 10 CLI run remains queued for validation. | BE-Merge (src/Concelier/__Libraries/StellaOps.Concelier.Merge/TASKS.md) +MERGE-LNM-21-003 Determinism/test updates | DOING (2025-11-07) | QA Guild, BE-Merge | Replace merge determinism suites with observation/linkset regression tests verifying no data mutation and conflicts remain visible. Dependencies: MERGE-LNM-21-002.
2025-11-07: Drafting test migration plan (`docs/dev/lnm-determinism-tests.md`) to map legacy merge fixtures onto observation/linkset pipelines; identifying coverage gaps (conflict surfacing, raw vs canonical parity, hash stability).
2025-11-07 20:05Z: Landed `AdvisoryObservationFactoryTests.Create_IsDeterministicAcrossRuns` to cover canonical JSON stability and pruned the old merge determinism integration test. | MERGE-LNM-21-002 (src/Concelier/__Libraries/StellaOps.Concelier.Merge/TASKS.md) WEB-AOC-19-001 (dependency) | DONE (2025-11-07) | Shared guard primitives now enforce the top-level allowlist (`_id`, tenant, source, upstream, content, identifiers, linkset, supersedes, created/ingested timestamps, attributes) and emit the reusable `AocError` payload consumed by HTTP/CLI tooling. Extend `AocGuardOptions.AllowedTopLevelFields` when staging new schema fields to avoid false-positive `ERR_AOC_007` violations. | BE-Base Platform Guild (docs/aoc/guard-library.md, src/Web/StellaOps.Web/TASKS.md) diff --git a/docs/implplan/SPRINT_130_scanner_surface.md b/docs/implplan/SPRINT_130_scanner_surface.md index 1e0bbea3b..c0dc8da7e 100644 --- a/docs/implplan/SPRINT_130_scanner_surface.md +++ b/docs/implplan/SPRINT_130_scanner_surface.md @@ -156,6 +156,8 @@ SCANNER-ENG-0027 | TODO | Deliver Windows policy/offline integration per `design SCANNER-SURFACE-01 | DONE (2025-11-06) | Persist Surface.FS manifests after analyzer stages, including layer CAS metadata and EntryTrace fragments.
2025-11-02: Worker pipeline emitting draft Surface.FS manifests for sample scans; determinism checks running.
2025-11-06: Continuing with manifest writer abstraction + telemetry wiring for Surface.FS persistence.
2025-11-06 18:45Z: Resumed work; targeting manifest writer abstraction, CAS persistence hooks, and telemetry/test coverage updates.
2025-11-06 20:20Z: Published Surface worker Grafana dashboard + updated design doc; WebService pointer integration test now covers manifest/payload artefacts. | Scanner Worker Guild (src/Scanner/StellaOps.Scanner.Worker/TASKS.md) SCANNER-SURFACE-02 | DONE (2025-11-05) | Publish Surface.FS pointers (CAS URIs, manifests) via scan/report APIs and update attestation metadata. Dependencies: SCANNER-SURFACE-01.
2025-11-05: Surface pointer projection wired through WebService endpoints, orchestrator samples & DSSE fixtures refreshed with `surface` manifest block, and regression suite (platform events, report sample, ready check) updated. | Scanner WebService Guild (src/Scanner/StellaOps.Scanner.WebService/TASKS.md) SCANNER-SURFACE-03 | DONE (2025-11-07) | Push layer manifests and entry fragments into Surface.FS during build-time SBOM generation. Dependencies: SCANNER-SURFACE-02.
2025-11-06: Starting BuildX manifest upload implementation with Surface.FS client abstraction and integration tests.
2025-11-07 15:30Z: Resumed BuildX plugin Surface wiring; analyzing Surface.FS models, CAS flow, and upcoming tests before coding.
2025-11-07 22:10Z: Added Surface manifest writer + CLI flags to the BuildX plug-in, persisted artefacts into CAS, regenerated docs/fixtures, and shipped new tests covering the writer + descriptor flow. | BuildX Plugin Guild (src/Scanner/StellaOps.Scanner.Sbomer.BuildXPlugin/TASKS.md) +SCHED-SURFACE-02 | TODO | Integrate Scheduler worker prefetch using Surface manifest reader and persist manifest pointers with rerun plans. Dependencies: SURFACE-FS-02, SCHED-SURFACE-01. Reference `docs/modules/scanner/design/surface-fs-consumers.md` §3 for implementation checklist. | Scheduler Worker Guild (src/Scheduler/__Libraries/StellaOps.Scheduler.Worker/TASKS.md) +ZASTAVA-SURFACE-02 | TODO | Use Surface manifest reader helpers to resolve `cas://` pointers and enrich drift diagnostics with manifest provenance. Dependencies: SURFACE-FS-02, ZASTAVA-SURFACE-01. Reference `docs/modules/scanner/design/surface-fs-consumers.md` §4 for integration steps. | Zastava Observer Guild (src/Zastava/StellaOps.Zastava.Observer/TASKS.md) [Scanner & Surface] 130.A) Scanner.VIII Depends on: Sprint 130.A - Scanner.VII diff --git a/docs/modules/concelier/architecture.md b/docs/modules/concelier/architecture.md index f18b44b37..829139932 100644 --- a/docs/modules/concelier/architecture.md +++ b/docs/modules/concelier/architecture.md @@ -61,23 +61,28 @@ "spec_version": "1.6", "raw": { /* unmodified upstream document */ } }, - "identifiers": { - "cve": ["CVE-2025-12345"], - "ghsa": ["GHSA-xxxx-...."], - "aliases": ["CVE-2025-12345", "GHSA-xxxx-...."] - }, - "linkset": { - "purls": ["pkg:npm/lodash@4.17.21"], - "cpes": ["cpe:2.3:a:lodash:lodash:4.17.21:*:*:*:*:*:*:*"], - "references": [ - {"type":"advisory","url":"https://..."}, - {"type":"fix","url":"https://..."} - ], - "reconciled_from": ["content.raw.affected.ranges", "content.raw.pkg"] - }, - "supersedes": "advisory_raw:osv:GHSA-xxxx-....:v2", - "tenant": "default" -} + "identifiers": { + "primary": "GHSA-xxxx-....", + "aliases": ["CVE-2025-12345", "GHSA-xxxx-...."] + }, + "linkset": { + "purls": ["pkg:npm/lodash@4.17.21"], + "cpes": ["cpe:2.3:a:lodash:lodash:4.17.21:*:*:*:*:*:*:*"], + "references": [ + {"type":"advisory","url":"https://..."}, + {"type":"fix","url":"https://..."} + ], + "reconciled_from": ["content.raw.affected.ranges", "content.raw.pkg"] + }, + "advisory_key": "CVE-2025-12345", + "links": [ + {"scheme":"CVE","value":"CVE-2025-12345"}, + {"scheme":"GHSA","value":"GHSA-XXXX-...."}, + {"scheme":"PRIMARY","value":"CVE-2025-12345"} + ], + "supersedes": "advisory_raw:osv:GHSA-xxxx-....:v2", + "tenant": "default" +} ``` ### 1.2 Connector lifecycle @@ -110,7 +115,7 @@ Running the same export job twice against the same snapshot must yield byte-iden * **Linkset builder** that correlates observations into `advisory_linksets` and annotates conflicts. * **Event publisher** emitting `advisory.observation.updated` and `advisory.linkset.updated` messages. * **Exporters** (JSON, Trivy DB, Offline Kit slices) fed from observation/linkset stores. -* **Minimal REST** for health/status/trigger/export and observation/linkset reads. +* **Minimal REST** for health/status/trigger/export, raw observation reads, and evidence retrieval (`GET /vuln/evidence/advisories/{advisory_key}`). **Scale:** HA by running N replicas; **locks** prevent overlapping jobs per source/exporter. diff --git a/docs/modules/scanner/design/surface-fs-consumers.md b/docs/modules/scanner/design/surface-fs-consumers.md new file mode 100644 index 000000000..6c7d2a0d3 --- /dev/null +++ b/docs/modules/scanner/design/surface-fs-consumers.md @@ -0,0 +1,51 @@ +# Surface.FS Consumer Integration Guide (Scheduler & Zastava) + +> **Updated:** 2025-11-07 +> **Audience:** Scheduler Worker Guild • Zastava Observer Guild • Surface FS Guild +> **Depends on:** SURFACE-FS-02 (`FileSurfaceManifestStore`), Surface.Env/Surface.Secrets libraries. + +This note captures the minimum wiring required for downstream services now that `FileSurfaceManifestStore` and the manifest reader/writer abstractions have landed. + +## 1. Shared prerequisites + +- Reference `StellaOps.Scanner.Surface.FS` (net10.0) and call: + ```csharp + services + .AddSurfaceFileCache() + .AddSurfaceManifestStore(); + ``` + This binds `Surface:Cache` and `Surface:Manifest` (or `SCANNER_SURFACE_*` overrides). +- Pull runtime settings via `ISurfaceEnvironment` to ensure tenants/endpoints line up with Scanner. +- Cache root (`Surface:Cache:Root`) must be writable; manifests fall back to `/manifests` unless explicitly overridden with `Surface:Manifest:RootDirectory`. + +## 2. Manifest reader usage + +```csharp +var reader = serviceProvider.GetRequiredService(); +var manifest = await reader.TryGetByUriAsync(surfaceUri, cancellationToken); +``` + +- Accept `cas://{bucket}/{prefix}/{tenant}/{hh}/{tt}/{digest}.json` pointers. +- On cache miss, return `null`—callers should fall back to existing recompute paths. +- All timestamps are stored in canonical UTC, and metadata dictionaries are alphabetically sorted to keep digests deterministic. + +## 3. Scheduler worker checklist (`SCHED-SURFACE-02`) + +1. Prefetch manifests during planning so reruns can skip redundant layers. +2. Persist `{manifestUri, manifestDigest}` alongside run plans for traceability. +3. Emit telemetry counters: `scheduler_surface_manifest_prefetch_total{result=hit|miss}`. +4. Update `docs/SCHED-WORKER-16-201-PLANNER.md` with the new prefetch flow. + +## 4. Zastava observer checklist (`ZASTAVA-SURFACE-02`) + +1. Resolve manifest pointer from runtime drift events (`entrytrace.graph`, `layer.fragments` kinds). +2. Enrich drift diagnostics with `manifestDigest` and `Artifacts[n].metadata`. +3. Add failure metric `zastava_surface_manifest_failures_total{reason=not_found|fetch_error}`. +4. Expand observer runbook (`docs/modules/zastava/operations/drift.md`) with Surface manifest troubleshooting. + +## 5. Testing guidance + +- Unit-test manifest prefetch/adoption with local `FileSurfaceManifestStore`; use temp directories for isolation. +- For integration environments, smoke-test by pointing to the same `Surface:Manifest:RootDirectory` used by Scanner Worker and verifying pointer fetch before scan jobs execute. + +Coordinate status updates in the relevant `TASKS.md` entries and `docs/implplan/SPRINT_130_scanner_surface.md` once each guild completes its part. If you discover additional shared requirements, extend this guide so future consumers (CLI, Orchestrator) can reuse the flow. diff --git a/docs/modules/scanner/design/surface-fs.md b/docs/modules/scanner/design/surface-fs.md index 83883980b..b66abebcb 100644 --- a/docs/modules/scanner/design/surface-fs.md +++ b/docs/modules/scanner/design/surface-fs.md @@ -1,8 +1,9 @@ # Surface.FS Design (Epic: SURFACE-SHARING) -> **Status:** Draft v1.0 — aligns with tasks `SURFACE-FS-01..06`, `SCANNER-SURFACE-01..05`, `ZASTAVA-SURFACE-01..02`, `SCHED-SURFACE-01`, `OPS-SECRETS-01..02`. +> **Status:** Draft v1.1 — aligns with tasks `SURFACE-FS-01..06`, `SCANNER-SURFACE-01..05`, `ZASTAVA-SURFACE-01..02`, `SCHED-SURFACE-01`, `OPS-SECRETS-01..02`. > > **Audience:** Scanner Worker/WebService, Zastava, Scheduler, DevOps. +> **Component map:** See [Scanner architecture — §1 System landscape](../architecture.md#1-system-landscape) for end-to-end placement. ## 1. Purpose @@ -26,45 +27,61 @@ Manifests describe the artefact metadata and storage pointers. They are stored i { "schema": "stellaops.surface.manifest@1", "tenant": "acme", - "kind": "layer-entry-trace", - "digest": "sha256:ab12...", - "createdAt": "2025-10-29T12:00:00Z", - "expiresAt": "2025-11-05T12:00:00Z", + "imageDigest": "sha256:cafe...", + "scanId": "scan-1234", + "generatedAt": "2025-10-29T12:00:00Z", "source": { - "scannerBuild": "stellaops/scanner@sha256:deadbeef", - "imageDigest": "sha256:cafe...", - "scanId": "scan-1234" + "component": "scanner.worker", + "version": "2025.10.0", + "workerInstance": "scanner-worker-1", + "attempt": 1 }, - "storage": { - "bucket": "surface-cache", - "objectKey": "tenants/acme/layer-entry-trace/sha256/ab/12/.../payload.json.zst", - "sizeBytes": 524288, - "contentType": "application/json+zstd" - }, - "integrity": { - "hash": "sha256:ab12...", - "signature": null - } + "artifacts": [ + { + "kind": "entrytrace.graph", + "uri": "cas://surface-cache/manifests/acme/ab/cd/abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789.json", + "digest": "sha256:abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789", + "mediaType": "application/vnd.stellaops.entrytrace+json", + "format": "json", + "sizeBytes": 524288, + "view": "runtime", + "storage": { + "bucket": "surface-cache", + "objectKey": "payloads/acme/entrytrace/sha256/ab/cd/abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789.ndjson.zst", + "sizeBytes": 524288, + "contentType": "application/x-ndjson+zstd" + }, + "metadata": { + "entrypoint": "/usr/bin/java", + "surfaceVersion": "1" + } + } + ] } ``` +Manifest URIs follow the deterministic pattern: + +``` +cas://{bucket}/{prefix}/{tenant}/{digest[0..1]}/{digest[2..3]}/{digest}.json +``` + +The hex portion of the manifest digest is split into two directory levels to avoid hot directories. The same layout is mirrored on disk by the default `FileSurfaceManifestStore`, which keeps offline bundle sync trivial (copy the `manifests/` tree verbatim). + ### 2.3 Payload Storage Large payloads (SBOM fragments, entry traces, runtime events) live in the same object store as manifests (RustFS/S3). Manifests record relative paths so offline bundles can copy both manifest and payload without modification. ## 3. APIs -Surface.FS exposes a gRPC/HTTP API consumed by .NET clients: +Surface.FS exposes .NET-first abstractions that hosts consume via DI: -| Method | Description | -|--------|-------------| -| `PutManifest(PutManifestRequest)` | Stores manifest + optional payload. Idempotent via `digest`. | -| `GetManifest(GetManifestRequest)` | Returns manifest metadata; 404 if missing. | -| `GetPayload(GetPayloadRequest)` | Streams payload bytes (optionally decompressing). | -| `ListManifests(ListManifestRequest)` | Enumerates manifests for tenant/kind with pagination. | -| `DeleteManifest(DeleteManifestRequest)` | (Optional) Removes manifest/payload based on retention policies. | +- `ISurfaceManifestWriter.PublishAsync(document)` – normalises artefact lists, computes the canonical SHA-256 digest, persists the manifest via the configured store, and returns a `SurfaceManifestPublishResult` containing the digest, canonical URI, and the normalised document. +- `ISurfaceManifestReader.TryGetByUriAsync(uri)` – resolves a manifest pointer (e.g. `cas://surface-cache/manifests/...`) back into a `SurfaceManifestDocument`. +- `ISurfaceManifestReader.TryGetByDigestAsync(digest)` – looks up a manifest by digest, scanning tenant prefixes when necessary (used by Offline Kit importers). +- `ISurfaceCache` (`GetOrCreateAsync`, `TryGetAsync`, `SetAsync`) – lightweight content-addressable cache for hot artefacts (layer fragments, entry trace outputs) hosted on local disk. -.NET client wraps these calls and handles retries using Polly policies. +All components honour configuration bound from `Surface:Cache` and `Surface:Manifest` (or environment mirrors like `SCANNER_SURFACE_CACHE_ROOT`). `SurfaceManifestStoreOptions` controls the URI scheme/bucket/prefix and allows overriding the manifest directory while still defaulting to `/manifests`. ### WebService integration (2025-11-05) @@ -78,16 +95,16 @@ Surface.FS exposes a gRPC/HTTP API consumed by .NET clients: Surface.FS library for .NET hosts provides: -- `ISurfaceManifestWriter` / `ISurfaceManifestReader` interfaces. -- Content-addressed path builder (`SurfacePathBuilder`). -- Tenant namespace isolation and bucket configuration (via Surface.Env). -- Local cache abstraction `ISurfaceCache` with default `FileSurfaceCache` implementation (uses `Surface:Cache:Root` / `SCANNER_SURFACE_CACHE_ROOT`, enforces quotas, serialises writes with per-key semaphores). +- `ISurfaceManifestWriter` / `ISurfaceManifestReader` with the default `FileSurfaceManifestStore` implementation (single-writer semaphore, digest reuse, optional overwrite warning). +- Deterministic pointer builder (`SurfaceManifestPathBuilder`) and options (`SurfaceManifestStoreOptions`, `SurfaceCacheOptions`) that align with `Surface.Env` configuration. +- Local cache abstraction `ISurfaceCache` with default `FileSurfaceCache` implementation (uses `Surface:Cache:Root` / `SCANNER_SURFACE_CACHE_ROOT`, enforces per-key semaphores, stores bytes verbatim). - `SurfaceCacheKey` helper that normalises cache entries as `{namespace}/{tenant}/{sha256}`. EntryTrace graphs use the `entrytrace.graph` namespace so Worker/WebService/CLI can share cached results deterministically. -- Metrics: `surface_manifest_put_seconds`, `surface_manifest_cache_hit_total`, etc. +- JSON serialiser (`SurfaceCacheJsonSerializer`) that applies camelCase naming, ignores nulls, and uses a stable encoder for reproducible hashing. +- Metrics: `surface_manifest_published_total`, `surface_manifest_cache_hit_total`, plus host-specific counters wired via Scanner Worker instrumentation. ## 5. Retention & Eviction -- Manifests include optional `expiresAt`; Worker defaults to 30 days for SBOM fragments, 7 days for entry traces. +- Manifests capture `generatedAt`; retention windows (30 days for SBOM fragments, 7 days for entry traces) are enforced by job configuration and object-store lifecycle policies. An `expiresAt` field is reserved for future use when automated eviction is introduced. - Background job `SurfaceCacheMaintenanceService` evicts local cache entries exceeding quota, oldest-first. - Object storage retention policies are managed by DevOps; library exposes metrics but does not auto-delete unless instructed. @@ -98,13 +115,13 @@ Offline kits include: ``` offline/surface/ manifests/ - tenants///.json + ///.json payloads/ - tenants///.json.zst + ////.json.zst manifest-index.json ``` -Import script calls `PutManifest` for each manifest, verifying digests. This enables Zastava and Scheduler running offline to consume cached data without re-scanning. +Import script uses `ISurfaceManifestWriter.PublishAsync` for each manifest after verifying the embedded digest, keeping Offline Kit replays identical to online flows. This enables Zastava and Scheduler running offline to consume cached data without re-scanning. ### 6.1 EntryTrace Cache Usage diff --git a/src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/ConcelierAdvisoryDocumentProviderTests.cs b/src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/ConcelierAdvisoryDocumentProviderTests.cs index 9fc1840dd..8212a7fd1 100644 --- a/src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/ConcelierAdvisoryDocumentProviderTests.cs +++ b/src/AdvisoryAI/__Tests/StellaOps.AdvisoryAI.Tests/ConcelierAdvisoryDocumentProviderTests.cs @@ -27,7 +27,9 @@ public sealed class ConcelierAdvisoryDocumentProviderTests ImmutableDictionary.Empty), content: new RawContent("csaf", "2.0", JsonDocument.Parse("{\"document\": {\"notes\": []}, \"vulnerabilities\": []}").RootElement), identifiers: new RawIdentifiers(ImmutableArray.Empty, "UP-1"), - linkset: new RawLinkset()); + linkset: new RawLinkset(), + advisoryKey: "UP-1", + links: ImmutableArray.Create(new RawLink("PRIMARY", "UP-1"))); var records = new[] { @@ -69,6 +71,13 @@ public sealed class ConcelierAdvisoryDocumentProviderTests public Task QueryAsync(AdvisoryRawQueryOptions options, CancellationToken cancellationToken) => Task.FromResult(new AdvisoryRawQueryResult(_records, NextCursor: null, HasMore: false)); + public Task> FindByAdvisoryKeyAsync( + string tenant, + IReadOnlyCollection searchValues, + IReadOnlyCollection sourceVendors, + CancellationToken cancellationToken) + => Task.FromResult(_records); + public Task VerifyAsync(AdvisoryRawVerificationRequest request, CancellationToken cancellationToken) => throw new NotImplementedException(); } diff --git a/src/Concelier/StellaOps.Concelier.WebService/Contracts/AdvisoryRawContracts.cs b/src/Concelier/StellaOps.Concelier.WebService/Contracts/AdvisoryRawContracts.cs index 4072da968..5a3ac32e9 100644 --- a/src/Concelier/StellaOps.Concelier.WebService/Contracts/AdvisoryRawContracts.cs +++ b/src/Concelier/StellaOps.Concelier.WebService/Contracts/AdvisoryRawContracts.cs @@ -74,16 +74,20 @@ public sealed record AdvisoryRawRecordResponse( [property: JsonPropertyName("createdAt")] DateTimeOffset CreatedAt, [property: JsonPropertyName("document")] AdvisoryRawDocument Document); -public sealed record AdvisoryRawListResponse( - [property: JsonPropertyName("records")] IReadOnlyList Records, - [property: JsonPropertyName("nextCursor")] string? NextCursor, - [property: JsonPropertyName("hasMore")] bool HasMore); - -public sealed record AdvisoryRawProvenanceResponse( - [property: JsonPropertyName("id")] string Id, - [property: JsonPropertyName("tenant")] string Tenant, - [property: JsonPropertyName("source")] RawSourceMetadata Source, - [property: JsonPropertyName("upstream")] RawUpstreamMetadata Upstream, +public sealed record AdvisoryRawListResponse( + [property: JsonPropertyName("records")] IReadOnlyList Records, + [property: JsonPropertyName("nextCursor")] string? NextCursor, + [property: JsonPropertyName("hasMore")] bool HasMore); + +public sealed record AdvisoryEvidenceResponse( + [property: JsonPropertyName("advisoryKey")] string AdvisoryKey, + [property: JsonPropertyName("records")] IReadOnlyList Records); + +public sealed record AdvisoryRawProvenanceResponse( + [property: JsonPropertyName("id")] string Id, + [property: JsonPropertyName("tenant")] string Tenant, + [property: JsonPropertyName("source")] RawSourceMetadata Source, + [property: JsonPropertyName("upstream")] RawUpstreamMetadata Upstream, [property: JsonPropertyName("supersedes")] string? Supersedes, [property: JsonPropertyName("ingestedAt")] DateTimeOffset IngestedAt, [property: JsonPropertyName("createdAt")] DateTimeOffset CreatedAt); diff --git a/src/Concelier/StellaOps.Concelier.WebService/Extensions/AdvisoryRawRequestMapper.cs b/src/Concelier/StellaOps.Concelier.WebService/Extensions/AdvisoryRawRequestMapper.cs index 00b6560b7..1dc4ffe73 100644 --- a/src/Concelier/StellaOps.Concelier.WebService/Extensions/AdvisoryRawRequestMapper.cs +++ b/src/Concelier/StellaOps.Concelier.WebService/Extensions/AdvisoryRawRequestMapper.cs @@ -61,24 +61,26 @@ internal static class AdvisoryRawRequestMapper identifiersRequest.Primary); var linksetRequest = request.Linkset; - var linkset = new RawLinkset - { - Aliases = NormalizeStrings(linksetRequest?.Aliases), - PackageUrls = NormalizeStrings(linksetRequest?.PackageUrls), - Cpes = NormalizeStrings(linksetRequest?.Cpes), - References = NormalizeReferences(linksetRequest?.References), - ReconciledFrom = NormalizeStrings(linksetRequest?.ReconciledFrom), - Notes = NormalizeDictionary(linksetRequest?.Notes) - }; - - return new AdvisoryRawDocument( - tenant.Trim().ToLowerInvariant(), - source, - upstream, - content, - identifiers, - linkset); - } + var linkset = new RawLinkset + { + Aliases = NormalizeStrings(linksetRequest?.Aliases), + PackageUrls = NormalizeStrings(linksetRequest?.PackageUrls), + Cpes = NormalizeStrings(linksetRequest?.Cpes), + References = NormalizeReferences(linksetRequest?.References), + ReconciledFrom = NormalizeStrings(linksetRequest?.ReconciledFrom), + Notes = NormalizeDictionary(linksetRequest?.Notes) + }; + + return new AdvisoryRawDocument( + tenant.Trim().ToLowerInvariant(), + source, + upstream, + content, + identifiers, + linkset, + AdvisoryKey: string.Empty, + Links: ImmutableArray.Empty); + } internal static ImmutableArray NormalizeStrings(IEnumerable? values) { diff --git a/src/Concelier/StellaOps.Concelier.WebService/Options/ConcelierOptions.cs b/src/Concelier/StellaOps.Concelier.WebService/Options/ConcelierOptions.cs index b37f39c51..4750fed16 100644 --- a/src/Concelier/StellaOps.Concelier.WebService/Options/ConcelierOptions.cs +++ b/src/Concelier/StellaOps.Concelier.WebService/Options/ConcelierOptions.cs @@ -144,7 +144,7 @@ public sealed class ConcelierOptions public sealed class FeaturesOptions { - public bool NoMergeEnabled { get; set; } + public bool NoMergeEnabled { get; set; } = true; public bool LnmShadowWrites { get; set; } = true; diff --git a/src/Concelier/StellaOps.Concelier.WebService/Program.cs b/src/Concelier/StellaOps.Concelier.WebService/Program.cs index 9c4cc9908..5830eb4a2 100644 --- a/src/Concelier/StellaOps.Concelier.WebService/Program.cs +++ b/src/Concelier/StellaOps.Concelier.WebService/Program.cs @@ -672,6 +672,59 @@ if (authorityConfigured) advisoryRawProvenanceEndpoint.RequireAuthorization(AdvisoryReadPolicyName); } +var advisoryEvidenceEndpoint = app.MapGet("/vuln/evidence/advisories/{advisoryKey}", async ( + string advisoryKey, + HttpContext context, + [FromServices] IAdvisoryRawService rawService, + CancellationToken cancellationToken) => +{ + ApplyNoCache(context.Response); + + if (!TryResolveTenant(context, requireHeader: false, out var tenant, out var tenantError)) + { + return tenantError; + } + + var authorizationError = EnsureTenantAuthorized(context, tenant); + if (authorizationError is not null) + { + return authorizationError; + } + + if (string.IsNullOrWhiteSpace(advisoryKey)) + { + return Problem(context, "advisoryKey is required", StatusCodes.Status400BadRequest, ProblemTypes.Validation, "Provide an advisory identifier."); + } + + var vendorFilter = AdvisoryRawRequestMapper.NormalizeStrings(context.Request.Query["vendor"]); + var records = await rawService.FindByAdvisoryKeyAsync( + tenant, + advisoryKey, + vendorFilter, + cancellationToken).ConfigureAwait(false); + + if (records.Count == 0) + { + return Results.NotFound(); + } + + var recordResponses = records + .Select(record => new AdvisoryRawRecordResponse( + record.Id, + record.Document.Tenant, + record.IngestedAt, + record.CreatedAt, + record.Document)) + .ToArray(); + + var response = new AdvisoryEvidenceResponse(recordResponses[0].Document.AdvisoryKey, recordResponses); + return JsonResult(response); +}); +if (authorityConfigured) +{ + advisoryEvidenceEndpoint.RequireAuthorization(AdvisoryReadPolicyName); +} + var aocVerifyEndpoint = app.MapPost("/aoc/verify", async ( HttpContext context, AocVerifyRequest request, diff --git a/src/Concelier/StellaOps.Concelier.WebService/TASKS.md b/src/Concelier/StellaOps.Concelier.WebService/TASKS.md index c5f5621c9..1e92d591b 100644 --- a/src/Concelier/StellaOps.Concelier.WebService/TASKS.md +++ b/src/Concelier/StellaOps.Concelier.WebService/TASKS.md @@ -55,8 +55,8 @@ | ID | Status | Owner(s) | Depends on | Notes | |----|--------|----------|------------|-------| -| CONCELIER-VULN-29-001 `Advisory key canonicalization` | TODO | Concelier WebService Guild, Data Integrity Guild | CONCELIER-LNM-21-001 | Canonicalize (lossless) advisory identifiers (CVE/GHSA/vendor) into `advisory_key`, persist `links[]`, expose raw payload snapshots for Explorer evidence tabs; AOC-compliant: no merge, no derived fields, no suppression. Include migration/backfill scripts. | -| CONCELIER-VULN-29-002 `Evidence retrieval API` | TODO | Concelier WebService Guild | CONCELIER-VULN-29-001, VULN-API-29-003 | Provide `/vuln/evidence/advisories/{advisory_key}` returning raw advisory docs with provenance, filtering by tenant and source. | +| CONCELIER-VULN-29-001 `Advisory key canonicalization` | DONE (2025-11-07) | Concelier WebService Guild, Data Integrity Guild | CONCELIER-LNM-21-001 | Canonicalize (lossless) advisory identifiers (CVE/GHSA/vendor) into `advisory_key`, persist `links[]`, expose raw payload snapshots for Explorer evidence tabs; AOC-compliant: no merge, no derived fields, no suppression. Include migration/backfill scripts. | +| CONCELIER-VULN-29-002 `Evidence retrieval API` | DOING (2025-11-07) | Concelier WebService Guild | CONCELIER-VULN-29-001, VULN-API-29-003 | Provide `/vuln/evidence/advisories/{advisory_key}` returning raw advisory docs with provenance, filtering by tenant and source. | | CONCELIER-VULN-29-004 `Observability enhancements` | TODO | Concelier WebService Guild, Observability Guild | CONCELIER-VULN-29-001 | Instrument metrics/logs for observation + linkset pipelines (identifier collisions, withdrawn flags) and emit events consumed by Vuln Explorer resolver. | ## Advisory AI (Sprint 31) diff --git a/src/Concelier/__Libraries/StellaOps.Concelier.Core/Raw/AdvisoryCanonicalizer.cs b/src/Concelier/__Libraries/StellaOps.Concelier.Core/Raw/AdvisoryCanonicalizer.cs new file mode 100644 index 000000000..2180a0a80 --- /dev/null +++ b/src/Concelier/__Libraries/StellaOps.Concelier.Core/Raw/AdvisoryCanonicalizer.cs @@ -0,0 +1,195 @@ +using System.Collections.Generic; +using System.Collections.Immutable; +using System.Linq; +using StellaOps.Concelier.Models; +using StellaOps.Concelier.RawModels; + +namespace StellaOps.Concelier.Core.Raw; + +public static class AdvisoryCanonicalizer +{ + private static readonly ImmutableArray PrimarySchemePriority = new[] + { + AliasSchemes.Cve, + AliasSchemes.Ghsa, + AliasSchemes.OsV, + AliasSchemes.Bdu, + AliasSchemes.Jvn, + AliasSchemes.Jvndb, + AliasSchemes.Rhsa, + AliasSchemes.Usn, + AliasSchemes.Dsa, + AliasSchemes.SuseSu, + AliasSchemes.Icsa, + AliasSchemes.Msrc, + AliasSchemes.CiscoSa, + AliasSchemes.OracleCpu, + AliasSchemes.Apsb, + AliasSchemes.Apa, + AliasSchemes.AppleHt, + AliasSchemes.Vmsa, + AliasSchemes.Vu, + AliasSchemes.ChromiumPost, + }.ToImmutableArray(); + + private const string PrimaryScheme = "PRIMARY"; + private const string UnscopedScheme = "UNSCOPED"; + + public static AdvisoryCanonicalizationResult Canonicalize( + RawIdentifiers identifiers, + RawSourceMetadata source, + RawUpstreamMetadata upstream) + { + ArgumentNullException.ThrowIfNull(identifiers); + ArgumentNullException.ThrowIfNull(source); + ArgumentNullException.ThrowIfNull(upstream); + + var candidates = new List<(string Scheme, string Value)>(); + + void AddCandidate(string? rawValue) + { + if (string.IsNullOrWhiteSpace(rawValue)) + { + return; + } + + var trimmed = rawValue.Trim(); + if (AliasSchemeRegistry.TryNormalize(trimmed, out var normalized, out var scheme)) + { + if (!string.IsNullOrEmpty(normalized) && !string.IsNullOrEmpty(scheme)) + { + candidates.Add((scheme, normalized)); + } + } + else + { + candidates.Add((UnscopedScheme, trimmed)); + } + } + + AddCandidate(identifiers.PrimaryId); + + if (!identifiers.Aliases.IsDefaultOrEmpty) + { + foreach (var alias in identifiers.Aliases) + { + AddCandidate(alias); + } + } + + var unique = new Dictionary<(string Scheme, string Value), RawLink>(CandidateComparer.Instance); + foreach (var candidate in candidates) + { + var key = (candidate.Scheme, candidate.Value); + if (!unique.ContainsKey(key)) + { + unique[key] = new RawLink(candidate.Scheme, candidate.Value); + } + } + + var advisoryKey = SelectCanonicalKey(unique.Keys, identifiers, source, upstream); + unique[(PrimaryScheme, advisoryKey)] = new RawLink(PrimaryScheme, advisoryKey); + + var links = unique.Values + .OrderBy(static link => link.Scheme, StringComparer.Ordinal) + .ThenBy(static link => link.Value, StringComparer.Ordinal) + .ToImmutableArray(); + + return new AdvisoryCanonicalizationResult(advisoryKey, links); + } + + private static string SelectCanonicalKey( + IEnumerable<(string Scheme, string Value)> candidates, + RawIdentifiers identifiers, + RawSourceMetadata source, + RawUpstreamMetadata upstream) + { + foreach (var preferredScheme in PrimarySchemePriority) + { + var match = candidates.FirstOrDefault(candidate => + string.Equals(candidate.Scheme, preferredScheme, StringComparison.OrdinalIgnoreCase)); + if (!string.IsNullOrEmpty(match.Value)) + { + return match.Value; + } + } + + var firstCandidate = candidates.FirstOrDefault(); + if (!string.IsNullOrEmpty(firstCandidate.Value)) + { + return firstCandidate.Value; + } + + var fallbackValue = identifiers.PrimaryId; + if (string.IsNullOrWhiteSpace(fallbackValue)) + { + fallbackValue = upstream.UpstreamId; + } + + if (string.IsNullOrWhiteSpace(fallbackValue)) + { + fallbackValue = upstream.ContentHash; + } + + fallbackValue = (fallbackValue ?? "unknown").Trim(); + var vendor = NormalizeVendor(source.Vendor); + return $"{vendor}:{NormalizeFallbackValue(fallbackValue)}"; + } + + private static string NormalizeVendor(string vendor) + { + if (string.IsNullOrWhiteSpace(vendor)) + { + return "UNKNOWN"; + } + + return vendor.Trim().ToUpperInvariant(); + } + + private static string NormalizeFallbackValue(string value) + { + if (string.IsNullOrWhiteSpace(value)) + { + return "unknown"; + } + + var trimmed = value.Trim(); + var builder = new char[trimmed.Length]; + var index = 0; + + foreach (var ch in trimmed) + { + if (char.IsWhiteSpace(ch)) + { + builder[index++] = '-'; + } + else + { + builder[index++] = char.ToUpperInvariant(ch); + } + } + + return new string(builder, 0, index); + } + + private sealed class CandidateComparer : IEqualityComparer<(string Scheme, string Value)> + { + public static CandidateComparer Instance { get; } = new(); + + public bool Equals((string Scheme, string Value) x, (string Scheme, string Value) y) + => string.Equals(x.Scheme, y.Scheme, StringComparison.OrdinalIgnoreCase) && + string.Equals(x.Value, y.Value, StringComparison.Ordinal); + + public int GetHashCode((string Scheme, string Value) obj) + { + unchecked + { + var schemeHash = obj.Scheme?.ToUpperInvariant().GetHashCode(StringComparison.Ordinal) ?? 0; + var valueHash = obj.Value?.GetHashCode(StringComparison.Ordinal) ?? 0; + return (schemeHash * 397) ^ valueHash; + } + } + } +} + +public sealed record AdvisoryCanonicalizationResult(string AdvisoryKey, ImmutableArray Links); diff --git a/src/Concelier/__Libraries/StellaOps.Concelier.Core/Raw/AdvisoryRawService.cs b/src/Concelier/__Libraries/StellaOps.Concelier.Core/Raw/AdvisoryRawService.cs index 4d1c9fdfb..8053e5c77 100644 --- a/src/Concelier/__Libraries/StellaOps.Concelier.Core/Raw/AdvisoryRawService.cs +++ b/src/Concelier/__Libraries/StellaOps.Concelier.Core/Raw/AdvisoryRawService.cs @@ -8,7 +8,8 @@ using Microsoft.Extensions.Logging; using StellaOps.Aoc; using StellaOps.Concelier.Core.Aoc; using StellaOps.Concelier.Core.Linksets; -using StellaOps.Concelier.RawModels; +using StellaOps.Concelier.RawModels; +using StellaOps.Concelier.Models; namespace StellaOps.Concelier.Core.Raw; @@ -99,12 +100,37 @@ internal sealed class AdvisoryRawService : IAdvisoryRawService return _repository.FindByIdAsync(normalizedTenant, normalizedId, cancellationToken); } - public Task QueryAsync(AdvisoryRawQueryOptions options, CancellationToken cancellationToken) - { - ArgumentNullException.ThrowIfNull(options); - return _repository.QueryAsync(options, cancellationToken); - } - + public Task QueryAsync(AdvisoryRawQueryOptions options, CancellationToken cancellationToken) + { + ArgumentNullException.ThrowIfNull(options); + return _repository.QueryAsync(options, cancellationToken); + } + + public Task> FindByAdvisoryKeyAsync( + string tenant, + string advisoryKey, + IReadOnlyCollection sourceVendors, + CancellationToken cancellationToken) + { + ArgumentException.ThrowIfNullOrWhiteSpace(tenant); + ArgumentException.ThrowIfNullOrWhiteSpace(advisoryKey); + + var normalizedTenant = NormalizeTenant(tenant); + var searchValues = BuildAdvisoryKeySearchValues(advisoryKey); + if (searchValues.Length == 0) + { + return Task.FromResult>(Array.Empty()); + } + + var vendors = NormalizeSourceVendors(sourceVendors); + + return _repository.FindByAdvisoryKeyAsync( + normalizedTenant, + searchValues, + vendors, + cancellationToken); + } + public async Task VerifyAsync(AdvisoryRawVerificationRequest request, CancellationToken cancellationToken) { ArgumentNullException.ThrowIfNull(request); @@ -249,16 +275,20 @@ internal sealed class AdvisoryRawService : IAdvisoryRawService var upstream = NormalizeUpstream(document.Upstream); var content = NormalizeContent(document.Content); var identifiers = NormalizeIdentifiers(document.Identifiers); - var linkset = NormalizeLinkset(document.Linkset); - - return new AdvisoryRawDocument( - tenant, - source, - upstream, - content, - identifiers, - linkset, - Supersedes: null); + var linkset = NormalizeLinkset(document.Linkset); + var canonical = AdvisoryCanonicalizer.Canonicalize(identifiers, source, upstream); + var links = canonical.Links.IsDefault ? ImmutableArray.Empty : canonical.Links; + + return new AdvisoryRawDocument( + tenant, + source, + upstream, + content, + identifiers, + linkset, + canonical.AdvisoryKey, + links.IsDefaultOrEmpty ? ImmutableArray.Empty : links, + Supersedes: null); } private static RawSourceMetadata NormalizeSource(RawSourceMetadata source) @@ -377,6 +407,27 @@ internal sealed class AdvisoryRawService : IAdvisoryRawService }; } + private static ImmutableArray NormalizeSourceVendors(IReadOnlyCollection sourceVendors) + { + if (sourceVendors is null || sourceVendors.Count == 0) + { + return ImmutableArray.Empty; + } + + var builder = ImmutableArray.CreateBuilder(sourceVendors.Count); + foreach (var vendor in sourceVendors) + { + if (string.IsNullOrWhiteSpace(vendor)) + { + continue; + } + + builder.Add(NormalizeSourceVendor(vendor)); + } + + return builder.Count == 0 ? ImmutableArray.Empty : builder.ToImmutable(); + } + private static ImmutableArray NormalizeStringArray(ImmutableArray values) { if (values.IsDefaultOrEmpty) @@ -419,14 +470,39 @@ internal sealed class AdvisoryRawService : IAdvisoryRawService string.IsNullOrWhiteSpace(reference.Source) ? null : reference.Source.Trim())); } - return builder.ToImmutable(); - } - - private JsonElement ToJsonElement(AdvisoryRawDocument document) - { - var json = System.Text.Json.JsonSerializer.Serialize(document); - using var jsonDocument = System.Text.Json.JsonDocument.Parse(json); - return jsonDocument.RootElement.Clone(); + return builder.ToImmutable(); + } + + private static ImmutableArray BuildAdvisoryKeySearchValues(string advisoryKey) + { + var trimmed = advisoryKey?.Trim(); + if (string.IsNullOrWhiteSpace(trimmed)) + { + return ImmutableArray.Empty; + } + + var set = new HashSet(StringComparer.Ordinal) + { + trimmed, + trimmed.ToUpperInvariant() + }; + + if (AliasSchemeRegistry.TryNormalize(trimmed, out var normalized, out _) + && !string.IsNullOrWhiteSpace(normalized)) + { + set.Add(normalized); + } + + return set.Count == 0 + ? ImmutableArray.Empty + : ImmutableArray.CreateRange(set); + } + + private JsonElement ToJsonElement(AdvisoryRawDocument document) + { + var json = System.Text.Json.JsonSerializer.Serialize(document); + using var jsonDocument = System.Text.Json.JsonDocument.Parse(json); + return jsonDocument.RootElement.Clone(); } private sealed class VerificationAggregation diff --git a/src/Concelier/__Libraries/StellaOps.Concelier.Core/Raw/IAdvisoryRawRepository.cs b/src/Concelier/__Libraries/StellaOps.Concelier.Core/Raw/IAdvisoryRawRepository.cs index fb29ec972..3d26f2800 100644 --- a/src/Concelier/__Libraries/StellaOps.Concelier.Core/Raw/IAdvisoryRawRepository.cs +++ b/src/Concelier/__Libraries/StellaOps.Concelier.Core/Raw/IAdvisoryRawRepository.cs @@ -23,13 +23,22 @@ public interface IAdvisoryRawRepository /// /// Queries raw documents using the supplied filter/paging options. /// - Task QueryAsync(AdvisoryRawQueryOptions options, CancellationToken cancellationToken); - - /// - /// Enumerates raw advisory documents for verification runs. - /// - Task> ListForVerificationAsync( - string tenant, + Task QueryAsync(AdvisoryRawQueryOptions options, CancellationToken cancellationToken); + + /// + /// Retrieves all raw documents associated with the supplied advisory key (or alias) for a tenant. + /// + Task> FindByAdvisoryKeyAsync( + string tenant, + IReadOnlyCollection searchValues, + IReadOnlyCollection sourceVendors, + CancellationToken cancellationToken); + + /// + /// Enumerates raw advisory documents for verification runs. + /// + Task> ListForVerificationAsync( + string tenant, DateTimeOffset since, DateTimeOffset until, IReadOnlyCollection sourceVendors, diff --git a/src/Concelier/__Libraries/StellaOps.Concelier.Core/Raw/IAdvisoryRawService.cs b/src/Concelier/__Libraries/StellaOps.Concelier.Core/Raw/IAdvisoryRawService.cs index 62c3591c7..574293038 100644 --- a/src/Concelier/__Libraries/StellaOps.Concelier.Core/Raw/IAdvisoryRawService.cs +++ b/src/Concelier/__Libraries/StellaOps.Concelier.Core/Raw/IAdvisoryRawService.cs @@ -7,13 +7,19 @@ namespace StellaOps.Concelier.Core.Raw; /// public interface IAdvisoryRawService { - Task IngestAsync(AdvisoryRawDocument document, CancellationToken cancellationToken); - - Task FindByIdAsync(string tenant, string id, CancellationToken cancellationToken); - - Task QueryAsync(AdvisoryRawQueryOptions options, CancellationToken cancellationToken); - - Task VerifyAsync(AdvisoryRawVerificationRequest request, CancellationToken cancellationToken); + Task IngestAsync(AdvisoryRawDocument document, CancellationToken cancellationToken); + + Task FindByIdAsync(string tenant, string id, CancellationToken cancellationToken); + + Task QueryAsync(AdvisoryRawQueryOptions options, CancellationToken cancellationToken); + + Task> FindByAdvisoryKeyAsync( + string tenant, + string advisoryKey, + IReadOnlyCollection sourceVendors, + CancellationToken cancellationToken); + + Task VerifyAsync(AdvisoryRawVerificationRequest request, CancellationToken cancellationToken); } /// diff --git a/src/Concelier/__Libraries/StellaOps.Concelier.Merge/TASKS.md b/src/Concelier/__Libraries/StellaOps.Concelier.Merge/TASKS.md index a23352663..0b714a6d2 100644 --- a/src/Concelier/__Libraries/StellaOps.Concelier.Merge/TASKS.md +++ b/src/Concelier/__Libraries/StellaOps.Concelier.Merge/TASKS.md @@ -10,6 +10,6 @@ | Task | Owner(s) | Depends on | Notes | |---|---|---|---| |MERGE-LNM-21-001 Migration plan authoring|BE-Merge, Architecture Guild|CONCELIER-LNM-21-101|**DONE (2025-11-03)** – Authored `docs/migration/no-merge.md` with rollout phases, backfill/validation checklists, rollback guidance, and ownership matrix for the Link-Not-Merge cutover.| -|MERGE-LNM-21-002 Merge service deprecation|BE-Merge|MERGE-LNM-21-001|**DOING (2025-11-07)** – Defaulted `concelier:features:noMergeEnabled` to `true`, added merge job allowlist gate, and began rewiring guard/tier tests; follow-up work required to restore Concelier WebService test suite before declaring completion.
2025-11-05 14:42Z: Implemented `concelier:features:noMergeEnabled` gate, merge job allowlist checks, `[Obsolete]` markings, and analyzer scaffolding to steer consumers toward linkset APIs.
2025-11-06 16:10Z: Introduced Roslyn analyzer (`CONCELIER0002`) referenced by Concelier WebService + tests, documented suppression guidance, and updated migration playbook.
2025-11-07 03:25Z: Default-on toggle + job gating break existing Concelier WebService tests; guard + seed fixes pending to unblock ingest/mirror suites.
2025-11-07 07:05Z: Added ingest logging + test log dumps to trace upstream hash loss; still chasing why Minimal API binding strips `upstream.contentHash` before the guard runs.| +|MERGE-LNM-21-002 Merge service deprecation|BE-Merge|MERGE-LNM-21-001|**DONE (2025-11-07)** – Feature flag now defaults to Link-Not-Merge mode (`NoMergeEnabled=true`) across options/config, analyzers enforce deprecation, and WebService option tests cover the regression; dotnet CLI validation still queued for a workstation with preview SDK.
2025-11-05 14:42Z: Implemented `concelier:features:noMergeEnabled` gate, merge job allowlist checks, `[Obsolete]` markings, and analyzer scaffolding to steer consumers toward linkset APIs.
2025-11-06 16:10Z: Introduced Roslyn analyzer (`CONCELIER0002`) referenced by Concelier WebService + tests, documented suppression guidance, and updated migration playbook.
2025-11-07 03:25Z: Default-on toggle + job gating surfacing ingestion test brittleness; guard logs capture requests missing `upstream.contentHash`.
2025-11-07 19:45Z: Set `ConcelierOptions.Features.NoMergeEnabled` default to `true`, added regression coverage (`Features_NoMergeEnabled_DefaultsToTrue`), and rechecked ingest helpers to carry canonical links before closing the task.| > 2025-11-03: Catalogued call sites (WebService Program `AddMergeModule`, built-in job registration `merge:reconcile`, `MergeReconcileJob`) and confirmed unit tests are the only direct `MergeAsync` callers; next step is to define analyzer + replacement observability coverage. -|MERGE-LNM-21-003 Determinism/test updates|QA Guild, BE-Merge|MERGE-LNM-21-002|Replace merge determinism suites with observation/linkset regression tests verifying no data mutation and conflicts remain visible.| +|MERGE-LNM-21-003 Determinism/test updates|QA Guild, BE-Merge|MERGE-LNM-21-002|**DOING (2025-11-07)** – Replacing legacy merge determinism harness with observation/linkset regression plan; tracking scenarios in `docs/dev/lnm-determinism-tests.md` before porting fixtures.
2025-11-07 20:05Z: Ported merge determinism fixture into `AdvisoryObservationFactoryTests.Create_IsDeterministicAcrossRuns` and removed the redundant merge integration test.| diff --git a/src/Concelier/__Libraries/StellaOps.Concelier.RawModels/AdvisoryRawDocument.cs b/src/Concelier/__Libraries/StellaOps.Concelier.RawModels/AdvisoryRawDocument.cs index 2d7c86709..b63ac1242 100644 --- a/src/Concelier/__Libraries/StellaOps.Concelier.RawModels/AdvisoryRawDocument.cs +++ b/src/Concelier/__Libraries/StellaOps.Concelier.RawModels/AdvisoryRawDocument.cs @@ -1,21 +1,23 @@ -using System.Collections.Immutable; -using System.Text.Json; -using System.Text.Json.Serialization; - -namespace StellaOps.Concelier.RawModels; - -public sealed record AdvisoryRawDocument( - [property: JsonPropertyName("tenant")] string Tenant, - [property: JsonPropertyName("source")] RawSourceMetadata Source, - [property: JsonPropertyName("upstream")] RawUpstreamMetadata Upstream, - [property: JsonPropertyName("content")] RawContent Content, - [property: JsonPropertyName("identifiers")] RawIdentifiers Identifiers, - [property: JsonPropertyName("linkset")] RawLinkset Linkset, - [property: JsonPropertyName("supersedes")] string? Supersedes = null) -{ - public AdvisoryRawDocument WithSupersedes(string supersedes) - => this with { Supersedes = supersedes }; -} +using System.Collections.Immutable; +using System.Text.Json; +using System.Text.Json.Serialization; + +namespace StellaOps.Concelier.RawModels; + +public sealed record AdvisoryRawDocument( + [property: JsonPropertyName("tenant")] string Tenant, + [property: JsonPropertyName("source")] RawSourceMetadata Source, + [property: JsonPropertyName("upstream")] RawUpstreamMetadata Upstream, + [property: JsonPropertyName("content")] RawContent Content, + [property: JsonPropertyName("identifiers")] RawIdentifiers Identifiers, + [property: JsonPropertyName("linkset")] RawLinkset Linkset, + [property: JsonPropertyName("advisory_key")] string AdvisoryKey = "", + [property: JsonPropertyName("links")] ImmutableArray Links = default, + [property: JsonPropertyName("supersedes")] string? Supersedes = null) +{ + public AdvisoryRawDocument WithSupersedes(string supersedes) + => this with { Supersedes = supersedes }; +} public sealed record RawSourceMetadata( [property: JsonPropertyName("vendor")] string Vendor, @@ -49,10 +51,10 @@ public sealed record RawIdentifiers( [property: JsonPropertyName("aliases")] ImmutableArray Aliases, [property: JsonPropertyName("primary")] string PrimaryId); -public sealed record RawLinkset -{ - [JsonPropertyName("aliases")] - public ImmutableArray Aliases { get; init; } = ImmutableArray.Empty; +public sealed record RawLinkset +{ + [JsonPropertyName("aliases")] + public ImmutableArray Aliases { get; init; } = ImmutableArray.Empty; [JsonPropertyName("purls")] public ImmutableArray PackageUrls { get; init; } = ImmutableArray.Empty; @@ -68,9 +70,13 @@ public sealed record RawLinkset [JsonPropertyName("notes")] public ImmutableDictionary Notes { get; init; } = ImmutableDictionary.Empty; -} - -public sealed record RawReference( - [property: JsonPropertyName("type")] string Type, - [property: JsonPropertyName("url")] string Url, - [property: JsonPropertyName("source")] string? Source = null); +} + +public sealed record RawReference( + [property: JsonPropertyName("type")] string Type, + [property: JsonPropertyName("url")] string Url, + [property: JsonPropertyName("source")] string? Source = null); + +public sealed record RawLink( + [property: JsonPropertyName("scheme")] string Scheme, + [property: JsonPropertyName("value")] string Value); diff --git a/src/Concelier/__Libraries/StellaOps.Concelier.RawModels/RawDocumentFactory.cs b/src/Concelier/__Libraries/StellaOps.Concelier.RawModels/RawDocumentFactory.cs index 0f3022517..3b42d2aa2 100644 --- a/src/Concelier/__Libraries/StellaOps.Concelier.RawModels/RawDocumentFactory.cs +++ b/src/Concelier/__Libraries/StellaOps.Concelier.RawModels/RawDocumentFactory.cs @@ -5,18 +5,21 @@ namespace StellaOps.Concelier.RawModels; public static class RawDocumentFactory { - public static AdvisoryRawDocument CreateAdvisory( - string tenant, - RawSourceMetadata source, - RawUpstreamMetadata upstream, - RawContent content, - RawIdentifiers identifiers, - RawLinkset linkset, - string? supersedes = null) - { - var clonedContent = content with { Raw = Clone(content.Raw) }; - return new AdvisoryRawDocument(tenant, source, upstream, clonedContent, identifiers, linkset, supersedes); - } + public static AdvisoryRawDocument CreateAdvisory( + string tenant, + RawSourceMetadata source, + RawUpstreamMetadata upstream, + RawContent content, + RawIdentifiers identifiers, + RawLinkset linkset, + string advisoryKey, + ImmutableArray links, + string? supersedes = null) + { + var clonedContent = content with { Raw = Clone(content.Raw) }; + var normalizedLinks = links.IsDefault ? ImmutableArray.Empty : links; + return new AdvisoryRawDocument(tenant, source, upstream, clonedContent, identifiers, linkset, advisoryKey, normalizedLinks, supersedes); + } public static VexRawDocument CreateVex( string tenant, diff --git a/src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/Migrations/EnsureAdvisoryCanonicalKeyBackfillMigration.cs b/src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/Migrations/EnsureAdvisoryCanonicalKeyBackfillMigration.cs new file mode 100644 index 000000000..1ac851d21 --- /dev/null +++ b/src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/Migrations/EnsureAdvisoryCanonicalKeyBackfillMigration.cs @@ -0,0 +1,166 @@ +using System; +using System.Collections.Generic; +using System.Collections.Immutable; +using System.Globalization; +using System.Linq; +using System.Threading; +using System.Threading.Tasks; +using MongoDB.Bson; +using MongoDB.Driver; +using StellaOps.Concelier.Core.Raw; +using StellaOps.Concelier.RawModels; + +namespace StellaOps.Concelier.Storage.Mongo.Migrations; + +public sealed class EnsureAdvisoryCanonicalKeyBackfillMigration : IMongoMigration +{ + public string Id => "2025-11-07-advisory-canonical-key"; + + public string Description => "Populate advisory_key and links for advisory_raw documents."; + + public async Task ApplyAsync(IMongoDatabase database, CancellationToken cancellationToken) + { + ArgumentNullException.ThrowIfNull(database); + + var collection = database.GetCollection(MongoStorageDefaults.Collections.AdvisoryRaw); + var filter = Builders.Filter.Or( + Builders.Filter.Exists("advisory_key", false), + Builders.Filter.Type("advisory_key", BsonType.Null), + Builders.Filter.Eq("advisory_key", string.Empty), + Builders.Filter.Or( + Builders.Filter.Exists("links", false), + Builders.Filter.Type("links", BsonType.Null))); + + using var cursor = await collection.Find(filter).ToCursorAsync(cancellationToken).ConfigureAwait(false); + while (await cursor.MoveNextAsync(cancellationToken).ConfigureAwait(false)) + { + foreach (var document in cursor.Current) + { + cancellationToken.ThrowIfCancellationRequested(); + + if (!document.TryGetValue("_id", out var idValue) || idValue.IsBsonNull) + { + continue; + } + + var source = ParseSource(document.GetValue("source", new BsonDocument()).AsBsonDocument); + var upstream = ParseUpstream(document.GetValue("upstream", new BsonDocument()).AsBsonDocument); + var identifiers = ParseIdentifiers(document.GetValue("identifiers", new BsonDocument()).AsBsonDocument); + + var canonical = AdvisoryCanonicalizer.Canonicalize(identifiers, source, upstream); + var linksArray = new BsonArray((canonical.Links.IsDefaultOrEmpty ? ImmutableArray.Empty : canonical.Links) + .Select(link => new BsonDocument + { + { "scheme", link.Scheme }, + { "value", link.Value } + })); + + var update = Builders.Update + .Set("advisory_key", canonical.AdvisoryKey) + .Set("links", linksArray); + + await collection.UpdateOneAsync( + Builders.Filter.Eq("_id", idValue), + update, + cancellationToken: cancellationToken).ConfigureAwait(false); + } + } + } + + private static RawSourceMetadata ParseSource(BsonDocument source) + { + return new RawSourceMetadata( + GetRequiredString(source, "vendor"), + GetOptionalString(source, "connector") ?? string.Empty, + GetOptionalString(source, "version") ?? "unknown", + GetOptionalString(source, "stream")); + } + + private static RawUpstreamMetadata ParseUpstream(BsonDocument upstream) + { + var provenance = ImmutableDictionary.CreateBuilder(StringComparer.Ordinal); + if (upstream.TryGetValue("provenance", out var provenanceValue) && provenanceValue.IsBsonDocument) + { + foreach (var element in provenanceValue.AsBsonDocument) + { + provenance[element.Name] = BsonValueToString(element.Value); + } + } + + var signature = upstream.TryGetValue("signature", out var signatureValue) && signatureValue.IsBsonDocument + ? signatureValue.AsBsonDocument + : new BsonDocument(); + + var signatureMetadata = new RawSignatureMetadata( + signature.GetValue("present", BsonBoolean.False).AsBoolean, + signature.TryGetValue("format", out var format) && !format.IsBsonNull ? format.AsString : null, + signature.TryGetValue("key_id", out var keyId) && !keyId.IsBsonNull ? keyId.AsString : null, + signature.TryGetValue("sig", out var sig) && !sig.IsBsonNull ? sig.AsString : null, + signature.TryGetValue("certificate", out var certificate) && !certificate.IsBsonNull ? certificate.AsString : null, + signature.TryGetValue("digest", out var digest) && !digest.IsBsonNull ? digest.AsString : null); + + return new RawUpstreamMetadata( + GetRequiredString(upstream, "upstream_id"), + upstream.TryGetValue("document_version", out var version) && !version.IsBsonNull ? version.AsString : null, + GetDateTimeOffset(upstream, "retrieved_at", DateTimeOffset.UtcNow), + GetRequiredString(upstream, "content_hash"), + signatureMetadata, + provenance.ToImmutable()); + } + + private static RawIdentifiers ParseIdentifiers(BsonDocument identifiers) + { + var aliases = identifiers.TryGetValue("aliases", out var aliasesValue) && aliasesValue.IsBsonArray + ? aliasesValue.AsBsonArray.Select(BsonValueToString).ToImmutableArray() + : ImmutableArray.Empty; + + return new RawIdentifiers( + aliases, + GetRequiredString(identifiers, "primary")); + } + + private static string GetRequiredString(BsonDocument document, string name) + { + if (!document.TryGetValue(name, out var value) || value.IsBsonNull) + { + return string.Empty; + } + + return value.IsString ? value.AsString : value.ToString(); + } + + private static string? GetOptionalString(BsonDocument document, string name) + { + if (!document.TryGetValue(name, out var value) || value.IsBsonNull) + { + return null; + } + + return value.IsString ? value.AsString : value.ToString(); + } + + private static string BsonValueToString(BsonValue value) + { + return value switch + { + null => string.Empty, + BsonString s => s.AsString, + BsonBoolean b => b.AsBoolean.ToString(), + BsonDateTime dateTime => dateTime.ToUniversalTime().ToString("O"), + BsonInt32 i => i.AsInt32.ToString(CultureInfo.InvariantCulture), + BsonInt64 l => l.AsInt64.ToString(CultureInfo.InvariantCulture), + BsonDouble d => d.AsDouble.ToString(CultureInfo.InvariantCulture), + _ => value.ToString() + }; + } + + private static DateTimeOffset GetDateTimeOffset(BsonDocument document, string name, DateTimeOffset defaultValue) + { + if (!document.TryGetValue(name, out var value) || value.IsBsonNull) + { + return defaultValue; + } + + return value.ToUniversalTime(); + } +} diff --git a/src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/Raw/MongoAdvisoryRawRepository.cs b/src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/Raw/MongoAdvisoryRawRepository.cs index 71b9af233..670efe0df 100644 --- a/src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/Raw/MongoAdvisoryRawRepository.cs +++ b/src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/Raw/MongoAdvisoryRawRepository.cs @@ -223,12 +223,69 @@ internal sealed class MongoAdvisoryRawRepository : IAdvisoryRawRepository ? EncodeCursor(records[^1].IngestedAt.UtcDateTime, records[^1].Id) : null; - return new AdvisoryRawQueryResult(records, nextCursor, hasMore); - } - - public async Task> ListForVerificationAsync( - string tenant, - DateTimeOffset since, + return new AdvisoryRawQueryResult(records, nextCursor, hasMore); + } + + public async Task> FindByAdvisoryKeyAsync( + string tenant, + IReadOnlyCollection searchValues, + IReadOnlyCollection sourceVendors, + CancellationToken cancellationToken) + { + ArgumentException.ThrowIfNullOrWhiteSpace(tenant); + if (searchValues is null || searchValues.Count == 0) + { + return Array.Empty(); + } + + var normalizedValues = searchValues + .Where(static value => !string.IsNullOrWhiteSpace(value)) + .Select(static value => value.Trim()) + .Distinct(StringComparer.Ordinal) + .ToArray(); + + if (normalizedValues.Length == 0) + { + return Array.Empty(); + } + + var filter = Builders.Filter.Eq("tenant", tenant) + & Builders.Filter.Or( + Builders.Filter.In("advisory_key", normalizedValues), + Builders.Filter.ElemMatch( + "links", + Builders.Filter.In("value", normalizedValues))); + + if (sourceVendors is { Count: > 0 }) + { + var vendorValues = sourceVendors + .Where(static vendor => !string.IsNullOrWhiteSpace(vendor)) + .Select(static vendor => vendor.Trim().ToLowerInvariant()) + .Distinct(StringComparer.Ordinal) + .ToArray(); + + if (vendorValues.Length > 0) + { + filter &= Builders.Filter.In("source.vendor", vendorValues); + } + } + + var sort = Builders.Sort + .Descending("created_at") + .Descending("_id"); + + var documents = await _collection + .Find(filter) + .Sort(sort) + .ToListAsync(cancellationToken) + .ConfigureAwait(false); + + return documents.Select(MapToRecord).ToArray(); + } + + public async Task> ListForVerificationAsync( + string tenant, + DateTimeOffset since, DateTimeOffset until, IReadOnlyCollection sourceVendors, CancellationToken cancellationToken) @@ -368,29 +425,39 @@ internal sealed class MongoAdvisoryRawRepository : IAdvisoryRawRepository } } - var linkset = new BsonDocument - { - { "aliases", new BsonArray(document.Linkset.Aliases) }, - { "purls", new BsonArray(document.Linkset.PackageUrls) }, - { "cpes", new BsonArray(document.Linkset.Cpes) }, - { "references", references }, - { "reconciled_from", new BsonArray(document.Linkset.ReconciledFrom) }, - { "notes", notes } - }; - - var bson = new BsonDocument - { - { "_id", id }, - { "tenant", document.Tenant }, - { "source", source }, - { "upstream", upstream }, - { "content", content }, - { "identifiers", identifiers }, - { "linkset", linkset }, - { "supersedes", supersedesValue is null ? BsonNull.Value : supersedesValue }, - { "created_at", document.Upstream.RetrievedAt.UtcDateTime }, - { "ingested_at", now } - }; + var linkset = new BsonDocument + { + { "aliases", new BsonArray(document.Linkset.Aliases) }, + { "purls", new BsonArray(document.Linkset.PackageUrls) }, + { "cpes", new BsonArray(document.Linkset.Cpes) }, + { "references", references }, + { "reconciled_from", new BsonArray(document.Linkset.ReconciledFrom) }, + { "notes", notes } + }; + + var linksArray = new BsonArray( + (document.Links.IsDefaultOrEmpty ? ImmutableArray.Empty : document.Links) + .Select(link => new BsonDocument + { + { "scheme", link.Scheme }, + { "value", link.Value } + })); + + var bson = new BsonDocument + { + { "_id", id }, + { "tenant", document.Tenant }, + { "source", source }, + { "upstream", upstream }, + { "content", content }, + { "identifiers", identifiers }, + { "linkset", linkset }, + { "advisory_key", document.AdvisoryKey }, + { "links", linksArray }, + { "supersedes", supersedesValue is null ? BsonNull.Value : supersedesValue }, + { "created_at", document.Upstream.RetrievedAt.UtcDateTime }, + { "ingested_at", now } + }; return bson; } @@ -402,17 +469,53 @@ internal sealed class MongoAdvisoryRawRepository : IAdvisoryRawRepository var upstream = MapUpstream(document["upstream"].AsBsonDocument); var content = MapContent(document["content"].AsBsonDocument); var identifiers = MapIdentifiers(document["identifiers"].AsBsonDocument); - var linkset = MapLinkset(document["linkset"].AsBsonDocument); - var supersedes = document.GetValue("supersedes", BsonNull.Value); - - var rawDocument = new AdvisoryRawDocument( - tenant, - source, - upstream, - content, - identifiers, - linkset, - supersedes.IsBsonNull ? null : supersedes.AsString); + var linkset = MapLinkset(document["linkset"].AsBsonDocument); + var supersedes = document.GetValue("supersedes", BsonNull.Value); + + var advisoryKey = document.TryGetValue("advisory_key", out var advisoryKeyValue) && advisoryKeyValue.IsString + ? advisoryKeyValue.AsString + : string.Empty; + + var links = MapLinks(document); + AdvisoryCanonicalizationResult? canonical = null; + if (string.IsNullOrWhiteSpace(advisoryKey) || links.IsDefaultOrEmpty) + { + canonical = AdvisoryCanonicalizer.Canonicalize(identifiers, source, upstream); + if (string.IsNullOrWhiteSpace(advisoryKey)) + { + advisoryKey = canonical.AdvisoryKey; + } + + if (links.IsDefaultOrEmpty) + { + links = canonical.Links.IsDefault ? ImmutableArray.Empty : canonical.Links; + } + } + + if (string.IsNullOrWhiteSpace(advisoryKey)) + { + canonical ??= AdvisoryCanonicalizer.Canonicalize(identifiers, source, upstream); + advisoryKey = canonical.AdvisoryKey; + } + + var normalizedLinks = links.IsDefaultOrEmpty + ? (canonical?.Links ?? ImmutableArray.Empty) + : links; + if (normalizedLinks.IsDefault) + { + normalizedLinks = ImmutableArray.Empty; + } + + var rawDocument = new AdvisoryRawDocument( + tenant, + source, + upstream, + content, + identifiers, + linkset, + advisoryKey, + normalizedLinks, + supersedes.IsBsonNull ? null : supersedes.AsString); var ingestedAt = GetDateTimeOffset(document, "ingested_at", rawDocument.Upstream.RetrievedAt); var createdAt = GetDateTimeOffset(document, "created_at", rawDocument.Upstream.RetrievedAt); @@ -499,7 +602,7 @@ internal sealed class MongoAdvisoryRawRepository : IAdvisoryRawRepository GetRequiredString(identifiers, "primary")); } - private static RawLinkset MapLinkset(BsonDocument linkset) + private static RawLinkset MapLinkset(BsonDocument linkset) { var aliases = linkset.TryGetValue("aliases", out var aliasesValue) && aliasesValue.IsBsonArray ? aliasesValue.AsBsonArray.Select(BsonValueToString).ToImmutableArray() @@ -549,7 +652,36 @@ internal sealed class MongoAdvisoryRawRepository : IAdvisoryRawRepository ReconciledFrom = reconciledFrom, Notes = notesBuilder.ToImmutable() }; - } + } + + private static ImmutableArray MapLinks(BsonDocument document) + { + if (!document.TryGetValue("links", out var linksValue) || !linksValue.IsBsonArray) + { + return ImmutableArray.Empty; + } + + var builder = ImmutableArray.CreateBuilder(); + foreach (var element in linksValue.AsBsonArray) + { + if (!element.IsBsonDocument) + { + continue; + } + + var linkDoc = element.AsBsonDocument; + var scheme = GetOptionalString(linkDoc, "scheme") ?? string.Empty; + var value = GetOptionalString(linkDoc, "value") ?? string.Empty; + if (string.IsNullOrWhiteSpace(value)) + { + continue; + } + + builder.Add(new RawLink(scheme, value)); + } + + return builder.Count == 0 ? ImmutableArray.Empty : builder.ToImmutable(); + } private static DateTimeOffset GetDateTimeOffset(BsonDocument document, string field, DateTimeOffset fallback) { diff --git a/src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/ServiceCollectionExtensions.cs b/src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/ServiceCollectionExtensions.cs index afe81526f..f8ca2bc7c 100644 --- a/src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/ServiceCollectionExtensions.cs +++ b/src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/ServiceCollectionExtensions.cs @@ -108,6 +108,7 @@ public static class ServiceCollectionExtensions services.AddSingleton(); services.AddSingleton(); services.AddSingleton(); + services.AddSingleton(); services.AddSingleton(); services.AddSingleton(); services.AddSingleton(); diff --git a/src/Concelier/__Tests/StellaOps.Concelier.Core.Tests/Linksets/AdvisoryObservationFactoryTests.cs b/src/Concelier/__Tests/StellaOps.Concelier.Core.Tests/Linksets/AdvisoryObservationFactoryTests.cs index 2622404a0..704a49bf7 100644 --- a/src/Concelier/__Tests/StellaOps.Concelier.Core.Tests/Linksets/AdvisoryObservationFactoryTests.cs +++ b/src/Concelier/__Tests/StellaOps.Concelier.Core.Tests/Linksets/AdvisoryObservationFactoryTests.cs @@ -1,10 +1,12 @@ -using System.Collections.Generic; -using System.Collections.Immutable; -using System.Text.Json; -using StellaOps.Concelier.Core.Linksets; -using StellaOps.Concelier.Models.Observations; -using StellaOps.Concelier.RawModels; -using Xunit; +using System.Collections.Generic; +using System.Collections.Immutable; +using System.Linq; +using System.Text.Json; +using StellaOps.Concelier.Core.Linksets; +using StellaOps.Concelier.Models.Observations; +using StellaOps.Concelier.RawModels; +using Xunit; +using StellaOps.Concelier.Models; namespace StellaOps.Concelier.Core.Tests.Linksets; @@ -115,11 +117,11 @@ public sealed class AdvisoryObservationFactoryTests } [Fact] - public void Create_StoresNotesAsAttributes() - { - var factory = new AdvisoryObservationFactory(); - var notes = ImmutableDictionary.CreateRange(new Dictionary - { + public void Create_StoresNotesAsAttributes() + { + var factory = new AdvisoryObservationFactory(); + var notes = ImmutableDictionary.CreateRange(new Dictionary + { ["range-introduced"] = "1.0.0", ["range-fixed"] = "1.0.5" }); @@ -142,7 +144,62 @@ public sealed class AdvisoryObservationFactoryTests Assert.Equal(notes, observation.RawLinkset.Notes); Assert.Equal(new[] { "connector-a", "connector-b" }, observation.RawLinkset.ReconciledFrom); } - + + [Fact] + public void Create_IsDeterministicAcrossRuns() + { + var factory = new AdvisoryObservationFactory(); + var retrievedAt = DateTimeOffset.Parse("2025-02-11T04:05:06Z"); + + var upstream = new RawUpstreamMetadata( + UpstreamId: "CVE-2025-1000", + DocumentVersion: "2025.02.11", + RetrievedAt: retrievedAt, + ContentHash: "sha256:deterministic-1", + Signature: new RawSignatureMetadata(true, "dsse", "key-123", "signature-data"), + Provenance: ImmutableDictionary.CreateRange(new Dictionary + { + ["api"] = "https://api.vendor.test/v1/feed", + ["snapshot"] = "2025-02-11" + })); + + var linkset = new RawLinkset + { + Aliases = ImmutableArray.Create("Vendor-1000", "CVE-2025-1000"), + PackageUrls = ImmutableArray.Create("pkg:npm/demo@1.0.0", "pkg:npm/demo@1.0.0"), + Cpes = ImmutableArray.Create("cpe:2.3:a:vendor:demo:1.0:*:*:*:*:*:*:*"), + References = ImmutableArray.Create( + new RawReference("advisory", "https://vendor.test/advisory", "vendor"), + new RawReference("fix", "https://vendor.test/fix", null)), + ReconciledFrom = ImmutableArray.Create("connector-y"), + Notes = ImmutableDictionary.CreateRange(new Dictionary + { + ["alias.vendor"] = "Vendor-1000" + }) + }; + + var rawDocument = BuildRawDocument( + source: new RawSourceMetadata("vendor", "connector-y", "5.6.7", "stable"), + upstream: upstream, + identifiers: new RawIdentifiers( + Aliases: ImmutableArray.Create("CVE-2025-1000", "Vendor-1000"), + PrimaryId: "CVE-2025-1000"), + linkset: linkset, + tenant: "tenant-a"); + + var first = factory.Create(rawDocument, observedAt: retrievedAt); + var second = factory.Create(rawDocument, observedAt: retrievedAt); + + var firstJson = CanonicalJsonSerializer.Serialize(first); + var secondJson = CanonicalJsonSerializer.Serialize(second); + + Assert.Equal(firstJson, secondJson); + Assert.Equal(first.ObservationId, second.ObservationId); + Assert.True(first.Linkset.Aliases.SequenceEqual(second.Linkset.Aliases)); + Assert.True(first.RawLinkset.Aliases.SequenceEqual(second.RawLinkset.Aliases)); + Assert.Equal(first.CreatedAt, second.CreatedAt); + } + private static AdvisoryRawDocument BuildRawDocument( RawSourceMetadata? source = null, RawUpstreamMetadata? upstream = null, diff --git a/src/Concelier/__Tests/StellaOps.Concelier.Core.Tests/Raw/AdvisoryRawServiceTests.cs b/src/Concelier/__Tests/StellaOps.Concelier.Core.Tests/Raw/AdvisoryRawServiceTests.cs index ba065d7f1..4310a7763 100644 --- a/src/Concelier/__Tests/StellaOps.Concelier.Core.Tests/Raw/AdvisoryRawServiceTests.cs +++ b/src/Concelier/__Tests/StellaOps.Concelier.Core.Tests/Raw/AdvisoryRawServiceTests.cs @@ -1,4 +1,5 @@ using System; +using System.Collections.Generic; using System.Collections.Immutable; using System.Linq; using System.Text.Json; @@ -22,16 +23,19 @@ public sealed class AdvisoryRawServiceTests var repository = new RecordingRepository(); var service = CreateService(repository); - var document = CreateDocument() with { Supersedes = " previous-id " }; - var storedDocument = document.WithSupersedes("advisory_raw:vendor-x:ghsa-xxxx:sha256-2"); - var expectedResult = new AdvisoryRawUpsertResult(true, CreateRecord(storedDocument)); - repository.NextResult = expectedResult; - - var result = await service.IngestAsync(document, CancellationToken.None); - - Assert.NotNull(repository.CapturedDocument); - Assert.Null(repository.CapturedDocument!.Supersedes); - Assert.Equal(expectedResult.Record.Document.Supersedes, result.Record.Document.Supersedes); + var document = CreateDocument() with { Supersedes = " previous-id " }; + var storedDocument = document.WithSupersedes("advisory_raw:vendor-x:ghsa-xxxx:sha256-2"); + var expectedResult = new AdvisoryRawUpsertResult(true, CreateRecord(storedDocument)); + repository.NextResult = expectedResult; + + var result = await service.IngestAsync(document, CancellationToken.None); + + Assert.NotNull(repository.CapturedDocument); + Assert.Null(repository.CapturedDocument!.Supersedes); + Assert.Equal(expectedResult.Record.Document.Supersedes, result.Record.Document.Supersedes); + Assert.Equal("GHSA-XXXX", repository.CapturedDocument.AdvisoryKey); + Assert.Contains(repository.CapturedDocument.Links, link => link.Scheme == "GHSA" && link.Value == "GHSA-XXXX"); + Assert.Contains(repository.CapturedDocument.Links, link => link.Scheme == "PRIMARY" && link.Value == "GHSA-XXXX"); } [Fact] @@ -68,6 +72,31 @@ public sealed class AdvisoryRawServiceTests Assert.NotNull(repository.CapturedDocument); Assert.True(aliasSeries.SequenceEqual(repository.CapturedDocument!.Identifiers.Aliases)); + Assert.Equal("CVE-2025-0001", repository.CapturedDocument.AdvisoryKey); + Assert.Contains(repository.CapturedDocument.Links, link => link.Scheme == "CVE" && link.Value == "CVE-2025-0001"); + Assert.Contains(repository.CapturedDocument.Links, link => link.Scheme == "GHSA" && link.Value == "GHSA-XXXX"); + } + + [Fact] + public async Task FindByAdvisoryKeyAsync_NormalizesKeyAndVendors() + { + var repository = new RecordingRepository + { + AdvisoryKeyResults = new[] { CreateRecord(CreateDocument()) } + }; + var service = CreateService(repository); + + var results = await service.FindByAdvisoryKeyAsync( + "Tenant-Example", + "ghsa-xxxx", + new[] { "Vendor-X", " " }, + CancellationToken.None); + + Assert.Single(results); + Assert.Equal("tenant-example", repository.CapturedTenant); + Assert.Contains("GHSA-XXXX", repository.CapturedAdvisoryKeySearchValues!, StringComparer.Ordinal); + Assert.Contains("ghsa-xxxx", repository.CapturedAdvisoryKeySearchValues!, StringComparer.Ordinal); + Assert.Contains("vendor-x", repository.CapturedAdvisoryKeyVendors!, StringComparer.Ordinal); } private static AdvisoryRawService CreateService(RecordingRepository repository) @@ -86,56 +115,75 @@ public sealed class AdvisoryRawServiceTests private static AdvisoryRawDocument CreateDocument() { using var raw = JsonDocument.Parse("""{"id":"demo"}"""); - return new AdvisoryRawDocument( - Tenant: "Tenant-A", - Source: new RawSourceMetadata("Vendor-X", "connector-y", "1.0.0"), - Upstream: new RawUpstreamMetadata( - UpstreamId: "GHSA-xxxx", - DocumentVersion: "1", - RetrievedAt: DateTimeOffset.UtcNow, - ContentHash: "sha256:abc", - Signature: new RawSignatureMetadata( - Present: true, - Format: "dsse", - KeyId: "key-1", - Signature: "base64signature"), - Provenance: ImmutableDictionary.Empty), - Content: new RawContent( - Format: "OSV", - SpecVersion: "1.0", - Raw: raw.RootElement.Clone()), - Identifiers: new RawIdentifiers( - Aliases: ImmutableArray.Create("GHSA-xxxx"), - PrimaryId: "GHSA-xxxx"), - Linkset: new RawLinkset - { - Aliases = ImmutableArray.Empty, - PackageUrls = ImmutableArray.Empty, - Cpes = ImmutableArray.Empty, - References = ImmutableArray.Empty, - ReconciledFrom = ImmutableArray.Empty, - Notes = ImmutableDictionary.Empty - }); - } + return new AdvisoryRawDocument( + Tenant: "Tenant-A", + Source: new RawSourceMetadata("Vendor-X", "connector-y", "1.0.0"), + Upstream: new RawUpstreamMetadata( + UpstreamId: "GHSA-xxxx", + DocumentVersion: "1", + RetrievedAt: DateTimeOffset.UtcNow, + ContentHash: "sha256:abc", + Signature: new RawSignatureMetadata( + Present: true, + Format: "dsse", + KeyId: "key-1", + Signature: "base64signature"), + Provenance: ImmutableDictionary.Empty), + Content: new RawContent( + Format: "OSV", + SpecVersion: "1.0", + Raw: raw.RootElement.Clone()), + Identifiers: new RawIdentifiers( + Aliases: ImmutableArray.Create("GHSA-xxxx"), + PrimaryId: "GHSA-xxxx"), + Linkset: new RawLinkset + { + Aliases = ImmutableArray.Empty, + PackageUrls = ImmutableArray.Empty, + Cpes = ImmutableArray.Empty, + References = ImmutableArray.Empty, + ReconciledFrom = ImmutableArray.Empty, + Notes = ImmutableDictionary.Empty + }, + AdvisoryKey: string.Empty, + Links: ImmutableArray.Empty); + } - private static AdvisoryRawRecord CreateRecord(AdvisoryRawDocument document) - => new( - Id: "advisory_raw:vendor-x:ghsa-xxxx:sha256-1", - Document: document, - IngestedAt: DateTimeOffset.UtcNow, - CreatedAt: document.Upstream.RetrievedAt); + private static AdvisoryRawRecord CreateRecord(AdvisoryRawDocument document) + { + var canonical = AdvisoryCanonicalizer.Canonicalize(document.Identifiers, document.Source, document.Upstream); + var resolvedDocument = document with + { + AdvisoryKey = string.IsNullOrWhiteSpace(document.AdvisoryKey) ? canonical.AdvisoryKey : document.AdvisoryKey, + Links = document.Links.IsDefaultOrEmpty ? canonical.Links : document.Links + }; + + return new AdvisoryRawRecord( + Id: "advisory_raw:vendor-x:ghsa-xxxx:sha256-1", + Document: resolvedDocument, + IngestedAt: DateTimeOffset.UtcNow, + CreatedAt: document.Upstream.RetrievedAt); + } private sealed class RecordingRepository : IAdvisoryRawRepository - { - public AdvisoryRawDocument? CapturedDocument { get; private set; } - - public AdvisoryRawUpsertResult? NextResult { get; set; } - - public Task UpsertAsync(AdvisoryRawDocument document, CancellationToken cancellationToken) - { - if (NextResult is null) - { - throw new InvalidOperationException("NextResult must be set before calling UpsertAsync."); + { + public AdvisoryRawDocument? CapturedDocument { get; private set; } + + public AdvisoryRawUpsertResult? NextResult { get; set; } + + public string? CapturedTenant { get; private set; } + + public IReadOnlyCollection? CapturedAdvisoryKeySearchValues { get; private set; } + + public IReadOnlyCollection? CapturedAdvisoryKeyVendors { get; private set; } + + public IReadOnlyList AdvisoryKeyResults { get; set; } = Array.Empty(); + + public Task UpsertAsync(AdvisoryRawDocument document, CancellationToken cancellationToken) + { + if (NextResult is null) + { + throw new InvalidOperationException("NextResult must be set before calling UpsertAsync."); } CapturedDocument = document; @@ -145,14 +193,26 @@ public sealed class AdvisoryRawServiceTests public Task FindByIdAsync(string tenant, string id, CancellationToken cancellationToken) => throw new NotSupportedException(); - public Task QueryAsync(AdvisoryRawQueryOptions options, CancellationToken cancellationToken) - => throw new NotSupportedException(); - - public Task> ListForVerificationAsync( - string tenant, - DateTimeOffset since, - DateTimeOffset until, - IReadOnlyCollection sourceVendors, + public Task QueryAsync(AdvisoryRawQueryOptions options, CancellationToken cancellationToken) + => throw new NotSupportedException(); + + public Task> FindByAdvisoryKeyAsync( + string tenant, + IReadOnlyCollection searchValues, + IReadOnlyCollection sourceVendors, + CancellationToken cancellationToken) + { + CapturedTenant = tenant; + CapturedAdvisoryKeySearchValues = searchValues?.ToArray(); + CapturedAdvisoryKeyVendors = sourceVendors?.ToArray(); + return Task.FromResult(AdvisoryKeyResults); + } + + public Task> ListForVerificationAsync( + string tenant, + DateTimeOffset since, + DateTimeOffset until, + IReadOnlyCollection sourceVendors, CancellationToken cancellationToken) => throw new NotSupportedException(); } diff --git a/src/Concelier/__Tests/StellaOps.Concelier.Merge.Tests/MergePrecedenceIntegrationTests.cs b/src/Concelier/__Tests/StellaOps.Concelier.Merge.Tests/MergePrecedenceIntegrationTests.cs index 61919920e..57b940829 100644 --- a/src/Concelier/__Tests/StellaOps.Concelier.Merge.Tests/MergePrecedenceIntegrationTests.cs +++ b/src/Concelier/__Tests/StellaOps.Concelier.Merge.Tests/MergePrecedenceIntegrationTests.cs @@ -1,5 +1,4 @@ -using System; -using System.Linq; +using System; using System.Threading; using System.Threading.Tasks; using Microsoft.Extensions.Logging.Abstractions; @@ -76,29 +75,6 @@ public sealed class MergePrecedenceIntegrationTests : IAsyncLifetime Assert.True(persisted.BeforeHash.Length > 0); } - [Fact] - public async Task MergePipeline_IsDeterministicAcrossRuns() - { - await EnsureInitializedAsync(); - - var merger = _merger!; - var calculator = new CanonicalHashCalculator(); - - var firstResult = merger.Merge(new[] { CreateNvdBaseline(), CreateVendorOverride() }); - var secondResult = merger.Merge(new[] { CreateNvdBaseline(), CreateVendorOverride() }); - - var first = firstResult.Advisory; - var second = secondResult.Advisory; - - var firstHash = calculator.ComputeHash(first); - var secondHash = calculator.ComputeHash(second); - - Assert.Equal(firstHash, secondHash); - Assert.Equal(first.AdvisoryKey, second.AdvisoryKey); - Assert.Equal(first.Aliases.Length, second.Aliases.Length); - Assert.True(first.Aliases.SequenceEqual(second.Aliases)); - } - public async Task InitializeAsync() { _timeProvider = new FakeTimeProvider(new DateTimeOffset(2025, 3, 1, 0, 0, 0, TimeSpan.Zero)) diff --git a/src/Concelier/__Tests/StellaOps.Concelier.Storage.Mongo.Tests/Migrations/EnsureAdvisoryObservationsRawLinksetMigrationTests.cs b/src/Concelier/__Tests/StellaOps.Concelier.Storage.Mongo.Tests/Migrations/EnsureAdvisoryObservationsRawLinksetMigrationTests.cs index b7e0ab61b..c085beedd 100644 --- a/src/Concelier/__Tests/StellaOps.Concelier.Storage.Mongo.Tests/Migrations/EnsureAdvisoryObservationsRawLinksetMigrationTests.cs +++ b/src/Concelier/__Tests/StellaOps.Concelier.Storage.Mongo.Tests/Migrations/EnsureAdvisoryObservationsRawLinksetMigrationTests.cs @@ -69,7 +69,12 @@ public sealed class EnsureAdvisoryObservationsRawLinksetMigrationTests References = ImmutableArray.Create(new RawReference("advisory", "https://example.test/advisory", "vendor")), ReconciledFrom = ImmutableArray.Create("connector-y"), Notes = ImmutableDictionary.CreateRange(new[] { new KeyValuePair("range-fixed", "1.0.1") }) - }); + }, + advisoryKey: "CVE-2025-0001", + links: ImmutableArray.Create( + new RawLink("CVE", "CVE-2025-0001"), + new RawLink("GHSA", "GHSA-2025-0001"), + new RawLink("PRIMARY", "CVE-2025-0001"))); await rawRepository.UpsertAsync(rawDocument, CancellationToken.None); @@ -147,7 +152,11 @@ public sealed class EnsureAdvisoryObservationsRawLinksetMigrationTests identifiers: new RawIdentifiers( Aliases: ImmutableArray.Empty, PrimaryId: "GHSA-9999-0001"), - linkset: new RawLinkset()); + linkset: new RawLinkset(), + advisoryKey: "GHSA-9999-0001", + links: ImmutableArray.Create( + new RawLink("GHSA", "GHSA-9999-0001"), + new RawLink("PRIMARY", "GHSA-9999-0001"))); var observationId = "tenant-b:vendor-y:ghsa-9999-0001:sha256-def456"; var document = BuildObservationDocument( diff --git a/src/Concelier/__Tests/StellaOps.Concelier.Storage.Mongo.Tests/Migrations/MongoMigrationRunnerTests.cs b/src/Concelier/__Tests/StellaOps.Concelier.Storage.Mongo.Tests/Migrations/MongoMigrationRunnerTests.cs index c1233e7f9..59ce1325b 100644 --- a/src/Concelier/__Tests/StellaOps.Concelier.Storage.Mongo.Tests/Migrations/MongoMigrationRunnerTests.cs +++ b/src/Concelier/__Tests/StellaOps.Concelier.Storage.Mongo.Tests/Migrations/MongoMigrationRunnerTests.cs @@ -686,6 +686,21 @@ public sealed class MongoMigrationRunnerTests { "notes", new BsonDocument() }, } }, + { "advisory_key", upstreamId.ToUpperInvariant() }, + { + "links", + new BsonArray + { + new BsonDocument + { + { "scheme", "PRIMARY" }, + { "value", upstreamId.ToUpperInvariant() } + } + } + }, + { "created_at", retrievedAt }, + { "ingested_at", retrievedAt }, + { "supersedes", BsonNull.Value } }; } } diff --git a/src/Concelier/__Tests/StellaOps.Concelier.WebService.Tests/ConcelierOptionsPostConfigureTests.cs b/src/Concelier/__Tests/StellaOps.Concelier.WebService.Tests/ConcelierOptionsPostConfigureTests.cs index ce9c3bc19..bb68f28d0 100644 --- a/src/Concelier/__Tests/StellaOps.Concelier.WebService.Tests/ConcelierOptionsPostConfigureTests.cs +++ b/src/Concelier/__Tests/StellaOps.Concelier.WebService.Tests/ConcelierOptionsPostConfigureTests.cs @@ -8,11 +8,11 @@ namespace StellaOps.Concelier.WebService.Tests; public sealed class ConcelierOptionsPostConfigureTests { [Fact] - public void Apply_LoadsClientSecretFromRelativeFile() - { - var tempDirectory = Directory.CreateTempSubdirectory(); - try - { + public void Apply_LoadsClientSecretFromRelativeFile() + { + var tempDirectory = Directory.CreateTempSubdirectory(); + try + { var secretPath = Path.Combine(tempDirectory.FullName, "authority.secret"); File.WriteAllText(secretPath, " concelier-secret "); @@ -34,14 +34,22 @@ public sealed class ConcelierOptionsPostConfigureTests { Directory.Delete(tempDirectory.FullName, recursive: true); } - } - } - - [Fact] - public void Apply_ThrowsWhenSecretFileMissing() - { - var options = new ConcelierOptions - { + } + } + + [Fact] + public void Features_NoMergeEnabled_DefaultsToTrue() + { + var options = new ConcelierOptions(); + + Assert.True(options.Features.NoMergeEnabled); + } + + [Fact] + public void Apply_ThrowsWhenSecretFileMissing() + { + var options = new ConcelierOptions + { Authority = new ConcelierOptions.AuthorityOptions { ClientSecretFile = "missing.secret" diff --git a/src/Concelier/__Tests/StellaOps.Concelier.WebService.Tests/WebServiceEndpointsTests.cs b/src/Concelier/__Tests/StellaOps.Concelier.WebService.Tests/WebServiceEndpointsTests.cs index ba4a286a0..bcfb82a36 100644 --- a/src/Concelier/__Tests/StellaOps.Concelier.WebService.Tests/WebServiceEndpointsTests.cs +++ b/src/Concelier/__Tests/StellaOps.Concelier.WebService.Tests/WebServiceEndpointsTests.cs @@ -469,6 +469,55 @@ public sealed class WebServiceEndpointsTests : IAsyncLifetime Assert.Empty(firstIds.Intersect(secondIds)); } + [Fact] + public async Task AdvisoryEvidenceEndpoint_ReturnsDocumentsForCanonicalKey() + { + await SeedAdvisoryRawDocumentsAsync( + CreateAdvisoryRawDocument("tenant-a", "vendor-x", "GHSA-2025-0001", "sha256:001", new BsonDocument("id", "GHSA-2025-0001:1")), + CreateAdvisoryRawDocument("tenant-a", "vendor-y", "GHSA-2025-0001", "sha256:002", new BsonDocument("id", "GHSA-2025-0001:2")), + CreateAdvisoryRawDocument("tenant-b", "vendor-x", "GHSA-2025-0001", "sha256:003", new BsonDocument("id", "GHSA-2025-0001:3"))); + + using var client = _factory.CreateClient(); + var response = await client.GetAsync("/vuln/evidence/advisories/ghsa-2025-0001?tenant=tenant-a"); + + response.EnsureSuccessStatusCode(); + var evidence = await response.Content.ReadFromJsonAsync(); + + Assert.NotNull(evidence); + Assert.Equal("GHSA-2025-0001", evidence!.AdvisoryKey); + Assert.Equal(2, evidence.Records.Count); + Assert.All(evidence.Records, record => Assert.Equal("tenant-a", record.Tenant)); + } + + [Fact] + public async Task AdvisoryEvidenceEndpoint_FiltersByVendor() + { + await SeedAdvisoryRawDocumentsAsync( + CreateAdvisoryRawDocument("tenant-a", "vendor-x", "GHSA-2025-0002", "sha256:101", new BsonDocument("id", "GHSA-2025-0002:1")), + CreateAdvisoryRawDocument("tenant-a", "vendor-y", "GHSA-2025-0002", "sha256:102", new BsonDocument("id", "GHSA-2025-0002:2"))); + + using var client = _factory.CreateClient(); + var response = await client.GetAsync("/vuln/evidence/advisories/GHSA-2025-0002?tenant=tenant-a&vendor=vendor-y"); + + response.EnsureSuccessStatusCode(); + var evidence = await response.Content.ReadFromJsonAsync(); + + Assert.NotNull(evidence); + var record = Assert.Single(evidence!.Records); + Assert.Equal("vendor-y", record.Document.Source.Vendor); + } + + [Fact] + public async Task AdvisoryEvidenceEndpoint_ReturnsNotFoundWhenMissing() + { + await SeedAdvisoryRawDocumentsAsync(); + + using var client = _factory.CreateClient(); + var response = await client.GetAsync("/vuln/evidence/advisories/CVE-2099-9999?tenant=tenant-a"); + + Assert.Equal(HttpStatusCode.NotFound, response.StatusCode); + } + [Fact] public async Task AdvisoryIngestEndpoint_EmitsMetricsWithExpectedTags() { @@ -1871,6 +1920,18 @@ public sealed class WebServiceEndpointsTests : IAsyncLifetime { "notes", new BsonDocument() } } }, + { "advisory_key", upstreamId.ToUpperInvariant() }, + { + "links", + new BsonArray + { + new BsonDocument + { + { "scheme", "PRIMARY" }, + { "value", upstreamId.ToUpperInvariant() } + } + } + }, { "supersedes", supersedes is null ? BsonNull.Value : supersedes }, { "ingested_at", now }, { "created_at", now } diff --git a/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/FileSurfaceManifestStore.cs b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/FileSurfaceManifestStore.cs new file mode 100644 index 000000000..9363f6456 --- /dev/null +++ b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/FileSurfaceManifestStore.cs @@ -0,0 +1,224 @@ +using System; +using System.Collections.Generic; +using System.Collections.Immutable; +using System.IO; +using System.Linq; +using System.Security.Cryptography; +using System.Threading; +using System.Threading.Tasks; +using Microsoft.Extensions.Logging; +using Microsoft.Extensions.Options; + +namespace StellaOps.Scanner.Surface.FS; + +/// +/// File-system backed manifest store for surface artefacts. +/// +public sealed class FileSurfaceManifestStore : + ISurfaceManifestWriter, + ISurfaceManifestReader +{ + private readonly ILogger _logger; + private readonly SurfaceManifestPathBuilder _pathBuilder; + private readonly SemaphoreSlim _publishGate = new(1, 1); + + public FileSurfaceManifestStore( + IOptions cacheOptions, + IOptions storeOptions, + ILogger logger) + { + if (cacheOptions is null) + { + throw new ArgumentNullException(nameof(cacheOptions)); + } + + if (storeOptions is null) + { + throw new ArgumentNullException(nameof(storeOptions)); + } + + _logger = logger ?? throw new ArgumentNullException(nameof(logger)); + _pathBuilder = new SurfaceManifestPathBuilder(cacheOptions.Value, storeOptions.Value); + } + + public async Task PublishAsync( + SurfaceManifestDocument document, + CancellationToken cancellationToken = default) + { + if (document is null) + { + throw new ArgumentNullException(nameof(document)); + } + + cancellationToken.ThrowIfCancellationRequested(); + + var normalized = Normalize(document); + var payload = SurfaceCacheJsonSerializer.Serialize(normalized); + var digest = ComputeDigest(payload.Span); + var digestHex = SurfaceManifestPathBuilder.EnsureSha256Digest(digest); + + var path = _pathBuilder.BuildManifestPath(normalized.Tenant, digestHex); + Directory.CreateDirectory(Path.GetDirectoryName(path)!); + + await _publishGate.WaitAsync(cancellationToken).ConfigureAwait(false); + try + { + var shouldWrite = true; + + if (File.Exists(path)) + { + var existing = await File.ReadAllBytesAsync(path, cancellationToken).ConfigureAwait(false); + var existingDigest = ComputeDigest(existing); + if (!digest.Equals(existingDigest, StringComparison.OrdinalIgnoreCase)) + { + _logger.LogWarning( + "Surface manifest collision for {Path}; overwriting with new digest {Digest}.", + path, + digest); + } + else + { + _logger.LogDebug("Surface manifest reuse for {Path} (digest {Digest}).", path, digest); + shouldWrite = false; + } + } + + if (shouldWrite) + { + await File.WriteAllBytesAsync(path, payload.ToArray(), cancellationToken).ConfigureAwait(false); + } + } + finally + { + _publishGate.Release(); + } + + var uri = _pathBuilder.BuildManifestUri(normalized.Tenant, digestHex); + var artifactId = $"surface:{Sanitize(normalized.Tenant)}:{digestHex}"; + + _logger.LogInformation( + "Published surface manifest for tenant {Tenant} with digest {Digest}.", + normalized.Tenant, + digest); + + return new SurfaceManifestPublishResult(digest, uri, artifactId, normalized); + } + + public async Task TryGetByDigestAsync( + string manifestDigest, + CancellationToken cancellationToken = default) + { + var digestHex = SurfaceManifestPathBuilder.EnsureSha256Digest(manifestDigest); + cancellationToken.ThrowIfCancellationRequested(); + + // We don't know the tenant from digest alone; iterate tenant directories. + foreach (var tenantDirectory in EnumerateTenantDirectories(_pathBuilder.RootDirectory)) + { + cancellationToken.ThrowIfCancellationRequested(); + + var path = Path.Combine( + tenantDirectory, + digestHex[..2], + digestHex[2..4], + $"{digestHex}.json"); + + if (!File.Exists(path)) + { + continue; + } + + var bytes = await File.ReadAllBytesAsync(path, cancellationToken).ConfigureAwait(false); + var existingDigest = ComputeDigest(bytes); + if (!existingDigest.Equals($"sha256:{digestHex}", StringComparison.OrdinalIgnoreCase)) + { + continue; + } + + return SurfaceCacheJsonSerializer.Deserialize(bytes); + } + + return null; + } + + public Task TryGetByUriAsync( + string manifestUri, + CancellationToken cancellationToken = default) + { + var pointer = SurfaceManifestPathBuilder.ParseUri(manifestUri); + var path = _pathBuilder.BuildManifestPath(pointer.TenantSegment, pointer.DigestHex); + + if (!File.Exists(path)) + { + return Task.FromResult(null); + } + + cancellationToken.ThrowIfCancellationRequested(); + + var bytes = File.ReadAllBytes(path); + return Task.FromResult( + SurfaceCacheJsonSerializer.Deserialize(bytes)); + } + + private static SurfaceManifestDocument Normalize(SurfaceManifestDocument document) + { + if (string.IsNullOrWhiteSpace(document.Tenant)) + { + throw new ArgumentException("Surface manifest tenant cannot be empty.", nameof(document)); + } + + var generatedAt = document.GeneratedAt == DateTimeOffset.MinValue + ? DateTimeOffset.MinValue + : document.GeneratedAt.ToUniversalTime(); + + var artifacts = document.Artifacts + .Select(NormalizeArtifact) + .OrderBy(static a => a.Kind, StringComparer.Ordinal) + .ThenBy(static a => a.Digest, StringComparer.Ordinal) + .ToArray(); + + return document with + { + GeneratedAt = generatedAt, + Artifacts = artifacts + }; + } + + private static string ComputeDigest(ReadOnlySpan bytes) + { + using var sha = SHA256.Create(); + var hash = sha.ComputeHash(bytes); + return $"sha256:{Convert.ToHexString(hash).ToLowerInvariant()}"; + } + + private static SurfaceManifestArtifact NormalizeArtifact(SurfaceManifestArtifact artifact) + { + if (artifact.Metadata is null || artifact.Metadata.Count == 0) + { + return artifact; + } + + var sorted = artifact.Metadata + .OrderBy(static pair => pair.Key, StringComparer.Ordinal) + .ToImmutableDictionary(static pair => pair.Key, static pair => pair.Value, StringComparer.Ordinal); + + return artifact with { Metadata = sorted }; + } + + private static IEnumerable EnumerateTenantDirectories(string rootDirectory) + { + if (!Directory.Exists(rootDirectory)) + { + yield break; + } + + foreach (var directory in Directory.EnumerateDirectories(rootDirectory)) + { + yield return directory; + } + } + + private static string Sanitize(string value) + => string.IsNullOrWhiteSpace(value) + ? "default" + : value.Replace('/', '_').Replace('\\', '_'); +} diff --git a/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/ISurfaceManifestReader.cs b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/ISurfaceManifestReader.cs new file mode 100644 index 000000000..5434e9ad2 --- /dev/null +++ b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/ISurfaceManifestReader.cs @@ -0,0 +1,15 @@ +namespace StellaOps.Scanner.Surface.FS; + +/// +/// Provides read access to published surface manifests. +/// +public interface ISurfaceManifestReader +{ + Task TryGetByDigestAsync( + string manifestDigest, + CancellationToken cancellationToken = default); + + Task TryGetByUriAsync( + string manifestUri, + CancellationToken cancellationToken = default); +} diff --git a/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/ISurfaceManifestWriter.cs b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/ISurfaceManifestWriter.cs new file mode 100644 index 000000000..4644ef2aa --- /dev/null +++ b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/ISurfaceManifestWriter.cs @@ -0,0 +1,11 @@ +namespace StellaOps.Scanner.Surface.FS; + +/// +/// Publishes manifest documents to the configured manifest store. +/// +public interface ISurfaceManifestWriter +{ + Task PublishAsync( + SurfaceManifestDocument document, + CancellationToken cancellationToken = default); +} diff --git a/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/ServiceCollectionExtensions.cs b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/ServiceCollectionExtensions.cs index 0d5c4f60b..b0269439e 100644 --- a/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/ServiceCollectionExtensions.cs +++ b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/ServiceCollectionExtensions.cs @@ -1,3 +1,4 @@ +using System; using Microsoft.Extensions.Configuration; using Microsoft.Extensions.DependencyInjection; using Microsoft.Extensions.DependencyInjection.Extensions; @@ -7,7 +8,8 @@ namespace StellaOps.Scanner.Surface.FS; public static class ServiceCollectionExtensions { - private const string ConfigurationSection = "Surface:Cache"; + private const string CacheConfigurationSection = "Surface:Cache"; + private const string ManifestConfigurationSection = "Surface:Manifest"; public static IServiceCollection AddSurfaceFileCache( this IServiceCollection services, @@ -19,7 +21,7 @@ public static class ServiceCollectionExtensions } services.AddOptions() - .BindConfiguration(ConfigurationSection); + .BindConfiguration(CacheConfigurationSection); if (configure is not null) { @@ -31,6 +33,33 @@ public static class ServiceCollectionExtensions return services; } + public static IServiceCollection AddSurfaceManifestStore( + this IServiceCollection services, + Action? configure = null) + { + if (services is null) + { + throw new ArgumentNullException(nameof(services)); + } + + services.AddOptions() + .BindConfiguration(ManifestConfigurationSection); + + if (configure is not null) + { + services.Configure(configure); + } + + services.TryAddSingleton(TimeProvider.System); + services.TryAddEnumerable(ServiceDescriptor.Singleton, SurfaceManifestStoreOptionsValidator>()); + + services.TryAddSingleton(); + services.TryAddSingleton(sp => sp.GetRequiredService()); + services.TryAddSingleton(sp => sp.GetRequiredService()); + + return services; + } + private sealed class SurfaceCacheOptionsValidator : IValidateOptions { public ValidateOptionsResult Validate(string? name, SurfaceCacheOptions options) @@ -56,4 +85,32 @@ public static class ServiceCollectionExtensions return ValidateOptionsResult.Success; } } + + private sealed class SurfaceManifestStoreOptionsValidator : IValidateOptions + { + public ValidateOptionsResult Validate(string? name, SurfaceManifestStoreOptions options) + { + if (options is null) + { + return ValidateOptionsResult.Fail("Options cannot be null."); + } + + if (string.IsNullOrWhiteSpace(options.Scheme)) + { + return ValidateOptionsResult.Fail("Manifest URI scheme cannot be empty."); + } + + if (string.IsNullOrWhiteSpace(options.Bucket)) + { + return ValidateOptionsResult.Fail("Manifest bucket cannot be empty."); + } + + if (string.IsNullOrWhiteSpace(options.Prefix)) + { + return ValidateOptionsResult.Fail("Manifest prefix cannot be empty."); + } + + return ValidateOptionsResult.Success; + } + } } diff --git a/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/StellaOps.Scanner.Surface.FS.csproj b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/StellaOps.Scanner.Surface.FS.csproj index b1f01c7f1..702f7b478 100644 --- a/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/StellaOps.Scanner.Surface.FS.csproj +++ b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/StellaOps.Scanner.Surface.FS.csproj @@ -21,6 +21,7 @@ + diff --git a/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/SurfaceManifestModels.cs b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/SurfaceManifestModels.cs index 1758f9f84..7c24c6579 100644 --- a/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/SurfaceManifestModels.cs +++ b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/SurfaceManifestModels.cs @@ -30,7 +30,7 @@ public sealed record SurfaceManifestDocument [JsonPropertyName("generatedAt")] public DateTimeOffset GeneratedAt { get; init; } - = DateTimeOffset.UtcNow; + = DateTimeOffset.MinValue; [JsonPropertyName("source")] [JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)] diff --git a/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/SurfaceManifestPathBuilder.cs b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/SurfaceManifestPathBuilder.cs new file mode 100644 index 000000000..17b476cd0 --- /dev/null +++ b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/SurfaceManifestPathBuilder.cs @@ -0,0 +1,85 @@ +using System; +using System.IO; + +namespace StellaOps.Scanner.Surface.FS; + +internal sealed class SurfaceManifestPathBuilder +{ + private readonly SurfaceManifestStoreOptions _storeOptions; + private readonly string _root; + + public SurfaceManifestPathBuilder( + SurfaceCacheOptions cacheOptions, + SurfaceManifestStoreOptions storeOptions) + { + if (cacheOptions is null) + { + throw new ArgumentNullException(nameof(cacheOptions)); + } + + _storeOptions = storeOptions ?? throw new ArgumentNullException(nameof(storeOptions)); + _root = storeOptions.ResolveRoot(cacheOptions); + } + + public string BuildManifestPath(string tenant, string digestHex) + { + var sanitizedTenant = Sanitize(tenant); + return Path.Combine(_root, sanitizedTenant, digestHex[..2], digestHex[2..4], $"{digestHex}.json"); + } + + public string BuildManifestUri(string tenant, string digestHex) + { + var tenantSegment = Sanitize(tenant); + return _storeOptions.ToUri(tenantSegment, digestHex); + } + + public static ParsedManifestPointer ParseUri(string manifestUri) + { + if (!Uri.TryCreate(manifestUri, UriKind.Absolute, out var uri)) + { + throw new ArgumentException("Manifest URI must be an absolute URI.", nameof(manifestUri)); + } + + var segments = uri.AbsolutePath.Trim('/').Split('/', StringSplitOptions.RemoveEmptyEntries); + if (segments.Length < 4) + { + throw new ArgumentException("Manifest URI is missing expected segments.", nameof(manifestUri)); + } + + var tenant = segments[^4]; + var digestFile = segments[^1]; + + if (!digestFile.EndsWith(".json", StringComparison.Ordinal)) + { + throw new ArgumentException("Manifest URI must end with a JSON file.", nameof(manifestUri)); + } + + var digestHex = digestFile[..^5]; + return new ParsedManifestPointer(uri.Scheme, tenant, digestHex); + } + + public static string EnsureSha256Digest(string manifestDigest) + { + if (string.IsNullOrWhiteSpace(manifestDigest)) + { + throw new ArgumentException("Digest cannot be null or empty.", nameof(manifestDigest)); + } + + const string prefix = "sha256:"; + if (!manifestDigest.StartsWith(prefix, StringComparison.OrdinalIgnoreCase)) + { + throw new ArgumentException("Only sha256 digests are supported.", nameof(manifestDigest)); + } + + return manifestDigest[prefix.Length..].ToLowerInvariant(); + } + + private static string Sanitize(string value) + => string.IsNullOrWhiteSpace(value) + ? "default" + : value.Replace('/', '_').Replace('\\', '_'); + + internal readonly record struct ParsedManifestPointer(string Scheme, string TenantSegment, string DigestHex); + + public string RootDirectory => _root; +} diff --git a/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/SurfaceManifestStoreOptions.cs b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/SurfaceManifestStoreOptions.cs new file mode 100644 index 000000000..065b0b7a3 --- /dev/null +++ b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/SurfaceManifestStoreOptions.cs @@ -0,0 +1,45 @@ +using System; +using System.IO; + +namespace StellaOps.Scanner.Surface.FS; + +/// +/// Configuration settings for the manifest store. +/// +public sealed class SurfaceManifestStoreOptions +{ + /// + /// Gets or sets the root directory used to persist manifest payloads. When null, + /// the value falls back to the surface cache root. + /// + public string? RootDirectory { get; set; } + + /// + /// Gets or sets the URI scheme emitted for manifest pointers. + /// + public string Scheme { get; set; } = "cas"; + + /// + /// Gets or sets the bucket name portion of manifest URIs. + /// + public string Bucket { get; set; } = "surface-cache"; + + /// + /// Gets or sets the prefix used before tenant segments within manifest URIs. + /// + public string Prefix { get; set; } = "manifests"; + + internal string ResolveRoot(SurfaceCacheOptions cacheOptions) + { + var root = RootDirectory; + if (string.IsNullOrWhiteSpace(root)) + { + root = Path.Combine(cacheOptions.ResolveRoot(), "manifests"); + } + + return root!; + } + + internal string ToUri(string tenantSegment, string digestHex) + => $"{Scheme}://{Bucket}/{Prefix}/{tenantSegment}/{digestHex[..2]}/{digestHex[2..4]}/{digestHex}.json"; +} diff --git a/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/TASKS.md b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/TASKS.md index 6ac5b1635..2f36e1aa0 100644 --- a/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/TASKS.md +++ b/src/Scanner/__Libraries/StellaOps.Scanner.Surface.FS/TASKS.md @@ -2,9 +2,11 @@ | ID | Status | Owner(s) | Depends on | Description | Exit Criteria | |----|--------|----------|------------|-------------|---------------| -| SURFACE-FS-01 | DOING (2025-11-02) | Scanner Guild, Zastava Guild | ARCH-SURFACE-EPIC | Author `docs/modules/scanner/design/surface-fs.md` defining cache layout, pointer schema, tenancy, and offline handling. | Spec merged; reviewers from Scanner/Zastava sign off; component map cross-link drafted. | -| SURFACE-FS-02 | DOING (2025-11-02) | Scanner Guild | SURFACE-FS-01 | Implement `StellaOps.Scanner.Surface.FS` core abstractions (writer, reader, manifest models) with deterministic serialization + unit tests. | Library compiles; tests pass; XML docs cover public types. | +| SURFACE-FS-01 | DONE (2025-11-07) | Scanner Guild, Zastava Guild | ARCH-SURFACE-EPIC | Author `docs/modules/scanner/design/surface-fs.md` defining cache layout, pointer schema, tenancy, and offline handling. | Spec merged; reviewers from Scanner/Zastava sign off; component map cross-link drafted. | +| SURFACE-FS-02 | DONE (2025-11-07) | Scanner Guild | SURFACE-FS-01 | Implement `StellaOps.Scanner.Surface.FS` core abstractions (writer, reader, manifest models) with deterministic serialization + unit tests. | Library compiles; tests pass; XML docs cover public types. | | SURFACE-FS-03 | TODO | Scanner Guild | SURFACE-FS-02 | Integrate Surface.FS writer into Scanner Worker analyzer pipeline to persist layer + entry-trace fragments. | Worker produces cache entries in integration tests; observability counters emitted. | | SURFACE-FS-04 | TODO | Zastava Guild | SURFACE-FS-02 | Integrate Surface.FS reader into Zastava Observer runtime drift loop. | Observer validates runtime artefacts via cache; regression tests updated. | | SURFACE-FS-05 | TODO | Scanner Guild, Scheduler Guild | SURFACE-FS-03 | Expose Surface.FS pointers via Scanner WebService reports and coordinate rescan planning with Scheduler. | API contracts updated; Scheduler consumes pointers; docs refreshed. | | SURFACE-FS-06 | TODO | Docs Guild | SURFACE-FS-02..05 | Update scanner-engine guide and offline kit docs with Surface.FS workflow. | Docs merged; offline kit manifests include cache bundles. | + +> 2025-11-07: Delivered file-backed manifest store with deterministic pointer layout, updated design doc with component map cross-link, and added regression tests (note: `dotnet test` hangs in current CLI; rerun once environment supports Linux dotnet output). diff --git a/src/Scanner/__Tests/StellaOps.Scanner.Surface.FS.Tests/FileSurfaceManifestStoreTests.cs b/src/Scanner/__Tests/StellaOps.Scanner.Surface.FS.Tests/FileSurfaceManifestStoreTests.cs new file mode 100644 index 000000000..796680cde --- /dev/null +++ b/src/Scanner/__Tests/StellaOps.Scanner.Surface.FS.Tests/FileSurfaceManifestStoreTests.cs @@ -0,0 +1,156 @@ +using System; +using System.Collections.Generic; +using System.IO; +using System.Linq; +using System.Threading.Tasks; +using Microsoft.Extensions.Logging.Abstractions; +using Microsoft.Extensions.Options; +using StellaOps.Scanner.Surface.FS; +using Xunit; + +namespace StellaOps.Scanner.Surface.FS.Tests; + +public sealed class FileSurfaceManifestStoreTests : IAsyncDisposable +{ + private readonly DirectoryInfo _root; + private readonly FileSurfaceManifestStore _store; + + public FileSurfaceManifestStoreTests() + { + _root = Directory.CreateTempSubdirectory("surface-fs-tests"); + + var cacheOptions = Options.Create(new SurfaceCacheOptions + { + RootDirectory = Path.Combine(_root.FullName, "cache") + }); + + var manifestOptions = Options.Create(new SurfaceManifestStoreOptions + { + RootDirectory = Path.Combine(_root.FullName, "manifests"), + Bucket = "test-bucket", + Prefix = "manifests" + }); + + _store = new FileSurfaceManifestStore( + cacheOptions, + manifestOptions, + NullLogger.Instance); + } + + [Fact] + public async Task PublishAsync_WritesManifestWithDeterministicDigest() + { + var doc = new SurfaceManifestDocument + { + Tenant = "acme", + ImageDigest = "sha256:deadbeef", + Artifacts = new[] + { + new SurfaceManifestArtifact + { + Kind = "layer", + Uri = "cas://bucket/layer", + Digest = "sha256:aaaa", + MediaType = "application/json", + Format = "json", + Metadata = new Dictionary + { + ["z"] = "last", + ["a"] = "first" + } + }, + new SurfaceManifestArtifact + { + Kind = "entrytrace", + Uri = "cas://bucket/entry", + Digest = "sha256:bbbb", + MediaType = "application/json", + Format = "json" + } + } + }; + + var result = await _store.PublishAsync(doc); + + Assert.StartsWith("sha256:", result.ManifestDigest, StringComparison.Ordinal); + Assert.Equal(result.ManifestDigest, $"sha256:{result.ManifestUri.Split('/', StringSplitOptions.RemoveEmptyEntries).Last()[..^5]}"); + Assert.NotNull(result.Document); + Assert.True(File.Exists(GetManifestPath(result.ManifestDigest, "acme"))); + + // Metadata dictionary should be sorted to guarantee deterministic serialization + var artifact = result.Document.Artifacts.Single(a => a.Kind == "layer"); + Assert.Equal(new[] { "a", "z" }, artifact.Metadata!.Keys); + } + + [Fact] + public async Task TryGetByUriAsync_ReturnsPublishedManifest() + { + var doc = new SurfaceManifestDocument + { + Tenant = "acme", + ScanId = "scan-123", + Artifacts = Array.Empty() + }; + + var publish = await _store.PublishAsync(doc); + + var retrieved = await _store.TryGetByUriAsync(publish.ManifestUri); + + Assert.NotNull(retrieved); + Assert.Equal("acme", retrieved!.Tenant); + Assert.Equal("scan-123", retrieved.ScanId); + } + + [Fact] + public async Task TryGetByDigestAsync_ReturnsManifestAcrossTenants() + { + var doc1 = new SurfaceManifestDocument + { + Tenant = "tenant-one", + Artifacts = Array.Empty() + }; + + var doc2 = new SurfaceManifestDocument + { + Tenant = "tenant-two", + Artifacts = Array.Empty() + }; + + var publish1 = await _store.PublishAsync(doc1); + var publish2 = await _store.PublishAsync(doc2); + + var retrieved = await _store.TryGetByDigestAsync(publish2.ManifestDigest); + + Assert.NotNull(retrieved); + Assert.Equal("tenant-two", retrieved!.Tenant); + } + + private string GetManifestPath(string digest, string tenant) + { + var hex = digest["sha256:".Length..]; + return Path.Combine( + Path.Combine(_root.FullName, "manifests"), + tenant, + hex[..2], + hex[2..4], + $"{hex}.json"); + } + + public async ValueTask DisposeAsync() + { + await Task.Run(() => + { + try + { + if (_root.Exists) + { + _root.Delete(recursive: true); + } + } + catch + { + // ignored + } + }); + } +} diff --git a/src/Scheduler/__Libraries/StellaOps.Scheduler.Worker/TASKS.md b/src/Scheduler/__Libraries/StellaOps.Scheduler.Worker/TASKS.md index 9d56919f3..8fc2552e1 100644 --- a/src/Scheduler/__Libraries/StellaOps.Scheduler.Worker/TASKS.md +++ b/src/Scheduler/__Libraries/StellaOps.Scheduler.Worker/TASKS.md @@ -3,6 +3,7 @@ | ID | Status | Owner(s) | Depends on | Description | Exit Criteria | |----|--------|----------|------------|-------------|---------------| | SCHED-SURFACE-01 | TODO | Scheduler Worker Guild | SURFACE-FS-02, SCANNER-SURFACE-02 | Evaluate Surface.FS pointers when planning delta scans to avoid redundant work and prioritise drift-triggered assets. | Planner reads Surface.FS manifests; regression tests cover cache hits/misses; documentation updated. | +| SCHED-SURFACE-02 | TODO | Scheduler Worker Guild, Surface FS Guild | SURFACE-FS-02, SCHED-SURFACE-01 | Integrate Surface manifest reader to prefetch CAS manifests before scheduling reruns and persist pointer metadata alongside run plans. See `docs/modules/scanner/design/surface-fs-consumers.md` §3 for checklist. | Prefetch pipeline prevents redundant scans; scheduler persists manifest URIs/digests; integration tests cover cache hit/miss fallbacks and telemetry wiring. | > 2025-10-27: Impact targeting sanitizes selector-constrained results, dedupes digests, and documents shard planning in `docs/SCHED-WORKER-16-202-IMPACT-TARGETING.md`. diff --git a/src/Zastava/StellaOps.Zastava.Observer/TASKS.md b/src/Zastava/StellaOps.Zastava.Observer/TASKS.md index 6fb63133c..0290203ff 100644 --- a/src/Zastava/StellaOps.Zastava.Observer/TASKS.md +++ b/src/Zastava/StellaOps.Zastava.Observer/TASKS.md @@ -3,6 +3,7 @@ | ID | Status | Owner(s) | Depends on | Description | Exit Criteria | |----|--------|----------|------------|-------------|---------------| | ZASTAVA-SURFACE-01 | TODO | Zastava Observer Guild | SURFACE-FS-02 | Integrate Surface.FS client for runtime drift detection (lookup cached layer hashes/entry traces). | Observer validates runtime vs cache; integration tests cover drift + cache-miss cases. | +| ZASTAVA-SURFACE-02 | TODO | Zastava Observer Guild | SURFACE-FS-02, ZASTAVA-SURFACE-01 | Adopt Surface manifest reader helpers to resolve `cas://` pointers and surface cache lineage in drift diagnostics. See `docs/modules/scanner/design/surface-fs-consumers.md` §4 for expectations. | Observer fetches manifests via new URI schema; drift diagnostics show manifest provenance; unit/integration tests cover pointer fetch and error fallback. | | ZASTAVA-ENV-01 | TODO | Zastava Observer Guild | SURFACE-ENV-02 | Adopt Surface.Env helpers for cache endpoints, secret refs, and feature toggles. | Observer configuration centralised; misconfiguration warnings logged; docs updated. | | ZASTAVA-SECRETS-01 | TODO | Zastava Observer Guild, Security Guild | SURFACE-SECRETS-02 | Retrieve CAS/attestation access via Surface.Secrets instead of inline secret stores. | Secrets resolved through shared provider; rotation/resilience tests pass. |