Implement Advisory Canonicalization and Backfill Migration
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled

- Added AdvisoryCanonicalizer for canonicalizing advisory identifiers.
- Created EnsureAdvisoryCanonicalKeyBackfillMigration to populate advisory_key and links in advisory_raw documents.
- Introduced FileSurfaceManifestStore for managing surface manifests with file system backing.
- Developed ISurfaceManifestReader and ISurfaceManifestWriter interfaces for reading and writing manifests.
- Implemented SurfaceManifestPathBuilder for constructing paths and URIs for surface manifests.
- Added tests for FileSurfaceManifestStore to ensure correct functionality and deterministic behavior.
- Updated documentation for new features and migration steps.
This commit is contained in:
master
2025-11-07 19:54:02 +02:00
parent a1ce3f74fa
commit 515975edc5
42 changed files with 1893 additions and 336 deletions

View File

@@ -0,0 +1,32 @@
# Link-Not-Merge Determinism Test Plan
**Task:** MERGE-LNM-21-003 — replace legacy merge determinism suites with observation/linkset regressions now that `NoMergeEnabled` is defaulted to `true`.
## Objectives
- Validate raw advisory documents remain byte-stable through observation/linkset materialisation.
- Ensure conflicts detected during linkset building surface in telemetry and persisted artifacts without merge-side mutation.
- Keep canonical hash output stable for exports/evidence bundles after repeated runs.
## Test Coverage Outline
1. **Raw → Observation determinism**
- Feed canonical advisory raw fixtures containing mixed casing, duplicate aliases, and provenance metadata.
- Assert repeated runs of `AdvisoryObservationFactory` emit identical observations (structural equality + canonical JSON hash).
- Verify raw linkset payload retains original ordering/whitespace while canonical linkset stays normalised.
- Initial coverage implemented via `AdvisoryObservationFactoryTests.Create_IsDeterministicAcrossRuns` (core tests).
2. **Linkset conflict surfacing**
- Build linksets from conflicting advisory observations (e.g., differing severity or status flags).
- Confirm conflict markers propagate to `AdvisoryLinkset` outputs and associated metrics/log records.
- Capture deterministic ordering of conflict explanations for evidence exports.
3. **Evidence/export parity**
- Re-run observation/linkset pipelines against identical fixtures and assert resulting evidence manifests hash-identically.
- Track monotonic `supersedes` chains and ensure canonical link records include `PRIMARY` schemes.
## Migration Steps
- [ ] Retire `StellaOps.Concelier.Merge.Tests` determinism suites once observation/linkset equivalents land.
- [ ] Introduce new regression fixtures under `StellaOps.Concelier.Core.Tests` (shared via `StellaOps.Concelier.Testing`).
- [ ] Wire test helpers to Mongo in-memory harness for end-to-end parity runs.
- [ ] Update documentation (`docs/migration/no-merge.md`) with validation checklist once new tests are green.
_Pending_: execute suites on a workstation with the .NET 10 preview SDK; local environment lacks a functioning CLI, so validation runs must happen downstream.

View File

@@ -78,6 +78,10 @@ Follow the sprint files below in order. Update task status in both `SPRINTS` and
> 2025-11-02: SURFACE-VAL-01 moved to DOING (Surface Validation Guild) aligning design document with implementation plan.
> 2025-11-02: SURFACE-FS-01 moved to DOING (Surface FS Guild) finalising cache layout and manifest spec.
> 2025-11-02: SURFACE-FS-02 moved to DOING (Surface FS Guild) building core abstractions and deterministic serializers.
> 2025-11-07: SURFACE-FS-01 marked DONE updated `surface-fs.md` with pointer layout, offline kit flow, and architecture cross-link.
> 2025-11-07: SURFACE-FS-02 marked DONE landed file-backed manifest store (`FileSurfaceManifestStore`), deterministic serialization, and unit coverage.
> 2025-11-07: SCHED-SURFACE-02 added (Scheduler Worker Guild) prefetch Surface manifests before scheduling reruns.
> 2025-11-07: ZASTAVA-SURFACE-02 added (Zastava Observer Guild) adopt Surface manifest reader for drift diagnostics.
> 2025-11-02: SURFACE-SECRETS-01 moved to DOING (Surface Secrets Guild) updating secrets design for provider matrix.
> 2025-11-02: SURFACE-SECRETS-02 moved to DOING (Surface Secrets Guild) implementing base providers + tests.
> 2025-11-02: AUTH-POLICY-27-002 marked DONE (Authority Core & Security Guild) interactive-only policy publish/promote scopes delivered with metadata, fresh-auth enforcement, and audit/docs updates.

View File

@@ -158,8 +158,8 @@ CONCELIER-SIG-26-001 `Vulnerable symbol exposure` | TODO | Expose advisory metad
CONCELIER-STORE-AOC-19-005 `Raw linkset backfill` | TODO (2025-11-04) | Plan and execute advisory_observations `rawLinkset` backfill (online + Offline Kit bundles), supply migration scripts + rehearse rollback. Follow the coordination plan in `docs/dev/raw-linkset-backfill-plan.md`. Dependencies: CONCELIER-CORE-AOC-19-004. | Concelier Storage Guild, DevOps Guild (src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/TASKS.md)
CONCELIER-TEN-48-001 `Tenant-aware linking` | TODO | Ensure advisory normalization/linking runs per tenant with RLS enforcing isolation; emit capability endpoint reporting `merge=false`; update events with tenant context. Dependencies: AUTH-TEN-47-001. | Concelier Core Guild (src/Concelier/__Libraries/StellaOps.Concelier.Core/TASKS.md)
CONCELIER-VEXLENS-30-001 `Advisory rationale bridges` | TODO | Guarantee advisory key consistency and cross-links for consensus rationale; Label: VEX-Lens. Dependencies: CONCELIER-VULN-29-001, VEXLENS-30-005. | Concelier WebService Guild, VEX Lens Guild (src/Concelier/StellaOps.Concelier.WebService/TASKS.md)
CONCELIER-VULN-29-001 `Advisory key canonicalization` | TODO | Canonicalize (lossless) advisory identifiers (CVE/GHSA/vendor) into `advisory_key`, persist `links[]`, expose raw payload snapshots for Explorer evidence tabs; AOC-compliant: no merge, no derived fields, no suppression. Include migration/backfill scripts. Dependencies: CONCELIER-LNM-21-001. | Concelier WebService Guild, Data Integrity Guild (src/Concelier/StellaOps.Concelier.WebService/TASKS.md)
CONCELIER-VULN-29-002 `Evidence retrieval API` | TODO | Provide `/vuln/evidence/advisories/{advisory_key}` returning raw advisory docs with provenance, filtering by tenant and source. Dependencies: CONCELIER-VULN-29-001, VULN-API-29-003. | Concelier WebService Guild (src/Concelier/StellaOps.Concelier.WebService/TASKS.md)
CONCELIER-VULN-29-001 `Advisory key canonicalization` | DONE (2025-11-07) | Canonicalize (lossless) advisory identifiers (CVE/GHSA/vendor) into `advisory_key`, persist `links[]`, expose raw payload snapshots for Explorer evidence tabs; AOC-compliant: no merge, no derived fields, no suppression. Include migration/backfill scripts. Dependencies: CONCELIER-LNM-21-001. | Concelier WebService Guild, Data Integrity Guild (src/Concelier/StellaOps.Concelier.WebService/TASKS.md)
CONCELIER-VULN-29-002 `Evidence retrieval API` | DONE (2025-11-07) | Provide `/vuln/evidence/advisories/{advisory_key}` returning raw advisory docs with provenance, filtering by tenant and source. Dependencies: CONCELIER-VULN-29-001, VULN-API-29-003. | Concelier WebService Guild (src/Concelier/StellaOps.Concelier.WebService/TASKS.md)
[Ingestion & Evidence] 110.B) Concelier.V
@@ -211,8 +211,8 @@ Depends on: Sprint 110.B - Concelier.VI
Summary: Ingestion & Evidence focus on Concelier (phase VII).
Task ID | State | Task description | Owners (Source)
--- | --- | --- | ---
MERGE-LNM-21-002 | DOING (2025-11-07) | Refactor or retire `AdvisoryMergeService` and related pipelines, ensuring callers transition to observation/linkset APIs; add compile-time analyzer preventing merge service usage.<br>2025-11-03: Began dependency audit and call-site inventory ahead of deprecation plan; cataloging service registrations/tests referencing merge APIs.<br>2025-11-05 14:42Z: Drafted `concelier:features:noMergeEnabled` gating, merge job allowlist handling, and deprecation/telemetry changes prior to analyzer rollout.<br>2025-11-06 16:10Z: Landed analyzer project (`CONCELIER0002`), wired into Concelier WebService/tests, and updated docs to direct suppressions through explicit migration notes.<br>2025-11-07 03:25Z: Default-on toggle + job gating break existing Concelier WebService tests; guard/migration adjustments pending before closing the task.<br>2025-11-07 07:05Z: Added ingest-path diagnostics (hash logging + test log dumping) to trace why HTTP binding loses `upstream.contentHash` with `noMergeEnabled=true`; need to adapt seeding/tests once the binding issue is fixed. | BE-Merge (src/Concelier/__Libraries/StellaOps.Concelier.Merge/TASKS.md)
MERGE-LNM-21-003 Determinism/test updates | QA Guild, BE-Merge | Replace merge determinism suites with observation/linkset regression tests verifying no data mutation and conflicts remain visible. Dependencies: MERGE-LNM-21-002. | MERGE-LNM-21-002 (src/Concelier/__Libraries/StellaOps.Concelier.Merge/TASKS.md)
MERGE-LNM-21-002 | DONE (2025-11-07) | Refactor or retire `AdvisoryMergeService` and related pipelines, ensuring callers transition to observation/linkset APIs; add compile-time analyzer preventing merge service usage.<br>2025-11-03: Began dependency audit and call-site inventory ahead of deprecation plan; cataloging service registrations/tests referencing merge APIs.<br>2025-11-05 14:42Z: Drafted `concelier:features:noMergeEnabled` gating, merge job allowlist handling, and deprecation/telemetry changes prior to analyzer rollout.<br>2025-11-06 16:10Z: Landed analyzer project (`CONCELIER0002`), wired into Concelier WebService/tests, and updated docs to direct suppressions through explicit migration notes.<br>2025-11-07 03:25Z: Default-on toggle + job gating surfaced ingestion test brittleness; guard/migration diagnostics capture requests missing `upstream.contentHash`.<br>2025-11-07 19:45Z: Set `ConcelierOptions.Features.NoMergeEnabled` default to `true`, added regression coverage (`Features_NoMergeEnabled_DefaultsToTrue`), and rechecked ingest helpers to carry canonical links. Remote .NET 10 CLI run remains queued for validation. | BE-Merge (src/Concelier/__Libraries/StellaOps.Concelier.Merge/TASKS.md)
MERGE-LNM-21-003 Determinism/test updates | DOING (2025-11-07) | QA Guild, BE-Merge | Replace merge determinism suites with observation/linkset regression tests verifying no data mutation and conflicts remain visible. Dependencies: MERGE-LNM-21-002.<br>2025-11-07: Drafting test migration plan (`docs/dev/lnm-determinism-tests.md`) to map legacy merge fixtures onto observation/linkset pipelines; identifying coverage gaps (conflict surfacing, raw vs canonical parity, hash stability).<br>2025-11-07 20:05Z: Landed `AdvisoryObservationFactoryTests.Create_IsDeterministicAcrossRuns` to cover canonical JSON stability and pruned the old merge determinism integration test. | MERGE-LNM-21-002 (src/Concelier/__Libraries/StellaOps.Concelier.Merge/TASKS.md)
WEB-AOC-19-001 (dependency) | DONE (2025-11-07) | Shared guard primitives now enforce the top-level allowlist (`_id`, tenant, source, upstream, content, identifiers, linkset, supersedes, created/ingested timestamps, attributes) and emit the reusable `AocError` payload consumed by HTTP/CLI tooling. Extend `AocGuardOptions.AllowedTopLevelFields` when staging new schema fields to avoid false-positive `ERR_AOC_007` violations. | BE-Base Platform Guild (docs/aoc/guard-library.md, src/Web/StellaOps.Web/TASKS.md)

View File

@@ -156,6 +156,8 @@ SCANNER-ENG-0027 | TODO | Deliver Windows policy/offline integration per `design
SCANNER-SURFACE-01 | DONE (2025-11-06) | Persist Surface.FS manifests after analyzer stages, including layer CAS metadata and EntryTrace fragments.<br>2025-11-02: Worker pipeline emitting draft Surface.FS manifests for sample scans; determinism checks running.<br>2025-11-06: Continuing with manifest writer abstraction + telemetry wiring for Surface.FS persistence.<br>2025-11-06 18:45Z: Resumed work; targeting manifest writer abstraction, CAS persistence hooks, and telemetry/test coverage updates.<br>2025-11-06 20:20Z: Published Surface worker Grafana dashboard + updated design doc; WebService pointer integration test now covers manifest/payload artefacts. | Scanner Worker Guild (src/Scanner/StellaOps.Scanner.Worker/TASKS.md)
SCANNER-SURFACE-02 | DONE (2025-11-05) | Publish Surface.FS pointers (CAS URIs, manifests) via scan/report APIs and update attestation metadata. Dependencies: SCANNER-SURFACE-01.<br>2025-11-05: Surface pointer projection wired through WebService endpoints, orchestrator samples & DSSE fixtures refreshed with `surface` manifest block, and regression suite (platform events, report sample, ready check) updated. | Scanner WebService Guild (src/Scanner/StellaOps.Scanner.WebService/TASKS.md)
SCANNER-SURFACE-03 | DONE (2025-11-07) | Push layer manifests and entry fragments into Surface.FS during build-time SBOM generation. Dependencies: SCANNER-SURFACE-02.<br>2025-11-06: Starting BuildX manifest upload implementation with Surface.FS client abstraction and integration tests.<br>2025-11-07 15:30Z: Resumed BuildX plugin Surface wiring; analyzing Surface.FS models, CAS flow, and upcoming tests before coding.<br>2025-11-07 22:10Z: Added Surface manifest writer + CLI flags to the BuildX plug-in, persisted artefacts into CAS, regenerated docs/fixtures, and shipped new tests covering the writer + descriptor flow. | BuildX Plugin Guild (src/Scanner/StellaOps.Scanner.Sbomer.BuildXPlugin/TASKS.md)
SCHED-SURFACE-02 | TODO | Integrate Scheduler worker prefetch using Surface manifest reader and persist manifest pointers with rerun plans. Dependencies: SURFACE-FS-02, SCHED-SURFACE-01. Reference `docs/modules/scanner/design/surface-fs-consumers.md` §3 for implementation checklist. | Scheduler Worker Guild (src/Scheduler/__Libraries/StellaOps.Scheduler.Worker/TASKS.md)
ZASTAVA-SURFACE-02 | TODO | Use Surface manifest reader helpers to resolve `cas://` pointers and enrich drift diagnostics with manifest provenance. Dependencies: SURFACE-FS-02, ZASTAVA-SURFACE-01. Reference `docs/modules/scanner/design/surface-fs-consumers.md` §4 for integration steps. | Zastava Observer Guild (src/Zastava/StellaOps.Zastava.Observer/TASKS.md)
[Scanner & Surface] 130.A) Scanner.VIII
Depends on: Sprint 130.A - Scanner.VII

View File

@@ -61,23 +61,28 @@
"spec_version": "1.6",
"raw": { /* unmodified upstream document */ }
},
"identifiers": {
"cve": ["CVE-2025-12345"],
"ghsa": ["GHSA-xxxx-...."],
"aliases": ["CVE-2025-12345", "GHSA-xxxx-...."]
},
"linkset": {
"purls": ["pkg:npm/lodash@4.17.21"],
"cpes": ["cpe:2.3:a:lodash:lodash:4.17.21:*:*:*:*:*:*:*"],
"references": [
{"type":"advisory","url":"https://..."},
{"type":"fix","url":"https://..."}
],
"reconciled_from": ["content.raw.affected.ranges", "content.raw.pkg"]
},
"supersedes": "advisory_raw:osv:GHSA-xxxx-....:v2",
"tenant": "default"
}
"identifiers": {
"primary": "GHSA-xxxx-....",
"aliases": ["CVE-2025-12345", "GHSA-xxxx-...."]
},
"linkset": {
"purls": ["pkg:npm/lodash@4.17.21"],
"cpes": ["cpe:2.3:a:lodash:lodash:4.17.21:*:*:*:*:*:*:*"],
"references": [
{"type":"advisory","url":"https://..."},
{"type":"fix","url":"https://..."}
],
"reconciled_from": ["content.raw.affected.ranges", "content.raw.pkg"]
},
"advisory_key": "CVE-2025-12345",
"links": [
{"scheme":"CVE","value":"CVE-2025-12345"},
{"scheme":"GHSA","value":"GHSA-XXXX-...."},
{"scheme":"PRIMARY","value":"CVE-2025-12345"}
],
"supersedes": "advisory_raw:osv:GHSA-xxxx-....:v2",
"tenant": "default"
}
```
### 1.2 Connector lifecycle
@@ -110,7 +115,7 @@ Running the same export job twice against the same snapshot must yield byte-iden
* **Linkset builder** that correlates observations into `advisory_linksets` and annotates conflicts.
* **Event publisher** emitting `advisory.observation.updated` and `advisory.linkset.updated` messages.
* **Exporters** (JSON, Trivy DB, Offline Kit slices) fed from observation/linkset stores.
* **Minimal REST** for health/status/trigger/export and observation/linkset reads.
* **Minimal REST** for health/status/trigger/export, raw observation reads, and evidence retrieval (`GET /vuln/evidence/advisories/{advisory_key}`).
**Scale:** HA by running N replicas; **locks** prevent overlapping jobs per source/exporter.

View File

@@ -0,0 +1,51 @@
# Surface.FS Consumer Integration Guide (Scheduler & Zastava)
> **Updated:** 2025-11-07
> **Audience:** Scheduler Worker Guild • Zastava Observer Guild • Surface FS Guild
> **Depends on:** SURFACE-FS-02 (`FileSurfaceManifestStore`), Surface.Env/Surface.Secrets libraries.
This note captures the minimum wiring required for downstream services now that `FileSurfaceManifestStore` and the manifest reader/writer abstractions have landed.
## 1. Shared prerequisites
- Reference `StellaOps.Scanner.Surface.FS` (net10.0) and call:
```csharp
services
.AddSurfaceFileCache()
.AddSurfaceManifestStore();
```
This binds `Surface:Cache` and `Surface:Manifest` (or `SCANNER_SURFACE_*` overrides).
- Pull runtime settings via `ISurfaceEnvironment` to ensure tenants/endpoints line up with Scanner.
- Cache root (`Surface:Cache:Root`) must be writable; manifests fall back to `<Root>/manifests` unless explicitly overridden with `Surface:Manifest:RootDirectory`.
## 2. Manifest reader usage
```csharp
var reader = serviceProvider.GetRequiredService<ISurfaceManifestReader>();
var manifest = await reader.TryGetByUriAsync(surfaceUri, cancellationToken);
```
- Accept `cas://{bucket}/{prefix}/{tenant}/{hh}/{tt}/{digest}.json` pointers.
- On cache miss, return `null`—callers should fall back to existing recompute paths.
- All timestamps are stored in canonical UTC, and metadata dictionaries are alphabetically sorted to keep digests deterministic.
## 3. Scheduler worker checklist (`SCHED-SURFACE-02`)
1. Prefetch manifests during planning so reruns can skip redundant layers.
2. Persist `{manifestUri, manifestDigest}` alongside run plans for traceability.
3. Emit telemetry counters: `scheduler_surface_manifest_prefetch_total{result=hit|miss}`.
4. Update `docs/SCHED-WORKER-16-201-PLANNER.md` with the new prefetch flow.
## 4. Zastava observer checklist (`ZASTAVA-SURFACE-02`)
1. Resolve manifest pointer from runtime drift events (`entrytrace.graph`, `layer.fragments` kinds).
2. Enrich drift diagnostics with `manifestDigest` and `Artifacts[n].metadata`.
3. Add failure metric `zastava_surface_manifest_failures_total{reason=not_found|fetch_error}`.
4. Expand observer runbook (`docs/modules/zastava/operations/drift.md`) with Surface manifest troubleshooting.
## 5. Testing guidance
- Unit-test manifest prefetch/adoption with local `FileSurfaceManifestStore`; use temp directories for isolation.
- For integration environments, smoke-test by pointing to the same `Surface:Manifest:RootDirectory` used by Scanner Worker and verifying pointer fetch before scan jobs execute.
Coordinate status updates in the relevant `TASKS.md` entries and `docs/implplan/SPRINT_130_scanner_surface.md` once each guild completes its part. If you discover additional shared requirements, extend this guide so future consumers (CLI, Orchestrator) can reuse the flow.

View File

@@ -1,8 +1,9 @@
# Surface.FS Design (Epic: SURFACE-SHARING)
> **Status:** Draft v1.0 — aligns with tasks `SURFACE-FS-01..06`, `SCANNER-SURFACE-01..05`, `ZASTAVA-SURFACE-01..02`, `SCHED-SURFACE-01`, `OPS-SECRETS-01..02`.
> **Status:** Draft v1.1 — aligns with tasks `SURFACE-FS-01..06`, `SCANNER-SURFACE-01..05`, `ZASTAVA-SURFACE-01..02`, `SCHED-SURFACE-01`, `OPS-SECRETS-01..02`.
>
> **Audience:** Scanner Worker/WebService, Zastava, Scheduler, DevOps.
> **Component map:** See [Scanner architecture — §1 System landscape](../architecture.md#1-system-landscape) for end-to-end placement.
## 1. Purpose
@@ -26,45 +27,61 @@ Manifests describe the artefact metadata and storage pointers. They are stored i
{
"schema": "stellaops.surface.manifest@1",
"tenant": "acme",
"kind": "layer-entry-trace",
"digest": "sha256:ab12...",
"createdAt": "2025-10-29T12:00:00Z",
"expiresAt": "2025-11-05T12:00:00Z",
"imageDigest": "sha256:cafe...",
"scanId": "scan-1234",
"generatedAt": "2025-10-29T12:00:00Z",
"source": {
"scannerBuild": "stellaops/scanner@sha256:deadbeef",
"imageDigest": "sha256:cafe...",
"scanId": "scan-1234"
"component": "scanner.worker",
"version": "2025.10.0",
"workerInstance": "scanner-worker-1",
"attempt": 1
},
"storage": {
"bucket": "surface-cache",
"objectKey": "tenants/acme/layer-entry-trace/sha256/ab/12/.../payload.json.zst",
"sizeBytes": 524288,
"contentType": "application/json+zstd"
},
"integrity": {
"hash": "sha256:ab12...",
"signature": null
}
"artifacts": [
{
"kind": "entrytrace.graph",
"uri": "cas://surface-cache/manifests/acme/ab/cd/abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789.json",
"digest": "sha256:abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789",
"mediaType": "application/vnd.stellaops.entrytrace+json",
"format": "json",
"sizeBytes": 524288,
"view": "runtime",
"storage": {
"bucket": "surface-cache",
"objectKey": "payloads/acme/entrytrace/sha256/ab/cd/abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789.ndjson.zst",
"sizeBytes": 524288,
"contentType": "application/x-ndjson+zstd"
},
"metadata": {
"entrypoint": "/usr/bin/java",
"surfaceVersion": "1"
}
}
]
}
```
Manifest URIs follow the deterministic pattern:
```
cas://{bucket}/{prefix}/{tenant}/{digest[0..1]}/{digest[2..3]}/{digest}.json
```
The hex portion of the manifest digest is split into two directory levels to avoid hot directories. The same layout is mirrored on disk by the default `FileSurfaceManifestStore`, which keeps offline bundle sync trivial (copy the `manifests/` tree verbatim).
### 2.3 Payload Storage
Large payloads (SBOM fragments, entry traces, runtime events) live in the same object store as manifests (RustFS/S3). Manifests record relative paths so offline bundles can copy both manifest and payload without modification.
## 3. APIs
Surface.FS exposes a gRPC/HTTP API consumed by .NET clients:
Surface.FS exposes .NET-first abstractions that hosts consume via DI:
| Method | Description |
|--------|-------------|
| `PutManifest(PutManifestRequest)` | Stores manifest + optional payload. Idempotent via `digest`. |
| `GetManifest(GetManifestRequest)` | Returns manifest metadata; 404 if missing. |
| `GetPayload(GetPayloadRequest)` | Streams payload bytes (optionally decompressing). |
| `ListManifests(ListManifestRequest)` | Enumerates manifests for tenant/kind with pagination. |
| `DeleteManifest(DeleteManifestRequest)` | (Optional) Removes manifest/payload based on retention policies. |
- `ISurfaceManifestWriter.PublishAsync(document)` normalises artefact lists, computes the canonical SHA-256 digest, persists the manifest via the configured store, and returns a `SurfaceManifestPublishResult` containing the digest, canonical URI, and the normalised document.
- `ISurfaceManifestReader.TryGetByUriAsync(uri)` resolves a manifest pointer (e.g. `cas://surface-cache/manifests/...`) back into a `SurfaceManifestDocument`.
- `ISurfaceManifestReader.TryGetByDigestAsync(digest)` looks up a manifest by digest, scanning tenant prefixes when necessary (used by Offline Kit importers).
- `ISurfaceCache` (`GetOrCreateAsync`, `TryGetAsync`, `SetAsync`) lightweight content-addressable cache for hot artefacts (layer fragments, entry trace outputs) hosted on local disk.
.NET client wraps these calls and handles retries using Polly policies.
All components honour configuration bound from `Surface:Cache` and `Surface:Manifest` (or environment mirrors like `SCANNER_SURFACE_CACHE_ROOT`). `SurfaceManifestStoreOptions` controls the URI scheme/bucket/prefix and allows overriding the manifest directory while still defaulting to `<cacheRoot>/manifests`.
### WebService integration (2025-11-05)
@@ -78,16 +95,16 @@ Surface.FS exposes a gRPC/HTTP API consumed by .NET clients:
Surface.FS library for .NET hosts provides:
- `ISurfaceManifestWriter` / `ISurfaceManifestReader` interfaces.
- Content-addressed path builder (`SurfacePathBuilder`).
- Tenant namespace isolation and bucket configuration (via Surface.Env).
- Local cache abstraction `ISurfaceCache` with default `FileSurfaceCache` implementation (uses `Surface:Cache:Root` / `SCANNER_SURFACE_CACHE_ROOT`, enforces quotas, serialises writes with per-key semaphores).
- `ISurfaceManifestWriter` / `ISurfaceManifestReader` with the default `FileSurfaceManifestStore` implementation (single-writer semaphore, digest reuse, optional overwrite warning).
- Deterministic pointer builder (`SurfaceManifestPathBuilder`) and options (`SurfaceManifestStoreOptions`, `SurfaceCacheOptions`) that align with `Surface.Env` configuration.
- Local cache abstraction `ISurfaceCache` with default `FileSurfaceCache` implementation (uses `Surface:Cache:Root` / `SCANNER_SURFACE_CACHE_ROOT`, enforces per-key semaphores, stores bytes verbatim).
- `SurfaceCacheKey` helper that normalises cache entries as `{namespace}/{tenant}/{sha256}`. EntryTrace graphs use the `entrytrace.graph` namespace so Worker/WebService/CLI can share cached results deterministically.
- Metrics: `surface_manifest_put_seconds`, `surface_manifest_cache_hit_total`, etc.
- JSON serialiser (`SurfaceCacheJsonSerializer`) that applies camelCase naming, ignores nulls, and uses a stable encoder for reproducible hashing.
- Metrics: `surface_manifest_published_total`, `surface_manifest_cache_hit_total`, plus host-specific counters wired via Scanner Worker instrumentation.
## 5. Retention & Eviction
- Manifests include optional `expiresAt`; Worker defaults to 30 days for SBOM fragments, 7 days for entry traces.
- Manifests capture `generatedAt`; retention windows (30 days for SBOM fragments, 7 days for entry traces) are enforced by job configuration and object-store lifecycle policies. An `expiresAt` field is reserved for future use when automated eviction is introduced.
- Background job `SurfaceCacheMaintenanceService` evicts local cache entries exceeding quota, oldest-first.
- Object storage retention policies are managed by DevOps; library exposes metrics but does not auto-delete unless instructed.
@@ -98,13 +115,13 @@ Offline kits include:
```
offline/surface/
manifests/
tenants/<tenant>/<kind>/<digest>.json
<tenant>/<digest[0..1]>/<digest[2..3]>/<digest>.json
payloads/
tenants/<tenant>/<kind>/<digest>.json.zst
<tenant>/<kind>/<digest[0..1]>/<digest[2..3]>/<digest>.json.zst
manifest-index.json
```
Import script calls `PutManifest` for each manifest, verifying digests. This enables Zastava and Scheduler running offline to consume cached data without re-scanning.
Import script uses `ISurfaceManifestWriter.PublishAsync` for each manifest after verifying the embedded digest, keeping Offline Kit replays identical to online flows. This enables Zastava and Scheduler running offline to consume cached data without re-scanning.
### 6.1 EntryTrace Cache Usage