Implement Advisory Canonicalization and Backfill Migration
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
- Added AdvisoryCanonicalizer for canonicalizing advisory identifiers. - Created EnsureAdvisoryCanonicalKeyBackfillMigration to populate advisory_key and links in advisory_raw documents. - Introduced FileSurfaceManifestStore for managing surface manifests with file system backing. - Developed ISurfaceManifestReader and ISurfaceManifestWriter interfaces for reading and writing manifests. - Implemented SurfaceManifestPathBuilder for constructing paths and URIs for surface manifests. - Added tests for FileSurfaceManifestStore to ensure correct functionality and deterministic behavior. - Updated documentation for new features and migration steps.
This commit is contained in:
32
docs/dev/lnm-determinism-tests.md
Normal file
32
docs/dev/lnm-determinism-tests.md
Normal file
@@ -0,0 +1,32 @@
|
||||
# Link-Not-Merge Determinism Test Plan
|
||||
|
||||
**Task:** MERGE-LNM-21-003 — replace legacy merge determinism suites with observation/linkset regressions now that `NoMergeEnabled` is defaulted to `true`.
|
||||
|
||||
## Objectives
|
||||
- Validate raw advisory documents remain byte-stable through observation/linkset materialisation.
|
||||
- Ensure conflicts detected during linkset building surface in telemetry and persisted artifacts without merge-side mutation.
|
||||
- Keep canonical hash output stable for exports/evidence bundles after repeated runs.
|
||||
|
||||
## Test Coverage Outline
|
||||
1. **Raw → Observation determinism**
|
||||
- Feed canonical advisory raw fixtures containing mixed casing, duplicate aliases, and provenance metadata.
|
||||
- Assert repeated runs of `AdvisoryObservationFactory` emit identical observations (structural equality + canonical JSON hash).
|
||||
- Verify raw linkset payload retains original ordering/whitespace while canonical linkset stays normalised.
|
||||
- Initial coverage implemented via `AdvisoryObservationFactoryTests.Create_IsDeterministicAcrossRuns` (core tests).
|
||||
|
||||
2. **Linkset conflict surfacing**
|
||||
- Build linksets from conflicting advisory observations (e.g., differing severity or status flags).
|
||||
- Confirm conflict markers propagate to `AdvisoryLinkset` outputs and associated metrics/log records.
|
||||
- Capture deterministic ordering of conflict explanations for evidence exports.
|
||||
|
||||
3. **Evidence/export parity**
|
||||
- Re-run observation/linkset pipelines against identical fixtures and assert resulting evidence manifests hash-identically.
|
||||
- Track monotonic `supersedes` chains and ensure canonical link records include `PRIMARY` schemes.
|
||||
|
||||
## Migration Steps
|
||||
- [ ] Retire `StellaOps.Concelier.Merge.Tests` determinism suites once observation/linkset equivalents land.
|
||||
- [ ] Introduce new regression fixtures under `StellaOps.Concelier.Core.Tests` (shared via `StellaOps.Concelier.Testing`).
|
||||
- [ ] Wire test helpers to Mongo in-memory harness for end-to-end parity runs.
|
||||
- [ ] Update documentation (`docs/migration/no-merge.md`) with validation checklist once new tests are green.
|
||||
|
||||
_Pending_: execute suites on a workstation with the .NET 10 preview SDK; local environment lacks a functioning CLI, so validation runs must happen downstream.
|
||||
@@ -78,6 +78,10 @@ Follow the sprint files below in order. Update task status in both `SPRINTS` and
|
||||
> 2025-11-02: SURFACE-VAL-01 moved to DOING (Surface Validation Guild) – aligning design document with implementation plan.
|
||||
> 2025-11-02: SURFACE-FS-01 moved to DOING (Surface FS Guild) – finalising cache layout and manifest spec.
|
||||
> 2025-11-02: SURFACE-FS-02 moved to DOING (Surface FS Guild) – building core abstractions and deterministic serializers.
|
||||
> 2025-11-07: SURFACE-FS-01 marked DONE – updated `surface-fs.md` with pointer layout, offline kit flow, and architecture cross-link.
|
||||
> 2025-11-07: SURFACE-FS-02 marked DONE – landed file-backed manifest store (`FileSurfaceManifestStore`), deterministic serialization, and unit coverage.
|
||||
> 2025-11-07: SCHED-SURFACE-02 added (Scheduler Worker Guild) – prefetch Surface manifests before scheduling reruns.
|
||||
> 2025-11-07: ZASTAVA-SURFACE-02 added (Zastava Observer Guild) – adopt Surface manifest reader for drift diagnostics.
|
||||
> 2025-11-02: SURFACE-SECRETS-01 moved to DOING (Surface Secrets Guild) – updating secrets design for provider matrix.
|
||||
> 2025-11-02: SURFACE-SECRETS-02 moved to DOING (Surface Secrets Guild) – implementing base providers + tests.
|
||||
> 2025-11-02: AUTH-POLICY-27-002 marked DONE (Authority Core & Security Guild) – interactive-only policy publish/promote scopes delivered with metadata, fresh-auth enforcement, and audit/docs updates.
|
||||
|
||||
@@ -158,8 +158,8 @@ CONCELIER-SIG-26-001 `Vulnerable symbol exposure` | TODO | Expose advisory metad
|
||||
CONCELIER-STORE-AOC-19-005 `Raw linkset backfill` | TODO (2025-11-04) | Plan and execute advisory_observations `rawLinkset` backfill (online + Offline Kit bundles), supply migration scripts + rehearse rollback. Follow the coordination plan in `docs/dev/raw-linkset-backfill-plan.md`. Dependencies: CONCELIER-CORE-AOC-19-004. | Concelier Storage Guild, DevOps Guild (src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/TASKS.md)
|
||||
CONCELIER-TEN-48-001 `Tenant-aware linking` | TODO | Ensure advisory normalization/linking runs per tenant with RLS enforcing isolation; emit capability endpoint reporting `merge=false`; update events with tenant context. Dependencies: AUTH-TEN-47-001. | Concelier Core Guild (src/Concelier/__Libraries/StellaOps.Concelier.Core/TASKS.md)
|
||||
CONCELIER-VEXLENS-30-001 `Advisory rationale bridges` | TODO | Guarantee advisory key consistency and cross-links for consensus rationale; Label: VEX-Lens. Dependencies: CONCELIER-VULN-29-001, VEXLENS-30-005. | Concelier WebService Guild, VEX Lens Guild (src/Concelier/StellaOps.Concelier.WebService/TASKS.md)
|
||||
CONCELIER-VULN-29-001 `Advisory key canonicalization` | TODO | Canonicalize (lossless) advisory identifiers (CVE/GHSA/vendor) into `advisory_key`, persist `links[]`, expose raw payload snapshots for Explorer evidence tabs; AOC-compliant: no merge, no derived fields, no suppression. Include migration/backfill scripts. Dependencies: CONCELIER-LNM-21-001. | Concelier WebService Guild, Data Integrity Guild (src/Concelier/StellaOps.Concelier.WebService/TASKS.md)
|
||||
CONCELIER-VULN-29-002 `Evidence retrieval API` | TODO | Provide `/vuln/evidence/advisories/{advisory_key}` returning raw advisory docs with provenance, filtering by tenant and source. Dependencies: CONCELIER-VULN-29-001, VULN-API-29-003. | Concelier WebService Guild (src/Concelier/StellaOps.Concelier.WebService/TASKS.md)
|
||||
CONCELIER-VULN-29-001 `Advisory key canonicalization` | DONE (2025-11-07) | Canonicalize (lossless) advisory identifiers (CVE/GHSA/vendor) into `advisory_key`, persist `links[]`, expose raw payload snapshots for Explorer evidence tabs; AOC-compliant: no merge, no derived fields, no suppression. Include migration/backfill scripts. Dependencies: CONCELIER-LNM-21-001. | Concelier WebService Guild, Data Integrity Guild (src/Concelier/StellaOps.Concelier.WebService/TASKS.md)
|
||||
CONCELIER-VULN-29-002 `Evidence retrieval API` | DONE (2025-11-07) | Provide `/vuln/evidence/advisories/{advisory_key}` returning raw advisory docs with provenance, filtering by tenant and source. Dependencies: CONCELIER-VULN-29-001, VULN-API-29-003. | Concelier WebService Guild (src/Concelier/StellaOps.Concelier.WebService/TASKS.md)
|
||||
|
||||
|
||||
[Ingestion & Evidence] 110.B) Concelier.V
|
||||
@@ -211,8 +211,8 @@ Depends on: Sprint 110.B - Concelier.VI
|
||||
Summary: Ingestion & Evidence focus on Concelier (phase VII).
|
||||
Task ID | State | Task description | Owners (Source)
|
||||
--- | --- | --- | ---
|
||||
MERGE-LNM-21-002 | DOING (2025-11-07) | Refactor or retire `AdvisoryMergeService` and related pipelines, ensuring callers transition to observation/linkset APIs; add compile-time analyzer preventing merge service usage.<br>2025-11-03: Began dependency audit and call-site inventory ahead of deprecation plan; cataloging service registrations/tests referencing merge APIs.<br>2025-11-05 14:42Z: Drafted `concelier:features:noMergeEnabled` gating, merge job allowlist handling, and deprecation/telemetry changes prior to analyzer rollout.<br>2025-11-06 16:10Z: Landed analyzer project (`CONCELIER0002`), wired into Concelier WebService/tests, and updated docs to direct suppressions through explicit migration notes.<br>2025-11-07 03:25Z: Default-on toggle + job gating break existing Concelier WebService tests; guard/migration adjustments pending before closing the task.<br>2025-11-07 07:05Z: Added ingest-path diagnostics (hash logging + test log dumping) to trace why HTTP binding loses `upstream.contentHash` with `noMergeEnabled=true`; need to adapt seeding/tests once the binding issue is fixed. | BE-Merge (src/Concelier/__Libraries/StellaOps.Concelier.Merge/TASKS.md)
|
||||
MERGE-LNM-21-003 Determinism/test updates | QA Guild, BE-Merge | Replace merge determinism suites with observation/linkset regression tests verifying no data mutation and conflicts remain visible. Dependencies: MERGE-LNM-21-002. | MERGE-LNM-21-002 (src/Concelier/__Libraries/StellaOps.Concelier.Merge/TASKS.md)
|
||||
MERGE-LNM-21-002 | DONE (2025-11-07) | Refactor or retire `AdvisoryMergeService` and related pipelines, ensuring callers transition to observation/linkset APIs; add compile-time analyzer preventing merge service usage.<br>2025-11-03: Began dependency audit and call-site inventory ahead of deprecation plan; cataloging service registrations/tests referencing merge APIs.<br>2025-11-05 14:42Z: Drafted `concelier:features:noMergeEnabled` gating, merge job allowlist handling, and deprecation/telemetry changes prior to analyzer rollout.<br>2025-11-06 16:10Z: Landed analyzer project (`CONCELIER0002`), wired into Concelier WebService/tests, and updated docs to direct suppressions through explicit migration notes.<br>2025-11-07 03:25Z: Default-on toggle + job gating surfaced ingestion test brittleness; guard/migration diagnostics capture requests missing `upstream.contentHash`.<br>2025-11-07 19:45Z: Set `ConcelierOptions.Features.NoMergeEnabled` default to `true`, added regression coverage (`Features_NoMergeEnabled_DefaultsToTrue`), and rechecked ingest helpers to carry canonical links. Remote .NET 10 CLI run remains queued for validation. | BE-Merge (src/Concelier/__Libraries/StellaOps.Concelier.Merge/TASKS.md)
|
||||
MERGE-LNM-21-003 Determinism/test updates | DOING (2025-11-07) | QA Guild, BE-Merge | Replace merge determinism suites with observation/linkset regression tests verifying no data mutation and conflicts remain visible. Dependencies: MERGE-LNM-21-002.<br>2025-11-07: Drafting test migration plan (`docs/dev/lnm-determinism-tests.md`) to map legacy merge fixtures onto observation/linkset pipelines; identifying coverage gaps (conflict surfacing, raw vs canonical parity, hash stability).<br>2025-11-07 20:05Z: Landed `AdvisoryObservationFactoryTests.Create_IsDeterministicAcrossRuns` to cover canonical JSON stability and pruned the old merge determinism integration test. | MERGE-LNM-21-002 (src/Concelier/__Libraries/StellaOps.Concelier.Merge/TASKS.md)
|
||||
WEB-AOC-19-001 (dependency) | DONE (2025-11-07) | Shared guard primitives now enforce the top-level allowlist (`_id`, tenant, source, upstream, content, identifiers, linkset, supersedes, created/ingested timestamps, attributes) and emit the reusable `AocError` payload consumed by HTTP/CLI tooling. Extend `AocGuardOptions.AllowedTopLevelFields` when staging new schema fields to avoid false-positive `ERR_AOC_007` violations. | BE-Base Platform Guild (docs/aoc/guard-library.md, src/Web/StellaOps.Web/TASKS.md)
|
||||
|
||||
|
||||
|
||||
@@ -156,6 +156,8 @@ SCANNER-ENG-0027 | TODO | Deliver Windows policy/offline integration per `design
|
||||
SCANNER-SURFACE-01 | DONE (2025-11-06) | Persist Surface.FS manifests after analyzer stages, including layer CAS metadata and EntryTrace fragments.<br>2025-11-02: Worker pipeline emitting draft Surface.FS manifests for sample scans; determinism checks running.<br>2025-11-06: Continuing with manifest writer abstraction + telemetry wiring for Surface.FS persistence.<br>2025-11-06 18:45Z: Resumed work; targeting manifest writer abstraction, CAS persistence hooks, and telemetry/test coverage updates.<br>2025-11-06 20:20Z: Published Surface worker Grafana dashboard + updated design doc; WebService pointer integration test now covers manifest/payload artefacts. | Scanner Worker Guild (src/Scanner/StellaOps.Scanner.Worker/TASKS.md)
|
||||
SCANNER-SURFACE-02 | DONE (2025-11-05) | Publish Surface.FS pointers (CAS URIs, manifests) via scan/report APIs and update attestation metadata. Dependencies: SCANNER-SURFACE-01.<br>2025-11-05: Surface pointer projection wired through WebService endpoints, orchestrator samples & DSSE fixtures refreshed with `surface` manifest block, and regression suite (platform events, report sample, ready check) updated. | Scanner WebService Guild (src/Scanner/StellaOps.Scanner.WebService/TASKS.md)
|
||||
SCANNER-SURFACE-03 | DONE (2025-11-07) | Push layer manifests and entry fragments into Surface.FS during build-time SBOM generation. Dependencies: SCANNER-SURFACE-02.<br>2025-11-06: Starting BuildX manifest upload implementation with Surface.FS client abstraction and integration tests.<br>2025-11-07 15:30Z: Resumed BuildX plugin Surface wiring; analyzing Surface.FS models, CAS flow, and upcoming tests before coding.<br>2025-11-07 22:10Z: Added Surface manifest writer + CLI flags to the BuildX plug-in, persisted artefacts into CAS, regenerated docs/fixtures, and shipped new tests covering the writer + descriptor flow. | BuildX Plugin Guild (src/Scanner/StellaOps.Scanner.Sbomer.BuildXPlugin/TASKS.md)
|
||||
SCHED-SURFACE-02 | TODO | Integrate Scheduler worker prefetch using Surface manifest reader and persist manifest pointers with rerun plans. Dependencies: SURFACE-FS-02, SCHED-SURFACE-01. Reference `docs/modules/scanner/design/surface-fs-consumers.md` §3 for implementation checklist. | Scheduler Worker Guild (src/Scheduler/__Libraries/StellaOps.Scheduler.Worker/TASKS.md)
|
||||
ZASTAVA-SURFACE-02 | TODO | Use Surface manifest reader helpers to resolve `cas://` pointers and enrich drift diagnostics with manifest provenance. Dependencies: SURFACE-FS-02, ZASTAVA-SURFACE-01. Reference `docs/modules/scanner/design/surface-fs-consumers.md` §4 for integration steps. | Zastava Observer Guild (src/Zastava/StellaOps.Zastava.Observer/TASKS.md)
|
||||
|
||||
[Scanner & Surface] 130.A) Scanner.VIII
|
||||
Depends on: Sprint 130.A - Scanner.VII
|
||||
|
||||
@@ -61,23 +61,28 @@
|
||||
"spec_version": "1.6",
|
||||
"raw": { /* unmodified upstream document */ }
|
||||
},
|
||||
"identifiers": {
|
||||
"cve": ["CVE-2025-12345"],
|
||||
"ghsa": ["GHSA-xxxx-...."],
|
||||
"aliases": ["CVE-2025-12345", "GHSA-xxxx-...."]
|
||||
},
|
||||
"linkset": {
|
||||
"purls": ["pkg:npm/lodash@4.17.21"],
|
||||
"cpes": ["cpe:2.3:a:lodash:lodash:4.17.21:*:*:*:*:*:*:*"],
|
||||
"references": [
|
||||
{"type":"advisory","url":"https://..."},
|
||||
{"type":"fix","url":"https://..."}
|
||||
],
|
||||
"reconciled_from": ["content.raw.affected.ranges", "content.raw.pkg"]
|
||||
},
|
||||
"supersedes": "advisory_raw:osv:GHSA-xxxx-....:v2",
|
||||
"tenant": "default"
|
||||
}
|
||||
"identifiers": {
|
||||
"primary": "GHSA-xxxx-....",
|
||||
"aliases": ["CVE-2025-12345", "GHSA-xxxx-...."]
|
||||
},
|
||||
"linkset": {
|
||||
"purls": ["pkg:npm/lodash@4.17.21"],
|
||||
"cpes": ["cpe:2.3:a:lodash:lodash:4.17.21:*:*:*:*:*:*:*"],
|
||||
"references": [
|
||||
{"type":"advisory","url":"https://..."},
|
||||
{"type":"fix","url":"https://..."}
|
||||
],
|
||||
"reconciled_from": ["content.raw.affected.ranges", "content.raw.pkg"]
|
||||
},
|
||||
"advisory_key": "CVE-2025-12345",
|
||||
"links": [
|
||||
{"scheme":"CVE","value":"CVE-2025-12345"},
|
||||
{"scheme":"GHSA","value":"GHSA-XXXX-...."},
|
||||
{"scheme":"PRIMARY","value":"CVE-2025-12345"}
|
||||
],
|
||||
"supersedes": "advisory_raw:osv:GHSA-xxxx-....:v2",
|
||||
"tenant": "default"
|
||||
}
|
||||
```
|
||||
|
||||
### 1.2 Connector lifecycle
|
||||
@@ -110,7 +115,7 @@ Running the same export job twice against the same snapshot must yield byte-iden
|
||||
* **Linkset builder** that correlates observations into `advisory_linksets` and annotates conflicts.
|
||||
* **Event publisher** emitting `advisory.observation.updated` and `advisory.linkset.updated` messages.
|
||||
* **Exporters** (JSON, Trivy DB, Offline Kit slices) fed from observation/linkset stores.
|
||||
* **Minimal REST** for health/status/trigger/export and observation/linkset reads.
|
||||
* **Minimal REST** for health/status/trigger/export, raw observation reads, and evidence retrieval (`GET /vuln/evidence/advisories/{advisory_key}`).
|
||||
|
||||
**Scale:** HA by running N replicas; **locks** prevent overlapping jobs per source/exporter.
|
||||
|
||||
|
||||
51
docs/modules/scanner/design/surface-fs-consumers.md
Normal file
51
docs/modules/scanner/design/surface-fs-consumers.md
Normal file
@@ -0,0 +1,51 @@
|
||||
# Surface.FS Consumer Integration Guide (Scheduler & Zastava)
|
||||
|
||||
> **Updated:** 2025-11-07
|
||||
> **Audience:** Scheduler Worker Guild • Zastava Observer Guild • Surface FS Guild
|
||||
> **Depends on:** SURFACE-FS-02 (`FileSurfaceManifestStore`), Surface.Env/Surface.Secrets libraries.
|
||||
|
||||
This note captures the minimum wiring required for downstream services now that `FileSurfaceManifestStore` and the manifest reader/writer abstractions have landed.
|
||||
|
||||
## 1. Shared prerequisites
|
||||
|
||||
- Reference `StellaOps.Scanner.Surface.FS` (net10.0) and call:
|
||||
```csharp
|
||||
services
|
||||
.AddSurfaceFileCache()
|
||||
.AddSurfaceManifestStore();
|
||||
```
|
||||
This binds `Surface:Cache` and `Surface:Manifest` (or `SCANNER_SURFACE_*` overrides).
|
||||
- Pull runtime settings via `ISurfaceEnvironment` to ensure tenants/endpoints line up with Scanner.
|
||||
- Cache root (`Surface:Cache:Root`) must be writable; manifests fall back to `<Root>/manifests` unless explicitly overridden with `Surface:Manifest:RootDirectory`.
|
||||
|
||||
## 2. Manifest reader usage
|
||||
|
||||
```csharp
|
||||
var reader = serviceProvider.GetRequiredService<ISurfaceManifestReader>();
|
||||
var manifest = await reader.TryGetByUriAsync(surfaceUri, cancellationToken);
|
||||
```
|
||||
|
||||
- Accept `cas://{bucket}/{prefix}/{tenant}/{hh}/{tt}/{digest}.json` pointers.
|
||||
- On cache miss, return `null`—callers should fall back to existing recompute paths.
|
||||
- All timestamps are stored in canonical UTC, and metadata dictionaries are alphabetically sorted to keep digests deterministic.
|
||||
|
||||
## 3. Scheduler worker checklist (`SCHED-SURFACE-02`)
|
||||
|
||||
1. Prefetch manifests during planning so reruns can skip redundant layers.
|
||||
2. Persist `{manifestUri, manifestDigest}` alongside run plans for traceability.
|
||||
3. Emit telemetry counters: `scheduler_surface_manifest_prefetch_total{result=hit|miss}`.
|
||||
4. Update `docs/SCHED-WORKER-16-201-PLANNER.md` with the new prefetch flow.
|
||||
|
||||
## 4. Zastava observer checklist (`ZASTAVA-SURFACE-02`)
|
||||
|
||||
1. Resolve manifest pointer from runtime drift events (`entrytrace.graph`, `layer.fragments` kinds).
|
||||
2. Enrich drift diagnostics with `manifestDigest` and `Artifacts[n].metadata`.
|
||||
3. Add failure metric `zastava_surface_manifest_failures_total{reason=not_found|fetch_error}`.
|
||||
4. Expand observer runbook (`docs/modules/zastava/operations/drift.md`) with Surface manifest troubleshooting.
|
||||
|
||||
## 5. Testing guidance
|
||||
|
||||
- Unit-test manifest prefetch/adoption with local `FileSurfaceManifestStore`; use temp directories for isolation.
|
||||
- For integration environments, smoke-test by pointing to the same `Surface:Manifest:RootDirectory` used by Scanner Worker and verifying pointer fetch before scan jobs execute.
|
||||
|
||||
Coordinate status updates in the relevant `TASKS.md` entries and `docs/implplan/SPRINT_130_scanner_surface.md` once each guild completes its part. If you discover additional shared requirements, extend this guide so future consumers (CLI, Orchestrator) can reuse the flow.
|
||||
@@ -1,8 +1,9 @@
|
||||
# Surface.FS Design (Epic: SURFACE-SHARING)
|
||||
|
||||
> **Status:** Draft v1.0 — aligns with tasks `SURFACE-FS-01..06`, `SCANNER-SURFACE-01..05`, `ZASTAVA-SURFACE-01..02`, `SCHED-SURFACE-01`, `OPS-SECRETS-01..02`.
|
||||
> **Status:** Draft v1.1 — aligns with tasks `SURFACE-FS-01..06`, `SCANNER-SURFACE-01..05`, `ZASTAVA-SURFACE-01..02`, `SCHED-SURFACE-01`, `OPS-SECRETS-01..02`.
|
||||
>
|
||||
> **Audience:** Scanner Worker/WebService, Zastava, Scheduler, DevOps.
|
||||
> **Component map:** See [Scanner architecture — §1 System landscape](../architecture.md#1-system-landscape) for end-to-end placement.
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
@@ -26,45 +27,61 @@ Manifests describe the artefact metadata and storage pointers. They are stored i
|
||||
{
|
||||
"schema": "stellaops.surface.manifest@1",
|
||||
"tenant": "acme",
|
||||
"kind": "layer-entry-trace",
|
||||
"digest": "sha256:ab12...",
|
||||
"createdAt": "2025-10-29T12:00:00Z",
|
||||
"expiresAt": "2025-11-05T12:00:00Z",
|
||||
"imageDigest": "sha256:cafe...",
|
||||
"scanId": "scan-1234",
|
||||
"generatedAt": "2025-10-29T12:00:00Z",
|
||||
"source": {
|
||||
"scannerBuild": "stellaops/scanner@sha256:deadbeef",
|
||||
"imageDigest": "sha256:cafe...",
|
||||
"scanId": "scan-1234"
|
||||
"component": "scanner.worker",
|
||||
"version": "2025.10.0",
|
||||
"workerInstance": "scanner-worker-1",
|
||||
"attempt": 1
|
||||
},
|
||||
"storage": {
|
||||
"bucket": "surface-cache",
|
||||
"objectKey": "tenants/acme/layer-entry-trace/sha256/ab/12/.../payload.json.zst",
|
||||
"sizeBytes": 524288,
|
||||
"contentType": "application/json+zstd"
|
||||
},
|
||||
"integrity": {
|
||||
"hash": "sha256:ab12...",
|
||||
"signature": null
|
||||
}
|
||||
"artifacts": [
|
||||
{
|
||||
"kind": "entrytrace.graph",
|
||||
"uri": "cas://surface-cache/manifests/acme/ab/cd/abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789.json",
|
||||
"digest": "sha256:abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789",
|
||||
"mediaType": "application/vnd.stellaops.entrytrace+json",
|
||||
"format": "json",
|
||||
"sizeBytes": 524288,
|
||||
"view": "runtime",
|
||||
"storage": {
|
||||
"bucket": "surface-cache",
|
||||
"objectKey": "payloads/acme/entrytrace/sha256/ab/cd/abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789.ndjson.zst",
|
||||
"sizeBytes": 524288,
|
||||
"contentType": "application/x-ndjson+zstd"
|
||||
},
|
||||
"metadata": {
|
||||
"entrypoint": "/usr/bin/java",
|
||||
"surfaceVersion": "1"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Manifest URIs follow the deterministic pattern:
|
||||
|
||||
```
|
||||
cas://{bucket}/{prefix}/{tenant}/{digest[0..1]}/{digest[2..3]}/{digest}.json
|
||||
```
|
||||
|
||||
The hex portion of the manifest digest is split into two directory levels to avoid hot directories. The same layout is mirrored on disk by the default `FileSurfaceManifestStore`, which keeps offline bundle sync trivial (copy the `manifests/` tree verbatim).
|
||||
|
||||
### 2.3 Payload Storage
|
||||
|
||||
Large payloads (SBOM fragments, entry traces, runtime events) live in the same object store as manifests (RustFS/S3). Manifests record relative paths so offline bundles can copy both manifest and payload without modification.
|
||||
|
||||
## 3. APIs
|
||||
|
||||
Surface.FS exposes a gRPC/HTTP API consumed by .NET clients:
|
||||
Surface.FS exposes .NET-first abstractions that hosts consume via DI:
|
||||
|
||||
| Method | Description |
|
||||
|--------|-------------|
|
||||
| `PutManifest(PutManifestRequest)` | Stores manifest + optional payload. Idempotent via `digest`. |
|
||||
| `GetManifest(GetManifestRequest)` | Returns manifest metadata; 404 if missing. |
|
||||
| `GetPayload(GetPayloadRequest)` | Streams payload bytes (optionally decompressing). |
|
||||
| `ListManifests(ListManifestRequest)` | Enumerates manifests for tenant/kind with pagination. |
|
||||
| `DeleteManifest(DeleteManifestRequest)` | (Optional) Removes manifest/payload based on retention policies. |
|
||||
- `ISurfaceManifestWriter.PublishAsync(document)` – normalises artefact lists, computes the canonical SHA-256 digest, persists the manifest via the configured store, and returns a `SurfaceManifestPublishResult` containing the digest, canonical URI, and the normalised document.
|
||||
- `ISurfaceManifestReader.TryGetByUriAsync(uri)` – resolves a manifest pointer (e.g. `cas://surface-cache/manifests/...`) back into a `SurfaceManifestDocument`.
|
||||
- `ISurfaceManifestReader.TryGetByDigestAsync(digest)` – looks up a manifest by digest, scanning tenant prefixes when necessary (used by Offline Kit importers).
|
||||
- `ISurfaceCache` (`GetOrCreateAsync`, `TryGetAsync`, `SetAsync`) – lightweight content-addressable cache for hot artefacts (layer fragments, entry trace outputs) hosted on local disk.
|
||||
|
||||
.NET client wraps these calls and handles retries using Polly policies.
|
||||
All components honour configuration bound from `Surface:Cache` and `Surface:Manifest` (or environment mirrors like `SCANNER_SURFACE_CACHE_ROOT`). `SurfaceManifestStoreOptions` controls the URI scheme/bucket/prefix and allows overriding the manifest directory while still defaulting to `<cacheRoot>/manifests`.
|
||||
|
||||
### WebService integration (2025-11-05)
|
||||
|
||||
@@ -78,16 +95,16 @@ Surface.FS exposes a gRPC/HTTP API consumed by .NET clients:
|
||||
|
||||
Surface.FS library for .NET hosts provides:
|
||||
|
||||
- `ISurfaceManifestWriter` / `ISurfaceManifestReader` interfaces.
|
||||
- Content-addressed path builder (`SurfacePathBuilder`).
|
||||
- Tenant namespace isolation and bucket configuration (via Surface.Env).
|
||||
- Local cache abstraction `ISurfaceCache` with default `FileSurfaceCache` implementation (uses `Surface:Cache:Root` / `SCANNER_SURFACE_CACHE_ROOT`, enforces quotas, serialises writes with per-key semaphores).
|
||||
- `ISurfaceManifestWriter` / `ISurfaceManifestReader` with the default `FileSurfaceManifestStore` implementation (single-writer semaphore, digest reuse, optional overwrite warning).
|
||||
- Deterministic pointer builder (`SurfaceManifestPathBuilder`) and options (`SurfaceManifestStoreOptions`, `SurfaceCacheOptions`) that align with `Surface.Env` configuration.
|
||||
- Local cache abstraction `ISurfaceCache` with default `FileSurfaceCache` implementation (uses `Surface:Cache:Root` / `SCANNER_SURFACE_CACHE_ROOT`, enforces per-key semaphores, stores bytes verbatim).
|
||||
- `SurfaceCacheKey` helper that normalises cache entries as `{namespace}/{tenant}/{sha256}`. EntryTrace graphs use the `entrytrace.graph` namespace so Worker/WebService/CLI can share cached results deterministically.
|
||||
- Metrics: `surface_manifest_put_seconds`, `surface_manifest_cache_hit_total`, etc.
|
||||
- JSON serialiser (`SurfaceCacheJsonSerializer`) that applies camelCase naming, ignores nulls, and uses a stable encoder for reproducible hashing.
|
||||
- Metrics: `surface_manifest_published_total`, `surface_manifest_cache_hit_total`, plus host-specific counters wired via Scanner Worker instrumentation.
|
||||
|
||||
## 5. Retention & Eviction
|
||||
|
||||
- Manifests include optional `expiresAt`; Worker defaults to 30 days for SBOM fragments, 7 days for entry traces.
|
||||
- Manifests capture `generatedAt`; retention windows (30 days for SBOM fragments, 7 days for entry traces) are enforced by job configuration and object-store lifecycle policies. An `expiresAt` field is reserved for future use when automated eviction is introduced.
|
||||
- Background job `SurfaceCacheMaintenanceService` evicts local cache entries exceeding quota, oldest-first.
|
||||
- Object storage retention policies are managed by DevOps; library exposes metrics but does not auto-delete unless instructed.
|
||||
|
||||
@@ -98,13 +115,13 @@ Offline kits include:
|
||||
```
|
||||
offline/surface/
|
||||
manifests/
|
||||
tenants/<tenant>/<kind>/<digest>.json
|
||||
<tenant>/<digest[0..1]>/<digest[2..3]>/<digest>.json
|
||||
payloads/
|
||||
tenants/<tenant>/<kind>/<digest>.json.zst
|
||||
<tenant>/<kind>/<digest[0..1]>/<digest[2..3]>/<digest>.json.zst
|
||||
manifest-index.json
|
||||
```
|
||||
|
||||
Import script calls `PutManifest` for each manifest, verifying digests. This enables Zastava and Scheduler running offline to consume cached data without re-scanning.
|
||||
Import script uses `ISurfaceManifestWriter.PublishAsync` for each manifest after verifying the embedded digest, keeping Offline Kit replays identical to online flows. This enables Zastava and Scheduler running offline to consume cached data without re-scanning.
|
||||
|
||||
### 6.1 EntryTrace Cache Usage
|
||||
|
||||
|
||||
Reference in New Issue
Block a user