docs consolidation
This commit is contained in:
@@ -1,22 +0,0 @@
|
||||
# Concelier AirGap Prep — PREP-CONCELIER-AIRGAP-56-001-58-001
|
||||
|
||||
Status: **Ready for implementation** (2025-11-20)
|
||||
Owners: Concelier Core · AirGap Guilds
|
||||
Scope: Chain mirror thin-bundle milestone with EvidenceLocker bundle references and console consumption to unblock air-gapped Concelier workflows (56-001..58-001).
|
||||
|
||||
## Inputs
|
||||
- Mirror milestone-0 thin bundle: `out/mirror/thin/mirror-thin-m0-sample.tar.gz` (hash documented in PREP-ART-56-001).
|
||||
- Evidence bundle v1 contract: `docs/modules/evidence-locker/evidence-bundle-v1.md`.
|
||||
- Console fixtures (29-001, 30-001) and LNM schema freeze.
|
||||
|
||||
## Deliverables
|
||||
- Publish mapping note `docs/modules/concelier/prep/airgap-56-001-58-001-mapping.md` covering:
|
||||
- Bundle locations/hashes (thin + evidence).
|
||||
- Import commands for Concelier offline controller.
|
||||
- Deterministic ordering and retention expectations.
|
||||
- Provide SHA256 for any new composed bundles and place under `out/concelier/airgap/`.
|
||||
|
||||
## Acceptance criteria
|
||||
- Mapping note published with hashes and import commands.
|
||||
- No unresolved schema decisions remain for air-gap import chain.
|
||||
|
||||
@@ -1,17 +0,0 @@
|
||||
# Concelier Attestation Prep — PREP-CONCELIER-ATTEST-73-001-002
|
||||
|
||||
Status: **Ready for implementation** (2025-11-20)
|
||||
Owners: Concelier Core · Evidence Locker Guild
|
||||
Scope: Evidence Locker attestation scope integration for Concelier attest tasks 73-001/002.
|
||||
|
||||
## Requirements
|
||||
- Use Evidence Locker attestation scope note: `docs/modules/evidence-locker/attestation-scope-note.md`.
|
||||
- Bind Evidence Bundle v1 contract: `docs/modules/evidence-locker/evidence-bundle-v1.md`.
|
||||
|
||||
## Deliverables
|
||||
- Concelier-specific attestation ingest note at `docs/modules/concelier/prep/attest-73-001-ingest.md` describing required claims, DSSE expectations, and lookup flow.
|
||||
- Hashes for sample attest bundles reused from Evidence Locker sample; no new artefacts needed.
|
||||
|
||||
## Acceptance criteria
|
||||
- Ingest note published with claim set and DSSE requirements; Concelier tasks can proceed without further schema questions.
|
||||
|
||||
@@ -1,17 +0,0 @@
|
||||
# Concelier Console Prep — PREP-CONCELIER-CONSOLE-23-001-003
|
||||
|
||||
Status: **Ready for implementation** (2025-11-20)
|
||||
Owners: Concelier Console Guild
|
||||
Scope: Console schema samples and evidence bundle references for console consumption of linkset/VEX data (23-001..003).
|
||||
|
||||
## Deliverables
|
||||
- JSON samples placed under `docs/samples/console/`:
|
||||
- `console-linkset-search.json` (frozen LNM schema, includes pagination + filters).
|
||||
- `console-vex-search.json` (VEX linkset search with exploitability flags).
|
||||
- Hashes `.sha256` for each sample.
|
||||
- README snippet added to `docs/samples/console/README.md` describing schema version, seed (`2025-01-01T00:00:00Z`), and deterministic ordering.
|
||||
|
||||
## Acceptance criteria
|
||||
- Samples validate against frozen LNM schema and reference evidence bundle IDs where applicable.
|
||||
- Hashes recorded; no external dependencies.
|
||||
|
||||
@@ -1,20 +0,0 @@
|
||||
# Concelier Feed Prep — PREP-FEEDCONN-ICSCISA-02-012-KISA-02-008-FEED
|
||||
|
||||
Status: **Ready for implementation** (2025-11-20)
|
||||
Owners: Concelier Feed Owners
|
||||
Scope: Remediation plan and schema notes for ICSCISA/KISA feeds to unblock connector work.
|
||||
|
||||
## Plan (agreed 2025-11-20)
|
||||
- Refresh schedule: weekly sync every Monday 02:00 UTC; backfill overdue advisories first.
|
||||
- Provenance: DSSE-signed feed files stored under `mirror/feeds/icscisa/` and `mirror/feeds/kisa/` with hashes in `out/feeds/icscisa-kisa.sha256`.
|
||||
- Normalized fields: enforce `source`, `advisoryId`, `severity`, `cvss`, `published`, `updated`, `references[]`.
|
||||
- Owners: Feed Ops team (primary), Security (review), Product Advisory Guild (oversight).
|
||||
|
||||
## Deliverables
|
||||
- Publish updated runbook `docs/modules/concelier/feeds/icscisa-kisa.md` and provenance note `docs/modules/concelier/feeds/icscisa-kisa-provenance.md` (already exist; confirm hashes and schedule lines).
|
||||
- Provide SHA256 for latest feed files and path under `out/feeds/icscisa-kisa.sha256`.
|
||||
|
||||
## Acceptance criteria
|
||||
- Runbook and provenance docs reflect schedule + normalized fields.
|
||||
- Hash file published for latest feed drop; connector work unblocked.
|
||||
|
||||
@@ -1,72 +0,0 @@
|
||||
# Concelier · Orchestrator Registry & Control Prep
|
||||
|
||||
- **Date:** 2025-11-20
|
||||
- **Scope:** PREP-CONCELIER-ORCH-32-001, PREP-CONCELIER-ORCH-32-002, PREP-CONCELIER-ORCH-33-001, PREP-CONCELIER-ORCH-34-001
|
||||
- **Working directory:** `src/Concelier/**` (WebService, Core, Storage.Mongo, worker SDK touch points)
|
||||
|
||||
## Goals
|
||||
- Publish a deterministic registry/SDK contract so connectors can be scheduled by Orchestrator without bespoke control planes.
|
||||
- Define heartbeats/progress envelopes and pause/throttle/backfill semantics ahead of worker wiring.
|
||||
- Describe replay/backfill evidence outputs so ledger/export work can rely on stable hashes.
|
||||
|
||||
## Registry record (authoritative fields)
|
||||
All registry documents live under the orchestrator collection keyed by `connectorId` (stable slug). Fields and invariants:
|
||||
- `connectorId` (string, slug, lowercase) — unique per tenant + source; immutable.
|
||||
- `tenant` (string) — required; enforced by WebService tenant guard.
|
||||
- `source` (enum) — advisory provider (`nvd`, `ghsa`, `osv`, `icscisa`, `kisa`, `vendor:<slug>`).
|
||||
- `capabilities` (array) — `observations`, `linksets`, `timeline`, `attestations` flags; no merge/derived data.
|
||||
- `authRef` (string) — reference to secrets store key; never inlined.
|
||||
- `schedule` (object) — `cron`, `timeZone`, `maxParallelRuns`, `maxLagMinutes`.
|
||||
- `ratePolicy` (object) — `rpm`, `burst`, `cooldownSeconds`; default deny if absent.
|
||||
- `artifactKinds` (array) — `raw-advisory`, `normalized`, `linkset`, `timeline`, `attestation`.
|
||||
- `lockKey` (string) — deterministic lock namespace (`concelier:{tenant}:{connectorId}`) for single-flight.
|
||||
- `egressGuard` (object) — `allowlist` of hosts + `airgapMode` boolean; fail closed when `airgapMode=true` and host not allowlisted.
|
||||
- `createdAt` / `updatedAt` (ISO-8601 UTC) — monotonic; updates require optimistic concurrency token.
|
||||
|
||||
### Registry sample (non-normative)
|
||||
```json
|
||||
{
|
||||
"connectorId": "icscisa",
|
||||
"tenant": "acme",
|
||||
"source": "icscisa",
|
||||
"capabilities": ["observations", "linksets", "timeline"],
|
||||
"authRef": "secret:concelier/icscisa/api-key",
|
||||
"schedule": {"cron": "*/30 * * * *", "timeZone": "UTC", "maxParallelRuns": 1, "maxLagMinutes": 120},
|
||||
"ratePolicy": {"rpm": 60, "burst": 10, "cooldownSeconds": 30},
|
||||
"artifactKinds": ["raw-advisory", "normalized", "linkset"],
|
||||
"lockKey": "concelier:acme:icscisa",
|
||||
"egressGuard": {"allowlist": ["icscert.kisa.or.kr"], "airgapMode": true},
|
||||
"createdAt": "2025-11-20T00:00:00Z",
|
||||
"updatedAt": "2025-11-20T00:00:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
## Control/SDK contract (heartbeats + commands)
|
||||
- Heartbeat endpoint `POST /internal/orch/heartbeat` (auth: internal orchestrator role, tenant-scoped).
|
||||
- Body: `connectorId`, `runId` (GUID), `status` (`starting|running|paused|throttled|backfill|failed|succeeded`),
|
||||
`progress` (0–100), `queueDepth`, `lastArtifactHash`, `lastArtifactKind`, `errorCode`, `retryAfterSeconds`.
|
||||
- Idempotency key: `runId` + `sequence` to preserve ordering; orchestrator ignores stale sequence.
|
||||
- Control queue document (persisted per run):
|
||||
- Commands: `pause`, `resume`, `throttle` (rpm/burst override until `expiresAt`), `backfill` (range: `fromCursor`/`toCursor`).
|
||||
- Workers poll `/internal/orch/commands?connectorId={id}&runId={runId}`; must ack with monotonic `ackSequence` to ensure replay safety.
|
||||
- Failure semantics: on `failed`, worker emits `errorCode`, `errorReason`, `lastCheckpoint` (cursor/hash). Orchestrator may re-enqueue with backoff.
|
||||
|
||||
## Backfill/replay expectations
|
||||
- Backfill command requires deterministic cursor space (e.g., advisory sequence number or RFC3339 timestamp truncated to minutes).
|
||||
- Worker must emit a `runManifest` per backfill containing: `runId`, `connectorId`, `tenant`, `cursorRange`, `artifactHashes[]`, `dsseEnvelopeHash` (if attested), `completedAt`.
|
||||
- Manifests are written to Evidence Locker ledger for replay; filenames: `backfill/{tenant}/{connectorId}/{runId}.ndjson` with stable ordering.
|
||||
|
||||
## Telemetry (to implement in WebService + worker SDK)
|
||||
- Meter name prefix: `StellaOps.Concelier.Orch`.
|
||||
- Counters:
|
||||
- `concelier.orch.heartbeat` tags: `tenant`, `connectorId`, `status`.
|
||||
- `concelier.orch.command.applied` tags: `tenant`, `connectorId`, `command`.
|
||||
- Histograms:
|
||||
- `concelier.orch.lag.minutes` (now - cursor upper bound) tags: `tenant`, `connectorId`.
|
||||
- Logs: structured with `tenant`, `connectorId`, `runId`, `command`, `sequence`, `ackSequence`.
|
||||
|
||||
## Acceptance criteria for prep completion
|
||||
- Registry/command schema above is frozen and referenced from Sprint 0114 Delivery Tracker (P10–P13) so downstream implementation knows shapes.
|
||||
- Sample manifest path + naming are defined for ledger/replay flows.
|
||||
- Meter names/tags enumerated for observability wiring.
|
||||
|
||||
@@ -1,42 +0,0 @@
|
||||
# Concelier PREP Notes — 2025-11-20
|
||||
|
||||
Owner: Concelier Core Guild · Scheduler Guild · Data Science Guild
|
||||
Scope: Provide traceable prep outputs for PREP-CONCELIER-GRAPH-21-002-PLATFORM-EVENTS-S and PREP-CONCELIER-LNM-21-002-WAITING-ON-FINALIZE so downstream tasks can proceed without blocking on missing contracts.
|
||||
|
||||
## 1) `sbom.observation.updated` platform event (Graph-21-002)
|
||||
- Goal: publish deterministic, facts-only observation updates for graph overlays; no derived judgments.
|
||||
- Proposed envelope (draft for Scheduler/Platform Events review):
|
||||
- `event_type`: `sbom.observation.updated`
|
||||
- `tenant_id` (string, required)
|
||||
- `advisory_ids` (array of strings; upstream IDs as-ingested)
|
||||
- `observation_ids` (array of stable per-observation IDs emitted by LNM storage)
|
||||
- `source` (string; advisory source slug)
|
||||
- `version_range` (string; original upstream semantics)
|
||||
- `occurred_at` (ISO-8601 UTC, produced by Concelier at write time; deterministic)
|
||||
- `trace` (object; optional provenance pointers, DSSE envelope digest with alg/id fields)
|
||||
- Delivery and wiring expectations:
|
||||
- Publisher lives in `StellaOps.Concelier.Core` after linkset/observation persistence.
|
||||
- Scheduler binding: NATS/Redis topic `concelier.sbom.observation.updated`; ack + idempotent replay friendly; max delivery once semantics via message ID = `<tenant>:<observation_id>::<digest>`.
|
||||
- Telemetry: counter `concelier_events_observation_updated_total{tenant,source,result}`; log template includes `tenant`, `advisory_id`, `observation_id`, `event_id`.
|
||||
- Offline posture: allow emitting into local bus, enqueue to file-backed spool when offline; retry with deterministic ordering by `(tenant, observation_id)`.
|
||||
- Open questions to resolve in impl task:
|
||||
- Final topic naming and DSSE requirement (optional vs required per deployment).
|
||||
- Whether to include component alias list in the event payload or expect consumers to join via API.
|
||||
|
||||
## 2) LNM fixtures + precedence markers (LNM-21-002)
|
||||
- Goal: unblock correlation pipelines and downstream linkset tasks by defining required fixture shape and precedence rules.
|
||||
- Fixture requirements (additive to frozen LNM v1 schema):
|
||||
- Provide at least three sources with conflicting severity/CVSS to exercise conflict markers.
|
||||
- Include overlapping version ranges to validate precedence tie-breakers.
|
||||
- Each fixture must include `provenance` (source, fetch_time, collector) and `confidence` hints.
|
||||
- Precedence rule proposal for review:
|
||||
1. Prefer explicit source ranking table (to be agreed) over recency.
|
||||
2. If ranking ties, prefer narrower version ranges, then higher confidence, then stable lexical order of `(source, advisory_id)`.
|
||||
3. Never collapse conflicting fields; emit `conflicts[]` entries with reason codes `severity-disagree`, `cvss-disagree`, `reference-disagree`.
|
||||
- Delivery path for fixtures once agreed: `src/Concelier/seed-data/lnm/v1/fixtures/*.json` with deterministic ordering; wire into `StellaOps.Concelier.Core.Tests` harness.
|
||||
- Next actions captured for implementation task:
|
||||
- Confirm ranking table and conflict reason code list with Cartographer/Data Science.
|
||||
- Drop initial fixtures into the above path and reference them from the implementation tasks’ tests.
|
||||
|
||||
## Handoff
|
||||
- This document is the published prep artefact requested by PREP-CONCELIER-GRAPH-21-002-PLATFORM-EVENTS-S and PREP-CONCELIER-LNM-21-002-WAITING-ON-FINALIZE. Downstream tasks should cite this file until the final schemas/fixtures are merged.
|
||||
@@ -1,37 +0,0 @@
|
||||
# Concelier · Policy Engine Linkset API Prep
|
||||
|
||||
- **Date:** 2025-11-20
|
||||
- **Scope:** PREP-CONCELIER-POLICY-20-001 (LNM APIs not exposed via OpenAPI)
|
||||
- **Working directory:** `src/Concelier/StellaOps.Concelier.WebService`
|
||||
|
||||
## Goal
|
||||
Freeze the contract Policy Engine will consume for advisory lookups without inference/merges, and locate where the OpenAPI surface must be updated so downstream Policy tasks can begin.
|
||||
|
||||
## API surface to expose
|
||||
- **Endpoint:** `GET /v1/lnm/linksets`
|
||||
- **Query params:**
|
||||
- `purl` (repeatable), `cpe`, `ghsa`, `cve`, `advisoryId`, `source` (nvd|ghsa|osv|vendor:<slug>), `severityMin`, `severityMax`, `publishedSince`, `modifiedSince`, `tenant` (header enforced, not query), `page` (default 1), `pageSize` (default 50, max 200), `sort` (publishedAt|modifiedAt|severity desc|source|advisoryId; default modifiedAt desc).
|
||||
- **Response:** deterministic ordering; body fields = `advisoryId`, `source`, `purl[]`, `cpe[]`, `summary`, `publishedAt`, `modifiedAt`, `severity` (source-native), `status` (facts only), `provenance` (`ingestedAt`, `connectorId`, `evidenceHash`, `dsseEnvelopeHash?`), `conflicts[]` (raw disagreements, no merged verdicts), `timeline[]` (raw timestamps + hashes), `remarks[]` (human notes, optional).
|
||||
- **Endpoint:** `GET /v1/lnm/linksets/{advisoryId}`
|
||||
- Mirrors above fields; adds `normalized` block for any canonicalized IDs; `cached` flag already added in Sprint 110.B endpoint work.
|
||||
- **Endpoint:** `POST /v1/lnm/linksets/search`
|
||||
- Accepts body with same filters as query params plus boolean `includeTimeline`, `includeObservations` (default false). Must respect tenant guard and AOC (no inferred verdicts or merges).
|
||||
|
||||
## OpenAPI tasks
|
||||
- Source file location: `src/Concelier/StellaOps.Concelier.WebService/openapi/concelier-lnm.yaml` (to be created / updated alongside code) and published copy under `docs/api/concelier/`.
|
||||
- Add components:
|
||||
- `LinksetProvenance` object (ingestedAt, connectorId, evidenceHash, dsseEnvelopeHash?).
|
||||
- `LinksetConflict` object (source, field, observedValue, observedAt, evidenceHash).
|
||||
- `LinksetTimeline` object (event, at, evidenceHash, dsseEnvelopeHash?).
|
||||
- Pagination envelope: `{ "items": [...], "page": 1, "pageSize": 50, "total": <int> }` with stable ordering guarantees quoted above.
|
||||
- Security: `Tenant` header required; bearer/mtls unchanged from existing WebService.
|
||||
|
||||
## Determinism & AOC guards
|
||||
- Responses must never include merged severity/state; surface only source-provided facts and conflicts.
|
||||
- Sorting: primary `modifiedAt desc`, tie-breaker `advisoryId asc`, then `source asc` for deterministic pagination.
|
||||
- Cache: the `/linksets/{advisoryId}` endpoint may serve cached entries but must include `cached: true|false` and `provenance.evidenceHash` so Policy Engine can verify integrity.
|
||||
|
||||
## Deliverable
|
||||
- This prep note is the canonical contract for policy-facing LNM APIs until the OpenAPI source is committed at the path above.
|
||||
- Downstream tasks (POLICY-ENGINE-20-001 and linked Policy Engine sprints) should bind to these fields; any deviations must update this prep note and the sprint’s Decisions & Risks.
|
||||
|
||||
@@ -1,44 +0,0 @@
|
||||
# Concelier Web AirGap Prep — PREP-CONCELIER-WEB-AIRGAP-57-001
|
||||
|
||||
Status: Draft (2025-11-20)
|
||||
Owners: Concelier WebService Guild · AirGap Policy Guild
|
||||
Scope: Define remediation payloads and staleness plumbing for sealed-mode violations, dependent on WEB-AIRGAP-56-002.
|
||||
|
||||
## Dependencies
|
||||
- WEB-AIRGAP-56-001: mirror bundle registration + sealed-mode enforcement.
|
||||
- WEB-AIRGAP-56-002: staleness + bundle provenance metadata surfaces.
|
||||
- AirGap controller scopes (seal/unseal) and time anchor semantics from AirGap Controller/Time guilds.
|
||||
|
||||
## Proposed payload mapping (EGRESS blocked)
|
||||
- Error code: `AIRGAP_EGRESS_BLOCKED`.
|
||||
- Shape:
|
||||
```json
|
||||
{
|
||||
"error": "AIRGAP_EGRESS_BLOCKED",
|
||||
"message": "Direct internet fetches disabled in sealed mode; use mirror bundle sources only.",
|
||||
"bundle_required": true,
|
||||
"staleness_seconds": 0,
|
||||
"remediation": [
|
||||
"Import mirror bundle via /airgap/import or offline kit",
|
||||
"Ensure sealed mode is set with valid time anchor",
|
||||
"Retry with cached/mirrored sources enabled"
|
||||
]
|
||||
}
|
||||
```
|
||||
- Determinism: fixed ordering of fields, remediation list sorted.
|
||||
|
||||
## Staleness surfacing
|
||||
- Staleness derived from bundle metadata supplied by 56-002 (`bundle_id`, `provenance`, `staleness_budget_seconds`).
|
||||
- Responses include `staleness_seconds_remaining` and `bundle_id` when available.
|
||||
|
||||
## Observability
|
||||
- Emit timeline event `concelier.airgap.egress_blocked` with `{tenant_id, bundle_id?, endpoint, request_id}`.
|
||||
- Metric: `concelier_airgap_egress_blocked_total` (counter) tagged by endpoint.
|
||||
|
||||
## Open decisions
|
||||
- Final error envelope format (depends on WEB-OAS-61-002 standard envelope).
|
||||
- Exact header name for staleness metadata (suggest `x-concelier-bundle-staleness`).
|
||||
- Whether to include advisory key/linkset ids in the blocked response.
|
||||
|
||||
## Handoff
|
||||
Use this as the PREP artefact for WEB-AIRGAP-57-001. Update once 56-002 and error envelope standard are finalized.
|
||||
@@ -1,29 +0,0 @@
|
||||
# Concelier OAS & Observability Prep (61-001..63-001, 51-001..55-001)
|
||||
|
||||
Status: **Ready for implementation** (2025-11-22)
|
||||
Owners: Concelier Core Guild · API Contracts Guild · DevOps/Observability Guilds
|
||||
Scope: Freeze the API/SDK contracts and observability envelopes for LNM search/timeline APIs so downstream SDK, governance, and incident flows can proceed without schema churn.
|
||||
|
||||
## Inputs
|
||||
- Frozen LNM payload schema: `docs/modules/concelier/link-not-merge-schema.md` (2025-11-17).
|
||||
- Event contract: `docs/modules/concelier/events/advisory.observation.updated@1.md`.
|
||||
- Registry/worker orchestration contract: `docs/modules/concelier/prep/2025-11-20-orchestrator-registry-prep.md`.
|
||||
|
||||
## Deliverables
|
||||
- OpenAPI source stub for LNM + timeline surfaces recorded at `docs/modules/concelier/openapi/lnm-api.yaml` (paths enumerated; examples outlined below).
|
||||
- SDK example library checklist covering `searchAdvisories`, `searchLinksets`, `getTimeline`, `getObservationById`; response bodies aligned to frozen schema; no consensus/merge fields.
|
||||
- Observability contract (metrics/logs/traces):
|
||||
- Metrics: `concelier_ingest_latency_seconds`, `concelier_linkset_conflicts_total`, `concelier_timeline_emit_lag_seconds`, `concelier_api_requests_total{route,tenant,status}` with burn-rate alert examples.
|
||||
- Logs: structured fields `tenantId`, `advisoryKey`, `linksetId`, `timelineCursor`, `egressPolicy`.
|
||||
- Traces: span names for `lnm.search`, `lnm.timeline`, `lnm.linkset-resolve` with baggage keys `tenant-id`, `request-id`.
|
||||
- Incident/observability hooks: timeline/attestation enrichment notes for OBS-54/55 including DSSE envelope hash field and sealed-mode redaction rules.
|
||||
|
||||
## Acceptance Criteria
|
||||
- Request/response shapes for `/api/v1/lnm/advisories`, `/api/v1/lnm/linksets`, `/api/v1/lnm/timeline` documented with required query params (`tenantId`, `productKey`, `offset`, `limit`, `sort`, `includeTimeline=true|false`).
|
||||
- All responses MUST include `provenance` block (source, fetchedAt, digest, evidenceBundleId) and forbid consensus/merge fields.
|
||||
- Metrics/logs names and labels are deterministic and lowercase; alert examples reference burn-rate SLOs.
|
||||
- File path above is referenced from sprint trackers; any future schema edits require bumping version/comment in this prep doc.
|
||||
|
||||
## Notes
|
||||
- This prep satisfies PREP-CONCELIER-OAS-61-001/002/62-001/63-001 and PREP-CONCELIER-OBS-51-001/52-001/53-001/54-001/55-001.
|
||||
- No external dependencies remaining; downstream tasks may proceed using the stubbed OpenAPI and observability contracts here.
|
||||
@@ -1,82 +0,0 @@
|
||||
# Concelier Backfill & Rollback Plan (STORE-AOC-19-005-DEV, Postgres)
|
||||
|
||||
## Objective
|
||||
Prepare and rehearse the raw Link-Not-Merge backfill/rollback so Concelier Postgres reflects the dataset deterministically across dev/stage. This replaces the prior Mongo workflow.
|
||||
|
||||
## Inputs
|
||||
- Dataset tarball: `out/linksets/linksets-stage-backfill.tar.zst`
|
||||
- Files expected inside: `linksets.ndjson`, `advisory_chunks.ndjson`, `manifest.json`
|
||||
- Record SHA-256 of the tarball here when staged:
|
||||
```
|
||||
$ sha256sum out/linksets/linksets-stage-backfill.tar.zst
|
||||
2b43ef9b5694f59be8c1d513893c506b8d1b8de152d820937178070bfc00d0c0 out/linksets/linksets-stage-backfill.tar.zst
|
||||
```
|
||||
- To regenerate the tarball deterministically from repo seeds: `./scripts/concelier/build-store-aoc-19-005-dataset.sh`
|
||||
- To validate a tarball locally (counts + hashes): `./scripts/concelier/test-store-aoc-19-005-dataset.sh out/linksets/linksets-stage-backfill.tar.zst`
|
||||
|
||||
## Preflight
|
||||
- Env:
|
||||
- `PGURI` (or `CONCELIER_PG_URI`) pointing to the target Postgres instance.
|
||||
- `PGSCHEMA` (default `lnm_raw`) for staging tables.
|
||||
- Ensure maintenance window for bulk import; no concurrent writers to staging tables.
|
||||
|
||||
## Backfill steps (CI-ready)
|
||||
|
||||
### Preferred: CI/manual script
|
||||
- `scripts/concelier/backfill-store-aoc-19-005.sh /path/to/linksets-stage-backfill.tar.zst`
|
||||
- Env: `PGURI` (or `CONCELIER_PG_URI`), optional `PGSCHEMA` (default `lnm_raw`), optional `DRY_RUN=1` for extraction-only.
|
||||
- The script:
|
||||
- Extracts and validates required files.
|
||||
- Creates/clears staging tables (`<schema>.linksets_raw`, `<schema>.advisory_chunks_raw`).
|
||||
- Imports via `\copy` from TSV derived with `jq -rc '[._id, .] | @tsv'`.
|
||||
- Prints counts and echoes the manifest.
|
||||
|
||||
### Manual steps (fallback)
|
||||
1) Extract dataset:
|
||||
```
|
||||
mkdir -p out/linksets/extracted
|
||||
tar -xf out/linksets/linksets-stage-backfill.tar.zst -C out/linksets/extracted
|
||||
```
|
||||
2) Create/truncate staging tables and import:
|
||||
```
|
||||
psql "$PGURI" <<SQL
|
||||
create schema if not exists lnm_raw;
|
||||
create table if not exists lnm_raw.linksets_raw (id text primary key, raw jsonb not null);
|
||||
create table if not exists lnm_raw.advisory_chunks_raw (id text primary key, raw jsonb not null);
|
||||
truncate table lnm_raw.linksets_raw;
|
||||
truncate table lnm_raw.advisory_chunks_raw;
|
||||
\copy lnm_raw.linksets_raw (id, raw) from program 'jq -rc ''[._id, .] | @tsv'' out/linksets/extracted/linksets.ndjson' with (format csv, delimiter E'\\t', quote '\"', escape '\"');
|
||||
\copy lnm_raw.advisory_chunks_raw (id, raw) from program 'jq -rc ''[._id, .] | @tsv'' out/linksets/extracted/advisory_chunks.ndjson' with (format csv, delimiter E'\\t', quote '\"', escape '\"');
|
||||
SQL
|
||||
```
|
||||
3) Verify counts vs manifest:
|
||||
```
|
||||
jq '.' out/linksets/extracted/manifest.json
|
||||
psql -tA "$PGURI" -c "select 'linksets_raw='||count(*) from lnm_raw.linksets_raw;"
|
||||
psql -tA "$PGURI" -c "select 'advisory_chunks_raw='||count(*) from lnm_raw.advisory_chunks_raw;"
|
||||
```
|
||||
|
||||
## Rollback procedure
|
||||
- If validation fails: `truncate table lnm_raw.linksets_raw; truncate table lnm_raw.advisory_chunks_raw;` then rerun import.
|
||||
- Promotion to production tables should be gated by a separate migration/ETL step; keep staging isolated.
|
||||
|
||||
## Validation checklist
|
||||
- Tarball SHA-256 recorded above.
|
||||
- Counts align with `manifest.json`.
|
||||
- API smoke test (Postgres-backed): `dotnet test src/Concelier/StellaOps.Concelier.WebService.Tests --filter LinksetsEndpoint_SupportsCursorPagination` (against Postgres config).
|
||||
- Optional: compare sample rows between staging and expected downstream tables.
|
||||
|
||||
## Artefacts to record
|
||||
- Tarball SHA-256 and size.
|
||||
- `manifest.json` copy alongside tarball.
|
||||
- Import log (capture script output) and validation results.
|
||||
- Decision: maintenance window and rollback outcome.
|
||||
|
||||
## How to produce the tarball (export from Postgres)
|
||||
- Use `scripts/concelier/export-linksets-tarball.sh out/linksets/linksets-stage-backfill.tar.zst`.
|
||||
- Env: `PGURI` (or `CONCELIER_PG_URI`), optional `PGSCHEMA`, `LINKSETS_TABLE`, `CHUNKS_TABLE`.
|
||||
- The script exports `linksets` and `advisory_chunks` tables to NDJSON, generates `manifest.json`, builds the tarball, and prints the SHA-256.
|
||||
|
||||
## Owners
|
||||
- Concelier Storage Guild (Postgres)
|
||||
- AirGap/Backfill reviewers for sign-off
|
||||
Reference in New Issue
Block a user