feat: Implement BerkeleyDB reader for RPM databases
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
console-runner-image / build-runner-image (push) Has been cancelled
wine-csp-build / Build Wine CSP Image (push) Has been cancelled
wine-csp-build / Integration Tests (push) Has been cancelled
wine-csp-build / Security Scan (push) Has been cancelled
wine-csp-build / Generate SBOM (push) Has been cancelled
wine-csp-build / Publish Image (push) Has been cancelled
wine-csp-build / Air-Gap Bundle (push) Has been cancelled
wine-csp-build / Test Summary (push) Has been cancelled
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
console-runner-image / build-runner-image (push) Has been cancelled
wine-csp-build / Build Wine CSP Image (push) Has been cancelled
wine-csp-build / Integration Tests (push) Has been cancelled
wine-csp-build / Security Scan (push) Has been cancelled
wine-csp-build / Generate SBOM (push) Has been cancelled
wine-csp-build / Publish Image (push) Has been cancelled
wine-csp-build / Air-Gap Bundle (push) Has been cancelled
wine-csp-build / Test Summary (push) Has been cancelled
- Added BerkeleyDbReader class to read and extract RPM header blobs from BerkeleyDB hash databases. - Implemented methods to detect BerkeleyDB format and extract values, including handling of page sizes and magic numbers. - Added tests for BerkeleyDbReader to ensure correct functionality and header extraction. feat: Add Yarn PnP data tests - Created YarnPnpDataTests to validate package resolution and data loading from Yarn PnP cache. - Implemented tests for resolved keys, package presence, and loading from cache structure. test: Add egg-info package fixtures for Python tests - Created egg-info package fixtures for testing Python analyzers. - Included PKG-INFO, entry_points.txt, and installed-files.txt for comprehensive coverage. test: Enhance RPM database reader tests - Added tests for RpmDatabaseReader to validate fallback to legacy packages when SQLite is missing. - Implemented helper methods to create legacy package files and RPM headers for testing. test: Implement dual signing tests - Added DualSignTests to validate secondary signature addition when configured. - Created stub implementations for crypto providers and key resolvers to facilitate testing. chore: Update CI script for Playwright Chromium installation - Modified ci-console-exports.sh to ensure deterministic Chromium binary installation for console exports tests. - Added checks for Windows compatibility and environment variable setups for Playwright browsers.
This commit is contained in:
@@ -1,76 +1,82 @@
|
||||
# Concelier Backfill & Rollback Plan (STORE-AOC-19-005-DEV)
|
||||
# Concelier Backfill & Rollback Plan (STORE-AOC-19-005-DEV, Postgres)
|
||||
|
||||
## Objective
|
||||
Prepare and rehearse the raw-linkset backfill/rollback so Concelier Mongo reflects Link-Not-Merge data deterministically across dev/stage. This runbook unblocks STORE-AOC-19-005-DEV.
|
||||
Prepare and rehearse the raw Link-Not-Merge backfill/rollback so Concelier Postgres reflects the dataset deterministically across dev/stage. This replaces the prior Mongo workflow.
|
||||
|
||||
## Inputs
|
||||
- Source dataset: staging export tarball `linksets-stage-backfill.tar.zst`.
|
||||
- Expected placement: `out/linksets/linksets-stage-backfill.tar.zst`.
|
||||
- Hash: record SHA-256 in this file once available (example below).
|
||||
|
||||
Example hash capture (replace with real):
|
||||
```
|
||||
$ sha256sum out/linksets/linksets-stage-backfill.tar.zst
|
||||
3ac7d1c8f4f7b5c5b27c1c7ac6d6e9b2a2d6d7a1a1c3f4e5b6c7d8e9f0a1b2c3 out/linksets/linksets-stage-backfill.tar.zst
|
||||
```
|
||||
- Dataset tarball: `out/linksets/linksets-stage-backfill.tar.zst`
|
||||
- Files expected inside: `linksets.ndjson`, `advisory_chunks.ndjson`, `manifest.json`
|
||||
- Record SHA-256 of the tarball here when staged:
|
||||
```
|
||||
$ sha256sum out/linksets/linksets-stage-backfill.tar.zst
|
||||
2b43ef9b5694f59be8c1d513893c506b8d1b8de152d820937178070bfc00d0c0 out/linksets/linksets-stage-backfill.tar.zst
|
||||
```
|
||||
- To regenerate the tarball deterministically from repo seeds: `./scripts/concelier/build-store-aoc-19-005-dataset.sh`
|
||||
- To validate a tarball locally (counts + hashes): `./scripts/concelier/test-store-aoc-19-005-dataset.sh out/linksets/linksets-stage-backfill.tar.zst`
|
||||
|
||||
## Preflight
|
||||
- Environment variables:
|
||||
- `CONCELIER_MONGO_URI` pointing to the target (dev or staging) Mongo.
|
||||
- `CONCELIER_DB` (default `concelier`).
|
||||
- Take a snapshot of affected collections:
|
||||
```
|
||||
mongodump --uri "$CONCELIER_MONGO_URI" --db "$CONCELIER_DB" --collection linksets --collection advisory_chunks --out out/backups/pre-run
|
||||
```
|
||||
- Ensure write lock is acceptable for the maintenance window.
|
||||
- Env:
|
||||
- `PGURI` (or `CONCELIER_PG_URI`) pointing to the target Postgres instance.
|
||||
- `PGSCHEMA` (default `lnm_raw`) for staging tables.
|
||||
- Ensure maintenance window for bulk import; no concurrent writers to staging tables.
|
||||
|
||||
## Backfill steps
|
||||
## Backfill steps (CI-ready)
|
||||
|
||||
### Preferred: CI/manual script
|
||||
- `scripts/concelier/backfill-store-aoc-19-005.sh /path/to/linksets-stage-backfill.tar.zst`
|
||||
- Env: `PGURI` (or `CONCELIER_PG_URI`), optional `PGSCHEMA` (default `lnm_raw`), optional `DRY_RUN=1` for extraction-only.
|
||||
- The script:
|
||||
- Extracts and validates required files.
|
||||
- Creates/clears staging tables (`<schema>.linksets_raw`, `<schema>.advisory_chunks_raw`).
|
||||
- Imports via `\copy` from TSV derived with `jq -rc '[._id, .] | @tsv'`.
|
||||
- Prints counts and echoes the manifest.
|
||||
|
||||
### Manual steps (fallback)
|
||||
1) Extract dataset:
|
||||
```
|
||||
mkdir -p out/linksets/extracted
|
||||
tar -xf out/linksets/linksets-stage-backfill.tar.zst -C out/linksets/extracted
|
||||
```
|
||||
2) Import linksets + chunks (bypass validation to preserve upstream IDs):
|
||||
2) Create/truncate staging tables and import:
|
||||
```
|
||||
mongoimport --uri "$CONCELIER_MONGO_URI" --db "$CONCELIER_DB" \
|
||||
--collection linksets --file out/linksets/extracted/linksets.ndjson --mode=upsert --upsertFields=_id
|
||||
|
||||
mongoimport --uri "$CONCELIER_MONGO_URI" --db "$CONCELIER_DB" \
|
||||
--collection advisory_chunks --file out/linksets/extracted/advisory_chunks.ndjson --mode=upsert --upsertFields=_id
|
||||
psql "$PGURI" <<SQL
|
||||
create schema if not exists lnm_raw;
|
||||
create table if not exists lnm_raw.linksets_raw (id text primary key, raw jsonb not null);
|
||||
create table if not exists lnm_raw.advisory_chunks_raw (id text primary key, raw jsonb not null);
|
||||
truncate table lnm_raw.linksets_raw;
|
||||
truncate table lnm_raw.advisory_chunks_raw;
|
||||
\copy lnm_raw.linksets_raw (id, raw) from program 'jq -rc ''[._id, .] | @tsv'' out/linksets/extracted/linksets.ndjson' with (format csv, delimiter E'\\t', quote '\"', escape '\"');
|
||||
\copy lnm_raw.advisory_chunks_raw (id, raw) from program 'jq -rc ''[._id, .] | @tsv'' out/linksets/extracted/advisory_chunks.ndjson' with (format csv, delimiter E'\\t', quote '\"', escape '\"');
|
||||
SQL
|
||||
```
|
||||
3) Verify counts vs manifest:
|
||||
```
|
||||
jq '.' out/linksets/extracted/manifest.json
|
||||
mongo --quiet "$CONCELIER_MONGO_URI/$CONCELIER_DB" --eval "db.linksets.countDocuments()"
|
||||
mongo --quiet "$CONCELIER_MONGO_URI/$CONCELIER_DB" --eval "db.advisory_chunks.countDocuments()"
|
||||
```
|
||||
4) Dry-run rollback marker (no-op unless `ENABLE_ROLLBACK=1` set):
|
||||
```
|
||||
ENABLE_ROLLBACK=0 python scripts/concelier/backfill/rollback.py --manifest out/linksets/extracted/manifest.json
|
||||
psql -tA "$PGURI" -c "select 'linksets_raw='||count(*) from lnm_raw.linksets_raw;"
|
||||
psql -tA "$PGURI" -c "select 'advisory_chunks_raw='||count(*) from lnm_raw.advisory_chunks_raw;"
|
||||
```
|
||||
|
||||
## Rollback procedure
|
||||
- If validation fails, restore from preflight dump:
|
||||
```
|
||||
mongorestore --uri "$CONCELIER_MONGO_URI" --drop out/backups/pre-run
|
||||
```
|
||||
- If partial write detected, rerun mongoimport for the affected collection only with `--mode=upsert`.
|
||||
- If validation fails: `truncate table lnm_raw.linksets_raw; truncate table lnm_raw.advisory_chunks_raw;` then rerun import.
|
||||
- Promotion to production tables should be gated by a separate migration/ETL step; keep staging isolated.
|
||||
|
||||
## Validation checklist
|
||||
- Hash of tarball matches recorded SHA-256.
|
||||
- Post-import counts align with `manifest.json`.
|
||||
- Linkset cursor pagination smoke test:
|
||||
```
|
||||
dotnet test src/Concelier/StellaOps.Concelier.WebService.Tests --filter LinksetsEndpoint_SupportsCursorPagination
|
||||
```
|
||||
- Storage metrics (if enabled) show non-zero `concelier_storage_import_total` for this window.
|
||||
- Tarball SHA-256 recorded above.
|
||||
- Counts align with `manifest.json`.
|
||||
- API smoke test (Postgres-backed): `dotnet test src/Concelier/StellaOps.Concelier.WebService.Tests --filter LinksetsEndpoint_SupportsCursorPagination` (against Postgres config).
|
||||
- Optional: compare sample rows between staging and expected downstream tables.
|
||||
|
||||
## Artefacts to record
|
||||
- Tarball SHA-256 and size.
|
||||
- `manifest.json` copy stored alongside tarball.
|
||||
- Import log (`out/linksets/import.log`) and validation results.
|
||||
- `manifest.json` copy alongside tarball.
|
||||
- Import log (capture script output) and validation results.
|
||||
- Decision: maintenance window and rollback outcome.
|
||||
|
||||
## How to produce the tarball (export from Postgres)
|
||||
- Use `scripts/concelier/export-linksets-tarball.sh out/linksets/linksets-stage-backfill.tar.zst`.
|
||||
- Env: `PGURI` (or `CONCELIER_PG_URI`), optional `PGSCHEMA`, `LINKSETS_TABLE`, `CHUNKS_TABLE`.
|
||||
- The script exports `linksets` and `advisory_chunks` tables to NDJSON, generates `manifest.json`, builds the tarball, and prints the SHA-256.
|
||||
|
||||
## Owners
|
||||
- Concelier Storage Guild (Mongo)
|
||||
- Concelier Storage Guild (Postgres)
|
||||
- AirGap/Backfill reviewers for sign-off
|
||||
|
||||
Reference in New Issue
Block a user