save checkpoint

This commit is contained in:
master
2026-02-11 01:32:14 +02:00
parent 5593212b41
commit cf5b72974f
2316 changed files with 68799 additions and 3808 deletions

View File

@@ -0,0 +1,114 @@
# Scanner SBOM Hot Lookup Operations
Status: Active
Last Updated: 2026-02-10
Sprint: `SPRINT_20260210_001_DOCS_sbom_attestation_hot_lookup_contract` (`HOT-005`)
## Purpose
Operate the `scanner.artifact_boms` monthly partition set used by Scanner SBOM hot lookups:
- pre-create upcoming partitions to avoid month-boundary ingest failures
- enforce retention windows by dropping old partitions
- keep maintenance scoped to partition units (not whole-table rewrites)
## Required Inputs
- PostgreSQL DSN in `PG_DSN`
- migration `025_artifact_boms_hot_lookup.sql` applied
- permissions to execute:
- `scanner.ensure_artifact_boms_future_partitions(int)`
- `scanner.drop_artifact_boms_partitions_older_than(int, bool)`
## Manual Operations
Pre-create current + next month partition:
```bash
PG_DSN="Host=...;Database=...;Username=...;Password=..." \
./devops/scripts/scanner-artifact-boms-ensure-partitions.sh 1
```
Retention dry-run (default keep 12 months):
```bash
PG_DSN="Host=...;Database=...;Username=...;Password=..." \
./devops/scripts/scanner-artifact-boms-retention.sh 12 true
```
Retention execution:
```bash
PG_DSN="Host=...;Database=...;Username=...;Password=..." \
./devops/scripts/scanner-artifact-boms-retention.sh 12 false
```
## Scheduled Jobs
### Cron example
```cron
# first day each month: ensure next partition exists
10 0 1 * * PG_DSN="..." /opt/stellaops/devops/scripts/scanner-artifact-boms-ensure-partitions.sh 1
# daily retention check
15 0 * * * PG_DSN="..." /opt/stellaops/devops/scripts/scanner-artifact-boms-retention.sh 12 false
```
### Systemd units
Install:
```bash
sudo cp devops/scripts/systemd/scanner-artifact-boms-*.service /etc/systemd/system/
sudo cp devops/scripts/systemd/scanner-artifact-boms-*.timer /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now scanner-artifact-boms-ensure.timer
sudo systemctl enable --now scanner-artifact-boms-retention.timer
```
`/etc/stellaops/scanner-hotlookup.env` must define `PG_DSN`.
## Failure Modes and Rollback
### Missing upcoming partition
Symptom:
- ingest errors near month boundary with partition routing failure.
Mitigation:
1. Run `scanner-artifact-boms-ensure-partitions.sh 2`.
2. Re-run failed ingest operations.
### Retention job dropped incorrect partition
Symptom:
- historical hot-lookup rows unexpectedly missing.
Rollback:
1. Restore dropped partition table from latest PostgreSQL backup.
2. Attach restored table back to parent:
```sql
ALTER TABLE scanner.artifact_boms
ATTACH PARTITION scanner.artifact_boms_YYYY_MM
FOR VALUES FROM ('YYYY-MM-01') TO ('YYYY-MM-01'::date + INTERVAL '1 month');
```
3. Rebuild per-partition indexes if restore omitted them.
### Hot partition bloat
Symptom:
- query latency regression on current month.
Mitigation:
1. Run `VACUUM (ANALYZE) scanner.artifact_boms_YYYY_MM;`
2. If needed, run `REINDEX TABLE scanner.artifact_boms_YYYY_MM;`
3. For online reclaim workflows, use `pg_repack` partition-by-partition.
## References
- Schema + functions: `src/Scanner/__Libraries/StellaOps.Scanner.Storage/Postgres/Migrations/025_artifact_boms_hot_lookup.sql`
- SQL job snippets: `devops/database/postgres-partitioning/003_scanner_artifact_boms_hot_lookup_jobs.sql`
- Shell jobs:
- `devops/scripts/scanner-artifact-boms-ensure-partitions.sh`
- `devops/scripts/scanner-artifact-boms-retention.sh`