Files
git.stella-ops.org/docs/operations/artifact-migration-runbook.md

5.2 KiB

Artifact Store Migration Runbook

Sprint: SPRINT_20260118_017_Evidence_artifact_store_unification (AS-006)

Overview

This runbook covers the migration of existing evidence from legacy artifact stores to the unified ArtifactStore.

Migration Sources

Source Legacy Path Description
EvidenceLocker tenants/{tenantId}/bundles/{bundleId}/{sha256}-{name} Evidence bundles
Attestor attest/dsse/{bundleSha256}.json DSSE envelopes
Vex {prefix}/{format}/{digest}.{ext} VEX documents

Target Path Convention

All artifacts are migrated to: /artifacts/{bom-ref-encoded}/{serialNumber}/{artifactId}.json

Pre-Migration Checklist

  • Backup existing S3 buckets
  • Verify PostgreSQL backup is current
  • Ensure sufficient storage for duplicated data
  • Review migration in dry-run mode first
  • Notify stakeholders of potential service impact

Running the Migration

stella artifacts migrate --source all --dry-run --output migration-preview.json

Full Migration

# Migrate all sources with default settings
stella artifacts migrate --source all

# Migrate with increased parallelism
stella artifacts migrate --source all --parallelism 8 --batch-size 200

# Migrate specific source
stella artifacts migrate --source evidence --output migration-report.json

# Migrate specific tenant
stella artifacts migrate --source all --tenant <tenant-uuid>

Resuming Failed Migration

# Use checkpoint ID from previous run
stella artifacts migrate --source all --resume-from <checkpoint-id>

Progress Monitoring

The CLI displays real-time progress:

  Progress: 1500/10000 (15.0%) - Success: 1495, Failed: 3, Skipped: 2

Rollback Procedure

When to Rollback

  • Migration corrupted data
  • Performance degradation after migration
  • Business-critical bug discovered

Rollback Steps

1. Stop New Writes to Unified Store

# Disable unified store in configuration
kubectl set env deployment/evidence-locker ARTIFACT_STORE_UNIFIED_ENABLED=false
kubectl set env deployment/attestor ARTIFACT_STORE_UNIFIED_ENABLED=false

2. Revert Application Configuration

# etc/appsettings.yaml
artifactStore:
  useUnifiedStore: false
  legacyMode: true

3. Clear Unified Store Index

-- Clear PostgreSQL index (preserves S3 data)
TRUNCATE TABLE artifact_store.artifacts;

4. (Optional) Remove Migrated S3 Objects

# Only if disk space is critical and you're certain about rollback
# WARNING: This is destructive!
aws s3 rm s3://artifacts-bucket/artifacts/ --recursive

5. Restart Services

kubectl rollout restart deployment/evidence-locker
kubectl rollout restart deployment/attestor

6. Verify Legacy Stores Work

# Test evidence retrieval
stella evidence get --bundle-id <test-bundle>

# Test attestation retrieval  
stella attestor get --digest <test-digest>

Post-Migration Validation

Verify Artifact Counts

-- Count migrated artifacts by source
SELECT 
  CASE 
    WHEN storage_key LIKE '%evidence%' THEN 'evidence'
    WHEN storage_key LIKE '%dsse%' THEN 'attestor'
    WHEN storage_key LIKE '%vex%' THEN 'vex'
    ELSE 'unknown'
  END as source,
  COUNT(*) as count
FROM artifact_store.artifacts
GROUP BY 1;

Verify bom-ref Extraction

-- Check for artifacts with synthetic bom-refs (extraction failed)
SELECT COUNT(*) as synthetic_count
FROM artifact_store.artifacts
WHERE bom_ref LIKE 'sha256:%';

Test Retrieval

# Query by bom-ref
curl "https://api.example.com/api/v1/artifacts?bom_ref=pkg:docker/acme/api@sha256:abc123"

# Verify content matches original
stella artifacts compare \
  --original tenants/xxx/bundles/yyy/sha256-sbom.json \
  --migrated /artifacts/encoded-ref/serial/artifact.json

Troubleshooting

Migration Stuck

# Check for stuck workers
ps aux | grep migrate

# Check migration checkpoints
cat /var/lib/stella/migration-checkpoint.json

High Failure Rate

  1. Check migration report for common errors
  2. Verify source store connectivity
  3. Check for corrupted source artifacts
  4. Increase batch size for memory issues

Slow Migration

  1. Increase parallelism (up to CPU count)
  2. Run during off-peak hours
  3. Consider migrating by tenant in parallel
  4. Verify network bandwidth to S3

Representative Dataset Testing

Before production migration, test with representative dataset:

# Export sample from each source
stella evidence list --limit 100 --output sample-evidence.json
stella attestor list --limit 100 --output sample-attestor.json

# Create test environment with samples
stella artifacts migrate --source all --tenant test-tenant --output test-report.json

# Verify counts and content
diff <(cat sample-evidence.json | jq '.total') <(cat test-report.json | jq '.succeeded')