I’m sharing this because it highlights important recent developments with Rekor — and how its new v2 rollout and behavior with DSSE change what you need to watch out for when building attestations (for example in your StellaOps architecture). ![Image](https://docs.sigstore.dev/sigstore_rekor-horizontal-color.svg) ![Image](https://miro.medium.com/v2/resize%3Afit%3A1200/1%2Abdz7tUqYTQecioDQarHNcw.png) ![Image](https://rewanthtammana.com/sigstore-the-easy-way/images/cosign-attest-sbom-ui.png) ### 🚨 What changed with Rekor v2 * Rekor v2 is now GA: it moves to a tile‑backed transparency log backend (via the module rekor‑tiles), which simplifies maintenance and lowers infrastructure cost. ([blog.sigstore.dev][1]) * The global publicly‑distributed instance now supports only two entry types: `hashedrekord` (for artifacts) and `dsse` (for attestations). Many previously supported entry types — e.g. `intoto`, `rekord`, `helm`, `rfc3161`, etc. — have been removed. ([blog.sigstore.dev][1]) * The log is now sharded: instead of a single growing Merkle tree, multiple “shards” (trees) are used. This supports better scaling, simpler rotation/maintenance, and easier querying by tree shard + identifier. ([Sigstore][2]) ### ⚠️ Why this matters for attestations, and common pitfalls * Historically, when using DSSE or in‑toto style attestations submitted to Rekor (or via Cosign), the **entire attestation payload** had to be uploaded to Rekor. That becomes problematic when payloads are large. There’s a reported case where a 130 MB attestation was rejected due to size. ([GitHub][3]) * The public instance of Rekor historically had a relatively small attestation size limit (on the order of 100 KB) for uploads. ([GitHub][4]) * Because Rekor v2 no longer supports many entry types and simplifies the log types, you no longer have fallback for some of the older attestation/storage formats if they don’t fit DSSE/hashedrekord constraints. ([blog.sigstore.dev][1]) ### ✅ What you must design for — and pragmatic workarounds Given your StellaOps architecture goals (deterministic builds, reproducible scans, large SBOMs/metadata, private/off‑air‑gap compliance), here’s what you should consider: * **Plan for payload-size constraints**: don’t assume arbitrary large attestations will be accepted. Keep attestation payloads small — ideally put large blobs (e.g. full SBOMs, large metadata) **outside** DSSE and store them elsewhere (artifact storage, internal logs, blob store) with the attestation only embedding a hash or reference. * **Use “private logs” / self‑hosted Rekor** if you anticipate large payloads — public instance limits make heavy payload uploads impractical. Running your own instance gives you control over size limits and resource allocation. ([GitHub][4]) * **Chunking / sharding**: For large metadata blobs, consider splitting (“sharding”) or chunking the data into smaller pieces, each with its own DSSE/hashed‑rekord entry, then reference or re‑assemble externally. This avoids hitting size limits while maintaining inclusion proofs. * **Build idempotent re‑submit logic**: Because DSSE/hashed‑rekord entries are the only supported types, and large payloads may fail, your pipelines (e.g. StellaOps) should handle retries, partial submits, and ensure idempotence — so re‑submits don’t create inconsistent or duplicate entries. * **Persist full attestations outside Rekor**: Since Rekor v2 dropped many types and doesn’t store full arbitrary blobs necessarily, ensure that the “source of truth” for large metadata remains under your control (e.g. in your internal storage), with Rekor only storing minimal hashed attestations. ### 🎯 What this means for StellaOps For your “Proof‑of‑Integrity Graph” + “offline bundle + replayable scans” vision — this means you should treat Rekor (especially public) as a *lightweight ledger for proofs*, not as a full-blown metadata store. In practice: * Use Rekor for auditing, signature/inclusion proofs, and “commit‑to‑hash” attestations. * Store full SBOMs, VEX data, scan metadata, large build/context blobs externally. * Build your log‑sharding, replay, and artifact‑reference logic to survive Rekor’s constraints. * For air‑gapped workflows, consider a private Rekor or alternative internal log that mirrors the public transparency‑log guarantees but under your control. If you like — I can pull up **real‑world examples** of organizations that already adapted to Rekor v2’s DSSE constraints (with chunking, private logs, hybrid storage) — could help shape StellaOps resilience strategy. [1]: https://blog.sigstore.dev/rekor-v2-ga/?utm_source=chatgpt.com "Rekor v2 GA - Cheaper to run, simpler to maintain" [2]: https://docs.sigstore.dev/logging/sharding/?utm_source=chatgpt.com "Sharding" [3]: https://github.com/sigstore/cosign/issues/3599?utm_source=chatgpt.com "Attestations require uploading entire payload to rekor #3599" [4]: https://github.com/sigstore/rekor?utm_source=chatgpt.com "sigstore/rekor: Software Supply Chain Transparency Log" Here’s a concrete, developer‑friendly implementation plan you can hand to the team. I’ll assume the context is “StellaOps + Sigstore/Rekor v2 + DSSE + air‑gapped support”. --- ## 0. Shared context & constraints (what devs should keep in mind) **Key facts (summarized):** * Rekor v2 keeps only **two** entry types: `hashedrekord` (artifact signatures) and `dsse` (attestations). Older types (`intoto`, `rekord`, etc.) are gone. ([Sigstore Blog][1]) * The **public** Rekor instance enforces a ~**100KB attestation size limit** per upload; bigger payloads must use your **own Rekor instance** instead. ([GitHub][2]) * For DSSE entries, Rekor **does not store the full payload**; it stores hashes and verification material. Users are expected to persist the attestations alongside artifacts in their own storage. ([Go Packages][3]) * People have already hit problems where ~130MB attestations were rejected by Rekor, showing that “just upload the whole SBOM/provenance” is not sustainable. ([GitHub][4]) * Sigstore’s **bundle** format is the canonical way to ship DSSE + tlog metadata around as a single JSON object (very useful for offline/air‑gapped replay). ([Sigstore][5]) **Guiding principles for the implementation:** 1. **Rekor is a ledger, not a blob store.** We log *proofs* (hashes, inclusion proofs), not big documents. 2. **Attestation payloads live in our storage** (object store / DB). 3. **All Rekor interaction goes through one abstraction** so we can easily switch public/private/none. 4. **Everything is idempotent and replayable** (important for retries and air‑gapped exports). --- ## 1. High‑level architecture ### 1.1 Components 1. **Attestation Builder library (in CI/build tools)** * Used by build pipelines / scanners / SBOM generators. * Responsibilities: * Collect artifact metadata (digest, build info, SBOM, scan results). * Call Attestation API (below) with **semantic info** and raw payload(s). 2. **Attestation Service (core backend microservice)** * Single entry‑point for creating and managing attestations. * Responsibilities: * Normalize incoming metadata. * Store large payload(s) in object store. * Construct **small DSSE envelope** (payload = manifest / summary, not giant blob). * Persist attestation records & payload manifests in DB. * Enqueue log‑submission jobs for: * Public Rekor v2 * Private Rekor v2 (optional) * Internal event log (DB/Kafka) * Produce **Sigstore bundles** for offline use. 3. **Log Writer / Rekor Client Worker(s)** * Background workers consuming submission jobs. * Responsibilities: * Submit `dsse` (and optionally `hashedrekord`) entries to configured Rekor instances. * Handle retries with backoff. * Guarantee idempotency (no duplicate entries, no inconsistent state). * Update DB with Rekor log index/uuid and status. 4. **Offline Bundle Exporter (CLI or API)** * Runs in air‑gapped cluster. * Responsibilities: * Periodically export “new” attestations + bundles since last export. * Materialize data as tar/zip with: * Sigstore bundles (JSON) * Chunk manifests * Large payload chunks (optional, depending on policy). 5. **Offline Replay Service (connected environment)** * Runs where internet access and public Rekor are available. * Responsibilities: * Read offline bundles from incoming location. * Replay to: * Public Rekor * Cloud storage * Internal observability * Write updated status back (e.g., via a status file or callback). 6. **Config & Policy Layer** * Central (e.g. YAML, env, config DB). * Controls: * Which logs to use: `public_rekor`, `private_rekor`, `internal_only`. * Size thresholds (DSSE payload limit, chunk size). * Retry/backoff policy. * Air‑gapped mode toggles. --- ## 2. Data model (DB + storage) Use whatever DB you have (Postgres is fine). Here’s a suggested schema, adapt as needed. ### 2.1 Core tables **`attestations`** | Column | Type | Description | | ------------------------ | ----------- | ----------------------------------------- | | `id` | UUID (PK) | Internal identifier | | `subject_digest` | text | e.g., `sha256:` of build artifact | | `subject_uri` | text | Optional URI (image ref, file path, etc.) | | `predicate_type` | text | e.g. `https://slsa.dev/provenance/v1` | | `payload_schema_version` | text | Version of our manifest schema | | `dsse_envelope_digest` | text | `sha256` of DSSE envelope | | `bundle_location` | text | URL/path to Sigstore bundle (if cached) | | `created_at` | timestamptz | Creation time | | `created_by` | text | Origin (pipeline id, service name) | | `metadata` | jsonb | Extra labels / tags | **`payload_manifests`** | Column | Type | Description | | --------------------- | ----------- | ------------------------------------------------- | | `attestation_id` (FK) | UUID | Link to `attestations.id` | | `total_size_bytes` | bigint | Size of the *full* logical payload | | `chunk_count` | int | Number of chunks | | `root_digest` | text | Digest of full payload or Merkle root over chunks | | `manifest_json` | jsonb | The JSON we sign in the DSSE payload | | `created_at` | timestamptz | | **`payload_chunks`** | Column | Type | Description | | --------------------- | ----------------------------- | ---------------------- | | `attestation_id` (FK) | UUID | | | `chunk_index` | int | 0‑based index | | `chunk_digest` | text | sha256 of this chunk | | `size_bytes` | bigint | Size of chunk | | `storage_uri` | text | `s3://…` or equivalent | | PRIMARY KEY | (attestation_id, chunk_index) | Ensures uniqueness | **`log_submissions`** | Column | Type | Description | | --------------------- | ----------- | --------------------------------------------------------- | | `id` | UUID (PK) | | | `attestation_id` (FK) | UUID | | | `target` | text | `public_rekor`, `private_rekor`, `internal` | | `submission_key` | text | Idempotency key (see below) | | `state` | text | `pending`, `in_progress`, `succeeded`, `failed_permanent` | | `attempt_count` | int | For retries | | `last_error` | text | Last error message | | `rekor_log_index` | bigint | If applicable | | `rekor_log_id` | text | Log ID (tree ID / key ID) | | `created_at` | timestamptz | | | `updated_at` | timestamptz | | Add a **unique index** on `(target, submission_key)` to guarantee idempotency. --- ## 3. DSSE payload design (how to avoid size limits) ### 3.1 Manifest‑based DSSE instead of giant payloads Instead of DSSE‑signing the **entire SBOM/provenance blob** (which hits Rekor’s 100KB limit), we sign a **manifest** describing where the payload lives and how to verify it. **Example manifest JSON** (payload of DSSE, small): ```json { "version": "stellaops.manifest.v1", "subject": { "uri": "registry.example.com/app@sha256:abcd...", "digest": "sha256:abcd..." }, "payload": { "type": "sbom.spdx+json", "rootDigest": "sha256:deadbeef...", "totalSize": 73400320, "chunkCount": 12 }, "chunks": [ { "index": 0, "digest": "sha256:1111...", "size": 6291456 }, { "index": 1, "digest": "sha256:2222...", "size": 6291456 } // ... ], "storagePolicy": { "backend": "s3", "bucket": "stellaops-attestations", "pathPrefix": "sboms/app/abcd..." } } ``` * This JSON is small enough to **fit under 100KB** even with lots of chunks, so the DSSE envelope stays small. * Full SBOM/scan results live in your object store; Rekor logs the DSSE envelope hash. ### 3.2 Chunking logic (Attestation Service) Config values (can be env vars): * `CHUNK_SIZE_BYTES` = e.g. 5–10 MiB * `MAX_DSSE_PAYLOAD_BYTES` = e.g. 70 KiB (keeping margin under Rekor 100KB limit) * `MAX_CHUNK_COUNT` = safety guard Algorithm: 1. Receive raw payload bytes (SBOM / provenance / scan results). 2. Compute full `root_digest = sha256(payload_bytes)` (or Merkle root if you want more advanced verification). 3. If `len(payload_bytes) <= SMALL_PAYLOAD_THRESHOLD` (e.g. 64 KB): * Skip chunking. * Store payload as single object. * Manifest can optionally omit `chunks` and just record one object. 4. If larger: * Split into fixed‑size chunks (except last). * For each chunk: * Compute `chunk_digest`. * Upload chunk to object store path derived from `root_digest` + `chunk_index`. * Insert `payload_chunks` rows. 5. Build manifest JSON with: * `version` * `subject` * `payload` block * `chunks[]` (no URIs if you don’t want to leak details; the URIs can be derived by clients). 6. Check serialized manifest size ≤ `MAX_DSSE_PAYLOAD_BYTES`. If not: * Option A: increase chunk size so you have fewer chunks. * Option B: move chunk list to a secondary “chunk index” document and sign only its root digest. 7. DSSE‑sign manifest JSON. 8. Persist DSSE envelope digest + manifest in DB. --- ## 4. Rekor integration & idempotency ### 4.1 Rekor client abstraction Implement an interface like: ```ts interface TransparencyLogClient { submitDsseEnvelope(params: { dsseEnvelope: Buffer; // JSON bytes subjectDigest: string; predicateType: string; }): Promise<{ logIndex: number; logId: string; entryUuid: string; }>; } ``` Provide implementations: * `PublicRekorClient` (points at `https://rekor.sigstore.dev` or v2 equivalent). * `PrivateRekorClient` (your own Rekor v2 cluster). * `NullClient` (for internal‑only mode). Use official API semantics from Rekor OpenAPI / SDKs where possible. ([Sigstore][6]) ### 4.2 Submission jobs & idempotency **Submission key design:** ```text submission_key = sha256( "dsse" + "|" + rekor_base_url + "|" + dsse_envelope_digest ) ``` Workflow in the worker: 1. Worker fetches `log_submissions` with `state = 'pending'` or due for retry. 2. Set `state = 'in_progress'` (optimistic update). 3. Call `client.submitDsseEnvelope`. 4. If success: * Update `state = 'succeeded'`, set `rekor_log_index`, `rekor_log_id`. 5. If Rekor indicates “already exists” (or returns same logIndex for same envelope): * Treat as success, update `state = 'succeeded'`. 6. On network/5xx errors: * Increment `attempt_count`. * If `attempt_count < MAX_RETRIES`: schedule retry with backoff. * Else: `state = 'failed_permanent'`, keep `last_error`. DB constraint: `UNIQUE(target, submission_key)` ensures we don’t create conflicting jobs. --- ## 5. Attestation Service API design ### 5.1 Create attestation (build/scan pipeline → Attestation Service) **`POST /v1/attestations`** **Request body (example):** ```json { "subject": { "uri": "registry.example.com/app@sha256:abcd...", "digest": "sha256:abcd..." }, "payloadType": "sbom.spdx+json", "payload": { "encoding": "base64", "data": "" }, "predicateType": "https://slsa.dev/provenance/v1", "logTargets": ["internal", "private_rekor", "public_rekor"], "airgappedMode": false, "labels": { "team": "payments", "env": "prod" } } ``` **Server behavior:** 1. Validate subject & payload. 2. Chunk payload as per rules (section 3). 3. Store payload chunks. 4. Build manifest JSON & DSSE envelope. 5. Insert `attestations`, `payload_manifests`, `payload_chunks`. 6. For each `logTargets`: * Insert `log_submissions` row with `state = 'pending'`. 7. Optionally construct Sigstore bundle representing: * DSSE envelope * Transparency log entry (when available) — for async, you can fill this later. 8. Return `202 Accepted` with resource URL: ```json { "attestationId": "1f4b3d...", "status": "pending_logs", "subjectDigest": "sha256:abcd...", "logTargets": ["internal", "private_rekor", "public_rekor"], "links": { "self": "/v1/attestations/1f4b3d...", "bundle": "/v1/attestations/1f4b3d.../bundle" } } ``` ### 5.2 Get attestation status **`GET /v1/attestations/{id}`** Returns: ```json { "attestationId": "1f4b3d...", "subjectDigest": "sha256:abcd...", "predicateType": "https://slsa.dev/provenance/v1", "logs": { "internal": { "state": "succeeded" }, "private_rekor": { "state": "succeeded", "logIndex": 1234, "logId": "..." }, "public_rekor": { "state": "pending", "lastError": null } }, "createdAt": "2025-11-27T12:34:56Z" } ``` ### 5.3 Get bundle **`GET /v1/attestations/{id}/bundle`** * Returns a **Sigstore bundle JSON** that: * Contains either: * Only the DSSE + identity + certificate chain (if logs not yet written). * Or DSSE + log entries (`hashedrekord` / `dsse` entries) for whichever logs are ready. ([Sigstore][5]) * This is what air‑gapped exports and verifiers consume. --- ## 6. Air‑gapped workflows ### 6.1 In the air‑gapped environment * Attestation Service runs in “air‑gapped mode”: * `logTargets` typically = `["internal", "private_rekor"]`. * No direct public Rekor. * **Offline Exporter CLI**: ```bash stellaops-offline-export \ --since-id \ --output offline-bundle-.tar.gz ``` * Exporter logic: 1. Query DB for new `attestations` > `since-id`. 2. For each attestation: * Fetch DSSE envelope. * Fetch current log statuses (private rekor, internal). * Build or reuse Sigstore bundle JSON. * Optionally include payload chunks and/or original payload. 3. Write them into a tarball with structure like: ``` /attestations//bundle.json /attestations//chunks/chunk-0000.bin ... /meta/export-metadata.json ``` ### 6.2 In the connected environment * **Replay Service**: ```bash stellaops-offline-replay \ --input offline-bundle-.tar.gz \ --public-rekor-url https://rekor.sigstore.dev ``` * Replay logic: 1. Read each `/attestations//bundle.json`. 2. If `public_rekor` entry not present: * Extract DSSE envelope from bundle. * Call Attestation Service “import & log” endpoint or directly call PublicRekorClient. * Build new updated bundle (with public tlog entry). 3. Emit an updated `result.json` for each attestation (so you can sync status back to original environment if needed). --- ## 7. Observability & ops ### 7.1 Metrics Have devs expose at least: * `rekor_submit_requests_total{target, outcome}` * `rekor_submit_latency_seconds{target}` (histogram) * `log_submissions_in_queue{target}` * `attestations_total{predicateType}` * `attestation_payload_bytes{bucket}` (distribution of payload sizes) ### 7.2 Logging * Log at **info**: * Attestation created (subject digest, predicateType, manifest version). * Log submission succeeded (target, logIndex, logId). * Log at **warn/error**: * Any permanent failure. * Any time DSSE payload nearly exceeds size threshold (to catch misconfig). ### 7.3 Feature flags * `FEATURE_REKOR_PUBLIC_ENABLED` * `FEATURE_REKOR_PRIVATE_ENABLED` * `FEATURE_OFFLINE_EXPORT_ENABLED` * `FEATURE_CHUNKING_ENABLED` (to allow rolling rollout) --- ## 8. Concrete work breakdown for developers You can basically drop this as a backlog outline: 1. **Domain model & storage** * [ ] Implement DB migrations for `attestations`, `payload_manifests`, `payload_chunks`, `log_submissions`. * [ ] Implement object storage abstraction and content‑addressable layout for chunks. 2. **Attestation Service skeleton** * [ ] Implement `POST /v1/attestations` with basic validation. * [ ] Implement manifest building and DSSE envelope creation (no Rekor yet). * [ ] Persist records in DB. 3. **Chunking & manifest logic** * [ ] Implement chunker with thresholds & tests (small vs large). * [ ] Implement manifest JSON builder. * [ ] Ensure DSSE payload size is under configurable limit. 4. **Rekor client & log submissions** * [ ] Implement `TransparencyLogClient` interface + Public/Private implementations. * [ ] Implement `log_submissions` worker (queue + backoff + idempotency). * [ ] Wire worker into service config and deployment. 5. **Sigstore bundle support** * [ ] Implement bundle builder given DSSE envelope + log metadata. * [ ] Add `GET /v1/attestations/{id}/bundle`. 6. **Offline export & replay** * [ ] Implement Exporter CLI (queries DB, packages bundles and chunks). * [ ] Implement Replay CLI/service (reads tarball, logs to public Rekor). * [ ] Document operator workflow for moving tarballs between environments. 7. **Observability & docs** * [ ] Add metrics, logs, and dashboards. * [ ] Write verification docs: “How to fetch manifest, verify DSSE, reconstruct payload, and check Rekor.” --- If you’d like, next step I can do is: take this and turn it into a more strict format your devs might already use (e.g. Jira epics + stories, or a design doc template with headers like “Motivation, Alternatives, Risks, Rollout Plan”). [1]: https://blog.sigstore.dev/rekor-v2-ga/?utm_source=chatgpt.com "Rekor v2 GA - Cheaper to run, simpler to maintain" [2]: https://github.com/sigstore/rekor?utm_source=chatgpt.com "sigstore/rekor: Software Supply Chain Transparency Log" [3]: https://pkg.go.dev/github.com/sigstore/rekor/pkg/types/dsse?utm_source=chatgpt.com "dsse package - github.com/sigstore/rekor/pkg/types/dsse" [4]: https://github.com/sigstore/cosign/issues/3599?utm_source=chatgpt.com "Attestations require uploading entire payload to rekor #3599" [5]: https://docs.sigstore.dev/about/bundle/?utm_source=chatgpt.com "Sigstore Bundle Format" [6]: https://docs.sigstore.dev/logging/overview/?utm_source=chatgpt.com "Rekor"