Files
git.stella-ops.org/docs/product-advisories/26-Nov-2025 - Handling Rekor v2 and DSSE Air‑Gap Limits.md
master e950474a77
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
api-governance / spectral-lint (push) Has been cancelled
oas-ci / oas-validate (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
SDK Publish & Sign / sdk-publish (push) Has been cancelled
up
2025-11-27 15:16:31 +02:00

24 KiB
Raw Blame History

Im sharing this because it highlights important recent developments with Rekor — and how its new v2 rollout and behavior with DSSE change what you need to watch out for when building attestations (for example in your StellaOps architecture).

Image

Image

Image

🚨 What changed with Rekor v2

  • Rekor v2 is now GA: it moves to a tilebacked transparency log backend (via the module rekortiles), which simplifies maintenance and lowers infrastructure cost. (blog.sigstore.dev)
  • The global publiclydistributed instance now supports only two entry types: hashedrekord (for artifacts) and dsse (for attestations). Many previously supported entry types — e.g. intoto, rekord, helm, rfc3161, etc. — have been removed. (blog.sigstore.dev)
  • The log is now sharded: instead of a single growing Merkle tree, multiple “shards” (trees) are used. This supports better scaling, simpler rotation/maintenance, and easier querying by tree shard + identifier. (Sigstore)

⚠️ Why this matters for attestations, and common pitfalls

  • Historically, when using DSSE or intoto style attestations submitted to Rekor (or via Cosign), the entire attestation payload had to be uploaded to Rekor. That becomes problematic when payloads are large. Theres a reported case where a 130MB attestation was rejected due to size. (GitHub)
  • The public instance of Rekor historically had a relatively small attestation size limit (on the order of 100KB) for uploads. (GitHub)
  • Because Rekor v2 no longer supports many entry types and simplifies the log types, you no longer have fallback for some of the older attestation/storage formats if they dont fit DSSE/hashedrekord constraints. (blog.sigstore.dev)

What you must design for — and pragmatic workarounds

Given your StellaOps architecture goals (deterministic builds, reproducible scans, large SBOMs/metadata, private/offairgap compliance), heres what you should consider:

  • Plan for payload-size constraints: dont assume arbitrary large attestations will be accepted. Keep attestation payloads small — ideally put large blobs (e.g. full SBOMs, large metadata) outside DSSE and store them elsewhere (artifact storage, internal logs, blob store) with the attestation only embedding a hash or reference.
  • Use “private logs” / selfhosted Rekor if you anticipate large payloads — public instance limits make heavy payload uploads impractical. Running your own instance gives you control over size limits and resource allocation. (GitHub)
  • Chunking / sharding: For large metadata blobs, consider splitting (“sharding”) or chunking the data into smaller pieces, each with its own DSSE/hashedrekord entry, then reference or reassemble externally. This avoids hitting size limits while maintaining inclusion proofs.
  • Build idempotent resubmit logic: Because DSSE/hashedrekord entries are the only supported types, and large payloads may fail, your pipelines (e.g. StellaOps) should handle retries, partial submits, and ensure idempotence — so resubmits dont create inconsistent or duplicate entries.
  • Persist full attestations outside Rekor: Since Rekor v2 dropped many types and doesnt store full arbitrary blobs necessarily, ensure that the “source of truth” for large metadata remains under your control (e.g. in your internal storage), with Rekor only storing minimal hashed attestations.

🎯 What this means for StellaOps

For your “ProofofIntegrity Graph” + “offline bundle + replayable scans” vision — this means you should treat Rekor (especially public) as a lightweight ledger for proofs, not as a full-blown metadata store. In practice:

  • Use Rekor for auditing, signature/inclusion proofs, and “committohash” attestations.
  • Store full SBOMs, VEX data, scan metadata, large build/context blobs externally.
  • Build your logsharding, replay, and artifactreference logic to survive Rekors constraints.
  • For airgapped workflows, consider a private Rekor or alternative internal log that mirrors the public transparencylog guarantees but under your control.

If you like — I can pull up realworld examples of organizations that already adapted to Rekor v2s DSSE constraints (with chunking, private logs, hybrid storage) — could help shape StellaOps resilience strategy.

Heres a concrete, developerfriendly implementation plan you can hand to the team. Ill assume the context is “StellaOps + Sigstore/Rekor v2 + DSSE + airgapped support”.


0. Shared context & constraints (what devs should keep in mind)

Key facts (summarized):

  • Rekor v2 keeps only two entry types: hashedrekord (artifact signatures) and dsse (attestations). Older types (intoto, rekord, etc.) are gone. (Sigstore Blog)
  • The public Rekor instance enforces a ~100KB attestation size limit per upload; bigger payloads must use your own Rekor instance instead. (GitHub)
  • For DSSE entries, Rekor does not store the full payload; it stores hashes and verification material. Users are expected to persist the attestations alongside artifacts in their own storage. (Go Packages)
  • People have already hit problems where ~130MB attestations were rejected by Rekor, showing that “just upload the whole SBOM/provenance” is not sustainable. (GitHub)
  • Sigstores bundle format is the canonical way to ship DSSE + tlog metadata around as a single JSON object (very useful for offline/airgapped replay). (Sigstore)

Guiding principles for the implementation:

  1. Rekor is a ledger, not a blob store. We log proofs (hashes, inclusion proofs), not big documents.
  2. Attestation payloads live in our storage (object store / DB).
  3. All Rekor interaction goes through one abstraction so we can easily switch public/private/none.
  4. Everything is idempotent and replayable (important for retries and airgapped exports).

1. Highlevel architecture

1.1 Components

  1. Attestation Builder library (in CI/build tools)

    • Used by build pipelines / scanners / SBOM generators.

    • Responsibilities:

      • Collect artifact metadata (digest, build info, SBOM, scan results).
      • Call Attestation API (below) with semantic info and raw payload(s).
  2. Attestation Service (core backend microservice)

    • Single entrypoint for creating and managing attestations.

    • Responsibilities:

      • Normalize incoming metadata.

      • Store large payload(s) in object store.

      • Construct small DSSE envelope (payload = manifest / summary, not giant blob).

      • Persist attestation records & payload manifests in DB.

      • Enqueue logsubmission jobs for:

        • Public Rekor v2
        • Private Rekor v2 (optional)
        • Internal event log (DB/Kafka)
      • Produce Sigstore bundles for offline use.

  3. Log Writer / Rekor Client Worker(s)

    • Background workers consuming submission jobs.

    • Responsibilities:

      • Submit dsse (and optionally hashedrekord) entries to configured Rekor instances.
      • Handle retries with backoff.
      • Guarantee idempotency (no duplicate entries, no inconsistent state).
      • Update DB with Rekor log index/uuid and status.
  4. Offline Bundle Exporter (CLI or API)

    • Runs in airgapped cluster.

    • Responsibilities:

      • Periodically export “new” attestations + bundles since last export.

      • Materialize data as tar/zip with:

        • Sigstore bundles (JSON)
        • Chunk manifests
        • Large payload chunks (optional, depending on policy).
  5. Offline Replay Service (connected environment)

    • Runs where internet access and public Rekor are available.

    • Responsibilities:

      • Read offline bundles from incoming location.

      • Replay to:

        • Public Rekor
        • Cloud storage
        • Internal observability
      • Write updated status back (e.g., via a status file or callback).

  6. Config & Policy Layer

    • Central (e.g. YAML, env, config DB).

    • Controls:

      • Which logs to use: public_rekor, private_rekor, internal_only.
      • Size thresholds (DSSE payload limit, chunk size).
      • Retry/backoff policy.
      • Airgapped mode toggles.

2. Data model (DB + storage)

Use whatever DB you have (Postgres is fine). Heres a suggested schema, adapt as needed.

2.1 Core tables

attestations

Column Type Description
id UUID (PK) Internal identifier
subject_digest text e.g., sha256:<hex> of build artifact
subject_uri text Optional URI (image ref, file path, etc.)
predicate_type text e.g. https://slsa.dev/provenance/v1
payload_schema_version text Version of our manifest schema
dsse_envelope_digest text sha256 of DSSE envelope
bundle_location text URL/path to Sigstore bundle (if cached)
created_at timestamptz Creation time
created_by text Origin (pipeline id, service name)
metadata jsonb Extra labels / tags

payload_manifests

Column Type Description
attestation_id (FK) UUID Link to attestations.id
total_size_bytes bigint Size of the full logical payload
chunk_count int Number of chunks
root_digest text Digest of full payload or Merkle root over chunks
manifest_json jsonb The JSON we sign in the DSSE payload
created_at timestamptz

payload_chunks

Column Type Description
attestation_id (FK) UUID
chunk_index int 0based index
chunk_digest text sha256 of this chunk
size_bytes bigint Size of chunk
storage_uri text s3://… or equivalent
PRIMARY KEY (attestation_id, chunk_index) Ensures uniqueness

log_submissions

Column Type Description
id UUID (PK)
attestation_id (FK) UUID
target text public_rekor, private_rekor, internal
submission_key text Idempotency key (see below)
state text pending, in_progress, succeeded, failed_permanent
attempt_count int For retries
last_error text Last error message
rekor_log_index bigint If applicable
rekor_log_id text Log ID (tree ID / key ID)
created_at timestamptz
updated_at timestamptz

Add a unique index on (target, submission_key) to guarantee idempotency.


3. DSSE payload design (how to avoid size limits)

3.1 Manifestbased DSSE instead of giant payloads

Instead of DSSEsigning the entire SBOM/provenance blob (which hits Rekors 100KB limit), we sign a manifest describing where the payload lives and how to verify it.

Example manifest JSON (payload of DSSE, small):

{
  "version": "stellaops.manifest.v1",
  "subject": {
    "uri": "registry.example.com/app@sha256:abcd...",
    "digest": "sha256:abcd..."
  },
  "payload": {
    "type": "sbom.spdx+json",
    "rootDigest": "sha256:deadbeef...",
    "totalSize": 73400320,
    "chunkCount": 12
  },
  "chunks": [
    {
      "index": 0,
      "digest": "sha256:1111...",
      "size": 6291456
    },
    {
      "index": 1,
      "digest": "sha256:2222...",
      "size": 6291456
    }
    // ...
  ],
  "storagePolicy": {
    "backend": "s3",
    "bucket": "stellaops-attestations",
    "pathPrefix": "sboms/app/abcd..."
  }
}
  • This JSON is small enough to fit under 100KB even with lots of chunks, so the DSSE envelope stays small.
  • Full SBOM/scan results live in your object store; Rekor logs the DSSE envelope hash.

3.2 Chunking logic (Attestation Service)

Config values (can be env vars):

  • CHUNK_SIZE_BYTES = e.g. 510 MiB
  • MAX_DSSE_PAYLOAD_BYTES = e.g. 70 KiB (keeping margin under Rekor 100KB limit)
  • MAX_CHUNK_COUNT = safety guard

Algorithm:

  1. Receive raw payload bytes (SBOM / provenance / scan results).

  2. Compute full root_digest = sha256(payload_bytes) (or Merkle root if you want more advanced verification).

  3. If len(payload_bytes) <= SMALL_PAYLOAD_THRESHOLD (e.g. 64 KB):

    • Skip chunking.
    • Store payload as single object.
    • Manifest can optionally omit chunks and just record one object.
  4. If larger:

    • Split into fixedsize chunks (except last).

    • For each chunk:

      • Compute chunk_digest.
      • Upload chunk to object store path derived from root_digest + chunk_index.
      • Insert payload_chunks rows.
  5. Build manifest JSON with:

    • version
    • subject
    • payload block
    • chunks[] (no URIs if you dont want to leak details; the URIs can be derived by clients).
  6. Check serialized manifest size ≤ MAX_DSSE_PAYLOAD_BYTES. If not:

    • Option A: increase chunk size so you have fewer chunks.
    • Option B: move chunk list to a secondary “chunk index” document and sign only its root digest.
  7. DSSEsign manifest JSON.

  8. Persist DSSE envelope digest + manifest in DB.


4. Rekor integration & idempotency

4.1 Rekor client abstraction

Implement an interface like:

interface TransparencyLogClient {
  submitDsseEnvelope(params: {
    dsseEnvelope: Buffer;      // JSON bytes
    subjectDigest: string;
    predicateType: string;
  }): Promise<{
    logIndex: number;
    logId: string;
    entryUuid: string;
  }>;
}

Provide implementations:

  • PublicRekorClient (points at https://rekor.sigstore.dev or v2 equivalent).
  • PrivateRekorClient (your own Rekor v2 cluster).
  • NullClient (for internalonly mode).

Use official API semantics from Rekor OpenAPI / SDKs where possible. (Sigstore)

4.2 Submission jobs & idempotency

Submission key design:

submission_key = sha256(
  "dsse" + "|" +
  rekor_base_url + "|" +
  dsse_envelope_digest
)

Workflow in the worker:

  1. Worker fetches log_submissions with state = 'pending' or due for retry.

  2. Set state = 'in_progress' (optimistic update).

  3. Call client.submitDsseEnvelope.

  4. If success:

    • Update state = 'succeeded', set rekor_log_index, rekor_log_id.
  5. If Rekor indicates “already exists” (or returns same logIndex for same envelope):

    • Treat as success, update state = 'succeeded'.
  6. On network/5xx errors:

    • Increment attempt_count.
    • If attempt_count < MAX_RETRIES: schedule retry with backoff.
    • Else: state = 'failed_permanent', keep last_error.

DB constraint: UNIQUE(target, submission_key) ensures we dont create conflicting jobs.


5. Attestation Service API design

5.1 Create attestation (build/scan pipeline → Attestation Service)

POST /v1/attestations

Request body (example):

{
  "subject": {
    "uri": "registry.example.com/app@sha256:abcd...",
    "digest": "sha256:abcd..."
  },
  "payloadType": "sbom.spdx+json",
  "payload": {
    "encoding": "base64",
    "data": "<base64-encoded-sbom-or-scan>"
  },
  "predicateType": "https://slsa.dev/provenance/v1",
  "logTargets": ["internal", "private_rekor", "public_rekor"],
  "airgappedMode": false,
  "labels": {
    "team": "payments",
    "env": "prod"
  }
}

Server behavior:

  1. Validate subject & payload.

  2. Chunk payload as per rules (section 3).

  3. Store payload chunks.

  4. Build manifest JSON & DSSE envelope.

  5. Insert attestations, payload_manifests, payload_chunks.

  6. For each logTargets:

    • Insert log_submissions row with state = 'pending'.
  7. Optionally construct Sigstore bundle representing:

    • DSSE envelope
    • Transparency log entry (when available) — for async, you can fill this later.
  8. Return 202 Accepted with resource URL:

{
  "attestationId": "1f4b3d...",
  "status": "pending_logs",
  "subjectDigest": "sha256:abcd...",
  "logTargets": ["internal", "private_rekor", "public_rekor"],
  "links": {
    "self": "/v1/attestations/1f4b3d...",
    "bundle": "/v1/attestations/1f4b3d.../bundle"
  }
}

5.2 Get attestation status

GET /v1/attestations/{id}

Returns:

{
  "attestationId": "1f4b3d...",
  "subjectDigest": "sha256:abcd...",
  "predicateType": "https://slsa.dev/provenance/v1",
  "logs": {
    "internal": {
      "state": "succeeded"
    },
    "private_rekor": {
      "state": "succeeded",
      "logIndex": 1234,
      "logId": "..."
    },
    "public_rekor": {
      "state": "pending",
      "lastError": null
    }
  },
  "createdAt": "2025-11-27T12:34:56Z"
}

5.3 Get bundle

GET /v1/attestations/{id}/bundle

  • Returns a Sigstore bundle JSON that:

    • Contains either:

      • Only the DSSE + identity + certificate chain (if logs not yet written).
      • Or DSSE + log entries (hashedrekord / dsse entries) for whichever logs are ready. (Sigstore)
  • This is what airgapped exports and verifiers consume.


6. Airgapped workflows

6.1 In the airgapped environment

  • Attestation Service runs in “airgapped mode”:

    • logTargets typically = ["internal", "private_rekor"].
    • No direct public Rekor.
  • Offline Exporter CLI:

    stellaops-offline-export \
      --since-id <last_exported_attestation_id> \
      --output offline-bundle-<timestamp>.tar.gz
    
  • Exporter logic:

    1. Query DB for new attestations > since-id.

    2. For each attestation:

      • Fetch DSSE envelope.
      • Fetch current log statuses (private rekor, internal).
      • Build or reuse Sigstore bundle JSON.
      • Optionally include payload chunks and/or original payload.
    3. Write them into a tarball with structure like:

      /attestations/<id>/bundle.json
      /attestations/<id>/chunks/chunk-0000.bin
      ...
      /meta/export-metadata.json
      

6.2 In the connected environment

  • Replay Service:

    stellaops-offline-replay \
      --input offline-bundle-<timestamp>.tar.gz \
      --public-rekor-url https://rekor.sigstore.dev
    
  • Replay logic:

    1. Read each /attestations/<id>/bundle.json.

    2. If public_rekor entry not present:

      • Extract DSSE envelope from bundle.
      • Call Attestation Service “import & log” endpoint or directly call PublicRekorClient.
      • Build new updated bundle (with public tlog entry).
    3. Emit an updated result.json for each attestation (so you can sync status back to original environment if needed).


7. Observability & ops

7.1 Metrics

Have devs expose at least:

  • rekor_submit_requests_total{target, outcome}
  • rekor_submit_latency_seconds{target} (histogram)
  • log_submissions_in_queue{target}
  • attestations_total{predicateType}
  • attestation_payload_bytes{bucket} (distribution of payload sizes)

7.2 Logging

  • Log at info:

    • Attestation created (subject digest, predicateType, manifest version).
    • Log submission succeeded (target, logIndex, logId).
  • Log at warn/error:

    • Any permanent failure.
    • Any time DSSE payload nearly exceeds size threshold (to catch misconfig).

7.3 Feature flags

  • FEATURE_REKOR_PUBLIC_ENABLED
  • FEATURE_REKOR_PRIVATE_ENABLED
  • FEATURE_OFFLINE_EXPORT_ENABLED
  • FEATURE_CHUNKING_ENABLED (to allow rolling rollout)

8. Concrete work breakdown for developers

You can basically drop this as a backlog outline:

  1. Domain model & storage

    • Implement DB migrations for attestations, payload_manifests, payload_chunks, log_submissions.
    • Implement object storage abstraction and contentaddressable layout for chunks.
  2. Attestation Service skeleton

    • Implement POST /v1/attestations with basic validation.
    • Implement manifest building and DSSE envelope creation (no Rekor yet).
    • Persist records in DB.
  3. Chunking & manifest logic

    • Implement chunker with thresholds & tests (small vs large).
    • Implement manifest JSON builder.
    • Ensure DSSE payload size is under configurable limit.
  4. Rekor client & log submissions

    • Implement TransparencyLogClient interface + Public/Private implementations.
    • Implement log_submissions worker (queue + backoff + idempotency).
    • Wire worker into service config and deployment.
  5. Sigstore bundle support

    • Implement bundle builder given DSSE envelope + log metadata.
    • Add GET /v1/attestations/{id}/bundle.
  6. Offline export & replay

    • Implement Exporter CLI (queries DB, packages bundles and chunks).
    • Implement Replay CLI/service (reads tarball, logs to public Rekor).
    • Document operator workflow for moving tarballs between environments.
  7. Observability & docs

    • Add metrics, logs, and dashboards.
    • Write verification docs: “How to fetch manifest, verify DSSE, reconstruct payload, and check Rekor.”

If youd like, next step I can do is: take this and turn it into a more strict format your devs might already use (e.g. Jira epics + stories, or a design doc template with headers like “Motivation, Alternatives, Risks, Rollout Plan”).