Files
git.stella-ops.org/docs/modules/airgap/guides/job-sync-offline.md

6.5 KiB

HLC Job Sync Offline Operations

Sprint: SPRINT_20260105_002_003_ROUTER

This document describes the offline job synchronization mechanism using Hybrid Logical Clock (HLC) ordering for air-gap scenarios.

Overview

When nodes operate in disconnected/offline mode, scheduled jobs are enqueued locally with HLC timestamps. Upon reconnection or air-gap transfer, these job logs are merged deterministically to maintain global ordering.

Key features:

  • Deterministic ordering: Jobs merge by HLC total order (T_hlc.PhysicalTime, T_hlc.LogicalCounter, NodeId, JobId)
  • Chain integrity: Each entry links to the previous via link = Hash(prev_link || job_id || t_hlc || payload_hash)
  • Conflict-free: Same payload = same JobId (deterministic), so duplicates are safely dropped
  • Audit trail: Source node ID and original links preserved for traceability

CLI Commands

Export Job Logs

Export offline job logs to a sync bundle for air-gap transfer:

# Export job logs for a tenant
stella airgap jobs export --tenant my-tenant -o job-sync-bundle.json

# Export with verbose output
stella airgap jobs export --tenant my-tenant -o bundle.json --verbose

# Export as JSON for automation
stella airgap jobs export --tenant my-tenant --json

Options:

  • --tenant, -t - Tenant ID (defaults to "default")
  • --output, -o - Output file path
  • --node - Export specific node only (default: current node)
  • --sign - Sign bundle with DSSE
  • --json - Output result as JSON
  • --verbose - Enable verbose logging

Import Job Logs

Import a job sync bundle from air-gap transfer:

# Verify bundle without importing
stella airgap jobs import bundle.json --verify-only

# Import bundle
stella airgap jobs import bundle.json

# Force import despite validation issues
stella airgap jobs import bundle.json --force

# Import with JSON output for automation
stella airgap jobs import bundle.json --json

Options:

  • bundle - Path to job sync bundle file (required)
  • --verify-only - Only verify the bundle without importing
  • --force - Force import even if validation fails
  • --json - Output result as JSON
  • --verbose - Enable verbose logging

List Available Bundles

List job sync bundles in a directory:

# List bundles in current directory
stella airgap jobs list

# List bundles in specific directory
stella airgap jobs list --source /path/to/bundles

# Output as JSON
stella airgap jobs list --json

Options:

  • --source, -s - Source directory (default: current directory)
  • --json - Output result as JSON
  • --verbose - Enable verbose logging

Bundle Format

Job sync bundles are JSON files with the following structure:

{
  "bundleId": "guid",
  "tenantId": "string",
  "createdAt": "ISO8601",
  "createdByNodeId": "string",
  "manifestDigest": "sha256:hex",
  "signature": "base64 (optional)",
  "signedBy": "keyId (optional)",
  "jobLogs": [
    {
      "nodeId": "string",
      "lastHlc": "HLC timestamp string",
      "chainHead": "base64",
      "entries": [
        {
          "nodeId": "string",
          "tHlc": "HLC timestamp string",
          "jobId": "guid",
          "partitionKey": "string (optional)",
          "payload": "JSON string",
          "payloadHash": "base64",
          "prevLink": "base64 (null for first)",
          "link": "base64",
          "enqueuedAt": "ISO8601"
        }
      ]
    }
  ]
}

Validation

Bundle validation checks:

  1. Manifest digest: Recomputes digest from job logs and compares
  2. Chain integrity: Verifies each entry's prev_link matches expected
  3. Link verification: Recomputes links and verifies against stored values
  4. Chain head: Verifies last entry link matches node's chain head

Merge Algorithm

When importing bundles from multiple nodes:

  1. Collect: Gather all entries from all node logs
  2. Sort: Order by HLC total order (PhysicalTime, LogicalCounter, NodeId, JobId)
  3. Deduplicate: Same JobId = same payload (drop later duplicates)
  4. Recompute chain: Build unified chain from merged entries

This produces a deterministic ordering regardless of import sequence.

Conflict Resolution

Scenario Resolution
Same JobId, same payload, different HLC Take earliest HLC, drop duplicates
Same JobId, different payloads Error - indicates bug in deterministic ID computation

Metrics

The following metrics are emitted:

Metric Type Description
airgap_bundles_exported_total Counter Total bundles exported
airgap_bundles_imported_total Counter Total bundles imported
airgap_jobs_synced_total Counter Total jobs synced
airgap_duplicates_dropped_total Counter Duplicates dropped during merge
airgap_merge_conflicts_total Counter Merge conflicts by type
airgap_offline_enqueues_total Counter Offline enqueue operations
airgap_bundle_size_bytes Histogram Bundle size distribution
airgap_sync_duration_seconds Histogram Sync operation duration
airgap_merge_entries_count Histogram Entries per merge operation

Service Registration

To use job sync in your application:

// Register core services
services.AddAirGapSyncServices(nodeId: "my-node-id");

// Register file-based transport (for air-gap)
services.AddFileBasedJobSyncTransport();

// Or router-based transport (for connected scenarios)
services.AddRouterJobSyncTransport();

// Register sync service (requires ISyncSchedulerLogRepository)
services.AddAirGapSyncImportService();

Operational Runbook

Pre-Export Checklist

  • Node has offline job logs to export
  • Target path is writable
  • Signing key available (if --sign used)

Pre-Import Checklist

  • Bundle file accessible
  • Bundle signature verified (if signed)
  • Scheduler database accessible
  • Sufficient disk space

Recovery Procedures

Chain validation failure:

  1. Identify which entry has chain break
  2. Check for data corruption in bundle
  3. Re-export from source node if possible
  4. Use --force only if data loss is acceptable

Duplicate conflict:

  1. This is expected - duplicates are safely dropped
  2. Check duplicate count in output
  3. Verify merged jobs match expected count

Payload mismatch (same JobId, different payloads):

  1. This indicates a bug - same idempotency key should produce same payload
  2. Review job generation logic
  3. Do not force import - fix root cause

See Also