219 lines
6.5 KiB
Markdown
219 lines
6.5 KiB
Markdown
# HLC Job Sync Offline Operations
|
|
|
|
Sprint: SPRINT_20260105_002_003_ROUTER
|
|
|
|
This document describes the offline job synchronization mechanism using Hybrid Logical Clock (HLC) ordering for air-gap scenarios.
|
|
|
|
## Overview
|
|
|
|
When nodes operate in disconnected/offline mode, scheduled jobs are enqueued locally with HLC timestamps. Upon reconnection or air-gap transfer, these job logs are merged deterministically to maintain global ordering.
|
|
|
|
Key features:
|
|
- **Deterministic ordering**: Jobs merge by HLC total order `(T_hlc.PhysicalTime, T_hlc.LogicalCounter, NodeId, JobId)`
|
|
- **Chain integrity**: Each entry links to the previous via `link = Hash(prev_link || job_id || t_hlc || payload_hash)`
|
|
- **Conflict-free**: Same payload = same JobId (deterministic), so duplicates are safely dropped
|
|
- **Audit trail**: Source node ID and original links preserved for traceability
|
|
|
|
## CLI Commands
|
|
|
|
### Export Job Logs
|
|
|
|
Export offline job logs to a sync bundle for air-gap transfer:
|
|
|
|
```bash
|
|
# Export job logs for a tenant
|
|
stella airgap jobs export --tenant my-tenant -o job-sync-bundle.json
|
|
|
|
# Export with verbose output
|
|
stella airgap jobs export --tenant my-tenant -o bundle.json --verbose
|
|
|
|
# Export as JSON for automation
|
|
stella airgap jobs export --tenant my-tenant --json
|
|
```
|
|
|
|
Options:
|
|
- `--tenant, -t` - Tenant ID (defaults to "default")
|
|
- `--output, -o` - Output file path
|
|
- `--node` - Export specific node only (default: current node)
|
|
- `--sign` - Sign bundle with DSSE
|
|
- `--json` - Output result as JSON
|
|
- `--verbose` - Enable verbose logging
|
|
|
|
### Import Job Logs
|
|
|
|
Import a job sync bundle from air-gap transfer:
|
|
|
|
```bash
|
|
# Verify bundle without importing
|
|
stella airgap jobs import bundle.json --verify-only
|
|
|
|
# Import bundle
|
|
stella airgap jobs import bundle.json
|
|
|
|
# Force import despite validation issues
|
|
stella airgap jobs import bundle.json --force
|
|
|
|
# Import with JSON output for automation
|
|
stella airgap jobs import bundle.json --json
|
|
```
|
|
|
|
Options:
|
|
- `bundle` - Path to job sync bundle file (required)
|
|
- `--verify-only` - Only verify the bundle without importing
|
|
- `--force` - Force import even if validation fails
|
|
- `--json` - Output result as JSON
|
|
- `--verbose` - Enable verbose logging
|
|
|
|
### List Available Bundles
|
|
|
|
List job sync bundles in a directory:
|
|
|
|
```bash
|
|
# List bundles in current directory
|
|
stella airgap jobs list
|
|
|
|
# List bundles in specific directory
|
|
stella airgap jobs list --source /path/to/bundles
|
|
|
|
# Output as JSON
|
|
stella airgap jobs list --json
|
|
```
|
|
|
|
Options:
|
|
- `--source, -s` - Source directory (default: current directory)
|
|
- `--json` - Output result as JSON
|
|
- `--verbose` - Enable verbose logging
|
|
|
|
## Bundle Format
|
|
|
|
Job sync bundles are JSON files with the following structure:
|
|
|
|
```json
|
|
{
|
|
"bundleId": "guid",
|
|
"tenantId": "string",
|
|
"createdAt": "ISO8601",
|
|
"createdByNodeId": "string",
|
|
"manifestDigest": "sha256:hex",
|
|
"signature": "base64 (optional)",
|
|
"signedBy": "keyId (optional)",
|
|
"jobLogs": [
|
|
{
|
|
"nodeId": "string",
|
|
"lastHlc": "HLC timestamp string",
|
|
"chainHead": "base64",
|
|
"entries": [
|
|
{
|
|
"nodeId": "string",
|
|
"tHlc": "HLC timestamp string",
|
|
"jobId": "guid",
|
|
"partitionKey": "string (optional)",
|
|
"payload": "JSON string",
|
|
"payloadHash": "base64",
|
|
"prevLink": "base64 (null for first)",
|
|
"link": "base64",
|
|
"enqueuedAt": "ISO8601"
|
|
}
|
|
]
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
## Validation
|
|
|
|
Bundle validation checks:
|
|
1. **Manifest digest**: Recomputes digest from job logs and compares
|
|
2. **Chain integrity**: Verifies each entry's prev_link matches expected
|
|
3. **Link verification**: Recomputes links and verifies against stored values
|
|
4. **Chain head**: Verifies last entry link matches node's chain head
|
|
|
|
## Merge Algorithm
|
|
|
|
When importing bundles from multiple nodes:
|
|
|
|
1. **Collect**: Gather all entries from all node logs
|
|
2. **Sort**: Order by HLC total order `(PhysicalTime, LogicalCounter, NodeId, JobId)`
|
|
3. **Deduplicate**: Same JobId = same payload (drop later duplicates)
|
|
4. **Recompute chain**: Build unified chain from merged entries
|
|
|
|
This produces a deterministic ordering regardless of import sequence.
|
|
|
|
## Conflict Resolution
|
|
|
|
| Scenario | Resolution |
|
|
|----------|------------|
|
|
| Same JobId, same payload, different HLC | Take earliest HLC, drop duplicates |
|
|
| Same JobId, different payloads | Error - indicates bug in deterministic ID computation |
|
|
|
|
## Metrics
|
|
|
|
The following metrics are emitted:
|
|
|
|
| Metric | Type | Description |
|
|
|--------|------|-------------|
|
|
| `airgap_bundles_exported_total` | Counter | Total bundles exported |
|
|
| `airgap_bundles_imported_total` | Counter | Total bundles imported |
|
|
| `airgap_jobs_synced_total` | Counter | Total jobs synced |
|
|
| `airgap_duplicates_dropped_total` | Counter | Duplicates dropped during merge |
|
|
| `airgap_merge_conflicts_total` | Counter | Merge conflicts by type |
|
|
| `airgap_offline_enqueues_total` | Counter | Offline enqueue operations |
|
|
| `airgap_bundle_size_bytes` | Histogram | Bundle size distribution |
|
|
| `airgap_sync_duration_seconds` | Histogram | Sync operation duration |
|
|
| `airgap_merge_entries_count` | Histogram | Entries per merge operation |
|
|
|
|
## Service Registration
|
|
|
|
To use job sync in your application:
|
|
|
|
```csharp
|
|
// Register core services
|
|
services.AddAirGapSyncServices(nodeId: "my-node-id");
|
|
|
|
// Register file-based transport (for air-gap)
|
|
services.AddFileBasedJobSyncTransport();
|
|
|
|
// Or router-based transport (for connected scenarios)
|
|
services.AddRouterJobSyncTransport();
|
|
|
|
// Register sync service (requires ISyncSchedulerLogRepository)
|
|
services.AddAirGapSyncImportService();
|
|
```
|
|
|
|
## Operational Runbook
|
|
|
|
### Pre-Export Checklist
|
|
- [ ] Node has offline job logs to export
|
|
- [ ] Target path is writable
|
|
- [ ] Signing key available (if --sign used)
|
|
|
|
### Pre-Import Checklist
|
|
- [ ] Bundle file accessible
|
|
- [ ] Bundle signature verified (if signed)
|
|
- [ ] Scheduler database accessible
|
|
- [ ] Sufficient disk space
|
|
|
|
### Recovery Procedures
|
|
|
|
**Chain validation failure:**
|
|
1. Identify which entry has chain break
|
|
2. Check for data corruption in bundle
|
|
3. Re-export from source node if possible
|
|
4. Use `--force` only if data loss is acceptable
|
|
|
|
**Duplicate conflict:**
|
|
1. This is expected - duplicates are safely dropped
|
|
2. Check duplicate count in output
|
|
3. Verify merged jobs match expected count
|
|
|
|
**Payload mismatch (same JobId, different payloads):**
|
|
1. This indicates a bug - same idempotency key should produce same payload
|
|
2. Review job generation logic
|
|
3. Do not force import - fix root cause
|
|
|
|
## See Also
|
|
|
|
- [Air-Gap Operations](operations.md)
|
|
- [Mirror Bundles](mirror-bundles.md)
|
|
- [Staleness and Time](staleness-and-time.md)
|