Files
git.stella-ops.org/docs2/orchestrator/architecture.md
master dac8e10e36 feat(crypto): Complete Phase 2 - Configuration-driven crypto architecture with 100% compliance
## Summary

This commit completes Phase 2 of the configuration-driven crypto architecture, achieving
100% crypto compliance by eliminating all hardcoded cryptographic implementations.

## Key Changes

### Phase 1: Plugin Loader Infrastructure
- **Plugin Discovery System**: Created StellaOps.Cryptography.PluginLoader with manifest-based loading
- **Configuration Model**: Added CryptoPluginConfiguration with regional profiles support
- **Dependency Injection**: Extended DI to support plugin-based crypto provider registration
- **Regional Configs**: Created appsettings.crypto.{international,russia,eu,china}.yaml
- **CI Workflow**: Added .gitea/workflows/crypto-compliance.yml for audit enforcement

### Phase 2: Code Refactoring
- **API Extension**: Added ICryptoProvider.CreateEphemeralVerifier for verification-only scenarios
- **Plugin Implementation**: Created OfflineVerificationCryptoProvider with ephemeral verifier support
  - Supports ES256/384/512, RS256/384/512, PS256/384/512
  - SubjectPublicKeyInfo (SPKI) public key format
- **100% Compliance**: Refactored DsseVerifier to remove all BouncyCastle cryptographic usage
- **Unit Tests**: Created OfflineVerificationProviderTests with 39 passing tests
- **Documentation**: Created comprehensive security guide at docs/security/offline-verification-crypto-provider.md
- **Audit Infrastructure**: Created scripts/audit-crypto-usage.ps1 for static analysis

### Testing Infrastructure (TestKit)
- **Determinism Gate**: Created DeterminismGate for reproducibility validation
- **Test Fixtures**: Added PostgresFixture and ValkeyFixture using Testcontainers
- **Traits System**: Implemented test lane attributes for parallel CI execution
- **JSON Assertions**: Added CanonicalJsonAssert for deterministic JSON comparisons
- **Test Lanes**: Created test-lanes.yml workflow for parallel test execution

### Documentation
- **Architecture**: Created CRYPTO_CONFIGURATION_DRIVEN_ARCHITECTURE.md master plan
- **Sprint Tracking**: Created SPRINT_1000_0007_0002_crypto_refactoring.md (COMPLETE)
- **API Documentation**: Updated docs2/cli/crypto-plugins.md and crypto.md
- **Testing Strategy**: Created testing strategy documents in docs/implplan/SPRINT_5100_0007_*

## Compliance & Testing

-  Zero direct System.Security.Cryptography usage in production code
-  All crypto operations go through ICryptoProvider abstraction
-  39/39 unit tests passing for OfflineVerificationCryptoProvider
-  Build successful (AirGap, Crypto plugin, DI infrastructure)
-  Audit script validates crypto boundaries

## Files Modified

**Core Crypto Infrastructure:**
- src/__Libraries/StellaOps.Cryptography/CryptoProvider.cs (API extension)
- src/__Libraries/StellaOps.Cryptography/CryptoSigningKey.cs (verification-only constructor)
- src/__Libraries/StellaOps.Cryptography/EcdsaSigner.cs (fixed ephemeral verifier)

**Plugin Implementation:**
- src/__Libraries/StellaOps.Cryptography.Plugin.OfflineVerification/ (new)
- src/__Libraries/StellaOps.Cryptography.PluginLoader/ (new)

**Production Code Refactoring:**
- src/AirGap/StellaOps.AirGap.Importer/Validation/DsseVerifier.cs (100% compliant)

**Tests:**
- src/__Libraries/__Tests/StellaOps.Cryptography.Plugin.OfflineVerification.Tests/ (new, 39 tests)
- src/__Libraries/__Tests/StellaOps.Cryptography.PluginLoader.Tests/ (new)

**Configuration:**
- etc/crypto-plugins-manifest.json (plugin registry)
- etc/appsettings.crypto.*.yaml (regional profiles)

**Documentation:**
- docs/security/offline-verification-crypto-provider.md (600+ lines)
- docs/implplan/CRYPTO_CONFIGURATION_DRIVEN_ARCHITECTURE.md (master plan)
- docs/implplan/SPRINT_1000_0007_0002_crypto_refactoring.md (Phase 2 complete)

## Next Steps

Phase 3: Docker & CI/CD Integration
- Create multi-stage Dockerfiles with all plugins
- Build regional Docker Compose files
- Implement runtime configuration selection
- Add deployment validation scripts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-23 18:20:00 +02:00

2.0 KiB

Orchestrator architecture

Runtime components

  • WebService: REST and WebSocket API for DAG definitions, runs, and admin actions.
  • Scheduler: cron and timer triggers that enqueue run intents.
  • Worker: executes DAG steps, enforces resource limits, and reports telemetry.
  • Plugin host: loads task plugins from signed offline bundles.

Data model

  • DAG: directed acyclic graph with deterministic topological ordering.
  • Run: immutable record with runId, dagVersion, tenant, inputsHash, status, traceId, startedUtc, endedUtc.
  • Step execution: stepId, inputsHash, outputsHash, status, attempt, durationMs, logsRef, metricsRef.

Execution flow

  • Run creation is idempotent on runToken, dagId, and inputsHash.
  • Scheduler enqueues run intent to a tenant queue.
  • Worker reconstructs DAG order, executes steps, applies retries and backoff.
  • WebSocket streams run and step status updates.

Storage and queues

  • PostgreSQL stores DAG specs, versions, and run history.
  • Queues are per-tenant FIFO in PostgreSQL or Valkey-backed lists.
  • Artifacts are content-addressed and stored in object storage or large objects.

Security and AOC alignment

  • Tenant header required on every request; cross-tenant DAGs are forbidden.
  • Scopes: orchestrator:read, orchestrator:write, orchestrator:admin.
  • AOC alignment: orchestrator schedules and records only; no policy decisions.
  • Step sandboxing enforces CPU and memory limits; network egress deny by default.

Determinism

  • Step ordering uses topological order with lexical tie-breaks.
  • Retries preserve traceId and reuse the same runToken.
  • Timestamps UTC; hashes lower-case hex.

Offline posture

  • DAG specs and plugins are loaded from offline bundles with signatures.
  • Exports of runs, steps, and logs are available as NDJSON.

Observability

  • Traces: orchestrator.run and orchestrator.step with tenant, dagId, runId, stepId.
  • Metrics: orchestrator_runs_total, orchestrator_run_duration_seconds, orchestrator_queue_depth.
  • Logs: structured JSON with trace_id, tenant, dagId, runId, stepId.