38 lines
3.3 KiB
Markdown
38 lines
3.3 KiB
Markdown
# Job Lifecycle State Machine
|
|
|
|
## Module
|
|
Orchestrator
|
|
|
|
## Status
|
|
IMPLEMENTED
|
|
|
|
## Description
|
|
Job scheduling with Postgres-backed job repository, event envelope domain model, and air-gap compatible scheduling tests.
|
|
|
|
## Implementation Details
|
|
- **Modules**: `src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.Core/Scheduling/`, `src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.Core/Domain/`
|
|
- **Key Classes**:
|
|
- `JobStateMachine` (`src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.Core/Scheduling/JobStateMachine.cs`) - finite state machine governing job lifecycle transitions (Pending -> Scheduled -> Running -> Completed/Failed/Cancelled)
|
|
- `JobScheduler` (`src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.Core/Scheduling/JobScheduler.cs`) - schedules jobs based on state machine rules and DAG dependencies
|
|
- `RetryPolicy` (`src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.Core/Scheduling/RetryPolicy.cs`) - configurable retry policy for failed jobs (max retries, backoff strategy)
|
|
- `Job` (`src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.Core/Domain/Job.cs`) - job entity with current status, attempts, and metadata
|
|
- `JobStatus` (`src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.Core/Domain/JobStatus.cs`) - enum defining all valid job states
|
|
- `JobHistory` (`src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.Core/Domain/JobHistory.cs`) - historical record of all state transitions with timestamps
|
|
- `EventEnvelope` (`src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.Core/Domain/Events/EventEnvelope.cs`) - typed event envelope emitted on state transitions
|
|
- `TimelineEvent` (`src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.Core/Domain/Events/TimelineEvent.cs`) - timeline event for job lifecycle tracking
|
|
- `TimelineEventEmitter` (`src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.Core/Domain/Events/TimelineEventEmitter.cs`) - emits timeline events on state transitions
|
|
- `JobEndpoints` (`src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.WebService/Endpoints/JobEndpoints.cs`) - REST API for job management
|
|
- `JobContracts` (`src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.WebService/Contracts/JobContracts.cs`) - API contracts for job operations
|
|
- **Interfaces**: `IJobRepository` (`src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.Infrastructure/Repositories/IJobRepository.cs`), `IJobHistoryRepository` (`src/JobEngine/StellaOps.JobEngine/StellaOps.JobEngine.Infrastructure/Repositories/IJobHistoryRepository.cs`)
|
|
- **Source**: Feature matrix scan
|
|
|
|
## E2E Test Plan
|
|
- [ ] Create a job via `JobEndpoints` and verify initial state is Pending
|
|
- [ ] Schedule the job via `JobScheduler` and verify state transition: Pending -> Scheduled, with `TimelineEvent` emitted
|
|
- [ ] Start the job and verify `JobStateMachine` transition: Scheduled -> Running
|
|
- [ ] Complete the job and verify transition: Running -> Completed with completion timestamp in `JobHistory`
|
|
- [ ] Fail the job and verify transition: Running -> Failed with retry attempt incremented
|
|
- [ ] Verify `RetryPolicy`: fail a job with max_retries=3 and verify it re-enters Scheduled up to 3 times before terminal failure
|
|
- [ ] Attempt an invalid transition (e.g., Completed -> Running) and verify `JobStateMachine` rejects it
|
|
- [ ] Verify air-gap scheduling: schedule a job in sealed mode and verify it does not attempt network egress
|