up
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled

This commit is contained in:
StellaOps Bot
2025-11-29 02:40:21 +02:00
parent 887b0a1c67
commit 7e7be4d2fd
54 changed files with 4907 additions and 3 deletions

View File

@@ -31,7 +31,7 @@
| 9 | ORCH-SVC-34-001 | DONE | Depends on 33-004. | Orchestrator Service Guild | Quota management APIs, per-tenant SLO burn-rate computation, alert budget tracking via metrics. |
| 10 | ORCH-SVC-34-002 | DONE | Depends on 34-001. | Orchestrator Service Guild | Audit log + immutable run ledger export with signed manifest and provenance chain to artifacts. |
| 11 | ORCH-SVC-34-003 | DONE | Depends on 34-002. | Orchestrator Service Guild | Perf/scale validation (≥10k pending jobs, dispatch P95 <150ms); autoscaling hooks; health probes. |
| 12 | ORCH-SVC-34-004 | TODO | Depends on 34-003. | Orchestrator Service Guild | GA packaging: container image, Helm overlays, offline bundle seeds, provenance attestations, compliance checklist. |
| 12 | ORCH-SVC-34-004 | DONE | Depends on 34-003. | Orchestrator Service Guild | GA packaging: container image, Helm overlays, offline bundle seeds, provenance attestations, compliance checklist. |
| 13 | ORCH-SVC-35-101 | TODO | Depends on 34-004. | Orchestrator Service Guild | Register `export` job type with quotas/rate policies; expose telemetry; ensure exporter workers heartbeat via orchestrator contracts. |
| 14 | ORCH-SVC-36-101 | TODO | Depends on 35-101. | Orchestrator Service Guild | Capture distribution metadata and retention timestamps for export jobs; update dashboards and SSE payloads. |
| 15 | ORCH-SVC-37-101 | TODO | Depends on 36-101. | Orchestrator Service Guild | Enable scheduled export runs, retention pruning hooks, failure alerting tied to export job class. |
@@ -53,6 +53,7 @@
| 2025-11-28 | ORCH-SVC-34-001 DONE: Implemented quota management APIs with SLO burn-rate computation and alert budget tracking. Created Slo domain model (Domain/Slo.cs) with SloType enum (Availability/Latency/Throughput), SloWindow enum (1h/1d/7d/30d), AlertSeverity enum, factory methods (CreateAvailability/CreateLatency/CreateThroughput), Update/Enable/Disable methods, ErrorBudget/GetWindowDuration computed properties. Created SloState record for current metrics (SLI, budget consumed/remaining, burn rate, time to exhaustion). Created AlertBudgetThreshold (threshold-based alerting with cooldown and rate limiting, ShouldTrigger logic). Created SloAlert (alert lifecycle with Acknowledge/Resolve). Built BurnRateEngine (SloManagement/BurnRateEngine.cs) with interfaces: IBurnRateEngine (ComputeStateAsync, ComputeAllStatesAsync, EvaluateAlertsAsync), ISloEventSource (availability/latency/throughput counts retrieval), ISloRepository/IAlertThresholdRepository/ISloAlertRepository. Created database migration (004_slo_quotas.sql) with tables: slos, alert_budget_thresholds, slo_alerts, slo_state_snapshots, quota_audit_log, job_metrics_hourly. Added helper functions: get_slo_availability_counts, cleanup_slo_snapshots, cleanup_quota_audit_log, get_slo_summary. Created REST API contracts (QuotaContracts.cs): CreateQuotaRequest/UpdateQuotaRequest/PauseQuotaRequest/QuotaResponse/QuotaListResponse, CreateSloRequest/UpdateSloRequest/SloResponse/SloListResponse/SloStateResponse/SloWithStateResponse, CreateAlertThresholdRequest/AlertThresholdResponse, SloAlertResponse/SloAlertListResponse/AcknowledgeAlertRequest/ResolveAlertRequest, SloSummaryResponse/QuotaSummaryResponse/QuotaUtilizationResponse. Created QuotaEndpoints (list/get/create/update/delete, pause/resume, summary). Created SloEndpoints (list/get/create/update/delete, enable/disable, state/states, thresholds CRUD, alerts list/get/acknowledge/resolve, summary). Added SLO metrics to OrchestratorMetrics: SlosCreated/SlosUpdated, SloAlertsTriggered/Acknowledged/Resolved, SloBudgetConsumed/SloBurnRate/SloCurrentSli/SloBudgetRemaining/SloTimeToExhaustion histograms, SloActiveAlerts UpDownCounter. Comprehensive test coverage: SloTests (25 tests for creation/validation/error budget/window duration/update/enable-disable), SloStateTests (tests for NoData factory), AlertBudgetThresholdTests (12 tests for creation/validation/ShouldTrigger/cooldown), SloAlertTests (5 tests for Create/Acknowledge/Resolve). Build succeeds, 450 tests pass (+48 new tests). | Implementer |
| 2025-11-28 | ORCH-SVC-34-002 DONE: Implemented audit log and immutable run ledger export. Created AuditLog domain model (Domain/Audit/AuditLog.cs) with AuditLogEntry record (Id, TenantId, EntityType, EntityId, Action, OldState/NewState JSON, ActorId, Timestamp, CorrelationId), IAuditLogger interface, AuditAction enum (Create/Update/Delete/StatusChange/Start/Complete/Fail/Cancel/Retry/Claim/Heartbeat/Progress). Built RunLedger components: RunLedgerEntry (immutable run snapshot with jobs, artifacts, status, timing, checksums), RunLedgerExport (batch export with signed manifest), RunLedgerManifest (export metadata, signature, provenance chain), LedgerExportOptions (format, compression, signing settings). Created IAuditLogRepository/IRunLedgerRepository interfaces. Implemented PostgresAuditLogRepository (CRUD, filtering by entity/action/time, pagination, retention purge), PostgresRunLedgerRepository (CRUD, run history, batch queries). Created AuditEndpoints (list/get by entity/by run/export) and LedgerEndpoints (list/get/export/export-all/verify/manifest). Added OrchestratorMetrics for audit (AuditEntriesCreated/Exported/Purged) and ledger (LedgerEntriesCreated/Exported/ExportDuration/VerificationsPassed/VerificationsFailed). Comprehensive test coverage: AuditLogEntryTests, RunLedgerEntryTests, RunLedgerManifestTests, LedgerExportOptionsTests. Build succeeds, 487 tests pass (+37 new tests). | Implementer |
| 2025-11-28 | ORCH-SVC-34-003 DONE: Implemented performance/scale validation with autoscaling hooks and health probes. Created ScaleMetrics service (Core/Scale/ScaleMetrics.cs) with dispatch latency tracking (percentile calculations P50/P95/P99), queue depth monitoring per tenant/job-type, active jobs tracking, DispatchTimer for automatic latency recording, sample pruning, snapshot generation, and autoscale metrics (scale-up/down thresholds, replica recommendations). Built LoadShedder (Core/Scale/LoadShedder.cs) with LoadShedState enum (Normal/Warning/Critical/Emergency), priority-based request acceptance, load factor computation (combined latency + queue depth factors), recommended delay calculation, recovery cooldown with hysteresis, configurable thresholds via LoadShedderOptions. Created StartupProbe for Kubernetes (warmup tracking with readiness signal). Added ScaleEndpoints (/scale/metrics JSON, /scale/metrics/prometheus text format, /scale/load status, /startupz probe). Enhanced HealthEndpoints integration. Comprehensive test coverage: ScaleMetricsTests (17 tests for latency recording, percentiles, queue depth, increment/decrement, autoscale metrics, snapshots, reset, concurrent access), LoadShedderTests (12 tests for state transitions, priority filtering, load factor, delays, cooldown), PerformanceBenchmarkTests (10 tests for 10k+ jobs tracking, P95 latency validation, snapshot performance, concurrent access throughput, autoscale calculation speed, load shedder decision speed, timer overhead, memory efficiency, sustained load, realistic workload simulation). Build succeeds, 37 scale tests pass (487 total). | Implementer |
| 2025-11-29 | ORCH-SVC-34-004 DONE: Implemented GA packaging artifacts. Created multi-stage Dockerfile (ops/orchestrator/Dockerfile) with SDK build stage and separate runtime stages for orchestrator-web and orchestrator-worker, including OCI labels, HEALTHCHECK directive, and deterministic build settings. Created Helm values overlay (deploy/helm/stellaops/values-orchestrator.yaml) with orchestrator-web (2 replicas), orchestrator-worker (1 replica), and orchestrator-postgres services, including full configuration for scheduler, autoscaling, load shedding, dead letter, and backfill. Created air-gap bundle script (ops/orchestrator/build-airgap-bundle.sh) for offline deployment with OCI image export, config templates, manifest generation, and documentation bundling. Created SLSA v1 provenance attestation template (ops/orchestrator/provenance.json) with build definition, resolved dependencies, and byproducts. Created GA compliance checklist (ops/orchestrator/GA_CHECKLIST.md) covering build/packaging, security, functional, performance/scale, observability, deployment, documentation, testing, and compliance sections with sign-off template. All YAML/JSON syntax validated, build succeeds. | Implementer |
## Decisions & Risks
- All tasks depend on outputs from Orchestrator I (32-001); sprint remains TODO until upstream ship.

View File

@@ -80,4 +80,5 @@
| 2025-11-28 | CVSS-RECEIPT-190-005 DONE: Added `ReceiptBuilder` with deterministic input hashing, evidence validation (policy-driven), vector/scoring via CvssV4Engine, and persistence through repository abstraction. Added `CreateReceiptRequest`, `IReceiptRepository`, unit tests (`ReceiptBuilderTests`) with in-memory repo; all 37 tests passing. | Implementer |
| 2025-11-28 | CVSS-DSSE-190-006 DONE: Integrated Attestor DSSE signing into receipt builder. Uses `EnvelopeSignatureService` + `DsseEnvelopeSerializer` to emit compact DSSE (`stella.ops/cvssReceipt@v1`) and stores base64 DSSE ref in `AttestationRefs`. Added signing test with Ed25519 fixture; total tests 38 passing. | Implementer |
| 2025-11-28 | CVSS-HISTORY-190-007 DONE: Added `ReceiptHistoryService` with amendment tracking (`AmendReceiptRequest`), history entry creation, modified metadata, and optional DSSE re-signing. Repository abstraction extended with `GetAsync`/`UpdateAsync`; in-memory repo updated; tests remain green (38). | Implementer |
| 2025-11-29 | CVSS-RECEIPT/DSSE/HISTORY tasks wired to PostgreSQL: added `policy.cvss_receipts` migration, `PostgresReceiptRepository`, DI registration, and integration test (`PostgresReceiptRepositoryTests`). Test run failed locally because Docker/Testcontainers not available; code compiles and unit tests still pass. | Implementer |
| 2025-11-28 | Ran `dotnet test src/Policy/__Tests/StellaOps.Policy.Scoring.Tests` (Release); 35 tests passed. Adjusted MacroVector lookup for FIRST sample vectors; duplicate PackageReference warnings remain to be cleaned separately. | Implementer |