Commit Graph

41 Commits

Author SHA1 Message Date
master
70cbfcee72 feat(scheduler): postgres + redis webhook rate limiter runtime
Sprint SPRINT_20260417_019_JobEngine_truthful_webhook_rate_limiter_runtime.

NoOpWebhookRateLimiter + RedisWebhookRateLimiter, service-collection
wiring, WebhookRateLimiterRuntimeTests, SCHED-WEB-16-104-WEBHOOKS doc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 14:41:42 +03:00
master
a15405431b wip(scheduler): compose storage configuration compatibility
Sprint SPRINT_20260417_002_JobEngine_scheduler_storage_compose_compatibility
(SCHEDULER-COMPAT-001 still DOING — sprint remains active).

Adds scheduler storage configuration adapter layer so the web host
accepts the compose-shaped storage configuration without manual remapping,
plus SchedulerStorageConfigurationTests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 14:41:18 +03:00
master
302826aedb feat(scheduler,packsregistry,registry): postgres backend cutover
Sprint SPRINT_20260415_003_DOCS_scheduler_registry_real_backend_cutover.

- Scheduler WebService: Postgres-backed audit service + resolver job service,
  system schedule bootstrap, durable host tests, jwt app factory
- PacksRegistry: persistence extensions + migration 002 runtime pack repo,
  durable runtime + startup contract tests
- Registry.TokenService: Postgres plan rule store + admin endpoints,
  migration 001 initial schema, durable runtime + persistence tests
- Scheduler.Plugin.Doctor: wiring for doctor job plugin
- Sprint _019 (webhook rate limiter) and _002 (compose storage compat)
  land separately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 14:36:05 +03:00
master
62d865080d feat(scheduler): wire startup migrations, dedupe 007/008, fix UI trend path
TASK-013: SchedulerPersistenceExtensions now calls AddStartupMigrations so
the embedded SQL files (including 007 job_kind + 008 doctor_trends) run on
every cold start. Deletes duplicate migrations 007_add_job_kind_plugin_config
(kept 007_add_schedule_job_kind.sql with tenant-scoped index) and
008_doctor_trends_table (kept 008_add_doctor_trends.sql with RLS + BRIN
time-series index).

TASK-010: Doctor UI trend service now calls
/api/v1/scheduler/doctor/trends/categories/{category} (was
/api/v1/doctor/scheduler/...) so it routes through the scheduler plugin
endpoints rather than the deprecated standalone doctor-scheduler path.

TASK-009: New DoctorJobPluginTests exercises plugin lifecycle: identity,
config validation for full/quick/categories/plugins modes, plan creation,
JSON schema shape, and PluginConfig round-trip (including alerts). 10 tests
added, all pass (26/26 in Plugin.Tests project).

Archives the sprint — all 13 tasks now DONE — and archives the platform
retest sprint (SPRINT_20260409_002) whose RETEST-008 completed via the
earlier feed-mirror cleanup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 22:14:30 +03:00
master
337aa58023 fix(scheduler): bind IDoctorTrendRepository via [FromServices] on trend endpoints
Three Doctor trend endpoints (/trends/checks/{checkId}, /trends/categories/{category},
/trends/degrading) were missing the [FromServices] attribute on the
IDoctorTrendRepository? parameter, causing ASP.NET minimal-APIs to attempt model
binding from route/query instead of resolving from DI. Verified fix with HTTP 200
responses against all four trend endpoints via the gateway.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 19:34:09 +03:00
master
fcdc4e0291 fix(scheduler): add [FromServices] to Doctor trend endpoint parameters
DoctorTrendEndpoints used IDoctorTrendRepository and TimeProvider as
MapGet handler parameters without [FromServices], causing ASP.NET to
infer them as body parameters — crashing the scheduler on startup with
"Body was inferred but the method does not allow inferred body parameters."

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 11:32:38 +03:00
master
3a36aefd81 fix: resolve 4 unhealthy services from fresh volume rebuild
- router-gateway: sync 10 missing jobengine routes to local config (prevent array merge bleed-through)
- findings-ledger-web: add VulnExplorer tables to postgres-init bootstrap script
- timeline-web: replace competing migration hosted service with standard AddStartupMigrations
- graph-api: handle null PostgresGraphRepository gracefully, add graph schema to init
- scheduler-web: add failure_signatures table to init bootstrap

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 16:23:52 +03:00
master
537f4f17fc test(audit): comprehensive tests for emission, PII redaction, hash chain, enrichers
- AuditPiiRedactorTests: 10 tests for recursive redaction + edge cases
- AuditActionFilterTests: 14 tests for capture, enrichment, fallback
- AuditModulesAndActionsTests: 3 tests for constant validation
- PostgresUnifiedAuditEventStoreTests: 8 tests for hash chain integrity
- UnifiedAuditAggregationServiceTests: 6 tests for new query filters
- AuditCleanseJobPluginTests: 7 tests for retention logic + validation
- PluginRegistryTests: 9 tests for plugin discovery
- Authority/Policy enricher tests: 8 tests for GUID resolution
- Total: ~65 new tests across 5 test projects
- Added InternalsVisibleTo for Audit.Emission and Timeline.WebService
- Created AuditCleanseJobPlugin implementation for retention-based cleanup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 13:00:18 +03:00
master
5d245f958f refactor(audit): replace magic strings with AuditModules/AuditActions constants
- Replace 349 .Audited("module", "action") calls with typed constants across 91 files
- Add 21 missing action constants to AuditActions.cs (Policy, Attestor, Evidence, Scanner)
- Compile-time safety for module/action naming across all 15 services

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 12:40:18 +03:00
master
54e7f871a3 feat(audit): annotate Platform + Notify + Scheduler + ReleaseOrchestrator (Batch 2b)
Platform (~40 state-changing endpoints annotated):
- EnvironmentSettingsAdmin: update/delete environment settings
- IdentityProvider: create/update/delete/enable/disable/test/apply
- CryptoProviderAdmin: update/delete crypto preferences
- AdministrationTrustSigning: create/rotate/revoke keys, register/block/unblock issuers,
  register/revoke certificates, configure transparency log
- PlatformEndpoints: quota alerts, onboarding complete/skip, preferences update, dashboard profile create
- SetupEndpoints: create session, execute/skip steps, finalize setup
- ScoreEndpoints: evaluate/verify score
- ScriptEndpoints: create/update/delete scripts
- ReleaseOrchestratorEnvironment: CRUD environments/targets/freeze-windows

Notify (~30 state-changing endpoints annotated):
- NotifyApi (v2): rules CRUD, templates CRUD, incident ack/resolve
- RuleEndpoints (v2): create/update/delete rules
- TemplateEndpoints (v2): create/update/delete templates
- EscalationEndpoints: CRUD policies, schedules, overrides; start/escalate/stop
- QuietHoursEndpoints: create/update/delete calendars
- ThrottleEndpoints: update/delete config
- OperatorOverrideEndpoints: create/revoke overrides

Scheduler (~10 state-changing endpoints annotated):
- ScheduleEndpoints: create/update/delete/pause/resume schedules
- RunEndpoints: create/cancel/retry runs
- GraphJobEndpoints: create build/overlay graph jobs
- PolicyRunEndpoints: create policy run
- Added StellaOps.Audit.Emission project reference + AddAuditEmission() registration
- Fixed pre-existing ScanJobPlugin.cs build error (Success -> Valid)

ReleaseOrchestrator (~25 state-changing endpoints annotated):
- ReleaseEndpoints: create/update/delete/ready/promote/deploy/rollback/clone releases,
  add/update/remove components
- ApprovalEndpoints: approve/reject/batch-approve/batch-reject
- DeploymentEndpoints: create/pause/resume/cancel/rollback/retry deployments
- EvidenceEndpoints: verify evidence
- ScriptsEndpoints: create/update/delete scripts
- ReleaseDashboardEndpoints: approve/reject promotions
- ReleaseControlV2Endpoints: approval decision, rollback run

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 11:40:02 +03:00
master
f3401540d7 refactor(jobengine): delete Core + Infrastructure + Worker + Tests (~65K lines)
- All active services now use their own persistence (release-orchestrator, scheduler, packsregistry)
- Zero remaining references from any active csproj
- Clean solution files (4 projects + 48 build configs removed from StellaOps.sln)
- Update README and AGENTS.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 11:23:11 +03:00
master
7f65e224ae feat: scheduler web+worker merge + audit Batch 1 (68 endpoints annotated)
Scheduler:
- Merge scheduler-worker into scheduler-web with Worker:Embedded flag
- Default embedded=true (compose), false available for K8s split
- Upgrade to resources-heavy, comment out scheduler-worker container

Audit Batch 1 (first real audit emission):
- Create AuditedRouteGroupExtensions convention helper
- EvidenceLocker: 7 endpoints (store/snapshot/verify/hold/export/verdict)
- Integrations: 6 endpoints (CRUD + test + discover)
- Scanner: 55 endpoints across 25 files
- Sprint 005 FILTER-001/002/003 marked DONE

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 11:08:40 +03:00
master
5d3e0d46b2 Merge branch 'worktree-agent-a503735a'
# Conflicts:
#	devops/compose/docker-compose.stella-ops.yml
#	devops/docker/services-matrix.env
#	src/JobEngine/StellaOps.Scheduler.WebService/Bootstrap/SystemScheduleBootstrap.cs
#	src/JobEngine/StellaOps.Scheduler.WebService/Program.cs
#	src/JobEngine/StellaOps.Scheduler.WebService/Schedules/ScheduleEndpoints.cs
#	src/JobEngine/StellaOps.Scheduler.WebService/StellaOps.Scheduler.WebService.csproj
#	src/JobEngine/StellaOps.Scheduler.__Libraries/StellaOps.Scheduler.Models/Schedule.cs
#	src/JobEngine/StellaOps.Scheduler.__Libraries/StellaOps.Scheduler.Plugin.Abstractions/IRunProgressReporter.cs
#	src/JobEngine/StellaOps.Scheduler.__Libraries/StellaOps.Scheduler.Plugin.Abstractions/ISchedulerJobPlugin.cs
#	src/JobEngine/StellaOps.Scheduler.__Libraries/StellaOps.Scheduler.Plugin.Abstractions/ISchedulerPluginRegistry.cs
#	src/JobEngine/StellaOps.Scheduler.__Libraries/StellaOps.Scheduler.Plugin.Abstractions/JobConfigValidationResult.cs
#	src/JobEngine/StellaOps.Scheduler.__Libraries/StellaOps.Scheduler.Plugin.Abstractions/JobExecutionContext.cs
#	src/JobEngine/StellaOps.Scheduler.__Libraries/StellaOps.Scheduler.Plugin.Abstractions/JobPlan.cs
#	src/JobEngine/StellaOps.Scheduler.__Libraries/StellaOps.Scheduler.Plugin.Abstractions/JobPlanContext.cs
#	src/JobEngine/StellaOps.Scheduler.__Libraries/StellaOps.Scheduler.Plugin.Abstractions/SchedulerPluginRegistry.cs
#	src/JobEngine/StellaOps.Scheduler.__Libraries/StellaOps.Scheduler.Plugin.Abstractions/StellaOps.Scheduler.Plugin.Abstractions.csproj
2026-04-08 16:27:02 +03:00
master
908619e739 feat(scheduler): plugin architecture + Doctor health check plugin
- Create ISchedulerJobPlugin abstraction with JobKind routing
- Add SchedulerPluginRegistry for plugin discovery and resolution
- Wrap existing scan logic as ScanJobPlugin (zero behavioral change)
- Extend Schedule model with JobKind (default "scan") and PluginConfig (jsonb)
- Add SQL migrations 007 (job_kind/plugin_config) and 008 (doctor_trends table)
- Implement DoctorJobPlugin replacing standalone doctor-scheduler service
- Add PostgresDoctorTrendRepository for persistent trend storage
- Register Doctor trend endpoints at /api/v1/scheduler/doctor/trends/*
- Seed 3 default Doctor schedules (daily full, hourly quick, weekly compliance)
- Comment out doctor-scheduler container in compose and services-matrix
- Update Doctor architecture docs and AGENTS.md with scheduling migration info

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 16:24:46 +03:00
master
f5a9f874d0 feat(audit): wire AddAuditEmission into 9 services (AUDIT-002)
- Wire StellaOps.Audit.Emission DI in: Authority, Policy, Release-Orchestrator,
  EvidenceLocker, Notify, Scanner, Scheduler, Integrations, Platform
- Add AuditEmission__TimelineBaseUrl to compose defaults
- Endpoint filter annotation deferred to follow-up pass

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 16:20:39 +03:00
master
65106afe4c refactor: DB schema fixes + container renames + compose include + audit sprint
- FindingsLedger: change schema from public to findings (V3-01)
- Add 9 migration module plugins: RiskEngine, Replay, ExportCenter, Integrations, Signer, IssuerDirectory, Workflow, PacksRegistry, OpsMemory (V4-01 to V4-09)
- Remove 16 redundant inline CREATE SCHEMA patterns (V4-10)
- Rename export→export-web, excititor→excititor-web for consistency
- Compose stella-ops.yml: thin wrapper using include: directive
- Fix dead /api/v1/jobengine/* gateway routes → release-orchestrator/packsregistry
- Scheduler plugin architecture: ISchedulerJobPlugin + ScanJobPlugin + DoctorJobPlugin
- Create unified audit sink sprint plan
- VulnExplorer integration tests + gap analysis

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 16:10:36 +03:00
master
13c4811e32 refactor(scripts): move Scripts API from scheduler to release-orchestrator
- Fix dual-schema violation (scheduler was writing to scheduler + scripts)
- Move ScriptsDataSource, PostgresScriptStore, script endpoints
- Update gateway routes and UI references
- Each service now owns exactly one schema

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 15:37:28 +03:00
master
0e25344bd7 refactor(jobengine): delete TaskRunner service
- Remove TaskRunner source, tests, libraries (3 directories)
- Remove from compose, services-matrix, nginx, hosts, smoke tests
- Remove CLI commands, UI references, Authority scopes
- Remove docs, OpenAPI spec, QA state files
- Leave task_runner_id DB columns as nullable legacy
- PacksRegistry preserved (independent service)
- Eliminates 2 containers (taskrunner-web + taskrunner-worker)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 14:11:20 +03:00
master
886ff6f9d2 refactor: JobEngine cleanup + crypto compose refactor + sprint plans + timeline merge prep
- Remove zombie JobEngine WebService (no container runs it)
- Remove dangling STELLAOPS_JOBENGINE_URL, replace with RELEASE_ORCHESTRATOR_URL
- Update Timeline audit paths to release-orchestrator
- Extract smremote to docker-compose.crypto-provider.smremote.yml
- Rename crypto compose files for consistent naming
- Add crypto provider health probe API (CP-001) + tenant preferences (CP-002)
- Create sprint plans: crypto picker, VulnExplorer merge, scheduler plugins
- Timeline merge prep: ingestion worker relocated to infrastructure lib

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 13:45:19 +03:00
master
59e7f25d96 docs: add service README.md files + update AGENTS.md decisions
- Create README.md for 25+ service modules with container info, API surface, storage
- Document attestor-tileproxy separation rationale (air-gap network isolation)
- Document opsmemory-advisoryai separation rationale (resource isolation, blast radius)
- Update Timeline AGENTS.md with merged indexer info

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 13:45:03 +03:00
master
c1ecc75ace refactor(policy): merge policy gateway into policy-engine
- Move 24 gateway source files (endpoints, services, contracts) into engine
  under Endpoints/Gateway/, Services/Gateway/, Contracts/Gateway/ namespaces
- Add gateway DI registrations and endpoint mappings to engine Program.cs
- Add missing project references (StellaOps.Policy.Scoring, DeltaVerdict, Localization)
- Remove HTTP proxy layer (PolicyEngineClient, DPoP, forwarding context not copied)
- Update gateway routes in router appsettings to point to policy-engine
- Comment out policy service in docker-compose, add backwards-compat network alias
- Update services-matrix (gateway build line commented out)
- Update all codebase references: AdvisoryAI, JobEngine, CLI, router tests, helm
- Update docs: OFFLINE_KIT, configuration-migration, gateway guide, port-registry
- Deprecate etc/policy-gateway.yaml.sample with notice
- Eliminates 1 container, 9 HTTP round-trips, DPoP token flow

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 13:19:09 +03:00
master
53f294400f fix(infra): resolve fresh-build DB schema gaps, Kerberos warnings, and Dockerfile syntax
- Workflow: add PostgreSQL auto-migration (8 tables in schema `workflow`)
  with AddStartupMigrations wiring and embedded SQL migration
- Scheduler: add missing `schema_version` and `source` columns to
  `scheduler.schedules` table in both init script and migration
- Platform: delay analytics maintenance 15s to avoid race with migration
  020_AnalyticsRollups creating compute_daily_rollups()
- Docker: install libgssapi-krb5-2 in runtime image to eliminate Npgsql
  Kerberos probe warnings across all 59 services
- Docker: remove `# syntax=docker/dockerfile:1.7` directive from both
  Dockerfiles to avoid BuildKit frontend pull failures on flaky DNS
- Postgres init: add `workflow` schema to 01-create-schemas.sql

Verified: 75 containers, 0 unhealthy, 0 recurring errors after full
wipe-and-rebuild cycle.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:40:08 +03:00
master
afbedf1c60 feat(scripts): scheduler scripts endpoint + script-picker component
Add ScriptsEndpoints to the Scheduler WebService for CRUD operations on
automation scripts. Add a reusable script-picker overlay component for
selecting scripts from the UI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 15:34:08 +03:00
master
9d47cabc37 Orchestrator decomposition: replace JobEngine with release-orchestrator + workflow services
- Remove jobengine and jobengine-worker containers from docker-compose
- Create release-orchestrator service (120 endpoints) with full auth, tenant, and infrastructure DI
- Wire workflow engine to PostgreSQL with definition store (wf_definitions table)
- Deploy 4 canonical workflow definitions on startup (release-promotion, scan-execution, advisory-refresh, compliance-sweep)
- Fix workflow definition JSON to match canonical contract schema (set-state, call-transport, decision)
- Add WorkflowClient to release-orchestrator for starting workflow instances on promotion
- Add WorkflowTriggerClient + endpoint to scheduler for triggering workflows from system schedules
- Update gateway routes from jobengine.stella-ops.local to release-orchestrator.stella-ops.local
- Remove Platform.Database dependency on JobEngine.Infrastructure
- Fix workflow csproj duplicate Content items (EmbeddedResource + SDK default)
- System-managed schedules with source column, SystemScheduleBootstrap, inline edit UI

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:57:42 +03:00
master
16c31f3303 Fix test assertion mismatches across Gateway, CLI, JobEngine, AdvisoryAI
- Gateway: align route mapping test with jobengine hostname rename
- CLI: update module registry count (10→28), migration consolidation (36→37),
  fix System.CommandLine option ordering, add SearchUnifiedAsync mock setup,
  strip FluentAssertions license warning from golden output, fix repo root detection
- JobEngine: update service actor subject, tolerate approval expiry in seed data
- AdvisoryAI: update route boost assertions for 0.85 multiplier

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 09:58:25 +03:00
master
59eca36429 Add JobEngine deployment compatibility store and scheduler persistence
Introduce PostgresDeploymentCompatibilityStore with migration 011, in-memory
fallback, deployment endpoints, and Postgres fixture for integration tests.
Update Scheduler repository with connection policy adoption.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 08:52:50 +03:00
master
4d82c346e3 Tag all Valkey/Redis connections with service-specific ClientName
Set ClientName on every Redis/Valkey connection across Scanner, Signals,
Concelier, Notify, Scheduler, Timeline, and Router for easier connection
attribution in monitoring.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 08:51:27 +03:00
master
a4c4690fef Rewrite UI API clients from /api/v2/releases to /api/v1/release-orchestrator
Completes Sprint 323 TASK-001 using Option C (direct URL rewrite):
- release-management.client.ts: readBaseUrl and legacyBaseUrl now use
  /api/v1/release-orchestrator/releases, eliminating the v2 proxy dependency
- All 15+ component files updated: activity, approvals, runs, versions,
  bundle-organizer, sidebar queries, topology pages
- Spec files updated to match new URL patterns
- Added /releases/activity and /releases/versions backend route aliases
  in ReleaseEndpoints.cs with ListActivity and ListVersions handlers
- Fixed orphaned audit-log-dashboard.component import → audit-log-table
- Both Angular build and JobEngine build pass clean

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 00:16:32 +03:00
master
f96c6cb9ed Complete release compatibility and host inventory sprints
Signed-off-by: master <>
2026-03-31 23:53:45 +03:00
master
260fce8ef8 Add dummy LLM provider, update Concelier sources and JobEngine endpoints
- AdvisoryAI: DummyLlmProvider for offline/testing scenarios,
  wire in LlmProviderFactory
- Concelier: source definitions, registry, and management endpoint updates
- JobEngine: approval and release endpoint updates
- etc/llm-providers/dummy.yaml config

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 17:25:48 +03:00
master
dd29786e38 Implement missing backend endpoints for release orchestration
TASK-002: 11 deployment monitoring endpoints in JobEngine
  (list, get, logs, events, metrics, pause/resume/cancel/rollback/retry)
TASK-003: 6 evidence management endpoints in JobEngine
  (list, get, verify, export, raw, timeline)
TASK-005: 3 release dashboard endpoints in JobEngine
  (dashboard summary, approve/reject promotion)
TASK-006: 2 registry image search endpoints in Scanner
  (search with 9 mock images, digests lookup)

All endpoints return seed/mock data for testing. Auth policies
match existing patterns. Dual route registration on both
/api/ and /api/v1/ prefixes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 15:52:20 +02:00
master
4d8a48a05f Sprint 7+8: Journey UX fixes + identity envelope shared middleware
Sprint 7 — Deep journey fixes:
  S7-T01: Trust & Signing empty state with "Go to Signing Keys" CTA
  S7-T02: Notifications 3-step setup guide (channel→rule→test)
  S7-T03: Topology validate step skip — "Skip Validation" when API fails,
    with validateSkipped signal matching agentSkipped pattern
  S7-T04: VEX export note on Risk Report tab linking to VEX Ledger

Sprint 8 — Identity envelope shared middleware (ARCHITECTURE):
  S8-T01: New UseIdentityEnvelopeAuthentication() extension in
    StellaOps.Router.AspNet. Reads X-StellaOps-Identity-Envelope headers,
    verifies HMAC-SHA256 via GatewayIdentityEnvelopeCodec, creates
    ClaimsPrincipal with sub/tenant/scopes/roles. 5min clock skew.
  S8-T02: Concelier refactored — removed 78 lines of inline impl,
    now uses shared one-liner
  S8-T03: Scanner — UseIdentityEnvelopeAuthentication() added
  S8-T04: JobEngine — UseIdentityEnvelopeAuthentication() added
  S8-T05: Timeline — UseIdentityEnvelopeAuthentication() added
  S8-T06: Integrations — UseIdentityEnvelopeAuthentication() added
  S8-T07: docs/modules/router/IDENTITY_ENVELOPE_MIDDLEWARE.md

All services now authenticate ReverseProxy requests via gateway envelope.
Scanner scan submit should now work with authenticated identity.

Angular: 0 errors. .NET (6 services): 0 errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 18:27:46 +02:00
master
ed6cd76c62 Fix critical journey blockers: audit endpoints, registry mock, topology auth
Fix #20 — Audit log empty:
  Wire app.MapAuditEndpoints() in JobEngine Program.cs. The endpoint file
  existed but was never registered, so /api/v1/jobengine/audit returned 404
  and the Timeline unified aggregation service got 0 events.

Fix #22 — Registry search returns mock data:
  Replace the catchError() synthetic mock fallback in searchImages() with
  an empty array return. The release wizard will now show "no results"
  instead of fabricating fake "payment-service" with "sha256:payment..."
  digests. getImageDigests() returns an empty-tags placeholder on failure.

Fix #13 — Topology wizard 401 (identity envelope passthrough):
  Add TryAuthenticateFromIdentityEnvelope() to Concelier's JwtBearer
  OnMessageReceived handler. When no JWT bearer token is present (stripped
  by gateway's IdentityHeaderPolicyMiddleware on ReverseProxy routes),
  the handler reads X-StellaOps-Identity-Envelope + signature headers,
  verifies the HMAC-SHA256 signature using the shared signing key, and
  populates ClaimsPrincipal with subject/tenant/scopes/roles from the
  envelope. This enables ReverseProxy routes to Concelier topology
  endpoints to authenticate the same way Microservice/Valkey routes do.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 09:24:04 +02:00
master
da76d6e93e Add topology auth policies + journey findings notes
Concelier:
- Register Topology.Read, Topology.Manage, Topology.Admin authorization
  policies mapped to OrchRead/OrchOperate/PlatformContextRead/IntegrationWrite
  scopes. Previously these policies were referenced by endpoints but never
  registered, causing System.InvalidOperationException on every topology
  API call.

Gateway routes:
- Simplified targets/environments routes (removed specific sub-path routes,
  use catch-all patterns instead)
- Changed environments base route to JobEngine (where CRUD lives)
- Changed to ReverseProxy type for all topology routes

KNOWN ISSUE (not yet fixed):
- ReverseProxy routes don't forward the gateway's identity envelope to
  Concelier. The regions/targets/bindings endpoints return 401 because
  hasPrincipal=False — the gateway authenticates the user but doesn't
  pass the identity to the backend via ReverseProxy. Microservice routes
  use Valkey transport which includes envelope headers. Topology endpoints
  need either: (a) Valkey transport registration in Concelier, or
  (b) Concelier configured to accept raw bearer tokens on ReverseProxy paths.
  This is an architecture-level fix.

Journey findings collected so far:
- Integration wizard (Harbor + GitHub App): works end-to-end
- Advisory Check All: fixed (parallel individual checks)
- Mirror domain creation: works, generate-immediately fails silently
- Topology wizard Step 1 (Region): blocked by auth passthrough issue
- Topology wizard Step 2 (Environment): POST to JobEngine needs verify
- User ID resolution: raw hashes shown everywhere

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 08:12:39 +02:00
master
166745f9f9 Reduce idle CPU across 62 containers (phase 1)
- Add resource limits (heavy/medium/light tiers) to all 59 .NET services
- Add .NET GC tuning (server/workstation GC, DATAS, conserve memory)
- Convert FirstSignalSnapshotWriter from 10s polling to Valkey pub/sub
- Convert EnvironmentSettingsRefreshService from 60s polling to Valkey pub/sub
- Consolidate GraphAnalytics dual timers to single timer with idle-skip
- Increase healthcheck interval from 30s to 60s (configurable)
- Reduce debug logging to Information on 4 high-traffic services

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 02:16:19 +02:00
master
d16d7a1692 Repair live JobEngine runtime contracts 2026-03-10 01:38:38 +02:00
master
dfd22281ed Repair live canonical migrations and scanner cache bootstrap 2026-03-09 21:56:41 +02:00
master
1e53976ffb fix(jobengine): make all orchestrator migration SQL idempotent and PostgreSQL-compatible
Fix 4 classes of issues that prevented JobEngine from auto-migrating:
1. Non-idempotent DDL: add IF NOT EXISTS to CREATE TABLE, wrap CREATE
   TYPE in DO blocks with EXCEPTION WHEN duplicate_object, wrap partition
   creation with EXCEPTION WHEN duplicate_object OR SQLSTATE '42P17'
2. Reserved keyword: quote `window` column name in 004_slo_quotas.sql
3. Invalid syntax: replace DELETE...LIMIT with ctid subquery pattern
   in 004_slo_quotas.sql and 005_audit_ledger.sql
4. Partition constraint: add tenant_id to UNIQUE(log_id) constraint
   on pack_run_logs in 006_pack_runs.sql (partitioned tables require
   partition key in all unique constraints)
5. Non-immutable index predicate: remove NOW() from partial index
   predicate in 002_backfill.sql
6. Remove BEGIN/COMMIT wrappers from all migration files (the
   StartupMigrationHost already wraps each migration in a transaction)

All 8 orchestrator migrations (001-008) now apply cleanly on fresh DB.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 08:38:20 +02:00
master
481a062a1a fix(jobengine): register startup migrations for orchestrator schema
Wire AddStartupMigrations so JobEngine converges the orchestrator schema
on fresh database or wiped volumes without manual bootstrap scripts.
Adds StellaOps.Infrastructure.Postgres.Migrations dependency.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 07:53:24 +02:00
master
a918d39a61 texts fixes, search bar fixes, global menu fixes. 2026-03-05 18:15:30 +02:00
master
8e1cb9448d consolidation of some of the modules, localization fixes, product advisories work, qa work 2026-03-05 03:54:22 +02:00