Commit Graph

67 Commits

Author SHA1 Message Date
master
4d8a48a05f Sprint 7+8: Journey UX fixes + identity envelope shared middleware
Sprint 7 — Deep journey fixes:
  S7-T01: Trust & Signing empty state with "Go to Signing Keys" CTA
  S7-T02: Notifications 3-step setup guide (channel→rule→test)
  S7-T03: Topology validate step skip — "Skip Validation" when API fails,
    with validateSkipped signal matching agentSkipped pattern
  S7-T04: VEX export note on Risk Report tab linking to VEX Ledger

Sprint 8 — Identity envelope shared middleware (ARCHITECTURE):
  S8-T01: New UseIdentityEnvelopeAuthentication() extension in
    StellaOps.Router.AspNet. Reads X-StellaOps-Identity-Envelope headers,
    verifies HMAC-SHA256 via GatewayIdentityEnvelopeCodec, creates
    ClaimsPrincipal with sub/tenant/scopes/roles. 5min clock skew.
  S8-T02: Concelier refactored — removed 78 lines of inline impl,
    now uses shared one-liner
  S8-T03: Scanner — UseIdentityEnvelopeAuthentication() added
  S8-T04: JobEngine — UseIdentityEnvelopeAuthentication() added
  S8-T05: Timeline — UseIdentityEnvelopeAuthentication() added
  S8-T06: Integrations — UseIdentityEnvelopeAuthentication() added
  S8-T07: docs/modules/router/IDENTITY_ENVELOPE_MIDDLEWARE.md

All services now authenticate ReverseProxy requests via gateway envelope.
Scanner scan submit should now work with authenticated identity.

Angular: 0 errors. .NET (6 services): 0 errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 18:27:46 +02:00
master
efa33efdbc Sprint 2+3+5: Registry search, workflow chain, unified security data
Sprint 2 — Registry image search (S2-T01/T02/T03):
  Harbor plugin: SearchRepositoriesAsync + ListArtifactsAsync calling
    Harbor /api/v2.0/search and /api/v2.0/projects/*/repositories/*/artifacts
  Platform endpoint: GET /api/v1/registries/images/search proxies to
    Harbor fixture, returns aggregated RegistryImage[] response
  Frontend: release-management.client.ts now calls /api/v1/registries/*
    instead of the nonexistent /api/registry/* path
  Gateway route: /api/v1/registries → platform (ReverseProxy)

Sprint 3 — Workflow chain links (S3-T01/T02/T03/T05):
  S3-T01: Integration detail health tab shows "Scan your first image"
    CTA after successful registry connection test
  S3-T02: Scan submit page already had "View findings" link (verified)
  S3-T03: Triage findings detail shows "Check policy gates" banner
    after recording a VEX decision
  S3-T05: Promotions list + detail show "Review blocking finding"
    link when promotion is blocked by gate failure

Sprint 5 — Unified security data (S5-T01):
  Security Posture now queries VULNERABILITY_API for triage stats
  Risk Posture card shows real finding count from triage (was hardcoded 0)
  Risk label computed from triage severity breakdown (GUARDED→HIGH)
  Blocking Items shows critical+high counts from triage
  "View in Vulnerabilities workspace" drilldown link added

Angular build: 0 errors. .NET builds: 0 errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 16:08:22 +02:00
master
b97bffc430 Sprint 1: Scanner entry point + vulnerability navigation (S1-T01 to T07)
S1-T01: Add "Scan Image" to sidebar under Security > Security Posture children
  - New nav item with scanner:read scope, route /security/scan

S1-T02: Create Scan Image page (scan-submit.component.ts)
  - Image reference input, force rescan toggle, metadata fields
  - Submits POST /api/v1/scans/, polls for status every 3s
  - Shows progress badges (queued/scanning/completed/failed)
  - "View findings" link on completion
  - Route registered in security.routes.ts

S1-T04: Rename "Triage" to "Vulnerabilities" in sidebar + breadcrumbs
  - Sidebar label: Triage → Vulnerabilities
  - Route title and breadcrumb data updated
  - Internal route /triage/artifacts unchanged

S1-T05: Add 10 security terms to command palette quick actions
  - Scan image, View vulnerabilities, Search CVE, View findings,
    Create release, View audit log, Run diagnostics, Configure
    advisory sources, View promotions, Check policy gates

S1-T06: Add CTA buttons to Security Posture page
  - "Scan an Image" (primary) → /security/scan
  - "View Active Findings" (secondary) → /triage/artifacts

S1-T07: Gateway routes for scanner endpoints
  - /api/v1/scans → scanner.stella-ops.local (ReverseProxy)
  - /api/v1/scan-policies → scanner.stella-ops.local (ReverseProxy)
  - Added to both compose mount and source appsettings

Angular build: 0 errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 14:27:47 +02:00
master
a86f0d1361 Add environment/target/agent CRUD endpoints to Concelier topology
The topology wizard creates environments and targets via POST /api/v1/environments
and POST /api/v1/targets. These were routed to JobEngine which doesn't have
the identity envelope middleware, causing 404 on ReverseProxy routes.

Fix: Add environment CRUD, target CRUD, and agent list endpoints directly
to Concelier's TopologySetupEndpointExtensions. These use the same
Topology.Read/Manage authorization policies that work with the identity
envelope middleware.

Routes updated:
- /api/v1/environments → Concelier (was JobEngine)
- /api/v1/agents → Concelier (new)

Topology wizard now completes steps 1-4:
  1. Region: CREATE OK
  2. Environment: CREATE OK
  3. Stage Order: OK (skip)
  4. Target: CREATE OK
  5. Agent: BLOCKED (expected — no agents deployed on fresh install)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 09:49:59 +02:00
master
da76d6e93e Add topology auth policies + journey findings notes
Concelier:
- Register Topology.Read, Topology.Manage, Topology.Admin authorization
  policies mapped to OrchRead/OrchOperate/PlatformContextRead/IntegrationWrite
  scopes. Previously these policies were referenced by endpoints but never
  registered, causing System.InvalidOperationException on every topology
  API call.

Gateway routes:
- Simplified targets/environments routes (removed specific sub-path routes,
  use catch-all patterns instead)
- Changed environments base route to JobEngine (where CRUD lives)
- Changed to ReverseProxy type for all topology routes

KNOWN ISSUE (not yet fixed):
- ReverseProxy routes don't forward the gateway's identity envelope to
  Concelier. The regions/targets/bindings endpoints return 401 because
  hasPrincipal=False — the gateway authenticates the user but doesn't
  pass the identity to the backend via ReverseProxy. Microservice routes
  use Valkey transport which includes envelope headers. Topology endpoints
  need either: (a) Valkey transport registration in Concelier, or
  (b) Concelier configured to accept raw bearer tokens on ReverseProxy paths.
  This is an architecture-level fix.

Journey findings collected so far:
- Integration wizard (Harbor + GitHub App): works end-to-end
- Advisory Check All: fixed (parallel individual checks)
- Mirror domain creation: works, generate-immediately fails silently
- Topology wizard Step 1 (Region): blocked by auth passthrough issue
- Topology wizard Step 2 (Environment): POST to JobEngine needs verify
- User ID resolution: raw hashes shown everywhere

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 08:12:39 +02:00
master
602df77467 Fix topology routes: use ReverseProxy for all topology endpoints
Changed all topology gateway routes from Microservice (Valkey transport)
to ReverseProxy (direct HTTP) because:
- Concelier topology endpoints serve via HTTP, not Valkey message bus
- JobEngine environment CRUD serves via HTTP

Routes:
- /api/v1/regions → Concelier (ReverseProxy)
- /api/v1/infrastructure-bindings → Concelier (ReverseProxy)
- /api/v1/pending-deletions → Concelier (ReverseProxy)
- /api/v1/targets → Concelier (ReverseProxy)
- /api/v1/environments/{id}/readiness → Concelier (ReverseProxy)
- /api/v1/environments → JobEngine (ReverseProxy)

All return expected auth/tenant errors (404/500) instead of 503.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 07:44:58 +02:00
master
701229b3e6 Add gateway routes for topology setup endpoints on Concelier
The topology setup wizard calls /api/v1/regions, /api/v1/infrastructure-bindings,
/api/v1/pending-deletions, and target/environment readiness+validate endpoints
which are registered on the Concelier service. Without explicit gateway routes,
these fall through to the generic Microservice matcher which tries to find a
non-existent "regions" service, returning 503.

Added 6 Microservice routes forwarding topology API paths to
http://concelier.stella-ops.local. Both compose mount config and source
appsettings.json updated.

Verified: /api/v1/regions now returns 401 (auth required) instead of 503.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 02:20:22 +02:00
master
534aabfa2a First-time user experience fixes and platform contract repairs
FTUX fixes (Sprint 316-001):
- Remove all hardcoded fake data from dashboard — fresh installs show
  honest setup guide instead of fake crisis data (5 fake criticals gone)
- Curate advisory source defaults: 32 sources disabled by default
  (ecosystem, geo-restricted, exploit, hardware, mirror). ~43 core
  sources remain enabled. StellaOps Mirror no longer enabled at priority 1.
- Filter Mirror-category sources from Create Domain wizard to prevent
  circular mirror-from-mirror chains
- Add 404 catch-all route — unknown URLs show "Page Not Found" instead
  of silently rendering the dashboard
- Fix arrow characters in release target path dropdown (? → →)
- Add login credentials to quickstart documentation
- Update Feature Matrix: 14 release orchestration features marked as
  shipped (was marked planned)

Platform contract repairs (from prior session):
- Add /api/v1/jobengine/quotas/summary endpoint on Platform
- Fix gateway route prefix matching for /policy/shadow/* and
  /policy/simulations/* (regex routes instead of exact match)
- Fix VexHub PostgresVexSourceRepository missing interface method
- Fix advisory-vex-sources sweep text expectation
- Fix mirror operator journey auth (session storage token extraction)

Verified: 110/111 canonical routes passing (1 unrelated stale approval ref)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 02:05:38 +02:00
master
19b9c90a8d Retry microservice startup and validate async Valkey connects 2026-03-12 13:12:54 +02:00
master
6964a046a5 Close admin trust audit gaps and stabilize live sweeps 2026-03-12 10:14:00 +02:00
master
8a1fb9bd9b OpenAPI query param discovery and header cleanup completion
Backend: ExtractParameters() now discovers query params from [AsParameters]
records and [FromQuery] attributes via handler method reflection. Gateway
OpenApiDocumentGenerator emits parameters arrays in the aggregated spec.
QueryParameterInfo added to EndpointSchemaInfo for HELLO payload transport.

Frontend: Remaining spec files and straggler services updated to canonical
X-Stella-Ops-* header names. Sprint 026 archived (tasks 01-06 DONE,
07-09 TODO for backend service rename pass).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-10 17:13:58 +02:00
master
8578065675 Fix notifications surface ownership and frontdoor contracts 2026-03-10 16:54:25 +02:00
master
fc7aaf4d37 Restore platform ownership for v2 evidence routes 2026-03-10 13:10:06 +02:00
master
d881fff387 Segment-bound doctor and scheduler frontdoor chunks 2026-03-10 12:47:51 +02:00
master
1b6051662f Repair router frontdoor route boundaries and service prefixes 2026-03-10 12:28:48 +02:00
master
7acf0ae8f2 Fix router frontdoor readiness and route contracts 2026-03-10 10:19:49 +02:00
master
ff4cd7e999 Restore policy frontdoor compatibility and live QA 2026-03-10 06:18:30 +02:00
master
6578c82602 Eliminate legacy gateway container (consolidate into router-gateway)
The gateway service was a redundant deployment of the same
StellaOps.Gateway.WebService binary already running as router-gateway.
It served no unique purpose — all traffic is handled by router-gateway
(slot 0). This removes the container, its route table entries, nginx
proxy blocks, health/quota stubs, and redirects STELLAOPS_GATEWAY_URL
to router.stella-ops.local so the Angular frontend resolves API base
URLs through the canonical frontdoor.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 03:50:16 +02:00
master
31cb31d0fb Eliminate Valkey queue polling fallback (phase 2 CPU optimization)
Replace hardcoded 1-5s polling constants with configurable
QueueWaitTimeoutSeconds (default 0 = pure event-driven). Consumers
now only wake on pub/sub notifications, eliminating ~118 idle
XREADGROUP polls per second across 59 services. Override with
VALKEY_QUEUE_WAIT_TIMEOUT env var if a safety-net poll is needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 02:36:01 +02:00
master
72084355a6 Align policy simulation auth passthrough at the frontdoor 2026-03-10 01:55:51 +02:00
master
bf937c9395 Repair router frontdoor convergence and live route contracts 2026-03-09 19:09:19 +02:00
master
69923b648c fix(infra): repair gateway route ownership and add JobEngine/pack-registry scopes
- Route /api/v1/jobengine to jobengine service (was orchestrator)
- Route /api/v1/sources and /api/v1/witnesses to scanner service
- Add orch:quota and pack-registry scopes to platform OIDC token
- Align compose-local manifests with gateway appsettings.json

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 07:52:46 +02:00
master
841add4f27 perf(router): replace 100ms Valkey polling with Pub/Sub notification wakeup and increase heartbeat to 45s
The Valkey transport layer used 100ms busy-polling loops (Task.Delay(100))
across ~90 concurrent loops in 45+ services, generating ~900 idle
commands/sec and burning ~58% CPU while the system was completely idle.

Replace polling with Redis Pub/Sub notifications:
- Publishers fire PUBLISH after each XADD (fire-and-forget)
- Consumers SUBSCRIBE and wait on SemaphoreSlim with 30s fallback timeout
- Applies to both ValkeyMessageQueue (INotifiableQueue) and ValkeyEventStream
- Non-Valkey transports fall back to 1s polling via QueueWaitExtensions

Increase heartbeat interval from 10s to 45s across all transport options,
with corresponding health threshold adjustments (stale: 135s, degraded: 90s).

Expected idle CPU reduction: ~58% → ~3-5%.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 07:47:31 +02:00
master
4f445ad951 Fix live evidence and registry auth contracts 2026-03-08 22:54:36 +02:00
master
c4b9373bf5 docs(router): archive header binding sprint 2026-03-08 15:34:26 +02:00
master
69813807a9 docs(router): archive messaging reregistration sprint 2026-03-08 15:33:25 +02:00
master
30532800ec fix(router): ship audit bundle frontdoor cutover 2026-03-08 14:30:12 +02:00
master
afa23fc504 Fix router ASP.NET request body binding 2026-03-07 04:26:54 +02:00
master
2ff0e1f86b Fix router messaging re-registration stability 2026-03-07 03:48:46 +02:00
master
973cc8b335 qa iteration 4
Add Valkey messaging transport auto-reconnection:
- MessagingTransportClient: detect persistent Redis failures (5 consecutive)
  and exit processing loops instead of retrying forever with dead connection
- IMicroserviceTransport: add TransportDied event to interface
- RouterConnectionManager: listen for TransportDied, auto-reconnect after 2s
- Fixes services becoming unreachable after Valkey blip during restarts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-06 03:11:28 +02:00
master
360485f556 qa iteration 1 2026-03-06 00:23:59 +02:00
master
a918d39a61 texts fixes, search bar fixes, global menu fixes. 2026-03-05 18:15:30 +02:00
master
8e1cb9448d consolidation of some of the modules, localization fixes, product advisories work, qa work 2026-03-05 03:54:22 +02:00
master
7bafcc3eef fix: filter domain assembly scans to Default ALC to prevent type identity mismatches
Plugin assemblies loaded via PluginHost into isolated AssemblyLoadContexts
produce distinct types even from the same DLL. When AppDomain.GetAssemblies()
returns both Default and plugin-ALC copies, DI registration and IOptions<T>
resolution silently fail (e.g. ValkeyTransportOptions defaulting to localhost).

Applied AssemblyLoadContext.Default filter to all 7 assembly discovery sites:
- MessagingServiceCollectionExtensions (transport plugin scan)
- StellaRouterIntegrationHelper (transport plugin loader)
- Gateway.WebService Program.cs (startup transport scan)
- GeneratedEndpointDiscoveryProvider (endpoint provider scan)
- ReflectionEndpointDiscoveryProvider (endpoint attribute scan)
- ServiceCollectionExtensions (schema provider scan)
- MigrationModulePluginDiscovery (migration plugin scan)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 14:01:12 +02:00
master
b07d27772e search and ai stabilization work, localization stablized. 2026-02-24 23:29:36 +02:00
master
e746577380 wip: doctor/cli/docs/api to vector db consolidation; api hardening for descriptions, tenant, and scopes; migrations and conversions of all DALs to EF v10 2026-02-23 15:30:50 +02:00
master
bd8fee6ed8 stela ops usage fixes roles propagation and timoeut, one account to support multi tenants, migrations consolidation, search to support documentation, doctor and open api vector db search 2026-02-22 19:27:54 +02:00
master
49cdebe2f1 compose and authority fixes. finish sprints. 2026-02-18 12:00:10 +02:00
master
fb46a927ad save changes 2026-02-17 00:51:35 +02:00
master
45c0f1bb59 Stabilzie modules 2026-02-16 07:32:38 +02:00
master
9911b7d73c save checkpoint 2026-02-12 21:02:43 +02:00
master
5bca406787 save checkpoint: save features 2026-02-12 10:27:23 +02:00
master
cf5b72974f save checkpoint 2026-02-11 01:32:14 +02:00
master
557feefdc3 stabilizaiton work - projects rework for maintenanceability and ui livening 2026-02-03 23:40:04 +02:00
master
5d5e80b2e4 stabilize tests 2026-02-01 21:37:40 +02:00
master
c70e83719e finish off sprint advisories and sprints 2026-01-24 00:12:43 +02:00
master
726d70dc7f tests fixes and sprints work 2026-01-22 19:08:46 +02:00
master
c32fff8f86 license switch agpl -> busl1, sprints work, new product advisories 2026-01-20 15:32:20 +02:00
master
17419ba7c4 doctor enhancements, setup, enhancements, ui functionality and design consolidation and , test projects fixes , product advisory attestation/rekor and delta verfications enhancements 2026-01-19 09:02:59 +02:00
master
77ff029205 todays product advirories implemented 2026-01-16 23:30:47 +02:00