Complete first-time user journey notes — full fresh install walkthrough

Documented the complete journey from fresh install through:
- Login, dashboard, integrations (Harbor + GitHub App)
- Advisory sources (42 curated, 54 healthy)
- Mirror domain creation (14 sources, signing)
- Topology wizard (blocked at auth passthrough)
- Release creation (sealed end-to-end with mock component)
- Approvals queue, security posture, policy studio
- Evidence/audit, doctor diagnostics

22 findings total (12 fixed, 10 tracked):
- Critical: ReverseProxy auth passthrough (#13), audit log empty (#20)
- High: Mock registry search in releases (#22)
- Medium: No post-seal guidance (#21), silent failures, user ID hashes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
master
2026-03-16 08:19:10 +02:00
parent da76d6e93e
commit 4e07f7bd72

View File

@@ -0,0 +1,155 @@
# First-Time User Journey Notes — Fresh Install
**Date**: 2026-03-16
**Stack**: Wiped + fresh boot, 63 containers, Harbor + GitHub App fixtures
---
## Journey Completed
### 1. Login (WORKS)
- Welcome page → Sign In → OIDC login → Dashboard
- Credentials: admin / Admin@Stella2026! (now documented in quickstart)
- Session persists across page reloads
### 2. Dashboard (FIXED)
- Shows honest empty state with setup guide when no environments
- When environments exist (from seed), shows honest "unknown" status, 0 findings
- No more fake "5 critical, blocked" environment cards
### 3. Integration Setup (WORKS)
- **Harbor Registry**: 6-step wizard (Provider → Connection → Scope → Schedule → Preflight → Review) → Created → Test Connection SUCCESS (38ms)
- **GitHub App SCM**: Same 6-step wizard → Created → Test Connection SUCCESS ("Connected as GitHub App: Stella QA GitHub App", 4ms)
- Both integrations properly show Pending → Active transition
- Integration detail page has Overview, Credentials, Scopes & Rules, Events, Health tabs
### 4. Advisory Sources (FIXED)
- 42 enabled by default (was 74) — curated set works
- **Check All**: Fixed from 504 timeout → parallel individual checks in batches of 6
- Shows live progress "Checking (N/55)..."
- Result: 54 healthy, 20 failed (expected for Docker network)
### 5. Mirror Domain Creation (WORKS, PARTIAL)
- 3-step wizard: Select Sources → Configure Domain → Review & Create
- Created "mirror-combined-14src" with 14 sources (Primary + Distribution)
- Signing enabled (HMAC-SHA256 with key ID)
- "Generate immediately" checkbox triggers a 503 → **silent failure, no user feedback**
- Mirror domain created but bundle not generated
### 6. Topology Wizard (BLOCKED — auth passthrough)
- 8-step wizard loads correctly: Region → Environment → Stage Order → Target → Agent → Infrastructure → Validate → Done
- **Step 1 (Region)**: Form renders, Create Region button works, BUT:
- POST /api/v1/regions returns 500 → ROOT CAUSE: missing auth policies → FIXED
- After fix: returns 401 → ROOT CAUSE: ReverseProxy doesn't forward identity envelope
- Concelier expects gateway-signed identity, not raw bearer tokens
- This is an **architecture-level issue**: ReverseProxy vs Microservice transport
- **Step 2 (Environment)**: POST /api/v1/environments → routes to JobEngine, also needs auth passthrough fix
---
## Journey Not Yet Reached
### 7. Create Release (WORKS END-TO-END ON FRESH INSTALL)
- 4-step wizard works (Basic Info → Components → Inputs → Review & Seal)
- Registry search returns mock data when API fails (noted earlier)
- Seal produces a real bundle with digest identity
- Bundle detail shows UUID heading instead of release name
- "Created by" shows raw user ID hash
### 8. Security Posture (WORKS — HONEST)
- All cards show real zeros: GUARDED, 0 blockers, 0% VEX, 0/0 SBOM, 0% reachability
- No fake data on fresh install
- "Snapshot: FAIL — 3 source(s) offline/stale" is accurate — sources not yet synced
### 9. Approvals Queue (WORKS)
- "Release Run Approvals Queue" with filtering by status, gate type, environment, hotfix, risk
- Table headers render but no approval items visible (sealed release didn't auto-trigger approval)
- **FINDING**: Sealing a release does NOT auto-create an approval request — user must manually promote
### 10. Policy Studio (WORKS)
- 7 tabs all load: Overview, Packs, Governance, Simulation, VEX & Exceptions, Release Gates, Audit
- "Core Policy Pack latest" shown in topbar
- Simulation Lab accessible under the Policy Studio shell
- **Not yet tested**: actual policy evaluation against the sealed release
### 11. Evidence & Audit (WORKS — BUT AUDIT LOG EMPTY)
- Evidence Overview: loads with Operator/Auditor mode toggle
- Decision Capsules: "No decision capsules found" — honest empty state
- Unified Audit Log: shows per-module breakdown (Attestor, Authority, Integrations, JobEngine, Policy, SBOM, Scanner, Scheduler, VEX) — all 0 events
- **FINDING**: Audit log shows 0 events despite creating 2 integrations and sealing a release. Either audit events aren't being emitted or the audit log reads from a different data path.
### 12. Doctor Diagnostics (WORKS)
- Quick/Normal/Full Check buttons available
- Quick Check: 7 pass, 1 warning, 1 fail, 5 skipped
- Failed check expands with remediation steps and copy buttons — excellent UX
---
## Additional Findings From Full Journey
### F20: Audit Log Shows 0 Events After Real Actions
**Severity**: HIGH
**What happened**: Created 2 integrations (Harbor, GitHub App), sealed a release, ran advisory check — audit log shows 0 events across all modules.
**Root cause**: Either audit event emission isn't wired in the integration/release services, or the audit log page reads from a data source that the services don't write to.
**Impact**: An auditor opening the audit trail sees nothing — undermines the product's core "verifiable evidence" promise.
### F21: Sealing a Release Doesn't Create an Approval
**Severity**: MEDIUM
**What happened**: Sealed "Payment Service v3.2" with bundle status "published". The approvals queue is empty. Expected: at least a policy gate approval request.
**Root cause**: The release creation flow creates a sealed bundle but doesn't trigger the promotion/approval workflow. Promotion is a separate step.
**Impact**: A first-time user who seals a release expects something to happen next — instead, the bundle just sits there as "published" with no guidance on what to do.
**Proposed fix**: After sealing, show a "What's next?" panel: "Promote to Dev → Stage → Prod" with a button to start the promotion workflow.
### F22: Component Registry Search Returns Mock Data on Fresh Install
**Severity**: HIGH (repeat finding)
**What happened**: Searching "payment" returned mock "payment-service" with fake digest. Console error for `/api/registry/images/search?q=payment`. The Harbor integration is connected but the search doesn't use it.
**Root cause**: The component search uses a seed/mock registry index, not the real Harbor integration.
**Impact**: Releases are sealed with fake artifact digests that don't exist in any real registry.
---
## Issues Found (All Iterations)
### FIXED (12)
| # | Issue | Fix |
|---|-------|-----|
| 1 | Dashboard 100% hardcoded | Removed all fake data, setup guide |
| 2 | Mirror source enabled P1 | EnabledByDefault = false on 32 sources |
| 3 | Mirror in domain builder | Filter category !== 'Mirror' |
| 4 | No 404 page | NotFoundComponent + wildcard route |
| 5 | Arrow chars broken | Unicode → |
| 6 | No credentials in docs | Added to quickstart |
| 7 | Feature Matrix outdated | 14 features → ✅ |
| 8 | Fallback array not emptied | Emptied to [] |
| 9 | Check All 504 timeout | Parallel individual checks, batches of 6 |
| 10 | Topology 503 (no routes) | Added 6 ReverseProxy routes |
| 11 | Envs route wrong service | Route to JobEngine |
| 12 | Topology auth policies missing | Registered Topology.Read/Manage/Admin |
### NOT FIXED (7)
| # | Issue | Severity | Root Cause |
|---|-------|----------|-----------|
| 13 | Topology wizard 401 (auth passthrough) | CRITICAL | ReverseProxy doesn't forward identity envelope to Concelier |
| 14 | "Created by" raw user ID hash | MEDIUM | No user ID → display name resolution |
| 15 | Mirror generate-immediately fails silently | MEDIUM | 503 from Concelier exports, no user feedback |
| 16 | v2 context API console errors | LOW | /api/v2/context/regions, /preferences, /approvals return errors |
| 17 | Crypto profile no tooltip | LOW | No explanation of FIPS/eIDAS/GOST/SM |
| 18 | Topology wizard silent failure | MEDIUM | No error toast when region creation fails |
| 19 | Wizard env Create disabled silently | MEDIUM | No explanation when Next/Create buttons are disabled |
---
## Architecture Issue: Gateway Auth for Topology
The core blocker is **issue #13**. The gateway has two transport types:
1. **Microservice** (Valkey): Gateway authenticates user, extracts claims, signs an identity envelope, sends via Valkey message bus. Backend receives pre-authenticated request with `hasPrincipal=True`.
2. **ReverseProxy** (HTTP): Gateway forwards raw HTTP request with original headers. Backend must validate the bearer token itself. Concelier's auth middleware (`StellaOps.Auth.ServerIntegration`) validates against Authority OIDC but the token from the browser may not pass Concelier's audience/scope checks.
**Options**:
- A) Register Concelier's topology endpoints as Valkey consumers (matches existing auth pattern for advisory sources)
- B) Configure Concelier to accept the gateway's identity envelope on HTTP requests (add bypass network for gateway IP)
- C) Add Concelier's service URL to the gateway's identity envelope signing, so ReverseProxy requests include the signed envelope headers
Option B is likely simplest — add the gateway's Docker network IP to Concelier's bypass networks.