Repair triage artifact scope and evidence contracts

This commit is contained in:
master
2026-03-11 14:25:59 +02:00
parent 4dc5db4efb
commit 9dd8592a2a
27 changed files with 1598 additions and 282 deletions

View File

@@ -0,0 +1,81 @@
# Sprint 20260311_003 - FE Triage Artifacts Vuln Scope Compat
## Topic & Scope
- Restore `/triage/artifacts` on a full scratch-built stack where the live admin token carries modern vulnerability scopes (`vuln:view`, `vuln:investigate`, `vuln:operate`, `vuln:audit`) instead of the obsolete `vuln:read`/`vuln:write`/`vuln:export` names.
- Fix the root cause in shared web auth scope matching so client-side prechecks do not block valid vulnerability pages before any request is sent.
- Separate the web vulnerability read/query contract from the legacy Authority mutation/export base and restore the documented scanner-backed `GET /api/v1/vulnerabilities*` surface that the artifact workspace expects.
- Add focused regression coverage for the scope bridge and reverify the repaired artifact workspace through the real authenticated frontdoor.
- Working directory: `src/Web/StellaOps.Web`.
- Expected evidence: focused Angular auth tests, targeted scanner xUnit runner output, rebuilt web bundle synced into `compose_console-dist`, rebuilt `scanner-web` image deployed into compose, live Playwright verification for `/triage/artifacts`, sprint log updates, and a scoped local commit.
## Dependencies & Concurrency
- Depends on the fresh scratch rebuild baseline and the current healthy compose stack on `https://stella-ops.local`.
- Safe parallelism: primary edits stay in `src/Web/StellaOps.Web`; this sprint explicitly permits the minimum cross-module repair in `src/Scanner/StellaOps.Scanner.WebService`, `src/Scanner/__Tests/StellaOps.Scanner.WebService.Tests`, and `src/Scanner/StellaOps.Scanner.WebService/TASKS.md` because the live route depends on the documented scanner read contract.
## Documentation Prerequisites
- `AGENTS.md`
- `docs/qa/feature-checks/FLOW.md`
- `docs/code-of-conduct/TESTING_PRACTICES.md`
- `docs/modules/platform/architecture-overview.md`
## Delivery Tracker
### FE-TRIAGE-SCOPE-001 - Root-cause the live artifact workspace failure
Status: DONE
Dependency: none
Owners: QA, 3rd line support
Task description:
- Reproduce `/triage/artifacts` on the live scratch stack with real Playwright, capture the failing behavior, and identify whether the defect is in frontdoor routing, runtime readiness, or client-side authorization.
Completion criteria:
- [x] Live evidence proves the failure and records the route, banner, and lack of runtime transport errors.
- [x] Root cause is traced to concrete code and contract mismatch, not a generic "service unavailable" guess.
### FE-TRIAGE-SCOPE-002 - Repair shared vulnerability scope compatibility
Status: DONE
Dependency: FE-TRIAGE-SCOPE-001
Owners: Product Manager, Architect, Developer
Task description:
- Update the shared web auth compatibility path so legacy client checks continue to work during the authority migration from `vuln:read`/`vuln:write`/`vuln:export` to the current vulnerability scope set.
- The fix must be narrow enough to preserve the new finer-grained scopes while preventing client-side false denies on read/audit paths.
Completion criteria:
- [x] Shared auth scope matching accepts `vuln:view` for legacy read checks and `vuln:audit` for legacy export checks.
- [x] Compatibility does not incorrectly allow `vuln:investigate` to satisfy `vuln:operate`.
- [x] Focused regression tests cover the alias behavior.
### FE-TRIAGE-SCOPE-003 - Rebuild and reverify the live artifact workspace
Status: DONE
Dependency: FE-TRIAGE-SCOPE-002
Owners: QA, Developer
Task description:
- Rebuild the web bundle, sync it into the live compose `console-dist` volume, restore the scanner vulnerabilities read controller expected by the route contract, and rerun authenticated Playwright against `/triage/artifacts` to confirm the banner is gone and artifact data/actions render normally on the repaired stack.
Completion criteria:
- [x] `npm run build` passes.
- [x] Targeted scanner contract tests pass via the test project executable.
- [x] The rebuilt bundle is synced into `compose_console-dist`.
- [x] The rebuilt `scanner-web` image is deployed into compose and answers `GET /api/v1/vulnerabilities`.
- [x] Live Playwright confirms `/triage/artifacts` loads without the vulnerability-service error banner.
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-03-11 | Sprint created after the scratch-stack sidebar-only route probe exposed a real `/triage/artifacts` defect: the page rendered a generic vulnerability-service error even though the browser captured no failing `/api/*` transport. Root-cause work moved into shared web auth scope matching. | QA / 3rd line support |
| 2026-03-11 | Shared web auth compatibility was patched to treat `vuln:view` as legacy read and `vuln:audit` as legacy export so client-side prechecks stop false-denying the route on modern tokens. Focused Angular tests passed. | Developer |
| 2026-03-11 | Live Playwright proved the initial scope fix only exposed a deeper contract defect: the route was still calling stale Authority-era `/vuln` read paths. The web client was split so read/query traffic targets `/api/v1/vulnerabilities`, and a scanner-backed controller/test slice was added to restore the documented route contract. | QA / 3rd line support / Architect |
| 2026-03-11 | Final root-cause closure: the artifact workspace was mixing artifact-scoped UI state with scan-scoped gated-buckets API calls, synthetic `vulnId` rows with evidence endpoints that require canonical `findingId`, and a dead local `/api/v1/telemetry/ttfs` postback instead of the shared telemetry pipeline. Added an artifact-scoped scanner endpoint, deterministic demo triage catalog, canonical `findingId` propagation, and shared `TelemetryClient` emission. | 3rd line support / Product / Architect / Developer |
| 2026-03-11 | Focused verification passed: Angular slice `20/20`, scanner executable slice `5/5`, `scanner-web` rebuilt/redeployed, web bundle rebuilt/synced, and live Playwright `live-triage-artifacts-scope-compat.json` recorded `failedCheckCount=0` and `runtimeIssueCount=0` on `https://stella-ops.local/triage/artifacts`. | QA |
## Decisions & Risks
- Initial decision: fix this in shared web auth scope matching, not as a page-local bypass. The live authority contract already emits modern vulnerability scopes, so client-side compatibility belongs in the shared authorization layer.
- Risk: several web clients still reference obsolete `vuln:*` names. A piecemeal page-only fix would leave other hidden client-side false denies behind.
- Decision: keep legacy Authority endpoints only for workflow/export operations and move all artifact-workspace reads onto the scanner route documented in the web and router dossiers. Fixing the URL string alone would have left the stale service ownership problem in place.
- Risk: the sprint is frontend-owned but required a minimal scanner repair because the documented backend contract had drifted out of implementation. The cross-module exception is recorded above; unrelated scanner behavior remains out of scope.
- Decision: preserve the artifact workspace as artifact-scoped. Instead of forcing the UI to synthesize a scan identity, the scanner now exposes `GET /api/v1/triage/artifacts/{artifactId}/gated-buckets` for the non-blocking bucket summary the page actually needs.
- Decision: vulnerability rows now carry canonical `findingId` alongside display `vulnId`. The UI can keep its current route and selection semantics, but all triage evidence/gating/replay boundaries resolve back to `findingId` before making scanner calls.
- Decision: scratch local setups now use a deterministic demo triage catalog for the artifact workspace surfaces so scanner-backed demo vulnerability rows, unified evidence, and gating explanations stay internally consistent without requiring seeded tenant data.
- Decision: triage TTFS events emit through the shared `TelemetryClient` rather than a dedicated `/api/v1/telemetry/ttfs` endpoint. This preserves central sampling/queueing behavior and degrades cleanly to a no-op when no ingest endpoint is configured.
## Next Checkpoints
- Local commit for the repaired triage artifact workspace iteration, then continue the next scratch-stack QA sweep against the remaining live routes/actions.

View File

@@ -284,6 +284,12 @@ type FirstSignalLoadState = 'idle' | 'loading' | 'streaming' | 'error' | 'done';
| `--motion-easing-decelerate` | cubic-bezier(0, 0, 0.2, 1) | Entries |
| `--motion-easing-accelerate` | cubic-bezier(0.4, 0, 1, 1) | Exits |
### 8.4 Browser TTFS Emission
- Web TTFS surfaces emit via the shared frontend `TelemetryClient`; they do not post directly to a page-local TTFS endpoint.
- This keeps browser-side TTFS aligned with global sampling, queue persistence, and offline/no-ingest behavior.
- When no telemetry ingest endpoint is configured in a local or scratch setup, TTFS emission must fail closed as a no-op and must never block the user flow or surface runtime errors.
## 9) Failure Signatures
Failure signatures enable predictive "last known outcome" by pattern-matching historical failures.