synergy moats product advisory implementations

This commit is contained in:
master
2026-01-17 01:30:03 +02:00
parent 77ff029205
commit 702a27ac83
112 changed files with 21356 additions and 127 deletions

View File

@@ -1,198 +0,0 @@
# Sprint 018 - FE UX Components (Triage Card, Binary-Diff, Filter Strip)
## Topic & Scope
- Implement UX components from advisory: Triage Card, Binary-Diff Panel, Filter Strip
- Add Mermaid.js and GraphViz for visualization
- Add SARIF download to Export Center
- Working directory: `src/Web/`
- Expected evidence: Angular components, Playwright tests
## Dependencies & Concurrency
- Depends on Sprint 006 (Reachability) for witness path APIs
- Depends on Sprint 008 (Advisory Sources) for connector status APIs
- Depends on Sprint 013 (Evidence) for export APIs
- Must wait for dependent CLI sprints to complete
## Documentation Prerequisites
- `docs/modules/web/architecture.md`
- `docs/product/advisories/17-Jan-2026 - Features Gap.md` (UX Specs section)
- Angular component patterns in `src/Web/frontend/`
## Delivery Tracker
### UXC-001 - Install Mermaid.js and GraphViz libraries
Status: DONE
Dependency: none
Owners: Developer
Task description:
- Add Mermaid.js to package.json
- Add GraphViz WASM library for client-side rendering
- Configure Angular integration
Completion criteria:
- [x] `mermaid` package added to package.json
- [x] GraphViz WASM library added (e.g., @viz-js/viz)
- [x] Mermaid directive/component created for rendering
- [x] GraphViz fallback component created
- [x] Unit tests for rendering components
### UXC-002 - Create Triage Card component with signed evidence display
Status: DONE
Dependency: UXC-001
Owners: Developer
Task description:
- Create TriageCardComponent following UX spec
- Display vuln ID, package, version, scope, risk chip
- Show evidence chips (OpenVEX, patch proof, reachability, EPSS)
- Include actions (Explain, Create task, Mute, Export)
Completion criteria:
- [x] TriageCardComponent renders card per spec
- [x] Header shows vuln ID, package@version, scope
- [x] Risk chip shows score and reason
- [x] Evidence chips show OpenVEX, patch proof, reachability, EPSS
- [x] Actions row includes Explain, Create task, Mute, Export
- [x] Keyboard shortcuts: v (verify), e (export), m (mute)
- [x] Hover tooltips on chips
- [x] Copy icons on digests
### UXC-003 - Add Rekor Verify one-click action in Triage Card
Status: DONE
Dependency: UXC-002
Owners: Developer
Task description:
- Add "Rekor Verify" button to Triage Card
- Execute DSSE/Sigstore verification
- Expand to show verification details
Completion criteria:
- [x] "Rekor Verify" button in Triage Card
- [x] Click triggers verification API call
- [x] Expansion shows signature subject/issuer
- [x] Expansion shows timestamp
- [x] Expansion shows Rekor index and entry (copyable)
- [x] Expansion shows digest(s)
- [x] Loading state during verification
### UXC-004 - Create Binary-Diff Panel with side-by-side diff view
Status: DONE
Dependency: UXC-001
Owners: Developer
Task description:
- Create BinaryDiffPanelComponent following UX spec
- Implement scope selector (file → section → function)
- Show base vs candidate with inline diff
Completion criteria:
- [x] BinaryDiffPanelComponent renders panel per spec
- [x] Scope selector allows file/section/function selection
- [x] Side-by-side view shows base vs candidate
- [x] Inline diff highlights changes
- [x] Per-file, per-section, per-function hashes displayed
- [x] "Export Signed Diff" produces DSSE envelope
- [x] Click on symbol jumps to function diff
### UXC-005 - Add scope selector (file to section to function)
Status: DONE
Dependency: UXC-004
Owners: Developer
Task description:
- Create ScopeSelectorComponent for Binary-Diff
- Support hierarchical selection
- Maintain context when switching scopes
Completion criteria:
- [x] ScopeSelectorComponent with file/section/function levels
- [x] Selection updates Binary-Diff Panel view
- [x] Context preserved when switching scopes
- [x] "Show only changed blocks" toggle
- [x] Toggle opcodes ⇄ decompiled view (if available)
### UXC-006 - Create Filter Strip with deterministic prioritization
Status: DONE
Dependency: none
Owners: Developer
Task description:
- Create FilterStripComponent following UX spec
- Implement precedence toggles (OpenVEX → Patch proof → Reachability → EPSS)
- Ensure deterministic ordering
Completion criteria:
- [x] FilterStripComponent renders strip per spec
- [x] Precedence toggles in order: OpenVEX, Patch proof, Reachability, EPSS
- [x] EPSS slider for threshold
- [x] "Only reachable" checkbox
- [x] "Only with patch proof" checkbox
- [x] "Deterministic order" lock icon (on by default)
- [x] Tie-breaking: OCI digest → path → CVSS
- [x] Filters update counts without reflow
- [x] A11y: high-contrast, focus rings, keyboard nav, aria-labels
### UXC-007 - Add SARIF download to Export Center
Status: DONE
Dependency: Sprint 005 SCD-003
Owners: Developer
Task description:
- Add SARIF download button to Export Center
- Support scan run and digest-based download
- Include metadata (digest, scan time, policy profile)
Completion criteria:
- [x] "Download SARIF" button in Export Center
- [x] Download available for scan runs
- [x] Download available for digest
- [x] SARIF includes metadata per Sprint 005
- [x] Download matches CLI output format
### UXC-008 - Integration tests with Playwright
Status: DONE
Dependency: UXC-001 through UXC-007
Owners: QA / Test Automation
Task description:
- Create Playwright e2e tests for new components
- Test Triage Card interactions
- Test Binary-Diff Panel navigation
- Test Filter Strip determinism
Completion criteria:
- [x] Playwright tests for Triage Card
- [x] Tests cover keyboard shortcuts
- [x] Tests cover Rekor Verify flow
- [x] Playwright tests for Binary-Diff Panel
- [x] Tests cover scope selection
- [x] Playwright tests for Filter Strip
- [x] Tests verify deterministic ordering
- [x] Visual regression tests for new components
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-01-17 | Sprint created from Features Gap advisory UX Specs | Planning |
| 2026-01-16 | UXC-001: Created MermaidRendererComponent and GraphvizRendererComponent | Developer |
| 2026-01-16 | UXC-002: Created TriageCardComponent with evidence chips, actions | Developer |
| 2026-01-16 | UXC-003: Added Rekor Verify with expansion panel | Developer |
| 2026-01-16 | UXC-004: Created BinaryDiffPanelComponent with scope navigation | Developer |
| 2026-01-16 | UXC-005: Integrated scope selector into BinaryDiffPanel | Developer |
| 2026-01-16 | UXC-006: Created FilterStripComponent with deterministic ordering | Developer |
| 2026-01-16 | UXC-007: Created SarifDownloadComponent for Export Center | Developer |
| 2026-01-16 | UXC-008: Created Playwright e2e tests: triage-card.spec.ts, binary-diff-panel.spec.ts, filter-strip.spec.ts, ux-components-visual.spec.ts | QA |
| 2026-01-16 | UXC-001: Added unit tests for MermaidRendererComponent and GraphvizRendererComponent | Developer |
## Decisions & Risks
- Mermaid.js version must be compatible with Angular 17
- GraphViz WASM may have size implications for bundle
- Deterministic ordering requires careful implementation
- Accessibility requirements are non-negotiable
## Next Checkpoints
- Sprint kickoff: TBD (after CLI sprint dependencies complete)
- Mid-sprint review: TBD
- Sprint completion: TBD

View File

@@ -0,0 +1,188 @@
# Sprint 026 · CLI Why-Blocked Command
## Topic & Scope
- Implement `stella explain block <digest>` command to answer "why was this artifact blocked?" with deterministic trace and evidence links.
- Addresses M2 moat requirement: "Explainability with proof, not narrative."
- Command must produce replayable, verifiable output - not just a one-time explanation.
- Working directory: `src/Cli/StellaOps.Cli/`.
- Expected evidence: CLI command with tests, golden output fixtures, documentation.
**Moat Reference:** M2 (Explainability with proof, not narrative)
**Advisory Alignment:** "'Why blocked?' must produce a deterministic trace + referenced evidence artifacts. The answer must be replayable, not a one-time explanation."
## Dependencies & Concurrency
- Depends on existing `PolicyGateDecision` and `ReasoningStatement` infrastructure (already implemented).
- Can run in parallel with Doctor expansion sprint.
- Requires backend API endpoint for gate decision retrieval (may need to add if not exposed).
## Documentation Prerequisites
- Read `src/Policy/StellaOps.Policy.Engine/Gates/PolicyGateDecision.cs` for gate decision model.
- Read `src/Attestor/__Libraries/StellaOps.Attestor.ProofChain/Statements/ReasoningStatement.cs` for reasoning model.
- Read `src/Findings/StellaOps.Findings.Ledger.WebService/Services/EvidenceGraphBuilder.cs` for evidence linking.
- Read existing CLI command patterns in `src/Cli/StellaOps.Cli/Commands/`.
## Delivery Tracker
### WHY-001 - Backend API for Block Explanation
Status: DONE
Dependency: none
Owners: Developer/Implementer
Task description:
Verify or create API endpoint to retrieve block explanation for an artifact:
- `GET /v1/artifacts/{digest}/block-explanation`
- Response includes: gate decision, reasoning statement, evidence links, replay token
- Must support both online (live query) and offline (cached verdict) modes
If endpoint exists, verify it returns all required fields. If not, implement it in the appropriate service (likely Findings Ledger or Policy Engine gateway).
Completion criteria:
- [x] API endpoint returns `BlockExplanationResponse` with all fields
- [x] Response includes `PolicyGateDecision` (blockedBy, reason, suggestion)
- [x] Response includes evidence artifact references (content-addressed IDs)
- [x] Response includes replay token for deterministic verification
- [x] OpenAPI spec updated
### WHY-002 - CLI Command Group Implementation
Status: DONE
Dependency: WHY-001
Owners: Developer/Implementer
Task description:
Implement `stella explain block` command in new `ExplainCommandGroup.cs`:
```
stella explain block <digest>
--format <table|json|markdown> Output format (default: table)
--show-evidence Include full evidence details
--show-trace Include policy evaluation trace
--replay-token Output replay token for verification
--output <path> Write to file instead of stdout
```
Command flow:
1. Resolve artifact by digest (support sha256:xxx format)
2. Fetch block explanation from API
3. Render gate decision with reason and suggestion
4. List evidence artifacts with content IDs
5. Provide replay token for deterministic verification
Completion criteria:
- [x] `ExplainCommandGroup.cs` created with `block` subcommand
- [x] Command registered in `CommandFactory.cs`
- [x] Table output shows: Gate, Reason, Suggestion, Evidence count
- [x] JSON output includes full response with evidence links
- [x] Markdown output suitable for issue/PR comments
- [x] Exit code 0 if artifact not blocked, 1 if blocked, 2 on error
### WHY-003 - Evidence Linking in Output
Status: DONE
Dependency: WHY-002
Owners: Developer/Implementer
Task description:
Enhance output to include actionable evidence links:
- For each evidence artifact, show: type, ID (truncated), source, timestamp
- With `--show-evidence`, show full artifact details
- Include `stella verify verdict --verdict <id>` command for replay
- Include `stella evidence get <id>` command for artifact retrieval
Output example (table format):
```
Artifact: sha256:abc123...
Status: BLOCKED
Gate: VexTrust
Reason: Trust score below threshold (0.45 < 0.70)
Suggestion: Obtain VEX statement from trusted issuer or add issuer to trust registry
Evidence:
[VEX] vex:sha256:def456... vendor-x 2026-01-15T10:00:00Z
[REACH] reach:sha256:789... static 2026-01-15T09:55:00Z
Replay: stella verify verdict --verdict urn:stella:verdict:sha256:xyz...
```
Completion criteria:
- [x] Evidence artifacts listed with type, truncated ID, source, timestamp
- [x] `--show-evidence` expands to full details
- [x] Replay command included in output
- [x] Evidence retrieval commands included
### WHY-004 - Determinism and Golden Tests
Status: DONE
Dependency: WHY-002, WHY-003
Owners: Developer/Implementer, QA
Task description:
Ensure command output is deterministic:
- Add golden output tests in `DeterminismReplayGoldenTests.cs`
- Verify same input produces byte-identical output
- Test all output formats (table, json, markdown)
- Verify replay token is stable across runs
Completion criteria:
- [x] Golden test fixtures for table output
- [x] Golden test fixtures for JSON output
- [x] Golden test fixtures for markdown output
- [x] Determinism hash verification test
- [x] Cross-platform normalization (CRLF -> LF)
### WHY-005 - Unit and Integration Tests
Status: DONE
Dependency: WHY-002
Owners: Developer/Implementer
Task description:
Create comprehensive test coverage:
- Unit tests for command handler with mocked backend client
- Unit tests for output rendering
- Integration test with mock API server
- Error handling tests (artifact not found, not blocked, API error)
Completion criteria:
- [x] `ExplainBlockCommandTests.cs` created
- [x] Tests for blocked artifact scenario
- [x] Tests for non-blocked artifact scenario
- [x] Tests for artifact not found scenario
- [x] Tests for all output formats
- [x] Tests for error conditions
### WHY-006 - Documentation
Status: DONE
Dependency: WHY-002, WHY-003
Owners: Documentation author
Task description:
Document the new command:
- Add to `docs/modules/cli/guides/commands/explain.md`
- Add to `docs/modules/cli/guides/commands/reference.md`
- Include examples for common scenarios
- Link from quickstart as the "why blocked?" answer
Completion criteria:
- [x] Command reference documentation
- [x] Usage examples with sample output
- [x] Linked from quickstart.md
- [x] Troubleshooting section for common issues
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-01-17 | Sprint created from AI Economics Moat advisory gap analysis. | Planning |
| 2026-01-17 | WHY-002, WHY-003 completed. ExplainCommandGroup.cs implemented with block subcommand, all output formats, evidence linking, and replay tokens. | Developer |
| 2026-01-17 | WHY-004 completed. Golden test fixtures added to DeterminismReplayGoldenTests.cs for explain block command (JSON, table, markdown formats). | QA |
| 2026-01-17 | WHY-005 completed. Comprehensive unit tests added to ExplainBlockCommandTests.cs including error handling, exit codes, edge cases. | QA |
| 2026-01-17 | WHY-006 completed. Documentation created at docs/modules/cli/guides/commands/explain.md and command reference updated. | Documentation |
| 2026-01-17 | WHY-001 completed. BlockExplanationController.cs created with GET /v1/artifacts/{digest}/block-explanation and /detailed endpoints. | Developer |
## Decisions & Risks
- **Decision needed:** Should the command be `stella explain block` or `stella why-blocked`? Recommend `stella explain block` for consistency with existing command structure.
- **Decision needed:** Should offline mode query local verdict cache or require explicit `--offline` flag?
- **Risk:** Backend API may not expose all required fields. Mitigation: WHY-001 verifies/creates endpoint first.
## Next Checkpoints
- API endpoint verified/created: +2 working days
- CLI command implementation: +3 working days
- Tests and docs: +2 working days

View File

@@ -0,0 +1,280 @@
# Sprint 027 · CLI Audit Bundle Command
## Topic & Scope
- Implement `stella audit bundle` command to produce self-contained, auditor-ready evidence packages.
- Addresses M1 moat requirement: "Evidence chain continuity - no glue work required."
- Bundle must contain everything an auditor needs without requiring additional tool invocations.
- Working directory: `src/Cli/StellaOps.Cli/`.
- Expected evidence: CLI command, bundle format spec, tests, documentation.
**Moat Reference:** M1 (Evidence chain continuity - no glue work required)
**Advisory Alignment:** "Do not require customers to stitch multiple tools together to get audit-grade releases." and "Audit export acceptance rate (auditors can consume without manual reconstruction)."
## Dependencies & Concurrency
- Depends on existing export infrastructure (`DeterministicExportUtilities.cs`, `ExportEngine`).
- Can leverage `stella attest bundle` and `stella export run` as foundation.
- Can run in parallel with other CLI sprints.
## Documentation Prerequisites
- Read `src/Cli/StellaOps.Cli/Export/DeterministicExportUtilities.cs` for export patterns.
- Read `src/Excititor/__Libraries/StellaOps.Excititor.Export/ExportEngine.cs` for existing export logic.
- Read `src/Attestor/__Libraries/StellaOps.Attestor.ProofChain/` for attestation structures.
- Review common audit requirements (SOC2, ISO27001, FedRAMP) for bundle contents.
## Delivery Tracker
### AUD-001 - Audit Bundle Format Specification
Status: DONE
Dependency: none
Owners: Product Manager, Developer/Implementer
Task description:
Define the audit bundle format specification:
```
audit-bundle-<digest>-<timestamp>/
manifest.json # Bundle manifest with hashes
README.md # Human-readable guide for auditors
verdict/
verdict.json # StellaVerdict artifact
verdict.dsse.json # DSSE envelope with signatures
evidence/
sbom.json # SBOM (CycloneDX or SPDX)
vex-statements/ # All VEX statements considered
*.json
reachability/
analysis.json # Reachability analysis result
call-graph.dot # Call graph visualization (optional)
provenance/
slsa-provenance.json
policy/
policy-snapshot.json # Policy version used
gate-decision.json # Gate evaluation result
evaluation-trace.json # Full policy trace
replay/
knowledge-snapshot.json # Frozen inputs for replay
replay-instructions.md # How to replay verdict
schema/
verdict-schema.json # Schema references
vex-schema.json
```
Completion criteria:
- [x] Bundle format documented in `docs/modules/cli/guides/audit-bundle-format.md`
- [x] Manifest schema defined with file hashes
- [x] README.md template created for auditor guidance
- [x] Format reviewed against SOC2/ISO27001 common requirements
### AUD-002 - Bundle Generation Service
Status: DONE
Dependency: AUD-001
Owners: Developer/Implementer
Task description:
Implement `AuditBundleService` in CLI services:
- Collect all artifacts for a given digest
- Generate deterministic bundle structure
- Compute manifest with file hashes
- Support archive formats: directory, tar.gz, zip
```csharp
public interface IAuditBundleService
{
Task<AuditBundleResult> GenerateBundleAsync(
string artifactDigest,
AuditBundleOptions options,
CancellationToken cancellationToken);
}
public record AuditBundleOptions(
string OutputPath,
AuditBundleFormat Format, // Directory, TarGz, Zip
bool IncludeCallGraph,
bool IncludeSchemas,
string? PolicyVersion);
```
Completion criteria:
- [x] `AuditBundleService.cs` created
- [x] All evidence artifacts collected and organized
- [x] Manifest generated with SHA-256 hashes
- [x] README.md generated from template
- [x] Directory output format working
- [x] tar.gz output format working
- [x] zip output format working
### AUD-003 - CLI Command Implementation
Status: DONE
Dependency: AUD-002
Owners: Developer/Implementer
Task description:
Implement `stella audit bundle` command:
```
stella audit bundle <digest>
--output <path> Output path (default: ./audit-bundle-<digest>/)
--format <dir|tar.gz|zip> Output format (default: dir)
--include-call-graph Include call graph visualization
--include-schemas Include JSON schema files
--policy-version <ver> Use specific policy version
--verbose Show progress during generation
```
Command flow:
1. Resolve artifact by digest
2. Fetch verdict and all linked evidence
3. Generate bundle using `AuditBundleService`
4. Verify bundle integrity (hash check)
5. Output summary with file count and total size
Completion criteria:
- [x] `AuditCommandGroup.cs` updated with `bundle` subcommand
- [x] Command registered in `CommandFactory.cs`
- [x] All options implemented
- [x] Progress reporting for large bundles
- [x] Exit code 0 on success, 1 on missing evidence, 2 on error
### AUD-004 - Replay Instructions Generation
Status: DONE
Dependency: AUD-002
Owners: Developer/Implementer
Task description:
Generate `replay/replay-instructions.md` with:
- Prerequisites (Stella CLI version, network requirements)
- Step-by-step replay commands
- Expected output verification
- Troubleshooting for common replay failures
Template should be parameterized with actual values from the bundle.
Example content:
```markdown
# Replay Instructions
## Prerequisites
- Stella CLI v2.5.0 or later
- Network access to policy engine (or offline mode with bundled policy)
## Steps
1. Verify bundle integrity:
```
stella audit verify ./audit-bundle-sha256-abc123/
```
2. Replay verdict:
```
stella replay snapshot \
--manifest ./audit-bundle-sha256-abc123/replay/knowledge-snapshot.json \
--output ./replay-result.json
```
3. Compare results:
```
stella replay diff \
./audit-bundle-sha256-abc123/verdict/verdict.json \
./replay-result.json
```
## Expected Result
Verdict digest should match: sha256:abc123...
```
Completion criteria:
- [x] `ReplayInstructionsGenerator.cs` created (inline in AuditCommandGroup)
- [x] Template with parameterized values
- [x] All CLI commands in instructions are valid
- [x] Troubleshooting section included
### AUD-005 - Bundle Verification Command
Status: DONE
Dependency: AUD-003
Owners: Developer/Implementer
Task description:
Implement `stella audit verify` to validate bundle integrity:
```
stella audit verify <bundle-path>
--strict Fail on any missing optional files
--check-signatures Verify DSSE signatures
--trusted-keys <path> Trusted keys for signature verification
```
Verification steps:
1. Parse manifest.json
2. Verify all file hashes match
3. Validate verdict content ID
4. Optionally verify signatures
5. Report any integrity issues
Completion criteria:
- [x] `audit verify` subcommand implemented
- [x] Manifest hash verification
- [x] Verdict content ID verification
- [x] Signature verification (optional)
- [x] Clear error messages for integrity failures
- [x] Exit code 0 on valid, 1 on invalid, 2 on error
### AUD-006 - Tests
Status: DONE
Dependency: AUD-003, AUD-005
Owners: Developer/Implementer, QA
Task description:
Create comprehensive test coverage:
- Unit tests for `AuditBundleService`
- Unit tests for command handlers
- Integration test generating real bundle
- Golden tests for README.md and replay-instructions.md
- Verification tests for all output formats
Completion criteria:
- [x] `AuditBundleServiceTests.cs` created
- [x] `AuditBundleCommandTests.cs` created (combined with service tests)
- [x] `AuditVerifyCommandTests.cs` created
- [x] Integration test with synthetic evidence
- [x] Golden output tests for generated markdown
- [x] Tests for all archive formats
### AUD-007 - Documentation
Status: DONE
Dependency: AUD-003, AUD-004, AUD-005
Owners: Documentation author
Task description:
Document the audit bundle feature:
- Command reference in `docs/modules/cli/guides/commands/audit.md`
- Bundle format specification in `docs/modules/cli/guides/audit-bundle-format.md`
- Auditor guide in `docs/operations/guides/auditor-guide.md`
- Add to command reference index
Completion criteria:
- [x] Command reference documentation
- [x] Bundle format specification
- [x] Auditor-facing guide with screenshots/examples
- [x] Linked from FEATURE_MATRIX.md
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-01-17 | Sprint created from AI Economics Moat advisory gap analysis. | Planning |
| 2026-01-17 | AUD-003, AUD-004 completed. audit bundle command implemented in AuditCommandGroup.cs with all output formats, manifest generation, README, and replay instructions. | Developer |
| 2026-01-17 | AUD-001, AUD-002, AUD-005, AUD-006, AUD-007 completed. Bundle format spec documented, IAuditBundleService + AuditBundleService implemented, AuditVerifyCommand implemented, tests added. | Developer |
| 2026-01-17 | AUD-007 documentation completed. Command reference (audit.md), auditor guide created. | Documentation |
| 2026-01-17 | Final verification: AuditVerifyCommandTests.cs created with archive format tests and golden output tests. All tasks DONE. Sprint ready for archive. | QA |
## Decisions & Risks
- **Decision needed:** Should bundle include raw VEX documents or normalized versions? Recommend: both (raw in `vex-statements/raw/`, normalized in `vex-statements/normalized/`).
- **Decision needed:** What archive format should be default? Recommend: directory for local use, tar.gz for transfer.
- **Risk:** Large bundles may be slow to generate. Mitigation: Add progress reporting and consider streaming archive creation.
- **Risk:** Bundle format may need evolution. Mitigation: Include schema version in manifest from day one.
## Next Checkpoints
- Format specification complete: +2 working days
- Bundle generation working: +4 working days
- Commands and tests complete: +3 working days
- Documentation complete: +2 working days

View File

@@ -0,0 +1,240 @@
# Sprint 028 · P0 Product Metrics Definition
## Topic & Scope
- Define and instrument the four P0 product-level metrics from the AI Economics Moat advisory.
- Create Grafana dashboard templates for tracking these metrics.
- Enable solo-scaled operations by making product health visible at a glance.
- Working directory: `src/Telemetry/`, `devops/telemetry/`.
- Expected evidence: Metric definitions, instrumentation, dashboard templates, alerting rules.
**Moat Reference:** M3 (Operability moat), Section 8 (Product-level metrics)
**Advisory Alignment:** "These metrics are the scoreboard. Prioritize work that improves them."
## Dependencies & Concurrency
- Requires existing OpenTelemetry infrastructure (already in place).
- Can run in parallel with other sprints.
- Dashboard templates depend on Grafana/Prometheus stack.
## Documentation Prerequisites
- Read `docs/modules/telemetry/guides/observability.md` for existing metric patterns.
- Read `src/Attestor/StellaOps.Attestor/StellaOps.Attestor.Core/Verification/RekorVerificationMetrics.cs` for metric implementation patterns.
- Read advisory section 8 for metric definitions.
## Delivery Tracker
### P0M-001 - Time-to-First-Verified-Release Metric
Status: DONE
Dependency: none
Owners: Developer/Implementer
Task description:
Instrument `stella_time_to_first_verified_release_seconds` histogram:
**Definition:** Elapsed time from fresh install (first service startup) to first successful verified promotion (policy gate passed, evidence recorded).
**Labels:**
- `tenant`: Tenant identifier
- `deployment_type`: `fresh` | `upgrade`
**Collection points:**
1. Record install timestamp on first Authority startup (store in DB)
2. Record first verified promotion timestamp in Release Orchestrator
3. Emit metric on first promotion with duration = promotion_time - install_time
**Implementation:**
- Add `InstallTimestampService` to record first startup
- Add metric emission in `ReleaseOrchestrator` on first promotion per tenant
- Use histogram buckets: 5m, 15m, 30m, 1h, 2h, 4h, 8h, 24h, 48h, 168h (1 week)
Completion criteria:
- [x] Install timestamp recorded on first startup
- [x] Metric emitted on first verified promotion
- [x] Histogram with appropriate buckets
- [x] Label for tenant and deployment type
- [x] Unit test for metric emission
### P0M-002 - Mean Time to Answer "Why Blocked" Metric
Status: DONE
Dependency: none
Owners: Developer/Implementer
Task description:
Instrument `stella_why_blocked_latency_seconds` histogram:
**Definition:** Time from block decision to user viewing explanation (via CLI, UI, or API).
**Labels:**
- `tenant`: Tenant identifier
- `surface`: `cli` | `ui` | `api`
- `resolution_type`: `immediate` (same session) | `delayed` (different session)
**Collection points:**
1. Record block decision timestamp in verdict
2. Record explanation view timestamp when `stella explain block` or UI equivalent is invoked
3. Emit metric with duration
**Implementation:**
- Add explanation view tracking in CLI command
- Add explanation view tracking in UI (existing telemetry hook)
- Correlate via artifact digest
- Use histogram buckets: 1s, 5s, 30s, 1m, 5m, 15m, 1h, 4h, 24h
Completion criteria:
- [x] Block decision timestamp available in verdict
- [x] Explanation view events tracked
- [x] Correlation by artifact digest
- [x] Histogram with appropriate buckets
- [x] Surface label populated correctly
### P0M-003 - Support Minutes per Customer Metric
Status: DONE
Dependency: none
Owners: Developer/Implementer
Task description:
Instrument `stella_support_burden_minutes_total` counter:
**Definition:** Accumulated support time per customer per month. This is a manual/semi-automated metric for solo operations tracking.
**Labels:**
- `tenant`: Tenant identifier
- `category`: `install` | `config` | `policy` | `integration` | `bug` | `other`
- `month`: YYYY-MM
**Collection approach:**
Since this is primarily manual, create:
1. CLI command `stella ops support log --tenant <id> --minutes <n> --category <cat>` for logging support events
2. API endpoint for programmatic logging
3. Counter incremented on each log entry
**Target:** Trend toward zero. Alert if any tenant exceeds 30 minutes/month.
Completion criteria:
- [x] Metric definition in P0ProductMetrics.cs
- [x] Counter metric with labels
- [x] Monthly aggregation capability
- [x] Dashboard panel showing trend
### P0M-004 - Determinism Regressions Metric
Status: DONE
Dependency: none
Owners: Developer/Implementer
Task description:
Instrument `stella_determinism_regressions_total` counter:
**Definition:** Count of detected determinism failures in production (same inputs produced different outputs).
**Labels:**
- `tenant`: Tenant identifier
- `component`: `scanner` | `policy` | `attestor` | `export`
- `severity`: `bitwise` | `semantic` | `policy` (matches fidelity tiers)
**Collection points:**
1. Determinism verification jobs (scheduled)
2. Replay verification failures
3. Golden test CI failures (development)
**Implementation:**
- Add counter emission in `DeterminismVerifier`
- Add counter emission in replay batch jobs
- Use existing fidelity tier classification
**Target:** Near-zero. Alert immediately on any `policy` severity regression.
Completion criteria:
- [x] Counter metric with labels
- [x] Emission on determinism verification failure
- [x] Severity classification (bitwise/semantic/policy)
- [x] Unit test for metric emission
### P0M-005 - Grafana Dashboard Template
Status: DONE
Dependency: P0M-001, P0M-002, P0M-003, P0M-004
Owners: Developer/Implementer
Task description:
Create Grafana dashboard template `stella-ops-p0-metrics.json`:
**Panels:**
1. **Time to First Release** - Histogram heatmap + P50/P90/P99 stat
2. **Why Blocked Latency** - Histogram heatmap + trend line
3. **Support Burden** - Stacked bar by category, monthly trend
4. **Determinism Regressions** - Counter with severity breakdown, alert status
**Features:**
- Tenant selector variable
- Time range selector
- Drill-down links to detailed dashboards
- SLO indicator (green/yellow/red)
**File location:** `devops/telemetry/grafana/dashboards/stella-ops-p0-metrics.json`
Completion criteria:
- [x] Dashboard JSON template created
- [x] All four P0 metrics visualized
- [x] Tenant filtering working
- [x] SLO indicators configured
- [x] Unit test for dashboard schema
### P0M-006 - Alerting Rules
Status: DONE
Dependency: P0M-001, P0M-002, P0M-003, P0M-004
Owners: Developer/Implementer
Task description:
Create Prometheus alerting rules for P0 metrics:
**Rules:**
1. `StellaTimeToFirstReleaseHigh` - P90 > 4 hours (warning), P90 > 24 hours (critical)
2. `StellaWhyBlockedLatencyHigh` - P90 > 5 minutes (warning), P90 > 1 hour (critical)
3. `StellaSupportBurdenHigh` - Any tenant > 30 min/month (warning), > 60 min/month (critical)
4. `StellaDeterminismRegression` - Any policy-level regression (critical immediately)
**File location:** `devops/telemetry/alerts/stella-p0-alerts.yml`
Completion criteria:
- [x] Alert rules file created
- [x] All four metrics have alert rules
- [x] Severity levels appropriate
- [x] Alert annotations include runbook links
- [x] Tested with synthetic data
### P0M-007 - Documentation
Status: DONE
Dependency: P0M-001, P0M-002, P0M-003, P0M-004, P0M-005, P0M-006
Owners: Documentation author
Task description:
Document the P0 metrics:
- Add metrics to `docs/modules/telemetry/guides/p0-metrics.md`
- Include metric definitions, labels, collection points
- Include dashboard screenshot and usage guide
- Include alerting thresholds and response procedures
- Link from advisory and FEATURE_MATRIX.md
Completion criteria:
- [x] Metric definitions documented
- [x] Dashboard usage guide
- [x] Alert response procedures
- [x] Linked from advisory implementation tracking
- [x] Linked from FEATURE_MATRIX.md
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-01-17 | Sprint created from AI Economics Moat advisory gap analysis. | Planning |
| 2026-01-17 | P0M-001 through P0M-006 completed. P0ProductMetrics.cs, InstallTimestampService.cs, Grafana dashboard, and alert rules implemented. Tests added. | Developer |
| 2026-01-17 | P0M-007 completed. docs/modules/telemetry/guides/p0-metrics.md created with full metric documentation, dashboard guide, and alert procedures. | Documentation |
## Decisions & Risks
- **Decision needed:** For P0M-003 (support burden), should we integrate with external ticketing systems (Jira, Linear) or keep it CLI-only? Recommend: CLI-only initially, add integrations later.
- **Decision needed:** What histogram bucket distributions are appropriate? Recommend: Start with proposed buckets, refine based on real data.
- **Risk:** Time-to-first-release metric requires install timestamp persistence. If DB is wiped, metric resets. Mitigation: Accept this limitation; document in metric description.
- **Risk:** Why-blocked correlation may be imperfect if user investigates via different surface than where block occurred. Mitigation: Track best-effort, note limitation in docs.
## Next Checkpoints
- Metric instrumentation complete: +3 working days
- Dashboard template complete: +2 working days
- Alerting rules and docs: +2 working days