15 KiB
Below is a cohesive set of 7 product advisories that together define an “AI-native” Stella Ops with defensible moats. Each advisory follows the same structure:
- Problem (what hurts today)
- Why (why Stella should solve it)
- What we ship (capabilities, boundaries)
- How we achieve (proposed
AdvisoryAIbackend modules + key UI components) - Guardrails (safety / trust / determinism)
- KPIs (how you prove it works)
I’m assuming your canonical object model already includes Runs (incident/escalation/change investigation runs) and a system-of-record in PostgreSQL with Valkey as a non-authoritative accelerator.
ADVISORY-AI-000 — AdvisoryAI Foundation: Chat + Workbench + Runs (the “AI OS surface”)
Problem
Most “AI in ops” fails because it’s only a chat box. Chat is not:
- auditable
- repeatable
- actionable with guardrails
- collaborative (handoffs, approvals, artifacts)
Operators need a place where AI output becomes objects (runs, decisions, patches, evidence packs), not ephemeral text.
Why we do it
This advisory is the substrate for all other moats. Without it, your other features remain demos.
What we ship
- AdvisoryAI Orchestrator that can:
- read Stella objects (runs, services, policies, evidence)
- propose plans
- call tools/actions (within policy)
- produce structured artifacts (patches, decision records, evidence packs)
- AI Workbench UI:
- Chat panel for intent
- Artifact cards (Run, Playbook Patch, Decision, Evidence Pack)
- Run Timeline view (what happened, tool calls, approvals, outputs)
How we achieve (modules + UI)
Backend modules (suggested)
-
StellaOps.AdvisoryAI.WebService- Conversation/session orchestration
- Tool routing + action execution requests
- Artifact creation (Run notes, patches, decisions)
-
StellaOps.AdvisoryAI.Prompting- Prompt templates versioned + hashed
- Guarded system prompts per “mode”
-
StellaOps.AdvisoryAI.Tools- Tool contracts (read-only queries, action requests)
-
StellaOps.AdvisoryAI.Eval- Regression tests for tool correctness + safety
UI components
AiChatPanelComponentAiArtifactCardComponent(Run/Decision/Patch/Evidence Pack)RunTimelineComponent(with “AI steps” and “human steps”)ModeSelectorComponent(Analyst / Operator / Autopilot)
Canonical flow
User intent (chat)
-> AdvisoryAI proposes plan (steps)
-> executes read-only tools
-> generates artifact(s)
-> requests approvals for risky actions
-> records everything on Run timeline
Guardrails
- Every AI interaction writes to a Run (or attaches to an existing Run).
- Prompt templates are versioned + hashed.
- Tool calls and outputs are persisted (for audit and replay).
KPIs
- % AI sessions attached to Runs
- “Time to first useful artifact”
- Operator adoption (weekly active users of Workbench)
ADVISORY-AI-001 — Evidence-First Outputs (trust-by-construction)
Problem
In ops, an answer without evidence is a liability. LLMs are persuasive even when wrong. Operators waste time verifying or, worse, act on incorrect claims.
Why we do it
Evidence-first output is the trust prerequisite for:
- automation
- playbook learning
- org memory
- executive reporting
What we ship
-
A Claim → Evidence constraint:
- Each material claim must be backed by an
EvidenceRef(query snapshot, ticket, pipeline run, commit, config state).
- Each material claim must be backed by an
-
An Evidence Pack artifact:
- A shareable bundle of evidence for an incident/change/review.
How we achieve (modules + UI)
Backend modules
-
StellaOps.AdvisoryAI.Evidence- Claim extraction from model output
- Evidence retrieval + snapshotting
- Citation enforcement (or downgrade claim confidence)
-
StellaOps.EvidenceStore- Immutable (or content-addressed) snapshots
- Hashes, timestamps, query parameters
UI components
EvidenceSidePanelComponent(opens from inline citations)EvidencePackViewerComponentConfidenceBadgeComponent(Verified / Inferred / Unknown)
Implementation pattern
-
For each answer:
- Draft response
- Extract claims
- Attach evidence refs
- If evidence missing: label as uncertain + propose verification steps
Guardrails
-
If evidence is missing, Stella must not assert certainty.
-
Evidence snapshots must capture:
- query inputs
- time range
- raw result (or hash + storage pointer)
KPIs
- Citation coverage (% of answers with evidence refs)
- Reduced back-and-forth (“how do you know?” rate)
- Adoption of automation after evidence-first rollout
ADVISORY-AI-002 — Policy-Aware Automation (safe actions, not just suggestions)
Problem
The main blocker to “AI that acts” is governance:
- wrong environment
- insufficient permission
- missing approvals
- non-idempotent actions
- unclear accountability
Why we do it
If Stella can’t safely execute actions, it will remain a read-only assistant. Policy-aware automation is a hard moat because it requires real engineering discipline and operational maturity.
What we ship
-
A typed Action Registry:
- schemas, risk levels, idempotency, rollback/compensation
-
A Policy decision point (PDP) before any action:
- allow / allow-with-approvals / deny
-
An Approval workflow linked to Runs
How we achieve (modules + UI)
Backend modules
-
StellaOps.ActionRegistry- Action definitions + schemas + risk metadata
-
StellaOps.PolicyEngine- Rules: environment protections, freeze windows, role constraints
-
StellaOps.AdvisoryAI.Automation- Converts intent → action proposals
- Submits action requests after approvals
-
StellaOps.RunLedger- Every action request + result is a ledger entry
UI components
ActionProposalCardComponentApprovalModalComponent(scoped approval: this action/this run/this window)PolicyExplanationComponent(human-readable “why allowed/denied”)RollbackPanelComponent
Guardrails
-
Default: propose actions; only auto-execute in explicitly configured “Autopilot scopes.”
-
Every action must support:
- idempotency key
- audit fields (why, ticket/run linkage)
- reversible/compensating action where feasible
KPIs
- % actions proposed vs executed
- “Policy prevented incident” count
- Approval latency and action success rate
ADVISORY-AI-003 — Ops Memory (structured, durable, queryable)
Problem
Teams repeat incidents because knowledge lives in:
- chat logs
- tribal memory
- scattered tickets
- unwritten heuristics
Chat history is not an operational knowledge base: it’s unstructured and hard to reuse safely.
Why we do it
Ops memory reduces repeat work and accelerates diagnosis. It also becomes a defensible dataset because it’s tied to your Runs, artifacts, and outcomes.
What we ship
A set of typed memory objects (not messages):
DecisionRecordKnownIssueTacticConstraintPostmortemSummary
Memory is written on:
- Run closure
- approvals (policy events)
- explicit “save as org memory” actions
How we achieve (modules + UI)
Backend modules
-
StellaOps.AdvisoryAI.Memory- Write: extract structured memory from run artifacts
- Read: retrieve memory relevant to current context (service/env/symptoms)
- Conflict handling: “superseded by”, timestamps, confidence
-
StellaOps.MemoryStore(Postgres tables + full-text index as needed)
UI components
MemoryPanelComponent(contextual suggestions during a run)MemoryBrowserComponent(search + filters)MemoryDiffComponent(when superseding prior memory)
Guardrails
-
Memory entries have:
- scope (service/env/team)
- confidence (verified vs anecdotal)
- review/expiry policies for tactics/constraints
-
Never “learn” from unresolved or low-confidence runs by default.
KPIs
- Repeat incident rate reduction
- Time-to-diagnosis delta when memory exists
- Memory reuse rate inside Runs
ADVISORY-AI-004 — Playbook Learning (Run → Patch → Approved Playbook)
Problem
Runbooks/playbooks drift. Operators improvise. The playbook never improves, and the organization pays the same “tuition” repeatedly.
Why we do it
Playbook learning is the compounding loop that turns daily operations into a proprietary advantage. Competitors can generate playbooks; they struggle to continuously improve them from real run traces with review + governance.
What we ship
-
Versioned playbooks as structured objects
-
Playbook Patch proposals generated from Run traces:
- coverage patches, repair patches, optimization patches, safety patches, detection patches
-
Owner review + approval workflow
How we achieve (modules + UI)
Backend modules
-
StellaOps.Playbooks- Playbook schema + versioning
-
StellaOps.AdvisoryAI.PlaybookLearning- Extract “what we did” from Run timeline
- Compare to playbook steps
- Propose a patch with evidence links
-
StellaOps.DiffService- Human-friendly diff output for UI
UI components
PlaybookPatchCardComponentDiffViewerComponent(Monaco diff or equivalent)PlaybookApprovalFlowComponentPlaybookCoverageHeatmapComponent(optional, later)
Guardrails
-
Never auto-edit canonical playbooks; only patches + review.
-
Require evidence links for each proposed step.
-
Prevent one-off contamination by marking patches as:
- “generalizable” vs “context-specific”
KPIs
- % incidents with a playbook
- Patch acceptance rate
- MTTR improvement for playbook-backed incidents
ADVISORY-AI-005 — Integration Concierge (setup + health + “how-to” that is actually correct)
Problem
Integrations are where tools die:
- users ask “how do I integrate X”
- assistant answers generically
- setup fails because of environment constraints, permissions, webhooks, scopes, retries, or missing prerequisites
- no one can debug it later
Why we do it
Integration handling becomes a moat when it is:
- deterministic (wizard truth)
- auditable (events + actions traced)
- self-healing (retries, backfills, health checks)
- explainable (precise steps, not generic docs)
What we ship
- Integration Setup Wizard per provider (GitLab, Jira, Slack, etc.)
- Integration Health dashboard:
- last event received
- last action executed
- failure reasons + next steps
- token expiry warnings
- Chat-driven guidance that drives the same wizard backend:
- when user asks “how to integrate GitLab,” Stella replies with the exact steps for the instance type, auth mode, and required permissions, and can pre-fill a setup plan.
How we achieve (modules + UI)
Backend modules
-
StellaOps.Integrations- Provider contracts: inbound events + outbound actions
- Normalization into Stella
SignalsandActions
-
StellaOps.Integrations.Reliability- Webhook dedupe, replay, dead-letter, backfill polling
-
StellaOps.AdvisoryAI.Integrations- Retrieves provider-specific setup templates
- Asks only for missing parameters
- Produces a “setup checklist” artifact attached to a Run or Integration record
UI components
IntegrationWizardComponentIntegrationHealthComponentIntegrationEventLogComponent(raw payload headers + body stored securely)SetupChecklistArtifactComponent(generated by AdvisoryAI)
Guardrails
- Store inbound webhook payloads for replay/debug, with redaction where required.
- Always support reconciliation/backfill (webhooks are never perfectly lossless).
- Use least-privilege token scopes by default, with clear permission error guidance.
KPIs
- Time-to-first-successful-event
- Integration “healthy” uptime
- Setup completion rate without human support
ADVISORY-AI-006 — Outcome Analytics (prove ROI with credible attribution)
Problem
AI features are easy to cut in budgeting because value is vague. “It feels faster” doesn’t survive scrutiny.
Why we do it
Outcome analytics makes Stella defensible to leadership and helps prioritize what to automate next. It also becomes a dataset for continuous improvement.
What we ship
-
Baseline metrics (before Stella influence):
- MTTA, MTTR, escalation count, repeat incidents, deploy failure rate (as relevant)
-
Attribution model (only count impact when Stella materially contributed):
- playbook patch accepted
- evidence pack used
- policy-gated action executed
- memory entry reused
-
Monthly/weekly impact reports
How we achieve (modules + UI)
Backend modules
-
StellaOps.Analytics- Metric computation + cohorts (by service/team/severity)
-
StellaOps.AdvisoryAI.Attribution- Joins outcomes to AI artifacts and actions in the Run ledger
-
StellaOps.Reporting- Scheduled report generation (exportable)
UI components
OutcomeDashboardComponentAttributionBreakdownComponentExecutiveReportExportComponent
Guardrails
- Avoid vanity metrics (“number of chats”).
- Always show confidence/limitations in attribution (correlation vs causation).
KPIs
- MTTR delta (with Stella artifacts vs without)
- Repeat incident reduction
- Escalation reduction
- Automation coverage growth
One unifying implementation note: “AdvisoryAI” should output objects, not prose
To make all seven advisories work together, standardize on a small set of AI-produced artifacts:
Plan(step list with tools/actions)EvidencePackDecisionRecordPlaybookPatchIntegrationSetupChecklistRunSummary(postmortem-ready)
Every artifact is:
- versioned
- evidence-linked
- attached to a Run
- subject to policy gates when it triggers actions
This gives you:
- auditability
- deterministic replay of the inputs and tool outputs
- compounding “Ops memory” and “Playbook learning” data
Example: how this maps to “AdvisoryAI module + Chat UI component”
Minimum viable architecture that is coherent:
Backend
StellaOps.AdvisoryAI.WebService(orchestrator)StellaOps.AdvisoryAI.Evidence(citations)StellaOps.AdvisoryAI.Memory(structured memory)StellaOps.AdvisoryAI.PlaybookLearning(patch proposals)StellaOps.AdvisoryAI.Integrations(setup guidance + checklists)StellaOps.PolicyEngine+StellaOps.ActionRegistry(gated actions)StellaOps.RunLedger+StellaOps.EvidenceStore(audit + snapshots)
UI
- AI chat panel that emits/accepts artifact cards
- Evidence side panel
- Approval modal
- Diff viewer for patches
- Integration wizard + health
If you want, I can translate these advisories into a single internal spec pack:
- database tables (Postgres)
- event contracts (signals/actions)
- JSON schemas for artifacts/actions
- UI navigation and component tree
- the first 10 “golden workflows” you should ship with the Workbench