Files
git.stella-ops.org/docs-archived/product/advisories/08-Jan-2026 - AI moats.md
2026-01-09 18:27:46 +02:00

15 KiB
Raw Permalink Blame History

Below is a cohesive set of 7 product advisories that together define an “AI-native” Stella Ops with defensible moats. Each advisory follows the same structure:

  • Problem (what hurts today)
  • Why (why Stella should solve it)
  • What we ship (capabilities, boundaries)
  • How we achieve (proposed AdvisoryAI backend modules + key UI components)
  • Guardrails (safety / trust / determinism)
  • KPIs (how you prove it works)

Im assuming your canonical object model already includes Runs (incident/escalation/change investigation runs) and a system-of-record in PostgreSQL with Valkey as a non-authoritative accelerator.


ADVISORY-AI-000 — AdvisoryAI Foundation: Chat + Workbench + Runs (the “AI OS surface”)

Problem

Most “AI in ops” fails because its only a chat box. Chat is not:

  • auditable
  • repeatable
  • actionable with guardrails
  • collaborative (handoffs, approvals, artifacts)

Operators need a place where AI output becomes objects (runs, decisions, patches, evidence packs), not ephemeral text.

Why we do it

This advisory is the substrate for all other moats. Without it, your other features remain demos.

What we ship

  1. AdvisoryAI Orchestrator that can:
  • read Stella objects (runs, services, policies, evidence)
  • propose plans
  • call tools/actions (within policy)
  • produce structured artifacts (patches, decision records, evidence packs)
  1. AI Workbench UI:
  • Chat panel for intent
  • Artifact cards (Run, Playbook Patch, Decision, Evidence Pack)
  • Run Timeline view (what happened, tool calls, approvals, outputs)

How we achieve (modules + UI)

Backend modules (suggested)

  • StellaOps.AdvisoryAI.WebService

    • Conversation/session orchestration
    • Tool routing + action execution requests
    • Artifact creation (Run notes, patches, decisions)
  • StellaOps.AdvisoryAI.Prompting

    • Prompt templates versioned + hashed
    • Guarded system prompts per “mode”
  • StellaOps.AdvisoryAI.Tools

    • Tool contracts (read-only queries, action requests)
  • StellaOps.AdvisoryAI.Eval

    • Regression tests for tool correctness + safety

UI components

  • AiChatPanelComponent
  • AiArtifactCardComponent (Run/Decision/Patch/Evidence Pack)
  • RunTimelineComponent (with “AI steps” and “human steps”)
  • ModeSelectorComponent (Analyst / Operator / Autopilot)

Canonical flow

User intent (chat) 
  -> AdvisoryAI proposes plan (steps)
  -> executes read-only tools
  -> generates artifact(s)
  -> requests approvals for risky actions
  -> records everything on Run timeline

Guardrails

  • Every AI interaction writes to a Run (or attaches to an existing Run).
  • Prompt templates are versioned + hashed.
  • Tool calls and outputs are persisted (for audit and replay).

KPIs

  • % AI sessions attached to Runs
  • “Time to first useful artifact”
  • Operator adoption (weekly active users of Workbench)

ADVISORY-AI-001 — Evidence-First Outputs (trust-by-construction)

Problem

In ops, an answer without evidence is a liability. LLMs are persuasive even when wrong. Operators waste time verifying or, worse, act on incorrect claims.

Why we do it

Evidence-first output is the trust prerequisite for:

  • automation
  • playbook learning
  • org memory
  • executive reporting

What we ship

  • A Claim → Evidence constraint:

    • Each material claim must be backed by an EvidenceRef (query snapshot, ticket, pipeline run, commit, config state).
  • An Evidence Pack artifact:

    • A shareable bundle of evidence for an incident/change/review.

How we achieve (modules + UI)

Backend modules

  • StellaOps.AdvisoryAI.Evidence

    • Claim extraction from model output
    • Evidence retrieval + snapshotting
    • Citation enforcement (or downgrade claim confidence)
  • StellaOps.EvidenceStore

    • Immutable (or content-addressed) snapshots
    • Hashes, timestamps, query parameters

UI components

  • EvidenceSidePanelComponent (opens from inline citations)
  • EvidencePackViewerComponent
  • ConfidenceBadgeComponent (Verified / Inferred / Unknown)

Implementation pattern

  • For each answer:

    1. Draft response
    2. Extract claims
    3. Attach evidence refs
    4. If evidence missing: label as uncertain + propose verification steps

Guardrails

  • If evidence is missing, Stella must not assert certainty.

  • Evidence snapshots must capture:

    • query inputs
    • time range
    • raw result (or hash + storage pointer)

KPIs

  • Citation coverage (% of answers with evidence refs)
  • Reduced back-and-forth (“how do you know?” rate)
  • Adoption of automation after evidence-first rollout

ADVISORY-AI-002 — Policy-Aware Automation (safe actions, not just suggestions)

Problem

The main blocker to “AI that acts” is governance:

  • wrong environment
  • insufficient permission
  • missing approvals
  • non-idempotent actions
  • unclear accountability

Why we do it

If Stella cant safely execute actions, it will remain a read-only assistant. Policy-aware automation is a hard moat because it requires real engineering discipline and operational maturity.

What we ship

  • A typed Action Registry:

    • schemas, risk levels, idempotency, rollback/compensation
  • A Policy decision point (PDP) before any action:

    • allow / allow-with-approvals / deny
  • An Approval workflow linked to Runs

How we achieve (modules + UI)

Backend modules

  • StellaOps.ActionRegistry

    • Action definitions + schemas + risk metadata
  • StellaOps.PolicyEngine

    • Rules: environment protections, freeze windows, role constraints
  • StellaOps.AdvisoryAI.Automation

    • Converts intent → action proposals
    • Submits action requests after approvals
  • StellaOps.RunLedger

    • Every action request + result is a ledger entry

UI components

  • ActionProposalCardComponent
  • ApprovalModalComponent (scoped approval: this action/this run/this window)
  • PolicyExplanationComponent (human-readable “why allowed/denied”)
  • RollbackPanelComponent

Guardrails

  • Default: propose actions; only auto-execute in explicitly configured “Autopilot scopes.”

  • Every action must support:

    • idempotency key
    • audit fields (why, ticket/run linkage)
    • reversible/compensating action where feasible

KPIs

  • % actions proposed vs executed
  • “Policy prevented incident” count
  • Approval latency and action success rate

ADVISORY-AI-003 — Ops Memory (structured, durable, queryable)

Problem

Teams repeat incidents because knowledge lives in:

  • chat logs
  • tribal memory
  • scattered tickets
  • unwritten heuristics

Chat history is not an operational knowledge base: its unstructured and hard to reuse safely.

Why we do it

Ops memory reduces repeat work and accelerates diagnosis. It also becomes a defensible dataset because its tied to your Runs, artifacts, and outcomes.

What we ship

A set of typed memory objects (not messages):

  • DecisionRecord
  • KnownIssue
  • Tactic
  • Constraint
  • PostmortemSummary

Memory is written on:

  • Run closure
  • approvals (policy events)
  • explicit “save as org memory” actions

How we achieve (modules + UI)

Backend modules

  • StellaOps.AdvisoryAI.Memory

    • Write: extract structured memory from run artifacts
    • Read: retrieve memory relevant to current context (service/env/symptoms)
    • Conflict handling: “superseded by”, timestamps, confidence
  • StellaOps.MemoryStore (Postgres tables + full-text index as needed)

UI components

  • MemoryPanelComponent (contextual suggestions during a run)
  • MemoryBrowserComponent (search + filters)
  • MemoryDiffComponent (when superseding prior memory)

Guardrails

  • Memory entries have:

    • scope (service/env/team)
    • confidence (verified vs anecdotal)
    • review/expiry policies for tactics/constraints
  • Never “learn” from unresolved or low-confidence runs by default.

KPIs

  • Repeat incident rate reduction
  • Time-to-diagnosis delta when memory exists
  • Memory reuse rate inside Runs

ADVISORY-AI-004 — Playbook Learning (Run → Patch → Approved Playbook)

Problem

Runbooks/playbooks drift. Operators improvise. The playbook never improves, and the organization pays the same “tuition” repeatedly.

Why we do it

Playbook learning is the compounding loop that turns daily operations into a proprietary advantage. Competitors can generate playbooks; they struggle to continuously improve them from real run traces with review + governance.

What we ship

  • Versioned playbooks as structured objects

  • Playbook Patch proposals generated from Run traces:

    • coverage patches, repair patches, optimization patches, safety patches, detection patches
  • Owner review + approval workflow

How we achieve (modules + UI)

Backend modules

  • StellaOps.Playbooks

    • Playbook schema + versioning
  • StellaOps.AdvisoryAI.PlaybookLearning

    • Extract “what we did” from Run timeline
    • Compare to playbook steps
    • Propose a patch with evidence links
  • StellaOps.DiffService

    • Human-friendly diff output for UI

UI components

  • PlaybookPatchCardComponent
  • DiffViewerComponent (Monaco diff or equivalent)
  • PlaybookApprovalFlowComponent
  • PlaybookCoverageHeatmapComponent (optional, later)

Guardrails

  • Never auto-edit canonical playbooks; only patches + review.

  • Require evidence links for each proposed step.

  • Prevent one-off contamination by marking patches as:

    • “generalizable” vs “context-specific”

KPIs

  • % incidents with a playbook
  • Patch acceptance rate
  • MTTR improvement for playbook-backed incidents

ADVISORY-AI-005 — Integration Concierge (setup + health + “how-to” that is actually correct)

Problem

Integrations are where tools die:

  • users ask “how do I integrate X”
  • assistant answers generically
  • setup fails because of environment constraints, permissions, webhooks, scopes, retries, or missing prerequisites
  • no one can debug it later

Why we do it

Integration handling becomes a moat when it is:

  • deterministic (wizard truth)
  • auditable (events + actions traced)
  • self-healing (retries, backfills, health checks)
  • explainable (precise steps, not generic docs)

What we ship

  1. Integration Setup Wizard per provider (GitLab, Jira, Slack, etc.)
  2. Integration Health dashboard:
  • last event received
  • last action executed
  • failure reasons + next steps
  • token expiry warnings
  1. Chat-driven guidance that drives the same wizard backend:
  • when user asks “how to integrate GitLab,” Stella replies with the exact steps for the instance type, auth mode, and required permissions, and can pre-fill a setup plan.

How we achieve (modules + UI)

Backend modules

  • StellaOps.Integrations

    • Provider contracts: inbound events + outbound actions
    • Normalization into Stella Signals and Actions
  • StellaOps.Integrations.Reliability

    • Webhook dedupe, replay, dead-letter, backfill polling
  • StellaOps.AdvisoryAI.Integrations

    • Retrieves provider-specific setup templates
    • Asks only for missing parameters
    • Produces a “setup checklist” artifact attached to a Run or Integration record

UI components

  • IntegrationWizardComponent
  • IntegrationHealthComponent
  • IntegrationEventLogComponent (raw payload headers + body stored securely)
  • SetupChecklistArtifactComponent (generated by AdvisoryAI)

Guardrails

  • Store inbound webhook payloads for replay/debug, with redaction where required.
  • Always support reconciliation/backfill (webhooks are never perfectly lossless).
  • Use least-privilege token scopes by default, with clear permission error guidance.

KPIs

  • Time-to-first-successful-event
  • Integration “healthy” uptime
  • Setup completion rate without human support

ADVISORY-AI-006 — Outcome Analytics (prove ROI with credible attribution)

Problem

AI features are easy to cut in budgeting because value is vague. “It feels faster” doesnt survive scrutiny.

Why we do it

Outcome analytics makes Stella defensible to leadership and helps prioritize what to automate next. It also becomes a dataset for continuous improvement.

What we ship

  • Baseline metrics (before Stella influence):

    • MTTA, MTTR, escalation count, repeat incidents, deploy failure rate (as relevant)
  • Attribution model (only count impact when Stella materially contributed):

    • playbook patch accepted
    • evidence pack used
    • policy-gated action executed
    • memory entry reused
  • Monthly/weekly impact reports

How we achieve (modules + UI)

Backend modules

  • StellaOps.Analytics

    • Metric computation + cohorts (by service/team/severity)
  • StellaOps.AdvisoryAI.Attribution

    • Joins outcomes to AI artifacts and actions in the Run ledger
  • StellaOps.Reporting

    • Scheduled report generation (exportable)

UI components

  • OutcomeDashboardComponent
  • AttributionBreakdownComponent
  • ExecutiveReportExportComponent

Guardrails

  • Avoid vanity metrics (“number of chats”).
  • Always show confidence/limitations in attribution (correlation vs causation).

KPIs

  • MTTR delta (with Stella artifacts vs without)
  • Repeat incident reduction
  • Escalation reduction
  • Automation coverage growth

One unifying implementation note: “AdvisoryAI” should output objects, not prose

To make all seven advisories work together, standardize on a small set of AI-produced artifacts:

  • Plan (step list with tools/actions)
  • EvidencePack
  • DecisionRecord
  • PlaybookPatch
  • IntegrationSetupChecklist
  • RunSummary (postmortem-ready)

Every artifact is:

  • versioned
  • evidence-linked
  • attached to a Run
  • subject to policy gates when it triggers actions

This gives you:

  • auditability
  • deterministic replay of the inputs and tool outputs
  • compounding “Ops memory” and “Playbook learning” data

Example: how this maps to “AdvisoryAI module + Chat UI component”

Minimum viable architecture that is coherent:

Backend

  • StellaOps.AdvisoryAI.WebService (orchestrator)
  • StellaOps.AdvisoryAI.Evidence (citations)
  • StellaOps.AdvisoryAI.Memory (structured memory)
  • StellaOps.AdvisoryAI.PlaybookLearning (patch proposals)
  • StellaOps.AdvisoryAI.Integrations (setup guidance + checklists)
  • StellaOps.PolicyEngine + StellaOps.ActionRegistry (gated actions)
  • StellaOps.RunLedger + StellaOps.EvidenceStore (audit + snapshots)

UI

  • AI chat panel that emits/accepts artifact cards
  • Evidence side panel
  • Approval modal
  • Diff viewer for patches
  • Integration wizard + health

If you want, I can translate these advisories into a single internal spec pack:

  • database tables (Postgres)
  • event contracts (signals/actions)
  • JSON schemas for artifacts/actions
  • UI navigation and component tree
  • the first 10 “golden workflows” you should ship with the Workbench