2026-05-31 / 6 min read pattern ai-ops

What's the next decision?

cubby ai team

Field note

Authored from the ops floor. Cite freely; rewrite only with attribution.

The question we ask the agent — and ask of ourselves at the design review — is the same one: what is the next decision? Not what is the system doing, not what changed in the last hour, not what is the model confidence on this anomaly. The next decision. Stated as a verb. Surfaced before the data, not after.

This is a design pattern, not a slogan. It has a literature now, and the literature is consistent.

The pattern, stated

Three commitments at the surface layer:

First paint is a decision, not a chart. The top of the screen, the first message, the API response root — all of them carry a verb plus an object: roll back deploy 8af2, email the customer, drop this lead, file the SOC 2 evidence. If the agent cannot propose one, it says so, and what is missing.
Evidence is one layer down. A single piece of evidence — one chart, one log, one diff — that decided the decision. Not every input. Not the dashboard.
Rejected alternatives are visible. Each with the reason it lost. This is the part most products skip, and it is the part responders trust the most.

Where the literature lands

Two strands converge on this shape.

Strand one: enterprise next-best-action. Genesys's working definition is short and useful: NBA is "an AI-driven approach that identifies the most relevant recommendation, message or offer for a customer in real time" using data, analytics, and ML. The framing is explicit about the move it makes — from broadcast dashboards to per-context recommendations, presented as an action the user can take immediately. The contact-center industry has been running this pattern at scale for a decade. Some of their primitives — confidence scores attached to recommendations, explanations attached to each suggestion — port cleanly to ops.

Strand two: AIOps and autonomous-cloud research. Microsoft Research's AIOpsLab, accepted at SoCC '24, frames the long arc as AgentOps: AI agents handling the full incident lifecycle, ending in self-healing systems. The companion SoCC paper evaluates LLM agents against a benchmark covering detection, localization, root-cause analysis, and mitigation, and concludes that current agents have "capabilities and limitations" — they need clear observability and they need a way to take action, including arbitrary shell commands, with structured error feedback.

The 2025 survey A Survey of AIOps in the Era of Large Language Models zooms out across 183 papers from 2020 to 2024 and names a five-level automation ladder for assisted remediation, ending in autonomous execution. The survey identifies four open challenges that map almost one-to-one onto interface design: efficiency and cost, data diversity, generalizability, and integration into existing toolchains. The first three are model problems. The fourth — integration — is the surface layer. It is where the next-decision pattern earns its keep.

Why "next decision first" is load-bearing, not aesthetic

Three reasons it changes outcomes.

It collapses synthesis latency. Splunk's MTTA writeup defines MTTA as time-to-acknowledge once an alert fires. If the responder lands on a chart, they spend their first 90 seconds synthesizing — building the verb in their head. If the responder lands on a verb, they spend their first 90 seconds either acting or rebutting. The latter two have outcomes. The first is dead weight on a $125k-per-hour clock.

It moves the agent up the automation ladder safely. The AIOps survey's five-level ladder distinguishes ask questions, suggest, draft, execute with ack, execute autonomously. A next-decision-first surface lets the same agent operate at level 2 or 4 with no code change — the only delta is whether the human has to press enter. That is the only delta that should matter.

It exposes the rejected alternatives. This is the part Gartner's decision-intelligence framing makes mandatory at the platform layer: a decision intelligence platform has to track prior outcomes and point out flaws and biases in the decision-making process. You can only audit decisions that were made legibly. We picked A; we rejected B because Y; we did not consider C. That sentence is the smallest unit of an auditable decision history.

Stance

The next decision is the unit. Everything else is supporting evidence. We design cubby ai, and we think the field should design AIOps tools more broadly, by writing the next decision first and letting the data be the appendix the responder opens only when they want to argue with the verb.

If the agent cannot name the next decision, that is the field note: it cannot decide yet, and here is what is missing. That is also a decision, surfaced in the same shape.

Sources

Microsoft Research — AIOpsLab: Building AI agents for autonomous clouds — framing of AgentOps, the SoCC '24 paper, and the AIOpsLab benchmark.
arXiv 2501.06706 — AIOpsLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous Clouds — full benchmark paper covering detection, localization, root-cause analysis, and mitigation.
arXiv 2507.12472 — A Survey of AIOps in the Era of Large Language Models — 183-paper survey naming the five-level automation ladder and the four open challenges.
Genesys — What is next-best action — operational definition of NBA from the contact-center industry that has run this pattern longest.
Splunk — Mean Time to Acknowledge (MTTA) — gives the cost basis (~$125k/hour) that makes synthesis latency expensive.
Gartner — Decision Intelligence glossary — establishes that auditable decision history is a platform-layer requirement, not a nicety.