Conversational BI pilots succeed because they operate in constrained environments. A dozen users, curated datasets, predictable query patterns. When AI agents enter the picture, everything breaks.
Gartner forecasts that more than 40% of agentic AI projects will be abandoned by next year. The primary reason isn’t the AI itself, but the underlying infrastructure. Organizations build natural language analytics on data architectures designed for human analysts, rather than for autonomous systems that generate unpredictable queries.
From Scheduled Routes to Autonomous Access
Traditional business intelligence behaves like a rail system. Routes are defined in advance. Capacity is planned. Trains run on schedules. You know where they can go, how much they cost to operate, and what happens when demand increases.
Conversational BI fundamentally changes the architecture, shifting it toward a more autonomous drone-like architecture. Every natural language question becomes a new, on-demand journey. The system must interpret intent, reconstruct business definitions, infer joins, apply filters, and then explain the result for every interaction. Analytics shifts from a scheduled, capacity-planned system to a probabilistic, conversational one.
That shift has direct cost implications. FinOps practitioners warn that GenAI spend is driven by hidden architectural costs like context-window creep in multi-turn interactions and output token pricing that can run 3-5× higher than input tokens. Verbose explanations quietly dominate the total cost.
The interface looks simple. The architecture underneath is fundamentally different.
When Everyone Can Ask the Data Anything
Cost is only part of the issue. Risk increases just as quickly.
When natural language interfaces are deployed widely, data access ceases to be mediated by analysts, dashboards, or predefined metrics. Every employee effectively gains an always-on data analyst that can query across systems, combine signals, and generate narratives at machine speed.
That changes the risk profile in subtle but important ways.
Business users don’t just ask what happened. They ask why, what if, and what we should do next. Without shared definitions and guardrails, two people can ask the same question and receive different answers, both sounding authoritative, both backed by data, neither obviously wrong.
In regulated or revenue-sensitive environments, that’s dangerous.
When an LLM interprets raw data directly, it makes implicit assumptions about definitions, time windows, exclusions, and edge cases. Those assumptions are rarely visible to the person reading the answer. Multiply that across hundreds or thousands of employees, and the organization loses any single source of truth.
The result is decision sprawl. Conflicting numbers circulate in meetings. Executives debate which answer is “right.” Trust erodes, even as access increases.
That loss of a single source of truth increases organizational risk and changes how the system is used. Once questions are no longer mediated by analysts, dashboards, or predefined metrics, query behavior itself shifts.
Query Patterns That Don’t Scale
Natural language analytics breaks at scale because of the difference between human analysts and AI agents. Human analysts use intuition and pause between queries. AI agents issue hundreds or thousands of queries continuously without any sense of computational cost or data complexity. There are no judgment calls about what’s worth analyzing. Exploratory by design, they drill into edge cases a human might dismiss as irrelevant.
This shift from intentional to exploratory query patterns is the issue. Unrestricted natural language access turns analytical curiosity into an unbounded computational problem. AI agents don’t know which joins are costly, which computations have already been performed, or which metrics should be reused rather than recomputed.
In most conversational BI implementations, a question like “What’s our customer retention rate?” is translated directly into SQL that accesses raw fact tables. The query involves full scans of transaction tables with millions or billions of rows, poorly constrained joins across customer, product, and time dimensions, and non-materialized calculations that get recomputed for every query. There’s no awareness of whether similar questions have been asked recently.
The cost may be manageable when a human asks one question. When an AI agent asks 500 variations in one hour, cloud data warehouse bills spike.
Requirements for Production Decision Automation
For decision automation to work in production, three requirements must be met:
- Consistency. “Customer lifetime value” is the same across all toolsets and use case.
- Predictability. Asking the same question twice yields the same answer and consumes roughly the same computational resources.
- Explainability. When an AI agent makes a decision or runs a query, teams need to trace exactly which data, calculations, and business rules drove that output.
Governance makes outcomes repeatable and explainable. Without that foundation, decision automation is too risky to deploy.
How Semantic Layers Change the Cost Curve
If conversational BI is like deploying thousands of autonomous drones, a semantic layer becomes the air traffic control system. It defines approved flight paths through governed metrics and dimensions, centralizes navigation rules for business logic and definitions, reuses known routes through shared aggregations and cached results, and prevents conflicting definitions and logic drift.
AI agents still ask questions and operate autonomously, but their behavior becomes predictable, repeatable, auditable, and economically viable.
A semantic layer’s deterministic SQL generation eliminates redundant computation and prevents AI agents from recalculating logic on every query. Shared logic prevents waste: if multiple queries require customer segmentation logic, they all reference the same semantic definition rather than implementing variations that compound into different answers. AI agents operate on known measures rather than inferred logic. Instead of asking an LLM to interpret the meaning of “active customer” by examining raw data, agents query a governed definition that captures institutional knowledge about data quality issues, edge cases, and business rules.
The business outcome is lower warehouse costs, stable performance, and consistent answers at scale.
Governed aggregations function like high-capacity air lanes, enabling the system to efficiently serve repeated questions. When an AI agent asks “What were sales by region last quarter?” a semantic layer with governed aggregations can return results from a pre-materialized summary table instead of scanning billions of transaction records, bound query complexity so agents can’t accidentally trigger runaway joins, cap compute consumption to predictable levels, and deliver sub-second latency instead of minute-long warehouse scans.
Once an agent has unrestricted access to raw data, there’s no reliable way to predict or control what it will query. Cost control has to be built into the semantic layer.
Moving from Pilot to Production
Enterprises fail at AI because their data foundation isn’t designed for autonomous access. Raw tables, ad hoc calculations, and unoptimized query patterns were used with human analysts. It’s unsustainable when AI agents multiply query volume by 100× or 1000×.
I’ve seen the same pattern across industries: successful pilot, promising business case, production rollout, unexpected warehouse bills, project scope reduction, and AI initiative stall. All from attempting AI adoption on a data infrastructure built for a different access pattern.
At AtScale, we’ve worked with enterprises for more than 12 years on production analytics systems that handle billions of queries. A semantic layer provides the constraints that make autonomy safe, predictable, and affordable without sacrificing the benefits that make conversational analytics valuable. It turns exploratory chaos into governed discovery.
If you’re ready to move your conversational AI initiative from pilot to production, request a demo to see how a semantic layer changes the cost curve.
SHARE
2026 State of the Semantic Layer