Autonomous AI Needs a Semantic Layer

I’ve been in the BI and data space long enough to recognize when something fundamental shifts. Denny Lee and I first worked together when he was at Microsoft on SQL Server Analysis Services (SSAS), and I was a customer building what was, at the time, an absurdly large cube (24 TB) at Yahoo!. We were solving the same core problem we’re still solving today, just with different tools: how do you get consistent, trustworthy answers from your data at scale?

On a recent webinar, Denny and I walked through how AtScale’s universal semantic layer integrates with Agent Bricks, Databricks’ framework for building production-grade AI agents, using the Model Context Protocol (MCP). Below is a breakdown of what we built, and why the architectural decisions matter.

LLMs Are Probabilistic. Enterprise Metrics Cannot Be.

Large language models are remarkably capable. They can reason, synthesize, and generate analyses from a simple natural-language prompt. But they are fundamentally non-deterministic. Ask the same question twice, and you may get two different answers, especially when those answers depend on navigating complex data schemas.

That’s manageable when a human is in the loop. It’s a serious problem when you’re deploying autonomous agents.

The semantic layer exists precisely to solve this. At AtScale, a semantic layer provides a logical view into your organization’s data using common business terms: gross margin, churn, customer lifetime value, etc. These definitions are pre-built, governed, and consistent. When an LLM queries through a semantic layer via MCP, it doesn’t need to reconstruct joins, infer aggregation rules, or guess at business logic. It asks for a metric by name and gets a deterministic result back, every time.

You want the LLM to be creative in interpreting the results and asking follow-up questions. You don’t want it to be overly creative in its calculations or data navigation.

What Model Context Protocol Actually Enables

MCP is getting a lot of attention right now, and it is warranted. It’s a standardized way for an LLM or agent to discover and invoke tools exposed by an external server.

On its own, MCP is roughly equivalent to a structured REST API call. What makes it powerful is what you expose through it, and what governance you enforce around those exposures.

AtScale’s MCP server exposes three core tools to an agent:

List models — surfaces the semantic models available for querying
Describe model — returns full metadata: dimensions, metrics, hierarchies, descriptions, synonyms
Run query — executes a governed query against the semantic layer and returns deterministic results

When Databricks’ Agent Bricks connects to the AtScale MCP server, it can reason about your data using that metadata, construct simple logical queries against your semantic models (no joins required), and return results that are consistent regardless of how the question is phrased or which tool asks it.

The underlying SQL query that AtScale generates against Databricks can be complex, including joins, filters, and aggregations across multiple tables. But the agent never sees that complexity. It works with a clean semantic interface.

From Architecture to Execution: What the Demo Proved

We walked through a live demo of the AdventureWorks dataset in Databricks, using a semantic model built in AtScale Design Center.

A few things worth highlighting from that demo:

AI-generated semantic models. AtScale’s semantic models are defined in YAML, plain code. Because LLMs are good at writing code, we were able to generate a new semantic model automatically from an existing Databricks table. This significantly reduces the manual effort traditionally required to build and maintain semantic models.

MCP integration through the Databricks Marketplace. AtScale’s MCP server is available directly in the Databricks Marketplace, making it straightforward to connect to Agent Bricks without any custom integration. Within minutes, Agent Bricks had access to the semantic models, could list them, inspect their metadata, and run queries.

Usage-based aggregate optimization. As queries came in through Agent Bricks, AtScale automatically created optimized aggregate tables in Databricks. Subsequent queries hit those aggregates instead of the full fact table, returning results in under a second. This matters at enterprise scale, where query costs and latency are real constraints.

Autonomous analysis, governed results. When prompted with “Tell me something I don’t know about sales,” Agent Bricks ran a series of queries autonomously, analyzed the results, and surfaced non-obvious patterns, including weekend sales spikes, geographic anomalies, and color preferences correlating with higher willingness to pay. The questions the LLM asked varied each time, but the numbers behind every answer were consistent.

Why Governed Semantics Are Non-Negotiable for Agentic AI

Conversational analytics is a useful interface, but it still requires a human to initiate the query, interpret the result, and decide what to do next.

Agentic AI changes the model. Agents can be configured to monitor conditions, trigger actions, and operate headlessly without human intervention. The example Denny and I walked through: an agent configured to identify customers approaching a churn threshold and send them a targeted offer automatically, as the data changes.

That’s prescriptive analytics operating at machine speed.

For that to work safely, the metrics powering those decisions have to be right. An agent that hallucinates a churn rate not only produces the wrong report, but could also send a discount offer to your highest-value customers who have no intention of churning, or fail to flag ones who do. In an autonomous system, incorrect metrics translate directly into incorrect actions, with the potential to inflict damage at an unprecedented scale.

Governed semantics are the prerequisite for trust in an agentic system.

The Reference Architecture for Governed Agentic AI

What we demonstrated in the webinar comes down to a straightforward integration:

• Semantic definition layer — AtScale defines and governs semantic models using SML (YAML-based, version-controlled, portable).

• Context exposure layer — The AtScale MCP server exposes governed tools to Agent Bricks.

• Reasoning layer — Agent Bricks interprets metadata, constructs queries, and synthesizes insight.

• Execution layer — Databricks executes optimized SQL generated by AtScale.

The LLM reasons. The semantic layer governs. The data platform executes.

The Shift From Demos to Production

In the early days of data engineering, there were no testing frameworks or software development practices. The industry grew into those disciplines over time. Agentic AI is at that same inflection point. The organizations that apply rigor now, building governance into their agent architecture from the start, will be the ones that can actually trust what their agents are doing.

Over time, enterprises will measure AI success by the reliability of automated outcomes. That reliability begins with governed semantics.

The MCP server for AtScale is now available on the Databricks Marketplace. AtScale itself can be downloaded and tested for free. If you’re building toward an agentic future on Databricks, this is a practical place to start.

>> Watch the full webinar on-demand now.