I just got back from the Databricks Data & AI Summit, and I have a lot to unpack. This was the biggest DAIS yet, with 31,309 attendees, and the energy was unmistakable. More importantly, Databricks came in with a sharpened story about where it’s headed.
The headline strategic shift: Databricks has decided its real competition isn’t Snowflake anymore. It’s the frontier model makers, namely Anthropic, OpenAI, and Google. The thesis is that privileged access to corporate data gives Databricks an edge that the frontier players can’t easily replicate. It’s a bold bet, and whether it pays off is one of the show’s more interesting open questions. Either way, it’s a clear point of view, and it organized everything they announced.
Speaking of organization: this year’s framing centered on four AI pillars, namely Context, Control, Cost, and Choice. That’s a useful spine, so I’m going to walk through my takeaways.
The Four Pillars at a Glance
Here’s how the announcements map:
- Context is Genie Ontology, Lakewatch, Omnigent
- Control is Unity AI Gateway
- Cost is Unity AI Gateway (smart routing)
- Choice is Lakebase, Any LLM, Iceberg
One honest observation: “Cost” appears to be a recent addition to the framework. It sat a little awkwardly on the keynote slides and didn’t appear in the Partner Summit deck. I read that as a feature, not a bug. It reflects how fast the industry moved this year toward “token maxxing,” meaning squeezing the most value out of every token and getting runaway agent bills under control. Databricks adjusted the story to meet the moment, and the substance behind Cost (more on that below) is genuinely good.
Let’s take the pillars one at a time, starting with the one I found most consequential.
Context: The Biggest Bet
This is the pillar that matters most to those of us who live in the semantic layer world.
Genie Ontology and Ontorank
Databricks has returned to a data-science-flavored approach: indexing data and using something they call Ontorank to surface the right context to Genie. In effect, this leans on Genie to generate SQL across customer schemas with better grounding. I’d call it “back to the future,” since it’s closer to the original pre-Unity Catalog Metric Views idea than to where they were pointing a year ago. Whether that’s the right direction depends a lot on how well Ontorank handles messy, real-world schemas, which is exactly where this approach has historically struggled.
Notably, Ali framed Unity Catalog Semantics (which folds in UC Metrics) as part of Genie Ontology, and in doing so, gave a name-check to AtScale as a third-party semantic layer. I’ll happily take that.
What stood out just as much was what wasn’t said: there was no mention of UC Metric Views in the keynote. Given how central that was to last year’s story, the omission reads like a strategic recalibration in semantic-layer land. My read is that recent industry moves, including Microsoft’s repositioning around UC Metric Views, have Databricks rethinking how hard to lean on that particular bet.
The accuracy benchmark, and the real opportunity
Databricks shared an internal benchmark claiming Genie Ontology hits 84.5% accuracy. That’s a meaningful jump over naive text-to-SQL approaches that tend to land in the 20-25% range. Credit where it’s due.
But it’s also a clear signal about where the category is heading. Being wrong roughly one time in six leaves a lot of room, and accuracy is the wide-open problem. Whoever plants a flag on trustworthy, governed, benchmark-backed accuracy is positioned to win this space. (I have opinions about who that should be, and “bring your own semantic layer” is a big part of the answer.)
Genie One
Genie One is Databricks’ entry in the single-assistant race, one chatbot-and-coding-companion to sit alongside Claude, ChatGPT/Codex, Cursor, and Snowflake CoWork. It runs on Genie Ontology, and from my early hands-on time, it’s still maturing. They also introduced OpenSharing, which lets you share Genie agents the way Delta Sharing shares data. It’s a smart idea, and one to revisit as the underlying experience firms up.
Omnigent
Omnigent is Databricks’ agent harness, essentially a developer IDE for orchestrating multiple LLMs and coding agents. This is clearly a Matei Zaharia project, and that’s worth respecting: Matei is a brilliant founder and a big reason Databricks is what it is today. The project is ambitious and, in my view, still searching for the precise problem it’s best suited to solve. The good news is they open-sourced it, which is exactly the right instinct. Putting it in the community’s hands is probably the fastest way for it to find its sharpest use case.
Control: The Most Coherent Thing They Shipped
Credit where it’s due. The Unity Catalog AI Gateway was the cleanest, most complete announcement of the show. It bundles:
- An agent (MCP) registry
- Policies and budget controls
- Agent tracing and observability, with session data written to a UC table
- Smart routing, matching the lowest cost LLM for the job
It’s a sensible, well-integrated control plane for a world where enterprises run many agents and need governance to match.
Cost: Smart Routing Is the Substance
The standout feature inside the Gateway, and the real substance behind the Cost pillar, is smart routing. The Gateway can select or substitute lower-cost (often open-source) models based on task difficulty, automatically using cheaper models for easy work. That directly addresses the token-maxxing problem everyone was talking about, and I think this pattern of controlling token spend by routing intelligently will be one of the defining themes of the next few years.
Choice: No Lock-In, By Design
The Choice pillar is the anti-lock-in message: Lakebase (their Postgres offering), Any LLM, and Iceberg, all under the banner of “any data, any model, any cloud, no lock-in.” Combined with OpenSharing, it’s a consistent pitch to customers who want flexibility and portability. It’s also the pillar most directly aimed at the frontier-lab thesis, the argument that Databricks meets you wherever your data and models already live.
Beyond the Pillars
A couple of announcements didn’t map neatly onto the four pillars but are worth noting.
Reyden and Lakehouse RT. Databricks teased Reyden, a new low-latency query engine that’s their answer to ClickHouse and the real-time analytics crowd. The first product riding on it will be Lakehouse RT. It’s in private preview and feels early, but it’s worth watching.
CustomerLake. This one raised the most interesting strategic question for me. Databricks is shipping a first-party Customer Data Platform (CDP), giving it real stage time. The likely logic is capturing app revenue directly amid the broader “SaaS-pocalypse.” It’s a reasonable bet, though it sits in tension with the Apps framework and marketplace they also promoted, which depend on a thriving partner ecosystem. How Databricks balances building first-party apps against enabling partners to build them is a thread worth following.
The Bottom Line
Databricks left this Summit as a company with enormous gravity and a long list of ambitious bets: a sharpened stance against the frontier labs, a recalibrated context layer strategy, a flagship agent product still finding its footing, and a move into first-party apps.
But the most important story was quieter: accuracy and cost are the unsolved problems. An 84.5% benchmark is real progress and a wide-open invitation to go claim the accuracy leadership position. Some of us intend to.
See you next year. Bring your own semantic layer.
SHARE
Guide: How to Choose a Semantic Layer