The Data Stack Show - Re-Air: Context is King: Building Intelligent AI Analytics Platforms with Paul Blankley of Zenlytic
Episode Date: November 19, 2025This episode is a re-air of one of our most popular conversations from this year, featuring insights worth revisiting. Thank you for being part of the Data Stack community. Stay up to date with the la...test episodes at datastackshow.com. This week on The Data Stack Show, John chats with Paul Blankley, Founder and CTO of Zenlytic, live from Denver! Paul and John discuss the rapid evolution of AI in business intelligence, highlighting how AI is transforming data analysis and decision-making. Paul also explores the potential of AI as an "employee" that can handle complex analytical tasks, from unstructured data processing to proactive monitoring. Key insights include the increasing capabilities of AI in symbolic tasks like coding, the importance of providing business context to AI models, and the future of BI tools that can flexibly interact with both structured and unstructured data. Paul emphasizes that the next generation of AI tools will move beyond traditional dashboards, offering more intelligent, context-aware insights that can help businesses make more informed decisions. It’s an exciting conversation you won’t want to miss.Highlights from this week’s conversation include:Welcoming Paul Back and Industry Changes (1:03)AI Model Progress and Superhuman Domains (2:01)AI as an Employee: Context and Capabilities (4:04)Model Selection and User Experience (7:37)AI as a McKinsey Consultant: Decision-Making (10:18)Structured vs. Unstructured Data Platforms (12:55)MCP Servers and the Future of BI Interfaces (16:00)Value of UI and Multimodal BI Experiences (18:38)Pitfalls of DIY Data Pipelines and Governance (22:14)Text-to-SQL, Semantic Layers, and Trust (28:10)Democratizing Semantic Models and Personalization (33:22)Inefficiency in Analytics and Analyst Workflows (35:07)Reasoning and Intelligence in Monitoring (37:20)Roadmap: Proactive AI by 2026 (39:53)Limitations of BI Incumbents, Future Outlooks and Parting Thoughts (41:15)The Data Stack Show is a weekly podcast powered by RudderStack, customer data infrastructure that enables you to deliver real-time customer event data everywhere it’s needed to power smarter decisions and better customer experiences. Each week, we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Transcript
Discussion (0)
Hey everyone, before we dive in, we wanted to take a moment to thank you for listening
and being part of our community. Today, we're revisiting one of our most popular episodes in the
archives, a conversation full of insights worth hearing again. We hope you enjoy it and remember
you can stay up to date with the latest content and subscribe to the show at datastackshow.com.
Hi, I'm Eric Dots. And I'm John Wessel. Welcome to The Datastack Show.
The Datastack Show is a podcast where we talk about the technical, business,
and human challenges involved in data work.
Join our casual conversations with innovators and data professionals
to learn about new data technologies
and how data teams are run at top companies.
Before we dig into today's episode,
we want to give a huge thanks to our presenting sponsor, Rutter Sack.
They give us the equipment and time to do this show week in, week out,
and provide you the valuable content.
RudderSack provides customer data infrastructure
and is used by the world's most innovative companies
to collect, transform, and deliver their event data
wherever it's needed all in real time.
You can learn more at rudderstack.com.
Oh, welcome back to the Data Stack Show.
Yeah, exactly.
We're here live from Denver.
Yeah, from my house, actually.
So I got to catch up with Ben Rogojon
last week in person here at Denver
and now we get to do this.
Yeah, the Denver 18th.
I mean, it's not a good week, you know?
Yeah, exactly.
Awesome.
Yeah, catch us up from when we last talked.
Yeah, there's a lot of exciting things going on on the Zenlit side.
We've been getting a bunch of great new logos like Jay Cruz, Stanley Black and Decker,
some of these just fantastic companies to work with.
Awesome, yeah.
And we've just been seeing AI and just go gangbusters in terms of overall capabilities of the models.
And it changes a lot of how generally this stuff,
needs to work in the future.
And it just, it changes so fast.
So it's like nothing I've ever seen in terms of rate of change of the industry overall.
Yeah.
So I'm always curious to these two questions.
One, is it going faster than you would have thought?
And then like the follow up to that is like, what is something, let's pick a six month or
year time frame that you're like, wow, this is like, I did not expect this.
So I think, I think definitely faster than I expected.
And I see a lot of really.
bad takes for people like, oh, well, the model's only getting sort of incrementally better.
It's like, you just use these things often enough to realize, like, the range they're
improving. So it's like definitely fast-man anticipated. And I think it's also the domains in which
they are getting dramatically better is what's maybe most interesting to me. So it's like things
that they continue to be sort of approximately human or like sub-human or a lot of sort of
mind vendors are just sort of understanding, communicating,
these kind of things that humans do a lot.
And then if you look at things that they are already superhuman at
and increasingly getting more dramatically superhuman at,
it's like coding.
Like, you can throw an AI agent and it will win
or come in, you know, within the top five,
the best programmers in the entire world
who have been training on doing these, you know,
past.
Same with mathematics.
Same with any symbolic,
sort of task.
And the reason for that is that you can generate a massive amount of training data
on these symbolic tasks that you can verify or correct.
So it's like you could say, hey, this test case needs to pass
if this code is written correctly,
and then you can just reinforce and learn that process
at a just truly massive scale,
way more so than you can sort of soft problems.
Right.
So while the models have certainly improved
and how they handle things in sort of softer domains,
how they've improved in terms of code generation
or math or physics or other hard
or symbolic domains,
that has been maybe one of the most interesting things to me.
It's also shaped a lot about how I think about the future
because I think within six to 12 months,
we're going to be superhuman in every respect
in symbolic domains.
So it's like there will be basically no human coder in 12 months
that is a better programmer than a language model.
That doesn't mean that the human won't be writing better code
than the language model,
just like someone who has more,
business context, but it's not as good a coding is going to do a way better job in terms of the
actual impact on stuff matters than a language model. But a lot of our job is thinking about how we
build an AI employee is how do we give the AI the right context to work with to be able to use the
stuff that is superhuman at to really help the human make their decisions. So interesting. So
another question along the AI models, what's your take on the long running models? Because I've seen
some stuff come out where people like these stats
are like, oh, these long running models, they don't
actually perform that well, you know.
And then other people who are like, it's amazing.
Like, you know, the longer it can run, like, you know,
the more it could do. Yeah.
No, so I think this is something
that divides a lot of people, even they used it pretty often.
So I love them.
I think it's amazing. Like you
just have to do a good job sort of
pointing it at the research problem
and the method approach you want to take
things like that. So for instance, if I
wanted to go and do an analysis,
of like, hey, what are the different data sources that people most often intervene with Salesforce.
It's like I would have an idea of how I want to approach that problem and, you know,
what other stuff fits with Salesforce in this way? And I can go and tell the agent, hey, this is
generally the approach I want you to take. And it can go and look at a thousand different sites
or a thousand different references and sort of aggregate all that up. And then as I look at the
report, I can click and see the citations and make deeper in anything I want to take deeper on.
So I find it to be incredibly powerful.
I know a lot of people, though, like, just don't engage with that in the same way.
Yeah.
And I don't really use it for, like, quick and dirty things.
Right.
So I think it's some of its difference in usage patterns as well.
Yeah.
Yeah.
I've seen some deep research, seen really neat things.
Manus.
I've seen some really neat things.
So, yeah, I, the thing that I've wanted, and this actually brings us kind of into the
B.I. territory, the thing that I've really wanted to see, which I'm interested in your
take is the long-running analyst task where you're, where you know, there's a lot
lot of like there's a spectrum here there's like the text is equal essentially or the equivalent
versus like I have this really deep problem with data from like a bunch of different places
and I want you to like solve the problem I think with your kind of AI employee concept you guys
are really trying to span both so I'm interested I want digging a little bit on the harder
problem and like is like long I assume like the long running task is part of that but like what
are some other components to solving for those like deep McKinsey or consultant sell
Yeah. I think a lot of it is you first got to give the model an interface in which it can
really cook, like which it can really work in and work well. That interface is still got to be
governed. Most importantly, it's got to be, it's got to correspond to your actual business context
because if you aren't understanding the environment you're working in, just like an employee
who's very technically capable but doesn't actually connect to the business, it's not going to be
that valuable at the end of the day because it's just speaking a different language from what
people are going to need to make actual decisions. I refer to
that is the data science problem.
Yeah.
Yeah, it really is.
Because that's what companies 10, 15 years ago,
they all hired data scientists.
And then there was, honestly, I think a lot of it was a communication gap.
Like, he's remarkably smart, PhD people with all the, like,
maybe stats background and computer science background and a lot of them
struggling with the business value.
Yeah.
Yeah.
And I think with AI, if you don't do a good job or praying the business context,
you get kind of the same thing where you get these answers that are maybe
generically good or good on apps.
but not actually good for helping you make
to say side of your business specifically.
So that's a lot of the stuff
that I think is really important.
The other thing is
picking the right kind of model for the job.
Like right now we have a model picker,
which is not something I'm very excited about.
I would love to see us have something
that's more of a delineation between
kind of fast and slow.
Okay.
Where instead, but right now you have model picker.
So you pick GPT 4.1,
it's going to be really fast.
It's going to answer the question quickly.
But if you wanted to go to a sort of
comprehensive look of what's going on,
it's not going to be so great at that.
Do you find at this point in time,
are customers, like,
you've got to have probably a subset of customers
that are really into like,
oh, like this is my model.
It's like a branding thing almost.
Like, I like this kind of car, right?
Even though maybe they're like fairly equivalent
for the task.
Do you find that to be the case?
You find those people are like,
just tell me whatever works best.
I don't care.
So one of the things that was interesting,
as soon as we launched the model picker,
People love to just go in and play around
because I didn't find it.
Yeah.
And for different companies,
because there is this tradeoff, right,
between latency and sort of comprehensiveness.
Right.
So, DPD 4 is going to be really fast.
Right.
Claude Sonic 4 are like 03 are going to be incredible
in terms of the debt and stuff that they can do.
But they take forever to run.
So you've kind of got to consider
what experience do I want my users to have by default.
Do I, by default, want it to be more thoughtful and slow
Or by default, do I want it to be faster, but maybe less comprehensive.
Right.
So one of the ways I like using Zoe to go and explore some data is that I'll just go
and I'll say like, hey, maybe a comprehensive dashboard of everything that's going on in the
and just like let it crank.
And it'll just crank and put all this stuff and put it together.
And I've got an overall view of everything that's going on and can kind of dive into any
areas that I don't want to focus on.
And I imagine too as we progress, there's going to be a mix here where if I'm sitting down,
Like, all right, I'm going to spawn five background tasks to go explore these things.
Those run.
And then, like, while those run, I'm going to be running 4.1 or something and just, like,
kind of exploring around.
So you can kind of do both.
Oh, totally.
I mean, not that you couldn't already do that, but I think I forget that's the thing.
I forget that, you know, that you could actually kind of multi-threat this stuff.
I just forget.
Yeah, no, it's easy to do because it's as easy as, like, you open a bunch of different tabs
and you just start different friends in the different news.
And it's crazy, the amount of stuff you can parallelize just to that right now.
Right.
Yeah, I mean, you know, just like a bunch of tabs, there's the whole focus thing that like, you know,
like you have to be careful with because you do want, you know, you do still have circumstances
where like deep thought is the right answer.
And it can definitely be a distraction.
Yeah, absolutely.
But also kind of plays into where I think where we're going in the long run.
So I think about us in the long run as, as.
as an employee, basically, as another person.
So it's like the, you know, quote is like,
don't buy software, like hire talent.
You just be thinking about AI agents
is almost something that you're hiring your business.
And I think about our place in that as effectively
an AI McKinsey consultant.
That's especially good at work from the data,
knows your business context,
and can immediately start launching
and helping you answer some of these questions
that actually matter.
So a lot of the problem that I view that solving
is being,
B.I. as it's existed for a long time is primarily, like, a collection of facts. And that's fine
if you have someone who's extremely analytically minded minded and wants to, like, hop in and look at all
the facts and kind of build those into a narrative and go with that. But a lot of the problem that
analysts have had to solve, and a lot of this isn't really rocket science. You know, if you're
referred to as like data literacy, it's like, how do you go from this collection of facts and
turn that into a decision about which products you're promoting campaigns? Bring that into decisions
about when you're retailing inventory
for these different skews.
So that is one of the real fundamental problems
that AI enables that we're especially excited.
That's all.
So I've got to ask about the AI employee thing
because this has come up more than once.
Do you think that in the future
companies will pay for training for their AI employees?
I think they have, and they already do.
Okay.
And in what way?
I'm imagining like we have an AI employee
and we like send them to a conference
to a big person.
I know it's not going to be quite that way, but yeah.
I think it's actually going to be called data engineering.
It's going to be because a lot of the job that you've got to think about is how do you broker
business context to a model.
And a lot of that is like doing actual data modeling, data engineering the way we
about it now.
A lot of it's doing similar work to data engineering but applied to text.
Because like how do you get the right text in the right spot where this model is able to act
on in a way that matters?
And now we have incredibly capable models that, given the right context, can perform a lot of road tasks, like, really well and pretty reliably now.
Right.
So it's like the problem and the thing that we have to be very cognizant of is it's like you need basically context engineers.
Sure.
Make sure of the right stuff shows up at the right time.
Okay.
So context engineers.
Is that a job posting?
Like, are people hiring?
Not a job posting, I'm not yet, but in a bit of years?
In a year?
In a year, it could be.
Yeah.
Yeah.
Give me like prompt engineering.
Yeah.
Man, that's fun.
Okay.
So speaking of context, structured data, kind of where we've been on structured data.
We had some runs with some, you know, database is devoted to them, you know, back 10 years ago.
Do you think, well, two parts of this question, do you think the existing structured data platforms are going to actually be able to kind of retool to support when everybody needs out of an unstructured data stores or anything that's?
That's a whole new like greenfield thing where people are going to have maybe a structured data
platform and unstructured and hook them both up maybe to it, to an AI or BI tool.
Yeah, great, great question.
I think it's, it's a hard question to answer because the water is going to be really bloody.
Like you have kind of an enterprise search, which is previously done, the sort of unstructured
things.
You have all the BI players, like legacy VI players doing like everything just on top of SQL.
I think it's inevitable that in the long run these two things come together.
One of the things language models did is they changed what people think about when they think about data.
So data used to mean like a table, a CSB, something in a database that you have to ask somebody about.
Data now also corresponds to contracts and, you know, internal knowledge docs and all this other stuff that is immensely valuable, but previously wasn't thought about as data.
And it's time for business intelligence to actually like make good on the business intelligence.
knowledge of the sequel work.
Contracts is such a good one
because I can't tell you the number of companies
I've worked with,
all of the companies I've ever worked for
of all sorts of valuable knowledge
in DocuSine, essentially,
and PDFs where they, you know,
they customize the contract
for that enterprise deal
and they keep doing it and they keep doing it.
And nobody has any idea what the deals are.
Somebody has to dig up the PDF
and the person left
and you have to go get into their evening.
You know, it's like a whole thing.
And then there's an initiative like,
hey guys, get all this in Salesforce.
So it would be so interesting once
like all of that.
is accessible in the same way, probably even a better way than like a structured data would be.
Yeah. Well, I think the other thing that you alluded to is that part of the benefit and problem
with BI, the way it's done now, is that it relies on very structured data that sits inside
of a SQL warehouse that requires a person, set up a connector, transform a bunch of data,
clean it to the point that it actually corresponds to what the business is referring to.
And that's great. That's all of a lot of very important metrics and,
KPI's and how are you actually performing at a business at this high sort of
aggregate level. But it is just so much work that no one would ever bring all
PDS of all their contract. Right. And like you would need a five-trand to connect to
AccuSine. That probably doesn't like this. And if you don't have to read, even if you don't
have to write that yourself, you still got to like somehow manage all these PDS and that sort
of engineering work. Some companies will go down that road and do it. Other ones, it's they need a
more plug-and-play away to get unstructured data out of these systems and into effectively
search indexes.
So the agents can start work.
Yeah.
Interesting.
So kind of another topic I would think about a lot, or MCP servers, because there's, so now
as we're talking unstructured data, structured data, and then we've got like MCP servers here.
We're like, okay, that's an interesting interface.
Like, how do you think the MCP servers play it into the BI space?
I think NCP servers are actually really interesting.
And maybe hot take is that an NCP server, if you are building one as the BI product,
you are eventually going to get in place.
And the hot take there is that BI systems aren't actually systems of record.
They don't have the end of their just processing layers.
True.
And that processing layer can be shown as email code.
That processing layer can be shown as SQL, depending on which VI products you're using
and how it's structured internally.
But it's like if you're basically outsourcing your entire UI to Claude or to open AI
and MCP server, eventually you're going to be in trouble.
Yeah.
Because you're just a processing layer.
You don't have the actual data like Salesforce does.
Like that's what you guys.
I think it's only a defensible position if you have proprietary data for some reason.
Like I know of a couple of companies would have a unique proprietary source of data
and them putting MCP on top of it.
It's the same model they have now.
they're charting for data via an API.
Like, cool, put an MCP.
Like, the model.
But if it's not like proprietary or unique in any way, I totally agree.
Yeah.
And I think it's like for how people work with enterprise data to succeed, it's like,
you do have to own the interface.
And like, if you can't add enough value in your interface and how people understand
what you've done and the attractability piece, how it integrates with dashboards,
how it integrates with other parts of the system, it's like you'll kind of get heat.
And it's like a lot, you, the thing that people want, and this kind of goes back to your MCP question, is a lot of people will come and ask us, like, hey, can you be an MCP server? I want to build one ager. And the question to go back to your unstructured comment, it's like, well, why do you want to do that? The reason people almost universally want to do that is because they want the one system, they can talk to their PDFs over here, and that can talk to their structure data up here. And if BI, like, BI must evolve to do that. Otherwise, it's not going to actually.
be business intelligence, it's going to be SQL intelligence.
Right, right.
Yeah.
Well, and the other thing, too, is MCPs are still just a layer over APIs,
and we'll have these same problems that APIs have until a different abstraction is formed.
Because I think people actually have too high of an expectation on MCPs,
when it's like, look, there's a reason we put all this effort in centralizing these things in databases
is because it's better for analytic care clinics.
It's not that we didn't know how to interface with APIs.
We knew how to do that.
We did this on purpose.
Yeah, exactly.
It's like, and that's what I mean about the value out of the UI.
It's a lot of people think you just kind of magically post something into an MCP server.
Everything's like great.
Right.
But it's like there is value in UI.
Yeah.
Like the final UI is probably not one universal chat system that literally everything.
It's like, cursor is very successful.
Cursor is not the same thing as chat GPT.
Like it has UI.
Yeah, exactly.
That is really beneficial in coding.
So.
Yeah.
And one of the, you know, from this,
and letting product and fairly unique, actually,
like the ability to switch back and forth,
I think it's going to be more commonplace,
at least I hope,
where you can start with a gooey element of a chart
and say, hey, I want to essentially talk to this chart or dashboard
and know more about this thing.
Or I can start from chat and then like, hey, I want, you know,
just say what you're looking to learn more about
and create charts and graphs.
I think the ability to go back and forth
is a big deal from user experience.
Totally.
And I think that's something that cursor got really well,
something Andre Carpath was talking about in his recent demo day talk where it's like a lot of
AI applications that succeed are going to find this right balance when giving the model
flexibility to do what it wants by where you can kind of control the autonomy of the agent.
And I think that's a pretty good way to think about it where it's like if you in Zenlinic,
right, if you want to be in full control, like agent is not really involved.
You can be in the UI clicking on things and same way you did before.
You can also ask Zoe to go and pull a piece of data for you.
It's like barely agenting at all.
And then it's like you could also ask it to do
deep research on what's the tariff impact
on your gross margin going to be
of all these different skew.
It's like take this CSV of tariff on raw materials
and blow it through to my final margin impacts.
And that's a really big ask
and a big task that you need to do.
So it's like there's this slider of autonomy
or our own product right now.
And I'd expect that for products that do well,
they master the kind of transitions between that.
So it's really easy and feels really natural to go from I'm kind of running everything to the agents running some stuff to the agents running a lot of stuff.
And you can sort of go back and forth as needed to solve your problem.
Yeah, that's awesome.
So I want to go back a little bit to the unstructured data conversation.
What's hot there right now?
Because originally it was like, okay, everything in vector databases and obviously still some of that.
But say I've got hundreds of thousands of PDFs, like what are you seeing people do?
with it, want to get that epitata into a business intelligence, you know, AI business intelligence
layer.
So I think there's a few different components.
There's like the deterministic filtering that's going to happen in whatever query language
on whatever database you're using.
You probably do need some component of embedding or vectors.
There's a lot of complicated stuff that you can be there.
Junking the documents, which embeddings do you use based on the domain where the text is going
to show up most often to make sure your embeddings are most fitted to the problem you're
trying to solve.
You're also going to need some hybrid component to the search as well.
It's probably not going to work that well if you just use the embedding.
You don't have, you don't have also a keyword search as well.
So it's like there's, it's going to be a pretty sophisticated system.
Yeah, okay.
Yeah, that's what I have sense.
I think a lot of people that desire, and I've seen a lot of people start on that road and kind of, never mind.
It's just because of the complexity, not because they don't want it.
Yeah, yeah.
It's hard.
It's hard in a different way from data.
I think data is kind of like deceivingly easy.
One of our best, one of our best sort of pipeline motions for customers is customers
that have tried to build up themselves.
Sure.
Because it's like, their own AIBI agent.
Yeah, exactly.
Because it's like, it's trivial to start.
You can literally use an MCP server.
Yeah, and it's like, boom, it can go and it can answer any question in any table
in your whole CIP warehouse.
But it will inevitably pick the wrong table a lot.
It will not have the business context on how you define gross margin.
And you won't have the governance for anyone who's not an analyst, like SQL level permissions that you want to use.
It won't have interoperability with dashboards or other things that are actually useful assets for data.
So it's like it's really easy to start.
But as you kind of go down the road of not only do I not have this integration, do I not have this governance?
Most importantly, the business people don't actually know what happened.
Because you can't show someone a giant 200-line SQL statement and say, hey, this is what we did.
You know, check our work.
I don't know what's given on there.
Yeah, it's so funny.
Back to the like, like the, essentially the search problem.
That's been one of the things that I have wondered for years why nobody has just
self the basic search problem for BI.
I can't think of a single BI tool that has great search.
Oh, yeah.
Because, like it's, but the reason it matters is I see two things.
One, you would obviously not be surprised business users all the time ask for things
that already exist all the time.
For sure.
But worse, is analysts a lot of the time rebuild things that already exist too?
So it really goes both ways.
No, it totally does.
And I think a lot of the problem is due to hierarchies that get introduced and don't get searched over.
Yes.
Because it just makes the engineering a little bit easier.
So like a good example is like, Looker has explores.
If you just need to see revenue and you see sales, the marketing, you know, pipeline, revenue, explore, like, what do you click on?
Yeah.
I don't know.
And it's like that, that usually stops a person.
And they're okay, well, I don't know.
And it's in the Boulder called leadership.
Yeah, exactly.
You know, who knows?
So it's like all these hierarchies that don't really make sense
are what caused a lot of problems with data discoverability.
Yeah, it's super interesting.
We're going to take a quick break from the episode to talk about our sponsor, Rutterstack.
Now, I could say a bunch of nice things as if I found a fancy new tool.
But John has been implementing rudder stack for over half a decade.
John, you work with customer event data every day and you know how hard it can be to make sure
that data is clean and then to stream it everywhere it needs to go.
Yeah, Eric, as you know, customer data can get messy.
And if you've ever seen a tag manager, you know how messy it can get.
So rudder stack has really been one of my team's secret weapons.
We can collect and standardize data from anywhere, web, mobile, even server side, and then send it
to our downstream tools.
Now, rumor has it that you have implemented the longest running production instance of rudderstack
at six years and going.
Yes, I can confirm that.
And one of the reasons we picked Rudderstack was that it does not store the data and we can
live stream data to our downstream tools.
One of the things about the implementation that has been so common over all the years
and with so many rudder stack customers is that it wasn't a wholesale replacement of your stack.
it fit right into your existing tool set.
Yeah, and even with technical tools, Eric,
things like Kafka or PubSub,
but you don't have to have all that complicated
customer data infrastructure.
Well, if you need to stream clean customer data
to your entire stack,
including your data infrastructure tools,
head over to rudderstack.com to learn more.
Let's talk products.
So Zinn-Lenik, we're talking about the product
maybe over six months ago now.
Yeah, maybe give us a little bit of a roundup
but what you guys have been working on,
so things you guys have rolled out.
Yeah.
So a lot of cool stuff on our side.
So workflows are maybe the biggest feature that we launch.
Workflows are a way to take intelligent analytics,
like intelligent sort of processes.
So some of our customers will use this for a weekly business review.
They're a little good group, pull a bunch of data,
be able to take that to impact.
It's like, hey, you know, this you went down the most.
We walked that hierarchy and went into the, you know,
class that went down the most.
department that would doubt in the mode, the individual skew levels and you're able to give
this very comprehensive analysis that can go step by step and incorporate some intelligence the same
way, a human would. Other use cases where people are taking this feature and using it for almost
like dispute resolution, where you can take, you know, I've got a screenshot or a CSV from
SAP and I need to take all the items in there and marry that with something in my warehouse and
figure out, you know, based on this, what sort of message should I be sending to each of the
suppliers that come through in that CSV I got from SAP.
So a lot of different, really interesting use cases when you can kind of take some of the processes that you do for analytics and turn that into something repeatable that you can just schedule it to run every week on Monday.
You can just incrementally remove this work from your play.
That's one of the big ones.
The other stuff is a lot around making our interface more flexible for the language models.
So I think to get a little history,
of how these interfaces work.
There's kind of text-a-sexual interface.
Like Snowblake Cortex is maybe the biggest one,
but a lot of YC companies are doing this as well.
And it's really flexible.
It can do anything, right?
It's just whatever you can do in a warehouse it can do.
The problem, though, is that governance level,
where it's like, how do you govern this to someone
who doesn't have a Snowblake account?
And then most importantly, the interpretability.
How do you make clear what you did
in the same way a talented analyst would
and in a really high fidelity way?
to a business user.
So they can trust the result
that they're actually getting.
That trust piece is really important
and that's what's missing in Texas SQL.
Then there's Texas semantic layer.
We kind of led to charge in this
and for a long time we were saying
like, hey, this is how you've got to do data.
And I think we've realized that also doesn't.
Interesting.
It's too restrictive.
And it ends up being
where you end up just pulling
mostly facts that already exist
on your dashboards
and just instead of finding a dashboard
and clicking on something,
you're just pulling that.
not flexible enough to give you
the power you need to actually answer
questions. So the pro, of course, is
Trump. Yeah, it's gone right. But it doesn't
have the... Which if you had to pick,
then if you had to pick, then you would pick
the types of semantic layer. But the unstructured data
part, like how do you put a semantic layer
unstructured data, for example? So I think this is actually
the exact distinction that I would want to get
in. This is what we're building on the
structuring side is it's like
the way I think about the problem is
that you have to interpret what you did
to the user. So to take one step back before we hop into that,
the unstructured question. Why does it work with unstructured data? Why does deep research work?
Because humans just naturally understand the, okay, you said you got this thing from here.
There's a link where I got it from. You can click on the link.
It's literally a bit of thing. We can do it. We can do it that for a really hard time.
So you can go and check the sources and any human can do that. If you see
something that looks questionable, you can say, I don't know, check the source. Maybe that's not
quite a fair summary of that. And you can dig in really easily. That's one of the reasons that
deep research is so successful. The research just came in and said, hey, I came up with this,
and you had no way to dig in or question any of the things that came up with. It just wouldn't
work as well. Yeah, definitely. And that's exactly how I think about the problem on the data side.
You need to give the model the most flexibility. You need to free to sequel, like let it write
sequel. Like language models, like we were just saying, it had it become perfect at coding.
I think that's inevitable. And you need to let them write SQL. You need to let them write SQL.
SQL, let them do with their good app. But then the hard job of the application is how do you
take that SQL that the model generated and have a truly trustable way for the business person
to be able to look at that and know what it did with absolute certainty, the same way they can
have certainty if they click on a citation link in a text document. So does the semantic layer
become a QA layer? Is that part of quality control? I think you have to think about the semantic
layer almost in this inverse way where it's like the whole time up until now,
We've been taking the semantic layer and saying,
hey, we've got all these building blocks that humans understand.
And we then have to, the semantic player's job is to compile that down
into a bunch of SQL that we brought on the warehouse.
I think the real job of it in this new age is the inverse.
You have a bunch of SQL that the language model wrote.
Given the context that's in your semantic layer,
like the semantic player still owns the context.
It's structured kind of similarly in terms of its input.
But the language model has written the SQL,
raw sepule based on the input here.
The semantic player's job now,
is how do you take the SQL language model as written
and effectively invert the problem?
How can you take that and take it back into business concepts
that a normal human could understand
without having to write the reason.
Yeah, those kind of translation problem.
Yeah, exactly.
Which Ellums are good at.
Ellums are good at.
And there's a lot of code that humans have to write to do this.
It's because, again, you can't forget permission,
you can't forget governance, you can't forget all the complexity
and actually being able to verify that you did stuff
the way you think you did.
So that's a lot of what's what we're working on and what we're replacing.
I mean, speaking of SQL, then we're going to go down the flexibility route.
I think they're both actually really important.
The reason for SQL is that SQL is the lingua franca of data because it's what runs
natively in the natives.
Yeah, right.
Like if everyone had the money to just spit up massive spark clusters to sit on top of the air,
you know, as everyone's moving iceberg.
Yeah, right.
And great.
Yeah, Python would be fantastic.
But it's like, as it is now, you've got to bring the, you've got to bring the compute
to the data.
Yep.
And that's what.
It's just most effective.
It's just most effective.
And then you do the aggregation, you take this truly massive amount of data, aggregate down
to something reasonable, and then you have Python work on it instead out of that.
And that's exactly how we're architect.
All right.
Well, I got to ask you that roadmap stuff.
Like, what is something like kind of midterm, you don't have to give us details that you're
excited about?
maybe directionally.
One of the biggest things that I'll give you two.
One's like an improvement on our current experience.
And that goes really in line with all the like flexibility stuff we've been talking
about.
Part of the part of our problem and every semantic layer kind of BI people's problem
is the setup process.
So like if you buy Lusker, if you buy us,
if you buy holistics, you buy, you know, whoever.
It's, you're going to have to do a lot of work setting up this part again of how we think
about the model should be able to generate SQL,
and we should be able to translate that
into intelligible concepts for you.
That also means the model needs to be able
to build your semantic layer for you.
Sure.
Because as it's translating that into intelligible concepts,
if you're an admin, you should be able to say,
hey, that's now governed.
That's verified.
Whenever you show that to the user,
give them the thumbs up.
Like, that is good.
We know that's good.
And so now we can actually help you build
the semantic model as you go,
as opposed to you having to build it up front
to make where he is.
So that's a huge difference in terms of...
That's so good.
Like democratization, you know,
it's overused at this point.
But that's like a reality when you can do stuff like that.
Yeah, because otherwise you...
And this is what we saw,
and I'm sure every looker and everybody else saw this too,
is it's like,
otherwise the semantic model becomes something
that the data person is building,
trying to guess at what the user actually wants to do.
And instead, it should be where the user is able to ask the questions they want,
get answers,
and they know which ones are sort of rubber-approved
by the data team and which ones aren't.
And they can either ask the data team about or take with a grain of salt.
I was going to say that.
Do you imagine there are like gold standard semantic model and maybe some personalization,
personalized semantic versions too?
Yeah.
So I think the way we think about it is actually what the teacher we're calling
dynamic fields where it's like you have the governed, you know, semantic measures.
Those, the model uses all the time whenever you ask you out those concepts.
But if you say, hey, what does gross margin look like if we take out discounts and we take out
out refunds that we add in this other like adjusting factor and it can just do that for you
yeah and it'll say like hey here's your adjusted gross margin that you asked about jumps gross
margin yeah exactly it's like hey i did it for you it's not the verified one and you can see the
verified one next to it it's not the same thing as that right so it's like you can't really confuse
the two but it's like you do need that flexibility and that's one thing that I think looker and similar
sort of bi products are going to miss yeah with this approach is that people don't want to come in
and just ask about the same things on their dashboards they want to
ask about new things. They want to be able to do stuff they were not able to do before.
Previously, they could only do a conversation with the analyst. Right. Well, and I think that
then at least every FBI analyst role that I've been in the past, the, like, not well-kept
secret is that most of what you do goes to waste. Yeah, totally. It's like people either don't
look at it. They look at it once or they forget that they ask, you know? And, you know, and, you
And you're doing it a week later, and they're like, oh, I forgot about that.
So, like, just the inefficiency there is mind-boggling.
And I think anybody that's been an analyst for longer than a couple weeks now is.
Oh, yeah.
And then to paint that picture, again, a little bit on, like, where we're going.
And also on that piece, too, with the inefficiencies.
It's like, I view us as a, again, like, McKinsey consultant and coworker.
And it's, like, the way that works now is that we have the business context
via a semantic model, this analyst is able to answer questions for you, interpret what those
things mean, be really flexible, you know, help you build assets that matter, like these
intelligent workflows, these dashboards. And then as we get more advanced, and as the AI systems
get more advanced, it's like that increasingly becomes something that you're able to get,
the deeper questions in your business to be more, the harder and harder things. Like Sam Alvin said
in the stuff like, keynote, it's like he expects them next year, people not just be getting
their sort of like mundane questions they ask this to give it your harder questions
give it the really difficult stuff right let me see that and then the other thing is it's
you want it to be proactive you don't want to have to go and ask this and everything you want it
to be able to be monitoring your inventory and it's like let me know when I should be buying
stuff I don't want to ask you when I should be buying stuff I want you to tell me what I
should be buying stuff so yeah that's a lot of what I'm curious on the proactive stuff
because I've had that come up a number of times how far out do you think
We are really meaningful, useful, like, hey, because that, I mean, that's all another domain, too,
that we haven't even touched on is, like, monitoring and alert.
Like, you know, I've got a little bit of the dead office background, like a huge component to that.
And some interesting, like, solves around that, like, companies like Datadog.
How do you think that comes to the BI space?
Because the BI space has been bad, honestly, not very good at the monitoring and, you know, anomaly detection and stuff.
I think it's also just that it's not, it's because the anomaly that they are explicitly trying to,
is not the anomaly they actually care about.
So it's like you don't actually care
and that X, Y, Z metric
is like two scanning deviation.
No, yeah.
Very rarely do you actually care about that.
Like, what you care about is some process
that's affected by it.
And the ideal interface there is that you can say,
like, hey, I'm worried, like,
if any of our product pages go down,
it would result in a drop in, you know,
conversion rate for that process.
Like, apply general sort of human heuristics.
And it's like, if it drops to zero
because there's no visits,
then it's obviously no problem.
And it's like there's a lot of different heuristics
that humans have to be able to say,
well, is this really a problem or not?
And if you just alert the person every time
the thing drops,
you can't do anything from its last value,
you're going to lose all of that.
Yeah.
And there's so many false closets and people ignore it
and there's use laws.
Totally.
And it's like a lot of how I think about
as the agents become more ambient,
like the next step in that direction
is that you say,
make sure nothing bad is going on with conversion rate.
And it can actually,
check, hey, well, we've got all the conversion rate.
These ones, the, you know, number of sessions also dropped to zero.
So from like fours, that's not really a problem.
And it's like, it's able to just alert you and things are actually a problem.
Right.
As opposed to every time some state.
Right.
Yeah.
And you're starting from the end and saying, like talking about conversion rate or revenue
or whatever, and then talking like, well, if it impacts that, that's what I care about.
You're not several layers down of like, well, monitor each page and check the traffic and
So, you know, and you're going to, like you said, miss something or something like
is going to get alerted to like, well, no, I didn't mean that.
Because it would be so precise if you're going to monitor like kind of the old way.
Totally.
And that's why a lot of the monitoring codes just as kind of brittle like sequel or ad on
that gets written that doesn't find some edge case.
And it's like really what you want is reasoning.
You want intelligence apply the problem.
It's not even a lot of intelligence.
It's just like a little bit of intelligence.
Previously, that was impossible.
Right.
So it's like, I think that's sort of.
the next step in terms of how things get kind of more ambient and more just
right for you and then at some point you onboard the system and the system is just
crawling around in the background and you decide how much compute do I want to allocate
yeah it's just sort of figuring out for me so yeah and you can just slide hey I want you
just go and find all the stuff right I'm willing to I'm willing to spend on you're like
just check a little bit like yeah right don't check that's the big stuff yeah I mean
how far away do you think we're from something like that I mean that's like where
That's on our roadmap for the end of 2026.
Okay.
So I think it's closer than you think.
I think it's because the more flexibility you give the models
and the better job you do a brokering the business context,
they can just do this increasingly magical stuff.
Yeah.
Through no, like, it's not like we or anyone else or cursor builds like the,
you know, you don't have to build the model.
Right.
It's like your job is to orchestrate all the stuff around it.
And the assumption that, you know, us and Cursor and a lot of people are making
who are taking these bigger bets around AI is that they're just going to keep getting better.
Right.
Which means like as long as you give them the right flexibility and the right guardrails,
then you're set up for them to just get dramatically better
and be able to take on these bigger tabs just naturally,
as long as you get them the right tools.
And the right way to think about the tools is what tools do the humans have that are really good.
Sure.
Yeah.
So it's like, yeah, the humans do have a B.I system, which is the people have a semantic layer that you can hit around in and pick the things with the semantic layer.
The humans also have a sequel editor.
Right.
Yeah, yeah.
And both of those, due to the human's understanding and communication skills, can become interpretable to the receiver.
Yeah.
And so it's like, okay, we're going to take those things and give it to the AI as well and not limit it to like one or the other.
Yeah, exactly.
And I think the thing that you're going to see miss from all the kind of B.I.
incumbents, is it
they're too fixated on
B.I. They don't think about themselves
as a human, as an employee
that you buy, and then think about what tools
do we need to give the employee? Like, well, we should
let the AI agent move things around in the B.I.
That's great. That can answer a subset
of problems, but that's not actually going to replace
the kind of work. It's marginally better.
It's marginally better. Yeah.
Interesting.
Well, I think we're at time, but this has been a blast.
John, thanks for coming out there.
In person, round two. I think
with our last one in Denver here, but yeah, thanks for doing that a show.
All right. Thanks for having me.
The Datastack show is brought to you by Rudderstack. Learn more at rudderstack.
