The Data Stack Show - 165: SQL Queries, Data Modeling, and Data Visualization with Colin Zima of Omni
Episode Date: November 22, 2023Highlights from this week’s conversation include:Colin's Background and Starting Omni (1:48)Defining “good” at Google search early in his career (4:42)Looker's Unique Approach to Analytics (9:48...)The paradigm shift in analytics (10:52)The architecture of Looker and its influence (12:04)Combatting the challenge of unbundling in the data stack (14:26)The evolution of analytics engineering (21:50)Enhancing user flexibility in Omni (23:44)The evolution of BI tools (32:53)What does the future look like for BI tools? (35:14)The role of Python and notebooks in BI (39:48)The product experience of Omni and its vision (45:27)Expectations for the future of Omni (47:52)The relationship between algorithms and business logic (50:51)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Transcript
Discussion (0)
Welcome to the Data Stack Show.
Each week we explore the world of data by talking to the people shaping its future.
You'll learn about new data technology and trends and how data teams and processes are run at top companies.
The Data Stack Show is brought to you by Rutterstack, the CDP for developers.
You can learn more at rutterstack.com.
Welcome back to the Data Stack Show.
Costas, I'm so excited because we get to talk with Colin. He's at Omni, but he, I mean, literally
helped build Looker over almost a decade, seven or eight years. And Looker has had a huge impact in multiple ways
on the data industry, from analytics to architecture to modeling. It has spawned
entirely new categories of companies. And I am so interested to hear about what Colin learned at Looker that he is building his
new company on, right?
Because, I mean, Looker is still a great tool, right?
So the people who built Looker, like, what are they trying to build?
I think that's what I want to figure out.
How about you?
Okay, I'm very interested to see how someone who is involved in such a
successful product and company starts another company with another product in the same industry.
So I want to learn about that. Like what's why and how. So I don't know. I think it's going to
be super, super interesting to have this conversation with him today. Yeah, I agree. Well, let's dig in.
Let's do it.
Colin, welcome to the Data Sack Show.
Thanks for having me.
Okay, you have a fascinating background.
So give us the story and, you know, especially how you ended up starting Omni.
Yeah, sure.
So right out of school, I created synthetic CDOs. So think of them as credit instruments that no one needs anymore. As that are actually doing ranking for Google.
And we help evaluate search ranking results. So we were sort of like the judge team for
how search was working. Started a company actually with one of my co-founders at Omni.
Ended up selling that to a company called Hotel Tonight. Hotel Tonight was actually Looker's
fourth customer. So I led the data team at Hotel Tonight
following that acquisition,
got very close with the Looker team,
eventually said, hey, I love the product.
I want to come work on it.
And I joined as around the 40th employee leading.
Originally customer success and support
alongside analytics,
eventually took over the product team,
kind of moved in and out
of those roles, was there for eight years through the Google acquisition. Frankly, like got a little
bit tired, just as we scaled up, sort of the culture was changing and wanted to fire it up
again. So that's how we started up Omni. Very cool. So many questions about the background,
but really quickly, can you just give us a quick explanation of what Omni. Very cool. So many questions about the background, but really quickly, can you just give
us a quick explanation of what Omni is? Yeah. So it's going to be very familiar to people that
are familiar with Looker, but the core of Omni is that we balance the analytical process. So
we give you all of the power of a data model to write queries and self-serve for end users.
And then we also give you all of the freedomness and openness of something like writing SQL
or extract-based analytics.
And the idea is that users can actually mix and match between those two versions of the
world.
They can move very quickly in sort of SQL and Freeland.
And over time, we help them build a data model so that other users can self-serve.
Very cool. Okay. Well, I have to ask, of course, as a marketer, I have to ask a little bit about
being a statistician at Google. Sure. Because I think we were chatting before the show that was,
you know, sort of, you know, what time period was that when you were doing?
2007 to 2011. Okay, wow. So yeah,
so like, man, that's like, when search advertising was, you know, going through a crazy hockey stick.
What types of projects did you work on? Like, what types of things were the engineers building
the algorithm? Because it sounds like you supported the engineers building the algorithm?
Like, what types of problems were you trying to solve? Or what types of things are you trying to understand? I mean, it's going to sound kind
of funny, but we were just trying to define mostly what good is for search. And I know
that sounds sort of obvious, but kind of similar to like a lot of analytics actually in sort of
businesses, Google is using a mix between live ranking signals. So think things that people click on. And then they're
also using objective or subjective, I guess, evaluation of ranking results. So it's not just
a black box that looks at clicks and promotes things up to the top of the result set. And
similarly, it's not just survey that says kind of here's one algorithm, here's another algorithm,
which one's better. It's a mix of those
things. And the whole job was sort of creating the process and the framework for doing that sort of
evaluation. So an example that I love to give, because there's almost a mix of like philosophy
and statistics here, is one of these queries that would always come up as Harry Porter. So Harry and then P-O-R-T-E-R. And it actually
gets into the philosophy of what people are searching for when they look for ranking results.
Because Google evolved this idea of spell correcting aggressively over time. And I think
thinking through how frequently does a user need to be looking for Harry Potter to intersperse or return Harry
Potter results versus exclusively providing Harry Potter results? And how do you create
frameworks for doing those sorts of things across sort of the whole surface area of search?
Yeah.
So it's this process of trying to use statistics to create frameworks,
but also sort of tying that to the logic of what people are trying to do when they're searching. Yeah, super interesting. And what was the output
of your work, you know, sort of like the specific work product? Was that an input to the algorithm?
Like, what did that exchange look like when you shipped a product or a product?
Yeah, so I mean, the simplest way to explain it is that engineers are constantly coming up with refinements to the algorithm. So they're saying,
I have a new sort of layer in the way that we return search results, and it changes search
results in a certain way. So it might affect 1% of all queries sampled in some sort of weighted
basis or something like that. And these are the results that my algorithm would give. And these
are the current results. Like, how do we create a framework for deciding whether that
change is actually positive for users? That was our team's job was to create that framework,
try to explain it to leadership, and then leadership essentially made decisions.
And sometimes it's cut and dry. It's just like, we're finding the exact thing that the person
is searching for 10 times more. Very frequently, it's a lot more subtle than that. Certain results get better,
certain ones get worse. They change in ways that are not obvious. So we're also creating
a framework for how to evaluate those things. That was the whole process for what we did.
Yeah. Super interesting. Okay. One more question. How did you integrate newer,
but sort of like peak trending search topics and then optimize around those? And like,
how did you weight the work against like, this is like very timely and very important,
like world of finance or whatever? Yeah. Yeah. So I mean, there are, the simplest way to explain
is there's lots of different modules in the search algorithm and freshness is sort of a whole area of
search. And one of those modules, we had to create guidelines for sort of a whole area of search and one of those modules.
We had to create guidelines for sort of how timely results get expired over time and how much boost they get when they are occurring.
So it's sort of all part of the framework.
It's what is so subjective about a good search result.
If you search for a politician today, how different should the
results be from yesterday based on news and how newsworthy is an event is very subjective,
but it was sort of vaguely trying to come up with how you describe these things so they could be
evaluated and then obviously using a mixture of click signals. So sometimes clicks
can give you answers to these sorts of things in terms of what people are using for.
There are also problems with clicks, like clickbait is a thing. So you have to sort of
adjust for things like that as well in terms of how ranking works. So that's why it couldn't
just be click signals. Yeah. Well, as a marketer, I can say that you did quite a good job over time
of really limiting the ability
to game the SEO type of things.
Yeah.
I mean, it was an impossible
ongoing battle
and I was a micro piece of it.
Yeah.
Okay, well, let's jump.
So many questions about Omni,
but I want to jump from there
to Looker
because it seems like there's a really clear connection between, you know,
freshness was a word you mentioned, tried to define what good is you mentioned
analytics and I mean, to me it's like, well, a lot of those things got baked
into Looker, you know, because that's, I mean, as a Looker user, I've experienced a lot of those things.
Is that true? I think certainly Looker had a unique take on the analytics world. I thought
it was really interesting when I started using Looker and eventually joined. I remember the
founding team, so Lloyd and Ben, took a lot of pride in not looking at the analytics landscape
very much. That was good in some ways and bad in others. But it meant that Looker did have a fairly unique perspective on how to approach
BI in a way that was honestly scary for people. We got a lot of criticism from folks like Gartner
for the whole life of the company that operating exclusively in database was crazy. I remember
we would get back a Gartner survey and there'd be a hundred questions on your in-database engine. And we'd
just have to write NA for a sixth of the survey. Did those analysts still work at Gartner?
They do. I mean, they're still slowly coming around to the concept of in-database analytics
as like an exclusive way of doing things. And it's kind of funny because now Omni is building in-memory layers above the database.
So it's like what's old is new constantly.
Totally.
I love that.
But I mean, we were trying to do things a very different way.
And that it was this really strict compilation of SQL down into the database, heavily governed.
And like it was a backlash to things like Tableau where extract
was the focus. So I mean, in many ways, we had to sort of teach people the way that is normal
to think about analytics today, which is like centralized data in a data warehouse,
put something that looks like a data model on top of it and let people query freely in the database. Those were scary concepts
at the time. One of the biggest reasons that we lost deals early was because people couldn't get
their data in a database. And I think now the idea of buying Fivetran or Stitch and getting that done
might even happen before you buy a BI tool in many contexts.
So there was this sort of paradigm shift that was happening. And it was really Redshift and
then later Snowflake and BigQuery that actually opened that up. But the idea of an analytical
database that you could just really exercise heavily from the BI layer is sort of what
unlocked the whole world. Yeah, super interesting.
And in terms of the architecture, where did that come from, right?
Because Looker in many ways sort of introduced this entire new architecture.
I mean, most companies today, I would say, that are setting up a new data stack or trying
to modernize their data stack are sort know, are sort of in a way
are influenced by the architecture that Looker championed. Where did that come from at Looker,
right? Because like you said, it's a scary concept. But it really changed, I think, a lot
of the ways that people think. And I think, I mean, literally launched a lot of new companies.
Yeah. The core pieces of the concepts came from software engineering, which was just this
idea of layers that sit on top of each other, where an API from below is sort of what the
layer above can work with.
This whole microservices approach to doing analytics, connecting to Git, building a code
based model.
So a lot of people don't even know the first version of the LookML model you only could interface with through the command line.
So it was truly a SQL compiler to start. It was not a BI tool. I mean, it was a BI tool,
but it was a SQL compiler first. And then sort of the BI layers flowed out from there.
But the real core was just this modeling layer that could describe essentially queries in a more abstracted way. So rather than writing SQL with characters,
now we can write it with fields, filters, pivots, or structural concepts that users think about.
Yeah, that makes total sense. Okay. So Looker sort of reinvented the way that a lot of people approach architectures
what i see now and feel free to disagree but it's sort of you know looker introduced a couple of
layers but they were integrated right and so like the wonderful thing about looker is you sort of
like put it on top of raw data and then you can model it and then you visualize it or whatever. And so now that's been unbundled on a number of levels, right? And so companies took
Looker. I mean, DBT is obviously the elephant in the room for LookML. So let's abstract that out.
Yep. Which is interesting. Thoughts on like, is it good that the unbundling is happening?
Yeah, I mean, I think in some ways it's natural, but I think that there's also equilibrium that
you have to deal with in terms of like how many of these tools that you want to manage over time.
So the challenge with Looker is always, or with sort of any BI tool that's bundled with a modeling
layer, so Omni included, is that you want to
create data model to build analytics. And then inevitably you want to use those things in other
places. So you push things further and further down the stack. The challenge is that the further
down the stack that you push things, so into DBT or into an ETL process or something like that,
the more rigid that transformation becomes. So the more difficult
it is to adjust and adapt to what users are doing. And I think in many ways, LookML's superpower
was this concept of development mode, where I can go into a branch, I can edit a thing,
I can immediately see what a report is impacted, and I can go push that thing out into production.
And moving it further down into the
stack, so into the ETL pipeline or into DBT or something like that, creates this discontinuity,
where now the API layer below is producing a data set, and the thing above needs to consume that
data. It can't really as easily interact with that thing. And so the trend that I've seen is more and more people doing things like producing
reporting tables, almost cubing a la sort of 2005. And the advantages of things like that are you do
get standardization. So you get tables that your BI stack can consume, your data science stack can
consume, your now reverse ETL stack can consume. And so you do get that standardization. The challenge is that rigidity is a business problem also. The number of people that can
touch that layer naturally drops over time. And I think very early, that was an advantage for dbt.
Like almost you have now a modeling layer that fewer people can touch.
So people can't screw things up.
Exactly. We can maintain it more tightly and its inaccessibility is an advantage.
But if you play that forward to a whole organization depended on materialized modeling,
the challenge then becomes very few people can touch it. So now we're waiting on the data team
for the next column or something like that. And I think these concepts play
really heavily into the way that we're thinking about building product, which is,
obviously, we do have a modeling layer kind of a la LookML that is doing just-in-time
transformation and pushing SQL down. I think the mistake that we made with Looker was hoping that
our modeling layer could be everything for everyone so that we could do
all transformation for the company. And I do think the things that DBT has shown people and just sort
of the evolution of the data stack has shown people is that there are concepts that may start
in your BI layer that need to get pushed down and standardized for everyone. And it's even obvious
to say, but like, I remember we made a customer health score,
our business depended on it. We actually picked it up out of LookML and we put it into an ETL
process in Airflow. And the reason, because we didn't want people to touch it. So there are good
reasons for those things. I think the challenge is that you need now these divergent layers to be
able to speak with each other. So I need to be able to start and
produce a report. And I don't want to start by doing ETL and do that. I want to be able to iterate
on it quickly, publish it, validate it, and then decide whether things need to get standardized.
So our point of view is that you do need those modeling layer pieces, but we need to be more
pragmatic about the things that we truly need to to be more pragmatic about the things that we
truly need to own in orchestrate and the things that we should push out. So like the example in
Omni is I want you to start by writing a piece of SQL, and then I want to take the pieces of
that SQL, so maybe the joins, and model them for you or fields. And then if that sort of
virtualized view becomes important, I actually want you to pull it out of Omni and I want to publish it into DBT.
And I want all of our reporting in Omni to continue to function silently.
So I don't want to care where that lives, but I want a user to be able to make that quickly and then harden it as needed.
And some things should go down into those lower layers
and some things actually should not. And I think that is sort of what the ecosystem is missing here
is there's almost this view that everything should be in the centralized metrics layer
when the reality is like that requires time and investment and something should have that level
of care from the data team with SLAs from
data sets and sort of alerting and things like that. And something should not, it's not worth
that effort. And so we're trying to sort of inject some pragmatism into the modeling experience for
the user. Yeah. Can you talk about that in terms of learnings that you had at Looker around, you know, like a KPI,
like you said, like a customer health score. It's core to the business and it shouldn't be touched,
but so much of good analytics is actually exploratory, right?
And so when you, I think it was like when you were talking about going deeper and deeper into
the stack, like one way to look at that is that you decrease people's ability to explore because you're really constricting parameters.
What did you see at Looker and even with Omni?
Exploration actually is the way to figure out maybe what you need to harden.
But if you just start with the modeling layer, you make so many assumptions that you end
up having to go back and change it, and it's very slow.
Yeah, I think that's exactly right.
I mean, the way I would sort of summarize it is I think there's a lot of things that
are bottoms up and a lot of things that are tops down.
And what that means is that there are certain data sets that you can publish out where making
them easy to work with is the superpower of them.
So maybe it's like a revenue time series or something like that,
where you're not going deep and accessibility is the most important thing. And then there are other
data sets where there's no amount of sort of preemptive manicuring that you can do to make
it effective for people. And I think event analytics is a great example of this. Like
we're building new features. Maybe we fought a little bit about tracking. Maybe we haven't.
We've got nested JSON blobs all over the place.
I'm not going to be able to, as the data analyst, predict what my product manager needs to do.
They're going to have a question and I need to give them as much as they possibly can
to go answer that question.
And then we can think about reshaping that data set.
So like, again, I think it's about data teams focusing on where the cleanup that they do
has the most leverage. So maybe it's Salesforce does need a lot of manicuring and they do need
to build published data sets there. But these long tail sets just require getting data into
people's hands and enabling them and then reacting to what they're doing with it.
It's actually very similar to sort of like the
MVP process of building a company, which is like, we can overbuild our product before people are
using it and we're not going to learn from it. And if we put out younger things that are less
complete, we can see how they're getting consumed and then react to it. But if you put up walls in
front of people, you're going to disengage them. So I think trying to sort of dumb it down for the user can be advantageous, but not
universally.
Yep.
Super interesting.
How do you think about your user at Omni based on what you learned at Looker?
Because one term that's cropped up in the last several years is sort of analytics engineer.
And in many ways,
Looker basically created that because it's like, you know, you sort of have someone who does a bunch of data modeling, but they're not an actual analyst, and they actually become an
analyst and then like, yep. And so Looker enabled this like really interesting environment where
it sort of gave superpowers on both ends. So how do you think about that?
No, it's true. It's like giving SQL people superpowers.
I think that we talked about this a little bit in the pre-show, but I think in a lot
of ways at Looker, we needed to really simplify our message because we were teaching people
a new way of doing things.
So this idea of a centralized modeling layer that's highly governed and highly controlled
was very appealing. And so templating SQL was a piece
of that, but the core message of Looker was governing data. And I think the flip side of
that was that it's very hard to compromise your most core message. And the core message was like,
everything is governed and it's really tight. And what that meant was that when people needed
to do pragmatic things, so they needed to
transpose before we had transpose, or they needed to write a piece of SQL that they didn't
want to model, they were picking it up and injecting it into the data model.
And it wasn't governed.
It was just a raw piece of SQL that was getting dumped in a practical way.
And so I think one of my big takeaways was we, in some ways, weren't allowed to be pragmatic
about our user.
We weren't allowed to give them more SQL things because we had to simplify.
Yeah.
And that's a lot of sort of the opportunity of doing this again is now rather than teaching
people the looker way, we can build on people that understand that and DBT and Fivetran.
And now we can say like, great,
you want modeled things. I can give you modeled things. But if you want to poke through the model
and write some SQL, I want to let you do that too. And then you can decide to model it later.
And so it's sort of like we can give people nicer things because we don't need to protect
them from themselves. And that was a lot of the balances that we felt like we had to
be very unopinionated about the product. And if the developer of a model was not good, that was
not our responsibility. And I think now we're taking a more opinionated point of view and being
a little bit more aggressive. So a simple example is we don't operate exclusively in database.
If you write a query to a table and you write select star to that table,
we'll actually pull the whole thing back, put it in the browser,
and let you re-query that data set.
Because it's sort of more pragmatic and faster and better.
And so we sort of get to take some of these foundational concepts
and go two steps
further in terms of what we're allowed to do with them. That's what's been so fun about doing this
again is we know how to build the core foundational pieces. And now we get to build those things just
outside the customer promise that are sort of so exciting. I always used to joke that I wrote more SQL,
like raw SQL than any other Looker user ever.
And now I just get to write raw SQL
alongside a data model.
And it's like, it's what I've always wanted.
So it's, I get to build for me a little bit.
Very cool.
Okay, one more question for me
before I hand it over to Costas.
Can we talk about, so, you know,
we talked a little bit about unbundas. Can we talk about, so, you know, we talked a little bit
about unbundling. Can we talk about the relationship between the visualization layer
and the modeling layer? As a Looker user, I think that was one thing that was really nice was that,
you know, you sort of have the ability to like drill and then it's like okay well if you want to like look under the hood you can
look under the hood which is really nice and so you know but with the unbundled model you don't
really get to do that right like yep that becomes a ticket for someone who's you know doing the dbt
model or whatever and so yep the user like i love that about looker is that something you're trying
to retain in omni or like how do. Is that something you're trying to retain
in Omni? Yeah. And actually, we're trying to even sort of push it a step further. So we sort of
talked about this in terms of there's a level of prep on a data set that you can make for a user,
and the more prepared it is, the less flexibility the user gets. And so the simplest version is you
make a reporting table, and that's all the user can touch. And the Looker version was we give you sort of this modeled schema and you can touch anything
inside the model.
And we're almost going a step further, which is you can touch the model and do anything.
And if you want to even poke through the model and write SQL, we'll let you go that step
further.
But the key here is always sort of this interplay between trying to structure things more over
time.
So if you do let someone write SQL, what we're trying to do is sort of pull out those sort
of granular concepts that can make the next question simpler.
So a really obvious example is that if you join two tables together, we know that we
can make a virtualized view over that table.
And so I don't need to write that join the next time. And kind of the more of those pieces that we can help you build fluidly.
So I don't need to drop into a model, it publish the model out right now. It's just, I can write
a join and now it looks like it's modeled. And then we can kind of structure that model more
and more over time. That fluidity, I think is really the superpower of what sort of the modeling layer
helps with the compilation. But it's not just having a model there that does it.
It's the model and the ability to adjust the model based on what the user needs.
So a great example that's sort of constant is that you have some sort of internal product and
you want to filter out your internal users. The version of it where you're doing this in the ETL cycle is like, go back in the ETL cycle,
rewrite the queries, filter the rows. What Looker's true innovation was on a query result,
you can drop into the model, put a where clause on everything that's hitting a table and then boom, it's gone. And it's that coupling that is the real power of the model is that it takes that events view
and it really makes it events view where user is not internal. And it makes it super accessible.
And I think then what we're trying to sort of build on is sort of how do we then refine that
work that user did and make it as fast as possible?
And then if that where needs to push all the way down into dbt, can we make that really simple as
well? So it's make it really fast for the user to answer the question and then make it really
robust or as robust as the company wants for controlling the logic later. Yeah, that's super
interesting. It's, it sounds as if like, you know, in many ways,
and I know this may be an oversimplification,
but it's almost like reclaiming the value
of the thousands of ad hoc, you know,
activities that people are performing
on a weekly or daily basis
because there's so much value
in what they're trying to do,
but largely it goes wasted.
You know, that's exactly it. No. And I'd say like, even it's trying to also knock down that
sort of decision node of like, do I make this scalable now? Or do I answer the question?
Like, I want you to answer the question and I want you to make it scalable later.
And sort of like the original
version of the world is like, do I go pull up mode and write it in SQL? Or do I go take the time to
like think about a data model and model it? And I want you to just answer the question. And then I
want to pull out like, hey, we found three things that are modelable. Do you want these things?
Like, boom, let's model them. It still can stay orphaned. Like it's okay to have one-off SQL
and it's pragmatic and users are doing it and we can't
argue with them.
So it's like, that is what I, that is the subtlety here is let's let users put themselves
in trouble a little bit, but let's try to help them like make it scalable and make it
better.
Yep.
Love it.
So fascinating.
Costas, please.
Thank you, Eric. So Colin, I have, okay, let's start with like a question about
the past and how it relates to today, right? Like, yeah, Looker was, I think the company was
founded in 2012 or something like that. That sounds about right.
Yeah, like 10 years, right? So? And 10 years after we have Omni.
So what has remained the same and what has changed between 2012 and 2022 when it comes to the problems that Omni today is addressing?
I mean, I think a lot of it.
So at first, I think there's a really big difference between 2012 and 2015.
So I think in some ways, Looker got a little lucky.
Great example was our first Hotel Tonight instance was actually on top of our production
MySQL.
I took down the app a couple times querying in Looker.
Not recommended.
Set up your production replicas.
But after Redshift and Snowflake, columnar databases on the web and essentially this sort of mixing
between the data lake and the data warehouse and just the ability to query lots of data
has become just obvious and normal to people.
You don't need to walk in and say, do you know what Snowflake is?
Are you ready to start doing that?
That thing has just become completely normal. I think the idea of compiling SQL and sort of SQL familiarity and that being a core component
of your data stack has just become normal.
And similarly, like this idea of just all of your data from 10 different sources showing
up in your data warehouse or your data lake or whatever it is has become normal.
So I'd say early in Looker's life, we were teaching people these concepts and sort of learning about them.
Early in Omni's life, we can assume that you have Fivetran set up, you started your Snowflake,
you've got DBT implemented, and you have two people that know how to write SQL, great.
And you're ready to start going. And so like we get to start with a lot of those concepts
existent in the user base.
I think the demands of end users have not changed at all.
Like people, the dashboard is not dead.
Like most people are looking for dashboards.
They're looking for interactive analytics.
They're looking for some version of self-service
so that a marketer can go look at their channels
over the last three
weeks and see the evolution of lead generation across them. I think all those things have become
just normal and standard for people. I think one big obvious thing that's sitting out there is
there hasn't been a generational BI company since Looker. So tableau and click and to some extent power bi
were the way before looker and looker like i mean i obviously have a very looker centric point of
view but i think looker grew up in an isolated like it was the isolated winner of its generation
and there's a couple other tools out there that are sort of similar generations i i think the
current generation has not quite been figured out yet. And so there's some white space. But at the same time, Tableau and Looker and Power
BI have become extremely commonplace in people's data stacks. So I do think people are somewhat
comfortable with the stack that they have now, which is a little bit different because we got
to be very different
when we were Looker.
We were pitching something very new.
Yeah, yeah, 100%.
Okay, you mentioned something very interesting.
I would like to spend like a few minutes on it
and hear your thoughts on that.
So going back to the Looker cohort of BI tools, there were a couple of them.
We had Chart.io, we had Periscope Data, Mode, which is still around.
It's still out there.
What's the other one that got merged with Periscope Data?
Sisense.
Sisense, yeah.
So we have all these companies growing,
and at some point we get the acquisition from Google to acquire Looker.
And it almost felt like a cycle was closed in the market.
Companies got merged, IPOs were canceled.
SciSense was talking about IPOing for a while.
It hadn't happened yet.
And then we also had events like Tableau, for example, from being public, getting acquired.
And we still have, by the way, as you said, Microsoft with Power BI, which we don't chat that much about it, but it's huge.
Enormous.
Biggest by far.
Yeah, exactly.
So can you give us a little bit of like what happens with these cohorts?
And after you do that, also like how do you see the future, like the next iteration?
Yeah.
I mean, I think the tools sort of divide into a few different buckets.
So I think the thing that Looker did really well was it was very opinionated about what it did in terms of
this modeling layer. But I think we also understood to be a big successful company,
we needed to serve enterprises effectively. And so while we started really focused on the
hotel tonights of the world, the venture-backed tech companies that were young and sort of first
early adopters of product, we built a lot of the things that gigantic
companies needed to be successful with enterprise analytics for 100,000 people.
And I think that was one of the reasons that we were able to be so successful as a company is
we thought a lot about the business as we built out the product. That wasn't always best for the
product, to be clear. And there's always some tension between the business and the product that you've got to deal with. But I think that was a lot of
the reason that we were allowed to be successful was because we thought about the trajectory of
what could support the business. I remember having conversations about how to get to a
billion dollars in revenue. And when you're having those types of conversations, it makes it much
easier to think about sort of what the business
looks like five, 10 years from now and what it needs to be successful. And I think some of those
companies that you're listing were more focused, a little bit more down market, and maybe a little
less focused on sort of the sustainable economics of the business. Though again, like some are
surviving and like sort of continue to grow. And the SaaS model is great for things like that.
It's really hard for me to figure out what the next generation is actually. And obviously I'm
trying to build one of them, but I think one of the things is that when we were starting Looker,
I looked back for a lot of inspiration at MicroStrategy Cosmos business objects,
like the first generation of BI. And I think in some ways I was literally looking through
MicroStrategy's docs and they've got like the little folder menus, and it looks like sort of Windows 2001 as you're using the product. And I think some of using some of these tools, like even Tableau to some extent, feels a little bit dated in terms of sort of the web interactions and sort of the user interaction models. And I think a lot of the opportunity is to sort of update really at the margin some of
these concepts in terms of like how we interact with the database. So a great example here is
as Looker was growing up, we had the columnar database growing with us. So Snowflake,
BigQuery, Redshift. And just the idea of using them was sort of the new concept.
We're not going to extract everything. We're going to operate in database. Not only is it going to be okay, it's going to be faster than if you were
working on an extract basis and it's real time. I think now we've reached the sort of pain point
in that sort of trajectory, which is like my snowflake bill is a million dollars.
Is my BI tool too recklessly consuming that layer? And I think DBT is probably the contributor to
this as well. And so this is where now we can take some of those core concepts and say, okay,
what is the borrowed concept from historical BI that we can actually layer in here? So
the example here is that we're silently putting in-memory layers into our product.
And again, no new concepts here. BI tools have done this for 30 years. But I think the concept
of operating entirely in database when
you need to be so that you're real time and you're working with Fivetree and well, but I'm able to
build a dashboard where I can download the whole data set in memory and cross filter it instantaneously.
I think that's actually what users want. So it's sort of like, how do we look at the things that
are great about columnar in database and then build on them?
Or Fivetree and having all your data there or DBT, the familiarity of SQL, those sorts of pieces.
Yeah, that makes a lot of sense.
And we've been talking a lot about SQL, but there's also like a lot of discussion about Python, right?
Yep. And the reason I'm asking that is because,
okay, I get it for dbt, for example,
to having a lot of requests around Python
because they are working a lot with ETL.
ETL traditionally was always preferred to have Python.
Yeah, but there are also a couple of,
let's say, BI products out there,
but they try to merge the two paradigms together.
And it usually happens through the notebook.
The notebook, yep.
Yeah.
So what's your take on that?
Do you think that this can be the new iteration of BI,
or at the end, notebooks is something different
and it's not BI?
I think it certainly can be BI. Like, I think all data consumption is sort of overlapping
Venn diagram circles where like similar users are doing similar things. I think that I have
found it in the past that it's a very different type of user and that data science activities tend to be done less
scalably and less to be shared and sort of more force-directed diagram style analytics than
creating a self-service environment for end users and you're making different trade-offs.
So like we are very focused on the SQL side of things for now and sort of the query consumption side. I would say that like,
as databases make querying in different ways more available to end users, I think that people will
want to use them. It's just the frequency with which I need to use looping or sort of like
higher order construction, I think tends to be more important in data engineering type activities and data science versus consumption and reporting and things like that.
So for me, I would say we're actually going in the other direction, which is like,
I want to build a functional library that looks a little bit more like Excel on top of our data.
So a Google Sheets style interface on top of queries rather than thinking
about Python, because I think that's what makes it more accessible to more users.
But sort of to your point, I think you need all of these things. And that is why I don't want to
lock the business logic into our layer. If we help you build business logic, I want you to push it into the
right place so that a Python user can go pick it up. And maybe we'll do that in sort of the infinity
of time. Like I would like to do everything eventually, but our focus is very much on like
the SQL compilation, the consumption, the reporting, the functional sort of consume layer
more than the sort of deeper science species.
Yeah, that makes total sense.
Okay, I want to share with you what I considered always like the genius of Looker.
And then based on that, I want to ask you about Omni.
So what I found like amazing about Looker was how, I mean, it was like, you have a product that
in order to deliver value, you had to engage two completely different personas.
One was the business user who is consuming the data and does the reports.
And then you have the analyst or the data engineer who has to prepare this data and
all that stuff. And
Looker for me created this very distinct experience between Look and Mail that was
for, let's say, the data engineer, and then for the business user where you pretty much
you could only do what you know really well to do, which is like pivots, right?
Yep. you could only do what you know really well to do, which is like pivot, right?
Like for me, that was like,
just like switching from like the developer mode,
like to the real world, let's say.
It was like amazing.
Like it was like amazingly smart what happened.
So I always considered like a very successful
and unique example of a product
that can serve like two personas
like almost equally well,
right? Which is hard. It's super hard to do. And based on that, let's talk about Omni.
Who is the user of Omni? Do you have this duality again?
Absolutely. No, I think you perfectly actually... It's amazing because we really tried to
profess that point of view strongly internally.
We actually had personas for each of those.
It was called the Fox and the Hound.
Like they were top level user types in the company and we care deeply about them.
I actually think the funny thing is, I think we stopped just short of really diverging
the product enough to serve both of those well.
So almost what we're trying to do is take one more step in both directions.
And that is, I want to give those technical users more superpowers in SQL and more fluidity
to model and do things and fork away.
And I want to make the end user experience more Excel-y and more sort of interactive,
still based on the pivot table. But like a simple example is use point and click interactions to
make functions instead of typing functions in a modal. Like how can we elevate it so that
any user can consume things? But I think to your point, even going back to sort of the success of
those previous businesses, I think that was the most important things to Looker's success was
understanding that we're selling to a data person, but we're selling to a data person whose job and
what makes them successful is making other people successful with data products. And that is exactly our focus is
we're not building a product for data people. We're building a data product for data people
for business users. And when you sort of shift the thinking a little bit, it still needs to
be outstanding for the data person. Like that is what got Looker bought. And we got plenty
of criticism from our non-technical users, but they're doing that work not to do research on an island.
They're doing that work to build a self-service environment for people.
And so that self-service environment needs to be truly great.
Ours is still getting better,
but that's exactly what we are trying to do
is build a great environment for end-user self-service
that lets data people go a little bit further too.
That's great. All right. One last question from me, because we're getting close to the end here,
and I want to give some time to Eric to ask any follow-up questions that he has.
Can you give us, tell us like a little bit about what Omni is today? Like what's the product experience?
And also share with us like the vision that you have, like what we should expect like
in a year from now, because Omni is a new company.
You've been around for a little.
So please do that.
Yeah.
So what we are doing is really balancing these two worlds between the directness of
writing SQL and the sort of governance and the richness of accessing a data model.
So what you see when you're using Omni today is all the analytics is built through a centralized
data model. But on top of that data model, users can essentially embellish in SQL and go beyond and sort of ask open-ended
questions and do open things. And then they can take the components of those analysis
and push them down and centralize them. So the idea is that I can start in a workbook,
I can do analysis with a mix of SQL and pivot table and front-end fields and UI,
and I can do that in isolation. And I can pick that isolated thing up, push it
down into a centralized data model and share it with everyone. Or I can leave it in isolation.
And so the idea is that I can really straddle these worlds between doing things in a free and
open way and sharing them directly with my neighbor. Or I can build a data environment
that everyone can self-serve from. And I can sort of
evolve that over time. So I could start in a very open, sort of sloppier analytical pattern,
and I can slowly have it look a little bit more like a mature looker instance over time.
And what's happening behind the scenes, what's powering that is sort of our model management
piece that is picking out the sort of components of your SQL,
turning them into a data model, and then we're able to pull them sort of out of SQL queries
and push them into our data model and push them out of our data model and into the database.
So you can almost think of it as just Looker meets a SQL runner and lets you sort of move
back and forth between them completely fluidly.
And what should we expect?
Like, give us something to, you know, like... Yeah, I mean, certainly, like, more and more maturity around these sorts of experiences.
Like, the magical super motion that I want to see in the future is that
an analyst starts a one-off analysis,
and they start writing SQL, and they share it with their team, and their team wants to do interactive analysis with that thing. We're able to hit a button and quote-unquote model that SQL
down, put it in a centralized modeling layer, and give people self-service with it. Now a data
science team decides that they want to work with that same data set. Again, we can pick up that business logic, go persisted in DBT through some sort of sort of cron schedule. And all of the
metrics work silently through the Omni layer. So it's a self-service environment for your end
users. And it's sort of a technical iterative environment for your technical users. And we do
all the orchestration of the business logic between those layers. So visualization reporting obviously comes along with those things.
And then you're going to see more and more in terms of end user experience. So things like
spreadsheet style analysis, CSV upload, acceleration, a lot around those pieces,
so that you're on a dashboard and you hit a button and now your dashboard
filtering is instantaneous and it's all just sort of happening magically behind the scenes.
Okay. That's great. Eric, all yours from here.
All right. And so this is a question that kind of combines multiple
parts of your past experience.
So one really interesting thing when it comes to data,
the data space in general,
but in analytics as well,
is that you have machine learning and AI
getting a lot of...
There are a lot of headlines out there about,
you know, ML and AI and, you know, automated insights and, you know, anyone who's actually,
you know, tried to use Google analytics to use their, you know, automated, like AI based insights,
like they know that they are real. If you're like a real business trying to scale,
that stuff is like very difficult
to do but you also have a lot of experience you know and sort of you know from your google
experience like feeding algorithms that are making decisions you know building products that create a
lot of data that go into algorithms and then in in some ways like it sounds like Omni is trying to make intelligent decisions or at least make decisions around what options to give you based on what you're trying to do, which is really interesting.
Yep.
Not that that's AI or ML necessarily in a formal sense.
But what's the relationship?
Because in some ways, it's dangerous to introduce like an algorithm into business logic
like i actually think you sort of nailed it right at the end there which is like the option
like i think the most underrated concept around all these things is like light human in the loop
on these sorts of concepts so like we've even actually noticed this in our product
there's a really big difference between writing automated joins on your behalf and telling you that we think
that we found very good joins. And the difference is being right a hundred percent of the time.
I think these are the sorts of concepts that tie to like self-driving and things like that,
which is the bar to being correct in a lot of contexts is
honestly, it can be very close to 100%. And so I think that you can use these tools in ways that
are extremely powerful, but that you just then want to present them to users in a way that's
more interactive. So an example that we're thinking of in the future is, I don't really
want to have people writing joins in our product.
Like you can, if you show up and you have a list of joins that you want to punch in,
I would much rather see two tables that you want to join together and say, it looks like
these two keys join and these are the three most likely other couplets of fields.
And this is why we think that like hit yes or no.
Yeah.
To me, that is like super magical. And it's like a half step back
from just doing it for you. But I think those are the types of pieces that we're trying to layer in.
And it's actually the same with our SQL parsing. If you write SQL, we parse it out and try to
write fields. We don't just immediately stick those fields in the model because we're wrong. We're wrong,
but we're comfortable being wrong. And we can accelerate your ability to make those fields.
And we can do that in a way that's very expressive. So I think we're trying to layer those
pieces in. We're just trying to do them in a way that puts the user in control as much as possible.
So very little black boxing, but black boxes
that point you at decisions to go make. Yeah. Fascinating. I love it. Yeah. I mean,
you can almost like, as those patterns become established, then those become even more
powerful, right? It's almost linting for queries or something. That's exactly it. And like they
can become automated, but like, let's make the bar really high. Yeah. Yep. Love it. Okay. We're at the buzzer here,
but before we jump off, where can people learn about, try use Omni?
Yep. Explore Omni.com. Just fill out the form there. Like we're about to probably put something
real public out there, but we're still young. So
we like to sort of handhold through the early process for now, or just shoot me an email,
Colin at exploreomni.com and I'll take you on the tour myself.
Awesome. Sounds great. Well, Colin, this has been an amazing conversation. Thank you so much
for giving us the time. Of course. Thanks. It's been fun. We hope you enjoyed this episode of the Data Stack Show. Be sure to subscribe on your favorite podcast app to get notified about
new episodes every week. We'd also love your feedback. You can email me, ericdodds, at eric
at datastackshow.com. That's E-R-I-C at datastackshow.com. The show is brought to you
by Rutterstack, the CDP for developers.
Learn how to build a CDP on your data warehouse
at rutterstack.com.