Drill to Detail - Drill to Detail Ep.98 'Why the Future of Notebooks Isn't Just About Notebooks' with Special Guest Izzy Miller
Episode Date: May 24, 2022Mark Rittman is joined by Izzy Miller to talk about Hex, a collaborative data platform that brings everyone together to explore, analyze, and share … and why the future of notebooks isn’t just abo...ut notebooks.Hex website Public app gallery of example Hex projects .Dataframe SQL announcement Hex for Analytics EngineersTweet thread about why the python / sql debate is boring"The Sharing Gap"
Transcript
Discussion (0)
Hello and welcome to another episode of Drill to Detail and I'm your host Mark Rittman.
So I'm joined on the podcast today by Izzy Miller from HEX.
So welcome to the show, Izzy, and great to have you join us.
Hey, it's great to be here. Thanks, Mark.
So Izzy, maybe just start by telling the listeners who you are
and I suppose the role you do at HEX,
but then the route you took to get into this industry.
So how did you end up working at HEX?
Sure. So I'm Izzy. I'm the
community advocate at Hex, which is really a portmanteau, I guess, of community manager and
developer advocate, which are both roles I've done in the past elsewhere. And so it really is
kind of a split responsibility of that developer advocacy, producing technical content and doing a little bit more of the traditional marketing while we scale our marketing org.
And then also that community management, engaging with the data community, talking to our users and growing spaces for them to talk to one another.
I've been here for nine or 10 months now, I think.
It's coming up on a year,
which in hex years is a pretty good amount of time.
I think we've quadrupled in that time
in terms of headcount.
And before that, I was at Looker,
where I was for four years,
or a little over four years.
I took a little bit of a winding route
into the data world.
I thought my whole life I was going to be a scientist or a naturalist.
I always loved plants and animals and went through phases of snakes and spiders,
but ultimately in college studied environmental studies and biology with an emphasis in botany.
And I really loved and studied and spent pretty much all day long thinking about plants. And along the way, like in a biology degree, you pick up little bits and bobs of data curiosity, I suppose, and little sort of specs of data skills. how to use RStudio and what that meant. I learned how to write a SQL query in an evolutionary
biology class where I had to get some data set down from a database. And so there were these
little places when in hindsight, now that I work in data, I think like, oh, maybe that's what the
catalyst was, or maybe that's what the seed was. But at the time, I had no idea, right? And so I graduated and I took what turned out to be a
pretty crappy job, but I was excited about at the time in Santa Cruz, where I still live, which
entailed wandering around in the forest, looking at trees all day long and doing tree surveys.
And if you like trees, which I did and still do, this on paper sounds like the best job in the world, right?
But it turns out that wandering around alone all day in the woods gets a little bit lonely.
And honestly, I might still be doing that job if all of my co-workers like the job too.
But all my co-workers hated it, which just is such a bummer.
And so I was looking for work and just exploring what's in Santa Cruz,
which isn't too much. And I stumbled upon a job posting at Looker for the Department of Customer
Love. And that was the name of the posting. And something about it just caught my eye. I had none
of the requisite skills that they wanted. I mean, the posting talked about SQL and Python and data
analytics. And I was like, Oh,
that's not me,
but something about it caught my eye.
And I just had this weird feeling that I would be good at it.
And so I wrote a cover letter and applied and shockingly they interviewed me
and it became immediately apparent that I didn't know anything that they
needed me to know.
I actually sort of hilariously flunked the technical interview with Lloyd, the founder of Looker, who just Lloyd is the nicest guy in the world.
And I backed myself into a corner of getting grilled by him.
And they sent me away and said, you don't know what we need you to know.
But if you want to learn and take a couple of months and come back, we'd love to have you.
And at that point, it was the first time I'd ever been in a tech company office.
First time I'd ever been in an office, really.
I worked in the field all day long.
And being in that Looker office and seeing the kitchen table and the people doing cool work there and just that energy, it was the best motivation to spend a couple months learning SQL and Python that you could have hoped for.
And so I did that and
ended up coming back and passing the interview and worked on the Department of Customer Love for
a little under a year, I think, before the opportunity presented itself to build out
Looker's community programs, which we had many users by that point, right? So there was a community. It's not really
something that you create with a magic spell. Like the people were there, but it had never
been managed intentionally. And so I spun up that community function and spent again about a year,
maybe a year and a half, maybe even two running Looker's community team. And then towards the end of my time there,
I moved towards developer advocacy
and actually moved to the embedded analytics side of things.
So integrating with the engineering team
and supporting Looker's API and SDKs
and the embedding and kind of like,
I think the tagline they're going with these days
is beyond BI.
So all of the things that let people go beyond BI with Looker. And at some point during that time,
Hex appeared on my radar. And I think at the time it was still in stealth and was just kind of a
cool looking website that had some hip little data tagline and something about it, again,
kind of like Looker, just some weird feeling caught my eye. And I was like, I think this would be really cool.
And I sent an email to Barry, the CEO saying, Hey, I don't think it's the right time right now. I
just got a new job at Google that I'm really excited about. But something about this seems
really interesting. Do you want to be friends basically and to my great surprise he replied back and we kept in
touch for about a year actually back and forth almost um and then one day over some delicious
focaccia at laguria in san francisco which if you've never been you should go he he won me over
and the fomo became too great and i i decided to join the team wow wow interesting so when i when i was looking
reading out for bio i said i thought it said naturist first of all um on there rather naturalist
so i was kind of i was kind of glad that i hadn't we hadn't we're doing a video interview then so
that could be embarrassing so but but yes it sounds like a really that sounds a really interesting
interesting background there and yeah i agree you know lloyd lloyd is a very interesting
and kind of um uh yeah enriching person and someone i never i never went to the looker
office in santa cruz i went to the one in san francisco once but even that beat that
was was kind of nice um so so hex is interesting isn't it and and you know i suppose hex what tell
you right so first of all maybe just as a taster, outline what HEX is and the mission, and then we'll get into the detail of that in the rest of the conversation.
Sure. So HEX is a platform for doing collaborative analytics and data science.
And the goal is to make it really easy for everyone on the team to go from an idea to analysis and working on that idea to actually sharing the
outputs and the outcomes of that analysis and exploration as a really beautiful artifact.
And I think you said mission. I think if I had to really simplify things down into
like a one sentence mission, I'd say the goal is to make data work more impactful.
And we come at this from a couple of angles. I think it's
all in service of making things more impactful. Hex tries to improve the actual experience of
doing the raw data work, making it more efficient and fun and more collaborative and more streamlined.
But then just as importantly, making sure that all of that hard work actually translates into something that can be shared in a way that really retains value and winds up having impact and creating knowledge for the broader organization.
I think that Hex is born from this idea of you can build something really powerful, but if no one can use it and if nobody reads it, then it's not useful.
So let's do a little bit of kind of context setting, a little bit of kind of history or background to this really.
So Hex is a notebook that supports Python and SQL
as those languages you can use to do analytics work.
So first of all, why is SQL important in this area?
And why is also Python important to data analysts?
Why are these two languages that are very important?
Actually, maybe before I answer that,
I might set some context of how the Hex product is
actually structured.
So I really think of Hex as having three pieces, which are kind of the three steps of the workflow.
And there's that actual analysis and science process, like the core data work that takes
place in SQL and Python, usually, as you mentioned, or even R.
And in Hex, this is done in this notebook-like
workspace. And if you're familiar with a traditional code notebook like Jupyter, then
you'll immediately understand the tool. But we've layered on a huge amount of these superpowers for
both really technical and really non-technical folks. So I can write a SQL query and then
jump into a Python cell and manipulate
the results of that query, and then build a chart without writing any code. And then I can
tag in a colleague with a comment all from the same page without context switching. So
there's that core data workspace, which I think of as being the first third of hex.
And then the second part is this app building and curation stage, which lets you
take that complex work that you've done in SQL and in Python and perhaps in no code or low code as
well. And then you can turn it into something curated and beautiful that you're proud to share.
So you can add interactivity and perhaps you consolidate down a hundred cells of SQL and
Python into like half a page of rich text and charts that are
interactive and really concise and neat.
And then the last mile or the last third is delivery, which is where that like impact
piece that I mentioned earlier really lies.
People work so hard on these reports and analyses and all kinds of tools, and then they really
struggle to deliver them in a way that
actually drives an outcome or drives knowledge. And so Hex lets you take that app that you've
constructed and then throw it up on a URL that anyone can access. And there's a knowledge library
of trusted insights and everything is commentable and versioned and collaborative and alive in the
sense that a Google Doc is alive. And so I'm excited to talk about SQL and Python
because they are the bread and butter.
But I think it's important to emphasize
that making working in SQL and Python easier
and more efficient and more fun
is only a third of the hex product.
It's really the building blocks
that then turn into this beautiful artifact
that can be shared and can drive outcomes on the team.
Maybe looking at existing BI tools
and SQL runners and so on that just do SQL.
Why is having Python, R, and so on
in your toolbox important as well?
Yeah, I wrote a sort of snarky, cynical tweet thread
about this a while ago,
where people love to argue about Python and SQL and
try and debate about what it's good for and what it's not good for and which one I should use. And
I was saying that I find this kind of surprisingly boring, and the tweets are out there, and
I could read them all out right now. But I think that one of the more interesting things there,
or let me be a little more clear. SQL is just
really good at some stuff. Python is also just really good at some stuff. It's also generally
a fact that most data-related projects have to start in SQL because that's where the data lives.
90%, maybe more, especially with data lake houses now, of your data is going to be stored in a
warehouse or a lake house
or some place where you get at it by writing SQL.
And so really just table stakes for accessing data
in almost every analysis SQL is where you start off.
Where you might want to switch to Python
is where things start getting much more complex,
or perhaps you want to leverage the world of package libraries
that exist out there.
I love doing GIS and mapping stuff, and it's really cool to have BigQuery GIS functions
and PostGIS and whatnot, but 90% of the time, it's more ergonomic to do that sort of thing
in Python using GeoPandas or Folium.
And so it really becomes about this split, and it often happens maybe halfway or at some point in your data
workflow. It's like, oh, I'm doing things now that are starting to feel a little less ergonomic in
SQL, or I'm doing things now that I know how to do better or faster in Python. And that is where
in traditional tools or traditional workflows, you'd switch tools, right? You'd be using your
SQL runner or your BI tool, SQL IDE.
You would get the data sort of looking more or less how you want it.
And then you'd download it and you'd wait for that export to finish.
And then perhaps you would upload it to your Jupyter notebook
and start working on it there.
And that context switch and that transfer of data
has a huge effect on both flow and creativity
and just being in the moment of the analysis,
but also on reproducibility. All of a sudden, you can't really just rerun the entire thing with one
click. You have to actually go back and switch tools and run the query and download the thing
and upload the thing and then run it there. You can't email it to somebody and have them try to
run it because there's an 800 megabyte CMG file that it depends on.
And so you really lose, I think,
more than you realize
when you've switched languages
a lot of the time with existing workflows
because you're switching tools as well.
And so Hex, supporting both languages
seamlessly in the same workspace
where you can really transit between them
as many times as you would like.
You can go from SQL into Python and back into SQL
and even write SQL on the results of a previous SQL query.
It's not just more efficient because, like, hey, I have a preference.
It actually has really huge ripple effects for reproducibility
and for deliverability and for maintaining the fidelity of a data project.
I think a lot of the time it is just preference like
hex steps back and it lets you choose the right tool for the task. And it lets you choose the
right tool for you to do that task. Like I personally prefer to do linear regression in
Python, I know how to do it. And it's one line of code. But literally earlier today, I got an
Instagram ad that said how to do linear
regression in SQL. And like, sure, it's possible. And if you love SQL that much, have at it. And so
the really cool thing about hex for me there is that it just lets you decide. And it also
gives you a really frictionless way to upscale, We think about Hex as having this really low floor and a
really high ceiling. Say you only know SQL. Hex is still an approachable and a productive
environment for you to do work in, even if you don't know Python. But then it's really easy to
start bleeding into Python and start learning it. Say you're in a BI tool, SQL runner, and you start going, oh, crap, here I go again,
like doing a slope calculation and CTEs to try and build a linear regression. And you just kind of
like power through it, because you're in a SQL runner tool, and you got to do it. But if you're
in hex, and you start doing the same thing, you might notice that like, three pixels away, there's
a button that says add a Python cell. And you think, oh, hey,
like, I don't know Python. But I do remember someone saying that it's better for data modeling
and stuff. And you Google search it, and it's one line of code. And you don't have to screw around
with an environment or installing homebrew to upgrade your Python version, you just start
writing it. And I think that's, that's kind of magical, both for learning and just for the general efficiency of
workflows. Okay. Okay. So one of the key, I suppose, innovations, looking at purely at
the kind of traditional notebook features of Hex is the data frame SQL. So this ability you've got
to be able to query data in SQL and then start working on it in Python.
Maybe just talk about what that is and why that is more than it's been done before by
other tools and what the key innovation was there, really.
Sure.
I think that DataFrame SQL, to be clear, is an ability that Hex has to write SQL queries
in a pretty standard PostgreSQL-like syntax on both the outputs of previous SQL queries in a pretty standard PostgreSQL-like syntax on both the outputs of previous SQL
queries or on the output of any data frame producing operation in Python or R throughout
the entire Hex project.
And I think it's this concept that is the innovation there, or that is maybe the thing
that I think will be picked up and run with by other tools, hopefully,
is that the data frame as the universal data storage construct.
And because Hex stores pretty much every piece of data as a data frame,
we know what it looks like.
We know how to access it.
We have our internal APIs for messing around with data frames.
And so it became almost this duh thing where it's like, okay, we have SQL cells and they
return data frames and then people are using those data frames in Python cells.
Wait, why don't we just let them keep using those data frames in SQL cells?
And we started playing around with it and it immediately became this
incredibly ergonomic thing. And it has some pretty significant implications. One of which,
like I mentioned, is the fact that Hex can be a productive tool for somebody who only knows SQL
because you can just write an endless string of SQL queries, and maybe the first one pulls in a big
wide table from your database, and then all of the other ones are just operating on the output of
that query. And it's all very fast. It's in memory, so it's nearly instantaneous. And pretty much
anything that you can imagine doing in pandas, you can figure out a way to do that in SQL. And so SQL first folks can use hex to its full
potential. And I've found it to change my SQL psychology a little bit too, in that it lets
you break up CTEs into separate queries that are much more debuggable. They're much easier to
explain and to understand, and they're much better to organize in the project. And so it's not just
this convenience thing. Sure, it lets you keep using SQL where you might have had to switch
over to Python first, but it also, I think, changes the way that people write SQL for the better. And
it makes, right in the spirit of notebooks being this literate programming, self-documenting
artifact, I think that DataFrame
SQL and the ability to chunk things down into more manageable sizes has pretty awesome implications
there. One thing that I think might be useful to people as well as this is maybe walk through
what the user experience is like and how a notebook differs from a traditional BI tool,
because there'll be people listening here who probably aren't that familiar
with notebooks and,
and the concepts you're talking about,
um,
they,
in the abstract,
they sound,
they sound interesting,
but how,
what's the,
what's the user experience and developer experience like compared to say a
BI tool?
Yeah.
Um,
so in a traditional BI tool,
I think you can almost think of it as like tabs versus cells.
When I use a BI tool or when I use the SQL runner of a BI tool,
I end up opening a whole bunch of tabs as I mess around with the query and I
iterate on it and I want to save a previous version of it.
So I leave that in a tab and I'll create a new tab.
And it's this sort of disjointed,
I guess you could say, exploration process. And notebooks, I think, aside from supporting Python
and other languages like Scala and R and Julia, et cetera, so you can just inherently do more
complex things in them, have this really powerful linear cell-based structure. And so if it might just be
more productive to go Google image search, what a Jupyter notebook looks like, but you can imagine
it as this document that's comprised of individual cells that flow linearly down and you create a
variable or you run a query in a cell at the top, and then you sort of work your way down, adding new cells
and chunking up your logic into really whatever size you prefer
and iterating on the outputs of previous cells.
And really the coolest, I think, or the most important feature of these
is that you can intersperse your code with rich or with markdown
text explanations of what's going on. And so they become this self-documenting artifact that just
has both reproducibility, people can run the notebook, but also they're explained in situ.
And what this sort of triggered for us when we were thinking about hacks is that this led to notebooks becoming the final artifact.
I did a really cool analysis and I kind of self-documented it as I went and I wrote notes to myself about it and I made some charts and tables and that's awesome.
I'm going to send it to my boss the problem is that notebooks aren't really great as a final
product like you wouldn't turn in your math notebook as your final project you
would turn in some beautifully built like trifold poster and notebooks for
better or for worse for all their, ended up creating this environment where oftentimes people are trying to turn in an IPyNB file as a finished product or even worse, screenshotting things out of it should be self-documenting. You should be able to break your logic up into whatever size chunks you prefer and takes
it one step further into this concept of like notebooks are probably the best place to do
exploration and to do data work, but they're probably not the best way to actually convey
the results and the outputs of that data work.
But like I said, tool switching sucks.
So why not build into a notebook tool the ability to take that notebook
use its outputs as building blocks and turn those into something curated and beautiful that you're
proud to share okay okay so you you i mean i'd like to get on to the the data apps that you you
know the hex supports as well in a second but one thing as you as you were describing the the
workflow there excuse me workflow with a notebook is it sounds quite
similar workflow to the workflow you'd have with say dbt where you would have in the sequel world
there'll be a transformation that leads on the results of that become a transformation which
results come another transformation and then at the end of it there's a result i mean how much of
a a sort of parallel is there between the dbt developer world and hex and how much i suppose would a dbt developer feel
comfortable working in hex really as a maybe more visual way of doing what they do currently
yeah that's a great question uh our our analytics engineer erica who works in dbt primarily recently
wrote a pretty great blog post about how she uses Hex to prototype her data transformations that ultimately wind up
in dbt and it really is along those lines right of having a more visual a kind of instantly instant
feedback loop on the data work that she's doing so she can rapidly prototype these transformations
and then productionize them into dbt it's also funny that you mentioned this because one of the innovations one
of the great innovations of hex solves what people think of as the state problem with notebooks and
if you're familiar with notebooks you're not in your head if you're not familiar with notebooks
the problem with that beautiful cell-based linear layout i was talking about is that you can run cells out of order.
Jupyter notebooks and almost every notebook lets you run cells individually. And what this means
is that say I'm assigning x a value of 1 in one cell, and then in a cell below it I'm reassigning
x to 2. If you're looking at this and sort of reading through and scrolling through the
notebook, you'd see those two and you'd be like, okay, x equals 2 because that came after.
But if the person at the time they published or downloaded the notebook had run those cells out
of order, x could be 1. And so the state of a notebook is not consistent. And it's very difficult
to introspect it and to tell at a glance what is actually what.
And this can be really confusing when you're reading the outputs of a notebook, but it can also be really confusing when you're doing work in real time in the notebook.
And a bad solution to this that people wind up doing is clicking the button that says restart and run all, which means that every time you change anything in a notebook or add a new cell, you rerun the entire logic of the notebook, which is how DBT
lays out their projects and transformations as well. And so you can open up the graph view in
Hex and visualize your project and all of the dependencies and the parent-child relationships
of cells. And this is really neat to look at and to troubleshoot your project with, and it especially
just mind maps to
that mind of an analytics engineer. But it also means that the state problem is fixed, because as
part of that graph building, we're dynamically recomputing all of the dependencies of every
single cell. And when you run any cell, we're going back, we're finding all of the dependencies
of it, rerunning those, not rerunning all of the cells that are unrelated, and your
state is always consistent in the notebook. So that was, I guess, a sideways way to plug a cool
Hex feature. But the short version of answering your question is that I think Hex is really
powerful for analytics engineers prototyping work because it's so fast, you get that fast feedback
loop. And then having this like reactive reactive dag based model makes it feel a little
bit familiar fantastic so so the title of this podcast is hex and why the future of notebooks
isn't just about notebooks and and what i meant by that really was was hex it seems to be a lot
more than just you know another notebook really and and the thing you've been talking about
the uh data apps really is is the thing that stood out for me and one of my
colleagues uh olivier has a has has a notebook as a data app sorry on your website looking at um i
think it's canadian a fuel protest or something um but the point is as you say it presents things
in a very kind of visual and interactive sort of way so maybe just recap on what data apps are
and how they fit into the workflow uh of of. And I suppose, again, what innovations you've
got there? Why is Hex's take on that particularly good, do you think? Sure. I think that data apps
is maybe a word that people toss around and it hasn't been perfectly defined in one place. I
think it's a little less exciting than it sounds like, or maybe a little bit less specific. I think that data apps is a more umbrella term for analytical products that
are simply more flexible than dashboards. I think that they're an umbrella term, more or less.
And so you can think of a data app as being looking like a traditional dashboard, right?
A couple of tiles with a sort of classic newspaper type layout.
But you can also think of them as being much more flexible, maybe like just one table that
allows for user input and then a big run button at the bottom that actually takes an action.
And that table at the top is being dynamically updated in real time.
And so for me, the concept of data apps, and this, I think revolution is probably a pretty strong
word, but there are a couple of tools out there that are really going all in on data apps. It's
about flexibility. It's about not wanting to be bound into that, like drag and drop tiles into a grid on a dashboard mode that BI tools present as the
only way to communicate insights. Because A, it's just not the best way to communicate insights.
A lot of hex data apps actually look like documents, like rich text stories or data
articles that are interactive. But it's also a pretty limiting model in terms of interactivity, right? You have
your grid-based tiles and you have your filters up at the top and that's it. And data apps kind
of give you the flexibility to build your data products like you would a modern web app where
maybe you're halfway down and all of a sudden there's a section that requires you to take an
action or to write back some data to a database or to upload a file
and take action on it.
And it's really just about giving users all of that flexibility, I would say.
So who do you see as the audience for this, really?
Who is the Data Apps feature geared towards in terms of who it's going to consume?
Is it meant to be for internal consumption within businesses is it something maybe you see hex as being a platform
for public data apps i mean where do you see this going really yeah that's a fascinating question
i think i'm i'm cognizant of the fact that it's hard to say i remember with looker looker i think
i could be wrong i don't work there i think they make a pretty good amount, if not most of their money,
off of this embedded external facing applications
and embedded dashboards use cases.
And that wasn't something that they planned on.
It was something that users started doing.
They realized, hey, we've already modeled our data.
We have these beautiful dashboards.
Why not show them to our end users and embed them into our portal? And Looker was like, oh, shoot, we better catch up to this and
we better put a SKU on it and monetize it in some way and control it. And it's done very well.
So I would be hesitant to say, no, Hex doesn't do that because our users really guide the use
of the product at the end of the day. Today, Hex is very much more geared
towards that internal use case. It's a way for data teams to do better work together and then
to communicate the results of that work out to the entire company, to the entire organization
in a way that is both technically satisfactory and reproducible for other technical folks and
easily understandable and approachable for non-technical people.
That flexibility that I was talking about, though,
does lend itself well to external use cases
because you can imagine it's a little bit more powerful
to be able to embed an interactive application to an end user
than it is to embed a three-tile dashboard
that perhaps you'd rather
just recreate yourself in JavaScript or something like that. So I would expect to see more of these
cases emerging. I think we have one or two people that are playing around with it right now. But
right now, Hex is primarily focused on internal use cases. But mean i finished up my tenure at google supporting the embedded
analytics sdks and capabilities so i would of course be excited to see hex go down that road as
well there may be a digression but but why would a vendor make most of its money for external use
of a product like that is it because that's where you get the kind of the the high user
i'm just curious why why would that end up being not
particularly for looker but why would that be a good way to sell your product or have it used
to make money sure and i think that that i may have just been wrong uh for one but funny funny
if you were yeah the obvious answers are like you said that the the user base is just so much tremendously larger, right? You can think of
an unnamed Looker customer that, I mean, most Looker deployments have like a couple hundred
people or a thousand max using it. We have a customer or had a customer who had like millions
and tens of millions of people that were using the embedded dashboards. And so you start to
really increase the scale there. So the knowledge library, what is the knowledge library and how does it work and where does it fit in with your product strategy?
Yeah, so the knowledge library is one way that we're answering that question or that thing that I mentioned earlier in our mission about making data work impactful, right?
Doing something different than just letting people send over spreadsheets or screenshots of stuff.
And there's many ways to make that better. One of which is like just the core hex value prop of
being able to push up a data app to a URL that anyone can access. But I think that a lot of the
organizational tools or the organizational structures of data tools are kind of broken
in the sense that they just fall into this endless nesting of subdirectories and people have their
own little personal seven circles of hell data subdirectory that's filled with millions of
copies and duplicates. And it's very difficult to find things. And what's more difficult is to find
things that are trusted. And the knowledge library is an answer to that, right And what's more difficult is to find things that are trusted.
And the knowledge library is an answer to that, right? There's a difference between information and knowledge. And the knowledge library of Hex is this view that catalogs
trusted Hex applications that have been tagged with a status that an administrator has chosen to include in the
library. And so it's like, not all Hex apps wind up in the library. A lot of things stay in an
exploratory phase, or perhaps they're just not the kind of thing that needs to be codified in
the company knowledge base. But if they do, then you add a tag and Hex adds it to this sort of
trusted, I hate to say the words single source of truth
because it's the worst phrase ever, but it at least is a source of truth for trusted
knowledge in the company in a way that anyone can access it. And this is the standard interface for
non-technical users into hex. If you invite somebody to hex who's not an author or an editor they'll wind up
in the library and so this is part of that low low floor right where hex if you use it you'll
realize immediately it has these crazy superpowers for technical folks but we want it to be the kind
of place where anyone can open it up and not feel lost so the library should feel like a library
right you walk in and you know what
you're looking for and you can find it and then you can easily access that work okay so that's
the interesting to go into it's probably the area of analytics moment that's even hotter than
notebooks which is or obligatory we have to discuss in every podcast which is which is the
metrics layer and i suppose the the thing that's part of that which is the unbundling of parts of
the kind of the stack and so on there so so i suppose first of all do you this this uh this the current trend around
unbundling of layers of the of the of the of the analytics stack and the build and the focus on the
metrics like is that something that's that that is relevant to hex is it something that is it
informs your strategy is it something that you think about? Maybe just kind of talk about what you think about that kind of that trend in the industry? Sure. I admit to not keeping up that
much with the unbundling progress. I think I'm not an MBA and I'm not even an aspirational
thought leader. So much of the unbundling discourse, to be completely honest, goes a little bit above my head. I think that where it's easiest to understand it is that the unbundling is happening between layers, right? Like there's the consumption layer, there's the metrics layer, the transformation layer, the storage layer. And previously, there were some of these sort of monolithic platform tools that encompass a lot of those things and brought them together.
And I think that those were noble goals, right?
If you think about Hex and the way that we're trying to take things that previously you had to use multiple tools to do and put them in a place where it feels really efficient and ergonomic to do in a single tool. Like that's the same vision that
say Looker had when they built a metrics layer and integrated it with transformation to some extent
and a consumption layer. I think that that is a noble task and the fact that it's being perhaps
better served today by standalone tools across those layers,
I think is almost more a function
of the easy transit of data between those layers
than it is about intrinsic failures within those workflows.
I haven't used Looker in a while,
but I remember it being a pretty delightful workflow
to model something and add some simple transformations
and immediately
begin visualizing it. It's fast and it's quick and it's integrated. I think that where that
unbundling process becomes a little bit more compelling is when it becomes super easy to
push data around between places like with Fivetran and dbt and all of a sudden it's like all of those
little tiny things that maybe weren't
enough reason to ditch a whole platform suddenly become a lot more uh more appealing when it's
super easy to ditch it and and orchestrate something that that works across the layers
and hex like hex exists entirely within the consumption layer we consume and we visualize
and we display and like allow people to take action on data.
I think, as you mentioned, we're closely tied to the metrics layer.
I am really excited about the prospect of the metrics layer and its integrations with
Hex because I think Hex right now presents an unusually flexible tool for doing data analytics. I've alluded several times
to the fact like Hex lets you decide, like Hex steps out of the way and lets you choose the
language that you want. Hex lets you write your own SQL queries. Hex lets you do just about anything
that you can dream of, which is amazing. But as a sort of halfway data scientist analyst,
I have like tiny little bits of skills in all these areas.
I would love to be able to just yank in a metric
without having to open our dbt documentation
and figure out the definition for it.
I think that in service of the low floor approach
that we're taking is really crucial.
And I think we will definitely be
investing in building support for the metrics layer so interestingly i was talking before just
before i recorded this with you i was talking to nick handle from uh from and who mentioned hex and
and something i noted before when i was when i was when i was you know talking about this kind
of episode was the the integration you announced with Transform.
So maybe is it something you can talk about?
I mean, what's the aim with that and how does that work?
And what is Transform maybe as well,
just for anybody who's not aware of it?
Sure.
Transform is a metrics layer.
It's a place where you can define the core business logic
of your organization or business in metrics, which again, create this kind of source of a SQL query to calculate ARR that everybody
calculates differently, you just write some, I think they call it MQL, I think it's metric
query language or something. You write a transform query that just says, hey, give me the ARR,
and it gives you that codified ARR. And that's, I mean, that's what the hex integration with
transform does. It lets you write SQL queries that connect to transform via its JDBC connector, which then dynamically rewrites the SQL query that you've written that just says ARR. And it goes, okay, I know how to calculate ARR. And it rewrites the query to be all complicated and perfect and fires that off to the warehouse and then pipes the data back so it it lets you
and this is sort of i think what the promise of all metrics layers will hopefully bring it it lets
you write simple or sequel that you know is going to give you a result that is that is accurate to
the business logic so i mean just to kind of i suppose the final thing i want to talk to you
about is is um is around community okay so your your role is is community um community uh advocates okay and you've got a community edition of of um of hex
and there's some great examples and tutorials and so on for hex on your website so maybe talk to us
about what the role of the community plays within hex and how people can get involved with the
product and how they get to use it. And generally, your day job,
how can they interact with you
and find out more about Hex?
Yeah, just a quick plug,
which is that Hex Community Edition
is free for everyone forever.
And it really should feel feature complete
to the individual user.
Most of the little upsell levers that are
baked into the product when your demo ends are around collaboration, around working with a
broader team, around enterprise things. If you are an individual data analyst or scientist or
data curious person, Hex should feel really ergonomic and really lovely for you forever,
for free. So I encourage you to check it
out because it's a lot of fun. We have a public app gallery with community projects, which you
mentioned, and you're welcome to submit projects to it. I think that one area, thinking about my
day job, that I want to invest and just spend a lot more time in is in building these demo and template
projects for and with the community. I think that there's a huge opportunity with the kind of
projects you can build in Hex, right? Talking about this literate programming, talking about
these self-documenting, self-explanatory projects, there's a huge opportunity to build tutorials
and not just, hey, here's how you use Hex, but here's how you do a really complex data analytics
concept. And it just happens to be in this really nice, clonable, forkable, reproducible,
no state problems notebook that anyone can fire up up and so i would i would really encourage everyone
to try out hex if you have ideas for amazing tutorials or concepts that you're really
passionate about that you want to start exploring or maybe even exploring building tutorials for
to yeah hit us up on twitter uh find us on the dbt slack and the tools hex channel and come hang out because we would love to have
you fantastic well look it's been fantastic speaking to you um and izzy and i really
appreciate you taking the time to come on the show and to explain i suppose the thinking behind
the product and um how you have taken kind of notebooks beyond people's original idea about
what they could do really and uh so it's been great speaking to you, Izzy.
Thank you very much and good luck with the product.
Thank you. you you you you you you you you you you you you you you you you you you you you you you you you you you you Terima kasih telah menonton! Thank you. you