Drill to Detail - Drill to Detail Ep.107 'Cube, Headless BI and the AI Semantic Layer' with Special Guest Artyom Keydunov
Episode Date: July 6, 2023Mark Rittman is joined in this episode by Artyom Keydunov, Founder at Cube to talk about embedded analytics and Cube's origin story, headless BI, query acceleration and how Cube and Delphi enable an A...I-powered conversational interface for the semantic layer.Empower uniquely insightful AI & LLM data experiencesAI-powered conversational interface for the semantic layerDefine metrics upstream to align your teamHow Rittman Analytics Delivers the Semantic Layer Today with CubeRittman Analytics Cube Quickstart
Transcript
Discussion (0)
maybe set the scene a little bit really about what what was what was the product landscape
like when you thought about cube at the time um and what problem did you initially try and
solve within that within that market yeah yeah the good question i think to be honest the initial
idea was what if we built a handless look ml but bring a different BI on top of it.
So hello and welcome to Drill to Detail, and I'm your host, Mark Rittman.
So I'm very pleased to be joined on the show today by Artem Kadyanov, CEO of Cube. So Artem, great to have you on the show.
Thank you for having me today, Mark. I'm really excited about our conversation. My name is Artem,
co-founder and CEO at Cube, started Cube in 2019. So really looking forward to today's conversation.
Fantastic. So we've been working together a little bit actually recently as well. We're a cube partner and, and certainly myself and the team have been very, very kind of excited to be working with, with cube products and cube company.
So, but it's great to talk to you. It's always great to talk to the kind of, you know, original
brains behind the, the product really. So it'd be quite good to go through in this episode,
some of your
background really, but then the area of semantic models is really hot at the moment. And I'm
particularly interested to understand what your thinking is around this area and what's the
differentiator really for cube product and where you see this all going. So, but let's start off
really with your, I suppose, your kind of backstory so tell us about um tell us about stat spot which was a product that i actually used a
little while ago and i was pleased to see actually you know it was your company or you that actually
was involved in that as well so what was stat spots and what was the story behind that uh yeah
yeah good good to hear that you know like you you enjoy the product.
So I started StatsBot, I think, in 2016 or so.
And I was running engineering at a company that was building software for schools back then.
And we had a good engineering team that was starting to use Slack a lot.
And I thought, what if we turn Slack into like some sort of a BI tool, right?
Like bring charts or analytics data into Slack, because we already spend a lot of time in Slack, right? So I built that integration with Slack and a few other places, you know, like Google
Analytics, databases, Salesforce. And so the stats bot was able to pull data from different places
and just display that in uh in slack it was either like a real time or it was you know like
kind of on a scheduled basis so and i started is as a just kind of side hustle project you know like
but uh it started to grow quickly.
Slack came with this application directory,
and they reached out to me and said,
hey, Statsbot is already getting some traction.
We wanted to feature it in our upcoming application directory launch.
So they did this.
I got a lot of traffic, got a lot of users.
My co-founder with Cube,
he's actually one of the first users of Statsbot,
and he texted me and I was like,
my Ruby on Rails application on Heroku is not doing well.
Can you help me?
He jumped in and kind of, you know, like started to help me.
So, and it was going well.
And then at some point, VC started to reach out, like saying, okay, we see some traction in Statsbot.
And it was, you know, like during, okay, we see some traction in StatsBot.
And it was, you know, like during the days when a lot of people were talking about conversational interfaces, it was like magic, like SMS, like a text-based app, right?
It was like a bunch of bots, Facebook bots.
So there's like a little bit hype in a venture world about that as well.
So VCs reached out and they kind of wanted to fund
Cube, or fund Statsbot, sorry.
So we
decided to do that.
So we quit our jobs, we erased a little bit
seed run for that.
And I think
it was
a good run.
I liked what we built with Statsbot.
I think the problem with Statsbot was its initial idea was just to build a side project.
And it was really good at the side hustle, but it was not like a big venture story.
And then Slack really kind of stopped growing at some point.
It was all application built on top of Slack.
So at some point we decided that we wanted to focus on a bigger opportunity.
And at this point, actually, we started to look more into Kube.
And Kube was an engine we built for StatsBot.
Because essentially what we needed at StatsBot was to have a roll-up engine, relational roll-up, that can sort of generate every bi needs it right like so you generate some sort of you know like a sql
query from the multi-dimensional model and then you run it in a warehouse so we we thought what
if we just take this in journals of stats bot what if we take a cube and put it out on a on a github
so people going to use it you know like for the for the building their own data apps with building
their own you know like analytics building their own analytics products.
So we did this and it took off.
So a lot of people started to use it and we decided just to pivot completely
and focus on Kube.
Interesting, interesting.
So I think maybe I used it at the time
for doing GA reporting within Slack.
So yeah, it was good.
And there's also a company
called all count as well that was involved in your your history um and that's where pavel your
your co-founder came from so i know all count wasn't maybe sort of direct product of yours but
but what's how does that fit into the story really yeah uh i all count my co-founder paul he was working on the open source project called all count
before he joined me at statsbot so it was uh it was a rapid uh application building platform for
uh for accounting based on not just okay and i suppose that had has been stepping forward a
little bit to the start of cube it was cube js KubeJS, I think, at the time.
And was that kind of where, I suppose,
the focus on Kube being embedded came from,
or was it just coincidence, really?
Yeah, yeah.
I think the main reason is that our data model back then
was JavaScript-based.
So right now we still have JavaScript,
but we also have a YAML based model.
And it seems the YAML based models
are getting more traction and just more,
I would say natural environment for data engineers.
But our original framework for the data modeling
was only in JavaScript.
So that's why we had this JS in a name.
At some point, we decided to remove that
because we started to see some confusion from people
thinking that Qubit is some sort of, you know, like a
JavaScript visualization library or something like that.
Because many, you know, like charting libraries,
they have a GS in a name, like a D3GS or chart GS, right?
So like having GS in the name was not good for us.
So we removed it probably like almost two years,
maybe one year ago.
Okay, okay.
So we're going to talk about Kube really in this episode.
And I suppose the wider analytics market
and semantic models within there
and the position that cube hopes to
have within that market really okay so so let's just take a step back so so when when again when
i first heard about cube there was a lot of talk about about headless bi um and and metrics layers
and and so on really so i suppose um maybe set the scene a little bit really about what what
what was the product landscape like when you thought about cube at the time?
Um, and what problem did you initially try and solve within that,
within that market?
Yeah. Yeah. The good question.
I think to be honest, the initial idea was,
what if we built a headless look about? So, and, uh,
we were like big fans of the Looker product
and we have many Lookers on the team right now.
So I think the Looker model is just a great product
and I have a lot of respect for the team.
So we thought, what if we build that?
And that was the idea.
What if we just build a headless data model?
And we started to think think what kind of use cases
are people going to use it for, right?
When we released it in open source,
we started to see most of the people
that were building embedded analytics
or like interactive data apps, right?
Which felt natural, right?
Just you have the data model that you can run
on top of warehouse.
It has all the data modeling capabilities,
but it also has some sort of SQL execution engine with some caching.
And then you have API.
And then you can build your own charts, like a React with Chart.js
or something like that.
So that was our major use case back then.
And then probably around 2020 or 2021, more and more people started to talk about the
metrics layer and headless BI as a term and semantic layer. So we started to see some ideas
of people wanting to take something like a LookML but bring a different BI on top of it, right?
Not only do embedded analytics, but bring a different BI,
like a Tableau superset on top of LookML model.
And there were like some blog posts talking about this idea,
like a headless BI or something like that.
And every time these blog posts would come out,
people in our community would point to it and say like,
hey, that's exactly what Kube is doing.
We're like, yeah, that sounds about right.
And we started to think more about that use case.
And we started to see, you know, like see some pull from a community.
And I think there's several companies, you know, like speaking of landscape, there were
several companies trying to do that.
There was one great project, the MetriQL. One probably was one of the
first that, you know, like trying to say, okay, what if we build a data model and then, you know,
like let different BI tools to connect to MetriQL. MetriQL was like Presto-based. So they were
exposing like a Presto SQL interface and things. that was a good idea. And then there were companies like Supergrain, Transform Data.
I think it was a few others.
And we at Kube, we started to think a lot about that use case as well.
So I think at some point, from a naming perspective also,
it was a little bit like a chaos.
Some people were using term headless BI, and we at Kube were using that as well.
And then some people were using metrics layer,
metrics store, and then semantic layer.
And now I feel like everything is converging
to like semantic layer, which is good.
So we have a one single term right now,
but it was a little bit interesting times,
like two years ago, you know,
like there's a lot of companies, projects
getting into space.
Okay. So I suppose what are, There's a lot of companies, projects getting into space.
Okay.
So I suppose without getting into Kube's particular details at the moment,
what are the, I suppose, the generic challenges in trying to build a headless BI, a headless semantic model?
And I suppose, why also would a company want to do that,
want to use that rather than, say,
just using LookerMail and Looker, for example?
So what are the challenges and why bother, really, I suppose, in some respect?
Yeah, yeah.
I think one, why and what is the value, I think,
if I would try to summarize that,
I think it's overall idea of bringing software engineering practices to the data management.
And one big best practice is dry, meaning that do not repeat yourself.
And what's happening right now is when we use multiple visualization tools, every visualization tool, it has some sort of data modeling layer, right?
Like it's very advanced in the looker, but Tableau has it, Power BI has it, 2% Metabase, every BI they have it.
And what's happening is that we're repeating ourselves in these places.
Every time we do new visualization visualization every time we bring another tool
or even embedded analytics right we we repeat the data modeling in that layer so idea is do not
repeat yourself and extract the data modeling upstream into some sort of a component that
provides a unified consistency and accuracy so that's an idea behind semantic layer and why to use it.
And what a challenge is.
Because, I mean, I guess I would imagine having to support lots of different databases,
you know, different BI tools.
I mean, I imagine it's not a trivial thing to do, really, is it?
Yeah, yeah, exactly.
I think that the main challenge would be around the BI tool because coming back to the thing that every BI has its own semantic layer, usually this semantic layer enables you, it controls the user interface and it controls the end user experience. Like we mentioned Looker a few times already, but let's take Looker as an example, right?
We have an Explorer in the Looker.
So every time I create an Explorer in a LookML,
it will pop up in my list of Explorers, right?
And then I will go into that UI.
So like everything I'm doing in the data model,
sort of, you know, like it affects my UI
and for myself and for like end users.
So that's a challenge is like,
we still need to have this semantically on a BI level because all the like
BI controls are connected to it to provide a native experience.
Right.
So if we try to bypass that,
then the experience of the end user would be very,
very bad.
Right. Like it would not be native. So I think the challenge is like, then the experience of the end user would be very, very bad, right?
Like it would not be native.
So I think the challenge is like, how do you implement semantic layer,
but still keep the same native experience for the BI for like end users?
Okay. Okay. So let's get into the detail of Kube then really. So you mentioned about how I suppose StatsBot was an inspiration
and some of the kind of the roots of Kuber in that.
But you went down the route, I suppose,
of open sourcing Kube, didn't you?
So maybe talk about the first couple of years, really,
and when it was Kube.js
and how you kind of, I suppose,
leveraged the community a little bit in this really as well.
So maybe the starting sort of origin story there
would be quite interesting.
It was a lot of fun.
So it was mostly me and my co-founder running the open source project.
We were just doing mostly three things.
Writing code, obviously talking to users.
We put a Slack out there. So that was good. Every time people
will run into some exception or some problem, they will join our Slack and ask for a question.
And then we'll use that as an opportunity to build a conversation in relationship with that users.
And this way we'll be able to learn how they were using product. So we're writing
codes, talking to customers and just blogging a little.
You know, just how do you solve this problem with Cube?
How do you solve that problem with you?
How do you use Cube with that technology?
So blogging was really a way for us to attract new users.
And by talking to users, we were able to shape the product.
So it was just like pretty much the slope, you know,
like put a blog out get any users
talk to them write code and then repeat let's go into i suppose some of the the detail of how cube
works okay so there's different layers to cube aren't there there's like caching and there's apis
and pre-aggregations and so on maybe just talk us through a little bit about how the product works
really and how you the choices you made over how it's architected and built.
When we talk about Kube,
we usually talk about the four layers of a product.
And the first layer is the data modeling.
And pretty much all other layers,
they're all coupled to the data modeling.
In a data modeling, you define your data models, right?
Hence the name. So in a Kube world, we have two objects for the data modeling cube, you define your data models, right? Hence the name.
So in a cube world, we have two objects for the data modeling.
One is called cubes, and the other is called views.
So the cubes, the purpose of cubes, this is a business entity.
So you take a user as a business entity, you take an order, transaction,
you define what measures, what dimensions these business entity, you take an order, transaction, you define what measures, what dimensions
these business entities have, and then you also define the relationship between them.
One to many, many to one, all of that. So you're sort of building your data graph
of your business entities. And then views, the job of views is to act as a data marks or slices of data.
So you can take some measures and dimensions and specific cubes and then present them as an interface to the end user, some sort of curated data sets.
And you also on a views level, you can control the joint paths because the cubes, they constitute the data graph,
but it's not directed.
And views, you give direction to joints on a views level because potentially you may
have multiple ways you direct your graph, right?
So on a views level, you can control that direction.
So that's a fundamental idea of the data model.
Then on top of this, we have access control, which is coupled into data model.
So every time that we execute a query in cube, we execute it in a context of some security context.
We call that idea security context.
So security context can affect the data model, meaning that for different users, you may have a different version of a data model,
which could be row-level security, column-level security, just remove entire set of measures,
entire set of dimensions. So your data model can be flexible based on the context of the query.
So that's how we implement access control. And then we have a caching layer. The caching layer is,
idea is aggregate awareness.
So Kube can build the aggregates
based on measures and dimensions.
Kube has its own storage for the caching.
So you can, Kube can aggregate,
it runs the initial aggregation in a data source
and then downloads the result into its own cache.
And the aggregates, again, they're like tables, right?
It's a relational cache, so meaning that it potentially can serve
a lot of different permutations of measures and dimensions in a query.
And then Kube can refresh that either with its own job scheduler
or you can use orchestration like Airflow.
And then final layer of the product is API.
So we have REST API, GraphQL API.
That's where we started.
People usually use this API
if they wanted to build embedded analytics
for interactive data apps.
And then we have a SQL API.
Our users use SQL API to connect with BI tools.
Okay. So How does Kube
translate queries against
your data model into
SQL that can be sent to
BigQuery or Snowflake?
How do you do that?
How do you handle the different dialects and so on?
Yeah.
We
first build a multidimensional query, right? Like measures dimensions. And then we translate that into the SQL based on the different dialects. And the cube has a concept of a driver. So every time you need to support a different data warehouse, you need to build a driver. And a driver, it needs to implement the connection,
you know, like how connection is done technically, right?
Different protocols, all of that.
And then it needs to implement the SQL dialect as well.
So because sometimes, you know, like a predicate,
you know, filters, like all of that,
they just have a little bit different syntax, right?
So that's how it works.
So every time we wanted to introduce a new support, a new warehouse, a new database, we need to implement the driver.
And because we open source community implemented a lot of drivers already, which is really good.
You know, like we have this luxury.
So yeah, Kube, that's how the Kube works on that side.
Queering side, it's interesting where like the SQL API,
like it's because Kube pretends to be a SQL database too, right?
So your BI can connect to Kube.
So like Tableau would connect to Kube as to Postgres database
or like a superset would connect to Kube to a postgres database or like a superset would connect to cube as opposed to this database the the missing piece here is a notion of a measure right because
cube has measures inside it but sql spec it doesn't have idea of the measure so here we're
like extending the sql spec and we're adding idea of the measure so So the measure is just a spatial
type of the column, and it has
a spatial function which we call measure
as well. So meaning that
when you query cube and you
say, I wanted to get this measure,
that means
that you don't need to do any
aggregation, any calculation with that measure
now, and it should be used
those calculations that
you have in your data model so we adding this extension to sql okay okay so i think probably
imagine if i was listening to this and i was uh the audience i'd be thinking well but how does
this compare to say um the dbt metrics layer or say you know looker and looker mel um so so with
the dbt metrics layer and i suppose the classic version
and you know maybe what's coming with with the transform um uh acquisition how is how is cube
differentiated from though from that from that product area that that product sort of family
and um and you know would cube compete with that or is it a complement it's generally how do people
sort of understand the difference between cube and the dbt semantic layer i think cube is definitely complementary to dbt as a transformation tool
and you know like i in many cases i encourage our customers to use transformation tool like dbt
upstream uh from cube uh with dbt semantic layer it's a little in flux right now because it's unclear how the end product would look like, right?
We know the history.
So dbt announced that they wanted to build a symmetric layer.
That's how they called it originally.
And then at some point, apparently they decided what they built
was not meeting all the requirements.
So they decided to buy the transform right and and
transform is a great team they have a great product i think the question is like how these
two things are going to sort of converge right and like what is going to end product look like
so it's a little hard for me to tell at this point it's like how it's going to be compared
to cube because i don't know how what what is the end product will be there.
But, you know, like I think fundamentally there's two questions in a semantic layer that I think different people have different views on how the metrics should be defined
and how the metrics should be queried.
I think in a cube world, and I already described all the cube news architecture,
we are data set centric,
meaning that we believe that the semantic layer
should provide as an interface, as a product,
it should give a data set
where it can contain multiple measures,
it can contain multiple dimensions,
but it should be a data set about some specific business entity or some specific business area right like users like oh you know like sign
ups like the transactions work but what i saw from a previous situation and dbt semantic layer
they thought in a more like a metric centric way where they would ship a metric and attach a multiple dimension to that,
and then ship another metric, attach a multiple dimension.
So it's less a data set, but more like a metric-oriented.
So that's one difference.
Again, it may change with transform coming in,
but that was before.
And a second big area in a semantic layer
is on how you query it.
So Kube way of doing this is to query it through the SQL. I think that's the right approach. SQL is the language of data, right?
Every tool knows how to speak SQL, so we should support SQL. The transform team had a little bit
different approach and a metrics layer from DPD had it too. So, the transform team had a little bit different approach and a metrics, and a metrics layer from dbt had it too.
So transform team was using,
I think it was called MQL metrics query language.
So their own like a metric query language,
which it was not SQL.
And then with dbt,
I think it was ginger based in that into SQL.
So it's a little different.
So I think that's an interesting area.
I don't know, again, how it's going to look like at the end state,
but Kube was SQL first from the beginning,
and I think that's the right approach.
Okay, okay.
So I suppose a combination of Kube and preset
is something that we've had a fair bit of success with recently on client
projects um maybe just talk about um i suppose the the particular kind of i suppose value in
integrating say cube with preset sort of superset and i suppose how cube we're investing in that
area in the future really we we're having a lot of a lot of users and customers with superset and
preset so that's why we're're excited specifically to build more integration
with that BI tool.
One area where we're working on improvements
is we're integrating with superset semantic layer itself.
So idea is, and we call this feature semantic layer sync.
Idea is to let cubes data model to push it downstream
to all different visualization tools
and synchronize cubes data model with BI's data model as well.
So in a superset, you have data sets.
So cube now can programmatically build and manage data sets and supersets
and define all the metrics, define all the dimensions.
So users, they don't have to put them manually.
And every time you make a change in a Kube's data model,
the Kube automatically synchronizes it with a data model in a superset.
And that gives a native experience to the end users. I think that's one of the biggest challenges in a semanticet, and that gives a native experience to the end users.
I think that's one of the biggest challenges in a semantic layer implementation
and how you give this native experience to the end users,
and I think the semantic layer sync is a way to solve this problem.
Okay, okay.
So going back a moment to you talked about the caching layer in Kube,
and certainly, again, from our experience, the caching layer in in cube and certainly again from from our experience um the the caching layer and you mentioned aggregate awareness there are i suppose
particularly defining features of cube so maybe let's go back to that a little bit how does that
work and how does that how does it then give you career performance that can be quite fast you know
you mentioned i presume you rewrite as part of that as well how does the caching layer work and
what what's the end user experience like in the end
when that's working properly?
So I'll start with just high-level architecture, what we have.
So in Kube, we have our own engine,
which is used for orchestration and caching.
So the first part of the job is to do orchestration
because kube instances, they are headless
and they are stateless.
So meaning that you can, you know,
like it lets you horizontally sort of scale it,
but they also need a way to, you know,
synchronization point.
So our caching engine, it manages a queue,
execution queue.
This way it orchestrates all the queries, but also manages the caching.
So the caching piece is divided into the router node and a worker node.
So it's a distributed query engine where we store the cold storage is per K files
and a hot storage, it's a memory of the worker.
So the way it usually works is when you define your aggregate in your data model,
kube caching engine will go in your data source say snowflake will execute a query in snowflake
and then download the whole result of that query into its own storage then it will do some
repartitioning re-indexing resorting and put it all into you know like the parquet file format
and then when a query comes to Kube,
say from a BI tool or from embedded analytics,
Kube will know that it has an aggregate.
That's where aggregate awareness comes in, right? The Kube will have this knowledge that the aggregate exists
for this specific query, and then we'll go and query
the aggregate from a Kube store, which is our caching engine,
and which is inherently much faster because
we already uh pre-processed that pre-indexed everything can load it into workers memory
for this fast querying so that's why the cache is is really fast for that specific uh queries
that are being processed by kube okay so this this reminds me quite a lot of the days when I used to work with tools like S-Space
and Microsoft OLAP,
when we had things like aggregate storage
and we had things like calculation plans
and aggregation at various levels in the hierarchy.
I mean, do you have a background in OLAP?
And is OLAP something that is on your mind really
about when you think about where the product is going?
Yeah, I mean, my co-founder and I,
we spent some time with BI systems like Mondrian and all of that.
So, you know, like that's where we saw a lot of, you know,
like implementation like this.
I don't think we are, I mean,
we try to use ideas from these tools because it's still multidimensional analysis, right?
But I don't think we wanted to rebuild
like one of the old systems, right?
Like we try to understand how the same ideas
can be applied to the data warehouse-centric world.
So I suppose we've been talking implicitly
about kind of internal BI use cases for Kube,
but certainly I suppose the origins of Kube were, you know,
maybe access via API and the embedded market.
Tell us a bit about the kind of type of customers that are using Kube
in an embedded context, and how do you get Kube to,
how do you actually include Kube in your, say, SaaS application?
How does that kind of work?
There are a lot of tech companies
using cube for that use case because as you can imagine every tech company they have some software
they usually sell and if it's a b2b company right they usually have some sort of insights and
monitoring features like a dashboarding features inside their applications because
customers now they demand a
lot of uh a lot of analytics features in the product so we have a big cohort of customers
who is using cube to power uh sort of a customer facing analytics inside their application so
that's been probably the biggest uh you know like uh segment of our customers so far
before we started to see more like an internal BI use case.
And in that stack, you would usually run Kube
on top of warehouse or SQL database
and expose API to the front-end team.
And then front-end team will build some sort of visualization
with tools like
React or Angular
and different charting libraries. We even
ship some SDKs and libraries
to integrate natively with React
and Angular. Looking to the future
now then, you've got a market
I suppose, something that's been
a validation of your strategy is
the fact that there are other players in the market
now, and in particular I suppose you've got looker with their with their semantic with their
new universal semantic model and look a modeler and so on um so i just wondered what where do you
where do you see cube fitting into the market going forward and what's the what's the unique
space that you think you guys would be in that would that would differentiate you from from say the other
players um and it would make you a valid choice to be to be chosen in preference to say sort of
something that's maybe maybe more of a kind of like a safe bet or certainly more of a known product
um so where does cube fit into the future do you think really yeah um i think that we'll see some of the semantic layers that are going to be not open, but more like a closed ecosystem.
Like a LookML or LookerModeler, that's a good example, right?
It's most likely going to exist mostly within Google ecosystem with a goal of selling more BigQuery, right?
That I think one of huge difference that we wanted to take with Kube as a product,
we wanted to create not only universal semantic layer, but we wanted to make sure it's open.
So first of all, we open source, which is a big difference, right?
Like in our core offering,
it's an open source.
While we have a cloud product,
but main features, many, many features,
they're still in open source.
So I think being open
and open from any affiliation
from the cloud vendors like GCP,
also open from a code-based perspective,
that's going to be a very big difference for the cube.
And to be honest, I think it's a safe bet for the enterprises and organizations
because, again, the underlying technology is open source, right?
And it's not going to work with big query or you know like
if ibm decides to enter the warehouse market right like you can just build an ibm driver
and use cube with that i think open source is a huge huge difference here do you think there's a
do you think maybe the way the market might evolve in the future is actually not necessarily that you
would choose one semantic model and that's it but you might for example link i don't know cube to looker through looker's new
they announced that they were going to make it open up looker's semantic model via a sql interface
so you could imagine maybe cube running on top of that to maybe to make it easier to embed that
data from there i mean do you think there's a kind of multi multi semantic layer future ahead as well?
That's a good question. I think it it may be and I think it would be a good future if you know, like if it if we would be
able to make it true. I think the prerequisite for that would
be some standardization on the market. And at Kube,
I would be happy to push for the open semantic letter standard. And if some other vendors
would support that, it will help to bring this standardization. And then if we will have that
standardization, then different semantic letters would be able to integrate and you know like even merge with each other which will which
will help you know like to have this sort of a cross cross vendor integration eventually
okay and you've mentioned open source a few times there and i think again going back to
maybe a unique a unique differentiator for cube is that community how important is the community to cube going forward i mean it was there at the start
but what role would it play in cube going forward as well do you think uh i think it just
it's essential part of the product and it's essential part of the, of every open source project.
So it's, you cannot separate community from a product or from open source project because
usually they define each other and they influence.
So as much as product to project, it influences who are your users, that much users are influencing Q.
Even if we're not talking about direct code commits, you know, like even just getting feedback and just, you know, like asking questions and kind of, you know, like moving product into a specific direction, I think community will influence it.
So it's hard for me to imagine, you know,
like it's two things as a separate entities.
Yeah. Okay. And, and, and just to kind of round things off, you, you,
so you mentioned at the start,
you talked about stat spot and you said it was you,
you talked about conversational and conversational BI now. So there was,
we actually had David from Delphi labs on the,
on that episode a little while ago.
And we've kind of been trialing that product since then.
And in a way, it sounds to me that, and that links in with Cube.
So it sounds to me that what he's doing with that product and the way it uses Cube is kind of going back to some of the things you were trying to do,
but doing it maybe with a few more years worth of kind of technology around and the whole ai kind of world
just tell us a bit about that that integration you've done with delphi labs and how that works
yeah uh we just we just announced it a few days ago and i'm super excited about it first you know
like it's obviously because of stats but it kind of you know like i had a you know like a weak spot for that but uh yeah i think
it's a it's great idea and large models right now they really enable the use case that's what
we didn't have back then with stat spot and that's now what david and team they have it
so i'm really bullish on on it so it's uh it's still a lot to build, right, of course,
and it's still early use case.
But I think it's just now we have a real chance to make it work, really.
So I'm super excited about it.
So maybe for anybody that doesn't know what you're talking about,
maybe just describe what it is and what it adds to what you did originally
in the safe stat spot.
Yeah. So with Delphi, you can actually run a conversational interface like think about chat gpt but it knows about your data it knows about your data model so you take uh you take
a cubes data model you give it to the delphi and delphi learns about it. It also knows a lot about the analytics in general
and then you can ask
questions and they say, hey, how
is my sales doing? What is
that? What is this? And then it can go
and look at your data
model and generate a query
through your data model and then gives the
results back. So it's kind of like
an analyst that can work
with your data model which is which is pretty cool and i think yeah i think that's really large models and
semantic layers they enable this use case right now so you know like we if we'll have more adoption
of a semantic layer uh that will that will be really good for you know like a delphi and a team
just kind of you know like for them to to really provide a lot of value on top of semantic layer
because semantic layers, they give the meaning,
they give business meaning to the data,
which exactly what the systems they need.
I mean, my experience with it, I mean, it's only days
and we've only been using it for a day or so,
but it's like having an analyst at the end of the,
it's like having an analyst working the end of the on a on a it's like having an
analyst working with you on on slack really so you could just say you know how are sales doing
this month and and it will like you say the the large language model in the background is the
thing that allows it to have a kind of conversation with you and it will come back and say it will
connect to cube for example and it will then say um you know by by by revenue do you mean this
particular measure here by by kind of region do
you mean this one here and it has this kind of conversational model with uh sort of interaction
with you um which is the which i think is the difference between that and maybe what stat spot
was where there was maybe a certain way in which you had to ask questions but it couldn't have a
conversation with you as such um but then it will it will then link in with with cube and it will
then work and you'll then kind of access your data through there so it's a kind of a
fantastic i suppose way of exposing cube to a more a less technical audience and doing it in the
environment they like to use which is slack so it's um i was really impressed with it yeah yeah
exactly i think you're spot on this fact that it can do follow-up questions and do a little bit investigation.
What actually you mean, I think that was a missing piece for StatsBot.
And that's what we realized that it was not possible to do back then.
But now with large models, it's actually possible.
So this system can ask follow-up questions and build that knowledge build that memory about what you actually want
and then go and get get the data for you yes i think that's uh that changes a lot fantastic
fantastic so so how do people find out more about cube and um and also just maybe just explain what
cube cloud is as well i i think people just you know we rely a lot of an Inbound as in every open source project, right? So it's just people, the word of mouth mostly,
and just a little bit blogging maybe.
So people, they find out about Kube through the open source awareness,
and then we have a cloud option.
Cloud option is not a hosted option of Kube.
It's more like a full-featured product built on top of cube.
So that's the way we position it.
So there are some additional features and additional integrations that we have in the cloud.
People sometimes, in many cases, they just opt in for a cloud instead of a cube core.
Okay, fantastic.
And Rippon Analytics as well, we're a cube partner,
so we do a quick start package as well to get you money with cube. So just a little plug there, fantastic. And Ripman Analytics as well, we're a Cube partner, so we do a quick start package as well to get you up and running with Cube.
So just a little plug there, really.
But, Artyom, it's been great having you on the show.
Thank you very much for coming on and telling us about the kind of origin story for Cube.
And, yeah, keep an eye on the products in the future,
and best of luck for everything you're doing with the product, taking it forward.
Thank you.
Thank you for having me today.
It was a really good conversation.