Software Huddle - Architecting Real-time Analytics with Dhruba Borthakur of Rockset
Episode Date: October 10, 2023In this episode, we spoke with Dhruba Borthakur, Dhruba is the CTO and Co-founder at Rockset. Rockset is a search and analytics database hosted on the cloud. Dhruba was the founding engineer of the Ro...cksDB project at Facebook and the Principal Architect for HDFS for a while. In this episode, we discuss RocksDB and compare it with LevelDB. We also discuss in detail the Aggregator Leaf Tailer architecture, which started at Facebook and is now powering Rockset. Follow Dhruba: https://twitter.com/dhruba_rocks Follow Alex: https://twitter.com/alexbdebrie
Transcript
Discussion (0)
We don't want to build something if we don't have a customer.
So even new features that we're building,
we're never building a new feature
by deciding between product and engineering and marketing
whether this is the best feature to build.
We always try to build if there are a few customers lining up
saying we will use this feature when needed, when it's available.
So RostiB is essentially a C++ library
that is used to store data
in a data storage system efficiently,
especially when the data storage system
is on flash or on memory system.
One of the first use cases at Facebook
was replacing like a 500 node HBase cluster
with like a 10 node or something like that,
or 20 node RocksDB cluster. That's the
time when people kind of woke up and thought, oh, this is good technology. Then we tried to
understand what this stuff is. Hey folks, this is Alex Debris and I love today's episode. I had
Druba Bertheker on here, who is the co-founder and CTO at Rockset, which is just one of my
favorite products. It pairs really nicely with DynamoDB and I think a lot of sort of real-time data problems that people have. So I recommend it to a lot of folks. The one thing I
like about Druba is he's super technical and has been, you know, doing a lot of things. He was
principal architect for HGFS for a while. He was the creator of RocksDB at Facebook. You know,
he's been co-founder at Rockset, but also he's just like a really good teacher, right? He has
these videos on Rockset's YouTube channel that explains all these different architectural concepts and really
great diagrams and teaching. So we talk a lot here. We talk about RocksDB and Rockset and also
the aggregator leaf tailor architecture, which is a really interesting one that, you know, sort of
started at Facebook and he brought to Rockset as well. So I hope you like the show. You know,
if you like it, please subscribe to the podcast, to the YouTube channel, and feel free to reach out with questions with future guests you
want to see and be sure to follow both me and my co-host Sean Falconer on Twitter. And with that,
let's get to the show. Druba, welcome to the show. Hey, Alex, how are you? I'm doing well. I'm really
excited to have you on the show for a couple couple different reasons. Two of them are like, number one, you're the CTO at Rockset. And I just I love Rockset. I recommend it to so many people because it's just so useful and unique in what it does. So number one, love Rockset and love what you've done there. But number two, you're deeply technical and you're also really good at explaining this stuff. So you've done like a bunch of these videos on the Rockset YouTube about different architectural concepts or things about
how Rockset works. And I just love what you've done there. So I'm excited to learn a lot today.
But maybe, Drew, if you could give us a little bit about your background and what you've been up to.
Sure. Yeah. Thank you. Thanks for inviting me to your show. I've heard some of your previous
episodes. They were fantastic. Because I really love the technical deep dives that you do on technology.
Good, good. We're going deep today. Good. So, good to hear.
So, yeah, my name is Dhruva. I'm the CTO and co-founder at Rockset
now. Rockset is a
search and analytics database hosted in the cloud. We've been
around for around six to seven years now.
And prior to Rockset, I was at Facebook
building a lot of data backends at Facebook,
including RocksDB and some photo storage
and Hadoop and HBase and these kind of backend systems.
And prior to that, I was actually working
on Andrew file system from CMU.
It was a spin-off and it was kind of the first distributed file system out there.
This is like 20 plus years ago.
So I mostly spent a lot of time building servers and storage and backend systems and things like these.
Yeah, I'm excited to talk to you today about a lot of database technologies.
I think this is what you had in mind, I guess.
Yeah, absolutely. And yeah, I love your background.
You know, a lot of things I've used either directly like HDFS or RocksDB and a lot of, you know, indirect things as well.
So first of all, let's get started.
Just tell me about Rockset, your current company that you've been working at and co-founded.
What is Rockset?
What would I use Rockset for?
Sure, yeah.
So Rockset is a search and analytics database.
This is the first, I think, of its kind in the sense you have search technologies earlier
or your database of technologies earlier, but Rockset kind of builds the backend
for combining these two type of technologies that have been there for a while.
So search databases essentially are when you want your queries to be optimized for natancies
and you have an online application
that is making queries 24 by 7 on your data sets
and you want the queries to come back in milliseconds,
you have to use a combination of search
and database technologies.
So this is what Rockset is.
It's hosted for you in the cloud.
It's a managed service.
You can connect connect your data sources
and immediately start making SQL queries on your data.
The API is very standard SQL,
so kind of everybody knows how to use it from day zero.
And then it's very cloud native.
Like it's built natively for the cloud,
unlike other systems.
So you can get all the cloud-friendliness of serverless,
auto-scale up, scale down, and these kind of things.
It usually powers applications that are running all the time
and kind of making automatic decisions on your behalf.
Recommendation systems, personalization systems,
fraud detection systems.
Those are the kind of applications that can kind of get the maximum value
or the money when they use Rockset.
That's the short spiel.
I hope I was able to explain to you what it is and what it is.
Yep, absolutely.
So I first got started with Rockset because it works so well with DynamoDB, right?
I do a ton with Dynamo, which is very good at sort of like known point queries or range
queries.
But then if you have like long tail complex filtering or aggregations or search, things
like that, not so good for Dynamo.
And Rockset just fills that gap really, really well.
It's like, what would you,
what do you compare Rockset to,
like related systems or things like that?
If people are trying to frame,
like where does Rockset fit in my architecture?
Yeah, that's a good point.
I think in my mind,
databases essentially are kind of usually two types, right? One are transactional databases
and one are analytics databases.
And then there is this odd man out earlier times, or is, oh, I have a search system where I can do
some text search or log search. So the difference between analytical databases and transactional is
that transactional are mostly used for kind of say credit card transactions,
like very easy for me to explain to somebody.
You want to deduct money from a transaction here
and put it in a different account,
all in one transaction.
So there's less of analytics.
There's less of automatic decision-making.
It's more like a slave to what the application is doing.
And recording, it's like a ledger
that you are recording things, right?
You need consistency, you need atomicity, you need many other things.
But for analytics, it's all about ability to look at large portions of data, right?
And then able to extract insights from it or able to take some decisions on this data.
For example, let's say you have a data set where you are recording, let's say you are like a fleet
management system, right? You have thousands of trucks as part of your business and the trucks
on the road. And you are monitoring where each truck is, what is it picking up, where does it
need to deliver things. And then you need to, let's say, you need to read out the truck based on
the pickup of the things
that it has to pick up and then drop off, right?
Dynamically.
So this is kind of a decision-making process.
It's not like taking out money from one account
and putting it in another account, right?
So these kinds of complex transaction decision-making
are best done in an analytics database.
So when you talk about what category
of other data systems
that people might use that compete with Rockset,
these are very much analytical in nature.
Things like, let's say you have maybe a MongoDB application
that you are running to do these kind of fleet management
and then trying to do automatic decision-making.
Again, for analytics purposes, not for transaction.
Or you could have, take for example,
an Elasticsearch system
where you have stored, let's say, your catalog.
You are an online retailer
and you have your catalog stored in Elastic.
And you are trying to find out
which item you should reorder
based on past transaction history on your catalogs.
And what is the lead time you need to reorder before you should
or the lead time you need before you should
place the reordering
transaction?
These kind of decisions
are what Broadset is best
used for. So typically we compete
a lot of Elasticsearch because
people have kind of
been using Elasticsearch or I would say using Elasticsearch because people have kind of been using Elasticsearch or I would say
using Elasticsearch for a long time.
I mean, it's worked well.
It's a good system.
But it's kind of not built for the cloud
as you look at Elastic. So we see a lot
of people migrating from Elastic to Rockset.
And then
when you talk about what other data
systems we compete with,
we are SQL versus many other systems that are not SQL,
like say MongoDB Elasticsearch.
These systems are trying to build some SQL APIs in the recent past.
But when I talk about SQL,
I'm not really talking about select statement.
I'm talking about joins.
This is what in my mind is a SQL API.
Everybody can implement select star, right?
That's kind of easy to do whatever it is.
You can call SQL or any of select star.
So our differentiator again is that we have a SQL API to our backend,
which means that you can join tables as part of your queries.
You can do search queries and then you can join them with other queries
and then do another round of search.
So search and aggregations are kind of combined together
in the standard SQL API.
So from that perspective, sometimes we also see people migrating
from Snowflake because you are using Snowflake for your reporting,
but then you have a real-time use case you put on Snowflake,
and then you quickly find out that costs on Snowflake are very high
because it's not built for real-time apps.
So then those guys, those applications also move to Rockset.
But the majority of our use cases are very search
analytics centered, where latency of
queries is key to serving these applications.
I like to tell people like Rockset is like Elasticsearch, but without the pain, basically. And the big things there being like, you know, Rockset is going to automatically ingest from your data source, whether that's Kafka or whether it like I got it with DynamoDB and DynamoDB streams, it's just like automatically indexing and I don't have to do anything. You can do the same with like Mongo's Oplog or, you know, relational database streams, things like that, or Kafka. So like that managed ingestion, really good.
The management of the compute and sort of scaling that up and down and managing better rather than
Elasticsearch, which is going to follow her for me. And then, and then like you're saying,
in contrast to something like Snowflake or Redshift or Athena, these like other
OLAP heavy type things.
If you want to do much sort of faster, more interactive queries, whether that's your internal team or what, like that could be like user facing queries data you're showing, you know, dashboards and things like that to users in your application.
Just going to work a lot better that way.
Can you tell me about the converged index and what that is in Rockset?
Yeah, no, absolutely.
So, yeah, we did.org SQL.
I can explain how converged indexing works,
and then I can kind of join the two components together.
But, yeah, just to wrap up on the previous point,
I really like DynamoDB.
I mean, I know you had a lot of experience with DynamoDB.
The simplicity of the system is really great,
and the thing is really great.
And the thing that it scales.
So you can think about Rockset more like DynamoDB with a SQL API where you can do search, aggregation, joins, everything else.
And you don't have to think about managing servers or systems anymore.
But yeah, coming back to the uniqueness of Rockset
is that we have a SQL API.
That's one of our differentiators compared
to say Elasticsearch or other systems and the other big difference is that we have something
called a converged index. So converged index is a piece of technology which lets us build indices
on different fields of your record in multiple ways so that your queries are fast. For example, for some type, for some of these
columns, let's say you're making a lot of search indices, right? So you need an inverted index on
those. Or other columns, let's say you're doing more of aggregations and draw like average min
or max or some aggregates that you're trying to compute, for them we would create the columnar index.
And then there are some certain times
you just need to look up all the fields in the record.
Then you, it's more like Postgres or Mongo,
then you would, for those fields,
you would build kind of the row index.
We call it the row index.
It's like more projections in your SQL query.
But also, we also see sometimes that there are numbers
on which you're doing a lot of range queries.
So we would build some range indexes for those fields.
And then Proxit is also kind of a database
where you can put NoSQL data on one side,
but then make SQL queries on the other side.
So we also optionally can index all the types of your data.
Like it's basically a multi-type database,
like multi-type column in the database.
If you have one column, which is integers and strings,
we can also optionally index those
so that you can find a query time,
like tell me all all the records where
zip code is an integer and zip code is greater than 48 let's say i'm just giving an example
so you can also essentially build indices on types of these objects because it's multi-type
and again all is in so this is what we mean by converged index uh you can um by default you can
build all indices but you also have optionally can switch on, switch off indices
on some of these fields.
Because at query time, the query system will automatically leverage
this converged index to figure out what is the best access pattern
or access path.
So we have a cost-based optimizer that will do some of the hard work for you.
To think like, which indices to use?
Shall I use the inverted index or shall I use the column index for the query?
Sometimes, because we also support joins,
there are things like, there are like four or five types of joins that we do,
like hash join, lookup join, whatever, broadcast join.
And so the cost-based optimizer,
based on the converged index we built
and the statistics we maintain,
we can figure out which join would be the best thing
for that query.
But again, all these things are built out of the box,
but you can always override it
and give more hints to the system so that,
I mean, the human brain is sometimes
always more intelligent than writing code
and building heuristics.
You know this.
So there's always the option of specifying hints so you can override what the system is doing default for you.
So, yeah, so that's the converge index part.
We have good performance measurements on how the converge index behaves when you compare to the column scans on other systems. Like sometimes people might try benchmarks on Grid or ClickHouse or these kind of systems
where there's only columnar storage.
But a converged index storage for us actually gives us far better price performance for
most of those queries compared to just pure column scan that you have in other databases.
So that's the Converse Index story.
Yep.
And one thing that really drove it home to me is,
again, I'm like a Dynamo guy,
and I'm like known access patterns
is what I think about and stuff.
And so like these long tail queries
on like arbitrary attributes or columns,
things like that are really hard in Dynamo, right?
And I think I was talking to Venkat,
your co-founder,
as he was sort of
explaining that to me. And like, if you could have like a, you know, like a JSON column that
has like arbitrary user inputted data, and that's if you query on some sort of like arbitrary column
attribute there, that's actually really efficient for Rockset. Like this long tail query,
probably not that many records have that exact key and value if it's like user input. So it's using an inverted index, which is super specific,
and it's like really narrowing it down.
Whereas Dynamo, you probably have to like look at a bunch of records,
sort of filter them out yourself manually and handle that.
So like those long tail, hyper specific queries are just like actually very efficient in Rockset,
which was counterintuitive to me.
I mean, Dynamo, maybe you can build this global secondary index
on some fields to do this.
But for Rockset, it is
why you get
a magnitude better price
performance because we use something called
RocksDB internally that does this for us.
So essentially, you can think
about DynamoDB with global secondary
index versus using Rockset
and letting the Rockset technology do the indexing.
So Rockset, essentially what we have done is that
we have made the cost of indexing low
compared to all other systems that are out there.
That's kind of our biggest differentiator.
So people usually think indexing is costly
and, oh, it's going to cost me a lot of money
to index all my data.
But no, that's not the case.
We have good technology based on RocksDB
and based on some of our
converged indexing processes
that actually make you run this
at price performance on large,
like tens of terabytes
and hundreds of terabytes of data
and be competitive.
It's amazing.
Yeah, that segues into
sort of this technical deep dive.
I want to start,
I want to investigate from like the bottom up and sort of start low level and move on and starting with rocks tv which you
just mentioned so while you were at facebook you created rocks tv this means billions of people are
using code that you wrote um every single day you know but but rocks tv is like one of those hidden
layers you know that we don't even know we're using. So what is RocksDB?
Yeah, so RocksDB is something that we built when I was part of the engineering team at Facebook, right?
Again, I started the project, but like maybe tens of developers have contributed to the success of the project, right?
So RocksDB is essentially a C++ library that is used to store data in a data storage system efficiently,
especially when the data storage system is on flash or on memory system. So it's optimized for
kind of running a database on SSDs or on flash memory or some other kind of random access
storage systems versus spinning disks, right?
So this is optimized for query latencies. At Facebook, we build this. I mean, before that,
actually, at Facebook, we were using a lot of HBase, which was the Hadoop-based
pipe system. And I remember when we were building RocksDB, one of the first use cases at Facebook was replacing like a 500-node
HBase cluster with like a 10-node or something like that, or 20-node RocksDB cluster. That's
the time when people kind of woke up and said, oh, this is good technology. Let me try to understand
what this stuff is. But yeah, RocksDB is essentially a C++ library that lets you efficiently index your data
and optimize your data
so you can do high write rates
as well as optimize for query latencies.
So these are sometimes conflicting in nature
but when you have fast storage like SSDs
it was built essentially for
managing large data sets on flash drives
and flash became very popular in 2010s, 2012s, 2013s or managing large datasets on Flash drives.
And Flash became very popular in 2010s, 2012, 2013,
this kind of timeframe.
And that's when we started the RocksDB project.
And so RocksDB is sort of built on top of LevelDB,
which was created by Google by Jeff Dean and Sanjay, like sort of legendary programmers.
I guess, like, how did you improve on their work?
Was it this change in hardware
for stuff moving to Flash and SSD?
Or like, what was sort of the insight
that you all had with RocksDB
that was improving on LevelDB?
Got it, yeah.
No, that's a good point.
So at that time, I was mostly, like prior to Rockstable,
I was mostly a developer with HBase.
So I wrote a lot of HBase fun internals.
HBase was also trying to do something like log-structured merge trees.
And I know that implementation wasn't as good as it could be
when you're running on Flash.
So I was trying to look around for other technologies
that can do LSM engines on Flash.
And this is when LevelDB came in the picture.
At that time, LevelDB was mostly built for,
I think the Chromium browser or something like that.
Basically it was built only for kind of
in-memory data stores, but it was an LSM engine.
And the code was very well-written, so I could kind of read all the
LevelDB code in maybe like three or four days.
It was like extremely well-written code.
Definitely, I kind of thought, oh, this code is something I can
definitely take over, take up and make it better or make it useful
for server applications.
The relationship between LevelDB and RocksDB is more like, how should I say,
it's like parent-child relationship. They have their own personalities and characteristics and
whatever the ways you interact with them. But there is one common gene between these two is
that both of them are log-structured merge trees, which is why our new writes come
in, they get stored in one place, and then when the overwrites happen, they get stored
in a different place.
And then over time, these things get merged and compacted.
I'm kind of making it sound very simple, but essentially this is what, compared to B3,
LSM3 kind of does this, where there is called action that's happening in the background. Are there times today when level DB might work better for you than rocks to be or as rocks to be sort of superseded?
No, so I mean, there's like, I can ruffle off maybe a huge number of things. Take, for example, compactions, right?
Compactions is critical to a database because only if you're able to compact and reduce your size can you take in more data.
So it's highly dependent on your write rate.
If you cannot compact, then you have an unstable system.
So level DB compaction is again single threaded and you can write in one thread.
ROS DB compaction is multi-threaded.
We have different compaction strategies.
We have level compaction.
We have universal compaction, which reduces
the write amplification that you do on the storage.
And what else?
Oh, yeah, again, things like say the basic table format, levelDB had kind of block based
table format, right?
But when we are using rocksdb for Facebook news feed, so Facebook news feed is the news feed
that when you fire up
the Facebook app,
you see all your
posts and comments
by your friends.
That's powered by
RocksDB.
And one of the changes
we did that was that
we created a plain
table format,
instead of a block-based
format.
Because we needed,
it's running on like
RAM systems,
the news feed.
So you need it, your storage is essentially random access storage
versus an SSD or disk-based, which is kind of very block-based.
What else?
Then also at Facebook, RocksDB is also used for memcache,
storing memcache blobs.
So RocksDB had like a blob interface now
where you can store larger size datasets
or larger size blobs,
whereas in LevelDB,
they're usually like much smaller size.
Things are ideal for it.
Caching system.
Oh yeah, memtables.
LevelDB has one kind of memtable,
which is a skip list.
But for RocksDB,
we have like eight different types of memtables because sometimes a skip list. But for RocksDB, we have like eight different types of memtables
because sometimes
a skip list memtable
is great,
but sometimes maybe
your vectors memtable
is great because
you're not reading
your recent writes.
Sometimes a different
data structure.
So we have like
all these pluggable
components.
So yeah,
I think,
I mean,
this is like 12
or 10 years back. Oh yeah, RocksDB, this is like 12 or 10 years back.
Oh, yeah, RocksDB, we open sourced almost 10 years ago to the dot, right?
Yeah, back in like 2013.
And it's come a long way.
Facebook had put in a lot of effort building RocksDB over the last 10 years.
There are like probably 15 people in the team continuously working.
So there's been a lot of change.
So we can move back to level DB.
None of these
workloads will work
if you migrate
RocksDB to level DB.
Okay.
So in terms of like
where RocksDB is used,
obviously Rocksend,
a lot of like,
you know,
very high scale
application workloads
like Facebook might use.
The first time I think
I came into contact
with it was like
Apache Flink,
you know,
if you want to do like
local state storage,
stuff like that. But also also just a bunch of databases
have RocksDB-based storage in it.
It's like Cassandra, Mongo, MySQL
have the option to use those.
What, for those databases, you know,
if you're looking at MySQL using MyRocks
or MongoRocks for MongoDB,
is that going to be the better choice for most people?
Or are there like certain cases where that engine might work,
but certain cases where sort of the traditional, you know,
like ICM or whatever, MySQL engine would work better?
Yeah, I think that's a good point.
I think in general, LSM engines work well for most use cases now,
right?
But B-trees,
again, the difference essentially is
irrespective of whether
you are using RocksDB
or some other LSM engine.
Take, for example,
MongoDB.
I think they have
a new LSM engine
called WiredTiger.
Not new,
but same time,
like 10-year-old.
Mostly, I think,
by default,
they use the WiredTiger
LSM engine.
So, yeah,
I think LSM engines
are definitely
something that has become
very popular over the last few years, especially
because you're running on Flash.
And Flash, you can avoid
the wear and tear of Flash when you
don't do kind of point
writing into the same
Flash page over and over again.
So that was one
thing where the hardware
kind of decided how the software would migrate to.
But yeah, RocksDB is also used for things like you mentioned about Slink, you mentioned Kafka.
It uses RocksDB as a state store.
Databricks, I think the streaming SQL, they use RocksDB again as a straight store. Because at some point, RocksDB is very much like a,
what should I say?
This is the Swiss army knife, right?
You have, it's very sharp.
As long as you know how to use it,
you could get the best deal out of it.
If you don't know how to use it,
you could be in a lot of trouble
because it's a complex piece of software that's out there.
It's high-performance,
but it's quite complex to tune and manage.
All right, cool.
Let's move a layer up the stack now.
And I want to talk about ALT architecture, aggregator, leaf, tailor.
This is something you've done a video on.
I'll link that in the show notes.
But just for the listeners, what is the ALT architecture?
Okay, yeah.
So ALT architecture, the full form is aggregator, leaf, tailor architecture.
This is an architecture that we use at Rockset for building our analytics database.
But it's not something that we invented ourselves.
We did this also when we were at Facebook, building, say, the Facebook News Feed app.
Again, this architecture is important so that we can scale a real-time analytical system,
which a real-time analytical system in our alliance
is somebody who is doing a lot of writes
and somebody who's doing a lot of queries at the same time.
That's the real-time part, right?
You can't do the writes in one place
and then upload all your data to be queried after half an hour.
So the real-time needs both of these two working in tandem.
And the NK architecture supports this well,
where you need to do
real-time analytics.
So the three components
is that it's a disaggregated.
One of the basic ways
to explain this
is that it's completely disaggregated.
And there's a three-way
disaggregation between
the storage needed
to store your data
and the compute needed
to index new data that is coming into your system and the storage needed to store your data and the compute needed to index new data
that is coming into your system
and the compute needed to query
all the data that you have stored.
So this is a three-way desegregation.
It's not just a desegregation between compute and storage,
but it's also a desegregation between storage
and the compute needed for queries
and the compute needed for indexing.
So the leaf part is the storage nodes.
In the ALT, the L part is the leaf.
Leaf is where your data is stored.
So it has its own tier of machines or conceptualizes your own set of servers.
If you have more data and your volume of data grows,
you grow more leaf nodes and then you scale up.
But then if your amount of new data coming into the system grows,
let's say today you are sending 10 megabytes a second,
but tomorrow you want to send new data at 10 gigabytes a second,
you need more compute to index and ingest the data.
So that's the tailor part of things in the
ALT architecture. And the aggregator
are a set of nodes which are
used for queries. So when a query
comes in, there's let's say SQL query or
whatever query it is,
it needs to recompile and needs to be made
in a query plan and executed.
That's done in aggregators.
So if there are more queries,
today you have one query a second, tomorrow you have thousand queries a second, you grow your aggregators. But you there are more queries, today you have one query a second,
tomorrow you have thousand queries a second,
you grow your aggregators.
But you don't have to grow your leaf nodes and you don't have to grow your Taylor nodes,
which is why you get the best price performance
for these kind of systems
because you can spin up and down
each of these layers by itself.
And that's the key part of the ALT architecture.
So we use it at Rockset extensively,
again, to power our backend database. You can have your own set of nodes for ingest and your
own set of nodes for queries. And it gives you the best price performance. Okay. Okay. So just
to make sure I'm understanding, ALT, the tailor is reading from a source, maybe that's a DynamoDB
stream, maybe a Kafka stream, some sort of source. It's reading, it's doing the indexing itself.
A data then lands on a leaf node,
which is holding it for storage.
And the aggregator, that's going to take a query
and sort of maybe fan out, do scatter gather
to multiple different leaf nodes,
which will sort of read, handle their part of the query,
send it back to the aggregator for sort of final assembly
and sending that query back.
Is that right?
Yes, yes.
Okay.
Absolutely.
So in terms of,
I imagine those are like bound by different things.
Like maybe the leaf node,
is that going to be more IO bound?
And then sort of the index aggregator
going to be more CPU
and maybe a little bit memory bound on some of those?
Yeah, that's a good point.
So the leaf nodes typically are bound by
storage capacity.
But yes,
you're right,
depending on your workload,
it could also be bound
by just the IOPS
that is out there.
But then all the
aggregators,
they're essentially bound
by the compute
that you have,
right,
like the queries.
And also the aggregators
typically have some on-demand cache
from the storage tier.
So usually CPU and cache would be put together
so that you get the best price performance.
So sometimes you're also bound by the amount of cache that is there.
Let's say your working size is very big,
then you might need more aggregators.
Or if you need more compute,
then also you might need more aggregators.
Same thing with the ingest. Ingest is, no also you might need more aggregators. Same thing with ingest.
Ingest is typically CPU bound.
It's not memory bound or anything because it's basically ingesting and indexing data.
So yeah, this is the part is that each of these tiers you can scale based on only one thing.
And it's easy to scale up when that thing is under high resource contention.
Gotcha. How big are those leaf nodes? And are those multi-tenant? Are they single-tenant
leaf nodes?
So I think for ALV, it's a general architecture, right? We use that Facebook, we use it at
Rockset. I know LinkedIn also uses it for some of their feed systems. So now coming
back to Rockset, when you ask about how big are the leaf nodes, typically these leaf nodes are what we call the hot storage in Rockset.
These are nodes which have locally attached SSD devices.
Those nodes could have maybe 2 terabytes to 60 terabytes of storage for some of these storage nodes.
And most of these nodes, also what has happened is that the networking speeds have also improved a lot in the last 10 years.
So you can get 10 gigabit, 40 gigabit network speeds.
So there is sometimes a fine balance between your IOPS
and your network speed so that you can kind of get the best performance again for the workload.
But yeah, these are storage heavy nodes.
And are those nodes multi-tenant or are they going to be dedicated to just my, what do those leaf nodes look like there?
Yeah, so for Rockset, again, there are two different modes.
It depends on the deployment option right uh one deployment option is multi-tenant
option where you can some of your data would be on the same maybe cluster nodes but maybe not on the same device or on the same node uh but again for rockset what happens is that the data so for
rockset we separate the durability from the performance. So the durability we get by putting
all the data in S3 or AWS backends, right? So that you never lose the data. And so the
leaf nodes are mostly kind of, you can think about it more like
an accelerator for all the data that is in S3. You see what I'm saying? Because accesses from
S3 takes like 400 milliseconds,
whereas accesses from SSD will take whatever five micro seconds or something like that, right?
So yeah, so these leaf nodes are essentially big machines and we have some consistent hashing
between them so that if a leaf node dies, we can refill its data from S3 in panel on many other leaf nodes
because these have, let's say,
this leaf node has 20 terabytes of disk.
If it's a single machine
that will refill from S3,
it could take a long time.
We have some good partitioning scheme
and hashing scheme
so that the load is balanced on failures.
But yeah, every system is very unique.
I mean, every system
has a lot of challenges to fix
so it's fun
to talk about
each of these systems
we also write
openly about
all of these
all of the
backends that we have
it's like
typically
I mean
we try to write
blogs and
some more
detail analysis
about how
we are
implementing the
backend
yeah
it's true
I love your blog and your videos,
walking through this stuff. I've learned so much from these.
So keep doing those great work on those. I want to talk a
little bit about the aggregator now. Is the query like parser
and planar happening at that aggregator node?
Got it. Yeah. So I think your query, your question is about
like how does Rockset
maybe execute the SQL query, right?
So Rockset comes all SQL.
I mean, the standard API
from a customer
is a SQL over REST, right?
So you can make a SQL over REST.
So it goes to an aggregator node
that will do some query parsing
and then compilation.
And then it will look at some statistics
to do a query plan.
So these statistics, again,
you don't look at the statistics of the entire collection,
but based on some samplings of data.
And then figure out saying that,
okay, I should use the index filter
for this query versus column scan
because this is going to be a highly selective query.
And then we have an execution engine that
will give this
the plan to execute.
The execution engine are essentially
again a C++ written
backend where
the goal for the execution engine is
not to like spin up
JVMs or spin up threads or
there's none of those.
It's, again, optimized for low latency.
So it's very much like a data flow
or the Volcano-style model,
where there's data flow,
there is message passing
versus more RPC-style,
less of RPC-style,
more message passing kind of style.
So the query flows to all the leaf nodes.
The leaf nodes might do some work,
send them back to the aggregators
and aggregators are going to
do more processing sometimes
it needs to go back to the storage
to go do other kind of
again because the SQL
is a very complex language
so depending on how complex
they could need multiple round trips
sometimes to get stuff
but yeah so the aggregators
essentially do this SQL parsing planning and then submitting the query for execution by the
execution engine. Gotcha. And if I can recall, I remember doing like some analyzed query stuff in
the Rockset dashboard before. And I feel like a lot of queries ended up being
sort of like an initial step, you know, finding the right records
based on some filter conditions,
you know, hitting either the inverted index
or the columnar index to find those.
And then once I found those,
once I found those sort of target records
doing like the hydrate step
where it's hitting the row index
to get any other attributes I want to have with it.
Is that kind of like a common?
That's true.
I think so.
I think SQL,
so SQL will have select some columns.
Those are the projections, right?
And then there's a where,
there's some filter conditions.
So typically we run the filter conditions to reduce the size of data
that this query needs to test.
And then we do fetch all the projected fields
and then do the aggregation
because aggregations are usually on the projected fields.
And then some of those aggregations kind of...
Yeah, so some of those aggregations...
So actually, good point that you mentioned this.
So we have SQL, but you can run the SQL not just at query time,
but you can also run some of the SQL at ingest time.
So for example, you are aggregating.
So let's say you want to find the number of unique users
visiting your website every hour, right?
So we know what the query would look like.
So we can kind of transfer some of the compute needed for queries
that we actually run at in just time and kind of
keep semi-rollups. We call it rollups
or ingest transformations, where
it will do some partial counts and aggregations
when new data is coming in. Again,
you don't have to write that
code, but it happens automatically
for you. And then when you run the SQL
query, let's say you want to find the number of
unique users every hour,
you might not need to look at everyone.
You can just look at, say, pre-aggregated data that has happened every minute
and then just look at 60 of those and give you per hour aggregate.
So yeah, we have SQL on both sides of the time.
Data when coming in and when you're making the queries.
Okay.
When you talk about when it's making that plan, it's going to look
statistics, table statistics and things like
that. What is the process for
getting those statistics? You mentioned some
sampling, but is it, you know, the
the Taylor node is doing some indexing and
generating statistics and that is it is it
sending it to the aggregator somehow?
Does does each aggregator have like the
statistics locally or like,
how do those stats get communicated?
Great part.
So the tailors actually don't communicate with aggregators at all.
Okay.
This is the good part.
This is why we call it ALT architecture, completely segregated.
So that if a tailor is getting stuck for some reason,
it cannot cause the aggregators to get stuck because of some RPC deadlocks
or whatever else.
So what happens is
that the tailors read all the new data coming in and then it will automatically generate some of the
statistics that we need for some rows. Now this could be like some kind of very elementary
histograms of say the ranges of a typical let's say you have an integer field
kind of what ranges of integers are out there in some kind of sampling mechanism I'm giving you.
If there are strings we have some idea about the lengths of strings and other things as a sample.
These things happen when data is coming in so it's part of the indexing process. This is what
basically also part of the converged index
that we have,
which not only builds all the indices,
but also builds higher level
summaries,
you could say,
for some of these
so that the query engine knows.
So now when the aggregator
needs to find this,
it just looks at the leaf
and it knows which fields,
which are the kind of the hidden,
I shouldn't say hidden,
but more like the attributes of the data. It knows how to fetch the attributes of the hidden, I shouldn't say hidden, but more like the attributes of the data.
It knows how to fetch the attributes
of the data from the leaves
and it leaves those to plan the query.
So there's no communication
between the tailors and the aggregators.
And the only thing is the shared storage,
which is what we call as the leaf nodes.
This gives us something good,
isolation, compute, compute, and other things.
We can talk about it again later if you're
interested, but that...
Absolutely. A couple more things just on this ALT.
Do you have four queries?
Do you have rough target
response times? I know it's going to vary
based on if you're two terabytes
of data, but if I'm doing
a more search-type query where I'm
filtering on some specific conditions and I'm trying to just return some records, what sort of target response time do you aim for there?
So, great question.
So, most of our users or most of our applications, when they start off using Rockset, they're using a system where, let's say, their data latencies are 15 minutes or 10 minutes or 5 minutes.
And they come to us saying that, can I get sub-second query latencies?
They don't actually give us a number saying that I need like 800 milliseconds or something like this.
But they also tell us that as part of their application query, it might make, let's say, one application query
from a user's perspective
might result in eight database queries.
You know what I'm saying?
So if they need sub-second
response times for their users
and you need to do
eight database queries as part of that application,
then your database query cannot take more than
say 50 to 100 milliseconds
on the average, right? So this is kind of, how should I say it?
Giving you the typical example, not everybody falls in this range.
But the lowest latency queries could be less than 10 milliseconds,
and the highest ones could be many minutes for us.
Again, Rockstrap also can do large, long, big queries if you like to.
The largest queries probably take
30 minutes or so, but those
are mostly reporting queries that people do on the
site, not really our sweet spot and
not our differentiator.
Is there a data size
where you tell people, hey, you'd probably be better off
using something else for
that? Yes.
What sort of threshold point is that
if I'm doing huge large-scale aggregations?
When do you say maybe something,
a warehouse or something?
Yeah, no, absolutely.
I think there is a...
So we don't replace a warehouse.
We definitely coexist with all warehouses
that's out there.
Typically, if it is 500 terabytes,
petabyte- size data,
then definitely this doesn't make sense
for you to make it real time.
Usually people don't want to make their warehouses
respond to like 50 millisecond queries,
you know what I'm saying?
So it's a different set of applications.
So when you're talking about hundreds,
high hundreds of terabytes of data,
let's say 500 terabytes, 600 terabytes, petabytes, we definitely don't.
People actually don't come to us
with those. When we explain them,
they very clearly understand that
yeah, I have gigantic amounts of data,
but I want to operationalize my last
one month of data, last six months
of data, and then things automatically fall
into place.
Yeah, absolutely.
So you talk a lot about
real-time data, and sometimes
that means very quick response
times on my queries, but it also means
high-velocity updates and not
sort of out-of-date data.
First of all, what's your
target latency in
terms of freshness of data from an
upstream system? How far behind
is RockSight usually
going to be?
Yeah, yeah, that's a good question.
So real-time means different things to different people, right?
If you ask what is real-time, some people will say one thing, some people will say another
thing.
But at the end of the day, it feels like the common theme among all the answers is that
you want to get something done very quickly as soon as some
event happens in your system, right? So we measure two things. We measure query latencies and we also
measure something called data latencies. So these are the two things. The data latency is a measure
of your freshness and query latency is a measure of your response times, right? So for data latency,
let's say you have, let's say you're collecting megabytes per second
of data in Kafka
right very popular
system to transfer
data
so our guarantees
are that if you
put data in Kafka
it'll show up in
your octet queries
within say
50 milliseconds
100 milliseconds
and this kind of range
because we have
ways where we can
continuously
pull Kafka.
So you have managed connectors at Kafka, very nicely integrated.
They are going to keep reading and scale up when there's more data in Kafka and index
it.
And those things become available in your queries in like 100 milliseconds or so.
But that's the data latency.
This is important when you're doing fraud detection,
for example, right?
Like if you can shorten the time it takes to detect fraud,
you can save millions of dollars per second
or something like that.
So there, and then the query latencies,
I think people are mostly familiar with, right?
Because databases have been around for so long.
So we want query latencies to be fast.
Again, because fraud detection use cases, whatever these ones are, need to respond quickly in real time.
Such a mix of both of these two.
Yeah. And also in terms of that data freshness, I know a lot of
data systems, QuickHouse, Druid, whatever, a lot of them
have trouble
with updates.
It's like,
what's the sort
of core
architectural
difference that
allows you to
do updates
in real time
where some
of these other
ones struggle
with that?
Yeah,
that's a good
point.
So,
for RockSERC's
underlying
storage engine
is RocksDB.
So,
RocksDB is
this key
value store.
And the benefit of using a system like RocksDB, right? So RocksDB is this key value store. And the benefit of using a system like RocksDB
is that every key and value is immutable.
This is by definition is a key value store.
You haven't seen a read-only key value store,
I think, ever in your life, right?
Nobody uses a read-only key value store.
So it's optimized for updates, key value stores,
and we leverage that.
So it's very easy for us to be
able to efficiently update one field in an existing document or add a new field to an existing document
or delete also, for example, if you want to delete and change some fields. Whereas for other systems,
let's say ClickHouse or you're trying Druid, what happens is that
the technology is built to
compress data as much as possible
because they're, if you have
gigantic amounts of data, they're optimized to
reduce your storage cost.
So you compress the data so much
that it's tied you back into
a small part of
your storage system. Now when you
want to update,
it's a very difficult update because you can't really update a ZZ block, for example.
Just, for example, you have to extract everything out,
make some changes and recompress it and recompact it.
So that's the trade-off that we have
where because we use RocksDB,
we do use compression like ZFTD compression,
ZZ compression.
But because we can do it at block level,
we can do it at key level,
we can actually give it much better
price performance when you are having updates.
Let's say you have a DynamoDB table, right?
And you join it with
and you create a Rockset collection
widget. You update DynamoDB,
we look at the CDC stream and update their
Rockset collection in real time
within less than 200 milliseconds.
It really is a superpower compared to some of the other ones.
Okay, let's talk about separation.
We hear a lot about compute storage separation.
You also have more on compute separation, which is really interesting.
So compute storage separation, I feel like that sort of broke into the mainstream with like Snowflake is when people started to get fired up about it.
But like what is compute storage separation?
What are the benefits there?
Yeah, compute storage separation has been now prevalent for the last probably 10 years, right?
When the cloud became very popular.
Again, the benefit there is that you can have your storage system on one set of machines and your computes on another set of machines.
Because then you can, if you have more storage,
like you said, spin up more of your storage nodes.
And if you have more compute needed,
you spin up more of your compute nodes.
That works well for Snowflake or other warehouses,
because most of those things are not real time.
Like in Snowflake, when you are depositing data,
they will have a staging area
where all the new data is being written to.
And then every 10 or 15 minutes, you can say that, okay, I will load this data
into a table for query collections.
Uh, and that's the time when you compact it, make it column and compressed,
tidy it back and then make it available for queries.
Um, so just the compute stores.
So this is, so in Snowflake, it's linked here.
If you want to do real time using SnowBipe and things like those,
your costs are like shooting through the roof
because you'd have to scale up both sides of your thing.
Whereas in a real-time system,
it's not enough to just separate the compute and the storage,
like we talked about the LTE architecture.
It's also needed to separate the compute meter for ingest
and the compute meter for queries and the storage. So Rockset, we have something called compute
separation, which is not just compute storage, but compute compute essentially means that
you can separate out the ingest compute from the query compute. And you can also have additional
compute for other workloads. Take, for example, you are serving a fraud detection
use case, right? So you have a set of compute running on that. But then you also need to
run some reports every night on the same data to tell you some metrics or insights into
the data system. So you can spin up another set of compute nodes and run those queries.
And when you make the update to the data, all these compute instances see the update immediately.
So that, this is what I mean by real time,
and you can't get this in Snowflake or other warehouses
because their latencies are typically like many minutes
before you can actually make it visible.
Yep.
That compute-compute, man, I'm just thinking back to
a prior place I worked where we had Elasticsearch, right? we're indexing, we're like, show where we're showing it to end users,
but then someone had also built, like an internal dashboard on top of Elastic Search. And seriously,
every time like, our executive would go to that internal dashboard, like all these alarms
would go off, because now indexing is getting backed up. And like the user queries are getting
way slower, because it's like churning through all this data.
And now just being able to segregate that to where you have like a different
set of compute nodes crunching those OLAP queries for that and not affecting
indexing or sort of real-time queries.
Yeah, that would have been very beneficial.
Yeah.
I mean, this is a common problem.
Yeah, a common problem for most of our customers.
Like they use Elasticsearch and one user comes in
and affects everybody else.
For Rockset,
this is one of the beauty
of building something
cloud-native
where from day one,
we know that we can
spin up compute nodes
as an API
versus buying machines
and installing
on your data center.
So it's easy for us
to do these kind of tricks.
And so this is why
the price performance
is so much different
when you use Rockset
versus Elastic.
Yep, yep, absolutely.
That compute-compute
separation video
of yours is really good.
I'll link that
in the show notes,
but check that one out.
I want to move on
to something.
Let's go even higher level now.
Let's talk about a new feature.
So you now have
vector search,
vector indexing.
You know,
generative AI has been huge this year, seen a lot around this. Maybe just tell me, like,
what's hard about vector search, especially compared to like other indexing patterns,
like the sort of inverted index or columnar index? What's hard about vector search compared to those?
Or different? Yeah, I mean, search typically means like there's some kind of
indexing system technology needed to serve the search query, right?
So when it comes to vector search, I think as a database person,
the question that comes to my mind is what kind of index can we build to serve that
search query? So as far as the indexing is concerned,
Facebook actually have done a good amount of work
here by open-sourcing some of the FACE, F-A-I-S-S libraries,
right? And one of the models of operation
that's supported by FACE is this called
inverted FACE, called IVF.
So the challenge for most vector databases out there
that you might have heard about a lot of new vector databases
in the last year,
the challenge is that they do mostly vector operations,
so you can do vector search.
But if you look at a real-life application
that we see from our current customers,
like let's say insurance companies, financial companies, they all want to do
vector search. And they don't want to
do vector search just by itself.
They want to do vector search with
existing data where they also need
to join with other data sets,
look at access patterns of some of those
records, and filter out some of the
vector search results.
So in my mind,
vectors or indexing vectors is just like indexing geo locations or just like indexing uh what is it ip addresses like you know what i'm saying so
people have built different kinds of indices on some of these well-known data types over time. We also have a geo-index, and the vectors is just a floating,
like a floating point array of numbers.
In data system, I call it an array of numbers.
It's nothing new,
but the challenge is that
sometimes the data sizes could be big, right?
Because every vector has a thousand elements,
each one of them is a floating point number.
So the challenge for most of these real-life applications is how can we scale this vector search to large sizes?
How can we update these vectors? How can we manage these vectors in the sense, how can we get
quality search results from these vectors? So there are two types of challenges, again,
coming back to your question. One is related to the algorithm that you are using
to do the recall.
Now let me backtrack. So some of them are exact match vector search algorithms
and some of them are approximate vector search. Or exact match
is a question of implementation and
optimizations that you do to figure out how can you find all the vectors you are wanting to search.
But for approximate search, there are probably four or five widely used algorithms that almost every vector search database is using.
So there is less of differences in that.
And that affects the recall that you have from your vector search.
But I think based on all the current technologies out there, I feel like the differences there are
not as much as picking an algorithm that you could pick. For example, let's say you pick the IVF
algorithm, or you could pick the hierarchical HNSW algorithm. There are variations of each one
of these but at the end of the day the recalls are slightly different between some of these things.
But from what I'm seeing is that not a lot of people are picking vector databases based on
how much recall there that algorithm supports but they're mostly picking vector databases based on, can I use it now?
Can I update these databases or data fields
when I need to?
Can I maintain them?
Can I associate roles or security policies
with these things?
This is what is happening in actually production systems.
But as far as the challenges are concerned,
it's a challenge of like
indexing
a large
area of
numbers
I have a
question on
that because
okay so if
you go to
open AI
and use
their embeddings
model they're
going to give
you you know
a vector with
like 1500
dimensions
right and
some other
models might
return with
you know 300
dimensions or
much fewer
dimensions is the number of dimensions going to vastly affect my accuracy models might return with 300 dimensions or much fewer dimensions.
Is the number of dimensions going to vastly affect my accuracy?
I mean, imagine the models themselves.
Or is there also like a way to reduce those?
If OpenAI gives me 1500 back,
is there a way I can squeeze that into a smaller amount
so it's easier to index, maybe faster query?
What is that?
You can do some types of quantizations, right?
This is widely researched now.
Take, for example, a 1500 size vector.
If you reduce it to, say, 700 bytes or 700 elements,
it will half the size.
But the impact or the recall or the accuracy of your queries
could only be 5% bad.
Again, I'm waving at Limey,
it's not 50% bad.
So depending on how we compress those
is super useful for the app,
especially if you are storage bound.
So, oh, so there's another point
is that a lot of these open source vector databases,
they're mostly people running it on memory, which is
why they're very much size bound.
But when you run it on SSDs, like
how we do on Rockset, we're not
really storage bound at the time
because
we have the ability to
store all the vectors in SSDs
so it just opens up a huge
dimension of
I mean, you could afford to store 1500 element vectors
because we store everything on SSDs.
But yeah, quantizations are definitely, it's a hot research area right now.
Some of the databases do this automatically under the covers.
Some of them don't and they leave it to the application to do it.
So an application, let's say, is opening up, then you use another library
to compress them
into 300 byte vectors and then store it in the database.
So I don't know where the final verdict would be,
but right now I don't see all the databases
doing this automatically on the fly.
Gotcha.
And what do response times look like for vector search
as compared to like normal, you know,
inverted index or columnar search? Is that to like normal, you know, inverted index
or columnar search? Is that going to be, you know, if I have millions and millions of records or
terabytes of data, is it going to be significantly slower to do sort of vector search as compared to
just normal where filter columns? So I think actually faster based on what I'm seeing from
our customers. Like for example, we have some customers who are looking at recommendations.
So they have a homepage where there's like selling auctions,
online auctions.
And when a person logs into the website,
they need to show the auctions
that are most relevant to that person.
So they do that.
They used to do it based on some of the keyword searches,
matching patterns, and very like machine learn models.
But as far as the vector database is concerned, they will use the vector database to reduce the amount of data that they need to, it's say, reduce the candidates from 15 billion records
to 500,000 records
and then do
more personalization
among those
500,000 records
to show
which recommendations
to show.
So the total
overall time
is actually
much lesser
compared to
before
because
these models
are kind of
becoming very powerful.
I mean, that's the whole point of this.
It's actually reducing the latency
so you can do more stuff during those 500 milliseconds
or whatever you have to respond to a user-facing query.
Yep. Wow, that's amazing.
Okay.
I want to shift from technical stuff
and just talk about company stuff.
You're one of the co-founders of Rockset
and founded in 2016 is that
right seven years now okay what what did that look like in terms of you know building a database like
you you and your co-founders have a very strong track record of doing great things at facebook
and and elsewhere but in terms of convincing people to trust you know trust you with with
with their data in a new database?
What was that process like?
How long did it take to, I guess, get your initial build
or you felt comfortable sharing that with users?
Yeah, absolutely.
I think building a database is quite, how should I say,
it's a longer journey compared to many other startups that are out there.
Because it's kind of building like, say, a skyscraper.
And you have to build some of the foundations before you can actually start building on the floors in your building.
Building the foundation is definitely something that needs time and effort.
What we have seen is that we have,
as far as
Rockset is concerned,
I think we hardly
see anybody
once they start
using Rockset
leaving Rockset
because they really
feel that this is
a great piece of technology.
They're just stuck to it.
But that doesn't mean
that our work is done
because a lot of the challenges that we currently see day to day are,
oh, I have data in SQS or something. Can I integrate it into Rockset? Can I bring it into Rockset quickly?
So it will still build a lot of integration pipelines so that it's easy for us to get data into the system.
And I still have to build a lot of things to be able to interact with other tools
that are out there in the ecosystem
so that people can use it quickly and efficiently.
There are so many data tools that are out there.
We have, again, standard SQL over REST,
we have JDBC, whatever other APIs.
But still, there are a lot of other tools
that people use that might not fit into some of the APIs that we have. So there's still a lot of other tools that people use that does not
might not fit into
some of the APIs
that we have
so there's still
a lot of more
work to do
and the data
whole data
the ecosystem
is changing rapidly
because people
are accumulating
lots more data
this year
versus last year
so the rate of
innovation has
increased
and we are
trying to
make sure that
we can innovate fast enough on all fronts, not just on the database, so we can keep ahead.
So our focus is always speed, scale, and simplicity.
These are the three S's we keep talking about.
How can we innovate on three different dimensions so that we can build a great, valuable company for our users. Yep, yep.
I remember talking with Venkat at reInvent last year
and just some of the super low-level stuff that you guys do.
I don't even know enough about it,
but the new chipsets and SIMD and different things like that,
the work you're doing there.
How much of the work you're doing is just staying up on top of like new hardware updates and optimizing for that?
Like, does that change very often?
Or is that like, you know, hey, Flash came around, you know, SSDs and things like that.
And then it's like, how often does the physical infrastructure change to where you need to
make big changes to Rockset as well?
Actually, the physical infrastructure is also changing much more rapidly now, right?
Like, so we shipped
an entirely integrated version
of Rockset on Intel hardware.
I think that's the reinventing
that you were talking about.
Yep, yep.
So when a new hardware goes along,
we have to make changes
to our code so that we can leverage
the new features in the hardware.
It's not just porting,
but it is actually rewriting some basic primitives.
So what we have done is that we have tried to localize those
in certain areas so that we know
that when we need to port to a different platform,
we can port them quickly and efficiently.
But the beauty of the solution
is that irrespective of the hardware, the service, like the Rockset
service remains the same for all our customers. So it's
basically they can leverage the changes in hardware that that's
coming in and get more work done.
Alex Ferrari- Do you remember, so started Rockset in 2016. Do
you remember how long it was before you had your first
customer like, hey, this is enough, reliable enough to use and how long did was before you had your first customer like, Hey, this is, um, good enough, reliable enough to use and it can be,
how long did that process take?
Ankur Kotwalda- Yeah, yeah, no, absolutely.
2016 and I think December, actually September, 2016 is when it started.
So kind of, yeah, seven years now.
So the first three or four months was all about writing some
code to kind of show a demo.
And then in 2017,
I think we had our first customer who was, I call them more like the believers
rather than the customers, right?
It's like a new religion.
They'll say, oh, I believe in you.
Let me try to see what you guys are doing.
So we used to camp in their office
probably in 2017,
like three days or a month or a few months
to show how our software will look like,
right? And in the very early days, we were actually having a very custom API that was
posing the raw database internals and it wasn't very SQL focused. And that was a time when we
realized from our customer in the first nine months that without SQL, this is a longer journey.
So the first customer was before we are like nine months old or something like that.
But we launched Rockset in 2019, which is when we kind of openly wrote about in our website,
came alive and things like that. So we have been in the public now for the last four years or so.
And every year,
I think I see a lot of changes in the volume of our customers
and the size and shape of our customers.
And it's becoming
more exciting and interesting.
Yeah, very cool.
Okay, I was going to ask you
if you ever considered an API
other than SQL or if SQL is a one.
So it sounded like you started with that
and realized early on,
like, I think that just makes so much sense.
You see a lot of new databases
and sometimes they want to write
their own query language
and it just like increases
that barrier so much more
because you're learning all the mechanics
of this database
and then having to learn
just the basic interactions with it
also slows the adoptions.
Yeah, one of the philosophies
we have is that we don't want to build
something if we don't have a customer.
So even new features that we're building,
we're never building a new feature
by deciding between
product and engineering and marketing
whether this is the best feature to build.
We always try to build if there are
a few customers lining up saying,
we will use this feature when needed,
when it's available.
It's very much like the iterative development model
is how I see it.
Like we got this when we were at Facebook,
building Facebook infrastructure,
where you never build something
that is just somebody's intuition or something.
Theoretical or something like that.
Make it actually practical.
Yeah, absolutely.
That's great.
Well, Drew, I love this conversation.
I loved all the videos you've done and the blog posts your team does on this stuff.
Congrats on, y'all raised a round recently, so congrats on that and just
the great work
you've been doing.
If people want
to find out more
about you,
about Rockset,
where can they
find you?
Yeah, I think
the best place
would be to go
to rockset.com.
That's where we
have a good
description about
our product.
And also,
there's a free
trial if you
have some data
or one of your
listeners have
any data and
they want to
try.
The best would
be to just create a free account, try it, see how it works and give us feedback.
Alex Williams- Yep, absolutely. I highly recommend that. I highly recommend the white
papers and videos and all that stuff. So, Dhruva, thanks for coming on the show and best of luck to
you and the team at Rockset going forward.
Dhruva Goelkar- Hey, thank you. Thanks a lot. Great talking to you, Alex. And I hope to hear a
lot more of these videos from you in the near future. Alex Williams- Cool. Thank you. I appreciate it, Dhruva. Dhruva Goelkar- Thank you. Thanks a lot. Great talking to you, Alex. And I hope to hear a lot more of these videos from you in the near future.
Cool. Thank you. I appreciate it, Dhruva.
Thank you. Bye.