Screaming in the Cloud - Keeping the Chaos Searchable with Thomas Hazel
Episode Date: November 30, 2021About ThomasThomas Hazel is Founder, CTO, and Chief Scientist of ChaosSearch. He is a serial entrepreneur at the forefront of communication, virtualization, and database technology and the in...ventor of ChaosSearch's patented IP. Thomas has also patented several other technologies in the areas of distributed algorithms, virtualization and database science. He holds a Bachelor of Science in Computer Science from University of New Hampshire, Hall of Fame Alumni Inductee, and founded both student & professional chapters of the Association for Computing Machinery (ACM).Links:ChaosSearch: https://www.chaossearch.io
Transcript
Discussion (0)
Hello, and welcome to Screaming in the Cloud, with your host, Chief Cloud Economist at the
Duckbill Group, Corey Quinn.
This weekly show features conversations with people doing interesting work in the world
of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles
for which Corey refuses to apologize.
This is Screaming in the Cloud.
This episode is sponsored in part by my friends at Thinkst Canary.
Most companies find out way too late that they've been breached,
and Thinkst Canary changes this, and I love how they do it.
Deploy Canaries and Canary tokens in minutes and then forget about them
what's great is that then attackers tip their hand by touching them giving you one alert when it
matters i use it myself and i only remember this when i get the weekly update with a we're still
here so you're aware from them it's glorious there's zero admin overhead to this there are
effectively no false positives unless I do something foolish.
Canaries are deployed and loved on all seven continents.
You can check out what people are saying at canary.love.
And their kubeconfig canary token is new and completely free as well.
You can do an awful lot without paying them a dime, which is one of the things I love about them.
It's useful stuff and not a, oh, I wish I had money. No, it is spectacular. Take a look. That's canary.love
because it's genuinely rare to find a security product that people talk about in terms of
love. It's really just a neat thing to see. Canary.love, thank you to Thinks Canary for
their support of my ridiculous, ridiculous nonsense.
This episode is sponsored in part by our friends at Vulture, spelled V-U-L-T-R,
because they're all about helping save money, including on things like, you know, vowels.
So what they do is they are a cloud provider that provides surprisingly high performance cloud compute at a price that, well, sure, they claim it is better than AWS's pricing.
And when they say that, they mean that it's less money.
Sure, I don't dispute that.
But what I find interesting is that it's predictable.
They tell you in advance on a monthly basis what it's going to cost.
They have a bunch of advanced networking features.
They have 19 global locations and scale things elastically, not to be confused with openly, which is apparently
elastic and open. They can mean the same thing sometimes. They have had over a million users.
Deployments take less than 60 seconds across 12 pre-selected operating systems. Or if you're one
of those nutters like me,
you can bring your own ISO
and install basically any operating system you want.
Starting with pricing as low as $2.50 a month
for Vulture Cloud Compute,
they have plans for developers and businesses of all sizes,
except maybe Amazon,
who stubbornly insists on having something of the scale
all on their own.
Try Vulture today for free by visiting vulture.com slash screaming,
and you'll receive $100 in credit.
That's v-u-l-t-r dot com slash screaming.
Welcome to Screaming in the Cloud.
I'm Corey Quinn.
This promoted episode is brought to us by our friends at Chaos Search.
We've been working with them for a long time.
They've sponsored a bunch of our nonsense,
and it turns out that we've been talking about them to our clients
since long before they were a sponsor,
because it actually does what it says on the tin.
Here to talk to us about that in a few minutes is Thomas Hazel,
Chaos Search's CTO and founder.
First, Thomas, nice to talk to you again. And as always, thanks for humoring me. Corey, always great to talk to you. And I enjoy
these conversations that sometimes go up and down, left and right, but I look forward to all the fun
we're going to have. So my understanding of Chaos Search is probably a few years old because it
turns out I don't spend a whole lot of time meticulously studying your company's roadmap in the same way that
you presumably do.
When last we checked in with what the service did slash does, you were effectively solving
the problem of data movement and querying that data.
The idea behind data warehouses is generally something that's shoved onto us by
cloud providers, where, hey, this data is going to be valuable to you someday. Data science teams
are big proponents of this, because when you're storing that much data, their salaries look
relatively reasonable by comparison. And the chaos search vision was, instead of copying all this
data out of an object store and storing it on expensive disks and replicating it, et cetera. What if we queried it in place
in a somewhat intelligent manner? So you take the data and you store it, in this case, in S3
or equivalent, and then just query it there rather than having to move it around all over the place,
which of course then incurs data transfer fees. You're storing it multiple times,
and it's never in quite the format that you wanted. That was the breakthrough revelation.
You were Elasticsearch, now OpenSearch, API compatible, which was great.
And that was sort of the state of the art a year or two ago.
Is that generally correct?
No, you nailed our mission statement.
No, you're exactly right.
The value of Cloud Abish stores, S3,, elasticity, the durability, all these wonderful things.
The problem was you couldn't get any value out of it.
And you had to move it out to these siloed solutions, as you indicated.
So our mission was exactly that, transform customers' cloud storage into an analytical database, a multi-model analytical database, where our first use case was search and log analytics, replacing the ELK stack and
also replacing the data pipeline, the schema management, etc. We automate the entire step,
raw data to insights. It's funny we're having this conversation today. Earlier today, I was
trying to get rid of a relatively paltry 200 gigs or so of small files on an EFS volume,
you know, Amazon's version of NFS. It's like an
NFS volume, except you're paying Amazon for the privilege. Great. And it turns out that it's a
whole bunch of operations across a network on a whole bunch of tiny files. So I had to spin up
other instances that were not getting backed by spot terminations and just firing up a whole bunch
of threads. So now the load average on that box is approaching 300, but it's plowing through getting rid of that data finally. And I'm looking at this saying,
this is a quarter of a terabyte. Data warehouses are in the petabyte range. Oh, I begin to see
aspects of the problem. Even searching that kind of data using traditional tooling starts to break
down, which is sort of the revelation that Google had 20-some-odd years ago and other
folks have since solved for. But this is the first time I've had significant data that wasn't just
easily searched with a grep. For those of you in the Unix world who understand what that means,
condolences. We're having a support group meeting at the bar.
Yeah. And I always thought, what if you could make cloud office storage like S3
high performance and really transform it into a database?
And so that warehouse capability, that's great.
We like that.
However, to manage it, to scale it, to configure it, to get the data into that was the problem.
That was the promise of a data lake, right?
This simple in and then this arbitrary schema and read generic out.
The problem next came, it became swampy.
It was really hard,
and that promise was not delivered. And so what we're trying to do is get all the benefits of
the data lake simple in so many services naturally streamed to cloud storage. Shoot, I would say
every one of our customers are putting their data in cloud storage because their data pipeline to
their warehousing solution or Elasticsearch may go down, and they're worried they'll lose the data.
So what we say is, what if you just said, activate that data lake and get that ELK use case,
get that BI use case without that data movement, as you indicated, without that ETLing,
without that data pipeline that you're worried is going to fall over.
So that vision has been cast.
Now, we haven't talked in a few years, but this idea that we're growing
beyond what we were just going after logs, we're going into new use cases, new opportunities,
and I'm looking forward to discussing with you. It's a great answer, though I have to call out
that I am right there with you as far as inappropriately using things as databases.
I know that someone is going to come back and say, oh, S3 is a database, you're dancing around it. Isn't that what Athena is? Which is named,
of course, after the Greek goddess of spending money on AWS. And that is a fair question. But
to my understanding, there's a schema story behind that that does not apply to what you're doing.
Yeah, and that is so crucial is that we like the relational access, the time, cost, complexity to get it into that.
As you mentioned, scaled access, I mean, it could take weeks, months to test it, to configure
it, to provision it.
And imagine if you got it wrong, you got to redo it again.
And so our unique service removes all that data pipeline scheme management.
And because of our innovation, because of our service, you do all schema definition
on the fly, virtually, we call
views on your index data that you can publish in Elastic Index Pattern for that consumption,
or Relational Table for that consumption. And that's kind of leading the witness into things
that we're coming out with in this quarter into 2022. I have to deal with a little bit of, I guess,
shame here,
because, yeah, I'm doing exactly what you just described.
I'm using Athena to wind up querying our customers' cost and usage reports,
and we spend a couple hundred bucks a month on AWS Glue
to wind up massaging those into the way that they expect it to be.
And it's great-ish.
We hook it up to Tableau and can make those queries from it,
and all right, it's great.
It just, brr, goes the money printer,
and we somehow get access and insight
to a lot of valuable data.
But even that is knowing exactly
what the format is going to look like-ish.
I mean, cost and usage reports from Amazon
are sort of aspirational when it comes to schema sometimes,
but here we are.
And that's been all well and good.
But now the idea of log files,
even looking at the base case
of sending logs from an application, great.
Nginx or Apache or Lighty or any of the various web servers out there all tend to use different logging formats just to describe the same exact things.
Start spreading that across custom in-house applications and getting signal from that is almost impossible.
Oh, people say, so we'll use a structured data format,
and now you're putting log and structuring requirements
on application developers who don't care in the first place,
and now you have a mess on your hands.
And it really is a mess, and that challenge is so problematic,
and schema's changing.
We have customers, and one of the reasons why they go with us
is their log data is changing.
They didn't expect it.
Well, in your data pipeline, in your Athena database, that breaks.
That brings the system down.
And so our system uniquely detects that and manages that for you.
Then you can pick and choose how you want to export it in these views dynamically.
So it's really not rocket science, right?
But the problem is a lot of the technology that we're using is designed for static fixed thinking. And then to scale it is problematic and time consuming. So, you know, glue is a great
idea, but it has a lot of sharp elbows. Athena is a great idea, but also has a lot of problems. And
so that data pipeline, you know, it's not for digitally native, active, new use cases, new
workloads coming up hourly, daily. You think about
this long term. So a lot of that data prep pipeline is something we address so uniquely.
But really where the customer cares is the value of that data, right? And so if you're spending
toils trying to get the data into a database, you're not answering the questions, whether it's
for security, for performance, for your business needs. That's the problem. And that agility, that time to value is where we very uniquely
coming in because we start where your data is raw and we automate the process all the way through.
So when I look at the things that I have stuffed into S3, they generally fall into a couple of
categories. There are a bunch of logs for things I never asked for nor particularly wanted,
but AWS is aggressive about that.
First, routing through CloudTrail so you can get charged 50 cent per gigabyte ingested.
Awesome.
And of course, large static assets.
Images I have done something to and are colloquially now known as shitposts, which is great.
Other than logs, what could you possibly be storing in S3
that lends itself to
effectively the type of analysis that you've built around this?
Well, our first use case was the classic log use cases, app logs, web service logs. I mean,
CloudTrail, it's famous. We had customers that gave up on Elastic and definitely gave up on
Relational where you can do a couple of changes and your permutation of attributes for CloudShare is going to put you to your knees.
And people just say, I give up, right?
Same thing with Kubernetes logs.
And so it's the classic, whether it's CSV, whether it's JSON, whether it's log types, we auto-discover all that.
We also allow you, if you want to override that and change the parsing capabilities through a UI wizard. We do discover what's in your buckets.
That term data swap and not knowing what's in your bucket,
we do have a facility that will index that data,
actually create a report for you for knowing what's in.
Now, if you have text data, if you have log data, if you have BI data,
we can bring it all together.
But the real pain is at the scale.
So classically, app logs, system logs, many devices sending IoT-type streams is where we really come in, Kubernetes, where they're dealing with terabytes and up of data per day.
And managing an L cluster at that scale, particularly on a Black Friday shoot, some of our customers, like Klarna is one of them, credit card payment. They're ramping up for Black
Friday. And one of the reasons why they chose us is our ability to scale when maybe you're doing
a terabyte or two a day and then it goes up to 20, 25. How do you test that scale? How do you
manage that scale? And so for us, the data streams are traditionally with our customers, the well-known
log types, at least in the log use cases.
And the challenge is scaling it, is getting access to it. And that's where we come in.
I will say the last time you were on this show, a couple of years ago, you were talking about the initial logging use case and you were speaking, in many cases, aspirationally about where things
were going. What a difference a couple of years has made. Instead of talking about what hypothetical
customers might want or what might be able to do,
you're just able to name drop them off the top of your head.
You have scaled to approximately 10 times
the number of employees you had back then.
You've raised, I think, a total of what,
50 million since then?
60 now.
60 now, fantastic.
Congrats.
And of course, how'd you do it?
By sponsoring Last Week in AWS, as everyone should.
I'm taking clear credit for that. Every time someone announces a round, that's the game. But no, there is validity to it, because telling fun stories and sponsoring exciting things like this only carry you so far. At some point, customers have to say, yeah, this is solving a pain that I have. I'm willing to pay you money to solve it. And you've clearly gotten to a point where you are addressing the needs of those customers at a pretty fascinating clip. It's bittersweet from
my perspective, because it seems like the majority of your customers have not come from my nonsense
anymore. They're finding you through word of mouth. They're finding you through more traditional
read as boring ad campaigns, et cetera, et cetera. But you've built a brand that extends beyond just me.
I'm no longer viewed as the de facto ombudsperson for any issue someone might have with chaos search
on Twitter. It's kind of, oh, the company grew up. What happened there?
No, listen, you were great. We reached out to you to tell our story. And I gotta be honest,
a lot of people came by and said, I heard something on Corey Quinn's podcast or et cetera, and it came a long way. Now we have companies like Echo Facts,
MultiCloud, Amazon, and Google. They love the data lake philosophy, the centralized, where
use cases are now available within days, not weeks and months, whether it's logs and BI,
correlating across all those data streams. It's huge. We mentioned Klarna, APM performance, and we have Armor for SIEM and Blackboard for Observant.
So it's funny.
Yeah, it's funny.
When I first was talking to you, I was like, what if?
What if we had this customer or that customer?
And we were building the capabilities, but now that we have it, now that we have customers,
yeah, I guess maybe we've grown up a little bit.
But hey, listen, you're always near to our heart
because we remember when you stopped by our booth
at re-event several times.
And we're coming to re-event this year.
And I believe you are as well.
Oh, yeah.
But people are listening to this.
If they're listening the day it's released,
this will be during re-event.
So by all means, come by the Chaos Search booth
and see what they have to say.
For once, they have people who aren't me
who are going to be telling stories about these things. And it's fun. Like, I joke. the chaos search booth and see what they have to say. For once, they have people who aren't me,
who are going to be telling stories about these things. And it's fun. Like I joke,
it's nothing but positive here. It's interesting from where I sit, seeing the parallels here.
For example, we have both had, how do we say, adult supervision come in. You have a CEO,
Ed, who came over from IBM Storage. I have Mike Julian, whose first love language is, of course, spreadsheets. And it's great on some level realizing that, wow, this company has eclipsed
my ability to manage these things myself and put my hands on everything. And eventually,
you have to start letting go. It's a weird growth stage, and it's a heck of a transition.
No, I love it. I mean, I think when we were talking, we were maybe 15 employees. Now
we're pushing 100. We brought in Ed Walsh, who's an amazing CEO. It's funny. I told him about this
idea. I invented this technology roughly eight years ago. And he's like, I love it. Let's do it.
I wasn't ready to do it. So five, six years ago, I started the company, always knowing that I give
him a call once we got the plane up in the air.
And it's been great to have him here because the next level up, right, of execution and growth and business development and sales and marketing.
So you're exactly right.
I mean, we were a young pup several years ago when we were talking to you.
And now we're a little bit older, a little bit wiser.
But no, it's great to have Ed here.
And just the leadership in general, we've grown immensely. Now, we are recording this in advance of re-invent,
so there's always the question of, wow, are we going to look really silly based upon what is
being announced when this airs? Because it's very hard to predict some things that AWS does. And
let's be clear, I always stay away from predictions just because first,
I have a bit of a knack for being right. But also, when I'm right, people will think, oh,
Corey must have known about that and is leaking. Whereas if I get it wrong, I just look like a
fool. There's no win for me if I start doing the predictive dance on stuff like that. But I have
to level with you. I have been somewhat surprised that,
at least as of this recording, AWS has not moved more in your direction because storing data in
S3 is kind of their whole thing. And querying that data through something that isn't Athena
has been a bit of a reach for them. They're slowly starting to wrap their heads around, but they're ultra warm nonsense, which is just okay. Great naming there. What is the point of
continually having a model where, oh yeah, we're going to just age it out the stuff that isn't
actively being used into S3 rather than coming up with a way to query it there? Because you've done
exactly that. And please don't take this as anything other than a statement of fact.
They have better access to what S3 is doing than you do. You're forced to deal with this thing
entirely from a public API standpoint, which is fine. They can theoretically change the behavior
of aspects of S3 to unlock these use cases if they chose to do so, and they haven't. Why is it that
you're the only folks that are doing this? No, it's a great question, and I'll give them props
for continuing to push the data lake philosophy, the cloud providers, S3, because it's really
where I saw the world. Lakes, I believe it. I love them. They love them. However, they promote the moving the data
out to get access. And it seems so counterintuitive on why won't you just leave it in and put these
services, make them more intelligent. So it's funny. I've trademarked smart object storage.
I actually trademarked, I think you were part of this, ultra hot, right? Because why would you want
ultra warm when you can have ultra hot? And the reason I feel is that if you're using Parquet for Athena,
ColumnStore, or Lucene for Elasticsearch,
these two index technologies were not designed for cloud storage,
for real-time streaming off of cloud storage.
So the trick is you have to build ultra-warm,
get it off of that what they consider cold S3
into a more warmer memory or SSD type access.
What we did, what the invention I created was, that first read is hot.
That first read is fast.
Snowflake is a good example.
They give you a 10 terabyte demo example.
And if you have a big instance and you do that first query, maybe several orders or groups, it could take an hour to warm up.
The second query is fast. Well, what if the first query is in seconds as well? And that's where we
really spend the last five, six years building out the tech and the vision behind this. Because
I like to say, you go to a doctor and you say, hey, doc, every single time I move my arm,
it hurts. And the doctor says, well, don't move your arm. It's things like that. To your point, it's like, why wouldn't they? I would argue, one, you have to believe it's possible.
We're proving that it is. And two, you have to have the technology to do it, not just the index,
but the architecture. So I believe they will go this direction. Little birdies always say that
all these companies understand this need. Shoot, Snowflake's trying to be lakey. Databricks is
trying to really bring
this warehouse lake concept.
But you still have to do all the pipelining.
You still have to do all the data management
the way that you don't want to do.
It's not a lake.
And so my argument is that it's innovation on why.
Now, they have money.
They have time.
But we have a big head start.
I remember last year at reInvent,
they released a, shall we say, significant change to S3,
that it enabled read-after-write consistency, which is awesome for, again, those of us in
the business of misusing things as databases. But for some folks, the majority of folks,
I would say, it was a, I don't know what that means, and therefore I don't care. And that's
fine. I have no issue with that. There are other folks, some of my customers, for example, who are suddenly, wait a minute,
this means I can sunset this entire janky sidecar metadata system that is designed to make sure that
we are consistent in our use of S3 because it now does it automatically under the hood,
and that's awesome. Does that change mean anything
for Chaos Search? It doesn't because of our architecture. We're a pend-only, write-once
scenario, so a lot of update-in-place viewpoints. My viewpoint is that if you're seeing S3 as the
database and you need that type of consistency, it makes sense of why you'd want it. But because
of our distributive fabric, our stateless architecture, our pen-only nature, it really doesn't affect us. Now,
I talked to the S3 team. I said, please, if you're coming up with this feature,
it better not be slower. I want S3 to be fast, right? And they said, no, no, it won't affect
performance. I was like, okay, let's keep that up. And so to us, any type of S3 capability,
we'll take advantage of it. It benefits us, whether it's consistency, as you indicated, performance, functionality.
But we really keep the constructs of S3 Access to really limited features.
List, put, get.
Roll on policies to give us read-only access to your data and a location to write our indices into your account.
And then our distributed fabric, our service, accesses those indices and queries them
or searches them
to resolve whatever analytics you need.
So we made it pretty simple
and that has allowed us
to make it high performance.
I'll take it a step further
because we want to talk about changes.
This is the last time we spoke.
It used to be that
this was on top of S3.
You can store your data
anywhere you want
as long as it's S3
in the customer's account.
Now you're also supporting one-click integration with Google Cloud's object storage, which,
great. That does mean, though, that you're not dependent upon provider-specific implementations
of things like a consistency model for how you've built things. It really does use the
lowest common denominator, to my understanding, of object stores. Is that something that you're seeing broad adoption of? Or is this one of those areas where, well, you have
one customer on a different provider, but almost everything lives on the primary? I'm curious what
you're seeing for adoption models across multiple providers. It's a great question. We build our
architecture purposely to be cloud-agnostic. I mean, we use compute in a
containerized way. We use object storage in a very simple construct, put, get list. And we went over
to Google because that made sense, right? We have customers on both sides. I would say Amazon is the
gorilla, but Google's trying to get there and growing. We had a big customer, Equifax, that's
on both Amazon and Google, but we offer
the same service. To be frank, it looks like the exact same product and it should, right? Whether
it's Amazon Cloud or Google Cloud multi-select, and I want to choose either one and get the other
one. I would say that different business types are using each one, but our bulk of our business
isn't Amazon, but we just this summer released our SaaS offering, so it's growing.
And, you know, it's funny.
You never know where it comes from.
So we have one customer, actually Digital River, as one of our customers on Amazon for logs.
But we're going and working together to do BI on GCP or on Google.
And so it's kind of funny.
They have two departments on two different clouds with two different use cases.
And so do they want unification?
I'm not sure, but they definitely have their BI on Google
and their operations in Amazon.
It's interesting.
You know, it's important to me
that people learn how to use the cloud effectively.
That's why I'm so glad that Cloud Academy
is sponsoring my ridiculous nonsense.
They're a great way to build in-demand tech skills
the way that, well, personally, I learn best, which is learn by doing, not by reading. They have live cloud labs
that you can run in real environments that aren't going to blow up your own bill. I can't stress
how important that is. Visit cloudacademy.com slash Corey. That's C-O-R-E-Y. Don't drop the E.
And use Corey as a promo code as well.
You're going to get a bunch of discounts on it with a lifetime deal.
The price will not go up.
It is limited time.
They assured me this is not one of those things
that is going to wind up being a rug pull scenario.
Oh, no, no.
Talk to them.
Tell me what you think.
Visit cloudacademy.com slash Corey, C-O-R-E-Y, and tell them that I sent you.
I know that I'm going to get letters for this, so let me just call it out right now,
because I've been a big advocate of pick a provider, I care not which one, and go all in on
it. And I'm sitting here congratulating you on extending to another provider, and people are
going to say, ah, you're being inconsistent. No, I'm suggesting that you as a
provider have to meet your customers where they are. Because if someone is sitting in GCP and
your entire approach is step one, migrate those four petabytes of data right on over here to AWS,
they're going to call you the jackhole that you would be by making that suggestion and go
immediately for option B, which is literally
anything that is not chaos search, just based upon that core misunderstanding of their business
constraints. That is the way to think about these things. For a vendor position that you are in,
as an ISV, independent software vendor, for those not up on the lingo of this ridiculous industry,
you have to meet customers where they are, and it's the right move.
Well, you just said it. Imagine moving
terabytes and petabytes of data. It sounds terrific if I'm a salesperson for one of these
companies working on commission, but for the rest of us, it sounds awful. We really are a data fabric
across clouds within clouds. We're going to go where the data is, and we're going to provide
access to where that data lives. Our whole philosophy is the no movement movement, right?
Don't move your data, leave it where it is
and provide access at scale.
And so you may have services in Google
that naturally stream to GCS.
Let's do it there.
Imagine moving that amount of data over to Amazon
to analyze and vice versa.
2020, we're going to be in Azure.
They're totally different type of business users and personas,
but you're getting asked,
can you support Azure? And the answer is yes, and we will in 2022. So to us, if you have cloud storage, if you have compute, and it's a big enough business and opportunity in the market,
we're there. We're going there. When we first started, we were talking to Minio. Remember that
open source object storage platform? We've run on our laptops.
We run, this thing, Dr. Seuss thing.
We run over here, we run over there, we run everywhere.
But the honest truth is, we're going to go with the big cloud providers where the business
opportunity is and offer the same solution.
Because the same solution is valued everywhere.
Simple in, value out, cost effective, long retention, flexibility.
That sounds so basic, but you mention this all the time
with our Rube Goldberg Amazon diagrams we see time and time again. It's like, if you looked at that
and you were from an alien planet, you'd be like, these people don't know what they're doing.
Why is it so complicated? And the simple answer is, I don't know why people think it's complicated.
To your point about Amazon, why won't they do it? I don't know.
But if they did, things would be different.
And to be honest, I think people are catching on.
We do talk to Amazon and others.
They see the need, but they also have to build it.
They have to invent technology to address it.
And using Parquet and Lucene are not the answer.
Yeah, it's too much of a demand on the producers of that data rather than the consumer.
And yeah, I would love to be able to go upstream to application developers and demand they do things in certain ways.
It turns out, as a consultant, you have zero authority to do that.
As a DevOps team member, you have limited ability to influence it.
But it turns out that being the department of no quickly turns into being the department of unemployment insurance because no
one wants to work with you and collaboration, contrary to what people wish to believe,
is a key part of working in a modern workplace. Absolutely. And it's funny, the demands of IT
are getting harder. The actual getting the employees to build out the solutions are getting
harder. And so a lot of that time is in the pipeline, is the prep,
is the schema and sharding and et cetera, et cetera, et cetera. My viewpoint is that should
be automated away. More and more databases are being auto-tuned, right? This whole knobs and
this and that, to me, glue is a means to an end. I mean, let's get rid of it. Why can't Athena
know what to do? Why can't object storage be Athena and
vice versa? I mean, to me, it seems like all this moving through all these services is a classic
Amazon viewpoint. Even their diagrams of having this centralized repository of S3, move it all
out to your services, get results, put it back in, then take it back out again, move it around.
It just doesn't make much sense. And so to us, I love S3, love the service.
I think it's brilliant.
Amazon's first service, right?
But from there, get a little smarter.
And that's where Chaos Search comes in.
I would argue that S3 is in fact a modern miracle.
And one of those companies saying,
oh, we have an object store, it's S3 compatible.
It's like, yeah, we have S3 at home.
Look at S3 at home.
And it's just basically a series of failing Raspberry Pis. But you have this whole ecosystem of things that have
built up and sprung up around S3. It is wildly understated just how scalable and massive it is.
There was an academic paper recently that won an award on how they use automated reasoning to validate what is going on
in the S3 environment. And they talked about hundreds of petabytes in some cases. And folks
are saying, ah, S3 is hundreds of petabytes. Yeah, I have clients storing hundreds of petabytes.
There are larger companies out there. Steve Schmidt, Amazon CISO, was recently at a Splunk keynote where he
mentioned that in security info alone, AWS itself generates 500 petabytes a day that then gets
reduced down to a bunch of stuff, and some of it gets loaded into Splunk, I think. I couldn't really
hear the second half of that sentence because of the sound of all of the Splunk salespeople in that
room becoming excited so quickly you could hear it. I love it. If I can be so bold, those S3 team,
they're gods. They are amazing. They created such an amazing service. And when I started playing
with S3 now, I guess 2006 or 2007, I mean, we were using it for a repository URL access to get images.
I was doing a virtualization company at the time.
Oh, first time I played with it.
This seems ridiculous and kind of dumb.
Why would anyone use this?
Yeah, yeah.
Yeah, it turns out I'm really bad at predicting the future.
Another reason I don't do the prediction thing.
Yeah.
And when I started this company officially
five, six years ago,
I was thinking about S3
and I was thinking about HCFS not being a good answer.
And I said, I think S3 will actually achieve the goals and performance we need. It's a distributed
file system. You can run parallel puts and parallel gets. And the performance that I was
seeing when the data was a certain way, certain size, wait, you can get high performance. And
when I first turned on the engine now four or five years
ago, I was like, wow, this is going to work. We're off to the races. And now, obviously,
we're more than just an idea. When we first talked to you, we're a service. We deliver
benefits to our customers, both in logs and, shoot, this quarter alone, we're coming out with
new features, not just in the logs, which I'll talk about in a second, but in direct SQL access.
But one thing that you hear time and time again, we talked about it, JSON, CloudTrail, and
Kubernetes. This is a real nightmare. And so one thing that we've come out with this quarter
is the ability to virtually flatten. Now, you've heard time and time again where,
okay, I'm going to pick and choose my data because my database can handle whether it's elastic or, say, relational.
And all of a sudden, shoot, I don't have that.
I've got to re-index that.
And so what we've done is we've created an index technology that we're always planning to come out with that indexes the JSON raw blob.
But in the data refiner we have, post-index, you can select how to unflatten it.
Why is that important?
Because all that tooling, whether it's Elastic or SQL, is now available.
You don't have to change anything.
Why is Snowflake and BigQuery having these proprietary JSON APIs that none of these toolings know how to use to get access to the data?
Or you pick and choose.
And so when you have a cloud trail and you need to know what's going on, if you picked wrong, you're in trouble. So this new feature we're calling virtual flattening, I know we're
working with the marketing team on it. And we're also bringing, this is where I get kind of excited
where the Elastic world, the Elk world, we're bringing correlations into Elasticsearch. How
do you do that? They don't have the APIs. Well, our data refiner, again, has the ability to
correlate index patterns into one view.
A view is an index pattern.
So all those same constructs that you had in Kibana, Grafana, or Elastic API still work.
And so no more denormalizing, no more trying to hodgepodge query over here, query there.
You're actually going to have correlations in Elastic natively, and we're excited about that. And one more push on the
future Q4 to 2022, we have been giving early access to S3 SQL access. And as I mentioned,
correlations in Elastic, but we're going full in on publishing our TPC-H report. We're excited
about publishing those numbers, as well as not just giving early access, but going GA in the
first of the year next year.
I look forward to it.
This is also, I guess it's impossible
to have a conversation with you even now
where you're not still forward looking
about what comes next, which is natural.
That is how we get excited
about the things that we're building.
But so much less of what you're doing now
and our conversations have focused around what's coming
as opposed to the neat stuff you're already doing.
I had to double check when we were talking just now
about, oh yeah, is that Google Cloud object store support
still something that is roadmap
or is that out in the real world?
No, it's very much here in the real world available today.
You can use it, go click the button, have fun.
It's neat to see at least some evidence
that not all roadmaps are wishes and pixie dust.
The things that you were talking to me about years ago are established parts of chaos search now.
It hasn't been just sort of frozen in amber for years or months or these giant periods of time.
Because again, there's, yeah, don't sell me vaporware.
I know how this works.
The things you have promised have come to fruition.
It's nice to see that. No, I appreciate it. We talked a little while ago, now a few years ago,
and it was a bit aspirational. We had a lot to do. We had more to do. But now when we have big
customers using our product, solving their problems, whether it's security, performance,
operation, again, at scale, right? The real pain is, sure, you have a small L cluster
or small Athena use case, but when you're dealing with terabytes to petabytes, trillions of rows,
right? Billions, when you were dealing with trillions, billions are now small. Millions
don't even exist, right? And you're graduating from computer science in college and you say the
word trillion, they're like, nah, no one does that. And like you were saying, people do petabytes and exabytes.
That's the world we're living in.
And that's something that we really went hard at because these are challenging data problems.
And this is where we feel we uniquely sit.
And again, we don't have to break the bank while doing it.
Oh, yeah.
Or at least as of this recording, there's a meme going around again from an old internal
Google video of, I just want to serve five terabytes of traffic. And it's an internal
Google discussion of, I don't know how to count that low. And yeah. But there's also value in
being able to address things at much larger volume. I would love to see better responsiveness
options around things like Deep Archive, because the idea of being able to query that, even if you
can wait a day or two,
becomes really interesting just from the perspective of, at that point, current cost for one petabyte of data in Glacier Deep Archive is a thousand bucks a month. That is, why would
I ever delete data again pricing? Yeah, you said it. And what's interesting about our technology
is, unlike, let's say, Lucene, when you index it, it could be 3, 4, or 5x the raw size. Our representation is smaller than GZIP. So it is a full representation. So why
don't you store it efficiently long-term in S3? Oh, by the way, move to Glacier. We support Glacier
too. And so, I mean, it's amazing. The cost of data with cloud storage is dramatic. And if you can make it hot and activated,
that's the real promise of a data lake. And it's funny, we use our own service to run our SaaS,
right? We log our own data, we monitor, we alert, have dashboards. And I can't tell you how cheap
our service is to ourselves, right? Because it's so cost-effective for long tail, not just, oh, a few
weeks. We store a whole year's worth of our operational data, so we can go back in time
to debug something or figure something out. And a lot of that savings, actually huge savings,
is cloud storage with a distributed elastic compute fabric that is serverless. These are
things that seem so obvious now, but if you have SSDs and
you're moving things around, a team of IT professionals trying to manage it, it's not cheap.
Oh yeah, that's the story. It's like, step one, start paying for using things in cloud. Okay,
great. When do I stop paying? That's the neat part. You don't. And it continues to grow and
build. And again, this is the thing I learned running a business that focuses on this.
The people working on this
in almost every case
are more expensive
than the infrastructure
they're working on.
And that's fine.
I'd rather pay people
than technologies.
And it does help reaffirm
on some level
that people don't like this reminder,
but you have to generate
more value than you cost.
So when you're sitting there
spending all your time trying to avoid saving money on, like, oh, I've listened to Chaos Search talk about what they do a few times. I can probably build my own and roll it at home. I've seen the kind of work that you folks have put into this. Again, you have something like 100 employees now. It is not just you building this. My belief has always been that if you can buy something that gets you 90 to 95%
of where you are, great, buy it and then yell at whoever's selling it to you for the rest of it.
And that'll get you a lot further than we're going to do this ourselves from first principles,
which is great for a weekend project for just something that you have a passion for.
But in production, mistakes show. I've always been a big proponent of buying wherever you can.
It's cheaper, which sounds weird, but it's true. I mean, we do the same thing. We have single sign-on support. We didn't build that
ourselves. We use a service now. Office Zero is one of our providers now. Oh, you didn't roll your
own authentication layer. Why ever not? Next, you're going to tell me that you didn't roll
your own payment gateway when you wound up charging people on your website to sign up.
You got it. And so, I mean, do what you do well. Focus on what you do well. If you're repeating what everyone seems to do
over and over again, time, cost, complexity, and service, it makes sense. I'm not trying to build
storage. I'm using storage. I'm using a great, wonderful service, cloud object storage. Use
what works, what works well, and do what you do well. And what we do well is make cloud storage analytical and fast.
So call us up, and we'll take away that 2 a.m. call you have when your cluster falls down,
or you have a new workload that you were going to go to the, I don't know, the beach house,
and now the weekend's shot, right?
Spin it up, stream it in, we'll take over.
Yeah, so if you're listening to this
and you happen to be at reInvent,
which is sort of an open question,
why would you be at reInvent while listening to a podcast?
And then I remember how long the shuttle lines
are likely to be, and yeah.
So if you're at reInvent, make it on down to the show floor,
visit the Chaos Search booth, tell them I sent you,
watch for the wince, that's always worth doing.
Thomas, if people have better decision-making capability
than the two of us do,
where can they find you
if they're not in Las Vegas this week?
So you find us online, chaossearch.io.
We have so much material, videos, use cases, testimonials.
You can reach out to us, get a free trial.
We have a self-service experience
where connect to your S3 bucket,
and you're up and running within five minutes. So definitely chaossearch.io. Reach out if you want a
handheld white glove experience POV. If you have those type of needs, we can do that with you as
well. But we're both on re-event, and I don't know the booth number, but I'm sure either we've
assigned it or we'll find out. Don't worry. This year, it is a low enough attendance rate that I'm sure either we've assigned it or we'll find out. Don't worry. This year, it is a low enough attendance rate that I'm projecting that you will not be as
hard to find in recent years.
For example, there's only one expo hall this year.
What a concept.
If only it hadn't taken a deadly pandemic to get us here.
Yeah.
But, you know, we'll have the ability to demonstrate chaos at the booth.
And really, within a few minutes, you'll say, wow, how come I've never heard of
doing it this way? Because it just makes so much sense on why you do it this way versus
the miracle round of data movement and transformation and schema management,
let alone all the sharding that I know is a nightmare more often than not.
And we'll, of course, put links to that in the show notes. Thomas, thank you so much for taking
the time to speak with me today. As always, it's appreciated.
Corey, thank you.
Let's do this again.
We absolutely will.
Thomas Hazel, CTO and founder of Chaos Search.
I'm cloud economist Corey Quinn, and this is Screaming in the Cloud.
If you've enjoyed this podcast episode, please leave a five-star review on your podcast platform of choice. Whereas if you've hated this episode, please leave a five-star review on your podcast platform of choice,
along with an angry comment, because I have dared to besmirch the honor of your home-brewed
object store running on top of some trusty and reliable raspberries pie.
If your AWS bill keeps rising and your blood pressure is doing the same,
then you need the Duck Bill Group.
We help companies fix their AWS bill by making it smaller and less horrifying.
The Duck Bill Group works for you, not AWS.
We tailor recommendations to your business, and we get to the point.
Visit duckbillgroup.com to get started.
This has been a HumblePod production.
Stay humble.