Screaming in the Cloud - Keeping the Chaos Searchable with Thomas Hazel

Starting point is 00:00:00 Hello, and welcome to Screaming in the Cloud, with your host, Chief Cloud Economist at the Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud. This episode is sponsored in part by my friends at Thinkst Canary. Most companies find out way too late that they've been breached,

Starting point is 00:00:38 and Thinkst Canary changes this, and I love how they do it. Deploy Canaries and Canary tokens in minutes and then forget about them what's great is that then attackers tip their hand by touching them giving you one alert when it matters i use it myself and i only remember this when i get the weekly update with a we're still here so you're aware from them it's glorious there's zero admin overhead to this there are effectively no false positives unless I do something foolish. Canaries are deployed and loved on all seven continents. You can check out what people are saying at canary.love.

Starting point is 00:01:14 And their kubeconfig canary token is new and completely free as well. You can do an awful lot without paying them a dime, which is one of the things I love about them. It's useful stuff and not a, oh, I wish I had money. No, it is spectacular. Take a look. That's canary.love because it's genuinely rare to find a security product that people talk about in terms of love. It's really just a neat thing to see. Canary.love, thank you to Thinks Canary for their support of my ridiculous, ridiculous nonsense. This episode is sponsored in part by our friends at Vulture, spelled V-U-L-T-R, because they're all about helping save money, including on things like, you know, vowels.

Starting point is 00:01:58 So what they do is they are a cloud provider that provides surprisingly high performance cloud compute at a price that, well, sure, they claim it is better than AWS's pricing. And when they say that, they mean that it's less money. Sure, I don't dispute that. But what I find interesting is that it's predictable. They tell you in advance on a monthly basis what it's going to cost. They have a bunch of advanced networking features. They have 19 global locations and scale things elastically, not to be confused with openly, which is apparently elastic and open. They can mean the same thing sometimes. They have had over a million users.

Starting point is 00:02:38 Deployments take less than 60 seconds across 12 pre-selected operating systems. Or if you're one of those nutters like me, you can bring your own ISO and install basically any operating system you want. Starting with pricing as low as $2.50 a month for Vulture Cloud Compute, they have plans for developers and businesses of all sizes, except maybe Amazon,

Starting point is 00:03:00 who stubbornly insists on having something of the scale all on their own. Try Vulture today for free by visiting vulture.com slash screaming, and you'll receive $100 in credit. That's v-u-l-t-r dot com slash screaming. Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted episode is brought to us by our friends at Chaos Search.

Starting point is 00:03:25 We've been working with them for a long time. They've sponsored a bunch of our nonsense, and it turns out that we've been talking about them to our clients since long before they were a sponsor, because it actually does what it says on the tin. Here to talk to us about that in a few minutes is Thomas Hazel, Chaos Search's CTO and founder. First, Thomas, nice to talk to you again. And as always, thanks for humoring me. Corey, always great to talk to you. And I enjoy

Starting point is 00:03:52 these conversations that sometimes go up and down, left and right, but I look forward to all the fun we're going to have. So my understanding of Chaos Search is probably a few years old because it turns out I don't spend a whole lot of time meticulously studying your company's roadmap in the same way that you presumably do. When last we checked in with what the service did slash does, you were effectively solving the problem of data movement and querying that data. The idea behind data warehouses is generally something that's shoved onto us by cloud providers, where, hey, this data is going to be valuable to you someday. Data science teams

Starting point is 00:04:29 are big proponents of this, because when you're storing that much data, their salaries look relatively reasonable by comparison. And the chaos search vision was, instead of copying all this data out of an object store and storing it on expensive disks and replicating it, et cetera. What if we queried it in place in a somewhat intelligent manner? So you take the data and you store it, in this case, in S3 or equivalent, and then just query it there rather than having to move it around all over the place, which of course then incurs data transfer fees. You're storing it multiple times, and it's never in quite the format that you wanted. That was the breakthrough revelation. You were Elasticsearch, now OpenSearch, API compatible, which was great.

Starting point is 00:05:11 And that was sort of the state of the art a year or two ago. Is that generally correct? No, you nailed our mission statement. No, you're exactly right. The value of Cloud Abish stores, S3,, elasticity, the durability, all these wonderful things. The problem was you couldn't get any value out of it. And you had to move it out to these siloed solutions, as you indicated. So our mission was exactly that, transform customers' cloud storage into an analytical database, a multi-model analytical database, where our first use case was search and log analytics, replacing the ELK stack and

Starting point is 00:05:45 also replacing the data pipeline, the schema management, etc. We automate the entire step, raw data to insights. It's funny we're having this conversation today. Earlier today, I was trying to get rid of a relatively paltry 200 gigs or so of small files on an EFS volume, you know, Amazon's version of NFS. It's like an NFS volume, except you're paying Amazon for the privilege. Great. And it turns out that it's a whole bunch of operations across a network on a whole bunch of tiny files. So I had to spin up other instances that were not getting backed by spot terminations and just firing up a whole bunch of threads. So now the load average on that box is approaching 300, but it's plowing through getting rid of that data finally. And I'm looking at this saying,

Starting point is 00:06:28 this is a quarter of a terabyte. Data warehouses are in the petabyte range. Oh, I begin to see aspects of the problem. Even searching that kind of data using traditional tooling starts to break down, which is sort of the revelation that Google had 20-some-odd years ago and other folks have since solved for. But this is the first time I've had significant data that wasn't just easily searched with a grep. For those of you in the Unix world who understand what that means, condolences. We're having a support group meeting at the bar. Yeah. And I always thought, what if you could make cloud office storage like S3 high performance and really transform it into a database?

Starting point is 00:07:06 And so that warehouse capability, that's great. We like that. However, to manage it, to scale it, to configure it, to get the data into that was the problem. That was the promise of a data lake, right? This simple in and then this arbitrary schema and read generic out. The problem next came, it became swampy. It was really hard, and that promise was not delivered. And so what we're trying to do is get all the benefits of

Starting point is 00:07:30 the data lake simple in so many services naturally streamed to cloud storage. Shoot, I would say every one of our customers are putting their data in cloud storage because their data pipeline to their warehousing solution or Elasticsearch may go down, and they're worried they'll lose the data. So what we say is, what if you just said, activate that data lake and get that ELK use case, get that BI use case without that data movement, as you indicated, without that ETLing, without that data pipeline that you're worried is going to fall over. So that vision has been cast. Now, we haven't talked in a few years, but this idea that we're growing

Starting point is 00:08:06 beyond what we were just going after logs, we're going into new use cases, new opportunities, and I'm looking forward to discussing with you. It's a great answer, though I have to call out that I am right there with you as far as inappropriately using things as databases. I know that someone is going to come back and say, oh, S3 is a database, you're dancing around it. Isn't that what Athena is? Which is named, of course, after the Greek goddess of spending money on AWS. And that is a fair question. But to my understanding, there's a schema story behind that that does not apply to what you're doing. Yeah, and that is so crucial is that we like the relational access, the time, cost, complexity to get it into that. As you mentioned, scaled access, I mean, it could take weeks, months to test it, to configure

Starting point is 00:08:51 it, to provision it. And imagine if you got it wrong, you got to redo it again. And so our unique service removes all that data pipeline scheme management. And because of our innovation, because of our service, you do all schema definition on the fly, virtually, we call views on your index data that you can publish in Elastic Index Pattern for that consumption, or Relational Table for that consumption. And that's kind of leading the witness into things that we're coming out with in this quarter into 2022. I have to deal with a little bit of, I guess,

Starting point is 00:09:24 shame here, because, yeah, I'm doing exactly what you just described. I'm using Athena to wind up querying our customers' cost and usage reports, and we spend a couple hundred bucks a month on AWS Glue to wind up massaging those into the way that they expect it to be. And it's great-ish. We hook it up to Tableau and can make those queries from it, and all right, it's great.

Starting point is 00:09:43 It just, brr, goes the money printer, and we somehow get access and insight to a lot of valuable data. But even that is knowing exactly what the format is going to look like-ish. I mean, cost and usage reports from Amazon are sort of aspirational when it comes to schema sometimes, but here we are.

Starting point is 00:09:58 And that's been all well and good. But now the idea of log files, even looking at the base case of sending logs from an application, great. Nginx or Apache or Lighty or any of the various web servers out there all tend to use different logging formats just to describe the same exact things. Start spreading that across custom in-house applications and getting signal from that is almost impossible. Oh, people say, so we'll use a structured data format, and now you're putting log and structuring requirements

Starting point is 00:10:28 on application developers who don't care in the first place, and now you have a mess on your hands. And it really is a mess, and that challenge is so problematic, and schema's changing. We have customers, and one of the reasons why they go with us is their log data is changing. They didn't expect it. Well, in your data pipeline, in your Athena database, that breaks.

Starting point is 00:10:48 That brings the system down. And so our system uniquely detects that and manages that for you. Then you can pick and choose how you want to export it in these views dynamically. So it's really not rocket science, right? But the problem is a lot of the technology that we're using is designed for static fixed thinking. And then to scale it is problematic and time consuming. So, you know, glue is a great idea, but it has a lot of sharp elbows. Athena is a great idea, but also has a lot of problems. And so that data pipeline, you know, it's not for digitally native, active, new use cases, new workloads coming up hourly, daily. You think about

Starting point is 00:11:26 this long term. So a lot of that data prep pipeline is something we address so uniquely. But really where the customer cares is the value of that data, right? And so if you're spending toils trying to get the data into a database, you're not answering the questions, whether it's for security, for performance, for your business needs. That's the problem. And that agility, that time to value is where we very uniquely coming in because we start where your data is raw and we automate the process all the way through. So when I look at the things that I have stuffed into S3, they generally fall into a couple of categories. There are a bunch of logs for things I never asked for nor particularly wanted, but AWS is aggressive about that.

Starting point is 00:12:07 First, routing through CloudTrail so you can get charged 50 cent per gigabyte ingested. Awesome. And of course, large static assets. Images I have done something to and are colloquially now known as shitposts, which is great. Other than logs, what could you possibly be storing in S3 that lends itself to effectively the type of analysis that you've built around this? Well, our first use case was the classic log use cases, app logs, web service logs. I mean,

Starting point is 00:12:36 CloudTrail, it's famous. We had customers that gave up on Elastic and definitely gave up on Relational where you can do a couple of changes and your permutation of attributes for CloudShare is going to put you to your knees. And people just say, I give up, right? Same thing with Kubernetes logs. And so it's the classic, whether it's CSV, whether it's JSON, whether it's log types, we auto-discover all that. We also allow you, if you want to override that and change the parsing capabilities through a UI wizard. We do discover what's in your buckets. That term data swap and not knowing what's in your bucket, we do have a facility that will index that data,

Starting point is 00:13:12 actually create a report for you for knowing what's in. Now, if you have text data, if you have log data, if you have BI data, we can bring it all together. But the real pain is at the scale. So classically, app logs, system logs, many devices sending IoT-type streams is where we really come in, Kubernetes, where they're dealing with terabytes and up of data per day. And managing an L cluster at that scale, particularly on a Black Friday shoot, some of our customers, like Klarna is one of them, credit card payment. They're ramping up for Black Friday. And one of the reasons why they chose us is our ability to scale when maybe you're doing a terabyte or two a day and then it goes up to 20, 25. How do you test that scale? How do you

Starting point is 00:13:55 manage that scale? And so for us, the data streams are traditionally with our customers, the well-known log types, at least in the log use cases. And the challenge is scaling it, is getting access to it. And that's where we come in. I will say the last time you were on this show, a couple of years ago, you were talking about the initial logging use case and you were speaking, in many cases, aspirationally about where things were going. What a difference a couple of years has made. Instead of talking about what hypothetical customers might want or what might be able to do, you're just able to name drop them off the top of your head. You have scaled to approximately 10 times

Starting point is 00:14:31 the number of employees you had back then. You've raised, I think, a total of what, 50 million since then? 60 now. 60 now, fantastic. Congrats. And of course, how'd you do it? By sponsoring Last Week in AWS, as everyone should.

Starting point is 00:14:44 I'm taking clear credit for that. Every time someone announces a round, that's the game. But no, there is validity to it, because telling fun stories and sponsoring exciting things like this only carry you so far. At some point, customers have to say, yeah, this is solving a pain that I have. I'm willing to pay you money to solve it. And you've clearly gotten to a point where you are addressing the needs of those customers at a pretty fascinating clip. It's bittersweet from my perspective, because it seems like the majority of your customers have not come from my nonsense anymore. They're finding you through word of mouth. They're finding you through more traditional read as boring ad campaigns, et cetera, et cetera. But you've built a brand that extends beyond just me. I'm no longer viewed as the de facto ombudsperson for any issue someone might have with chaos search on Twitter. It's kind of, oh, the company grew up. What happened there? No, listen, you were great. We reached out to you to tell our story. And I gotta be honest, a lot of people came by and said, I heard something on Corey Quinn's podcast or et cetera, and it came a long way. Now we have companies like Echo Facts,

Starting point is 00:15:50 MultiCloud, Amazon, and Google. They love the data lake philosophy, the centralized, where use cases are now available within days, not weeks and months, whether it's logs and BI, correlating across all those data streams. It's huge. We mentioned Klarna, APM performance, and we have Armor for SIEM and Blackboard for Observant. So it's funny. Yeah, it's funny. When I first was talking to you, I was like, what if? What if we had this customer or that customer? And we were building the capabilities, but now that we have it, now that we have customers,

Starting point is 00:16:20 yeah, I guess maybe we've grown up a little bit. But hey, listen, you're always near to our heart because we remember when you stopped by our booth at re-event several times. And we're coming to re-event this year. And I believe you are as well. Oh, yeah. But people are listening to this.

Starting point is 00:16:35 If they're listening the day it's released, this will be during re-event. So by all means, come by the Chaos Search booth and see what they have to say. For once, they have people who aren't me who are going to be telling stories about these things. And it's fun. Like, I joke. the chaos search booth and see what they have to say. For once, they have people who aren't me, who are going to be telling stories about these things. And it's fun. Like I joke, it's nothing but positive here. It's interesting from where I sit, seeing the parallels here.

Starting point is 00:16:59 For example, we have both had, how do we say, adult supervision come in. You have a CEO, Ed, who came over from IBM Storage. I have Mike Julian, whose first love language is, of course, spreadsheets. And it's great on some level realizing that, wow, this company has eclipsed my ability to manage these things myself and put my hands on everything. And eventually, you have to start letting go. It's a weird growth stage, and it's a heck of a transition. No, I love it. I mean, I think when we were talking, we were maybe 15 employees. Now we're pushing 100. We brought in Ed Walsh, who's an amazing CEO. It's funny. I told him about this idea. I invented this technology roughly eight years ago. And he's like, I love it. Let's do it. I wasn't ready to do it. So five, six years ago, I started the company, always knowing that I give

Starting point is 00:17:43 him a call once we got the plane up in the air. And it's been great to have him here because the next level up, right, of execution and growth and business development and sales and marketing. So you're exactly right. I mean, we were a young pup several years ago when we were talking to you. And now we're a little bit older, a little bit wiser. But no, it's great to have Ed here. And just the leadership in general, we've grown immensely. Now, we are recording this in advance of re-invent, so there's always the question of, wow, are we going to look really silly based upon what is

Starting point is 00:18:14 being announced when this airs? Because it's very hard to predict some things that AWS does. And let's be clear, I always stay away from predictions just because first, I have a bit of a knack for being right. But also, when I'm right, people will think, oh, Corey must have known about that and is leaking. Whereas if I get it wrong, I just look like a fool. There's no win for me if I start doing the predictive dance on stuff like that. But I have to level with you. I have been somewhat surprised that, at least as of this recording, AWS has not moved more in your direction because storing data in S3 is kind of their whole thing. And querying that data through something that isn't Athena

Starting point is 00:18:58 has been a bit of a reach for them. They're slowly starting to wrap their heads around, but they're ultra warm nonsense, which is just okay. Great naming there. What is the point of continually having a model where, oh yeah, we're going to just age it out the stuff that isn't actively being used into S3 rather than coming up with a way to query it there? Because you've done exactly that. And please don't take this as anything other than a statement of fact. They have better access to what S3 is doing than you do. You're forced to deal with this thing entirely from a public API standpoint, which is fine. They can theoretically change the behavior of aspects of S3 to unlock these use cases if they chose to do so, and they haven't. Why is it that you're the only folks that are doing this? No, it's a great question, and I'll give them props

Starting point is 00:19:54 for continuing to push the data lake philosophy, the cloud providers, S3, because it's really where I saw the world. Lakes, I believe it. I love them. They love them. However, they promote the moving the data out to get access. And it seems so counterintuitive on why won't you just leave it in and put these services, make them more intelligent. So it's funny. I've trademarked smart object storage. I actually trademarked, I think you were part of this, ultra hot, right? Because why would you want ultra warm when you can have ultra hot? And the reason I feel is that if you're using Parquet for Athena, ColumnStore, or Lucene for Elasticsearch, these two index technologies were not designed for cloud storage,

Starting point is 00:20:34 for real-time streaming off of cloud storage. So the trick is you have to build ultra-warm, get it off of that what they consider cold S3 into a more warmer memory or SSD type access. What we did, what the invention I created was, that first read is hot. That first read is fast. Snowflake is a good example. They give you a 10 terabyte demo example.

Starting point is 00:20:58 And if you have a big instance and you do that first query, maybe several orders or groups, it could take an hour to warm up. The second query is fast. Well, what if the first query is in seconds as well? And that's where we really spend the last five, six years building out the tech and the vision behind this. Because I like to say, you go to a doctor and you say, hey, doc, every single time I move my arm, it hurts. And the doctor says, well, don't move your arm. It's things like that. To your point, it's like, why wouldn't they? I would argue, one, you have to believe it's possible. We're proving that it is. And two, you have to have the technology to do it, not just the index, but the architecture. So I believe they will go this direction. Little birdies always say that all these companies understand this need. Shoot, Snowflake's trying to be lakey. Databricks is

Starting point is 00:21:44 trying to really bring this warehouse lake concept. But you still have to do all the pipelining. You still have to do all the data management the way that you don't want to do. It's not a lake. And so my argument is that it's innovation on why. Now, they have money.

Starting point is 00:21:57 They have time. But we have a big head start. I remember last year at reInvent, they released a, shall we say, significant change to S3, that it enabled read-after-write consistency, which is awesome for, again, those of us in the business of misusing things as databases. But for some folks, the majority of folks, I would say, it was a, I don't know what that means, and therefore I don't care. And that's fine. I have no issue with that. There are other folks, some of my customers, for example, who are suddenly, wait a minute,

Starting point is 00:22:30 this means I can sunset this entire janky sidecar metadata system that is designed to make sure that we are consistent in our use of S3 because it now does it automatically under the hood, and that's awesome. Does that change mean anything for Chaos Search? It doesn't because of our architecture. We're a pend-only, write-once scenario, so a lot of update-in-place viewpoints. My viewpoint is that if you're seeing S3 as the database and you need that type of consistency, it makes sense of why you'd want it. But because of our distributive fabric, our stateless architecture, our pen-only nature, it really doesn't affect us. Now, I talked to the S3 team. I said, please, if you're coming up with this feature,

Starting point is 00:23:12 it better not be slower. I want S3 to be fast, right? And they said, no, no, it won't affect performance. I was like, okay, let's keep that up. And so to us, any type of S3 capability, we'll take advantage of it. It benefits us, whether it's consistency, as you indicated, performance, functionality. But we really keep the constructs of S3 Access to really limited features. List, put, get. Roll on policies to give us read-only access to your data and a location to write our indices into your account. And then our distributed fabric, our service, accesses those indices and queries them or searches them

Starting point is 00:23:46 to resolve whatever analytics you need. So we made it pretty simple and that has allowed us to make it high performance. I'll take it a step further because we want to talk about changes. This is the last time we spoke. It used to be that

Starting point is 00:23:59 this was on top of S3. You can store your data anywhere you want as long as it's S3 in the customer's account. Now you're also supporting one-click integration with Google Cloud's object storage, which, great. That does mean, though, that you're not dependent upon provider-specific implementations of things like a consistency model for how you've built things. It really does use the

Starting point is 00:24:20 lowest common denominator, to my understanding, of object stores. Is that something that you're seeing broad adoption of? Or is this one of those areas where, well, you have one customer on a different provider, but almost everything lives on the primary? I'm curious what you're seeing for adoption models across multiple providers. It's a great question. We build our architecture purposely to be cloud-agnostic. I mean, we use compute in a containerized way. We use object storage in a very simple construct, put, get list. And we went over to Google because that made sense, right? We have customers on both sides. I would say Amazon is the gorilla, but Google's trying to get there and growing. We had a big customer, Equifax, that's on both Amazon and Google, but we offer

Starting point is 00:25:05 the same service. To be frank, it looks like the exact same product and it should, right? Whether it's Amazon Cloud or Google Cloud multi-select, and I want to choose either one and get the other one. I would say that different business types are using each one, but our bulk of our business isn't Amazon, but we just this summer released our SaaS offering, so it's growing. And, you know, it's funny. You never know where it comes from. So we have one customer, actually Digital River, as one of our customers on Amazon for logs. But we're going and working together to do BI on GCP or on Google.

Starting point is 00:25:40 And so it's kind of funny. They have two departments on two different clouds with two different use cases. And so do they want unification? I'm not sure, but they definitely have their BI on Google and their operations in Amazon. It's interesting. You know, it's important to me that people learn how to use the cloud effectively.

Starting point is 00:25:57 That's why I'm so glad that Cloud Academy is sponsoring my ridiculous nonsense. They're a great way to build in-demand tech skills the way that, well, personally, I learn best, which is learn by doing, not by reading. They have live cloud labs that you can run in real environments that aren't going to blow up your own bill. I can't stress how important that is. Visit cloudacademy.com slash Corey. That's C-O-R-E-Y. Don't drop the E. And use Corey as a promo code as well. You're going to get a bunch of discounts on it with a lifetime deal.

Starting point is 00:26:28 The price will not go up. It is limited time. They assured me this is not one of those things that is going to wind up being a rug pull scenario. Oh, no, no. Talk to them. Tell me what you think. Visit cloudacademy.com slash Corey, C-O-R-E-Y, and tell them that I sent you.

Starting point is 00:26:47 I know that I'm going to get letters for this, so let me just call it out right now, because I've been a big advocate of pick a provider, I care not which one, and go all in on it. And I'm sitting here congratulating you on extending to another provider, and people are going to say, ah, you're being inconsistent. No, I'm suggesting that you as a provider have to meet your customers where they are. Because if someone is sitting in GCP and your entire approach is step one, migrate those four petabytes of data right on over here to AWS, they're going to call you the jackhole that you would be by making that suggestion and go immediately for option B, which is literally

Starting point is 00:27:25 anything that is not chaos search, just based upon that core misunderstanding of their business constraints. That is the way to think about these things. For a vendor position that you are in, as an ISV, independent software vendor, for those not up on the lingo of this ridiculous industry, you have to meet customers where they are, and it's the right move. Well, you just said it. Imagine moving terabytes and petabytes of data. It sounds terrific if I'm a salesperson for one of these companies working on commission, but for the rest of us, it sounds awful. We really are a data fabric across clouds within clouds. We're going to go where the data is, and we're going to provide

Starting point is 00:27:59 access to where that data lives. Our whole philosophy is the no movement movement, right? Don't move your data, leave it where it is and provide access at scale. And so you may have services in Google that naturally stream to GCS. Let's do it there. Imagine moving that amount of data over to Amazon to analyze and vice versa.

Starting point is 00:28:19 2020, we're going to be in Azure. They're totally different type of business users and personas, but you're getting asked, can you support Azure? And the answer is yes, and we will in 2022. So to us, if you have cloud storage, if you have compute, and it's a big enough business and opportunity in the market, we're there. We're going there. When we first started, we were talking to Minio. Remember that open source object storage platform? We've run on our laptops. We run, this thing, Dr. Seuss thing. We run over here, we run over there, we run everywhere.

Starting point is 00:28:49 But the honest truth is, we're going to go with the big cloud providers where the business opportunity is and offer the same solution. Because the same solution is valued everywhere. Simple in, value out, cost effective, long retention, flexibility. That sounds so basic, but you mention this all the time with our Rube Goldberg Amazon diagrams we see time and time again. It's like, if you looked at that and you were from an alien planet, you'd be like, these people don't know what they're doing. Why is it so complicated? And the simple answer is, I don't know why people think it's complicated.

Starting point is 00:29:21 To your point about Amazon, why won't they do it? I don't know. But if they did, things would be different. And to be honest, I think people are catching on. We do talk to Amazon and others. They see the need, but they also have to build it. They have to invent technology to address it. And using Parquet and Lucene are not the answer. Yeah, it's too much of a demand on the producers of that data rather than the consumer.

Starting point is 00:29:45 And yeah, I would love to be able to go upstream to application developers and demand they do things in certain ways. It turns out, as a consultant, you have zero authority to do that. As a DevOps team member, you have limited ability to influence it. But it turns out that being the department of no quickly turns into being the department of unemployment insurance because no one wants to work with you and collaboration, contrary to what people wish to believe, is a key part of working in a modern workplace. Absolutely. And it's funny, the demands of IT are getting harder. The actual getting the employees to build out the solutions are getting harder. And so a lot of that time is in the pipeline, is the prep,

Starting point is 00:30:25 is the schema and sharding and et cetera, et cetera, et cetera. My viewpoint is that should be automated away. More and more databases are being auto-tuned, right? This whole knobs and this and that, to me, glue is a means to an end. I mean, let's get rid of it. Why can't Athena know what to do? Why can't object storage be Athena and vice versa? I mean, to me, it seems like all this moving through all these services is a classic Amazon viewpoint. Even their diagrams of having this centralized repository of S3, move it all out to your services, get results, put it back in, then take it back out again, move it around. It just doesn't make much sense. And so to us, I love S3, love the service.

Starting point is 00:31:08 I think it's brilliant. Amazon's first service, right? But from there, get a little smarter. And that's where Chaos Search comes in. I would argue that S3 is in fact a modern miracle. And one of those companies saying, oh, we have an object store, it's S3 compatible. It's like, yeah, we have S3 at home.

Starting point is 00:31:22 Look at S3 at home. And it's just basically a series of failing Raspberry Pis. But you have this whole ecosystem of things that have built up and sprung up around S3. It is wildly understated just how scalable and massive it is. There was an academic paper recently that won an award on how they use automated reasoning to validate what is going on in the S3 environment. And they talked about hundreds of petabytes in some cases. And folks are saying, ah, S3 is hundreds of petabytes. Yeah, I have clients storing hundreds of petabytes. There are larger companies out there. Steve Schmidt, Amazon CISO, was recently at a Splunk keynote where he mentioned that in security info alone, AWS itself generates 500 petabytes a day that then gets

Starting point is 00:32:15 reduced down to a bunch of stuff, and some of it gets loaded into Splunk, I think. I couldn't really hear the second half of that sentence because of the sound of all of the Splunk salespeople in that room becoming excited so quickly you could hear it. I love it. If I can be so bold, those S3 team, they're gods. They are amazing. They created such an amazing service. And when I started playing with S3 now, I guess 2006 or 2007, I mean, we were using it for a repository URL access to get images. I was doing a virtualization company at the time. Oh, first time I played with it. This seems ridiculous and kind of dumb.

Starting point is 00:32:50 Why would anyone use this? Yeah, yeah. Yeah, it turns out I'm really bad at predicting the future. Another reason I don't do the prediction thing. Yeah. And when I started this company officially five, six years ago, I was thinking about S3

Starting point is 00:33:02 and I was thinking about HCFS not being a good answer. And I said, I think S3 will actually achieve the goals and performance we need. It's a distributed file system. You can run parallel puts and parallel gets. And the performance that I was seeing when the data was a certain way, certain size, wait, you can get high performance. And when I first turned on the engine now four or five years ago, I was like, wow, this is going to work. We're off to the races. And now, obviously, we're more than just an idea. When we first talked to you, we're a service. We deliver benefits to our customers, both in logs and, shoot, this quarter alone, we're coming out with

Starting point is 00:33:39 new features, not just in the logs, which I'll talk about in a second, but in direct SQL access. But one thing that you hear time and time again, we talked about it, JSON, CloudTrail, and Kubernetes. This is a real nightmare. And so one thing that we've come out with this quarter is the ability to virtually flatten. Now, you've heard time and time again where, okay, I'm going to pick and choose my data because my database can handle whether it's elastic or, say, relational. And all of a sudden, shoot, I don't have that. I've got to re-index that. And so what we've done is we've created an index technology that we're always planning to come out with that indexes the JSON raw blob.

Starting point is 00:34:19 But in the data refiner we have, post-index, you can select how to unflatten it. Why is that important? Because all that tooling, whether it's Elastic or SQL, is now available. You don't have to change anything. Why is Snowflake and BigQuery having these proprietary JSON APIs that none of these toolings know how to use to get access to the data? Or you pick and choose. And so when you have a cloud trail and you need to know what's going on, if you picked wrong, you're in trouble. So this new feature we're calling virtual flattening, I know we're working with the marketing team on it. And we're also bringing, this is where I get kind of excited

Starting point is 00:34:53 where the Elastic world, the Elk world, we're bringing correlations into Elasticsearch. How do you do that? They don't have the APIs. Well, our data refiner, again, has the ability to correlate index patterns into one view. A view is an index pattern. So all those same constructs that you had in Kibana, Grafana, or Elastic API still work. And so no more denormalizing, no more trying to hodgepodge query over here, query there. You're actually going to have correlations in Elastic natively, and we're excited about that. And one more push on the future Q4 to 2022, we have been giving early access to S3 SQL access. And as I mentioned,

Starting point is 00:35:34 correlations in Elastic, but we're going full in on publishing our TPC-H report. We're excited about publishing those numbers, as well as not just giving early access, but going GA in the first of the year next year. I look forward to it. This is also, I guess it's impossible to have a conversation with you even now where you're not still forward looking about what comes next, which is natural.

Starting point is 00:35:53 That is how we get excited about the things that we're building. But so much less of what you're doing now and our conversations have focused around what's coming as opposed to the neat stuff you're already doing. I had to double check when we were talking just now about, oh yeah, is that Google Cloud object store support still something that is roadmap

Starting point is 00:36:12 or is that out in the real world? No, it's very much here in the real world available today. You can use it, go click the button, have fun. It's neat to see at least some evidence that not all roadmaps are wishes and pixie dust. The things that you were talking to me about years ago are established parts of chaos search now. It hasn't been just sort of frozen in amber for years or months or these giant periods of time. Because again, there's, yeah, don't sell me vaporware.

Starting point is 00:36:40 I know how this works. The things you have promised have come to fruition. It's nice to see that. No, I appreciate it. We talked a little while ago, now a few years ago, and it was a bit aspirational. We had a lot to do. We had more to do. But now when we have big customers using our product, solving their problems, whether it's security, performance, operation, again, at scale, right? The real pain is, sure, you have a small L cluster or small Athena use case, but when you're dealing with terabytes to petabytes, trillions of rows, right? Billions, when you were dealing with trillions, billions are now small. Millions

Starting point is 00:37:15 don't even exist, right? And you're graduating from computer science in college and you say the word trillion, they're like, nah, no one does that. And like you were saying, people do petabytes and exabytes. That's the world we're living in. And that's something that we really went hard at because these are challenging data problems. And this is where we feel we uniquely sit. And again, we don't have to break the bank while doing it. Oh, yeah. Or at least as of this recording, there's a meme going around again from an old internal

Starting point is 00:37:44 Google video of, I just want to serve five terabytes of traffic. And it's an internal Google discussion of, I don't know how to count that low. And yeah. But there's also value in being able to address things at much larger volume. I would love to see better responsiveness options around things like Deep Archive, because the idea of being able to query that, even if you can wait a day or two, becomes really interesting just from the perspective of, at that point, current cost for one petabyte of data in Glacier Deep Archive is a thousand bucks a month. That is, why would I ever delete data again pricing? Yeah, you said it. And what's interesting about our technology is, unlike, let's say, Lucene, when you index it, it could be 3, 4, or 5x the raw size. Our representation is smaller than GZIP. So it is a full representation. So why

Starting point is 00:38:30 don't you store it efficiently long-term in S3? Oh, by the way, move to Glacier. We support Glacier too. And so, I mean, it's amazing. The cost of data with cloud storage is dramatic. And if you can make it hot and activated, that's the real promise of a data lake. And it's funny, we use our own service to run our SaaS, right? We log our own data, we monitor, we alert, have dashboards. And I can't tell you how cheap our service is to ourselves, right? Because it's so cost-effective for long tail, not just, oh, a few weeks. We store a whole year's worth of our operational data, so we can go back in time to debug something or figure something out. And a lot of that savings, actually huge savings, is cloud storage with a distributed elastic compute fabric that is serverless. These are

Starting point is 00:39:22 things that seem so obvious now, but if you have SSDs and you're moving things around, a team of IT professionals trying to manage it, it's not cheap. Oh yeah, that's the story. It's like, step one, start paying for using things in cloud. Okay, great. When do I stop paying? That's the neat part. You don't. And it continues to grow and build. And again, this is the thing I learned running a business that focuses on this. The people working on this in almost every case are more expensive

Starting point is 00:39:49 than the infrastructure they're working on. And that's fine. I'd rather pay people than technologies. And it does help reaffirm on some level that people don't like this reminder,

Starting point is 00:40:00 but you have to generate more value than you cost. So when you're sitting there spending all your time trying to avoid saving money on, like, oh, I've listened to Chaos Search talk about what they do a few times. I can probably build my own and roll it at home. I've seen the kind of work that you folks have put into this. Again, you have something like 100 employees now. It is not just you building this. My belief has always been that if you can buy something that gets you 90 to 95% of where you are, great, buy it and then yell at whoever's selling it to you for the rest of it. And that'll get you a lot further than we're going to do this ourselves from first principles, which is great for a weekend project for just something that you have a passion for. But in production, mistakes show. I've always been a big proponent of buying wherever you can.

Starting point is 00:40:42 It's cheaper, which sounds weird, but it's true. I mean, we do the same thing. We have single sign-on support. We didn't build that ourselves. We use a service now. Office Zero is one of our providers now. Oh, you didn't roll your own authentication layer. Why ever not? Next, you're going to tell me that you didn't roll your own payment gateway when you wound up charging people on your website to sign up. You got it. And so, I mean, do what you do well. Focus on what you do well. If you're repeating what everyone seems to do over and over again, time, cost, complexity, and service, it makes sense. I'm not trying to build storage. I'm using storage. I'm using a great, wonderful service, cloud object storage. Use what works, what works well, and do what you do well. And what we do well is make cloud storage analytical and fast.

Starting point is 00:41:27 So call us up, and we'll take away that 2 a.m. call you have when your cluster falls down, or you have a new workload that you were going to go to the, I don't know, the beach house, and now the weekend's shot, right? Spin it up, stream it in, we'll take over. Yeah, so if you're listening to this and you happen to be at reInvent, which is sort of an open question, why would you be at reInvent while listening to a podcast?

Starting point is 00:41:51 And then I remember how long the shuttle lines are likely to be, and yeah. So if you're at reInvent, make it on down to the show floor, visit the Chaos Search booth, tell them I sent you, watch for the wince, that's always worth doing. Thomas, if people have better decision-making capability than the two of us do, where can they find you

Starting point is 00:42:07 if they're not in Las Vegas this week? So you find us online, chaossearch.io. We have so much material, videos, use cases, testimonials. You can reach out to us, get a free trial. We have a self-service experience where connect to your S3 bucket, and you're up and running within five minutes. So definitely chaossearch.io. Reach out if you want a handheld white glove experience POV. If you have those type of needs, we can do that with you as

Starting point is 00:42:35 well. But we're both on re-event, and I don't know the booth number, but I'm sure either we've assigned it or we'll find out. Don't worry. This year, it is a low enough attendance rate that I'm sure either we've assigned it or we'll find out. Don't worry. This year, it is a low enough attendance rate that I'm projecting that you will not be as hard to find in recent years. For example, there's only one expo hall this year. What a concept. If only it hadn't taken a deadly pandemic to get us here. Yeah. But, you know, we'll have the ability to demonstrate chaos at the booth.

Starting point is 00:43:01 And really, within a few minutes, you'll say, wow, how come I've never heard of doing it this way? Because it just makes so much sense on why you do it this way versus the miracle round of data movement and transformation and schema management, let alone all the sharding that I know is a nightmare more often than not. And we'll, of course, put links to that in the show notes. Thomas, thank you so much for taking the time to speak with me today. As always, it's appreciated. Corey, thank you. Let's do this again.

Starting point is 00:43:28 We absolutely will. Thomas Hazel, CTO and founder of Chaos Search. I'm cloud economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast episode, please leave a five-star review on your podcast platform of choice. Whereas if you've hated this episode, please leave a five-star review on your podcast platform of choice, along with an angry comment, because I have dared to besmirch the honor of your home-brewed object store running on top of some trusty and reliable raspberries pie. If your AWS bill keeps rising and your blood pressure is doing the same, then you need the Duck Bill Group.

Starting point is 00:44:09 We help companies fix their AWS bill by making it smaller and less horrifying. The Duck Bill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started. This has been a HumblePod production. Stay humble.

Screaming in the Cloud - Keeping the Chaos Searchable with Thomas Hazel

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.