Screaming in the Cloud - Hyperscaler Infrastructure for the Masses with Jessie Frazelle, Steve Tuck, and Bryan Cantrill of Oxide Computing

Episode Date: January 1, 2020

Links ReferencedOxide WebsiteOn The Metal Podcast ...

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Screaming in the Cloud with your host, cloud economist Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud. This episode is sponsored by Influx Data. Influx is most well known for InfluxDB, which is a time series database that you use if you need a time series database.
Starting point is 00:00:38 Think Amazon Timestream, except actually available for sale and has paying customers. To check out what they're doing, both with their SaaS offering as well as their on-premise offerings that you can use yourself because they're open source, visit influxdata.com. My thanks to them for sponsoring this ridiculous podcast. Welcome to Screaming in the Cloud. I'm Corey Quinn. I'm joined this week by not one guest, but three, because you go ahead and get Jess Brazell, Brian Cantrell, and Steve Tuck together in a room, and then telling them no, they can't all speak. Together, they're Oxide Computer. Welcome to the show, folks. Hey, thanks for having us.
Starting point is 00:01:18 I'm glad to be here. Yeah, super exciting. So let's start at the beginning. You're called Oxide Computers, so obviously you focus on Rust. What does the company do? It is a bit of a tip of the hat to Rust, actually. I am falling, I mean, I did fall in love with Rust. But it's not, it's not only about Rust. So Jess, you want to?
Starting point is 00:01:36 No, at some point you've got to stop talking about work and actually do work. That is true. That is true. Yeah, so I guess our tagline is hyperscaler infrastructure for everyone else. So if you think about how Amazon, Google, Microsoft, the hyperscalers built their internal infrastructure, we are trying to do that for everyone else with like a rack-scale server based off Open Compute Project and going all the way up with the software layer to deploying VMs.
Starting point is 00:02:11 So you can basically plug in a rack and then you are able to deploy VMs within a few minutes of it booting is the goal. I'm sure you've been asked this question once or twice, but I say the quiet part out loud. Why in the world would you build a hardware computer startup in the time of cloud? Did no one hug any of you enough as children? Perhaps your parents were cousins.
Starting point is 00:02:34 What inspired you to do this in the year of our Lord 2019? Well, I mean, a lot of people still honestly run on premise. Like if you talk to companies, really like a lot of where the hype is with containers and Kubernetes and a lot of other things like that, that's super, super forward facing. And a lot of people run on premises for like good reasons. So like either strategic or the unit cost economics of them actually running in the cloud, it's too expensive. But if you're, you know, a finance company and you are high frequency trading, you need good latency or you really care about the security of your infrastructure because you saw with like the Capital One breach that, you know, a bank running in the cloud, like it's actually super horrifying if they get popped. So I think that there are really, really good reasons for running in the cloud.
Starting point is 00:03:28 And a lot of that market has been neglected for so long. I spent a lot of my fun employment talking to a lot of these companies. And honestly, they're in a great deal of pain. And we even saw it during the raise that people don't seem to believe that they exist, but they do. And so we're here to kind of disrupt that and build them something that is actually nice. And Corey, this is the life that Steve and I lived. We worked at a cloud computing company, actually. And so we're very pro-cloud, just to be clear.
Starting point is 00:04:00 We love the cloud. But we also know that there are economic reasons in particular that you may not want to actually rent your compute. You actually may want to buy a computer or two, as it turns out,
Starting point is 00:04:13 if your cloud bill is high enough. Well, looking at the cloud bills of large companies, there's a distinct break from what they talk about on stage at reInvent where, oh, we're going to talk about these machine learning algorithms
Starting point is 00:04:24 and all these container things and serverless is the future. And you look at the bill and it's all primarily just a giant pile of EC2 instances. The stuff that they talk about on stage is not the stuff that they're actually spending the money on.
Starting point is 00:04:41 So I'm curious, as much as we hear about cloud, is there still a sizable market running data centers? Yeah, there is. I think your question is one that we've heard quite a bit in the market of, well, isn't everything going to the cloud? And having run lots of infrastructure ourselves on-premises and then talking to a lot of the enterprise market, there is still an enormous amount of infrastructure that is being run and that they are expecting to run for years and years and years. And frustratingly for ourselves, for them in the past,
Starting point is 00:05:21 that market has been neglected. And there has been this false dichotomy that if I'm going to run on-premises, I can't have the kind of utility or elasticity or just ease for developers that I would have in the public cloud. And you're right, 80, 90% of kind of the bulk of usage for folks today is more like EC2. Let me be able to easily spin up and provision a VM via an API of various sizes with various software running on it and store some data from that
Starting point is 00:05:54 and connect to that over the network. Hey, Corey, when was the last time you bought a box, like a 2U box or whatever, and stood it up? Probably been a while. It's my guess. 2012. Yeah, right.
Starting point is 00:06:02 So we're here to tell you that the state of the art hasn't changed that much since 2012 or since 2008. Probably 2012. Yeah, right. So we're here to tell you that the state of the art hasn't changed that much since 2012 or since 2008. Or 2008. Yeah, right. Exactly, 2002. And it's basically still a personal computer. And if you are in the unfortunate position
Starting point is 00:06:15 where you are actually cloud aware, where you know what the cloud is, and then you have to, for one of these reasons that Jess mentions, economic reasons or strategic reasons or so on, actually stand up your own cloud because you actually want to own your own computers. It's very dismaying to discover that the state of the art has advanced so little. And then really heightening the level of pain, Google and Facebook and others make it very clear the infrastructure
Starting point is 00:06:39 that they're on, and it's not the infrastructure that you just bought. In fact, it's much, much, much better. And there's no way you can buy it. I mean, this is the kind of the fundamental issue. If you want to buy the Facebook's open compute server, which is, you could like Tioga Pass or Bryce Canyon. These are, it's a nice box. Or if you want to buy Google's warehouse size, the data center size computer, they're not for sale. And if you're a customer who's buying your own machines, this is really frustrating. Yeah. For six or seven years, there's been innovation going into these hyperscalers infrastructure to deliver out the services we all consume, Amazon, Google, Microsoft, Facebook, et cetera. And none of
Starting point is 00:07:18 that innovation has made its way into the larger enterprise market, which is frustrating. And then you're told they don't exist and they go into a rage. Because it's all going to the public cloud. So what does it matter? Yeah, if I go and buy a whole bunch of servers to put in a cage somewhere at a data center, you're right, my answer hasn't really changed in years. I would call up Dell or HP or Supermicro.
Starting point is 00:07:39 I would be angry and upset by regardless of which direction I went in. And no one's going to be super happy. Nothing's going to show up on time. And what I get is going to be mostly the same, except for those one or two edge case boxes that don't work the same way that slowly caused me to rip the remains of my hair out. And eventually, I presumably have a data center up and running.
Starting point is 00:07:57 Cloud wasn't, for me at least, this thing that was amazing and solved all of the problems about instant on and on-demand compute, but it got me out of the data center where I didn't have to deal with things that did nothing other than piss me off. So I guess my first question for you then becomes, what makes you different than the rest of those, shall we say, legacy vendors? And then I want to talk a little bit about how much of what the hyperscalers have done is applicable if you're not building out rows and rows and rows and rows of racks. I think like honestly what makes us different is the fact that like we have experience with cloud compute and we also have experience in like making tools that developers love. And then we also have the experience with like Brian and Steve of running on-prem and all that pain.
Starting point is 00:08:45 So like what you get, like you were saying, when you buy like a few servers is you get a kit car. And so like what we are giving people is like not a kit car. It's like the Death Star all built together. Is that a market good or true? We update that slide. I don't know if Death Star feels a bit like, what's Alderaan in that? I should have used that in the raise, honestly. Sorry.
Starting point is 00:09:11 That's actually, that was not the point I was trying to make, actually. It should be said, Corey, that I'm not sure that Jess believed how bad the existing vendors were. I think she heard Steve and me complain, but then she acted too surprised when she talked to customers. Oh yeah, no, this is very true because, you know, I mean,
Starting point is 00:09:30 they did complain. And then I was like, okay, I'm going to go like talk to some people. And so I spent a lot of time actually like tracking down a bunch of people and getting on the phone with them. And honestly, like, then I was like, whoa, like Dell is really bad. Like, I didn't realize that like everyone has the same exact problem. And not only that, like you I was like, whoa, like Dell is really bad. Like I didn't realize that like everyone has the same exact problem. And not only that, like you get gaslit by these vendors into thinking like this problem is only yours. You bought the wrong rails. No one has ever done that before. You might be simple.
Starting point is 00:09:55 We have never seen this problem before. You are the only ones to report this problem. I cannot tell you how deeply to the marrow frustrating it is. And I mean, honestly, this is, Corey, you made this point too, about like AWS and the cloud taking this pain away. That is one thing that AWS really does not do. They do not try to tell a customer
Starting point is 00:10:13 that they're the only one seeing a problem. And largely because they can't get away with that, right? I mean, they can't get away with saying, no, Frankfurt is only down for you. Yeah, Twitter would disagree. Twitter would disagree. Exactly, but you don't have a way of going to Twitter and understanding that, hey, is anyone else seeing, you know,
Starting point is 00:10:29 a level of dim failure that seems to be too high, to take an example. Or, you know, we were seeing a problem, God, Steve, when was that? That was in 2011, when we were seeing a pandemic of reliability issues. We were being told we were the only ones that are seeing this problem. And it's a problem that we couldn't possibly create. So it made no sense. And I asked a room full of,
Starting point is 00:10:53 I don't know how many people were in there, 300 people, could you raise your hand if you've ever seen a parity error on your rate controller? And all of a sudden, what, maybe 20 or 30 angry hands shot up. I mean, it wasn't very many people, but it was, we were not the only ones. There's all these people discovering they
Starting point is 00:11:09 weren't the only ones. They were not the only ones. And everyone's kind of looking around and they're seeing like, wait a minute, like my hand isn't the only one that's up. And all of a sudden you have like an instant therapy group. But that shows you the level of frustration. That level of frustration has only grown in the last decade. And a big, a big part of the frustration stems from the disconnected hardware and software. And to your example, Corey, you go buy a bunch of hardware, and if you can get it all uniform and get it stood up and racked and test your drives, and now you have a collection of hardware, but yet now you have to go add software.
Starting point is 00:11:41 But Steve, we could just run Kubernetes on bare metal. Stop it. Don't even. Do not. That was not like third rail that you're supposed to touch, by the way. I love that third rail. That's one of my favorite third rails to touch. But tightly integrating hardware and software together is going to be one key differentiation point for us, Corey, and owning both the hardware and the software side so the customer doesn't have to. So they can effectively roll a rack in, plug it in, and expose the APIs to the developers that they love and operational tools that they love.
Starting point is 00:12:11 And I mean, this is the kind of stuff you would have expected to come out of hardware 20 years ago, 10 years ago. And so we're finally going to do it. I will say that what you're saying resonates as far as people having anger about what's going on in this, I guess, world of data centers. I gave a talk at scale at the beginning of 2019 called The Cloud is a Scam.
Starting point is 00:12:33 And apparently someone ripped it off and put it up on YouTube a few weeks ago. And I just glanced at it. It has 43,000 views and a bunch of terrible comments. And someone's monetizing it who is not me. Great. Awesome. But at least the message is getting out there. But the entire talk is in five minutes
Starting point is 00:12:47 is just a litany of failures I had while setting up a data center series in one year at an old job. Every single one of those stories happened and each one resonates with people. The picture of the four different sizes of rack nuts that look exactly the same that are never actually compatible with one another.
Starting point is 00:13:07 Schrodinger's G-Bick, where you send the wrong one every time and there's no way out of it. And just going down this litany of terrible experiences that everyone can relate to. Was that G-Bick a Sun G-Bick, by the way? No, I think it was a Cisco G-Bick at that point, just because we all have our own personal scars. Serious G-BIC problem at Sun, so I'm glad that wasn't a Sun GBIC. Yeah, and the fact that it resonates as strongly as it does is fascinating.
Starting point is 00:13:31 Yeah, so in terms of what a kit car the Extend data center is, I mean, it's not good. And the problem is that you're left buying from all these different vendors. As Steve says, you've got this hardware-software divide. And what we think people are begging for is an iPhone in their DC. They're begging for that fully integrated, vertically integrated experience where if there is a failure, I know the vendor who is going to actually go proactively deal with it.
Starting point is 00:13:57 Just as you've got that at Google and Facebook and so on. I mean, a group that I think, Jess, I'm not sure if you coined it or not, but I love the term infrastructure privilege, which those hyperscalers definitely have. And we want to give that infrastructure privilege to everybody. Well, that does lead to the next part of the question, which is, okay, if I want to build a data center like Google, for example, well, step one, I'm going to have an interview series that is as condescending and insulting to people as humanly possible. Once I wind up getting the very brightest boys, yes, they're generally boys, who can help build this thing out, super. That's just awesome.
Starting point is 00:14:30 But now most of what it feels like they do requires step one, build or buy an enormous building. Step two, cooling power, et cetera, et cetera. And something like step 35 is, and then worry about getting individual optimizations for racks. Then going down the tier even further, talking about specific individual servers. At what scale do you need to be operating at before some of those benefits become available to, shall we say, other companies who have, you know, old school things like business models? Yeah, I mean, like, honestly, in the people that I was talking to, and I won't
Starting point is 00:15:06 name names like there were numerous who contemplated doing that. But the problem is, of course, like obviously that's not their business, like their business is not being a hyperscaler. So you very rarely see companies doing it unless there is this huge gain for them economically. So a lot of like the value add that we're providing is giving them access to having all the resources almost of a hyperscaler without having to hire this massive team and have like an entire like build out of a data center for it. You can still host in a colo. Yeah. I mean, I think the idea is that you start with a rack, right? That you can do with a single rack. You can, especially if that rack represents both hardware and software integrated together, you can get a lot of those advantages. Yes, there are advantages, absolutely, when you're architecting at an entire DC level or an entire region level. But we think that those advantages actually can start at a much lower level. Now, we don't think it starts much lower than a rack. If you just want a 1U or 2U server,
Starting point is 00:16:07 no, that's going to be at too small a scale. But the smallest scale that we will go is to a rack. It is a rack scale design. And to your question on what is the applicability, how can people use this? Do they have to reimagine their data centers? Most modern data centers today have sufficient power capacity to support these rack scale designs. And especially if you're a company that is using some of these REITs like Equinix or Iron Mountain or others,
Starting point is 00:16:40 those service providers are beginning to, and have really come a long way in modernizing their facilities because they need to reduce, they need to help customers who want to reduce PUE. They need to help customers who are out of space and need to get much better density and utilization out of their space.
Starting point is 00:16:57 So you can take advantage of benefits on one rack and these racks can fit in many modern data centers. Is this the minimum size, or is this one of those things where, oh, cool, I will take one computer, please, and it's going to be just like my old desktop, except now it sounds like a jet engine taking off every time someone annoys me. I think a rack is going to be the minimum buy. Okay. So it needs to be a much larger desk is what I'm hearing. Reinforce the desk. Yeah, exactly.
Starting point is 00:17:24 Reinforce the desk. Yeah, exactly. Reinforce the desk. And it's going to be, you know, it'll fit on a 24-inch floor tile, and it will fit in an extant DC within limits. But a rack is going to be the minimum design point. I frequently said that multi-cloud is a stupid best practice, and I stand by that. However, if your customers are in multiple clouds and you're a platform, you probably want to be where your customers are unless you enjoy turning down money. An example of that is Influxdata. Influxdata are the manufacturers of InfluxDB, a time-series database that you'll use if
Starting point is 00:18:02 you need a time-series database. Check them out at influxdb.com. As you've been talking to, I guess, prospective customers, as you went through the joy of starting up the company, did they all tend to fit into a particular profile? Were they across the board? For example, it feels like a two-person company that is trying to launch the
Starting point is 00:18:26 next coming of Twitter for Pets is probably not going to necessarily be your target market. Who is? So I really tried to get like a diversity of people who I was talking to. So it's a lot of different industries and it's a lot of different sizes. Like there's super large enterprises there with numerous teams internally that will probably only interact with like maybe lot of different sizes. Like there's super large enterprises there with numerous teams internally that will probably only interact with like maybe one of those teams. And then there's also like kind of smaller scale, like not to the scale of like only two people in a company,
Starting point is 00:18:58 but like maybe like a hundred person company that is interested in as well. So it's really a variety of different people. And I did that like almost on purpose because you can easily get trapped into making something very specific for a very specific industry. And in terms of like Twitter for pets, they should start in the cloud. And I would encourage anyone, any two person software startup, go start in the cloud. Do not start by buying your own machines. But when you get to a certain point where VCs are now looking not just for growth, but are looking for margins, say, and you're kind of approaching that S1 point, or you're
Starting point is 00:19:37 approaching the point where you know you've got product market fit, and now you need to optimize for COGS, now is the time to consider owning and operating your own machines. And that's what we want to make easy. Totally. We're not convincing anyone to move away from the cloud for bad reasons, for sure. Well, I think everyone's going to use both. I mean, very few and far between companies that we've spoken to so far are not going to have a large footprint in the public cloud for a lot of the use cases in their business that have less predictability. Our new apps they're deploying have seasonality, have different usage patterns, but for those more persistent, more predictable workloads that they
Starting point is 00:20:17 can see multi-year usage for, a lot of them are contemplating building their own infrastructure on-premises because there's huge economic wins and you get more control over the infrastructure. There's a lot of opportunity to, I guess, build baseline workloads out in data centers. But the whole beautiful part of the cloud is that I can just pay for whatever I use and it can vary all the time. Now, excuse me, I have to spend the next three years planning out my compute usage so I can buy reserved instances or savings plans now that are perfectly aligned so I don't wind up overpaying. There starts to be an idea towards maybe there is an on-demand story someday for stuff like this, but I'm not sure the current configuration of the cloud is really the economical way to get there. Yeah, it is still much more the hotel model. And if you're going to stay in the city for a week, a hotel makes sense. But as you get to a month and then three
Starting point is 00:21:17 months and then a year, it's pretty difficult to get, uh, to the right unit economics that are necessary for some of these persistent workloads. And, um, I think bandwidth is part of it. Uh, I mean, there's just between compute storage and, and bandwidth, these, you know, these when used in a persistent fashion are pretty expensive. Now, if those work with one's business that they should,'s business, I think when there's not an economic challenge or a need for control or being able to modify the underlying infrastructure, the cloud's great. a pretty big spread for that infrastructure that they're running, even with reserved instances, it's tough to put a three to five year value on it. And we've been convinced that this post-cloud SaaS workload is coming, in part because Steve and I lived it in our previous lives. So we've been convinced it's coming economically. I think it's a little bit surprising the degree to which it's arriving. We've increasingly have been surprised by conversations that we've had with folks
Starting point is 00:22:29 who are like, all right, well, this company is going to be, I mean, they're going to die in the cloud. They would never contemplate going to the cloud. And then we learned that like, oh no, actually we've got a high level mandate to get off the cloud by 2021 or what have you. Or move X percent off. Or move X percent off or what have you. And it's like, and that's surprising.
Starting point is 00:22:44 I think it shows that they've got high level mandates to either, or to move to that hybrid model that Steve described. So, especially when you look at things like bandwidth. And bandwidth, by the way, was on my Steve Talk bingo card for this little panel. Yeah, there's no way you're going to let Corey get through an entire episode
Starting point is 00:23:01 without mentioning bandwidth costs. Because bandwidth costs are outrageous on AWS. The problem isn't even that they're expensive because you can at least make an argument in favor of being expensive. The problem I have is that they are inconfreakable as far as being able to understand what it's going to cost in advance. The way you find out is the somewhat titillatingly called suck it and see method. Namely, how do you figure out if a power cable in a data center is live? Well, it's dangerous to touch it. So you hand it to an intern to suck on the end of the cable. And if they don't get blown backwards through the rack,
Starting point is 00:23:34 it's probably fine. That's what you're doing except with the bill. If your bill doesn't explode and take your company out of business, okay, maybe that data transfer pattern is okay this month. Well, and that's it. And I think that data transfer pattern is okay this month. Well, and that's it. And I think that, you know, I've always said that,
Starting point is 00:23:47 you know, who loves utility billing models, utilities, people that love it. It's like, and you know, I, I,
Starting point is 00:23:53 I, I, I, I, I don't know if you've got teenagers, but she's two at the moment. So it's almost a three nature though. So we'll see.
Starting point is 00:23:59 But okay. Even like, you know, four or five. Oh, bandwidth overages do not start at too young an age at this point. Yeah. Five years old,
Starting point is 00:24:07 five years old. And like, you're scared as hell that they're going to be on the cellular network when they're watching YouTube kids. Exactly. And, and being on the cloud is like having thousands of teenagers that are on YouTube videos all the time. And you're praying that they're all on the wifi in, you know, they all claim they are, but you know, of course, half of them are accidentally on the cellular. So as you were talking to a variety of VC types, presumably, as you were getting the funds together to launch and come out of stealth mode, what questions did you wind up seeing coming up again and again? And they can be good questions. They can be bad questions. They can be hilarious questions. I leave it to you. What a spectrum of questions. I mean, I actually think what,
Starting point is 00:24:45 what was most fascinating is that nobody really asked the same questions. A lot of people like kind of had differing opinions, almost so much so that if you were to combine all the VCs we talked to, they would all like not align. Right. You, you get like one kind of like super Voltron VC that understands everything. And then one just total idiot VC that understands everything. And then
Starting point is 00:25:05 one just total idiot VC that understands nothing. And everything in the middle. And everything in the middle. The thing that was also interesting, and I don't know if you two felt the same thing, but I felt it was the thing that people would often have the most angst about or the most questions for us about were the things they understood the least. Yes, yes, yes. And then there were other VCs who they understood those things the most. And then all the other things, they were like, wait, but what about? So if you combine them together,
Starting point is 00:25:34 you would almost get like our ideal scenario where they understand everything or you would get the worst scenario where they understand absolutely nothing. And ask the most questions. Or the best questions that it's not entirely clear which side of that very large spectrum they're on well what about magnets and you're sitting there trying to figure out are they are they actually having a legitimate concern about some weird magnetic
Starting point is 00:25:55 issue that you hadn't considered or do they have no actual idea how computers work or is it something smack dab in the freaking middle or is it a ploy to try and throw you off we've got to talk about some of the dumbest questions we got okay okay okay okay okay can we do can we do can we do yes okay yes so there was one question where um we were trying to raise enough money to hire enough people to go build this basically and uh a lot of like kind of the questions that would come up would be like so like what can you do with like a lot less money, a seed thing, just a seed thing, like, like something like a million dollars, like maybe just one rack. Can you build, get some proof points and one rack. Right. Well, so there's one, one VC in particular, uh, yeah, I had said, you know, let's, can you shrink the scope of the problem a bit so that
Starting point is 00:26:43 you can, uh, take less investment, prove it out and then scale the problem a bit so that you can take less investment, prove it out, and then scale the business? So what if you were to say, shrink this down to like three racks? Three racks. If you just did three racks, would that then prove out things? Instead of going big and building a lot of racks,
Starting point is 00:27:00 what if you just built? And Corey, to clarify, most of what we're building is software. So that first rack is actually the expensive one. The second rack is a lot cheaper. Eventually you cross the chasm and you have a rack and then you can go sell that rack to like everyone. Exactly. And then you go pitch to SoftBank on the other hand, like, okay, we like your prospectus, but what can you do with $4 billion? And the only acceptable answer that gets the money is something monstrous.
Starting point is 00:27:25 You know what was funny is when we were first starting to raise, we were like, what would we do if SoftBank came to us? And then we realized very shortly after starting our raise that like SoftBank is not going to be coming to us or anybody else. SoftBank is going- They're hiding from an investor who has a bone saw.
Starting point is 00:27:41 And actually, oh my God, that got dark. That got really dark. That got really darned. That was really darned. That was. This is why they don't invite me back more than once for events like this. But no, it's easy to look from the outside and make fun of spectacular stumble failures and whatnot. But it's also interesting to see how there are serious concerns like the one you're addressing
Starting point is 00:28:03 that aren't, shall we say, the exciting things that have the potential to revolutionize society. It doesn't necessarily, and correct me if I'm wrong, sound like that's what you're aiming at. It sounds more like you're trying to drag, I need to build some server racks, and I want to be able to do it in a way that doesn't make me actively hate my own life. So maybe we can drag them kicking and screaming at least into the early 2000s, but anything out of the 80s works. Yeah. So actually I think we actually are changing and do view ourselves as, as changing things a bit more broadly in that we are bringing innovations that are really important, clear innovations that the hyperscalers have developed for themselves
Starting point is 00:28:39 and have actually very charitably made available for others. So the Open Compute Project that was originally initiated by Facebook was really an attempt to bring these innovations to a much broader demographic. And it didn't exactly, or hasn't yet, I should say, hit that broader demographic. It's really been confined to the hyperscalers. And these advantages are really important. I mean, it was really interesting. We were talking to Amir Michael, who we had on our podcast and had just a fascinating conversation with him about Amir was the engineer at Facebook who led the Open Compute Project. you get from designing these larger scale machines, rack scale designs and driving the PUE down. And that was a real motivator for him
Starting point is 00:29:28 and a really deep earnest motivator and an earnest motivator for Facebook and the OCP was allowing other people to appreciate that. So especially, I don't think it's too ridiculous to say that as we have a changing planet and we're much more mindful about the way we consume energy. Actually delivering these more efficient designs to people,
Starting point is 00:29:49 it's not just about giving them a better technology, but a more efficient one as well. Yeah, the power savings are actually huge. It's really... Yeah, and Corey, I know you had someone on your podcast talking about kind of comparing cloud providers and who are focusing on sustainability, both in terms of utilization and also offsetting. And, you know, you've got large enterprise data centers that are even further behind some of the hyperscalers that aren't maybe scoring as highly against a Google or another. So this gives them the ability not only to increase density and get
Starting point is 00:30:26 a smaller footprint in either their colo or in their data center, but also get a lot of these power efficiency savings. And we don't want it easier for people to go build racks. We actually want to make it easy for folks to just snap new racks in and have usable capacity that is easy to manage. And then also by crossing that hardware-software divide, give people insight into their utilization and allow people to make the right level of purchase. Even if that means buying less stuff from us next year, because we know that that's going to be a lifelong relationship for us. What is the story, if you can tell me this now, please feel free to tell me you can't, around hardware refreshes? One of the challenges, of course, is not only does technology get faster, better, cheaper, et cetera, but it also gets more energy efficient, which from a climate and
Starting point is 00:31:20 sustainability perspective is incredibly important. What is the narrative here? I mean, frankly, one of the reasons I like even renting things rather than buying them in the forms of a cell phone purchase plan is because I don't have to worry about getting rid of the old explodey thing. What is the story with Oxide around upgrading and sustainability?
Starting point is 00:31:39 Well, first, I think is you gotta kind of take it from a couple angles. From the when should one replace infrastructure? This is something that hardware manufacturers do a very poor job of providing information for one to make that decision. So what is the health of my current infrastructure? What is the capacity of my current infrastructure? Um, what is the capacity of my current infrastructure? Um, you know, basics that should just come out of the box, uh, that help you make those decisions. Should I, um, should I, should I buy new equipment or, uh, three years, four years, five years? Um, what are my warranty schedules look like? I mean, being a former operator of thousands and thousands of machines,
Starting point is 00:32:22 one of the things that was most frustrating was, frustrating was that the vendors I bought from seemed to treat those 8,000 machines as 8,000 separate instances, 8,000 separate pieces of hardware that I was doing something with and no cohesive view into that. So number one, make it easier for customers to look at the data and make those decisions. The other element of it is that your utilization is more environmentally impacting in many cases than the length of time or the efficiency of the box itself. So how can I be smarter about how I am putting that infrastructure to work? If I'm only using 30% of my capacity, but it's all plugged in drawing power all the time, that's extraordinarily wasteful. And so I think there's a whole lot that can be gained in terms of efficiency of infrastructure one has. And then yes, there's also the question of, when is the right time to buy hardware
Starting point is 00:33:28 based on improvements in the underlying components? Yeah, and I think the other thing that's happening, of course, is that Moore's law is slowing down. And that what is the actual lifetime of a CPU, for example? How long can a CPU run? We know this the bad way from having machines that were in production long after they should have been taken out of production for various reasons. And we know that there's no uptick in mortality even after CPUs have been
Starting point is 00:33:56 in production for a decade. So at the end of Moore's law, should we be ripping and replacing a CPU every three years? Probably not. We can expect our DRAM density to level. We can expect our, obviously our CPU clock frequencies have already leveled, but our transistor densities in the CPUs are going to level. And then so how do you then have a surround that's designed to make that thing run longer? And then when it does need to be replaced,
Starting point is 00:34:20 you want to be sure you're replacing the right thing. And you want to make sure you've got a modular architecture and OCP has got a terrific rack design that allows for individual sleds to be replaced. And certainly we're going to optimize that as well. But we're also taking the opportunity to kind of rethink the problem. So when people look at what you're building and come at it from a relatively naive or shall we say cloud first perspective, which let's face it, I'm old enough to admit it those are the same thing what is cloud first like a new nationalist movement though that makes me feel kind of uncomfortable honestly it really does feel somewhat nationalist and then we talk
Starting point is 00:34:54 about cloud native and oh that goes oh god i'm not apologizing for that but you are a cloud nativist is what i understand exactly Exactly. What do, what are people missing when they're coming from that perspective, looking at what you're building? You know, I don't think we're going to try to talk people out of that perspective. That's fine. We're actually going to go to the folks that are already in pain of which Jess knows many. Yes. No, definitely a lot of people in pain. And also like we do understand the cloud and like the usability that comes from kind of the interfaces there.
Starting point is 00:35:29 I also think that we can innovate on them. But yeah, I think that we're not opposed. And just like, you know, someone who's using Lambda doesn't necessarily need to be educated about, you know, well, actually it's not serverless and there's actually, there's something you're running in. Or someone who's running containers doesn't necessarily have to be educated
Starting point is 00:35:46 about actually a hardware virtual machine and what that means. And someone who's running a hardware virtual machine doesn't need to necessarily be educated about what an actual bare metal machine looks like. I mean, we don't feel kind of an obligation to force this upon people for whom this is not a good fit. If you want to learn more about the exciting things that Oxide Computer is up to, now that you're post-stealth, where can they learn more about you? Head on over to our website, oxide.computer.
Starting point is 00:36:14 And also we have our own podcast called On the Metal. It's Tales from the Hardware Software Interface. So you can subscribe to that as well. Can I just say, in Corey, we obviously, we love your podcast. Our podcast is awesome. It is so good. Don't start a competition right now. It's like we can both be aspire to, no, we can be, we're different. We're not, we're not in one is screaming at the cloud. Um, the other is tales from the hardware software interface. These are very different podcasts. And to be fair, we copied Corey on his entire podcast set up.
Starting point is 00:36:48 So it's absolutely fine. It's I've, I've made terrible decisions in the hopes that other people will sabotage themselves by repeating them. Oh, well, one step too late. Um, but are the, the, I actually think, and I think we all think that we, you know, we made the podcast that we all kind of dreamed of listening to, which was, um, folks who've done interesting work at the hardware software interface describing some of their adventures. And it's amazing. And it's a,
Starting point is 00:37:14 it's a good reminder that no matter how cloudy things are, the hardware software interface still very much exists. Yeah. Software still has to run on hardware. Software still has to run on hardware. Computers will always be there. Servers will always exist. And Corey, you surely must have done an emergency episode when HPE announced their cloudless initiative. I assume that you had a spot episode on that. I think, didn't they retract that, the cloudless business?
Starting point is 00:37:39 Oh, they shut that down so quickly that it wasn't even pressed for more than a day, which proves that cyber bullying works. Well done. It's abhorrent when you do it to an individual, but when you do it to a multi-billion dollar company like HP, it's justified. And frankly,
Starting point is 00:37:56 everyone can feel good about the outcome. Hashtag cloudless is now gone. Yeah. And it did not last long. It feels like it feels like a Microsoft. Hey, may have even lasted a little bit longer. It was within a year.
Starting point is 00:38:07 Oh God, it was within, I mean, and because it's stupid. It's stupid to talk about things that are cloudless and things that, and even serverless, like we have to be very careful about what that means because we are still running on physical hardware. And that's what that podcast is all about. Well, even you're defining things
Starting point is 00:38:22 by what they're not. And Oxide is definitionally built on something that is no longer the absence of atmosphere. So you're defining things by what they're not. And oxide is definitionally built on something that is no longer the absence of atmosphere. So you're now about presence rather than absence. Good work. Wow. Okay. That is, that is the first. That was meta. That was, you know, we've got a lot of good reasons for naming the company oxide. Oxides are making very strong bonds. They're very, silicon is normally found in its oxide, but that's very meta. I've not thought of that one. Oh yeah, wait till people start mishearing it as ox hide,
Starting point is 00:38:49 something you skin off a buffalo. We'll have to cross that verge when we come to it. But thanks for planting the seeds. Yeah, why are we selling so poorly in the great plains states? Sustainability. Thank you all so much for taking the time to speak with me today. I appreciate it.
Starting point is 00:39:08 Corey, thanks for having us. Thanks, Corey. This was great. Likewise. The team at Oxide Computer, I'm Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this episode, please leave it a five-star review on iTunes.
Starting point is 00:39:20 If you've hated it, please leave it a five-star review on iTunes. this has been a humble pod production stay humble

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.