Grey Beards on Systems - 120: GreyBeards talk CEPH storage with Phil Straw, Co-Founder & CEO, SoftIron

Starting point is 00:00:00 Hey everybody, Ray Lucchese here with Matt Lieb. Welcome to the next episode of Greybeards on Storage podcast, a show where we get Greybeards Storage bloggers to talk with system vendors and other experts to discuss upcoming products, CEO of Softiron. So Phil, why don't you tell us a little bit about yourself and what Softiron is all about? So my name is Phil Straw. I was originally the co-founding CTO of Softiron and a few years ago moved across to be in the CEO hot seat.

Starting point is 00:00:47 I co-founded the company with a guy called Norman Frazier, who's a very successful entrepreneur based out of the UK, currently our chairman. He has two IPOs behind him of unicorn companies, and he really wants Softdine to be the third, so I hope he's correct. Myself, I really had a varied career, basically as a technologist, working in sort of disruptive areas of all sorts, automotive networking, security, various different things, including scuba diving, electronics, and special forces equipment, which is a very interesting adventure. Then I did research for DARPA and MIT and JPL and others.

Starting point is 00:01:28 I've been in Silicon Valley about 20 years, originally kind of hailed out of the UK out of technology there, and then moved over here and been here ever since. As far as Softline is concerned, that chapter started nearly 10 years ago now. So we're coming up to our 10 yearyear birthday party we founded the company in 2012 and i guess the best thing that we're the thing we're best known for is creating a portfolio of um seph targeted appliances so storage boxes that make seph shine as best he possibly can um and um you know, SEF being a Swiss Army knife of storage,

Starting point is 00:02:08 we find ourselves in all sorts of storage adventures with customers doing a whole variety of things. But since we have gotten known for producing SEF appliances, Softdyn has been growing pretty quickly, even through the chapter of COVID. Softdine, I'd say, was kind of born for COVID because we're such a distributed company. Spread out, you know, Asia, Europe, US, it was like nothing really happened for us because

Starting point is 00:02:40 that's the way we worked anyhow. So it's kind of an interesting aside to inside track on Softdark. I was going to say, SAF is an interesting product. And my first question would be, are we doing the officially licensed Red Hat version or is this the open source? Yeah, we do the open source version in fact we we we do quite a lot of things um to surround seth we are very religious about keeping seth open not forking it not doing

Starting point is 00:03:14 anything devices like that really we live in the modern world and i really believe in particular that you have to serve the customer first and doing anything that changes the open source nature of seth is bad. We're also part of the Ceph Foundation and try and contribute where we can to, you know, the best interest to advocate for Ceph and its movement forward. I totally agree with that decision. The only issue my customers have seen is support. And I imagine the support for Ceph actually comes from Softiron in this case? Oh, yeah, absolutely. It's sort of one of our, you know, I say, and, you know, I may get shot in the head here for being a terrible CEO, but I say that we should try and lose money on support,

Starting point is 00:04:00 which may sound very contrarian. What? Yes. You know, I think there's a movement in the industry that, you know, often people look at support as a cost center, and of course it is. But, you know, if you're going to meet customers in the true enterprise sense, if you take an open source thing which i i personally am a complete digger about seth i love it um i love what it stands for i think it's very misunderstood often um and we could talk about that maybe but it it is an amazing technology and as an open source thing that's not been around forever often in open source when it meets enterprise and you try and turn a corner with open source you need a bunch of support that looks enterprise class.

Starting point is 00:04:48 And the only way you can really do that is meet customers on their own terms, on their own turf and only leave when they have a smile on their face. And we've religiously done that at Softdyn to the point that that's becoming our reputation and that's great. But we've gone to the ends of the earth for certain customers as far as integration and like standing by them. In fact, you know,

Starting point is 00:05:11 we've had so much success doing that. We've had requests to create a product just to support Seth clusters that we didn't necessarily create and ship to customers. Oh, that's important. Impressive. You know, it's kind of a reflection on the state of a limiting factor for a number of open sources. I mean, from a support perspective, I think there's, you know, many eyes kind of saying and that, you know, lots of players in this in that space can can lend a certain credence to levels of support and functionality and stuff like that. But the usability concern has always, always been an issue issue for these types of solutions. I'm not sure.

Starting point is 00:06:09 I never use Ceph, so I can't tell you whether usability is a problem, but I'm guessing that it is. Well, it's funny you mention that as a pair because that's sort of the soft analysis is that in the juicy center of Seth, you have this brilliant thing, which is you have a consistent mechanism of storage, which is constituted by objects, but you know, you build everything including block object and file and many, many tuning knobs and dials. I mean, it's incredibly flexible.

Starting point is 00:06:39 It's probably, I think the least advocated and amazing thing about Seth is you have this consistent mechanism which allows you to express policy, do flexible things, configure different pools, make them consume different ways. And it all comes from this primordial suit, which is the sort of genius, in my opinion, in the middle of Seth. But, you know, when it comes to support, all of that flexibility comes with, you'd say, complication. And as a technologist, when I first started learning Ceph, I would have to admit that I found it fairly intimidating. It was just a lot of vegetable soup. You know, there's all these letters and names and things. customer in a very unusual way around support cycles is surround Seth with things that we've seen missing in enterprise environments. And by that, I mean, not necessarily someone doing the

Starting point is 00:07:35 easy thing like putting OpenStack next to Seth or just putting an S3 gateway in the front. That's a fairly simple use case. It's well understood, and it's sort of atomic in its use. But when you get into an enterprise environment, and this is also a very interesting thing happening around Seth, is because you can do block object and file, and because now Seth has a very stable file system in S3, S3 is very popular. You see people doing virtualization everywhere, so Block.

Starting point is 00:08:10 People are realizing that Ceph is now so credible and so capable. And, you know, particularly on, say, soft iron products, you know, we focus very heavily on vectors like performance and low power and quality, to the extent that we even do all of our own manufacturing in Silicon Valley to get that quality. It can perform in places that traditionally you wouldn't see it perform or the expectation might not be there. So in workloads that say compete with Lustre and we beat it in some circumstances. So what we're starting to see around Seth is

Starting point is 00:08:45 people may have four or five vendors in the data center, one that's very fast, one that's, you know, for backup, et cetera. And what we're seeing is that because we have now a wide portfolio of products, you see people doing backup on spinning media. They love the low power. They love the performance at that price and density. And then you see them doing out and out performance that you'd expect to be running on Lustre or some other performance file system. And it's performing exceptionally well to the point it's really turning heads. And so that allows an enterprise to support one technology with one team that's deeply invested in that technology with an

Starting point is 00:09:26 enterprise level of support behind them. And then what we're doing on usability is surrounding the product with things that represent legacy attach. And that's a product for us called Storage Router, NFS, SMB, various other protocols, iSCSI, et cetera. And we insulate and productize that in a way that has all the enterprise features you might not say, for example, financial and the like all of the features in SMB that give you resilience or

Starting point is 00:09:54 load balancing or, or transparent failover. So that's a product, you know, that we invested in heavily because we believe that open source, we were finding customers that would just weren't satisfied with the experience of, say, SMB on just pure open source. So we surrounded that with Storage Router, which insulates Seth and its juicy goodness I just talked about from turning the awkward corner and still having usability. And then another thing we've done is created a management dashboard that allows mere mortals to configure the majority of sets and it still be stateless so that you can use the command line, be a command line hero, or have a variety of the two things, but do all the

Starting point is 00:10:38 commonsensical things and get a user level view of what is the state of my cluster or clusters and what am I doing with it and all of that good life cycle stuff. I find this very interesting. A couple of knocks against Seth historically have been how complex and how command line driven it has been. So putting in your UI on top of that, though I haven't played with it, to be fair, is absolutely mission critical because, let's face it, most storage administrators aren't Linux administrators. So they want to be able to do the kinds of things you're talking about. They want to be able to do them relatively straightforward. And having built that into the user interface, you know, and again, not from what I understand in your case, it hasn't had the performance necessary to support high data transactions,

Starting point is 00:11:58 high calls. Apps kind of thing, those sorts of small file performance, I guess. Particularly in read. Yeah, you're absolutely right, Ray. And that's always been a challenge for a lot of these file systems, the small file performance. I'm not sure Seth is kind of in that space,

Starting point is 00:12:18 but it's been primarily high bandwidth types of workloads that were very useful for these sorts of file systems. But machine learning and all that sort of stuff have all kind of been focused on smaller files. So I guess the question is, how's the small file performance on your solution? Thanks for narrowing that down for me, Ray. Sorry, Matt. No, no, you absolutely nailed it. So there's a couple of things that have happened in Ceph over,

Starting point is 00:12:45 I'd say the last three or four years. One is a real focus on CephFS in general. So outside of, at least at that time, the traditional things that Ceph did, and that's gone very, very well. A real forklift in performance of primary storage. So moving away from file store towards blue storage, a totally new way of dealing with just raw data on drives. And then, you know, for us, we've just spent nearly a decade getting to a technology platform that allows us to build essentially storage Lego, everything from, you know, dual 100 gig with wire speed performance on NVMe boxes

Starting point is 00:13:28 all the way to 100 watt, you know, footprint for 120 terabytes at 25 gigs wire speed. So, you know. I think I have a Chia thing I need to talk to you about. That's a different story. Go ahead. Yeah. I mean, you know, so we really, it's a different story. Go ahead. Yeah. I mean, you know, so, so we really,

Starting point is 00:13:50 it's not uncommon for us to put a box next to an industry standard box at the very high end of what you can buy from all the usual suspects and double the wireline performance. And it leaves a lot of people even experienced F users scratching their head. Like, how do you do that? And then we list the laundry list of the 99 things that we do to make that happen. And I'll tell you, I'll ground it with an example. So I'll give you an example here of Sandia National Labs.

Starting point is 00:14:13 It's public that we did this. And we took CephFS on relatively low performance boxes relative to our portfolio, and we got performance that really shocked that lab. I mean, those guys are top shelf, right? They really know Seth, they know storage. And we were beating Lustre in certain cases. And that was with not even the high performance boxes we just launched.

Starting point is 00:14:39 So Seth, I think, has got to this. It's a turning point, I think, in Seth's life where I think enterprises, high performance compute areas, people say doing AI with small files. I think they're realizing that this is a technology you can really invest in. And because of that, as people flock to it, we've seen a massive growth in our business, which at some level you could put that at soft iron's feet but i think it's also it's more a reflection of cess acceptance and if you really want to get the best of cess then you're probably going to come to soft iron right that's

Starting point is 00:15:15 the theory and and as that happens there's more of a groundswell of people pushing on cess and and requesting things and developing on that software. And it's just, you know, I think really Seth has found its stride. And it's very interesting. Yeah, yeah, yeah. So talk to me a little bit about your appliances. You mentioned the storage router, router, router. And you've got NBME boxes and you've got, I guess, spinning disk boxes as well.

Starting point is 00:15:45 I mean, is this all, I mean, so the question I have is like, is it high availability types of environments that you're going into as a multi-controller solutions or is it all scale out or a little bit of both? A little bit of both. I mean, at least in what I've experienced, which is a fair segment of software and business, high availability is a standing assumption. Load balancing is a standing assumption. The idea that you're not going to lose data, that this is not an experiment. We're buying petabytes of data typically, right? Yeah, and so what we do, we spent a long time developing a technology platform, which includes software, including our own Linux distribution to serve Ceph, our own electronic design designed in-house completely, our own mechanical design around tool-less integration and caddy systems to encapsulate and hold drive combinations in different types of technology, but a cartridge system essentially, and then a high-speed data distribution backplane,

Starting point is 00:16:52 which is six terabits per second in every one new box. And then on top of that, we also do all of our own electronic manufacturing and sub-assembly box build, actually in Newark and Silicon Valley, which is very contrarian and it's not what people do, but we get quality from that and we get security properties because we designed all of our own electronics and we built all of our stuff. I'm not even sure what you'd call this,

Starting point is 00:17:17 but there's a concept in like the DoD where they have a protected supply chain where you control all the steps in that design, development, manufacturing. Is that what you would call this sort of thing? Yeah, we've been told by that community, the intelligence community, that we have something very unique, which is we're actually the only computer company in the sort of server world that builds you know designs and builds all of their own electronics under one roof so to speak one company and by that i mean we literally don't do we don't go to a contract manufacturer we don't even route pcbs outside the building it's all

Starting point is 00:17:58 done by engineering staff who are permanent employees inside softdine. A lot of them stateside, a lot of them in Europe. We don't really do any design work out in Asia. That's been an active choice for us. And we do all of our own pick and place electronics. So, you know, top end of town, motherboard class products get, you know, get assembled component wise and then macro assembled in Newark in California. And so we've called that provenance in that we can prove, you know, you could literally come and do the equivalent of a financial audit on all of our design, all of our source code, including even the lower-level parts of the software

Starting point is 00:18:40 that are often forgotten and very important, things like SOC firmware on boot up, which if you own that, you own the entire platform. UEFI, the operating system, the self-distribution, including even the hardware and software design for the board management controller for how to band management, because that's obviously a massive threat.

Starting point is 00:18:58 Sure, right. Yeah. So this is huge. And I can see why you've got like national labs on your customer list. You know, we're facing a world where components come from all over the place. But if you're doing all the manufacturing and not OEMing, I don't know what percentage of your hardware is not OEMed. I imagine you're not building your own NVMe drives. Yeah, disk drives and stuff, compute chips and stuff to some extent, right?

Starting point is 00:19:32 Yeah, we're very sensitive about the source of chips. So, you know, what's in it, what the analysis of what its security impact is, what it's connected to, what data paths it lives in. We obviously buy drives, we obviously buy processes, but we're super fussy about the way that happens. And for example, you can do things with drives like just encrypt everything that goes to the drive, and then the last place that you had clear text data is the processor right yeah yeah yeah yeah are these intel based servers uh no they're all basically amd based servers so we have you were asking about the portfolio of products and i got distracted sorry that's okay it's my fault i think we have um we have have basically storage Lego in our technology platform. And what we try to be is super agnostic to speeds and feeds on the network side.

Starting point is 00:20:31 And then media storage, be it hard drives, NVMe, or SSDs of another kind. We try to be very flexible in our modularity so that we can plug things together to get to a power, price, performance, et cetera, that is appropriate for whatever part or whole of a cluster. And so we often mix and match all of those different things with different networking speeds, but it's all consistently one design, one platform design with different modules. And then we have a cartridge system and a networking system that insulates us from the facts of life, if you like, because we design everything and it's designed to be manufactured in-house. We have this enormous flexibility in how we can actually build boxes.

Starting point is 00:21:18 So we summarize the boxes we can produce on our website. But the dirty little secret is that that's just a summary of what's possible. We actually have very fine granularity in providing what customers really need. And so all of them are tuned to have wireline performance. So, you know, if we ship a 10 gig box, it should be capable of 10 gig, 25 gig, 100 gig, et cetera. Sure, absolutely.

Starting point is 00:21:46 So, but in addition to the hardware, which is obviously, you know, best in breed, is there a way for an existing environment to layer the Softiron software management platform, et cetera, on top of an existing environment running Ceph? So there's a couple of ways that we tackle that. One is we now sell a product we've called Hypersafe, which is just Ceph support.

Starting point is 00:22:16 So we'll support you in situ with a Ceph cluster. So that's one way. Two, we're very religious about not doing anything to Seth other than surround it with supplements that changes Seth in a fundamental way. So you could take a soft iron product and you could plug it right next to something running a different distro, different hardware, and it will interoperate. And we promise that because we know that because we don't really change it, right, at some basic level. Wait a minute. You mean like your management solution, you could plug into a normal Ceph non-soft iron configuration and it would manage that Ceph configuration? Well, watch this space. Not currently today, but watch this space.

Starting point is 00:23:01 Fair enough. I declare that as intent right so one of the things i think we're trying to do in the our dashboard product the management interface is solve problems not on an abstraction that's equivalent to the command line so it's easy to make a gui that reflects the command line but that's not really what customers want. We've tried to really have this adage that you have two ears on one mouth, and that ratio exists in nature for a reason. And the answer to that is customers, right? So as we've listened to customers, they want to create shares. They want to create different pools based on different sizes or technologies or speeds. They want to use one pool for backup.

Starting point is 00:23:43 They want to use another pool over here. They want to use another pool over here. They want to see certain things. And, you know, Ceph is this massive toolbox of things that you could do almost anything with if you're creative enough. But that's not how most users and operators want to consume things in a non-experimental basis. And so one of the things that we're going to do is look at exactly that, keeping that interface beyond the Softiron product. But we interoperate at the Ceph level. So right now, you couldn't use our management interface to manage something that was on a different distro. But we also provide the ability to install softdiam Linux on existing hardware.

Starting point is 00:24:26 So if you're prepared to do that, then you're in better shape there. So that's, you know, in my mind, that's the definition of software-defined storage. Obviously, you've got some secret sauce mixed into your hardware, and that can't be denied. But the ability to port your management plane, your software level secret sauce into hardware that already exists, rather than having to do a complete lift and shift, I think is a really compelling migration path as well. And once you prove yourselves as that hardware does age out, then by all means, replacing the older stuff with branded soft iron products,

Starting point is 00:25:16 particularly from the hardware side of the equation, makes a lot of sense. Yeah, we hear from customers a lot that, you know, hey, we really like the original Genesis is changing day by day, but was we would sell equipment to users that really wanted the best experience on set and we would do very well and we would like it. And then they would say, but we have another five petabytes behind us running on something else. We love dealing with you. Could we please send you some money? And would you do the support? And we resisted this for a long time.

Starting point is 00:25:52 Focus on what you're good at. Make the best product you possibly can. All that mantra. And then we just got to, you know, we've come to a point where we have so much tribal knowledge inside Softdine around sets and how the nuances of performance and all the other things work that we're kind of are we really advocating for the customer on their own terms on their own turf and the answer that we got to was well probably not like we probably should support them and what's naturally happening is that you know equipment

Starting point is 00:26:22 moves on and there is a migration so we've So we've seen that effect of people migrating because there's no denying it. We don't charge a premium on pricing for the product. It just performs better, right? So, you know, at some point, customers just naturally go to, you know, that experience. So we're already seeing that. And the Hypersafe product is not really that old. It's just a few months old. Right.

Starting point is 00:26:44 And that's a support solution that you offer. So getting back to the hardware, you mentioned that there's a special networking solution that your box supports. I think it was six terabits per second, something like that. Yes. So when we set out to do this company, it was obviously something very technically ambitious, which even goes beyond storage eventually. You see a little bit of ankle about that on our website with networking and various other things. And what we tried to do was create a box, you know, a technical platform that could have various different media in it. And as we experimented with that we found out that

Starting point is 00:27:25 cables are a nightmare customers hate cables and when a cable moves you have a bit error rate because you know differential pairs of electronics lose signals right and cables are made for cost typically in massive factories in Asia and they have variable outcomes and you have to test for all of that so this led us to let's get all of the cables completely out of the box. No power cables, no data cables. Let's do it like you would build an enterprise class networking router, you know, a chassis level router. And let's put high speed backplanes in there that run, you know, an aggregate throughput of six terabits per second. And then let's put a cartridge system of drives into that backplane.

Starting point is 00:28:10 And that turned out to be actually a performance upgrade because it's very hard on certain drives to see that you're losing data because it's recovered so low down that if you have a bad cable, you might lose. I mean, I saw ISO is highest 17% data loss. And that's, you know, one of the things that happens when you roll things up to software defined storage, and this is true of Ceph, is the cluster dynamics are very important. So people in, a lot of people in storage, they look at boxes like a scale-up solution. You know, what is the width of the pipes? How much processor do you have? How many controllers? What's the nature of the drives?

Starting point is 00:28:51 What's the aggregate throughput? And of course, that mentality is not invalid. But when you do true scale-out software-defined storage, like proof by absurdum, you make a hundred petabyte cluster that what ends up happening in seth often is and this is sort of part of the tribal knowledge of what sort of things we've learned in our journey is um the dynamics of that cluster being balanced and having peers and no network choke points is more important actually than the speeds of feeds of the individual components underneath even though that does matter a lot and so when we had uneven data paths that was a problem and that led us to a bunch of things like creating a dma engine both you know at the lower end of of the performance which is still high but64-based processors is what we use there.

Starting point is 00:29:46 And then EPYC x86-based processors at the higher end and various different processors along the way, including multiprocessor architectures that are joined together with this networking fabric to make it look like one box. But in every case, every piece of media and every piece of networking has a totally dedicated channel that has no cables and has error recovery and all the fancy things you can possibly put on a backplane. And then they're all connected to a DMA engine. So the amount of processing we do in the middle is just to satisfy the protocol and the data paths.

Starting point is 00:30:23 It's essentially a data pump for connecting networks to storage media. And it's a very different mentality, both in software and in PCB routing, PCB creation, the firmware that even boots and configures the chip, where you run certain processes on which part of the core in which part of the die how you root the pcie and saturn and various other things all of that looks distinctly not like a computer it really does look like it's designed to be a storage thing first and if you look at how the industry rolls up to make money out of hardware. It's really people make margin on the basis of volume. And in order to get volume, you have to do standard.

Starting point is 00:31:11 And so you end up with a computer that then you add hard drives to, and then you put the software on top. And I believe that the right mentality, and this is the approach we took at Softdive for so many years, is start with the software, and then if you care about it being a storage product, you probably care about the drives. And the last thing you really care about is the computer side of life,

Starting point is 00:31:31 which is you don't want it too hot. You don't just want to throw the big iron in there. You don't want it to be under spec. You want it to be just right. It's the Goldilocks zone, right? And that's how we get wireline performance, but don't blow a power budget. So, you know, typically 100 watts is an easy target for us to do for a single node, which is often, you know, one fifth.

Starting point is 00:31:52 That's very, very, very low. in the case of the probably rare hardware failure, a far more trivial and far more rapid transaction. Yeah, I mean, driver placements would be a major driver of these sorts of things, right? Yeah, so there's a story there. So one of the things that we worked out, and I would say we reversed into a position of perspective rather than having any greater genius at the start, this is just the truth, is that when we reduce the heat, and we did that by reducing

Starting point is 00:32:29 power consumption by having this different architecture, what we found is that the reliability went up by what we roughly estimate as double based on science and how people do MTBS calculations. But we're also seeing it in the field. We really just effectively don't see any failures. And it's just because there's less heat in the box where, you know, 25 degrees C cooler, and you double every chemical rate reaction for every 10 degrees C and increase in temperature. So it makes complete sense. The boxes run so cool that we run the air straight over the drives to start with, because we don't need the process processor cooling. We don't need the power budget to cool the processor. So that's another saving in temperature. And then that leads to reliability.

Starting point is 00:33:11 But heaven forbid you actually do need to change a drive. One of the nightmares is when you take a standard computer and, you know, you may have hundreds of them in racks all over the place. And you have a hard drive that fails. You have to do all sorts of voodoo to flash LEDs on the right node, and then you have to rip it apart, work out which drive is the slash dev slash whatever in Linux, right?

Starting point is 00:33:32 So what we did is on our technology with our backplane, we created these caddies that allow you to hold multiple drives together. And they're all very sophisticated digital interfaces that are aware of power, aware of connectivity, aware of failure, various other things like that. And we have the individual ability to control power up, power down in place, for example. But we can also lift the caddy out. So we can open the box, which is all tool-less.

Starting point is 00:34:02 It's on rails. It's all one use. So a single person can install it and uninstall it. You pull the thing out of the rack, you know, on the rails. We have a tool-less design where just with fingers, you can lift the lid on the box. And then in there, you can press what we affectionately call the Seth button. And what this does is it interfaces with Seth in a way that there are LEDs that tell you, hey, there's a failed drive in this caddy. You press the Ceph button, that tells the node

Starting point is 00:34:27 and the whole Ceph cluster, look, don't freak out here. We've got maybe three copies of the data or erasure coding, so we don't have to be worried about the data going away temporarily. You lift the caddy out, it has a battery in it, you know, small battery that shows you the LEDs on the front to tell you which drive has failed. You pull out the drive and you can either insert the caddy with a missing hole or you can put a new drive in. And then when you reinsert the caddy, because of the magic of the Ceph button,

Starting point is 00:34:54 it re-communicates with the whole Ceph cluster and the node, and then it just rebalances the data and recovers everything. And if that drive that you're replacing with is a different size, does it just automatically incorporate the additional storage into the environment? Yeah, Seth works that way, although what we're going to do is just send you a replacement. For the exact drive, yeah. Yeah, and the reason is it's about this thing I hinted at, which is the balance, the sizing, the IO interfaces, all being exact peers.

Starting point is 00:35:28 That makes a massive difference to Ceph. And so we try to keep that and we try to create pools with that symmetry. Yeah, yeah, yeah, understandable. You mentioned erasure coding and replication, data mirroring, that sort of stuff. You want to talk a little bit about what your – is this a standard Ceph capability or is this something you've added to that? Both. So we support, you know, all the things that Ceph does, and it's a big Swiss Army knife. So you can do various different things, erasure coding as a data protection with different geometries, and you can do triple replication or other kinds of replication as a data protection strategy. And

Starting point is 00:36:09 you can mix them to different pools doing different things, for example. And so often, you might want triple replication for certain things, higher performance things. But if you're doing backup, you may need a really bizarre geometry of the erasure code. So we support all of that. We also do optimizations for data protection performance with various different instruction sets, like optimization for ARM64, for example, is a good one. And that makes a fairly large difference. And we've also created hardware acceleration using FPGAs in the networking path to do erasure coding in hardware, which sometimes is relevant and sometimes isn't. That's probably part of a bigger long-term strategy for us to do hardware acceleration, but we've already announced that. That's well known. And you also mentioned that you're doing, I'll call it server-based encryption, or have that as an option?

Starting point is 00:37:07 Yeah, so you can do various different things for encryption. You can use encrypted drives. You can do encryption in software. And there are things that you can do with encryption so that you only put ciphertext into Ceph to begin with, like encrypting objects through an S3 gateway, for example. So yeah, there are a number of different tricks there for people that care. The DoD, the intelligence community, they care a lot about the CIPs 140-2 topic set and then what you can do beyond that. Yeah, and your support and SAF support all that capability.

Starting point is 00:37:50 Yep. And you mentioned the storage router has NFS and SMB. I'm not sure. Is it the S3 as well? Yeah, and iSCSI. And iSCSI, okay. Just no InfiniBand.

Starting point is 00:38:03 Yeah, not today, although we're fairly agnostic. You know, if we had a customer that really wanted that, we would probably do it. Yeah, yeah, yeah, yeah. A lot of the labs in existing contexts have relied on InfiniBand for, let's be fair, decades. The world is moving towards faster and faster Ethernet. So I'm not sure how mission critical InfiniBand is. I think there are certain customers that really love InfiniBand and they have, not to be critical, but a religious affinity. You know, it's never let them down.

Starting point is 00:38:45 It's a brilliant technology and they use it and they're happy with it. But I would say, and maybe this is just our journey, we're definitely seeing people who traditionally have been in Finiband, you know, users, if not very much enthusiasts, look at 100 gig going on 400 gig Ethernet, and there's a bunch of things that combine together to change the decision path, which is, it's simple. The switches are simple if you go that direction. I can buy from multiple vendors. Ethernet looks like it's got legs forever, at least as far as we can see. The network attached price per port is a lot less. Everyone understands Ethernet.

Starting point is 00:39:30 It's not a corner case, and it's very, very, very fast now. And it's the sort of standard path in Ceph too, as far as our product. So we've actually seen a bunch of people just say, you know, 100 gig per node or 200 gig per node is probably enough. Right, right, right. So tell us a little bit about the market that you guys are going after. I mean, Ceph seems to be more HPC kind of solution.

Starting point is 00:39:55 But, you know, with the solution that you're bringing to the table, it sort of goes, I hesitate to say this, down market into the enterprise. Is that, where are you guys finding more of your success, I guess? I mean, enterprise is definitely a focus for us. We do really well there because of the level of support and the productization. We've taken Seth into places that traditionally it has never seen before or not commonly things like national labs doing very high performance file stuff um to you know other environments where you know um sort of the extremes of backup for example rivaling cold storage um but we're we're all we see traction all over the place and i think this comes back to the idea of you know if there's a Venn diagram of storage,

Starting point is 00:40:47 you can color most of the space of that Venn diagram with Ceph now. And very credibly so if you have performance and productization, which is what we've built. And so we see hosting companies. We see cloud providers, MSPs. We see national labs. We see enterprise in all shapes and sizes doing various workloads from high performance stuff on Kubernetes right down to backup. And what we're starting to see more and more this year is combinations of those things on one pool of technology, a data lake, if you know, there are other things that are very common in industry that Seth does incredibly well at like S3, you know, we can demonstrate terabit per second throughput on S3, um, you know, with all of the bells and whistles you'd expect for load balancing and, and for failover and stuff like that. So that's a workload that it just shines out. You know, and we should tell one additional feature that is certainly obvious on your website,

Starting point is 00:41:48 and that is the cost per gigabyte, terabyte, petabyte in terms of storage costs alone are substantially lower than your competitors. Yeah, we find that. You know, customers love this idea that we make all of our own things and the quality pours out as a result and that's feedback from customers not me saying it because i want to believe it and when you combine that with the fact that we can go toe-to-toe with china every day and we can win on cost because we really have that control and we built a mechanism for manufacturing that's meant to be even distributed, but it could be anywhere in the world. And we don't have the same volume ambitions and commonality in order to extract margin. It gives us the ability to be

Starting point is 00:42:36 super aggressive for certain customers. I mean, they can't have the highest performance product and the lowest cost. There's common sense in the room, but everyone understands that. And so, you know, we literally have been going into accounts that want, you know, very economical storage at scale for backup. And what we're seeing is that, you know, customers that see this technology, when they learn to operate Ceph, you know, often with a management interface that doesn't require them to be an expert, they see that, you know, scale is just, the ability to scale with Ceph is off the hook. It's got this juicy common center of one mechanism

Starting point is 00:43:15 you can express all sorts of policy and control around. You can do the block object and file. It's got a fair bit of maturity at this point and its performance, particularly on our hardware is very, very credible. And then you have these corner cases, you know, the storage manager and the storage router that just give them a way of insulating legacy or reality, if you like, in different consumption models.

Starting point is 00:43:40 So what would be a minimum, minimum solution that you would go out and sell, Phil? It seems like Ceph, I think petabytes, but I usually see the petabyte word. It's singles or multiples or sometimes very large, but you can run Seth with three storage nodes, which in our case is 360 terabytes of raw data on one of our products. We can go lower with other products, but that's a common...

Starting point is 00:44:21 We often see people do POCs on that because you can demonstrate everything that you'd feasibly want to see. And then they often grow out from there. Right, right, right, right. Well, this has been great. Matt, any last questions for Phil? No, it's been a wonderful conversation. Phil, anything you'd like to say to our listening audience before we close?

Starting point is 00:44:44 Well, thanks for having me. It's been likewise a pleasure. But, you know, we're a friendly bunch of guys. We're all over the world, Asia, Europe, America. You know, we'd like to help anyone that has storage needs, particularly around Seth. And we're house trained and, you know, a friendly bunch. So just reach out. You know, we often love to help customers and often, you know, it's not necessarily a transaction for us in terms of business, but we can be very helpful and we learn things and it's good to touch, you know, as many different use cases as we can see. Just we get more information about the market. So reach out. We're a friendly bunch. Okay. Well, this has been great, Phil.

Starting point is 00:45:27 Thanks for being on our show today. Thanks a lot. That's it for now. Bye, Matt. Bye, Ray. And bye, Phil. Until next time. Next time, we will talk to the most system storage technology person.

Starting point is 00:45:43 Any questions you want us to ask, please let us know. And if you enjoy our podcast, tell your friends about it. Please review us on Apple Podcasts, Google Play, and Spotify, as this will help get the word out. Thank you.

Grey Beards on Systems - 120: GreyBeards talk CEPH storage with Phil Straw, Co-Founder & CEO, SoftIron

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.