Grey Beards on Systems - 48: Greybeards talk object storage with Enrico Signoretti, Head of Product Strategy, OpenIO

Episode Date: July 25, 2017

In this episode we talk with Enrico Signoretti, Head of Product Strategy for OpenIO, a software defined, object storage startup out of Europe. Enrico is an old friend, having been a member of many Sto...rage Field Day events (SFD) in the past which both Howard and I attended and we wanted to hear what he … Continue reading "48: Greybeards talk object storage with Enrico Signoretti, Head of Product Strategy, OpenIO"

Transcript
Discussion (0)
Starting point is 00:00:00 Hey everybody, Ray Lucchese here with Howard Marks here. Welcome to the next episode of the Greybeards on Storage monthly podcast, a show where we get Greybeards storage assistant bloggers to talk with storage assistant vendors to discuss upcoming products, technologies, and trends affecting the data center today. This is our 49th episode of Grave Ridge on Storage, which was recorded on July 20th, 2017. We have with us here today an old friend, Enrico Signoretti, Head of Product Strategy at OpenIO. So Enrico, why don't you tell us a little bit about yourself and your company? Yeah, sure.
Starting point is 00:00:49 First of all, thank you very much for having me on Great Birth of Stooges. This is one of my favorite podcasts. Thank you. We're always shocked when people say that. So what about me? I've joined the company, OpenRO, in January this year, after many years doing independent consulting and the blogger on Juku.it. And when I joined the company, the idea was that they needed someone to help communicate, a little bit of marketing, helping the product development not coding but more on the roadmap side so but you know jack of all trades was not a good title to put in a in a business case so we found the head of product strategy which is uh probably more you know professional yeah every time you
Starting point is 00:01:42 put jack of all trades someone expects there to be an master of none. Yes, of course. We know you better than that. So we are an object storage company. So we define ourselves next generation object storage and with a serverless compute framework. And this is what we do. The idea is that our core product is very very efficient, lightweight compared to the competition and on top of it we can intercept any sort of event that happens on the system so we can run code directly on it.
Starting point is 00:02:27 And there are several use cases that are popping up now. Some of them are really, really fancy, like machine learning. So we've been demonstrating using TensorFlow lately so that you can recognize the content of the object that you store in our system. I'm still waiting for somebody to do a TensorFlow demo that says not hot dog.
Starting point is 00:02:56 Yeah. Yeah, no, we use volcanoes and bananas. So you can, it's a little bit different than a Silicon Valley application, but still very, very standard stuff. So it's an object storage system. Is it a software-only solution or is it hardware, software, appliance, or how does that work? Well, this is a software solution. We are partnering with several hardware vendors to certify their hardware.
Starting point is 00:03:30 Actually, it's a little bit more complicated than for others because we support totally heterogeneous clusters, okay? So our technology is not based on a distributed hash table that usually each node takes a part of the key space and this part is identical. Okay, so you divide by the number of nodes, each one of them takes a part of the key space and the load balancing is static then. In our system, we do it in a totally different way. So we don't have this concept. Usually object stores are represented as rings. We usually talk about a grid of nodes,
Starting point is 00:04:18 and we have a system of directories, three-level directories, so the keys are distributed differently. It's much more efficient because the number of OPs to get the data is always two, and
Starting point is 00:04:38 it doesn't increment with the number of nodes. And the other thing is that to do the load balancing, we have a mechanism that we call conscience. So we collect metrics from all the nodes in the cluster. For each one node, we collect capacity left, throughput, CPU power, and so on.
Starting point is 00:05:03 And we compute the quality score of that node. For each single operation it's a decision that has to be taken. We just have this sort of bulletin that is available to all the nodes and you pick up the best nodes that suit your need at that specific moment. So you are sure that you are writing always on the most available node. It also means that, for example, you don't need to rebalance the cluster. So if you put a new node in the cluster, usually with a traditional system, you have to rebalance the whole cluster.
Starting point is 00:05:41 So you have to recompute all the key space, redistribute it, and also for the data, it's the same. So you have to reshuffle that around. It's a mess, and it impacts performance, or it takes a long time. In our system, you just put the node. In a few seconds, the node is recognized. It becomes part of the cluster, and its score is very high because it's new, a lot of CPU power and capacity and so on.
Starting point is 00:06:12 Then you start hitting the node, and of course, its score goes down. And that's it. It's very, very easy and flexible. So let me try to understand what you just said. So you say it's more of a mirroring of data protection rather than RAID and stuff of that nature, or ratio coding. Is that true? No, we use object level.
Starting point is 00:06:35 You can select several types of data protection. So you can select multiple copies or ratio coding and so on. Actually, you can also set a dynamic policy. For example, when you save the data, the system checks the size. If it's a very small file, then you probably will save it in, I don't know, three copies. If it's larger than a certain size, you choose an erasure code type. So it's very flexible. Yeah, you mentioned that the data was always located at most on two different nodes. No, actually, the data is located depending on the data protection scheme that you use.
Starting point is 00:07:25 So the object can be copied multiple times in the system, or by using erasure code, the chunks can be spreaded. Okay, I got you. I got you. So from this point of view, it's the most traditional part of the system. So, ratio coding or multiple copies is what you can expect from all the object stores. Okay, but you did mention that the load balancing is more of a dynamic. It's not a dynamic, yes. In a more traditional system like Scality's Ring, when you add a node, that node becomes responsible for those objects which hash to some range of hashes.
Starting point is 00:08:11 And data automatically starts moving to get the data that node's responsible for onto that node and free up space on the other nodes. And correct me if I'm wrong, Enrico, but you guys say, well, here's a new node. It's empty. Let's write most of the new data to it. Right. And instead of having to be doctrinaire about this hash space management, we'll just keep track of where things go so we don't have to move old data until it's time to retire those nodes, which is actually one of the things people don't think about, about supporting heterogeneous nodes, is I'm putting an object storage system in. That data is going to sit there a long time. The system is going to live beyond the life of the nodes I'm putting in today.
Starting point is 00:09:04 And four years from now, I don't want to buy machines with four terabyte drives anymore. I want to buy machines with 20 terabyte drives. So that heterogeneity becomes really important. Yeah, exactly. Also, most of our customers, so, you know, the company, our company is pretty young. Okay. So we started two years ago. But actually, the first system deployed was in 2009.
Starting point is 00:09:31 So the team, the founding team, was part of a large system integrator in France and a very large tech commission at the storage system to solve the scalability issues that they had with the traditional system and also the lock-in and etc.
Starting point is 00:09:51 Then, so this guy started developing the system in 2009 deployed the first petabyte. That particular system
Starting point is 00:09:58 in 2014 became 15 petabyte and then the team left and founded OpenIO. They forked the OpenSUSE project because this project was OpenSUSE many years ago.
Starting point is 00:10:11 And they built OpenIO on top of it. So we spent something like a week. They, at the time, now we, spent something like two years to make the project a real product. And we started selling it to customers at the end of last year. Okay. All of this to say that this first customer is still in production.
Starting point is 00:10:37 And it has 650 nodes now. Sorry, 650 nodes? Is that what you said? Yes. Oh, my Lord. Is that what you said? Yes. Oh my lord. Yeah, it's an email system, okay, because for several reasons, our technology is good both for managing
Starting point is 00:10:56 large files as well as more files, okay? We have connectors for Dove code for you know, Serious Map and Zimbra and so on. And the idea is that this customer started
Starting point is 00:11:11 with very, very small disks and very small nodes. And during every year, they added more and more and more. And now they have this massive installation. You know, another important thing is that you don't want to
Starting point is 00:11:28 renew your support contracts after a while because it's really expensive. You can decide to let the nodes die. After five years, if the nodes die, you just decommission it
Starting point is 00:11:44 and you buy a new node, which will be much more powerful with a stronger network connectivity and, of course, more capacity. This is one of the parts of managing cloudy things that corporate America is going to take a long time to realize and to fully accept. Just allowing nodes to die? Yes. Yes.
Starting point is 00:12:09 Yeah. It's, you know, I've been involved in too many acquisitions of systems where somebody goes, okay, so we get 400 servers and we'll get them all with four-hour support. And like, why don't you get 450 servers with next day support it'll be a lot cheaper that way you know we have customers running on system so in the last year we launched a product to showcase our technology okay when i talk about lightweight object store means that we we can run on 500 megabytes of RAM and one CPU core, an R core.
Starting point is 00:12:46 So we built this interposer that we call the NanoNode. And the NanoNode has enough CPU power, two cores, and one gigabyte of RAM and some SSD inside to manage metadata and a couple of Ethernet connections. This is an app-based with an interposer. If you remember FC to SATA interposers, this is the same thing, but it's Ethernet to SAS or Ethernet to SATA. So each single disk in the system is a server. It means that you can have hundreds or thousands of disks,
Starting point is 00:13:24 and each one of them is a failure domain. It's the smallest failure domain that you can have in this industry because you lose one node, which equals one disk. It's very powerful because just think about a range recorder. You can lose 30% of the disk in one month just going the last day of the month in the data center, change all the disks, and send
Starting point is 00:13:50 the disk back to the manufacturer and get new disks. And you're still protected, the performance is still good, and you don't really care. It's just one single person doing the work,
Starting point is 00:14:06 you know, a couple of hours of work per month. And so that nano node has a little ARM processor on it? Yes, yes. You know, I saw something on your website about running Raspberry Pi, running a node on a Raspberry Pi. Yes, you can. I'm building actually a dome. It's a little bit of a challenge because I can't buy too many Raspberry Pi 0 with a single owner.
Starting point is 00:14:33 But I'm building a cluster of Raspberry Pi 0s with a front end that will be made of a Raspberry Pi 3. And my only problem at the moment, because I have all the packages and they are available on the Apple website, is that I can buy one Raspberry Pi 0 at a time. It makes that array of 650 nodes difficult to assemble. Yeah. So your Nano node is kind of a slightly smarter version of the Seagate Kinetic or the Western Digital object drives that looked really interesting a few years ago, but never seemed to have
Starting point is 00:15:16 taken off. Yeah. You know, that product had a problem because it solved just half of the stack. So you have this key value store, but actually you don't have anything for the front end. And I think that we will see more and more of this kind of devices in the future. I can tell you a lot about this, but actually there are several small vendors working on this. One example is Ignis.
Starting point is 00:15:53 So they have the same identical technology. And there are others that are working on similar design, like a small card on top of disk, and maybe with some more CPU power, but you will see more and more. I'm in contact with a few vendors, and they are thinking about it. It's really attractive for composable infrastructure to just be able to say,
Starting point is 00:16:24 yes, every disk drive is an addressable device and we can assemble them and disassemble them on an as-needed basis. Yes. The problem is you need the right technology to take advantage of this drive because the CPU power and the RAM that you have in a single node is small
Starting point is 00:16:45 so most of the object storage vendors today use RAM to cache the metadata so it's really expensive and you can't do that because just thinking about 12 or 14 terabyte drives means that you need a lot of RAM to manage
Starting point is 00:17:02 the metadata we manage metadata on SSDs, so it's much cheaper and it's also still faster and at the end we can do this kind of products. Do you use the SSDs for data too? Well we can. We have a customer in Japan. They are doing they have not a large cluster. It's 10 nodes plus 10 nodes
Starting point is 00:17:31 400 kilometers far from each other because disaster recovery, they have problems with earthquakes and the kind of stuff. Earthquakes, tsunamis, volcanoes. Yes, they have everything. Everything. So this
Starting point is 00:17:49 customer is using the system to store emails and on top of it they are doing full text indexing on each single So they use the best use case for us because they also
Starting point is 00:18:06 use our read for apps which is the framework that allows you to run the code directly on the system so what they do is just indexing all the emails that become searchable and then after I don't know
Starting point is 00:18:22 40 days if I remember well they just delete this. They do this for compliance reasons. And it's very easy to achieve with our system. You don't need external hardware, external servers. It's just a bunch of line codes and Elastic. Okay, so you used one of my least favorite IT terms earlier, serverless. Yeah.
Starting point is 00:18:51 And I'm going to take that to mean event-driven as opposed to, and you don't need any servers because that's just stupid. Well, you know, so think about this. Okay, we can run in one CPU core. Okay, what is the smallest CPU that you can buy today from Intel? Four core? Four core or eight core probably. So you have seven free cores in the same infrastructure.
Starting point is 00:19:19 And I can tell you, okay, you can use them. So you don't need other servers. So it's serverless from this point of view. Okay, that's the best back explanation into serverless. No, it's just because we needed our marketing team. When they came in and told us, well, we don't have buzzwords in the website, so we need some buzzwords. And we all agree that serverless is one of these new buzzwords.
Starting point is 00:19:53 Yeah, I'm not blaming you for the invention of serverless. Yeah, yeah, yeah. Does that mean I could do something like write a rule that says when an object appears in this bucket, it's 4K video and automatically transcode it down and create a thumbnail version? Wow, this is exactly one of the use cases that we see the most. So thank you for telling this. So one of our customers instead of doing this process externally they just import the video
Starting point is 00:20:30 they add in each single video through this process the name of the logo of the company on top and they produce like 20 videos. Each one of them for a single different device and bitrate and whatever.
Starting point is 00:20:47 And in the metadata field, they add copyright information. And then the videos are available because we have an HTTP interface like Amazon S3. They are available as in a web server. So it's very easy to add. Another customer, but this one has a particular case, they add GPUs in the servers, and they do live streaming. So when they, because it doesn't work only on read,
Starting point is 00:21:19 but also on write. So they're experimenting the fact that they store only one video, and when you request the video, you know the kind of devices that request the video, so you transcribe the video in real time. Oh my God. pay less the power than the flow space. So it's much better for them to have two racks of storage instead of 20 and stream
Starting point is 00:21:56 everything in real time. Yeah, which is great for large libraries that aren't accessed all that often. Just a little bit of CPU instead of a lot of storage. But also there are CDNs. So actually you do the transcoding, but actually on the other side,
Starting point is 00:22:13 it's cached on a CDN. So they don't have to do the same transcode over and over. No, yes. But you don't have to store all the libraries, okay? The same video is rehearsed maybe twice a day, I don't know, maybe less, but you don't need to have
Starting point is 00:22:33 20 petabytes, but maybe a couple of petabytes of storage. Enrico, does the system support clusters that span different locations? Yes, so we support a single instance cluster in a single location. You can
Starting point is 00:22:50 stretch the cluster to different data centers, or you can have a synchronous replication and for long distance. So all the configurations are supported. So all the, all the configurations are, are supported.
Starting point is 00:23:07 Okay. So, so that includes, you know, clever, safe style dispersal coatings where. Yes, we can do that.
Starting point is 00:23:15 Yeah. So it's quite easy. It depends on the, on the number of sites that you have, because we do not support two sites with this mechanism, because of course there is a, a problem with a split brain. But if you have at least three sites, we can start doing that.
Starting point is 00:23:32 Yes. It's just a matter of configuration. If you consider that, so this is the sales part, the price of the product is based on usable capacity, only that. So all the other features are included. And so it means that if you want a ratio coding or if you want a JAR distribution or what else, it's just the same price.
Starting point is 00:23:54 And the software, I assume, runs under Linux? Yes, it's a Linux software. We run on Ubuntu and Red Hat and Debian. No souce? We don't have a customer asking for it. Good enough reason. We're still a small company, so many of the features are driven now
Starting point is 00:24:18 and availability of products are driven by customers. And even more than many small companies, in object storage, you're a small company with a large, where individual customers are large, so they start having that influence. It's like, well, we'll buy it if you do these three things. Exactly. We'll buy a couple of telebytes.
Starting point is 00:24:44 Do those three things. No, no turn to developers and go, do those three things. No, no, exactly. It happens all the time because, you know, some of these customers run just one single application. They want, you know, efficiency
Starting point is 00:24:57 and, well, as I said, I want this and it should work this way. You have to do that. What about data security, Enrico? Do you guys support encryption of the data? The objects? Well, we encrypt on the front end. Encryption will be supported on the next release
Starting point is 00:25:16 at rest. It's a request that is coming more of a rather... Consider this, that most of our customer store data already encrypted and sometimes also compressed. So some of the traditional mechanisms are not really important. The data path is really to support both of them. So compression and encryption, okay?
Starting point is 00:25:44 We already tried both of them in our lab, and for example, for compression, especially emails, for example, we can easily get a 30% space saved. So in the next few years, we will add compression, of course. The duplication is much harder for us because just by saving data that is already compressed, the creation
Starting point is 00:26:08 is not really efficient. Yeah, and metadata management becomes a real problem when you start talking about deduplication at multiple petabytes of scale. Exactly. Well, we have a nice mechanism of chunking the files. So actually, they are already chunked anyway. But, you know, it's not global duplication, because especially if you have a distributed cluster,
Starting point is 00:26:36 it becomes really, really complicated. And the benefit is minimal. So you mentioned you're AWS S3 compatible. Is that true? Yes, we are S3 compatible, is that true? Yes, we are S3 compatible as well as Swift compatible. And we started with Swift. We just borrowed all the code Keystone Swift from Tend. And then we added Swift 3 on top of it and we are contributing a lot to the community
Starting point is 00:27:09 now we are just all the patches that we we are added on a Swift 3 which is the you know this compatibility layer for Swift was not very good so we we did a lot of work and now we are ready to contribute back. So our team started talking with the OpenStack guys and we will release a lot of patches very soon to the community. At the end of the day,
Starting point is 00:27:37 we are an open source software, so we have to do this too. So OpenIO is itself open source? Yes. So all the features that I'm talking about are open source. You play support like the Red Hat model.
Starting point is 00:27:54 Yeah, Red Hat. So there's a community edition that people can install on their own Raspberry Pis and play with? Yes, of course. It's not a lie. So you can, there is a huge button on the right, top of the homepage, get started.
Starting point is 00:28:13 You can start a Trinode cluster by a few clicks, or you can just install it on the Raspberry Pi or on any Linux machine that you want. There are some requirements to get the performance, of course, but if you want just to try, you can create 3D machines and start. There is a background box already on the website. So also on Docker.
Starting point is 00:28:42 So if you want to try it on Docker, you just... Oh, so it runs in a container as well. Excellent. Yes. So at the moment not in production, but in October when we will release the new version, Docker will be supported in production.
Starting point is 00:29:00 So it will be a nice addition because we will be able to... So we are changing a little bit the architecture of the backend. So we won't use IPs anymore to discover services in the system, but actually we will have a service ID directory. So it means that even if the node changes the IP or changes the port, it will just update the directory and it will be discovered again.
Starting point is 00:29:34 So it's a nice addition that will help us. Also, this will allow us to extend the cluster to the cloud. So today we have what we call hybrid tiering, opposed to the tiering in the cluster. So we can have multiple tiers on-premises, but actually we can also have a tier S3 or Backblaze B2 as a tier. In October with the new release, we will be able to replicate data to the cluster, not just copy data we will be able to replicate data to the cluster, not just copy data to the cluster, so move data to the cluster.
Starting point is 00:30:09 And the metadata services will be available on EC2 instances. So if you don't have a secondary data center, you could have your primary data center, your on-premises data center, and then a full cluster running on Amazon EC2 plus S3. S3 just because the storage is cheaper than buying EC2 instances with the storage. Yeah, I don't want to store all that data in EBS. that gets expensive. So today you mentioned that you have hybrid tiering, which means that
Starting point is 00:30:46 objects could potentially be moved to other cloud storages. Is that what you're saying? Yeah, we started with Blaze B2 last year, which is a great partner of ours, and
Starting point is 00:31:01 this year we started supporting Amazon S3. So, you know, most of our customers are not using it because they are very large customers and the cost per gigabyte is really important for them. But actually, also for the fact that we are growing in the number of smaller installations, because partners usually have installations in the range between 100 and 1000 terabytes, 1 petabyte. So these customers are more interested in this kind of features because sometimes they don't have a secondary data center or they don't want to buy additional servers just for data that they don't use, they don't access anymore.
Starting point is 00:31:46 Because S3, at the end of the day, S3 is cheap if you don't access the data. Yeah, yeah. Right, and B2 even cheaper. B2 is even better, yes. Yeah, we had Backblaze on a couple of months ago, so yeah, they're great. We like them a lot. So I could today build a cluster that was the local backup target for my data center and build a policy that said objects
Starting point is 00:32:15 that haven't been touched in more than the next days get migrated off to B2? Yes, you can do that. It's a straightforward policy that you can apply outside others. Okay. And in the future, as I said, you will be able to do the multiple copies, one local and one online. Right. Yeah, which makes it more attractive because now you've solved my, and I got the backup off-site problem as well. Yes, of course. If you just have the media but not the metadata, you can't do a lot of things.
Starting point is 00:32:50 But when you have the metadata there, it's actually, you know, it becomes a multi-cloud controller. You can use OpenIO to manage different clouds and different on-premises infrastructure. At the end of the day, it's just theoretically we can manage also caves because the same connector that we use to write data to the Amazon, okay,
Starting point is 00:33:18 could be used to write to, you know, for example, Spectralogic. So you put a bunch of small servers to manage metadata, Spectralogic, okay? So you put a bunch of small servers to manage metadata and Spectralogic and you have a full functioning OpenIO cluster just on tapes. Yeah. So do you have metadata servers throughout the cluster
Starting point is 00:33:37 that have the SSDs? Is that how this works? So we can have all the nodes with all the services okay including metadata yeah yeah or we can have services a specialized node with services doesn't make a lot of sense actually but you could okay and well it might make sense to do metadata on a real server and a bunch of nano nodes behind it yes yes also yes. Also, for example, with a nano node, you can have only the storage services in the nano nodes and all the front-end layers like S3 and Swift
Starting point is 00:34:15 on H66 machines, for example. It's much easier. You need a lot of bandwidth, maybe concentrating on less nodes. Now we need that interposer form factor to get standardized enough that some vendor can make a chassis with an Ethernet switch that we can plug them into. Yes, I would like to see that. But again, I would like to say more about this. Yeah, we know, I would like to say more about this. Yeah, yeah, yeah. Yeah, we know what that means.
Starting point is 00:34:48 So is there a minimum number of metadata servers that are required in the system? Well, our minimum number for everything is three. Okay. But it's pretty common in the object server. So you can have two phases and still have access to the data. But the more the nodes and we spread data on all the nodes and the metadata also
Starting point is 00:35:09 so you have multiple also because we with the system of directories we actually have two levels of metadata meta 0, meta 1 and meta 2 so there are several services running everywhere and consider that
Starting point is 00:35:24 each single hard drive that we see in a system is a different service so that helps us also to understand better the metrics on the system, it's not just a file system but it's a different hard drives in the same system
Starting point is 00:35:40 and for each one of them we collect the statistics so at the end of the day it's very, very granular. In fact, we can add also only a single hard drive in the system. So if you buy half-cubed servers, and then one day you decide to buy two drives per server, you can do that. That's very flexible from this point of view. Yeah, it can start tiny.
Starting point is 00:36:06 Yeah, but consider that the smaller customer we have, they plan it for growth, okay? So they want the vehicle machines in the front end with all the S3 and Swift connectivity, okay? And in the DMZ. And all the storage nodes are decommissioned servers.
Starting point is 00:36:34 Yeah, because they were there, they had these two REC unit servers and they bought some capacity for them, so they started with a very minimal installation. And they started, this is a service provider,
Starting point is 00:36:48 they started selling the service and, you know, minimum license for us is 100 terabytes. They started with this 100 terabytes one year and they will see what happens. Actually, they already took with us for additional 400 terabytes. But, you know, if the entry point is is a very very cheap you can plan big you know very easily because uh you know you're not buying you're not paying at the hardware because we run on everywhere you can you don't pay the hardware
Starting point is 00:37:18 well you already paid the other but you you don't pay it because you are running it on VMware. And then if something happens, you can, you know, money flowing, and then you can do more and more. It's much easier. You mentioned VMware. Can you run your system on a VMware vSphere environment? Yes, of course. It doesn't make any sense from the dollar per gigabyte point of view, but you can.
Starting point is 00:37:43 You know, if you have some underutilized VMware cluster, and you have some spare space, and, you know, development or a small production system. Yeah, but it makes sense for, you know, these four developers are doing something
Starting point is 00:38:01 really weird. Let's fire them up their own object store while they're working on it. Yes, yes. For the developers, containers are better because you can do much more experiments on them. Yeah, more dynamic. But if you need to start
Starting point is 00:38:17 with a very small... Sometimes it's just a small application. You have to have it secured and everything. But it's just 10 ter application. You have to have it secured and everything, but it's just 10 terabytes. The smallest customer that we have in production in the physical environment
Starting point is 00:38:33 has 60 terabytes through servers. So it's very easy because at the end of the day, and they maintain a very low cost per gigabyte because if you do the math on how much does it cost to buy a 12-disk server today, it's not that much, especially because the characteristics of the hardware are not crazy.
Starting point is 00:38:57 We're talking about a 1.8 core CPU and a couple of network interfaces. Curiously, for corporate Americans, it's kind of, well, yes, you're taking the E5, V1, and V2 servers out of commission because you want more memory in each server so you can run more VMs, but those would be the perfect place to run an object store.
Starting point is 00:39:22 Yes, yes, of course. It's fantastic. And if I've got 30 nodes, perfect place to run an object store. Yes. Yes, of course. It's fantastic. And if I've got 30 nodes, then worrying about, but they're five years old, and one of a year of them is going to fail. It's just like, yeah, let's be like Google and let them fail in place.
Starting point is 00:39:37 Yeah. All right. Well, gents, this has been great. Howard, are there any last questions you have for Enrico? No. I think I got it. Well, actually, now I do have one. So, Enrico, I assume at the moment the big customers are mostly solution providers and you're getting the enterprises to start sampling? So, we have some web
Starting point is 00:40:03 scalers. I think the largest customer that I can name, unfortunately, most of our customers are really big, and we sign a lot of NDAs. But I think I can mention now Dailymotion, which is a sort of YouTube, but a European size. And it's a 20 petabyte cluster growing 8 petabytes per year. So we replaced an Iceland cluster there, a huge cluster. Going from Iceland to you must be a huge cost savings.
Starting point is 00:40:46 Yes, I can talk about that, but it was a lot. You know, sometimes when you are used to the price of enterprise storage and you compare it to the object storage, especially the kind of savings that we can bring to the company, it's like buying one year
Starting point is 00:41:04 of support contracts. And you get the full solution for multiple years. So that's the kind of service that you can expect in large-sized deals. Now we just have to convince people that they need to do it.
Starting point is 00:41:20 Yeah. And lately we are seeing some very interesting enterprises, as I said, in the 100, 1000 terabyte, so 100 terabyte, 1 petabyte range enterprises. This is a new thing for us, but it's ramping up very quickly. And a lot of services, because they have sometimes small data sets, but many of them, they are consolidating a lot of things. So multiple applications in the same system. Okay. Well, this has been great. Enrico, thanks very much for being on our show today. It's been a pleasure talking to you again.
Starting point is 00:42:07 Thank you very much again for having me here. All right. Next month, we'll talk to another system storage technology person. Any questions you want us to ask, please let us know. Thanks for now. Bye, Howard. Bye, Ray. And that's it.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.