Grey Beards on Systems - 118: GreyBeards talks cloud-native object storage with Greg DiFraia, Scality and Stephen Bacon, HPE
Episode Date: April 29, 2021Sponsored By: And Keith and I have talked with Stephen Bacon, Senior Director, Big Data Category, HPE, before (not on our podcast) but not Greg DiFraia, GM Americas, Scality. Both were very knowledgea...ble about how containerization is changing IT and the role of object storage in this transition. Scality’s ARTESCA, takes this changed world view … Continue reading "118: GreyBeards talks cloud-native object storage with Greg DiFraia, Scality and Stephen Bacon, HPE"
Transcript
Discussion (0)
Hey everybody, Ray Lucchese here with Keith Townsend.
Welcome to another sponsored episode of the Greybeards on Storage podcast,
a show where we get Greybeards bloggers together with storage assistant vendors
to discuss upcoming products, technologies, and trends affecting the data center today.
This Greybeard on Storage episode brought to you today by Scality HPE and was recorded on April 23rd, 2021.
We have with us here today Stephen Bacon, Senior Director of Big Data at HPE, and Greg DeFryer, GM Americas at Scality.
So Stephen and Greg, why don't you let us know a little bit about yourselves and what's new with Scality and HPE?
Sure thing. So Stephen
Bacon, I've been with HPE for quite some time. Through my career, I've been in the industry just
over 20 years now, focused primarily on IT infrastructure, covering software, servers,
storage, and a little bit of networking at one point in time. So Greg DeFryer, GM of the Americas
for Scality. Approaching three years
with the organization, which there's been a lot of innovation and change and certainly been an
exciting ride this far. And we're going to talk about some new and exciting things here today
with you all as well. But I've been in the infrastructure space for roughly about 20 years
as well and have a long history with Scale-Out File and Object
platforms and large storage platforms as well. And I've been really attuned to a lot of the
next-gen technologies when it comes to containers, cloud-native, etc., which we'll talk about some of
those themes here today on the podcast. So thank you again for having us. Great, great. So what's going on with Scality and HPE? Sure. So I can start. There's quite a bit. I think, you know,
one of the reasons why I joined the organization three years ago was specifically about the things
we're doing around innovation, right? So when people think about storage, there's a lot of
things that have been constant over the past 10, 20 years. But when you listen to the customers,
you listen to the analyst community as well,
there's a lot of these emerging needs.
And a lot of the emerging needs are around
how to leverage technology to drive business outcomes.
And some of the things that we've done since I've been here
is that we've really done a fantastic job
with doing multi-site and multi-cloud topologies,
helping customers get to the cloud and run in a multi-cloud scenario to solve a host of business
and technical issues that they had. But we've also seen some very unique trends that, again,
working tightly with our customers and partners and so forth is really how to fuel some of these changes and innovations
that are happening in the market. So we look at things around edge and edge decor and different
deployment methodologies. And Stephen's going to talk a lot about with our great partnership with
HPE and some of the advancements that have happened on the hardware side is how to take
full advantage of those as they come, because those advancements are happening real time. And, you know, there's a
lot of things that we're going to talk about in terms of our launch, but there's things that we've
done over the previous years that, you know, this is really why our customers really thrive with
our partnership in the field. So we're excited to share that with you all. Great, great. So,
I mean, Scality has been in the, I don't know if to share that with you all. Great, great. So, I mean,
Scality has been in the, I don't know if it's the correct word, object storage space for
a long time. I mean, I've heard about them probably 10 years ago, maybe longer.
And they've always been pretty high performance, that Scality ring and that sort of stuff.
And it was, you know, very massive application environments where you
were deployed, as far as I could tell in the old days. So you're looking at more of a smaller
deployment in the new solution or what? So, yeah, if you look at our heritage,
you know, we're distributed scale out file and object platform. We've been a market leader with both Gartner and IDC
for the last previous five years consecutively.
And we've been solving a lot of these large-scale
unstructured data challenges, right?
And like I just mentioned,
we also pioneered some multi-cloud capabilities as well
that allow us to extend those services
into cloud of choice based on policy.
What we're seeing is that, yes, is that with some of the emergence of new use cases and workloads
and really the decentralization of IT is that there is a need for edge.
And some of the things that we're seeing in terms of some market drivers there is that
over 50% of the new infrastructure is actually being deployed to the edge, right?
By 2023, that's going to reach about 50% or more.
We're seeing a massive increase of the applications at the edge, right?
And we're seeing this in the next couple of years
close to 800% growth of the number of applications at the edge.
And if you look at what we've traditionally deployed in the core data center,
these are massively scalable petabyte, tens of petabytes,
and even larger systems in the core.
But there's this desire and need with the decentralization of these
applications to resolve the storage need at the edge. And with the form factors of hardware with
HPE and our ability to say containerize our service and deploy in maybe smaller increments
as like satellites, if you will will tying back to a core or
even to cloud this is really powering and fueling a lot of these new next-gen data pipelines right
and we're talking about things like ai and machine learning and you know connected devices and you
can apply those to different verticals but the net is is that these are very complex challenges and again having this
portable form factor to really resolve that edge requirement or even some cases some smaller sites
that maybe you know historically scale out platforms were really you know maybe a little
too large for what they're trying to do this allows us to get to a new part of the market that historically
scaled systems didn't necessarily provide value.
Yeah.
I've seen, it didn't necessarily make much sense to me early on that the edge would have
a serious storage or data throughput requirement.
But then I started seeing some of these smart cars and almost literally terabytes of data
these things are processing on a daily basis.
And it just amazes me what's going on in these solutions and stuff like that.
Does HPE have edge deployments with serious data requirements like that?
It's certainly something that we're increasingly
running into. There's actually a really key word that I'd probably just like to expand on for a
minute that Greg was also referencing, and that's applications. So if we take a step back and look
at what's happening today versus years gone by, the world really has evolved to being much more app-centric, app-driven
than what it was.
And we've certainly seen that being a catalyst, an inflection point, requiring a fresh data
management approach, a fresh object storage approach versus what's been the case in the
past. So when you combine that app centricity requirement
with a true edge through core to cloud context,
you really do have a number of catalyzing drivers
for a new style of object storage solution.
One with a high degree of adaptability,
of portability, of accessibility and efficiency,
that a more traditional heavier weight object storage solution just wouldn't be as ideal for,
let alone a traditional file system, which would be far less suitable for. So let's talk a little bit about that Lego, to borrow a term from a different industry,
approach to providing storage services at the edge.
One of the problems or debates that we're having, kind of the cloud native and cloud ecosystem,
is around what's the right control plane for the edge.
And as a component, as a subsystem of cloud,
how did the team approach kind of this unknown state?
Kubernetes is kind of what we have today.
But if you look at edge deployments,
either you're all in in Kubernetes and you believe that that's the model or you're looking at it and saying, you know what, that's not the right thing for the edge.
How does that impact Scality and HPE strategy for providing kind of next gen object storage for them?
It's a good question, right, because you're always mindful of, you know, where things are at today and how it will evolve, right? And, and we've, we've got, we've really, when you look at our engineering team, and our product team is, we always talk about adaptability and sustainability, right? Is, is that what we do today may more may improve and change. And if you look look at historically as an organization, we were founded back in 2009,
and we've been on the same code base throughout, right?
So we've got a really good track record of aligning and adapting
to best of breed based on our customer requirements.
So when we're talking about edge,
we historically have always deployed on, let's say, bare metal
or right on the hardware itself in a scale of topology.
And that really works in certain cases.
But we really gravitated towards container-based deployment models.
And when we looked at a lot of our customers, a lot of them are going to Kubernetes-based application and orchestration platforms.
So adopting storage into that same framework made it simple to deploy, simple to manage, simple to scale.
And this allowed customers tremendous flexibility.
Today, and we're going to continue to innovate in advance there as well and obviously as the market improves
shifts augments is that you know we've shown a track record of having our ear to the rail
to to really grow with the market and i think that's the key is is that what it is today
may not be what it is in a year or two but providing customers the path to leverage those best of breed technologies
when available and not really get siloed or locked in is really a key design and execution principle.
Somewhere in there, Greg, you mentioned containerizing the storage. You want to talk
a little bit more about what you've done there? Sure. So what Steven mentioned is around the launch of the platform, right,
is that we believe that, you know, the edge and form factor and having a very simplistic storage
interface, a S3, which is really what I would see, we see as a standard for a lot of these
next-generation cloud-native applications to speak to, is that when you shrink wrap it in a container,
this allows us to deploy essentially
on any type of form factor and really make it portable
and very elastic, right?
So, and for a lot of our customers,
that's exactly what they're looking for.
And I'll give you an example is, with Steven and his team of our customers, that's exactly what they're looking for. And I'll give you an example.
With Steven and his team and our team, we're currently working with a very large global manufacturer that has over 200 remote locations. And they have a requirement to have local storage for 30 days, but also synchronizing that with the core, doing analytic near real time, and then long-term vaulting of that data as well.
So there's a very unique pipeline that happens within that workload.
In each one of those deployments, they're having dissimilar hardware configurations based on a number of different factors.
It could be form factor, it could be the amount of data, it could be performance. So we've certainly embraced Flash and roughly about 70% of our
customers today are evaluating Flash because majority of the new applications that we're
deploying have embedded AI, which means this notion of completely cold or frozen data is really
changing, right? It's constantly being gone
through and to learn and to take variables back to the business to make better business decisions.
So a lot of the stale data that we saw maybe 10 years ago with low cost archive frozen is becoming
alive again. And, you know, for us is that, you know, with the emergence of, say, RESTful-based APIs for storage is that we're seeing RESTful-based storage become more to tier one or primary storage.
And in order to meet those needs, you need the performance.
And things like Flash technologies allow us to do that. So again, the power of software and containers allow us to use basically any storage medium,
SAS, SATA, NVMe, Flash, and align it to the right performance and capacity need based
on demand.
And now we can do that based on locality, which is even more exciting.
So Stephen, talk to us a little bit about the HPE angle in here, because as an enterprise
architect, and I'm getting more and more abstracted away
from the hardware, why does hardware matter in this solution? You know, why not just go out and
put Scality on Whitebox? Sure. So the hardware aspect is only one aspect of the collaboration
between HP and Scality. Our partnership goes back to, I think it was 2014. It was further extended with
an equity investment that HP made in Scality back in 2016. And it's a partnership that has
grown very substantially since now to hundreds of customers, many thousands of systems with very successful deployments in many parts of the world.
It's represented in terms of both a shared vision, joint collaboration on how we drive
each element within the offering. And thus, we have input and help shape Scality software.
Scality similarly has input too and helps us shape not just the recipes software capability as part of this announcement, it's fully rapid within the likes of HPE GreenLake, it's really a way that we bring the cloud experience to our customers
such that they're not having to go to the cloud themselves, if you will.
And thus, we're able to bring to our customers the ability to enjoy a flexible pay-per-use
consumption-based model, as well as the actual managed infrastructure managed solution for them
such that they can focus on using not operating the infrastructure. So it actually goes
pretty deep in that regard. To come back to your specific question around hardware. So as we've approached the market opportunity
and inflection point that we're addressing with Scality Arteska, we certainly have sought to be
very expansive in the portfolio that we bring to bear to make it easier for customers to purchase, to deploy, and be successful.
Once again, Rapid with that HB GreenLake experience.
So we're actually introducing as part of this new offering a full portfolio of six
different platforms. Everything from a very commonplace, general purpose, single node system
to a four node cluster in a box in just two rack units, a high performance and capacity, all NVMe, one U-based node, an economic bulk capacity or flash node using QLC media
technology, as well as a number of hybrid flash nodes in addition. So regardless of the context
requirement that a customer might have, where hardware is a part of the necessary equation,
we believe we have a full expanse of offerings that would address those requirements.
But there's one, what I believe is really additional, a really important additional
element to keep in mind. When we talk to Edge to Core to Cloud and the capabilities around HP solutions for Scala T-Artesca, it ability to manage data no matter where it lives, in the cloud, in the data center, at the edge, including within S3-compat compatible object stores from other vendors. And
thus whether it be a hundred percent the joint offering or in combination with
other S3 compatible object stores either at the edge or in the cloud we're able
to bring together that environment and really drive for our customers an edge-to-cloud solution for their apps, no matter where the data lives. can run and manage their own storage that's residing on HPE hardware appliances,
as well as potentially any S3 storage that the customer has, regardless of where it is?
I'm trying to understand.
Correct. Yeah. So think of it this way, is that it's really comprised of two main components.
So one is a very simplistic S3 store, right? And what I mean simplistic,
it's simplistic by design because it's really built and optimized with the app developer
architect in mind and the application in mind, right? So it's very simple to manage, simple to
deploy, and very easy to manage at scale. The other component that Stephen referenced was
is that our ability to manage workflow. And yes, we interface with any other third-party object
storage. So if you had an ABC vendor of choice or even a public cloud that you wanted to synchronize
back or from us to them. And we also have the ability to
do that with POSIX. So if I had, you know, a situation and we see this a lot, especially in
these, you know, data pipelines for analytic and AI and machine learning situations where,
you know, I think you mentioned the car or the edge is that, you know, we have this really powerful policy based IP that allows us to identify a source and and to a different AI platform for interrogation and
learning is that we can provide that interface and basically ingest that namespace, the metadata,
and conversion. So it's much more than just storage. There's transformation, there's movement,
there's workflow, and also that does integrate to all the other
major cloud vendors because we do see them in the pipeline as well. So it's very unique in its offer
and allows us to do a lot of things that do not just storage. And I think when we talk to the
data scientists, the app architects, the storage is a consideration. It's a challenge, performance, scale, availability,
and those things. But it's also is, how do I drive these pipelines end to end? And we're also
fulfilling that requirement as well, which to be honest with you, is a very challenging point
of note. And this very much intersects with a part of both companies' heritage.
So if we look at the heritage of Hewlett Packard Enterprise, HPE, if we look at the heritage of Scality, enterprise is a core part of our heritage.
And thus, with this new offering, while it's lightweight, while itself is cloud-native object storage, very much designed for the Kubernetes era, and with that built-in federated data management, what we don't trade off, what we do embody, is enterprise capabilities and enterprise-grade offering, including very much in the completeness of the offering, both in terms
of the feature set, as well as the user experience embodied around.
That's great.
We're getting close to the end.
I did want to ask, so the pricing, is that something you want to discuss on the podcast?
Or obviously the hardware is H know hpe pricing kinds of public
information right i assume but the software is uh the scalety solution is that how this works
so it's very simple i mean the the way that we license is just by by by how much data it's data
managed it's very simple it's not after the overhead and all the other things right it's
it's how much data we manage in the system uh so it's very simplistic it actually is very similar
to how we license um with with the scality ring which is uh you know which has been a really our
flagship over the years but we try to make that component very simplistic and easy. And one thing I would say is that, you know, with that partnership with HPE is that it makes it very simple.
It's the hardware form factor, and then it's how much data do you want to put on it.
It's not the raw hardware, which is very typical in software defined is if HPE said, hey, here's a system with a petabyte of raw disk,
a lot of software defined vendors are,
you got the license, the whole thing.
We're just saying, well, how much data do you have?
Do you have 100 terabytes?
Do you have 50 terabytes?
And do you want like two copies, three copies?
Whatever you need for your local and remote durability is, but you're really the
essence of how we've always done our license. And this isn't just unique here is really around
data. And that's how we license. And I think many customers really embrace that because,
I mean, I know Keith's an architect and very technical and understands that not all data is created equal.
And, you know, if you've got large files, small files, you have different service levels and things that you're trying to hit.
You know, these are things that could really impact customer cost in the end. but provide an appropriate level of protection, performance,
but really keep the license component and cost component
really streamlined and simple and in check.
And if we think about the purchasing model side of that,
so with this new offering,
it is available to customers from a software standpoint
in a subscription model,
as well as then as mentioned before in an
HPE GreenLake consumption based purchasing model as well. One of the elements that I think really
crystallizes the co-designed element of this new offering is that it is exclusively available through HPE for the
first six months. And that's certainly something which isn't exactly commonplace in the industry,
but I think is one of the best testaments to the very close collaboration between HP and Scality to bring this new capability to market
to address a new emerging evolving
set of customer requirements and application requirements.
And this really reflects this is a joint effort
to enable customers to do new things, to address the emerging breed of cloud native, of analytics, of in-memory, and other applications.
And with the full backing of a very close, effective, collaborative relationship between the two companies.
So, Keith, do you have any last questions for Greg or Stephen?
No, I think I really understand that GreenLake model.
I think Stephen wrapped up the question that I would have had,
which is this seems pretty simple to consume with GreenLake
as kind of not to give HPE a marketing term
or Gala to you guys do a
good job of marketing. Cloud consumption of S3 storage at the edge is a pretty good story from
a cost model. Stephen and Greg, anything you'd like to say to our listening audience before we
close? So just a final wrap for it from my perspective. What we're really doing
here, I think, is seeking to empower our joint customers. I do so with an application-centric,
developer-friendly, a cloud-first object storage platform through which they can address their
emerging requirements, both in terms of the workloads, as well as the topology context
in which they deployed, and addressing not just the data
persistence layer, which is the focus that many others may too
narrowly be centered on, but also the more complex problems, if you will, at the data
management and workflows layer.
And I think the combination, the duality of those things is what makes this so incredibly
powerful.
So what I would say is, I know you brought up, I couldn't remember if it was yourself,
Ray or Keith, but why not just white box or things like that? And honestly, this is really when we're talking to
our customers week over week and month over month and quarter over quarter and over the years is
that this was an initiative that we had been working on for quite some time you know i can go back to even when
i started at scalety three years ago that you know we were we were working to solve this problem is
is that you know a lot of our customers were trying to satisfy this need and they're doing
things with open source and various different tools and scripting and other things and various
different hardwares and then i got to make this thing enterprise grade, always on, hit a service level.
And that to me was the ultimate pain that we're trying to solve for a lot of our customers
is that they were trying to do that.
And obviously is when you're doing things like that is, you know, the cost to manage
various different codes and things like that and hardware and other things
is getting to a service level that meets the business need
is a heroic task.
In many cases, those customers were challenged with it.
And when we learned about some of these challenges that they had,
I think this is where Steve was talking about
our unique partnership is we've been working on this to solve it. So if we can simplify, we can do this with an always-on
enterprise-grade, many nines of durability, and we have our great support organization that's
been in place for years, both on hardware and software supporting, always on mission-critical
applications.
And you can obviously I think you probably know a lot of the installed customers that we have.
They're very high profile. Is that taking that taking that thought leadership to this new challenge of edge and developer centric storage and edge core cloud and data pipelines and really driving the
transformation era for their digital initiatives is exactly what we're doing. So, you know,
we kind of come with the thought and the methodology of open source without any of the risk
and the alignment on the hardware for performance and other things would be great.
Okay. All right.
Well, this has been great.
Thank you very much, Stephen and Greg, for being on our show today.
And thanks again to Scala T and HPE for sponsoring this podcast.
That's it for now.
Bye, Keith.
Bye, Ray.
Bye, Stephen.
And bye, Greg.
Thank you.
Thank you.
Until next time.
Next time, we will talk to another system storage technology person.
Any questions you want us to ask, please let us know.
And if you enjoy our podcast, tell your friends about it.
Please review us on Apple Podcasts, Google Play, and Spotify, as this will help get the word out. Thank you.