Storage Developer Conference - #49: Time to Say Good Bye to Storage Management with Unified Namespace, Write Once and Reuse Everywhere Paradigm
Episode Date: June 28, 2017...
Transcript
Discussion (0)
Hello, everybody. Mark Carlson here, SNEA Technical Council Chair. Welcome to the SDC
Podcast. Every week, the SDC Podcast presents important technical topics to the developer
community. Each episode is hand-selected by the SNEA Technical Council from the presentations
at our annual Storage Developer Conference. The link to the slides is available in the show notes at snea.org slash podcast.
You are listening to SDC Podcast Episode 49.
Today we hear from Adjaneya Chagam, Principal Engineer, Intel,
as he presents Time to Say Goodbye to Storage Management with Unified
Namespace, Write Once, and Reuse Everywhere Paradigm from the 2016 Storage Developer Conference.
My name is Reddy Chagam. I'm the principal engineer and chief SDS architect at Intel
working in data center group so
I'm going to cover today more most of my session is all about open source is DS
controller effort so there is work going on behind the scenes with few storage
vendors as well as big end customers and I would like to highlight what exactly
we are doing within the context
of enabling the storage management in the industry I won't be able to go over
the details of there are specific announcements that are actually going to
go come come up in the next month or so I won't be able to talk about those
details but I will give you a flavor for why are we doing this what
exactly are the problem areas that we are trying to address when it comes to
storage management in the industry hopefully it gives you a flavor of the
need behind the work that is going to happen towards the later part of this
year so I'm going to cover
so I'm going to cover bit on on software defined storage what do I mean
by software defined storage the I'm going to take a couple of storage stacks
one is the open stack the other one is kubernetes to give it to give you a
flavor of how the storage management is being done in the existing stacks one
gives you the virtualization layer the other one is being done in the existing stacks one gives you the virtualization
layer the other one is a container framework so I wanted to give you a pulse on how storage
management is being done in these two different stacks paint a picture of what exactly the
problems that we are facing in general both with the cloud computing as well as the virtual
virtualization management
frameworks give a context behind what we are going to do with the open stack as
DS open source as DS controller in you know proposal I'll talk a little bit on
the next steps and you know specifically call to action so if you look at this
software defined storage by the way it's a small audience so if you look at the software-defined storage by the way it's a small audience so if you have questions feel free to stop and ask so looking at the
software-defined storage there are two broad elements one is the which you are
probably familiar with quite a bit which is at the bottom anything to do with the
scale up which is the traditional SAN and NAS appliances, and
scale out is one building block element when it comes to software-defined storage.
Scale out can be anything that you instantiate on the standard high volume servers using
open source flavor like Ceph, Swift, or proprietary flavor like Scale.io, vSAN, Nutanix, Storage
Spaces Direct, and so on so an
element of a scale out an element of a scale up essentially is considered as
the storage systems deployed in a data center but in order to manage those
pieces that are actually deployed in a data center you need something at the
management plane we call that as the software-defined storage controller so what is the role
of software-defined storage controller so fundamentally it needs to have
visibility into all the storage resources deployed in a data center it
needs to provide a mechanism to provision storage resources to meet
targeted SLS SLS could be I want X amount of capacity
I want X amount of latency I want X number of IOPS X amount of throughput so
anything and everything related to storage requirements controller needs to
be able to understand be able to cover those storage resources in an optimal
fashion and work with any orchestrator so it doesn't have to be open stack only
it doesn't have to be cloud stack it doesn't have to be OpenStack only,
it doesn't have to be CloudStack,
it doesn't have to be Docker or Kubernetes,
but it's need to be able to work with
any orchestration framework out there.
So that's kind of the notion of what I mean
by software-defined storage.
So two different elements, control plane,
managed by software-defined storage controller,
managing all the storage resources.
But once it calls out a resource,
you will use the standard data plane protocols,
NFS, iSCSI, ICER, NVMe or Fabrics,
anything and everything related to block file protocols
or maybe even future key value protocols as well.
So we are not talking about changing anything
in the
data plane the data plane will continue to stay as is it is only the control
plane where how do you manage the storage resources how do you have
visibility of the storage resources in a data center and how do you allocate the
storage resources in a much more optimal way okay good
does it also work in deaths environment environment DAS is a direct attached
storage that's what I mean by yeah so a base storage you mean yeah so it should
there is an element of standard standardization that is going on with
the swordfish and redfish if you know the standards activities under the Snea
plus DMTF umbrella so the goal is to figure out so today if you know the standards activities under the SNIA plus DMTF umbrella.
So the goal is to figure out, so today if you look at most of the storage management,
it is all focusing on network attached storage, scale up or scale out.
But the goal is to actually rope in anything that happens on a compute node, local storage
as well, as well as the fabrics and targets.
So we have to include those elements as well as well as the fabrics and targets so we have to include those
elements as well today it is mostly there is a scale out storage and there
is a scale up I'm going to write a driver I'm going to consume it
everything else is done in its own fashion in each and every orchestration
storage layers right but the intent is to bring in those elements as well. We talked about single thing of manage this storage different
of the storage.
Yeah.
And we have challenges where you have
storage devices that are servers,
then you have SAN, NAS, or different network devices.
Exactly, yeah, yeah.
So broadly classifying the way I look at it is
there is a storage that is sitting in a compute node
that is purely servicing the compute elements like virtual
machines or containers and then you have targets high-scalability targets and VM over fabrics based
targets that are really needed as part of the provisioning flows as well as allocating the
storage resources then you have scale out you know some sort of a service that is deployed on a standard high volume server pool like Ceph and Swift.
And then you have scale up.
So we should be able to manage all those pieces.
They're all considered as the storage resources, but depending on your orchestration stack, your mileage will vary, right?
Yeah.
Okay. So let's look at the click down of what are the major functions that you typically would
like to see in the storage controller when it comes to control plane operation.
So the way I look at it is you need to be able to manage the storage resources from
the beginning of procuring your storage systems and deploying in a data center,
all the way to retiring your storage resources
in a data center.
So that's how I look at it,
and there are different functional building blocks
that happen for you to actually start consuming
the storage resources in a data center.
When I say consuming, create volumes, create shares,
attach volumes, attach shares,
so that the compute side of the orchestration can take
advantage of storage resources.
So typically you start off with provisioning.
Normally what happens is if you have a pre-configured appliance, provisioning is going to be fairly
lightweight other than maybe configuring and all that stuff, and then connecting the
network fabric to the compute
cluster. If it is a scale out obviously you have to deploy the operating system, you have to deploy
the specific software, you have to have all the pieces before you can start consuming them. So
there is a little bit of heavy effort involved when it comes to deploying the server-based storage
scale out stacks all the way from operating system deployment and provisioning
on top of it once you provision the storage resources and connect the storage
storage systems in a data center you need to be able to discover
yeah there are lots of ways to discover but most of the discovery happens to be you log into the
storage appliance and figure out what exactly it has depending on what vendor you are talking about they may have a more
integrated way of discovering the storage resources but in general it's
not very well integrated and thought out you know phase and then you need to be
able to take the storage resources and loop them you know essentially group
them into certain logical buckets so
that you can manage them together so you may have a pool that is performance
oriented pool you may have a throughput optimized pool you may have a capacity
oriented pool you need to be able to group them but you can't group them
unless you know exactly what the back ends are capable of right so there is an
element of composing them into certain logical pools and then start consuming them which is essentially creating the you
know blocks and file shares and buckets creating objects and so on which
predominantly happens to be most of the focus in the orchestration stacks so if
you look at most of the orchestration stacks they assume that these three
building blocks are already taken care
so you normally don't see these orchestration stacks paying attention to
those three building blocks mostly start consuming them with the assumption that
they are already plugged in they are already connected to the compute nodes
everything is working fine you start consuming them the later part is
obviously once you start come you know creating the volumes and shares, how do you
maintain and monitor, which is a critical component as well.
So I'm asking for 10,000 IOPS, am I really getting 10,000 IOPS?
If you look at AWS, for example, provisioned IOPS, you are essentially looking at ask for
X number of provisioned IOPS, am I really getting that?
The visibility of what I ask for what I am getting is
you know somewhat challenging depending on what tools you have and what
framework you are dealing with and then obviously retiring so the key part of
retiring is you need to be able to migrate the data out of the system that
you want to retire so there needs to be a mechanism to automatically migrate the
data which depending on your storage backend, sometimes you may
have that mechanism, sometimes you may not.
And it doesn't normally work with cross-vendor storage backends.
Works probably seamlessly within the same vendor skews, but when you look at the cross-vendor
technologies, data migration is fairly challenging.
So the pieces at the bottom, most of the challenges are scale and
hitche has been fairly problematic if you look at openstack cinder and Manila
a lot of focus in ensuring that the tools can actually scale and there is a
high availability configuration baked in to deploy for large pool of machines and
manage them the discovery
and classification is the piece that is really missing the third element is the
policy based orchestration so if I have a policy on quality of service things
like IOPS latency and so on how do I abstract it and convey that all the way
down to the lowest level so it can be intelligent about how to carve out the
resources multi system orchestration so if you have multiple sand back ends how down to the lowest level. So it can be intelligent about how to carve out the resources.
Multi-system orchestration.
So if you have multiple SAN backends,
how do you manage them together as one entity?
Has been fairly challenging with most of the backends.
And then automating the data migration piece.
So lots of tools in this space, mostly vendor specific.
This is largely ignored.
Lot of focus when it comes to
orchestration stack in this area lots of tools there are chef puppet there are
Nagios there are lots of tools out there you can take advantage of but the goal
is to kind of bring in the visibility of how is my volume behaving how is my
share behaving from a performance latency perspective?
At a data center level, can you give that visibility?
So that's kind of the key thing.
Yeah?
You said lots of tools, but what about EOL for tooling?
Migration in that area?
Migration, again, there are certain basic.
So the question is, are there lots of tools in terms of migration?
There are tools, if you look at vendor specific, let's say EMC has its own set of tools to migrate from EMC equipment to EMC equipment.
So is NetApp, Huawei, Hitachi and so on and IBM of course.
I think there are a couple of vendors, proprietary vendors who can actually do cross vendor migration there are some brute force
implementations from open source perspective where they look at a volume
a and volume B it's plain vanilla offline export import kind of a scenario
but it's an offline migration not really real-time inline migration so it depends
on whether you are looking for a very sophisticated inline migration you are
mostly better off by taking the commercial tools.
If it is a very simple offline migration,
there are lots of tools out there,
but you have to do a lot of hand stitching
to be able to get that done.
I don't know if I answered your question.
Yeah.
It's still an area that's still very much in work.
Yep, exactly.
Okay, so I'm gonna take a look at a couple of
orchestration stacks. Hopefully it will give you a flavor of how it is being
done today storage management. One from virtualization management perspective
the other one is container management. So OpenStack is very popular for managing
you know deploying clouds that takes care of compute, network and storage. So
you get all those three pieces when you look at OpenStack
and somewhat similar to what you look for in the AWS type of an implementation
for private cloud.
So the top NOAA is the virtual machine management stack in OpenStack,
and then Horizon is the dashboard when it comes to storage we have four different building blocks
Manila is meant for creating shares Cinder is for the block creating block
creating volumes and managing volumes for the multi back-end storage scale up
and scale out systems and then glance is mainly meant for images so if you
have lots of virtual machine templates it's a repository for managing those templates and
creating those templates snapshotting the existing volumes converting them into templates and so on
and then the last one is the object store swift is the object store default object store for open
stack it's not really a control plane right it's an implementation so swift is
an end-to-end implementation of a given object store but there is also an api swift api that
other backends actually use as a mechanism to integrate into openstack much more seamlessly
so if you look at the flows i kind of wanted to show one specific flow
to give a flavor of how the storage being orchestrated
So cinder has support for
Essentially for LVM, which is the local
That's use case and then scale up. So lots of storage vendors have drivers underneath the cinder
And then we have scale out
driver plugins as well
sef kind of comes under the you know scale out driver plugin rbd plugin
the initial step of it assumes that these things are actually deployed in a data center configured
and everything and you when you start configuring the Cinder, you essentially provide a mechanism
of how to connect to these storage backends
and then it starts pulling certain statistics.
Things like, are the storage backends up and down?
What is the state of storage backends?
It also pulls in certain stats,
things like the storage capacity and so on.
Cinder uses that information as a mechanism to figure out
what is the best place to, you know,
place volumes when there is a request coming in.
And I will explain the flow a little bit later.
So if you look at the number one,
either you can actually do this from command line
or from UI,
you are essentially creating a volume.
And when you are creating a volume, you are not specifying I want to create a volume on
EMC equipment or scale out Ceph.
You are essentially saying I want to create a volume.
Here is the size and you can have different properties attached to it, including logical
pools.
Things like I want to create a volume
in the performance pool.
Performance pool may have three different backends,
and it will go figure out what's the best way
to pick one of them and create a volume in that backend.
So, but when you start creating a volume,
you're not really specifying any backend,
any specific instance.
Cinder actually deeply buries that abstraction,
and it figures
out what's the best way to pick the right storage system and create a volume
once you could so the volume creation once it goes through the cinder layer it
essentially create figures out the best possible candidate and connects to the
storage back-end using a driver to create a volume on that storage back end.
And once that is created, the next step is you want to either use that volume as a way to boot it
or attach it to the existing virtual machine instance. So you normally issue boot or attach
volume which is the step number three. NOVA is the one that starts looking at volume properties.
It connects to Cinder and say,
give me the volume properties for this volume ID.
That will have information about the storage backend
and maybe the security credentials on how to connect it,
what type of protocol it is exposing.
All the details associated with that volume
actually are provided by Cinder into NOVA.
It can also pull the virtual machine template information from GLAMS and use that as a way
to boot virtual machine.
So step number four essentially gives you where is the volume located, what's the boot
instance that I need to be using to start up a virtual machine.
And then it uses the lib word, which is essentially a control plane layer for QMU, KVM, and then it uses the lib word which is essentially a control plane
layer for QMU KVM virtual machine instantiation to really create the right
set of virtual machine parameters and start the virtual machine so that's kind
of the flow on what you see in it when it comes to open stack creating the
volumes attaching the
volumes and starting the virtual machines but the key property that you
see is that cinder is essentially giving you that layer of abstraction so that
you don't know which storage back-end it is actually selecting there is a
scheduling component in the cinder you can write your own scheduler you can
schedule it
such a way that you are filtering based on maybe specific region you're
filtering based on certain performance properties default scheduler is fairly
primitive it looks at all the storage back ends and it figures out the first
storage back end that has the most amount of capacity and creates picks
that as a candidate for creating a volume but you can plug into your own
scheduler
that is a lot more sophisticated
and do very intelligent things around it.
So that's kind of the flow.
There are lots of other projects in OpenStack,
things like how do you manage your security.
There is a Keystone.
I mean, the user credentials is Keystone,
and then there is a Barbican,
which is the security key management.
And then Selamator is the one
that where you normally provide telemetry information.
And then Neutron is for the networking orchestration.
So Neutron, Nova, Cinder, Manila, Glance,
those are the five building block components
that really provide the foundation for compute,
network, and storage orchestration okay so hopefully you got the gist of how the flow happens
the key thing is open stack in general as if you are aware of open stack where
it is going there are lots of production implementation so the layers in this are
being used in production so there is a production level of maturity with these
stacks so that's a good thing lots of vendors writing drivers for
sender and Manila so you see lots of ecosystem out there vendor ecosystem
writing drivers so you will find lots of driver support out there again as I
mentioned in the previous slides obviously it doesn't have storage
management that is completely baked in it provides constructs on how to group
storage resources but you don't know you need to have a visibility into what your
storage resources are what their properties are you need to have a
mechanism to logically group by yourself manually or custom tools and whatever it
is and then create config files to cinder to say here is a performance pool
here is the capacity pool here is the capacity pool here is
the throughput optimized pool so you have to hand stitch the configuration
before cinder actually is aware of both here are the storage resources and here
are the logical groupings and here is how we consume the resources is
something that you have to do a little bit manual the scheduling and monitoring
is still evolving specifically monitoring, lots of things happening
but it's still not a place where you can actually get an
end-to-end visibility of how the storage resources are
and what their properties are.
Okay, any questions on the OpenStack so far?
Good, okay.
Yeah.
Do we have any particular good okay yeah yes the cinder basic scheduler essentially uses the free space as the mechanism
to pick the right back-end but there are a few different flavors of filters there is an affinity filter so you can say i want to
create this volume for this compute node right lvm needs an affinity filter because you don't want to
create an lvm on a different node and your compute is on a different node right so there is an
affinity filter so there are few other filters that you can potentially use you know so there is a mechanism to extend it as well
but the default implementations are fairly I would say primitive in my opinion okay any other
questions on the open stack okay so I'm going to cover a little bit on the container orchestration how many of you heard about kubernetes okay cool
fantastic okay so there are two types of servers in kubernetes one obviously you probably will
guess it node is the place where you are really deploying containers but for you to manage a pool
of machines node can be your physical machine or a virtual machine but for you to manage a pool of machines, Node can be your physical machine or a virtual machine,
but for you to manage a pool of machines,
you need a clustering software on top of it.
So the master server has a list of services
that essentially provide a way to manage a pool of servers.
Things like, how do you deploy, how do you select specific pool of servers, things like how do you deploy,
how do you select specific pool of containers to deploy on a thousand-node cluster.
There is a mechanism to schedule.
So the scheduler is the one that is essentially responsible for figuring out where is the right place to start your containers.
Replication controller is the one that essentially gives you a
property of there is a minimum and there is a maximum that you can actually set a
pool. It is responsible to make sure that you have enough number of container
instances running in a data center to get your job done. Things like load
balancing, you have a front-end web server and you say I want 10, it will
make sure that there are 10 front-end container instances
managing the load, and if it goes down by two or so, replication service is the one
that is responsible to bring them back up.
So it may be because your node is down.
You are asking for 10 instances, one node is down that has two instances running on it.
It will make sure that it is starting those two on a separate node.
So the whole mechanism of detecting the node state, failure,
starting those instances and make sure it is meeting your criteria
of number of instances is the function of a replication controller.
API server is essentially a place where you use that as a mechanism
to connect to the entire cluster.
How do you bring up the nodes?
How do you offline the nodes?
How do you add additional nodes? How do you offline the nodes? How do you add additional nodes?
How do you deploy the service,
Kubernetes services on nodes?
You do all these three, all these things
using the command line tool called kubectl
that uses the API server.
And you can also use the rest APIs if you really want
and automate it that way too.
So when you look at the node,
Kubernetes has a concept called a pod.
So pod is your basic set of abstraction for deploying the compute instances, which is
essentially a pool of containers.
So this is the bottom most abstraction for deployment.
Pod can have one container or many containers.
And the reason why they chose pod is it makes easier
for managing a pool of containers
from an administrative perspective,
as well as things like how do you manage the policies
from a networking as well as, you know, other pieces.
It makes it a lot easier to manage if you have a collection of containers
as opposed to managing each one as its own entity.
So part is a collection of containers,
and there is an agent sitting on each and every node which is called kubelet.
Kubelet is essentially an entry point into a node to get anything and everything done.
So it's an agent to execute the operations on behalf of the master services.
And proxy is essentially a network proxy.
It gives you the network virtualization services that are needed on a node.
Stitching all these, stitching nodes as well as master servers together there is foundation layer
called HCD this is essentially a key value distributed storage layer it keeps up all the
metadata that is needed behind beyond node failures as well as you know ensuring that
your master servers can scale and so on.
And then there is a concept called persistent volume.
So you can essentially attach persistent volumes.
There are certain policies that you can define.
There are three different policies that Kubernetes provides.
One is the capacity, how much capacity you need for a given volume.
Recycle.
And then the actual access policy. So access policy could be I want only read only volumes,
I want read write volumes, it is a private instance,
it is a shared instance.
And the recycle policy says when you are done,
do you want to actually recycle the data,
do you want to delete the volume and share,
or do you want to just leave it as is?
So three different policies and
the policies are still evolving right so the key thing is this is a growing
community specifically very good framework for container orchestration
sophisticated scheduling lots of backing from you know store you know from wide
variety of you know storage vendors as well as other big companies.
The big thing is that the storage interfaces are still evolving.
As I talked, there is a capacity-based as well as a little bit of access policy
and how you want to recycle the data.
But things like performance attributes, how do you explain them,
how do you provision them, those things are still being worked out.
And then most of the storage management is out of scope, it assumes that everything is provisioned and you're consuming them, so it ignores the fact that there is a provisioning part, there is a
discovery part, there is a pooling part, there is a monitoring and maintenance, there is a little bit
of monitoring and maintenance but it is fairly basic. So you kind of get the theme from OpenStack and container
orchestration is they're very good at provisioning the volumes and shares.
Your scheduling may not be sophisticated depending your mileage will
vary based on what orchestration stack you're picking. Every other area is
completely ignored. So there are lots of other open source flavors.
These are, you know, if you look at Apache Mesos, it's essentially an application framework,
so you can actually deploy big data, you can deploy infrastructure as a service,
and they can coexist on the same data center infrastructure,
so it has mechanism to up-level the abstraction to applications,
and then you can write you can use
those apis as a way to write your own application frameworks and deploy them in a data center right
so apache mesos mesos is very popular docker swarm mainly meant for native clustering for the docker
instances containers cover head i presented this with emC last year, so this is another open
source flavor, software-defined storage controller. CloudStack, very similar to OpenStack, it's
an equivalent of OpenStack flavor. Eucalyptus, somewhat similar to OpenStack and CloudStack,
but it is tilted towards more of AWS-friendly, integration-friendly orchestration. And there are several others, like Giant, friendly orchestration and there are
several others like giant open nebula there are a few others that are actually
in the industry so you can get a flavor for you know there are lots of open
source flavors out there all of them are actually trying to figure out what's the
best way to integrate storage resources if you look at the flavor of first three
so there is Kubernetes there is darker and there is mesos virtually everyone has their own mechanism to integrate
persistent volumes which is essentially volumes coming from network attached
storage depending on which orchestration you are picking your integration touch
points will vary and there are drivers in each one of them and then the EMC has something called
lib storage and rect-stray
and as well as the flex volume concept from Diamante.
The concept is,
instead of me writing a driver,
I will give you an abstraction layer.
So you write to my abstraction.
I'll make sure that everything plugs
into container frameworks,
whether it is Kubernetes,
whether it is Mesos,
whether it is Docker. I'll be able to make sure they plug in very
seamlessly so that's kind of the concept of what you see with the EMC and flex
volume thing a framework that lets you write drivers so you don't have to worry
about which one you are plugging underneath and then the last one is the
open storage one this is mostly revolving around how do you have the API abstraction that both for the front,
the northbound APIs as well as southbound APIs as a mechanism to write drivers seamlessly
has been the approach from a standards specification perspective,
not standards but rather specification and reference implementation perspective.
So the last four are mainly revolving around I want to create an abstraction, you can write
a driver based on my abstraction, I will make sure it integrates seamlessly with all the
other orchestration stacks.
Mostly focusing on cloud native computing frameworks which are in the first three buckets.
Okay, so what is the problem with this?
So if you piece all the things together,
so if you look at all the containers as well as the OpenStack frameworks,
this is just a sample of four, but when you add both proprietary
as well as the other open source flavors,
you can actually see the magnitude of the problem.
So what is the problem here?
If you look at it from top to down, it is very important for us to look at it from a storage perspective.
Can I attach storage that is sitting on a compute, which is called DAS?
Can I actually do that for the iSCSI targets as well as NVMe over Fabric's targets?
Can I actually allocate storage resources
that are somewhat compatible
with network-attached storage as well?
Scale up or scale out?
So there is no parity between these three different types
of instantiation, DAS, iSCSI, NVMe over Fabric,
target implementations, and scale-up and scale-out storage solutions,
you see that there's not a whole lot of commonality when it comes to implementing those things.
So the goal is to figure out what is the best way to do that.
Rather than having every vendor, every abstraction out there trying to position themselves as,
hey, I have my own abstraction, why don't you write a driver underneath it?
Our goal is is
there a way to create a common abstraction that everyone can participate and influence as opposed
to every vendor trying to drive this in a different direction right and the last thing is from a
storage vendors perspective ideally in an ideal situation i want to be able to write a driver as
you know writing a driver certifying a driver is a nightmare it takes significant amount of effort to make make that happen so storage vendors are
looking at can I write a driver test it thoroughly and then from there onwards
it's an integration problem that I need to focus on as opposed to really writing
a driver for every packet and every orchestration so storage vendor ecosystem pain point is right once we
use it everywhere so if I were to do this cleanly what will it look like the
yeah the picture is beautiful but you know the challenge is going to be how do
we enable this right so the way we are looking at is we'll have open SDS
orchestration we are calling that as open SDS controller that will have
plugins for all the orchestration stacks that includes traditional computing
platforms like open stack cloud stack as well as the cloud native
computing frameworks plus our hope and
pray is that the proprietary flavors will also come in you know as a way to
plug in VMware Microsoft stacks as well so that's essentially the top portion
but for us to do that we really need to figure out what is the best way to
abstract the storage resources what is that the basic set that we can actually
use as a starting point is it a performance oriented
construct like I ops latency throughput space efficiency data services things
like encryption compression so we are looking at few different concepts and
say this is good enough and that will be the level of abstraction for us to
actually provision the storage resources and we use that as a common
integration mechanism with all these two orchestration stacks that's a goal and
then when it comes to integrating with the existing ones our goal is not to
really start everything from ground sub can we reuse the existing frameworks and
then take advantage of their maturity and address the problems where they are
not being addressed today things like things like automated discovery and pooling is something that nobody addressed it.
So we are going to look at those pieces as opposed to, oh, let's do this whole thing from grounds up again.
So reuse the existing open source building blocks that are mature as a starting point.
That includes the driver. So Cinder has made significant traction. We should be able to use a Cinder framework as well as the drivers as a starting point that includes the driver. So, Cinder has made significant traction.
We should be able to use a Cinder framework
as well as a driver as a starting point,
as opposed to really writing from scratch.
So, the goal is to reuse as much as possible,
address the critical pain points, and lay the foundation.
And, you know, at some point,
we look for something common API here.
There are a few elements that are actually happening
in this for fish area
we'll be able to take advantage of enclosure management api's as well as
the NVA or fabrics target type of how do you discover them how do you pull them
how do you abstract them so the goal is to kind of integrate into the standards
bodies and make sure that is happening very seamlessly so
that's essentially the focus so how do we simplify the integration of storage
management how do we reuse the existing storage blocks you know and address the
real pain points okay I kind of summarize this one in the previous slide
focus on the real world you know know, end customer pain points.
I think the ones that I really talked about is scale and HA is very painful for me.
We need to make sure that it's taken care.
That has been one of the top end user pain points.
A way to discover and pool the storage resources has been the second most activity.
Automated data migration has been one of those you
know critical components where I need to be able to retire the systems but I have
data I need to be able to migrate it can I do that without actually going through
this expensive but you know storage layer that I need to buy and it has lots
of limitations it works with certain vendors and certain combinations and all
that stuff so focus on the real pain points.
Ensure that we are focusing on the real pain points that are seemed as priority from a
customer side.
Look at integrating with the broader orchestration ecosystem, virtualization backends, as well
as the container frameworks.
Reuse the open source building blocks wherever it makes sense and then the goal last one is very important we really want to
be able to make sure that the whole community is coming around the storage
ecosystem needs to feel the pain and they need to be part of this as opposed
to you know they're not being part of it so that's an important element standards
bodies I talked about swordfish is one of the critical elements in here CDMI could play a very critical role too and then obviously the
big service providers need to be part of it to be able to influence what is
really useful compared to you know developers coming in and doing the work
so we are looking at essentially a vibrant ecosystem that has a mix of
storage vendors standards standards bodies,
as well as end customers to be able to influence the storage management problem in a broader
way.
So there are discussions going on and our goal is to announce this sometime this year.
I can't because of all the NDA discussions that are going on, I can't really reveal
who those companies are, when it is going to be announced,
all that stuff, but it's coming up this year pretty soon.
Hopefully next month, but sometime this year.
You will see that there is an open SDS controller effort.
Starting under some sort of foundation,
it could be OpenStack or it could be CNCF
which is a cloud data computing framework or Linux foundation, one of those three.
But in general, it's a broader ecosystem coming together to solve this problem.
And really looking forward to you guys actually looking at this effort
and see what makes sense from a contribution perspective.
We'd love to have you join this effort as well.
Okay. So that's my last
slide have time for Q&A so let me know if this if this makes sense are we
solving the right problem is this the right thing to do for the industry give
us feedback questions Questions? So for example, we have kind of
Yeah, so we have, there is a prototyping effort that has been done by Intel and there are
a few companies.
So we were doing essentially how do you discover it, how do you pull. Obviously the discovery component is, you know, it's because there is a lack of uniform
discovery protocol, you will have to have a driver mechanism.
So we quickly settled on the need for drivers.
If someone wants to discover their storage backends or scale out systems, they need to
have a driver to discover.
So we settled
on the need for driver based discovery mechanism as opposed to let's go solve
that interface problem right the there is also some work that is going on NetApp
has been doing quite a bit of this stuff and they talked about this in SNEA
two years ago or so they are looking at SLO based constructs so they are looking
at what's the best way to create that normalized storage abstraction from an orchestration layer
perspective from for different you know dimension things like performance data
services performance could be you know how many I ops do I need for this volume
data services could be do I need compression do I need address
encryption what should be my encryption management story around it. So they have actually created a nice set of
abstraction and we are looking at some of that. There is also a little bit of CDMI spec
that has very good set of abstraction but it is meant for essentially object based backends
and all that stuff but we could beg and borrow from the existing
standard elements as well to see
what is the best way to start with.
So there is some prototyping effort going on.
There are a couple of other companies doing some work in the,
as I showed, EMC has done some standardization
for the CN's Kubernetes and Mesos
using lib storage and RxRay.
So they have done some work.
So there are working examples there. So the goal is kind of how do we take all these
elements that are happening discreetly come together in a common way do this in
an open source community to really address the storage management in one
place as opposed to addressing the storage management in OpenStack, CloudStack, Kubernetes, Mesos, Docker, Swarm, you name it right plus proprietary stacks. So
that's kind of the theme. So yeah there are few prototyping efforts that are
actually in flight. We looked at the right design in point for doing this
effort. Lots of waiting around behind the scenes to figure out whether we want to do something from scratch yeah it looks
very sexy when you go and look at write this in go you can actually attract lots
of developer community out there and say hey this is the open source SDS
controller we are going to write in go language you will attract reasonable
amount of developer community to come in and pile on and do this work but the
goal is you know storage management
is a very hard and painful problem there is reason why these companies are not
haven't come together in the past to do this so we talked quite a bit and say
the best starting point is what's been down art out there openstack cinder and
Manila has very nice foundation there is a lot of vibrant driver vendor ecosystem out there.
Why don't we start from there if you want to do something
and then add on modules and then plug in everywhere
has been one of the options that's being explored.
So I think in reality, there are lots of moving parts.
How do we take advantage of those moving parts
to really do this in one place as opposed to discrete places?
And then focus on addressing the gaps
any other questions Good? All right.
You get 15 minutes back.
Is it 15 minutes or six minutes?
One of those two.
All right.
Thank you.
Thanks for listening.
If you have questions about the material presented in this podcast,
be sure and join our developers mailing list
by sending an email to
developers-subscribe at sneha.org. Here you can ask questions and discuss this topic further with
your peers in the developer community. For additional information about the Storage Developer
Conference, visit storagedeveloper.org.