Storage Developer Conference - #51: USB Cloud Storage Gateway
Episode Date: July 18, 2017...
Transcript
Discussion (0)
So I'll start off with just a run through of what I'll talk about during this presentation.
So I'll start off with a project introduction, so looking at the plan for this USB storage gateway.
Then take a quick look at Ceph. Next off look at USB storage, particularly on Linux, then do a demonstration of this USB storage gateway,
and then move on to public cloud or a public cloud gateway implementation using Microsoft
Azure, and finally finish off with some future challenges that could be or should be addressed
with this project.
So this project was conceived with SUSE Hack Week. So basically I work for SUSE Linux and
normally once a year we get a week of time to work on whatever project we feel like working on. So in this case I had an arm board
which I'd purchased recently, so a cubitruck board.
It was gathering dust in the corner and I thought, okay I'd like to
do something with it. The goal
of Hack Week is always to learn something new and then also
being a storage developer I wanted to learn something new, and then also being a storage developer,
I wanted to do something storage related.
That's where I thought, okay,
how about I work on a USB storage gateway for Ceph.
In this case we have our Ceph storage cluster,
which offers us huge amounts of redundant,
reliable, scalable storage at the back end.
If I want to access that storage with a device that doesn't have a Ceph client built in,
then a gateway like this would be useful to translate those Ceph IO requests at the back end
to just basically USB storage on the front end.
Reiterating the goals of the project, I can then access the cloud from anything. My television
or stereo both have USB ports. They both support mass storage so with something like
this I can just plug it in and use Ceph for everything, for all storage.
Another option is that I could use it then to boot from the Ceph Fuster, so my laptop
where the BIOS itself doesn't support booting from Ceph, then in that case I could
boot from USB which is backed by the Ceph cluster. Everything works.
The final goal, so simple configuration. I just wanted to have something which was as
easy to use as possible. Basically you set it up once and then you can forget about it and just plug it in and out, plug and unplug it
like you would a normal USB key.
So now look at Ceph. So Ceph is
a really cool open source free software project
that handles aggregation of storage
across multiple nodes.
So basically you're pooling then the storage resources from this cluster
into a logical pool which you can then divide up
and use as a file system or as an object store or as a block device store.
It's highly available in that no single point of failure.
It does scrubbing, replication, everything's managed by Ceph itself.
It's incredibly scalable, so you have organisations like CERN
using it to store 30 petabytes of data, I think, they've
played with.
It's a really cool project.
It has, on the client side, a number of different interfaces.
We have on the left, sorry, at the base there, we have RADOS, which is this reliable autonomous
distributed object store that handles storage of the
objects and everything that's needed from this object storage cluster.
On the left we have LibRADOS, so we can basically integrate our application with a SEP client
using LibRADOS. There's an object storage gateway which supports then
S3 and Swift protocols. There's the block device interface, so RADOS block device which
is what I'm focusing on for this talk and this project. Finally, there's also a POSIX
clustered file system with Ceph2.
So a few words on the block device interface. So this is basically a block device image
which is backed by objects within our Ceph cluster.
Has a number of cool features like theme provisioning,
online resize, snapshotting.
And there's also on the client side for RBD, we have the internal
Linux client and there's also user space clients using libRBD, so there's QNU support where
you can have your VMs plumbed directly into or utilising directly those RADOS block device
images on the Ceph cluster.
So the hardware I was using for this project, so I mentioned I had this QB truck lying around.
This isn't ideal for a storage gateway. It has heaps of interfaces which just aren't needed for this.
So after initially implementing it with the Qubitrack, I then moved on to the, or just
recently this Nanopie Neo. So this is a lot cheaper, it costs under $10. It's got everything that's needed for this project, so USB on the go and a 100
megabit network interface. Another board which I've looked at doing the same thing on is
this chip from NexSyncCo, so they're based in Oakland, but that has then, rather than
the wired Ethernet interface,
that has wireless onboard and which is quite cool.
Also under $10, I should mention.
So the main priority in selecting the hardware was that I wanted support
for the mainline Linux kernel.
So there are thousands
of boards out there, most of them use some sort of outdated kernel which isn't updated
that much. With the Allwinner chips, there's this excellent Sunsea community behind it.
So they're not working directly with Allwinner, not really sponsored by Allwinner
but they just do awesome things with these chips or these boards.
There's an open SUSE tumbleweed port at least for the Cubitruck board I mentioned earlier.
Hopefully we'll have something for the NPy Neo at some stage as well.
For the price, quite performance.
So this, I mentioned earlier, has a 100 megabit network interface. It has a quad-core ARMv7 CPU and 512 megabit RAM.
It's sort of everything that's needed.
Sorry?
It's called a NanoPy Neo, so it's produced by a company called Friendly Arm.
So now USB storage.
We have basically two mechanisms for transporting SCSI over USB, there's the old bulk-only transport and there's USB-attached SCSI, which has some better performance-based features, so command queuing and out-of-order completion, which are quite interesting performance-wise. On the Linux kernel side we have then this mass storage kernel module
which basically implements just the bare bones, a very minimal SCSI target and handles everything
within that module and there's also this TCM kernel module which plumbs into the LIO stack on Linux. So this is the generic SCSI target on Linux.
So with this project, I've covered everything. Basically the kernel does everything that's
needed and there's really not much that needs to be implemented here. We have the kernel RVD client for Ceph, we have our USB gadget
or device mode support and we also have Lux or DMcrypt support which is cool if
we want to do our transparent encryption, decryption on the Gateway itself. So in the end it's just about handling configuration, making that as easy as
possible for the user. So the user or administrator can then just copy their
SF credentials, configuration file onto the device, the decrypt key if they want
encryption, and then all we need to do is handle mapping
and exposing that via USB.
So here's a look at how that's implemented,
basically as a sequence or a boot sequence
for this little board.
So we basically have,
or we're booting a normal Linux distribution, Linux
kernel. What I then do is have it provision a file system, just a FAT32 file system based
on or backed by a RAM disk. This is then the configuration file system so the user sees a regular FAT file system, can then copy on
their Ceph credentials, specify which image or which RADOS block device image they want
to have exposed via USB. What we then do is handle the eject event. once the the initiator or the user ejects the file system or the
device we intercept that and then commit that configuration and can then go ahead
and map the RADOS block device image and expose it via USB so now a demonstration. So what I have here is on my laptop I have a Ceph cluster
so to speak. I have just a VSTART Ceph cluster with three OSDs, one MON. Pretty minimal but
does what's needed for this demonstration.
So I'll just connect my board then, or the USB gateway, to my Ceph cluster.
And the board is then also powered by this USB cable, so power and data. I haven't really optimized boot time at this stage on this board.
What we can do is
when configuration is completed by the user, subsequent plugin attempts
can use this fast path. So basically rather than
showing the config.fs and waiting for the eject, we just mapped the RADOS block
device image and exposed it immediately. It still takes a while to boot.
I did have the QB truck board down to like under five seconds for the boot
time just by doing everything from
InitRAMFS.
Good. Well, the config file system has come up so I'll just bring that over to
your window. There we can see it on the left there.
So here we have, this is the configuration file system which is provisioned by the board
and is now exposed by USB.
So a client now just comes along, or sorry, a user comes along.
They have their Ceph configuration and Ceph keyring.
They can just copy that into the Ceph directory.
There's a de-encrypt directory which takes then a lux key if the user wishes to have
transparent encryption decryption on the device. There's this flag which then
triggers, or the removal of this flag triggers the
fast path so that we don't
reach the config file system.
I actually want to remove that for a later test.
And finally just the generic
global config which says basically which image should be exposed.
So in this case, I have a pool on Ceph called RBD.
That's a default Routers Proc Device pool.
And I have an image within that pool called USB.
So the configuration looks good.
I'm happy with that.
So I can go ahead and eject.
And as I mentioned earlier, what happens is we then
intercept that eject event, and then do the mapping, and
then expose it via USB.
So we can see on the left here, I now have this 10
gigabyte device. So this
now corresponds to my RADOS block device image. So everything's happened now on
the gateway, it's mapped it and exposed it via USB so I can go ahead and
utilize that storage. This is where, yeah, as mentioned at the start, my initial idea was to have this storage then
used by stupid devices rather than something which has a SEP client built in like my laptop.
So to actually demonstrate that support I just wanted to connect the gateway to a mobile phone.
So I have here just a regular Sony Xperia Android phone.
This has a USB on the go port which is capable of powering my USB gateway.
So I'll just go ahead and connect that. So in this case
we remove the
I'll just bring up the slides again.
So we remove that
flag to request the fast path for the next connection, which means
that the phone doesn't see the config file system, it just sees the RADOS block device
image immediately mapped.
So with that, I will just actually make use of the storage first.
So I'll just take a quick photo.
Good.
And I'll just copy that photo then to the storage.
How was it? copy that photo then to the storage. Good. So the photo is now copied onto the Ceph cluster via this USB gateway. I can now safely remove it, unmount the file system,
and it's good to go.
So now just to show that it worked
and it actually copied something,
and what I'll do is just map
the Rados Block device using the internal Ceph client.
You should see, there we go, on the left again, the same 10 gigabyte drive came up.
In this case it's just using the Ceph client on my laptop,
so it's not making use of the gateway at this stage.
And looks good.
There we can see the pretty bad photo I took earlier.
So that's the demonstration out of the way.
Thank you.
So with this same device I also hoped to use it as a gateway for the public cloud.
In this case, instead of having the CIF cluster backing what's exposed by the gateway, I wanted
to use the public cloud, Microsoft Azure in this case, to back those images.
This is possible using the page blob protocol
on Microsoft Azure, so basically the VM images
on Microsoft Azure are accessible using
this page blob protocol.
It supports basically arbitrary,
sorry, 512 byte IOs at arbitrary offsets,
which is perfect for, obviously,
just a regular block device.
So in this case, I worked on this using
a user space backend for the Linux
in kernel SCSI target.
So we have in kernel just the regular
LIO SCSI target within Linux
that has then on the transport side support
for Fibre Channel, iSCSI,
USB, a number of other things.
On the back end we have
a number of different storage engines so the SCSI
logical units can be backed by a file, by a block device. What I then wanted or
what I then used is the TCMU back end so basically what that does is sees the
LIO target handle or forward the raw SCSI request up to user space via this TCMU kernel module.
So in this case what I have is a ElastoCloud project which basically implements the Azure page blob protocol.
So that then handles mapping those IO requests into Cloud IO.
The ElastoCloud project also supports the Azure files service and Amazon S3 protocols but that's not what's in use here it's just
the the page blob API we have been a TCMU elasto handler so this is basically
I mentioned TCMU in the SCSI request supported up to user space
and within user space we have yeah plug-in for this TCMU component, which then maps those SCSI
requests directly to the cloud IO requests.
So there's sort of a diagram of what we have.
We just have on the left the client side, just using a regular block device.
We then have on the gateway just the regular USB gadget stack
with the LIO SCSI target, TTMU, and finally Elasto at the top
to actually do the Azure protocol requests.
And yeah, now on to testing.
So for testing, yeah, the implementation of this,
I didn't really want to deal with, you know,
plugging and unplugging cables all the time.
So there is this cool kernel module on Linux, dummy HCD, which does basically loop back
of any USB gadget onto the same system.
So that's been really useful
for testing this implementation quickly.
Some of the challenges, yeah, with something like this,
so I mean it costs $10 or under $10, so you could obviously just buy
heaps of gateways and expose the same image to multiple systems via this gateway.
That would then, with something like the FAT file system or a non-clustered file system,
that would then be problematic.
So that's not currently handled.
But one option there would potentially be having the gateway request an exclusive lock on the image
and then potentially snapshot that image
so that any subsequent gateway connecting
just as fed the snapshot, a read-only snapshot, and then potentially snapshot that image so that any subsequent gateway connecting just
is fed the snapshot, a read-only snapshot, might potentially help or make that a little
smoother for users.
Some other challenges, so power, this board doesn't have a battery or anything in there, so it's very reliant or dependent on a reasonable source of power from the USB host.
That's not something you can always rely on, so one option would be to add a very small
battery to the board.
I mentioned the mixing chip before, this has a battery connector where you could do something
like that.
Using the TCM USB back end, I had some problems with that on the board itself so it seems
to work when I use dummy HCI, HCD.
It wasn't working on proper hardware but that's no big deal because I can just use the TCM
or the LAO loopback and just have it exposed as a local block device and then
use the mass storage kernel module. Caching, so many of these boards have
onboard NAND which could be utilized for caching on the gateway itself.
So something to also look at.
Performance so I mentioned earlier boot time at the moment it's pretty slow to boot so
really getting that down to as little time as possible would be helpful for users there on the performance side yeah the
performance isn't great but I think in the end it's it's sort of made for these
stupid devices which don't don't normally require you know awesome
performance you could could though look at using USB 3 or Gigabit Ethernet on the board but then your price
goes up quite a bit so it's a trade off.
A few final words, all I can say is Ceph is amazing, use it. It is a little limited in that you don't have a Ceph client on everything, so this
is where the Gateway is then helpful. USB storage is everywhere, so everything nowadays
has a USB port which you can then for the most part use for mass storage. Encryption on a gateway device like this is, in my opinion, something
I wouldn't do without for public cloud. For public cloud, I would never trust the public
cloud with my unencrypted data. I would always want to own the keys. Having that on the gateway itself is, I think, quite a nice solution for that.
It's also incredibly cheap, so you can certainly buy something like that for all of your devices.
They support mainline Linux, so you don't need to worry too much about running, say,
an out-of-date kernel with security vulnerabilities or something like that.
But otherwise, yeah, that's it.
Any questions?
Anything?
No, it doesn't look like it. I'll just finish off by saying thanks to the Sunsea
community. I don't know the guys, but they've done really just some awesome work on these
boards. The LIO target developers, so the Linux SCSI target developers, and also this
board itself I was given for free from our
friendly arm so thank you to friendly arm for that as well yes Can you try... With the setback end? Yeah. Yeah, so I can use it as a regular pole system.
I don't use it in daily life at home yet, but yeah, that was certainly my plan,
was just to have in my set cluster in the basement and then any device which...
so my stereo or TV, I can just then plug one of these things into and not need to deal
with plugging and unplugging USB sticks everywhere.
I don't know much about the network capacity of these cars.
How does that affect it?
I'll just repeat the question. So the network performance of the board, was that your request?
How does it impact your...
It's really not great. It's a 100 megabit network interface and sort of around 10 megabytes per second IOS speed to the
RIDOS block device via this gateway so
it's not great but yeah I mean you can use something with a quicker
network interface it's just gonna cost you more so this one has a
the Qubitruck has a gigabit interface
the CPU can't quite drive it at that rate but it's still
still better than something like the Neo.
Yes?
You're talking about battery power and plugging it into a TV somewhere.
Yes.
So you don't have wires snaking all over the house on the Ethernet side, right?
Yeah, so this is where the chip from Nexinco, this has Wi-Fi built in, so I haven't worked on getting this gateway up on the chip.
It doesn't have sort of an open SUSE port at the moment, but it certainly should be an option.
So it's basically a very similar chip to the same other boards.
Oh, sorry, the other boards.
Is it tricky with the network setup? Is just Linux doing it?
At the moment it just supports DHCP.
Ideally with the configfs you would just be able to do the network configuration through that as well.
Just via a text file you enter static or DHCP and do what you need to do there.
Okay.
Thanks again.
Thanks for listening.
If you have questions about the material presented in this podcast,
be sure and join our developers mailing list by sending an email to developers-subscribe at sneha.org. Here you can ask
questions and discuss this topic further with your peers in the developer community.
For additional information about the Storage Developer Conference, visit storagedeveloper.org.