Storage Developer Conference - #51: USB Cloud Storage Gateway

Episode Date: July 18, 2017

...

Transcript
Discussion (0)
Starting point is 00:00:00 So I'll start off with just a run through of what I'll talk about during this presentation. So I'll start off with a project introduction, so looking at the plan for this USB storage gateway. Then take a quick look at Ceph. Next off look at USB storage, particularly on Linux, then do a demonstration of this USB storage gateway, and then move on to public cloud or a public cloud gateway implementation using Microsoft Azure, and finally finish off with some future challenges that could be or should be addressed with this project. So this project was conceived with SUSE Hack Week. So basically I work for SUSE Linux and normally once a year we get a week of time to work on whatever project we feel like working on. So in this case I had an arm board
Starting point is 00:01:08 which I'd purchased recently, so a cubitruck board. It was gathering dust in the corner and I thought, okay I'd like to do something with it. The goal of Hack Week is always to learn something new and then also being a storage developer I wanted to learn something new, and then also being a storage developer, I wanted to do something storage related. That's where I thought, okay, how about I work on a USB storage gateway for Ceph.
Starting point is 00:01:36 In this case we have our Ceph storage cluster, which offers us huge amounts of redundant, reliable, scalable storage at the back end. If I want to access that storage with a device that doesn't have a Ceph client built in, then a gateway like this would be useful to translate those Ceph IO requests at the back end to just basically USB storage on the front end. Reiterating the goals of the project, I can then access the cloud from anything. My television or stereo both have USB ports. They both support mass storage so with something like
Starting point is 00:02:25 this I can just plug it in and use Ceph for everything, for all storage. Another option is that I could use it then to boot from the Ceph Fuster, so my laptop where the BIOS itself doesn't support booting from Ceph, then in that case I could boot from USB which is backed by the Ceph cluster. Everything works. The final goal, so simple configuration. I just wanted to have something which was as easy to use as possible. Basically you set it up once and then you can forget about it and just plug it in and out, plug and unplug it like you would a normal USB key. So now look at Ceph. So Ceph is
Starting point is 00:03:16 a really cool open source free software project that handles aggregation of storage across multiple nodes. So basically you're pooling then the storage resources from this cluster into a logical pool which you can then divide up and use as a file system or as an object store or as a block device store. It's highly available in that no single point of failure. It does scrubbing, replication, everything's managed by Ceph itself.
Starting point is 00:03:57 It's incredibly scalable, so you have organisations like CERN using it to store 30 petabytes of data, I think, they've played with. It's a really cool project. It has, on the client side, a number of different interfaces. We have on the left, sorry, at the base there, we have RADOS, which is this reliable autonomous distributed object store that handles storage of the objects and everything that's needed from this object storage cluster.
Starting point is 00:04:34 On the left we have LibRADOS, so we can basically integrate our application with a SEP client using LibRADOS. There's an object storage gateway which supports then S3 and Swift protocols. There's the block device interface, so RADOS block device which is what I'm focusing on for this talk and this project. Finally, there's also a POSIX clustered file system with Ceph2. So a few words on the block device interface. So this is basically a block device image which is backed by objects within our Ceph cluster. Has a number of cool features like theme provisioning,
Starting point is 00:05:20 online resize, snapshotting. And there's also on the client side for RBD, we have the internal Linux client and there's also user space clients using libRBD, so there's QNU support where you can have your VMs plumbed directly into or utilising directly those RADOS block device images on the Ceph cluster. So the hardware I was using for this project, so I mentioned I had this QB truck lying around. This isn't ideal for a storage gateway. It has heaps of interfaces which just aren't needed for this. So after initially implementing it with the Qubitrack, I then moved on to the, or just
Starting point is 00:06:14 recently this Nanopie Neo. So this is a lot cheaper, it costs under $10. It's got everything that's needed for this project, so USB on the go and a 100 megabit network interface. Another board which I've looked at doing the same thing on is this chip from NexSyncCo, so they're based in Oakland, but that has then, rather than the wired Ethernet interface, that has wireless onboard and which is quite cool. Also under $10, I should mention. So the main priority in selecting the hardware was that I wanted support for the mainline Linux kernel.
Starting point is 00:07:04 So there are thousands of boards out there, most of them use some sort of outdated kernel which isn't updated that much. With the Allwinner chips, there's this excellent Sunsea community behind it. So they're not working directly with Allwinner, not really sponsored by Allwinner but they just do awesome things with these chips or these boards. There's an open SUSE tumbleweed port at least for the Cubitruck board I mentioned earlier. Hopefully we'll have something for the NPy Neo at some stage as well. For the price, quite performance.
Starting point is 00:07:53 So this, I mentioned earlier, has a 100 megabit network interface. It has a quad-core ARMv7 CPU and 512 megabit RAM. It's sort of everything that's needed. Sorry? It's called a NanoPy Neo, so it's produced by a company called Friendly Arm. So now USB storage. We have basically two mechanisms for transporting SCSI over USB, there's the old bulk-only transport and there's USB-attached SCSI, which has some better performance-based features, so command queuing and out-of-order completion, which are quite interesting performance-wise. On the Linux kernel side we have then this mass storage kernel module which basically implements just the bare bones, a very minimal SCSI target and handles everything within that module and there's also this TCM kernel module which plumbs into the LIO stack on Linux. So this is the generic SCSI target on Linux.
Starting point is 00:09:08 So with this project, I've covered everything. Basically the kernel does everything that's needed and there's really not much that needs to be implemented here. We have the kernel RVD client for Ceph, we have our USB gadget or device mode support and we also have Lux or DMcrypt support which is cool if we want to do our transparent encryption, decryption on the Gateway itself. So in the end it's just about handling configuration, making that as easy as possible for the user. So the user or administrator can then just copy their SF credentials, configuration file onto the device, the decrypt key if they want encryption, and then all we need to do is handle mapping and exposing that via USB.
Starting point is 00:10:11 So here's a look at how that's implemented, basically as a sequence or a boot sequence for this little board. So we basically have, or we're booting a normal Linux distribution, Linux kernel. What I then do is have it provision a file system, just a FAT32 file system based on or backed by a RAM disk. This is then the configuration file system so the user sees a regular FAT file system, can then copy on their Ceph credentials, specify which image or which RADOS block device image they want
Starting point is 00:10:53 to have exposed via USB. What we then do is handle the eject event. once the the initiator or the user ejects the file system or the device we intercept that and then commit that configuration and can then go ahead and map the RADOS block device image and expose it via USB so now a demonstration. So what I have here is on my laptop I have a Ceph cluster so to speak. I have just a VSTART Ceph cluster with three OSDs, one MON. Pretty minimal but does what's needed for this demonstration. So I'll just connect my board then, or the USB gateway, to my Ceph cluster. And the board is then also powered by this USB cable, so power and data. I haven't really optimized boot time at this stage on this board. What we can do is
Starting point is 00:12:17 when configuration is completed by the user, subsequent plugin attempts can use this fast path. So basically rather than showing the config.fs and waiting for the eject, we just mapped the RADOS block device image and exposed it immediately. It still takes a while to boot. I did have the QB truck board down to like under five seconds for the boot time just by doing everything from InitRAMFS. Good. Well, the config file system has come up so I'll just bring that over to
Starting point is 00:12:56 your window. There we can see it on the left there. So here we have, this is the configuration file system which is provisioned by the board and is now exposed by USB. So a client now just comes along, or sorry, a user comes along. They have their Ceph configuration and Ceph keyring. They can just copy that into the Ceph directory. There's a de-encrypt directory which takes then a lux key if the user wishes to have transparent encryption decryption on the device. There's this flag which then
Starting point is 00:13:46 triggers, or the removal of this flag triggers the fast path so that we don't reach the config file system. I actually want to remove that for a later test. And finally just the generic global config which says basically which image should be exposed. So in this case, I have a pool on Ceph called RBD. That's a default Routers Proc Device pool.
Starting point is 00:14:17 And I have an image within that pool called USB. So the configuration looks good. I'm happy with that. So I can go ahead and eject. And as I mentioned earlier, what happens is we then intercept that eject event, and then do the mapping, and then expose it via USB. So we can see on the left here, I now have this 10
Starting point is 00:14:44 gigabyte device. So this now corresponds to my RADOS block device image. So everything's happened now on the gateway, it's mapped it and exposed it via USB so I can go ahead and utilize that storage. This is where, yeah, as mentioned at the start, my initial idea was to have this storage then used by stupid devices rather than something which has a SEP client built in like my laptop. So to actually demonstrate that support I just wanted to connect the gateway to a mobile phone. So I have here just a regular Sony Xperia Android phone. This has a USB on the go port which is capable of powering my USB gateway.
Starting point is 00:15:48 So I'll just go ahead and connect that. So in this case we remove the I'll just bring up the slides again. So we remove that flag to request the fast path for the next connection, which means that the phone doesn't see the config file system, it just sees the RADOS block device image immediately mapped. So with that, I will just actually make use of the storage first.
Starting point is 00:16:26 So I'll just take a quick photo. Good. And I'll just copy that photo then to the storage. How was it? copy that photo then to the storage. Good. So the photo is now copied onto the Ceph cluster via this USB gateway. I can now safely remove it, unmount the file system, and it's good to go. So now just to show that it worked and it actually copied something, and what I'll do is just map
Starting point is 00:17:27 the Rados Block device using the internal Ceph client. You should see, there we go, on the left again, the same 10 gigabyte drive came up. In this case it's just using the Ceph client on my laptop, so it's not making use of the gateway at this stage. And looks good. There we can see the pretty bad photo I took earlier. So that's the demonstration out of the way. Thank you.
Starting point is 00:18:22 So with this same device I also hoped to use it as a gateway for the public cloud. In this case, instead of having the CIF cluster backing what's exposed by the gateway, I wanted to use the public cloud, Microsoft Azure in this case, to back those images. This is possible using the page blob protocol on Microsoft Azure, so basically the VM images on Microsoft Azure are accessible using this page blob protocol. It supports basically arbitrary,
Starting point is 00:19:07 sorry, 512 byte IOs at arbitrary offsets, which is perfect for, obviously, just a regular block device. So in this case, I worked on this using a user space backend for the Linux in kernel SCSI target. So we have in kernel just the regular LIO SCSI target within Linux
Starting point is 00:19:32 that has then on the transport side support for Fibre Channel, iSCSI, USB, a number of other things. On the back end we have a number of different storage engines so the SCSI logical units can be backed by a file, by a block device. What I then wanted or what I then used is the TCMU back end so basically what that does is sees the LIO target handle or forward the raw SCSI request up to user space via this TCMU kernel module.
Starting point is 00:20:18 So in this case what I have is a ElastoCloud project which basically implements the Azure page blob protocol. So that then handles mapping those IO requests into Cloud IO. The ElastoCloud project also supports the Azure files service and Amazon S3 protocols but that's not what's in use here it's just the the page blob API we have been a TCMU elasto handler so this is basically I mentioned TCMU in the SCSI request supported up to user space and within user space we have yeah plug-in for this TCMU component, which then maps those SCSI requests directly to the cloud IO requests. So there's sort of a diagram of what we have.
Starting point is 00:21:18 We just have on the left the client side, just using a regular block device. We then have on the gateway just the regular USB gadget stack with the LIO SCSI target, TTMU, and finally Elasto at the top to actually do the Azure protocol requests. And yeah, now on to testing. So for testing, yeah, the implementation of this, I didn't really want to deal with, you know, plugging and unplugging cables all the time.
Starting point is 00:22:03 So there is this cool kernel module on Linux, dummy HCD, which does basically loop back of any USB gadget onto the same system. So that's been really useful for testing this implementation quickly. Some of the challenges, yeah, with something like this, so I mean it costs $10 or under $10, so you could obviously just buy heaps of gateways and expose the same image to multiple systems via this gateway. That would then, with something like the FAT file system or a non-clustered file system,
Starting point is 00:22:43 that would then be problematic. So that's not currently handled. But one option there would potentially be having the gateway request an exclusive lock on the image and then potentially snapshot that image so that any subsequent gateway connecting just as fed the snapshot, a read-only snapshot, and then potentially snapshot that image so that any subsequent gateway connecting just is fed the snapshot, a read-only snapshot, might potentially help or make that a little smoother for users.
Starting point is 00:23:15 Some other challenges, so power, this board doesn't have a battery or anything in there, so it's very reliant or dependent on a reasonable source of power from the USB host. That's not something you can always rely on, so one option would be to add a very small battery to the board. I mentioned the mixing chip before, this has a battery connector where you could do something like that. Using the TCM USB back end, I had some problems with that on the board itself so it seems to work when I use dummy HCI, HCD. It wasn't working on proper hardware but that's no big deal because I can just use the TCM
Starting point is 00:24:06 or the LAO loopback and just have it exposed as a local block device and then use the mass storage kernel module. Caching, so many of these boards have onboard NAND which could be utilized for caching on the gateway itself. So something to also look at. Performance so I mentioned earlier boot time at the moment it's pretty slow to boot so really getting that down to as little time as possible would be helpful for users there on the performance side yeah the performance isn't great but I think in the end it's it's sort of made for these stupid devices which don't don't normally require you know awesome
Starting point is 00:24:58 performance you could could though look at using USB 3 or Gigabit Ethernet on the board but then your price goes up quite a bit so it's a trade off. A few final words, all I can say is Ceph is amazing, use it. It is a little limited in that you don't have a Ceph client on everything, so this is where the Gateway is then helpful. USB storage is everywhere, so everything nowadays has a USB port which you can then for the most part use for mass storage. Encryption on a gateway device like this is, in my opinion, something I wouldn't do without for public cloud. For public cloud, I would never trust the public cloud with my unencrypted data. I would always want to own the keys. Having that on the gateway itself is, I think, quite a nice solution for that. It's also incredibly cheap, so you can certainly buy something like that for all of your devices.
Starting point is 00:26:17 They support mainline Linux, so you don't need to worry too much about running, say, an out-of-date kernel with security vulnerabilities or something like that. But otherwise, yeah, that's it. Any questions? Anything? No, it doesn't look like it. I'll just finish off by saying thanks to the Sunsea community. I don't know the guys, but they've done really just some awesome work on these boards. The LIO target developers, so the Linux SCSI target developers, and also this
Starting point is 00:27:02 board itself I was given for free from our friendly arm so thank you to friendly arm for that as well yes Can you try... With the setback end? Yeah. Yeah, so I can use it as a regular pole system. I don't use it in daily life at home yet, but yeah, that was certainly my plan, was just to have in my set cluster in the basement and then any device which... so my stereo or TV, I can just then plug one of these things into and not need to deal with plugging and unplugging USB sticks everywhere. I don't know much about the network capacity of these cars. How does that affect it?
Starting point is 00:28:01 I'll just repeat the question. So the network performance of the board, was that your request? How does it impact your... It's really not great. It's a 100 megabit network interface and sort of around 10 megabytes per second IOS speed to the RIDOS block device via this gateway so it's not great but yeah I mean you can use something with a quicker network interface it's just gonna cost you more so this one has a the Qubitruck has a gigabit interface the CPU can't quite drive it at that rate but it's still
Starting point is 00:28:44 still better than something like the Neo. Yes? You're talking about battery power and plugging it into a TV somewhere. Yes. So you don't have wires snaking all over the house on the Ethernet side, right? Yeah, so this is where the chip from Nexinco, this has Wi-Fi built in, so I haven't worked on getting this gateway up on the chip. It doesn't have sort of an open SUSE port at the moment, but it certainly should be an option. So it's basically a very similar chip to the same other boards.
Starting point is 00:29:24 Oh, sorry, the other boards. Is it tricky with the network setup? Is just Linux doing it? At the moment it just supports DHCP. Ideally with the configfs you would just be able to do the network configuration through that as well. Just via a text file you enter static or DHCP and do what you need to do there. Okay. Thanks again. Thanks for listening.
Starting point is 00:29:55 If you have questions about the material presented in this podcast, be sure and join our developers mailing list by sending an email to developers-subscribe at sneha.org. Here you can ask questions and discuss this topic further with your peers in the developer community. For additional information about the Storage Developer Conference, visit storagedeveloper.org.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.