Storage Developer Conference - #145: The Future of Accessing Files Remotely from Linux: SMB3.1.1 Client Status Update

Starting point is 00:00:00 Hello, everybody. Mark Carlson here, SNEA Technical Council Co-Chair. Welcome to the SDC Podcast. Every week, the SDC Podcast presents important technical topics to the storage developer community. Each episode is hand-selected by the SNEA Technical Council from the presentations at our annual Storage Developer Conference. The link to the slides is available in the show notes at snea.org slash podcasts. You are listening to SDC Podcast Episode 145.

Starting point is 00:00:41 I'd like to welcome you to the talk on the future of accessing files remotely from Linux. This is a status update on the SMB 3.1.1 client, as well as I'll talk a little bit about the SMB 3.1.1 server and Samba as well. My name is Steve French. I'm a principal software engineer with Microsoft. I work on the Linux client full-time, but our team works on Azure Storage, XSMB. Legal statement, this doesn't necessarily, these views don't necessarily represent those of Microsoft. Let's see, who am I? I'm the author and maintainer of the Linux SIFS client which is used to access Samba Windows various NAS devices Macs as well as the cloud the largest file server in the world

Starting point is 00:01:35 Azure I'm a member of the Samba team I actually wrote the initial net utility with Jim McDonough many years ago I'm the co-author of the SNEA CIFS Technical Reference and former Working Group Chair. And as I said, I'm a Principal Software Engineer working on helping maintain the Linux client, but working on the Azure Storage team within Microsoft. I'm going to talk about recent Linux VFS and file system activity to kind of give you a broader view of what's going on in Linux

Starting point is 00:02:09 and then dive into changes on the new kernel server as well as changes on the client. And then I'm going to talk a little bit about what client features you can expect in the near future. Then I'll finish up with some discussions of the user space utilities, SIPs, UDLs, and the improvements in that area, as well as testing, which has had some really remarkable, excellent progress recently. So let's step back.

Starting point is 00:02:36 On Linux, the progress is amazing. A year ago, when I spoke here a year ago, the Linux kernel version was called BobtailSquid. And you can see an interesting picture of what a BobtailSquid looks like. Now, fast forward to last week. We have the KleptomaniacOctopus as the current name. Notice Linus enjoys having unusual names for releases. So we've gone from the 5.3 kernel to the 5.9 kernel.

Starting point is 00:03:12 You know, the Linux activity has been amazing, even in this COVID-19-dominated world over the last year. Over the last seven months, activity has kept up strong. So to give you a feel for what's gone on in the last year, the file system activity is actually up from the previous year. And the file system represents, the file system changes represent almost 7% of the overall kernel changes. And that's flat.

Starting point is 00:03:40 That's similar to a typical year last few. Now, the Linux kernel is amazingly huge. Over 20 million lines of source code. I just measured it a couple weeks ago. Now, there are many Linux file systems. There are more than 60, but six of these file systems and the VFS layer itself drive the vast majority of the activity. They include one local file systems you're well aware of, BTRFS and XFS, and the two network file systems, NFS and CIFS.

Starting point is 00:04:11 CIFS has been the most active over the last couple of years. Now, file systems represent almost 5% of the kernel source code, but a higher percentage of the changes. They represent almost a million lines of code, but a higher percentage of the changes. They represent almost a million lines of code, but they're much more carefully watched than many other areas of the kernel. So they're among the most carefully watched areas of the kernel. Now, cifs.ko, the module I work on that provides the SMB 3.1.1 support, is the third most active file system with 336 changes. Its activity has been strong. It went a lot over the last two and a half to three years. Now, the cifs.k over module is almost

Starting point is 00:04:51 57.2 thousand lines of code, but that doesn't count the user space cifs.utils, which are almost 12,000 lines of code, nor does it count the Samba tool. Samba is over 3 million lines of source code, and we use pieces of Samba for some of the helper functions in user space. And then we rely on other services like WinBind in some cases. Now, let's look at the overall change activity

Starting point is 00:05:22 in the Linux file system over the past year from the 5.3 kernel until now. The largest set of changes overall has been in the VFS layer, the overall file system mapping layer, which includes common functions used by various file systems, as well as the mapping from syscalls into the actual file systems themselves. Over 1,400 lines of code, or sorry, over 1,400 change sets. That is enormous, way, way higher. Some of that's due to work from people like Dave Howells, who are improving the mount API, doing various cleanup, but it's spread all over the place.

Starting point is 00:06:02 Now, traditionally, XFS and VTRFS kind of dueled to see who's the most active file system. And, you know, they're far more active than EXT4. But both of them had significantly increased activity. Now, the CIFS SMB2, SMB3 client had 336 changes. But as I mentioned earlier, ever since the 418 kernel, you'll see the CIPS activity has gone way, way up. The NFS client had a similar set of changes, a similar amount of changes last year. Interestingly, EXT4, which had relatively much less activity than any of these file systems, their activity did increase.

Starting point is 00:06:41 But if you look at the other file systems, you'll notice that very few come close to the activity of those top five or six that we see there. Now, let's talk about the server a little bit. Many of you are familiar with the NFS kernel server. It had 164 changes. Change that slide here. That's a fraction of the changes that went on to, you know, the SIFS.co client or the Samba server. Samba server is as active as all these file systems put together. You know, typically many thousands of change sets a year. And just in the file server component alone, it typically gets over a thousand change sets a year. So, you know, it's interesting to kind of put this in perspective.

Starting point is 00:07:29 Samba's broader in scope by a lot. It's, you know, almost 100 times larger than the NFS server. Over 3 million lines of source code. So this gives you a kind of perspective for what's going on in Linux. It's really interesting to see how much activity has been going on. But what people often forget is that it's not just internal things going on inside the file system. They're also adding new syscalls. In the previous talk last year, I mentioned the 15 file system-related syscalls that were added.

Starting point is 00:08:05 Now, this year, only three were added, but that's still kind of incredible. When you think about this, every year, we're adding new interfaces to support more applications. We think of POSIX file APIs as kind of a static thing that haven't changed. But in Linux, we don't depend just on POSIX. We keep adding APIs. Last year, IOU Ring, Dave Howell's new mount API, these were big changes. These were very important changes.

Starting point is 00:08:34 But this year, we have OpenAt2 in the 5.6 kernel. We have FSAccessAt2 in the 5.8 kernel, and CloseRange in the 5.9 kernel. These are very important changes, and it's kind of an example of the continued improvements in Linux, some of which are visible to user applications. And as Linux continues to evolve, it makes our job harder and harder in the file system

Starting point is 00:09:05 because we have to be able to support a larger and larger set of API calls. Now, what's driving all this activity? Well, there are many things. Recently, for example, there's been a rewrite of the FS cache layer. Probably four or five versions of this have been put out on the mailing list. Dave Howells has been driving that. The previous year we had the new Mound API. And Dave also has proposed some extended ways of querying additional FS information

Starting point is 00:09:40 and new notification mechanisms. But we also have changes coming from groups trying to improve the support for containers in Linux. And we've had over the last 10 years or more many efforts around trying to support faster storage, NVMe, RDMA. These cause interesting problems, right? Where we've had examples of one of the fun bugs I remember was when we had a response returned before the request completed. that are generated in source code by this incredibly fast infrastructure,

Starting point is 00:10:26 faster networking with RDMA, and faster storage at the server end with NVMe. This creates great opportunities, but it also creates lots of bugs. And, of course, we've also changed the way we think about async I.O. with IOU ring. And, of course, working in Azure, I see this all the time. We have to deal with a shift to cloud. Sometimes when you're trying to access, you're saving a presentation,

Starting point is 00:10:53 you're storing it maybe in Dublin or San Antonio. The latencies can be quite long when you're storing it from very many thousands of miles away. And you're also having to have objects and files coexisting in the same backend storage. Now, the shift to the cloud, though, also means shorter latencies in some cases because we're able to spin up a wide variety of VMs, some with extremely fast speed and very fast networking. So you see orders of magnitude differences in the cloud, which once again causes bugs and causes changes needed in the protocol

Starting point is 00:11:35 and in optimizing the client and the server for these unusual workloads that we now see more often. Now what about the server? We've talked a lot about the Linux client. Now, the Samba server is great. It's huge and full function, but I want to shift gears a little bit and talk about some exciting development. Namjai got a chance last year to talk about his new kernel server, KSMBD. He's continued to make progress, and I wanted to, since he wasn't able to attend this, I wanted to give a chance to describe some of this new progress on the kernel server.

Starting point is 00:12:08 So the module, sysd, ksmd.ko, is over 35,000 lines of code, almost 3,000 change sets, and then it comes with user space tools, ksmbd tools, which are almost 11,000 lines of source code and had almost 1,000 change sets. Now, this KSMBD module, by the way, they call it KSMBD, so it doesn't get confused with the name of the Samba process. The Samba server process is called SMBD. So KSMBD gives you a chance to, you know, obviously see that it's the kernel server. Now, there's a wiki page on this, but here's an example you can see on my system here of starting the KSMBD service and seeing the processes that it launches.

Starting point is 00:13:02 Now, this is a really good book by Namjai Sargay and others. You can see the GitHub tree here. It is still experimental, but the goal is to send it to Linux next as soon as we can get build verification runs complete. There are also some other changes that have been requested that we'll talk about in a future slide here. Now, it's mirrored onto a tree in GitHub and also in Samba.org, and you can see the tree there.

Starting point is 00:13:31 Now, the name of the module is ksnbd.ko, but we've called the source directory SIFS-D to make it easier to find in the kernel FS directory because we already have an FS SIFS directory for the client and FS SIFifs-d for the server. We'll make it a little easier to find in the source directory, but the module itself and the daemon begin with the letter K to distinguish them from the Samba user space code. Now, they've significantly improved the server. Now look at the number of test cases.

Starting point is 00:14:08 They pass almost 100 of the XFS tests now. This is good progress. So these are Linux mounts to KSMBD. It's big progress. Now, who uses it? Well, it's included in all the firmware for DDWRT and also OpenWRT issues with smartphone apps and SMB client and various types of clients. Fixed kernel loops and hang issues and leakages. They found other issues with static checkers.

Starting point is 00:14:54 They've applied almost 500 patches since last year's SDC. This has been very good progress in the kernel server. Now what's in progress? They added the writing support for ACLs. So the code implementation is complete, storing NT ACLs into XADDRs, and they're now fixing various failures they found from running SMB torture tests

Starting point is 00:15:14 against the kernel server. They've also added support for Kerberos. So using the existing Kerberos 5 libraries, it requires an auxiliary user space daemon. Now some to-dos they're working on, being able to handle open with backup intent, which was needed for some robocopy scenarios, and also being able to support multi-channel to take advantage of the improved performance you get from being able to use multiple network connections at one time. It's a wonderful feature of the SMB3 protocol. So a lot of progress in bug fixers and also in new features. The ACLs and Kerberos support were two particularly important ones because that was

Starting point is 00:15:58 one where some people were hesitant to push forward the kernel server if it had security that was worse than other servers. But adding support for ACLs and Kerberos was a key part of that. So hopefully it will help us to make it easier to merge into mainline. The upstream version of KSMED is now in this GitHub tree that we mentioned here. It's removed the SMB1 code for obvious security reasons. It has much code cleanup. There is various things found with check patch and sparse, and it fixed various build errors with the latest kernel.

Starting point is 00:16:36 So it'll be the best way to integrate the testing and upstreaming into this Linux kernel mainline. Okay, let's switch gears a little bit to what are our goals with Linux and SMB 3.1.1? Well, we want to make it the fastest, most secure, general-purpose way to access file data, whether it's virtualized or in the cloud like Azure or on-premises. We need to implement all the reasonable POSIX semantics we can and features. Some of them can be emulated. We don't really want apps to realize they're running on SMB3 mounts.

Starting point is 00:17:12 Applications just run on file systems, right? You don't want them to have to think that they're running on a network mount. I was always kind of amused because you see the man page of certain utilities and certain syscalls that mention, well, it works this way locally, but if I'm on an NFS, it does this. Well, we don't want to have to do that, right? We want to make sure that the semantics

Starting point is 00:17:33 look reasonably close on the SMB mounts as well as local mounts. Unfortunately, Linux keeps evolving, right? So as Linux continues to evolve, we have to keep adding features quickly to the Linux kernel client and to Samba. You know, we've got some catch-up to do still, so we're not at the point where we're,

Starting point is 00:17:53 you know, adding support for, you know, the example we talked about earlier with, you know, OpenAT2. What does that mean? Does that mean do we have to change anything in the Linuxux kernel client for that unfortunately with rename we did have some things we're going to need to change right to be able to support rename exchange for example as if allocate continues to be extended we will

Starting point is 00:18:16 have additional fsctls to add but we need to be able to not wait eight years right like the ietf sometimes took many many years to get changes to other protocols. So we want to make sure that we can evolve quickly as Linux keeps evolving. But we want this to be the fastest commodity, sort of general purpose way to access file data. You know, blocks are great for certain things, but, you know, applications often just, they expect file semantics to work. Let's look at the new features, the progress in the past year. We added this mode from SID mount option.

Starting point is 00:18:56 This gives us sort of like the NFS style security where, in many cases, the client's permission evaluation matters the most. There are many cases in NFS where the server doesn't know much about the UIDs, right? You could have clients basically that, you know, the mode bits and the UIDs, the server doesn't really know much about those. This is very different than the kind of Kerberos model that you often see with Active Directory and SMB clients, let's say Windows clients, where when Windows SMB clients or Linux SMB clients are joined to an Active Directory domain, the server is able to resolve these global users to SIDs,

Starting point is 00:19:40 to globally unique identifiers, and be able to evaluate these rich ACLs on the server side. The mode from SID option is we're dealing with the primitive POSIX mode bits, and we're trying to store them in a way that's opaque to the server, but allows us to correctly evaluate mode bits on the client. And here's an example you can see on the right-hand side of the page where we have files with every mode bit combination stored. This is similar to what was done in the NFS server for Windows. So the NFS server for Windows had a special SID that was unenforced,

Starting point is 00:20:23 but it contained the exact mode bits. So creating files with all 4,096 mode bit combination works. Now, what else has been added recently? Well, multichannel was added in the 5.5 kernel. Thank you to Aureliana Tsusa. This is a big performance win. In the 5.8 kernel, in some of my tests, it was five times faster because in the 5.8 kernel, we were able to dispatch large reads and writes.

Starting point is 00:20:54 In the 5.5 kernel, all the other I.O. types were dispatched in parallel, but not the large reads and writes. So in the 5.8 kernel, you see some much better performance for a large file I.O. Multichannel allows you to take advantage of multiple network adapters at one time and parallelize better. And in Linux, we do a pretty good job of parallelizing writes going out, but reads coming back in aren't parallelized that well

Starting point is 00:21:23 because we end up sort of blocking on the SIFT-E thread, the reading the stuff off the socket. And even with offloading it to network adapters, we have bottlenecks there, especially with encrypted connections. Now, we do have an e-size mount option that can help parallelize a little bit, but multi-channel does a much better job. So this has been a great performance win, and we see some really promising data, not just to Windows, but also to Azure. It's really kind of fun. I've enjoyed seeing, in many cases, two, three, four times faster I.O. And it's a wonderful feature of the SMB protocol. So here's a network trace of multichannel.

Starting point is 00:22:11 You can see the different colors representing the different adapters that it's going over. And this is, you know, a nice way to see how these requests are being parallelized across multiple network adapters, allowing us to achieve better performance. Now, what else has been added? If you look at this, the XFS developers back a couple years ago, two and a half years ago at the File System Summit, talked to us about dynamic tracing. And a year later at the File System Summit, the developers of the BPF subsystem,

Starting point is 00:22:50 these dynamic tracing infrastructure improvements, gave talks to the file system and memory management developers about how important dynamic tracing is and what can be done. So we've continued to add dynamic trace points, and here's a list of the 82. We continue to add more and more every and here's a list of the 82. We continue to add more and more every year, and as we debug customer problems, we add more. Dynamic tracing is so much better than the typical tracing that's been going to DMSG. And I think that it's really helped a lot in helping us as developers debug customer problems better

Starting point is 00:23:25 and also develop code faster. And, you know, dynamic tracing has been fantastic for CIFS, and I recommend strongly you take a look at Brendan Gregg's books, and there's also, I think, a presentation here on this conference about using BPF. But in CIFS, we have a large number of dynamic trace points, and they've been very helpful. Now, one of the things that was so much fun last year was the ability to

Starting point is 00:23:51 add support for GCM. And this was added in the 5.3 kernel. It can more than double the performance and improve read performance by 80%. It works very well with Windows. And then there were fairly recent changes at the year or so ago to Samba. So now mounts to Samba can also benefit a lot from this. So GCM is quite fast. And as we go forward, we're also going to see support for GCM signing coming. And I don't know exactly the timeframe for that in Windows, but it's going to be a big help. So I'm going to find that I'm finding that very exciting. But I think everybody benefits from this if you're encrypting data, that is. So, for example, when you mount to Azure, you're encry these secure mounts that are encrypted had been an obstacle for Linux because it could take, you know, as much as, you know, sometimes eight milliseconds to encrypt a very, very large frame.

Starting point is 00:24:57 And with GCM, the performance can go up, you know, five times faster for encrypting these frames. So it's kind of a really neat feature that we added last year. Paolo at SUSE added the ability to boot diskless systems via SIFS.ko. That was added in the 5.5 kernel. And here's an kind of interesting because in some of these earlier examples, he booted over SMB1 because of the SMB1 Unix extensions, but he was able to get it with SMB3 as well. There's no IPv6 support yet,

Starting point is 00:25:43 and it does require IP config to set up the network stack prior to mounting. There's a nice write-up on this, and I think that there are use cases where disless systems make a lot of sense in SMB, allowing more control over it. And in some environments, disless systems are going to make more sense. I think as we support, as we improve our support for the POSIX extensions, booting Dyslus over SMB311 is going to make even more sense. But this is a really fascinating feature, and the changes went into the 5.5 kernel.

Starting point is 00:26:25 Okay, so let's summarize some of these changes fairly quickly that have gone in over the past year because there's been a lot of exciting progress. In the 5.3 kernel, we cut the number of round trips from three to two in open. That improved performance about 10%. We added the GCM crypto, which can more than double the write performance. Copy file range, which allows for faster server-side copy, now supports being able to copy from two shares on the same server, so you can do a cross-share copy offload, not just within the same share.

Starting point is 00:27:00 SMB Direct is no longer experimental. Long Lee did some wonderful work there, and I think you're all aware that RDMA is extremely fast, and SMB support over RDMA is a very, very powerful feature. We also added the ability to send the network name context and negotiate protocols, which will help load balancers in some cases, and we can now query sim links that are stored as reparse points. In the 5.4 kernel, we can now boot, but there were some networking dependencies that were required that went into 5.5. The mode from SID mountparm was added. We talked about earlier slide. For workloads where you have large encrypted reads that you need to do, the farm e-size allows you to offload encrypted, which helps our performance of,

Starting point is 00:27:46 of large IO in some cases by parallelizing decryption that we had some customers requesting the ability to disable leases. So that's possible now with a no lease mount farm there. We also added some much improved user space tools to allow better debugging and certain system admin tasks. And as part of that, we added support for a pass-through act to allow user space tools to call various set info levels directly from user space. Thanks to Ronnie at Red Hat for work on that. We also, from some customer requests, added the ability to force caching. So basically cache equals read-only and cache equals single client.

Starting point is 00:28:31 And also, signing using the current signing algorithms is slower than encryption, actually. And there are some cases where by disabling the signing checks on the client, you can get better performance. And in some workloads, that's okay. Obviously, signing has significant value in preventing corruption and also man-in-the-middle attacks. So this is not recommended in all cases.

Starting point is 00:28:57 But if you need to disable signing validation on the client, it can improve performance. We added a mount parm for some customer requests for that. So we also have the ability to display the maximum number of requests in flight, which another feature that's been incredibly helpful is the ability to now dump the decryption keys. So if you want to, if you're mounted with SMB to a server that has an encrypted share, you can now debug that by running SMB info and name of a file, and it'll dump the encryption keys for you, which you can then plug into Wireshark.

Starting point is 00:29:38 And thanks to Aurelia Anitusa for some of that. In the 5.5 kernel, we added support for whole file locks. We emulate them, emulate whole file locks, the BSD style locks, and we added multi-channel support. We also added a performance optimization that helps metadata queries. We cache

Starting point is 00:29:59 metadata for one second, but we now query the metadata on close, which allows us to cache it for one more second beyond close, and that's improved performance. And we also did a redir performance optimization when you have reparse points. Now in the 5.6 kernel, we added the mode from, we improved the mode from sid mount option. We also added support for F allocate mode zero for non-sparse files. We had some things coming from Boris that allowed setting the owner info DOS attributes and creation time from user space, which helps them back up and restore tools.

Starting point is 00:30:41 We also added a, Ronnie at Red Hat did a performance optimization that added compounding support for Redir, which cuts the number of round trips from nine to seven for a typical LS command. And that significantly can improve performance when you're doing a number of Redirs. and we also added re-dir improvements for mode from SID and SIFSACL so we don't end up messing up the mode bits and having re-dir overwrite the mode bits we just got back from stat we also added a new article for change notify this is helpful for user space

Starting point is 00:31:18 tools that may want to wait on directory change notifications currently in Linux there is a change notify and inotify API but it doesn't work for network file systems. It only works for local file systems. There's a VFS change. There's a patch set that's been proposed for that to fix that. But in the interim, we have an iOctl that can be used to call change notify to allow user space tools to wait on directory changes. In 5.7, we had a big performance improvement for sign connections when multiple requests

Starting point is 00:31:46 at the same time went much better parallelism. We made a number of RDMA improvements and added support for swap over SMB3. In the 5.7 kernel, we also came close to finishing the Unix extensions by adding support for the POSIX reader using the SMB3.1.1 POSIX extensions. In the 5.8,

Starting point is 00:32:06 Colonel Morelian made a big performance improvement for a large IO with multi-channel, allowing multi-channel to parallelize large reads and writes, which can make it four or five times faster in some cases. We also added support for something called IDs from SID, which allows an alternate way of handling chown. So mapping POSIX UIDs, GIDs, owner information with a special SID. We also

Starting point is 00:32:31 added this last key part of the POSIX SMB3-1 POSIX extensions in the 5.8 kernel, adding support for POSIX query info. So what improvements should you expect in the near future? So one of the things that you should expect is much stronger encryption. So at the last Storage Developer Conference, they talked about AES-GCM-256, something that helps with the most demanding, most secure workloads. We currently use AES-GCM 128 or CCM 128. GCM 128 is a little faster.

Starting point is 00:33:11 So I've begun work on adding support for AES-GCM 256 as an option to allow us to handle those workloads that require even stronger encryption. Currently, I don't see a reason to do CCM 256 because GCM 256 is likely to be faster. So this is going to be helpful for that subset of workloads that demands the best encryption. Now, in addition, one of the problems that we have had is that encryption is faster than signing because signing relies on a slower algorithm. So there's a new negotiate context that allows other signing

Starting point is 00:33:56 choices and that works in progress. And what this will allow us to do is choose GCM signing which would allow, signing to be faster than encryption. I've begun work on that, and I can see negotiating with the server, but I'm still not complete yet. But that's something I've been working on, and hopefully we'll continue testing at the co-located SMB test event this week. Now, we also have a problem that was noted recently as we were investigating some performance issues. When a server like Samba, in some cases, doesn't support directory leases. Well, in some of the changes we made to allow better caching of the root directory, we end up now doing an extra set of round trips.

Starting point is 00:34:48 So we've added compounding support, which would normally reduce round trips, but we end up doing an extra set of round trips for query of the root door when a server doesn't support directory leases. So that performance fix is being worked on right now. Shyam's been working on a number of sysadcl improvements, things to, for example, handle cases like where the mode bits are 0707, where the, sorry, a better example might be 007 or 077, where the owner has less permissions than the group or everyone. There are a number of cases where we have to add deny aces to be able to properly match the mode bits for some of the cases, like the example I gave 077. So he's been working on those improvements to Sipsackle, as well as some better Kerberos integration things. So when we mount with Kerberos, we can up call a little bit better and find the default tickets a little bit better and also deal with generating the tickets in cases where we don't have it. Make it a little bit easier to do multi-user mounts.

Starting point is 00:36:07 So another thing that's been in progress, we need to improve caching. So NFS often caches file data across close, so there are certain benchmarks that actually don't send any network data because the file is cached the whole time. They set up the temporary file, close it, it stays in the cache, so there's no network hire for those. SMB has a similar feature called handle leases, and we can do the same thing. And doing that will improve workloads where you do open, write, close, open, read, close. In addition, we need to extend the use of directory leases,

Starting point is 00:36:43 which are incredibly helpful for improving metadata. Today, we only do extend the use of directory leases, which are incredibly helpful for improving metadata. Today, we only do it for the root directory. We need to be able to extend directory leases to subdirectories as well. We've been worried a little bit about starping the server for metadata performance, ability to cache subdirectories, the metadata section with subdirectories. And we need to continue to optimize network traffic. SMB has a wonderful feature called compounding that lets you reduce the round trips. And Ronnie did some wonderful patches roughly a year and a half ago and we've continued to do them

Starting point is 00:37:27 to improve the performance of Reader, to improve the performance of Stat and there are many, many IO patterns where we can send multiple requests to the server as one, reducing the round trips and improving performance. The problem is we have to just keep looking case by case by case and find these. Another thing we need to work on is although the multichannel performance is getting better,

Starting point is 00:37:54 it's still somewhat experimental because if a multichannel disconnects, we need to improve how we handle reconnect in multi-channel. So when a session drops in multi-channel, we need to do a few more patches to fix that. So we have some new features that are very exciting that have gone into recent Windows. I was looking at, downloaded the recent Windows Insider build, and it was kind of fun because you can see the GCM encryption here. Here it's trying to negotiate. You can see AES-256-GCM.

Starting point is 00:38:35 And here's an example from the Linux client from yesterday where we enabled require GCM-256. You can see the echo one module CIFS parameters require GC GCM256. You can see the echo one module SIFS parameters require GCM256, which causes us to only send GCM256. And this allows you to require that when the best encryption is required. And then we try to do a mount. Now, the mount's going to fail because the code isn't quite finished yet. But you can see in this Wireshark trace that we've actually successfully negotiated GCM 256.

Starting point is 00:39:12 And where we fail is the tree connect where we have a problem in how we're calculating, I think how we're using the key to encrypt that first frame. So this is something we'll be working on at the test event over the next few weeks, and hopefully we'll have ready for the 510 kernel to be able to get the best encryption. And what about QUIC? Now, QUIC isn't just about avoiding the port 445 problems. There's lots of other advantages of QUIC as well. But the lack of a kernel network driver for QUIC

Starting point is 00:39:55 has been a key issue. It's been discussed, and there's an MS QUIC GitHub project that could be used as a starting point. And the discussions are continuing here, and hopefully we'll be able to talk more about that. Now, Wireshark now has support for SMB311 over QUIC. Here you can see Windows negotiating QUIC.

Starting point is 00:40:14 And so what about other security improvements? We need better SE Linux integration. We need support for more security scenarios. We need to improve the support for multi-user Kerberos mounts on the Cheyennes I'm working on, the stronger encryption, and the port 445 problem. We also need to be able to support dummy mounts so we can handle KRB5 credentials when they're not available. But one thing that's been interesting is to look at the different security models. We have a case where we're enforcing permissions on the

Starting point is 00:40:43 client, like mode from SID, IDs from SID, in cases where we're doing multi-user and enforcing it on the server with CIFS ACL. And then we have cases where we're using the default. These three models have to be supported well, and there's some unusual combinations of all of these that we have to continue to extend and document. CIFS Utils has improved a lot.

Starting point is 00:41:04 SMB Info has been rewritten in Python. There's some great examples. Kenneth D'Souza did a nice job of pass-through Ioctl to get quota information. So it's kind of a fun tool. Here's a sample example of using it to get quota information. Now, what about typical user scenarios? One of the common questions we get asked is what mount options should we use? It's very common to use MF symlinks to be able to have client-evaluated symlinks that are safer than having the symlinks stored on the server.

Starting point is 00:41:33 It's also common to mount with no perm, especially if you have ACL set up properly on a server. It's also very common to use the default DIR mode, file mode, UID and GID. It's faster than mounting with CIFS ACL where you're retrieving those out of the ACL on every stat call. But it's sometimes recommended that you want to use the CIFS ACL if you want to have server-enforced permissions

Starting point is 00:41:58 or IDs from SID and mode from SID if you're enforcing them on the client. It's very common to mount with AC timeout with a larger value than one second if you can relax metadata consistency requirements. Also, it's very common to mount with SFU if you have the need to store special files. And also for servers like Azure that do highly reliable backends,

Starting point is 00:42:22 you may not need to send sync calls to the server. So you can sometimes rely on mounting with no strict sync. No strict sync will allow a little faster performance when your server backend is reliable enough that you don't have to be sending the SMB flush command to tell the server to write it to disk. Now testing is significantly improved. So take a look at our build bot. Take a look at the XFS testing page. It's very easy to set up.

Starting point is 00:42:51 Excludes slow tests, failing tests. We've now up to 180 groups of tests that run over SMB3. That's more than run over NFS. And we keep adding more every release. We added more than 50 XFS tests since last year. This has really helped a lot with reducing regressions, and we're going to continue to add more and more XFS tests as we work through one by one by one. There's a lot of XFS tests, many of them only appropriate for a local file system, but this has been extremely exciting progress. And thanks to the build bot,

Starting point is 00:43:26 we've had the best releases ever for SMB3. Fewer regressions, much higher reliability, much higher quality. So the future looks very bright for Linux and SMB3.1.1. And I'm very excited about the progress that we've made. Now, during the co-located test event, we'll have a chance to do some, you know, this test event that's going on the same week. We'll have a chance to make even more progress. But I look forward to hearing more about the kind of requirements and suggestions you guys have and experiences you have using Linux Client. The Linux SIFS mailing list and Samba Technical are excellent resources to use for getting more information. And here's some additional resources you can use for exploring SMB3.

Starting point is 00:44:08 Anyway, I want to thank you for your time. And I look forward to hearing more feedback and questions. And hopefully, we can continue to extend the workloads that the Linux client can work on. And also, hopefully, we can get this Linux kernel server in so we have lots of choices, both on the client and on the server, for accessing files remotely. Thank you very much for your time. Thanks for listening. If you have questions about the material presented in this podcast, be sure and join our developers mailing list by sending an email to developers-subscribe at sneha.org.

Starting point is 00:44:56 Here you can ask questions and discuss this topic further with your peers in the storage developer community. For additional information about the

Storage Developer Conference - #145: The Future of Accessing Files Remotely from Linux: SMB3.1.1 Client Status Update

...

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.