Storage Developer Conference - #185: SMB3 Landscape and Directions

Starting point is 00:00:00 Hello, everybody. Mark Carlson here, SNEA Technical Council Co-Chair. Welcome to the SDC Podcast. Every week, the SDC Podcast presents important technical topics to the storage developer community. Each episode is hand-selected by the SNEA Technical Council from the presentations at our annual Storage Developer Conference. The link to the slides is available in the show notes at snea.org slash podcasts. You are listening to SDC Podcast, episode number 186. Good morning. This is the SMB3 landscape and directions session. We're members of the Windows SMB client and server team. This is myself, Genghis, or Genghis.

Starting point is 00:00:53 Other pronunciation is fine. And this is Meng. There are other members of the team that contributed to this work, notably Stephen Tran, who couldn't present his work today, so we'll cover for him. So, again, I'm Genghis. This is Mong. I'll do a portion of the presentation, and then Mong will take over. Roughly, we'll speak about three things, compression and compression sampling. And then two new features we've added to the server, the authentication rate limiter, and to the client and server, the SMB notification. That was an extension to a protocol.

Starting point is 00:01:39 A quick overview of SMB3. Recently rewatched our SDC talk from 2017, where it was five years old. At this point, it's 10 years old and quite a few improvements in the past half decade. Sign-in encryption, transparent compression. We've added support for new transports, RDMA, or as Microsoft protocol would describe it as SMBD or SMB Direct. And then most recently, we've added the SMB over Quick transport layer. And that was covered in our 2021 talk at this conference. In the past few years, the most notable improvements were compression and compression sampling.

Starting point is 00:02:24 Compression was that involved changes to protocol. Compression sampling is a feature of a client and server itself, and we'll speak about that in detail. And then, most notably in the past two years, we've added authentication rate limit into the server and SMB notification, and Mung will speak about those two topics. Just an overview of where we're headed in the future, and this is maybe an open-ended discussion, something we'd like to engage the audience on. Obviously, with SMB being a mature technology,

Starting point is 00:02:54 with anything mature, inevitably you enter a phase of optimization, consolidation. You go as lean as possible. If you've attended NetPile's session yesterday on the security roadmap, you'll see we have quite a mind in terms of deprecating older parts of SMB. And it may be that in the future we'll deprecate other parts that are no longer needed or no longer relevant. And then something very nascent that we're talking and working with partners on is offloading parts of notably encryption and compression into hardware. Any questions thus far?

Starting point is 00:03:43 All right. We've covered compression, I think, two years ago also at SDC. I'll just quickly go over what S&P compression is and describe a few interesting issues we've run into as we were implementing compression itself, which may be interesting to other third-party vendors implementing compression. And I know there's at least one here. We've also added support for compression sampling. As I've said in the overview, this is a client feature

Starting point is 00:04:14 where we try to determine if compression is worth doing. And I'll describe why we do that, because there is a component to compression where you might not want to do it. That's the cost evaluation and heuristics part, which we'll speak about. And then I'll just quickly go over the performance counters on the client and server to help anybody working with compression to diagnose how well is it performing, is it working at all. All right, just an overview of what compression is.

Starting point is 00:04:46 Pretty trivial. We offer seamless data compression at our transport layer. This is something you can enable via policy or on a per file basis. So from the application layer, other than enabling it, there's nothing else to be done. We seamlessly handle compression. If you have encryption on, that will integrate with that as well. It's part of a protocol since version 3.1 and 3.1. Currently, we do not support compression over the RDMA transport. There's no particular reason we do not, other than we simply did not get to it yet. There are some technical burdens on

Starting point is 00:05:24 our side that we have to address first. And compression is an area where we see a lot of potential for future expansion. And I have a thought on that as well. And that part, I think, please engage with us if you have any ideas. There's quite a few things I think we can do. It would help to have input from others

Starting point is 00:05:44 to understand, you know, where to focus on. We did cover compression in the 2018 presentation. There is a link below on this slide. Part of that talk described some of the nitty-gritty details of the protocol messaging. But I won't go into those in this talk today. We offer three compression algorithms. These are documented by a spec, the MSXCA spec. That's the list on the slide. Express, Express Huffman, and LZNT1. From the spec itself, quoting Verbitim,

Starting point is 00:06:28 these algorithms are not designed to compress image, audio, or video data, obviously because that data is already compressed. Compared to other algorithms out there, these algorithms emphasize low CPU cost, obviously at the cost of lower compression ratio. Another thing that we documented in the protocol spec, it is also a form of compression. We have a run length encoding prefix scanner that we apply before we run express. And that simply is we look for a repeating data pattern in front and in the back of the packet that we're transferring. So a few things that are interesting from an implementation point of view for somebody who wants to implement compression. You can, even after you've negotiated compression,

Starting point is 00:07:20 again, if you go back to the SDC 2018 talk, we described that in detail, even after the client and server have negotiated the session and have agreed to enable compression and do compression, you are not obligated to compress every message. Another interesting thing is when the client issues a read to a server, the client specifies a bit that requests the server attempt compression. There's no guarantee that the server will actually compress. It is better worded, better to think about this as if the client does not specify this flag, the server will never compress. All right. On the client and server, actually, let me skip over this slide. I'll go into cost evaluation and heuristic first.

Starting point is 00:08:06 So compression by itself is maybe costly. It is opportunistic in nature in that you have to apply a certain amount of effort, but you're not guaranteed to get a reward because you don't know if the input data is compressible or not. Now, we do on the client and server employ heuristics on both macro level and a micro level. On a micro level, we do things such as not compress small messages because even though we may save a byte or two, it's just not worth it. Not only do we expand cycles on the compressor side to do the work, to save that potentially just one byte,

Starting point is 00:08:41 we then have to burden the server with decompressing that. And just to save one byte, that's not worth it. You're not mandated by the protocol to implement these heuristics. A very trivial implementation may simply do always do compression, and that's fine. That could be even something as a reference implementation for testing. And that's on the micro level. On the macro level, we did introduce a feature called compression sampling that tries to evaluate on a larger volume of data if something is compressible or not.

Starting point is 00:09:17 So it's a simple state machine. It is configurable. There are two knobs. The first is the compression sampling window size, and that specifies the number of bytes that the client will transfer until it makes a decision whether the data stream is compressible or not.

Starting point is 00:09:35 The second knob is the goal, is the target. If after transferring that volume of bytes, we've managed to reach the target, we switch to one of two states. If we've reached the target, the data stream is compressible, and then for the remainder of the transfer, we'll always compress. If it's not compressible, we do not compress. Another notable thing here is that the state machine runs on the client.

Starting point is 00:09:58 It is not the server that is making these decisions. And again, if we go back to this tidbit here about the read operations with a client controlling whether the server will attempt compression or not, we utilize that bit. If a client determines that the data stream is incompressible, it will never specify that flag to the server, so the server will never compress.

Starting point is 00:10:21 I have a question. Yes, please go ahead. will then also run the algorithm on the received data in order to check if the other AT requests should be compressed or not? So I'll repeat the question and then just verify that I interpreted it correctly. So if somebody sets the, for the first time, the compression sampling window size to 100 megabytes and the client only completes a portion of that data transfer,

Starting point is 00:11:05 let's say 10 megabytes, does the sampling state machine make a decision or not? No, no. My question was if this sampling also happens on reads, when the client reads back data from the server, and the server responds back with obviously a compressed packet. Does the client consider that as part of the decision? Yes, it does. Obviously, again, there's a caveat here in that we'll specify the bit as part of that while we're in that evaluation phase. We'll specify the bit to say, server, please always compress. The server will compress. And I believe, actually, so remember earlier I mentioned

Starting point is 00:12:07 that small messages are not compressed. In that case, I believe, I'll have to double-check the code, that the server will still compress. So the server may do useless work here. But when that packet comes back to the client and the client decompresses, it will compare the size of the decompressed buffer and the size of the compressedressed buffer and the size of the

Starting point is 00:12:25 compressed buffer and then consider that decision. Yes. And as I said, you do have to transfer the full target size, the full volume, before you transition out of the evaluation state into either the compressible state or the non-compressible state. And those are terminal states. Once you're in one of those two states, you never come out of those until, until the file is closed. Does the, does the server do the evaluation as well? The server does not. Actually, when I studied the code originally, that was surprising to me. Perhaps there is a different design here where the server may do the evaluation and coordinate with the client.

Starting point is 00:13:08 Because then it could be sort of opt-in rather than opt-out. Well, it's starting to compress. They're turning it off. It's worthless. The server could say, well, you know, I can see it's not compressible, so I'm not going to compress. All right. So, yeah, sorry. Yeah, the question comment was, can we do the sampling on the server?

Starting point is 00:13:28 Can the server cooperate with the client on doing the sampling? I believe so. I believe there is a design that will work just as well or maybe even slightly better in some regards. We'll have to do some thinking on that. But ultimately, even if you do it just on the client, just on one node,

Starting point is 00:13:45 the client ultimately receives a packet back from the server. So we'll do the decompression. Yes, if you did that on the server, perhaps the server may avoid doing costly compression work that it would perhaps know ahead of time that this is not compressible, let's not do this. Yes, so what else was fixed? The response has compressed?

Starting point is 00:14:04 The question was, does the server send the responses compressed? While we're in that initial evaluation state, the client will ask the server to compress all the time. Yes. Is it expected that the server will compress while we're in that initial state? Yes. Yes.

Starting point is 00:14:25 Is this going for data operations? Data operations are already done on the computer? Sorry, whichever operations are in data? Is sampling done only for reads and writes or for metadata operations also? Good question. I believe we only do it, the question was, is the sampling done only for reads and writes or other messages such as metadata?

Starting point is 00:14:48 I believe it is only reads and writes, but I'll have to clarify and get back to you on that. These are the knobs here on the slide that control compatibility sampling through our PowerShell interface, set SAB client configuration. There's an op to enable, disable it, a Boolean option, and then the compressibility sampling size and threshold. Threshold is that target value.

Starting point is 00:15:15 So let's say you specify compressibility sampling size to 100 megabytes, compressible threshold to 50 megabytes. For the state machine to determine that the data is compressible, those 100 megabytes have to compress to at least 50 megabytes. For the state machine to determine that the data is compressible, those 100 megabytes have to compress to at least 50 megabytes. So 49, 48, 50, that's fine. That's considered compressible.

Starting point is 00:15:32 Anything over 50 is not compressible. And Ned recently published an article documenting all this. There's a link here. Since this is not a protocol feature and something we've added fairly recently, obviously there will be some probably defects or shortcomings we find, so this is subject to change in the future. I've covered this earlier.

Starting point is 00:15:52 I skipped to the slide again. So compression, just to summarize, compression, the compression is an opportunistic operation. It may be costly with no guarantee of a reward. Therefore, it would make sense to add some smartness around whether we do compression or not. In terms of diagnostics, we've added several performance counters on the client, on the server as well, I believe, but the client ones are more detailed. There's the compressed request per second and compressed response per second that track the number of compressed requests and responses. Again, I'll double check if those just include read rights or metadata as well. And then for cases,

Starting point is 00:16:43 again, I mentioned that we will not always compress due to heuristics. For cases where we fail compression or decide not to compress, those requests will not be counted in the successful compressed request per second. So actual number of packets compressed is that last counter, successful compressed request per second. You can, again, monitor these counters using the built-in Windows Performance Monitor. A few screenshots here. So two different workloads.

Starting point is 00:17:17 One is we're just writing data that's completely random, pseudo-random, and it's not compressing at all. You can see the successful compressed requests per second counter is zero. Writing all zeros continuously. A surprising thing here is we should be able to compress every packet, but we do not. And I believe that goes back to the gentleman's question. Perhaps we're not compressing metadata packets. So I'll go back to that and double-check, and we'll document that.

Starting point is 00:17:45 And on this slide, I'd like to call to the audience and other vendors out there who are implementing compression. Again, I know there is at least one. We've had plenty of questions, found shortcomings in the protocol spec that we've addressed, and there are a few other paragraphs we'll need to address there as well. On our roadmap, definitely we need to add compression support for the RDMA S&P Direct transport layer. In the way that compression combines with encryption and signing, specific to our own S&P client and server, we want to do a bunch of refactor and address technical debt.

Starting point is 00:18:22 And partly that is the debt that is preventing us from committing to implementing compression with RDMA and SMB direct fast enough. Another thing is we'd like to add more sophistication to the compression cost evaluator and heuristics. Automated performance analysis, for example, which algorithm performs better or not versus others. Which ones are more CPU-heavy versus higher compression ratio? Which ones have a higher memory footprint? And again, you might have different characteristics

Starting point is 00:18:53 on the compressor and decompressor. S&B compression, in the past few years, we've had a few security cases. So this is obviously one of the more complex portions of our code. And again, just generally anything with higher complexity, higher risk of defects, especially security defects, and our implementation living in kernel mode, those can be especially dangerous. We're onboarding automated fuzzing for the compression layer. That's not done yet, but that's in progress.

Starting point is 00:19:30 Interoperability validation. So if we extend this going into the future, this being a complex part of a protocol, we'd like to make sure we interoperate not only with older versions of our own client and server implementation, but also third-party vendors out there. So I'd love to have a discussion with other vendors out there if we can set something up where, you know, we run something on a semi-automated basis, I don't know, some cloud somewhere, and we rent VMs, and we run different clients against our own client. That'd be just great.

Starting point is 00:20:03 And again, I've mentioned in the overview slides, something very nascent we're working with hardware partners on is offloading compression into hardware. That's not something that's standardized at all. I think, you know, we'll see standardization in a year's time at least. And other than the three algorithms that we currently support, I'd love to open this up to other compression algorithms out there. I know there's been quite a few developments out there in the open source world. C standard is a notable one. So if we can support that, well, you know, being compliant, that'd be great. Another thing I'd like to do, we're

Starting point is 00:20:47 thinking of, is extend the protocol to chain different algorithms together. So, as an example, the Roar compressor, WinRar, it implements filters for better compression, more efficient compression of executables. The way it does that, it runs filters that flatten branches in executable code, and that just makes, that increases the opportunity for redundancy. So if we can do something similar, actually, in the MSGAP format,

Starting point is 00:21:19 we'll do something similar. If we can stick that into SMB, somebody copying the XE files, you know, they'll benefit. A few links on this slide. Again, there's that article that we recently published, and then the spec link for the express compression algorithms. Yes. So, for your

Starting point is 00:21:44 various compression algorithms, it's very handy Yes. should be, because each company has to write their own implementation of those compression algorithms, and they need some way of verifying that their algorithms are correct. So I think just to summarize the question, it's on the correctness of implementation of these compression algorithms. Yeah, I'm not an expert. I'll have to look into that and get back to you. I don't believe we have a reference implementation. We should. At this point, I'll hand off to Meng. It's 8.52.

Starting point is 00:22:34 Yes. Just a question on compression. If the client requests compression, is the server allowed to reply and compress? The question is, if a client requests compression, is the server allowed to reply uncompressed? The answer is yes. Again, there's that bit.

Starting point is 00:22:48 SMP2 read flag request compress. If the client requests compressed, can the server reply uncompressed? Do you mean if a server sends something uncompressed? If the request is compressed, can you fake the request if you don't? Is that? So the answer is the server can always reply uncompressed. It's a hint or a please kind of a bit. It says controls.

Starting point is 00:23:25 Controls is a little bit strong. But it requests the server to attempt compression. The server can say, sorry, it's not working. And it will slide with a normal report. And select which compression output to use. In fact, it always does that. There's a set of algorithms that are agreed upon by both clients. I'm sorry to jump in.

Starting point is 00:23:48 No, thank you, Tom. Yeah. I appreciate that. Yeah. I used to do Microsoft, but I'm the author of some of these documents. Basically, they pre-negotiate a list of algorithms, and they can choose whatever algorithm they like to replace. It's just like how they negotiate the ceiling algorithm. So the client sends a list and then the negotiate responds the server takes one. So this is the one where you say, come on, I've got a request, that's the second concept,

Starting point is 00:24:18 or the second step. No, the algorithm never flows over the wire, it's explicitly accepted in the very beginning when it agrees which ones. It's more than one, right? I mean, the server only picks one. Each operation

Starting point is 00:24:32 uses only one, correct, but you can try multiple algorithms and see which one's best and choose that one after that one. The server

Starting point is 00:24:40 returns one out of one of those LZ whatever. It doesn't return arguments. It's kind of... It's a little bit messy, but... It just doesn't...

Starting point is 00:24:53 Different streams might be frustrated with different algorithms. So all of this is not bad. And it's done only at a full connection level and there's nothing I've shared with you. What do you mean by it? Some of it is negotiated as part of the negotiated request on a connection basis.

Starting point is 00:25:13 And for full connection, for all shares, access to that connection, it sticks later on the connection. The question is at what level or what grain compression is negotiated. Yeah, it's at the tree connect session setup, I believe. So I'll refer to the SEC 2018 talk. Some of these messages and session setup negotiation is described in detail there at about 33 minutes in. All right. At this point, I'll hand off to Meng.

Starting point is 00:25:48 Thank you. So let's move on to SMB authentication rate limiter. So we chose a new feature developed to defend against NTR password brute force attacks. We have future plan to address the parallel passwords rate attacks. To recap, in SMB, we have SMB server or serve-to instance. Each serve-to instance, we handle traffic from regular SMB client accessing file shares

Starting point is 00:26:57 and other internal traffic. And we also have other serve-to instance for, for example, CSV and SBL. So, let's move on to the structure design of the authentication read-limited feature. So each server to instance have a timer view. And so the reason we have only one time view is that we want to ensure we are not relying on, like, just one view for our server to instance. So each view will have a structure containing various members. For example, the slot array, which is an array of slots,

Starting point is 00:27:53 which is actually a linked list, contains all the delayed authentication work items. And we also have a cursor, which points to where we are in the slot array. And we have an expiry counter, so which counts the how many expiry we have so far in the implementation. So for example if we have like the count is equal to the number of the total slots. So it means that there is nothing to process and we can like time off the timer, like turn off the timer to save the resources. And also we have the DPC to, we can use this to process the delayed authentication work item.

Starting point is 00:28:46 So this structure is per-numer node. To recap, the numer or non-uniform memory access is a computer memory design in multi-processing. So the

Starting point is 00:29:02 process, so this means that the memory access time will depend on the memory location relative to the processor. So a processor can have access to this local memory faster than non-local memory. So the non-local memory could include memory shared between processors and memory local to another processor. So the reason why we make the authentication limit per NUMA node is that if we just rely on one slot array for all NUMA nodes. So this will cause performance degradation, which will cause some internode or crossnode memory access, which will increase the memory access time.

Starting point is 00:29:59 So for example, if we have a machine of two NUMA nodes, we have something like in the diagram, each numeral node will have a structure, similar structure, and each slot array is, as I said, is an array of linked lists which contains the delayed work item. So in our current implementation, we allow the authentication delay up to 10 seconds.

Starting point is 00:30:31 So, in order to accommodate this amount of time delay, we have to ensure we have enough time slots to do so. So, in the timer, we will have timer expire every 100 milliseconds. So if we use 10 seconds divided by 100 milliseconds, it's equal to 100. So that means we the delayed work items in the linked list for each index. So, let's look at the example. So, this is a simplified example to examine how it works. So, we assume the cursor is pointing to the start of time of real, which is index zero, and we assume the time between the time slot is one second, and we want to delay the work item for two seconds. So this means that when the Wakanem arrived,

Starting point is 00:31:46 we will schedule two nodes from the starting cursor. So after one second, the cursor move to next index, but there is nothing to process, so we keep moving for next second. So the cursor is pointing to index two. So it will grab all the work items in the linked list from index two. It will process by their SMB server. So there's other improvement to make this work correctly because in SMB, there's some dependency on the SMB. For example, the multi-UNC provider and the distributed file system. So for example, when we access the net use server share with username with invalid password,

Starting point is 00:32:48 there will be some dependency caused by delay. For example, DFS will attempt SMB tree connect to the dollars and IPC share to check if the pass is part of the DFS namespace. This happens before the SMB client tries to do a regular SMB trade connect. To resolve this additional delay, so we have added a new, sorry, an active cache. So there will be a 50% improvement over the net use case. So there will be only one trip instead of two. And 75% improved over the cases. There will be only one trip

Starting point is 00:33:33 instead of four trips. Any questions so far? Yes. Yes? It's implemented. You think the server and there is no protocol extension in terms of error codes or anything like that? Sorry, can you say that again? Is there anything extended or just a little protocol to make use of this? Oh, this is enabled by default. So there's no extra implementation on this. Even on the wire, you don't see any new protocols.

Starting point is 00:34:07 So let's move on to the SMB notification. So, it's a one-way server-to-client message that became repurposed to replace the skill-out file system witness mechanism and the client periodically acquiring the available network interface on the server. So, this was originally designed to solve the remote desktop services, the terminal services, auto disconnect timeout issue when the SMB encryption is turned on. So this happens, for example, we have

Starting point is 00:34:59 multiple user accessing Azure. One of the users has IODO for more than 50 minutes, which is a default in our current implementation. So the server, in order to save the resources, it will have got encryption decryption keys, but the client doesn't. So after next time, the user issue another IODO request using the same session. If So after next time, the user issue another IO request using the same session. If the session is valid, so it will keep, it will issue the request to the server,

Starting point is 00:35:32 but server cannot decrypt the message because it already has got the decryption keys. So it will send the IO message back to the client. But the client expects a valid response with a valid message, message ID. So it will terminate the connection, cause TCP reset, session reset. So what we have done is let the SMB server send a session close notification to the client. So tell the client to disconnect that session or match in the session ID. So whenever the next I.O. trigger the session read slash mint, it will click in the existing reader state machine. It will create a new session with new encryption and decryption keys. So here's the overview on the protocol change. We have the SMB server to collect notification

Starting point is 00:36:36 message. We have a structure size, two bytes, which is the size of server to collect notification message. And we have reserved two bytes for future usage. And the notification type is the server-to-client notification. We have four bytes. For the notification, because we only support this notification, so N is equal to four over here. And the bottom one is for the SMB2 notify session close message, which is four bytes.

Starting point is 00:37:29 So here's the update on header. So this feature will have a new header command, SMB2 server-to-client notification with this new value. So the new client will accept SMB off-log break and SMB server-to-client notification when the message ID is this file, which is an invite message ID. So for the client, the client can receive a session close notification anytime when after the user have issued a session setup request to the time the client received SMB response to the client log off request.

Starting point is 00:38:13 So for the same session. So if the client cannot find existing session in the connection session table, the client must discard the notification. So we have some way to do the notification I will talk about in the slide later on. So there's no reply to the notification, right? It just drops the message.

Starting point is 00:38:50 No, the server doesn't expect a reply from the client. Okay, so it's a one-way notification. Yes, yeah. Okay. from user session deleted response on the client side. Only once the session is closed, post-edit client-sense response, you can send the session user deleted that is editing on the client side. Sorry.

Starting point is 00:39:16 So the question was, how different is it from a user session deleted response from the server? Yes. So Tom correctly or wrong. User session deleted is a response from the server? Yes. So, Tom, correct me if I'm wrong. User session delete is a response from a server for a client to have to send something for a response. Whereas this mechanism, a server out of the blue

Starting point is 00:39:36 can send something to the client. On the road. So, I see this time you had kind of understand the motivation for this, right? So, is it to correct the client behavior? Because you mentioned and proposed the signed and encrypted keys, which it doesn't invalidate if the client keep on retrying with that.

Starting point is 00:39:59 Are you trying to fix that issue with this? So, just to add to what Anirudh was asking there. What if you just did nothing and client So just to add to what was asked in the back, what if you just did nothing and client sent a SMB to create or something with that invalid session ID, you respond to that with session deleter. What would client do differently than respond when you send this out of the bank notification?

Starting point is 00:40:24 Question was how different do we have to do this because if we did not, the client would eventually get to a state along with the server where, yes, the session is invalid, but they would renegotiate, reconnect, and everything would be good. I believe we have that.

Starting point is 00:40:43 This is what will happen. I don't think we have a slide on the original context of the problem we're trying to solve. But that's essentially what was happening. There might have been something more serious to that problem we were solving. I don't quite recall. I see. I have a question. Is this in a specific version of Windows 5?

Starting point is 00:41:04 Is this the version? It would be the latest one. I can a question. This is implemented in a specific version of Windows 5? Is it a specific version? It would be the latest one. I can't remember. The question was, what version? Something fairly recent. Okay. And there's no plan to backport it to older? I believe we backported it to some slightly older version.

Starting point is 00:41:25 More published will be better. I believe we've back-ported it to some slightly older version. We'll publish it when we go. Maybe you're going to say this, but you haven't yet said what the client does if it does sign the session. What was the client action on this notification? Are you going to talk about that in a minute? Oh, yeah. Yes. Are we going to... Sorry. The question is, what does the client actually do?

Starting point is 00:41:52 It says here what to... It discards the notification if there's no session. What does it do if there is a session? I'm assuming it's being requested to close the session. Right, so the client will close the session. And so the server will give him a little more time, and if he doesn't, he just slams the door and does what it does today. Right, so this is a mechanism to more gracefully notify the client.

Starting point is 00:42:15 Okay, so general client information. If it does find the session, please close it. Right? That's the action that's being requested by this callback? And that gives the client the opportunity to pop up and so on. Like an output. Because the server is essentially saying

Starting point is 00:42:33 the keys you're holding, the state you're holding, is now outdated. I'm going to discard it on my end. On the server. But it just keeps another notification that it's already closed. What is the notification I will close in half a minute or so? I mean, I have over 500 on that session, possibly. With outstanding rights, due to your fault.

Starting point is 00:42:57 What do we do with that? Throw away? Even the early side, it said it was a notification. On the early side, it said it was a notification. It said it had been published by a certain... So my delay, my right, I think I'm gone. Right? Is that right? I don't know. Can I address it?

Starting point is 00:43:21 It's a recall of... Tell me. Tell me. What do we do with outstanding operations? I have buttons that are updated because I have an excluded list. I want to write that one. What's my session name? There it is. Right, those are good questions. Unfortunately, I don't have an answer.

Starting point is 00:43:55 We'll get back to you with the answers. We'll ask Steve. You had a slide that said it's used for major session by all pro-shopper servers. I thought that's the whole context, but you're saying it's not. It can positively be verified by that and stuff like that? So, yeah, the motivation was there was a server knob to automatically retire sessions. And when the server did that,

Starting point is 00:44:17 it got the client into an undesirable state. So that was the motivation for doing that, is to gracefully tell the client that the server's doing that, and to gracefully tell the client that the server's doing that. Client, please do be aware of that so we don't go into... But we would only do that if the session was idle. If you have outstanding handles, et cetera. If you have outstanding handles, then...

Starting point is 00:44:39 Correct. Then there's always a race with the server switching the session to the state where it believes it's idle concurrently with some request coming in that unidles it. Yeah. So this is a trigger when you've got an idle session and you're telling the client, hey, you can't do anything. If I don't see anything from you, I'm going to close this. Yeah, that's the question. Or is it a take-out or a decision? Yeah, that's the question. Or is it a take-now-related thing? Yeah, that's the real key.

Starting point is 00:45:08 Because we have something similar in our connection time where we have the libraries and we're seeing things that we want. We'll just close it. It would be nice to get this message to know

Starting point is 00:45:17 that we're not in that case. We'll have to double check. I think I understand the concern. How idle is it? And if a client is really not that idle, what happens? So this is something we've recently added.

Starting point is 00:45:42 We've never, well, we've had something where the server tells the client. But, you know, I'll imagine we'll find issues. So this could be one way. Okay. So in terms of the integration dialect, we have a new SMB2 global cap notification. So it will indicate the SMB client can receive such notification from the server. So the client will send this new global flag along with other capabilities. So the server must set the support notification, the connection support notification to true if the capabilities are great between the client and server. So during the initial session setup,

Starting point is 00:46:30 the session support notification must be true if connection support notification is true for that session. In case of SMB multi-channel, for subsequent binding request, the server must check against session spot notification with connection spot notification. So if the value is different, the server must reject the binding request from the incoming client. For the signing, we have some area to implement this. One solution is to add a new server notification bit set to one for AES-GMAC

Starting point is 00:47:07 noise generation. So this would require us to generate a random 64-bit message ID for each notification instead of the current invalid message ID. The other one is to add a new field to serve to, serve to client notification structure to send randomly generated noise by the server. So for clients larger than three point, point x, so it will acknowledge receiving SMB with random mid for a server larger than 3.x.x. So it will send the signed notifications. So you're using a random mid to avoid resending the same payload twice? It's serving a little bit like a nod, so I think. That's odd, but okay. I don't have any detail on this one

Starting point is 00:48:11 because we haven't implemented this yet. Oh, okay. Yeah. The random mid is what gets me. I don't quite understand it. I'd like more detail, but okay. Okay. So for the protocol change, when the SMB encryption is turned on,

Starting point is 00:48:34 so there's two scenarios to handle. So if the notification is not associated with SMB session, the encryption will not work unless we broadcast the notification to all clients, to all sessions, in order to use the encryption key. So other case, if the notification is associated with a specific SMB session, so the notification will be encrypted and sent to the client. So for multi-channel scenario, if the notification is not associated with specific session and scope to some client, so the server will iterate through the client connection

Starting point is 00:49:19 list, matching the GUID ID for the client to send the notification to first available connection. If it failed, it will try next connection. So if the notification is associated with a specific session, so the server will iterate through the session channel list to find the first available connection. If it failed, it'll try the next connection. Is this specific to this request? Why would it do the same thing that Oplog Break does?

Starting point is 00:49:56 For the Oplog? Well, it's very much like an Oplog Break. Does Oplog Break iterate through all the multi-channel connections? Mm-hmm. Yeah. So, you can try it to the bottom? Does Oploc break iterate through all the multi-channel connections? Yes. It doesn't go through, applies that there's a response that times out, but yet this operation could take quite some time to notify all your channels if you're going to wait for a timeout on every one of them. Right. That's odd.

Starting point is 00:50:32 I just wonder why it doesn't follow the behavior of an outlaw crate. I thought an outlaw crate was only sent once. Okay. It's a notification. You get one chance. If it didn't go through, well, you know. Oh. Okay. Okay, thank you. So here's example of the SMB

Starting point is 00:51:11 notification. So we have the SMB header, SMB2 server to client notification. We can have the structure size, notification type, notification, and SMB2 notify session close message. So there's some practical usage to trigger this notification.

Starting point is 00:51:36 One example is use the close SMB session with session 90. The other example is to have zero open file handles on an active session when the encryption is turned on for the SMB share. So we can let the session idle for until past 15 minutes timeout, so which is default in our current system. So the SMB server will clean up the state for, and it sent the session close notification to the client. So when we implement this session close notification, so it gave us an easy way to reproduce a rare exchange dialog due to the network disconnect in SMB director, which was also fixed recently. So in the next version of Windows Server, there will be a new command

Starting point is 00:52:30 for get set SMB server configuration auto disconnect timeout v2. So there's some confusion in our current Windows Server. We use the auto disconnect timeout for SMB version one. So in order to fix this in next version of Windows, we're going to use auto disconnect timeout version one to distinguish this.

Starting point is 00:53:00 Yeah. Thank you. Thank you all for attending this session. That's all. Thanks for listening. If you have questions about the material presented in this podcast, be sure and join our developers mailing list by sending an email to developers-subscribe at snea.org. Here you can ask questions and discuss this topic further with your peers in the storage developer community. For additional information

Starting point is 00:53:36 about the Storage Developer Conference, visit www.storagedeveloper.org.

Your Ad Here

Storage Developer Conference - #185: SMB3 Landscape and Directions

...

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.