Storage Developer Conference - #185: SMB3 Landscape and Directions
Episode Date: March 27, 2023...
Transcript
Discussion (0)
Hello, everybody. Mark Carlson here, SNEA Technical Council Co-Chair. Welcome to the
SDC Podcast. Every week, the SDC Podcast presents important technical topics to the storage
developer community. Each episode is hand-selected by the SNEA Technical Council from the presentations at our annual Storage
Developer Conference. The link to the slides is available in the show notes at snea.org
slash podcasts. You are listening to SDC Podcast, episode number 186.
Good morning. This is the SMB3 landscape and directions session.
We're members of the Windows SMB client and server team.
This is myself, Genghis, or Genghis.
Other pronunciation is fine.
And this is Meng.
There are other members of the team that contributed to this work,
notably Stephen Tran, who couldn't present his work today, so we'll cover for him.
So, again, I'm Genghis. This is Mong. I'll do a portion of the presentation, and then Mong will take over.
Roughly, we'll speak about three things, compression and compression sampling.
And then two new features we've added to the server, the authentication rate limiter, and to the client and server, the SMB notification.
That was an extension to a protocol.
A quick overview of SMB3.
Recently rewatched our SDC talk from 2017, where it was five years old.
At this point, it's 10 years old and quite a few improvements in the past half decade.
Sign-in encryption, transparent compression.
We've added support for new transports, RDMA, or as Microsoft protocol would describe it as SMBD or SMB Direct.
And then most recently, we've added the SMB over Quick transport layer.
And that was covered in our 2021 talk at this conference.
In the past few years, the most notable improvements were compression and compression sampling.
Compression was that involved changes to protocol.
Compression sampling is a feature of a client and server itself, and we'll speak about that in detail. And then,
most notably in the past two years, we've added authentication rate limit into the server
and SMB notification, and Mung will speak about those two topics.
Just an overview of where we're headed in the future,
and this is maybe an open-ended discussion,
something we'd like to engage the audience on.
Obviously, with SMB being a mature technology,
with anything mature,
inevitably you enter a phase of optimization, consolidation.
You go as lean as possible.
If you've attended NetPile's session yesterday on the security roadmap, you'll see we have quite a mind in terms of deprecating older parts of SMB.
And it may be that in the future we'll deprecate other parts that are no longer needed or no longer relevant.
And then something very nascent that we're talking and working with partners on is offloading parts of
notably encryption and compression into hardware.
Any questions thus far?
All right.
We've covered compression, I think, two years ago also at SDC.
I'll just quickly go over what S&P compression is
and describe a few interesting issues we've run into as we were implementing compression itself,
which may be interesting to other third-party vendors implementing compression.
And I know there's at least one here.
We've also added support for compression sampling.
As I've said in the overview, this is a client feature
where we try to determine if compression is worth doing.
And I'll describe why we do that,
because there is a component to compression
where you might not want to do it.
That's the cost evaluation and heuristics part, which we'll speak about. And then I'll
just quickly go over the performance counters on the client and server to help anybody working
with compression to diagnose how well is it performing, is it working at all.
All right, just an overview of what compression is.
Pretty trivial.
We offer seamless data compression at our transport layer.
This is something you can enable via policy or on a per file basis.
So from the application layer, other than enabling it, there's nothing else to be done.
We seamlessly handle compression.
If you have encryption on, that will integrate with that as well. It's part of a protocol since version 3.1 and 3.1. Currently,
we do not support compression over the RDMA transport. There's no particular reason we
do not, other than we simply did not get to it yet. There are some technical burdens on
our side that we have to address first.
And compression is an area where we see
a lot of potential for future expansion.
And I have a thought on that as well.
And that part, I think, please engage with us
if you have any ideas.
There's quite a few things I think we can do.
It would help to have input from others
to understand, you
know, where to focus on. We did cover compression in the 2018 presentation.
There is a link below on this slide. Part of that talk described some of the nitty-gritty details of the protocol messaging. But I won't go into those in this talk today.
We offer three compression algorithms.
These are documented by a spec, the MSXCA spec.
That's the list on the slide.
Express, Express Huffman, and LZNT1.
From the spec itself, quoting Verbitim,
these algorithms are not designed to compress image, audio, or video data, obviously because that data is already compressed. Compared
to other algorithms out there, these algorithms emphasize low CPU cost, obviously at the cost
of lower compression ratio. Another thing that we documented in the protocol
spec, it is also a form of compression. We have a run length encoding prefix scanner
that we apply before we run express. And that simply is we look for a repeating data pattern
in front and in the back of the packet that we're transferring.
So a few things that are interesting from an implementation point of view for somebody who wants to implement compression.
You can, even after you've negotiated compression,
again, if you go back to the SDC 2018 talk, we described that in detail,
even after the client and server have negotiated the session and have agreed to enable compression and do compression,
you are not obligated to compress every message.
Another interesting thing is when the client issues a read to a server,
the client specifies a bit that requests the server attempt compression.
There's no guarantee that the server will actually compress. It is better worded, better to think about this as if the
client does not specify this flag, the server will never compress.
All right. On the client and server, actually, let me skip over this slide. I'll go into cost evaluation and heuristic first.
So compression by itself is maybe costly.
It is opportunistic in nature in that you have to apply a certain amount of effort,
but you're not guaranteed to get a reward because you don't know if the input data is compressible or not.
Now, we do on the client and server employ heuristics on both macro level and a micro level.
On a micro level, we do things such as not compress small messages
because even though we may save a byte or two, it's just not worth it.
Not only do we expand cycles on the compressor side to do the work,
to save that potentially just one byte,
we then have to burden the server with decompressing that.
And just to save one byte, that's not worth it.
You're not mandated by the protocol to implement these heuristics.
A very trivial implementation may simply do always do compression, and that's fine.
That could be even something as a reference implementation for testing.
And that's on the micro level. On the macro level, we did introduce a feature called compression sampling
that tries to evaluate on a larger volume of data
if something is compressible or not.
So it's a simple state machine.
It is configurable.
There are two knobs.
The first is the compression sampling window size,
and that specifies the number of bytes
that the client will transfer
until it makes a decision
whether the data stream is compressible or not.
The second knob is the goal, is the target.
If after transferring that volume of bytes,
we've managed to reach the target,
we switch to one of two states.
If we've reached the target, the data stream is compressible,
and then for the remainder of the transfer, we'll always compress.
If it's not compressible, we do not compress.
Another notable thing here is that the state machine runs on the client.
It is not the server that is making these decisions.
And again, if we go back to this tidbit here
about the read operations with a client controlling
whether the server will attempt compression or not,
we utilize that bit.
If a client determines that the data stream is incompressible,
it will never specify that flag to the server,
so the server will never compress.
I have a question.
Yes, please go ahead. will then also run the algorithm on the received data in order to check if the other AT requests
should be compressed or not?
So I'll repeat the question
and then just verify that I interpreted it correctly.
So if somebody sets the, for the first time,
the compression sampling window size to 100 megabytes
and the client only completes a portion of that data transfer,
let's say 10 megabytes, does the sampling state machine make a decision or not?
No, no. My question was if this sampling also happens on reads, when the client reads back data from the server,
and the server responds back with obviously a compressed packet. Does the client consider that as part of the decision?
Yes, it does.
Obviously, again, there's a caveat here in that we'll specify the bit as part of that while we're in that evaluation phase.
We'll specify the bit to say, server, please always compress.
The server will compress.
And I believe, actually, so remember earlier I mentioned
that small messages are not compressed.
In that case, I believe, I'll have to double-check the code,
that the server will still compress.
So the server may do useless work here.
But when that packet comes back to the client
and the client decompresses,
it will compare the size of the decompressed buffer
and the size of the compressedressed buffer and the size of the
compressed buffer and then consider that decision. Yes. And as I said, you do have to transfer
the full target size, the full volume, before you transition out of the evaluation state
into either the compressible state or the non-compressible state. And those are terminal
states. Once you're in one of those two states, you never come out of those until, until the file is closed.
Does the, does the server do the evaluation as well?
The server does not. Actually, when I studied the code originally, that was surprising
to me. Perhaps there is a different design here where the server may do the evaluation
and coordinate with the client.
Because then it could be sort of opt-in rather than opt-out.
Well, it's starting to compress.
They're turning it off.
It's worthless.
The server could say, well, you know, I can see it's not compressible, so I'm not going to compress.
All right.
So, yeah, sorry.
Yeah, the question comment was, can we do the sampling on the server?
Can the server cooperate with the client
on doing the sampling?
I believe so.
I believe there is a design that will work just as well
or maybe even slightly better in some regards.
We'll have to do some thinking on that.
But ultimately, even if you do it just on the client,
just on one node,
the client ultimately receives a packet back from the server.
So we'll do the decompression.
Yes, if you did that on the server,
perhaps the server may avoid doing costly compression work
that it would perhaps know ahead of time
that this is not compressible, let's not do this.
Yes, so what else was fixed?
The response has compressed?
The question was, does the server send the responses compressed?
While we're in that initial evaluation state,
the client will ask the server to compress all the time.
Yes.
Is it expected that the server will compress
while we're in that initial state?
Yes.
Yes.
Is this going for data operations?
Data operations are already done on the computer?
Sorry, whichever operations are in data?
Is sampling done only for reads and writes
or for metadata operations also?
Good question.
I believe we only do it, the question was,
is the sampling done only for reads and writes or other messages such as metadata?
I believe it is only reads and writes, but I'll have to clarify and get back to you on that.
These are the knobs here on the slide that control compatibility sampling through our PowerShell interface,
set SAB client configuration.
There's an op to enable, disable it,
a Boolean option,
and then the compressibility sampling size
and threshold.
Threshold is that target value.
So let's say you specify compressibility sampling size
to 100 megabytes,
compressible threshold to 50 megabytes.
For the state machine to determine
that the data is compressible, those 100 megabytes have to compress to at least 50 megabytes. For the state machine to determine that the data is compressible,
those 100 megabytes have to compress to at least 50 megabytes.
So 49, 48, 50, that's fine.
That's considered compressible.
Anything over 50 is not compressible.
And Ned recently published an article documenting all this.
There's a link here.
Since this is not a protocol feature
and something we've added fairly recently,
obviously there will be some probably defects or shortcomings we find, so this is subject to change in the
future.
I've covered this earlier.
I skipped to the slide again.
So compression, just to summarize, compression, the compression is an opportunistic operation.
It may be costly with no guarantee
of a reward. Therefore, it would make sense to add some smartness around whether we do
compression or not. In terms of diagnostics, we've added several
performance counters on the client, on the server as well, I believe, but the client ones are more detailed. There's the compressed request per second and compressed
response per second that track the number of compressed requests and responses. Again,
I'll double check if those just include read rights or metadata as well. And then for cases,
again, I mentioned that we will not always compress
due to heuristics. For cases where we fail compression or decide not to compress, those
requests will not be counted in the successful compressed request per second. So actual number
of packets compressed is that last counter, successful compressed request per second.
You can, again, monitor these counters
using the built-in Windows Performance Monitor.
A few screenshots here.
So two different workloads.
One is we're just writing data that's completely random,
pseudo-random, and it's not compressing at all.
You can see the successful compressed requests
per second counter is zero.
Writing all zeros continuously.
A surprising thing here is we should be able to compress every packet, but we do not. And I believe that goes back to the gentleman's question. Perhaps we're not
compressing metadata packets. So I'll go back to that and double-check, and we'll
document that.
And on this slide, I'd like to call to the audience and other vendors out there who are
implementing compression.
Again, I know there is at least one.
We've had plenty of questions, found shortcomings in the protocol spec that we've addressed,
and there are a few other paragraphs we'll need to address there as well.
On our roadmap, definitely we need to add compression support for the RDMA S&P Direct transport layer.
In the way that compression combines with encryption and signing, specific to our own S&P client and server,
we want to do a bunch of refactor and address technical debt.
And partly that is the debt that is preventing us from
committing to implementing compression with RDMA and SMB direct fast enough.
Another thing is we'd like to add more sophistication to the compression cost evaluator and heuristics.
Automated performance analysis, for example, which algorithm performs better or not versus
others.
Which ones are more CPU-heavy versus higher compression ratio?
Which ones have a higher memory footprint?
And again, you might have different characteristics
on the compressor and decompressor.
S&B compression, in the past few years,
we've had a few security cases.
So this is obviously one of the more complex portions of our code.
And again, just generally anything with higher complexity, higher risk of defects, especially
security defects, and our implementation living in kernel mode, those can be especially dangerous.
We're onboarding automated fuzzing for the compression layer.
That's not done yet, but that's in progress.
Interoperability validation.
So if we extend this going into the future, this being a complex part of a protocol,
we'd like to make sure we interoperate not only with older versions of our own client and server implementation,
but also third-party vendors out there.
So I'd love to have a discussion with other vendors out there if we can set something
up where, you know, we run something on a semi-automated basis, I don't know, some cloud
somewhere, and we rent VMs, and we run different clients against our own client.
That'd be just great.
And again, I've mentioned in the overview
slides, something very nascent we're working with hardware partners on is offloading compression
into hardware. That's not something that's standardized at all. I think, you know,
we'll see standardization in a year's time at least.
And other than the three algorithms that we currently support, I'd love to open
this up to other compression algorithms out there. I know there's been quite a few developments
out there in the open source world. C standard is a notable one. So if we can support that,
well, you know, being compliant, that'd be great. Another thing I'd like to do, we're
thinking of, is extend the protocol to chain different algorithms together. So, as an example,
the Roar compressor, WinRar, it implements filters for better compression, more efficient
compression of executables.
The way it does that, it runs filters that flatten branches in executable code,
and that just makes,
that increases the opportunity for redundancy.
So if we can do something similar,
actually, in the MSGAP format,
we'll do something similar.
If we can stick that into SMB,
somebody copying the XE files, you know, they'll benefit.
A few links on this slide. Again, there's that article
that we recently published, and then the spec link for the
express compression algorithms.
Yes.
So, for your
various compression algorithms, it's very handy Yes. should be, because each company has to write their own implementation of those compression algorithms,
and they need some way of verifying that their algorithms are correct.
So I think just to summarize the question, it's on the correctness of implementation of these compression algorithms.
Yeah, I'm not an expert. I'll have to look into that and get back to you.
I don't believe we have a reference implementation.
We should.
At this point, I'll hand off to Meng.
It's 8.52.
Yes.
Just a question on compression.
If the client requests compression,
is the server allowed to reply and compress?
The question is, if a client requests compression,
is the server allowed to reply uncompressed?
The answer is yes.
Again, there's that bit.
SMP2 read flag request compress.
If the client requests compressed, can the server reply uncompressed?
Do you mean if a server sends something uncompressed?
If the request is compressed, can you fake the request if you don't?
Is that?
So the answer is the server can always reply uncompressed.
It's a hint or a please kind of a bit.
It says controls.
Controls is a little bit strong.
But it requests the server to attempt compression.
The server can say, sorry, it's not working.
And it will slide with a normal report.
And select which compression output to use.
In fact, it always does that.
There's a set of algorithms that are agreed upon by both clients.
I'm sorry to jump in.
No, thank you, Tom.
Yeah.
I appreciate that.
Yeah. I used to do Microsoft, but I'm the author of some of these documents.
Basically, they pre-negotiate a list of algorithms, and they can choose whatever algorithm they like to replace.
It's just like how they negotiate the ceiling algorithm.
So the client sends a list and then the negotiate responds the server takes one.
So this is the one where you say, come on, I've got a request, that's the second concept,
or the second step.
No, the algorithm never flows over the wire, it's explicitly accepted in the very beginning
when it agrees which ones.
It's more than one,
right?
I mean,
the server only picks one.
Each operation
uses only one,
correct,
but you can try
multiple algorithms
and see which one's best
and choose that one
after that one.
The server
returns one
out of one
of those
LZ whatever.
It doesn't return arguments.
It's kind of...
It's a little bit messy, but...
It just doesn't...
Different streams might be frustrated
with different algorithms.
So all of this is not bad.
And it's done only at a full connection level
and there's nothing I've shared with you.
What do you mean by it?
Some of it is negotiated as part of the negotiated request
on a connection basis.
And for full connection, for all shares,
access to that connection, it sticks later on the connection.
The question is at what level
or what grain compression is negotiated.
Yeah, it's at the tree connect
session setup, I believe. So I'll refer to the SEC 2018 talk. Some of these messages
and session setup negotiation is described in detail there at about 33 minutes in.
All right. At this point, I'll hand off to Meng.
Thank you.
So let's move on to SMB authentication rate limiter.
So we chose a new feature developed
to defend against NTR password brute force attacks.
We have future plan to address
the parallel passwords rate attacks.
To recap, in SMB, we have SMB server or serve-to instance.
Each serve-to instance, we handle traffic from regular SMB client accessing file shares
and other internal traffic.
And we also have other serve-to instance for, for example, CSV and SBL.
So, let's move on to the structure design of the authentication read-limited feature. So each server to instance have a timer view.
And so the reason we have only one time view is that we want to ensure we are not relying on, like,
just one view for our server to instance.
So each view will have a structure containing various members.
For example, the slot array,
which is an array of slots,
which is actually a linked list,
contains all the delayed authentication work items.
And we also have a cursor,
which points to where we are in the slot array.
And we have an expiry counter, so which counts the how many expiry we have so far in the
implementation. So for example if we have like the count is equal to the number of the total slots. So it means that there is nothing to
process and we can like time off the timer, like turn off the timer to save
the resources. And also we have the DPC to, we can use this to process the delayed authentication work item.
So this structure is
per-numer node. To recap,
the numer
or non-uniform memory
access is a computer
memory
design in multi-processing.
So the
process, so this means that the memory
access time will depend on the memory location
relative to the processor. So a processor can have access to this local memory faster
than non-local memory. So the non-local memory could include memory shared between processors and memory
local to another processor. So the reason why we make the authentication
limit per NUMA node is that if we just rely on one slot array for all NUMA nodes. So this will cause performance degradation, which will cause some internode
or crossnode memory access, which will increase the memory
access time.
So for example, if we have a machine of two
NUMA nodes, we have something like in the diagram,
each numeral node will have a structure,
similar structure,
and each slot array is, as I said,
is an array of linked lists which contains the delayed work item.
So in our current implementation,
we allow the authentication delay up to 10 seconds.
So, in order to accommodate this amount of time delay, we have to ensure we have enough time slots to do so.
So, in the timer, we will have timer expire every 100 milliseconds.
So if we use 10 seconds divided by 100 milliseconds, it's equal to 100. So that means we the delayed work items in the linked list for each index.
So, let's look at the example.
So, this is a simplified example to examine how it works.
So, we assume the cursor is pointing to the start of time of real, which is index
zero, and we assume the time between the time slot is one second, and we want to delay the
work item for two seconds. So this means that when the Wakanem arrived,
we will schedule two nodes from the starting cursor.
So after one second, the cursor move to next index,
but there is nothing to process,
so we keep moving for next second. So the cursor is pointing to index two. So
it will grab all the work items in the linked list from index two. It will process by their
SMB server. So there's other improvement to make this work correctly because in SMB, there's some dependency on the SMB.
For example, the multi-UNC provider and the distributed file system.
So for example, when we access the net use server share with username with invalid password,
there will be some dependency caused by delay.
For example, DFS will attempt SMB tree connect
to the dollars and IPC share
to check if the pass is part of the DFS namespace.
This happens before the SMB client tries to do a regular SMB
trade connect. To resolve this additional delay, so we have added a new, sorry, an active
cache. So there will be a 50% improvement over the net use case. So there will be only one trip instead of
two. And 75% improved over the cases. There will be only one trip
instead of four trips. Any questions so far? Yes. Yes? It's implemented. You think the server and there is no protocol extension
in terms of error codes or anything like that?
Sorry, can you say that again?
Is there anything extended or just a little protocol
to make use of this?
Oh, this is enabled by default.
So there's no extra implementation on this.
Even on the wire, you don't see any new protocols.
So let's move on to the SMB notification. So, it's a one-way server-to-client message
that became repurposed
to replace the skill-out file system witness mechanism
and the client periodically acquiring
the available network interface on the server.
So, this was originally designed to solve the
remote desktop services, the terminal services, auto disconnect timeout issue
when the SMB encryption is turned on. So this happens, for example, we have
multiple user accessing Azure. One of the users has IODO for more than 50 minutes,
which is a default in our current implementation.
So the server, in order to save the resources,
it will have got encryption decryption keys,
but the client doesn't.
So after next time, the user issue another IODO request
using the same session. If So after next time, the user issue another IO request using the same
session. If the session is valid, so it will keep, it will issue the request to the server,
but server cannot decrypt the message because it already has got the decryption keys. So
it will send the IO message back to the client. But the client expects a valid response with a valid message, message ID.
So it will terminate the connection, cause TCP reset, session reset. So what we have
done is let the SMB server send a session close notification to the client. So tell the client to disconnect
that session or match in the session ID. So whenever the next I.O. trigger the session
read slash mint, it will click in the existing reader state machine. It will create a new session with new encryption and decryption keys.
So here's the overview on the
protocol change. We have the SMB server to collect notification
message. We have a structure size, two bytes,
which is the size of server to collect notification
message.
And we have reserved two bytes for future usage.
And the notification type is the server-to-client notification.
We have four bytes.
For the notification, because we only support this notification, so N is equal to four over here.
And the bottom one is for the SMB2 notify session close message, which is four bytes.
So here's the update on header. So this feature will have a new header command,
SMB2 server-to-client notification with this new value.
So the new client will accept SMB off-log break and SMB server-to-client notification
when the message ID is this file,
which is an invite message
ID. So for the client, the client can receive
a session close notification anytime when after the user have issued a session setup request to the time the client received SMB response
to the client log off request.
So for the same session.
So if the client cannot find existing session
in the connection session table,
the client must discard the notification.
So we have some way to do the notification
I will talk about in the slide later on.
So there's no reply to the notification, right?
It just drops the message.
No, the server doesn't expect a reply from the client.
Okay, so it's a one-way notification.
Yes, yeah.
Okay. from user session deleted response on the client side. Only once the session is closed,
post-edit client-sense response,
you can send the session user deleted
that is editing on the client side.
Sorry.
So the question was,
how different is it from a user session deleted response
from the server?
Yes.
So Tom correctly or wrong.
User session deleted is a response from the server? Yes. So, Tom, correct me if I'm wrong. User session delete is a response from a server
for a client to have to send something for a response.
Whereas this mechanism, a server out of the blue
can send something to the client.
On the road.
So, I see this time you had
kind of understand the motivation for this, right?
So, is it to correct the client behavior?
Because you mentioned and proposed the
signed and encrypted keys, which it doesn't invalidate
if the client keep on retrying with that.
Are you trying to fix that issue with this?
So, just to add to what Anirudh was asking there.
What if you just did nothing and client So just to add to what was asked in the back,
what if you just did nothing and client sent a SMB to create
or something with that invalid session ID,
you respond to that with session deleter.
What would client do differently than respond
when you send this out of the bank notification?
Question was how different do we have to do this
because if we did not,
the client would eventually get to a state
along with the server where, yes,
the session is invalid,
but they would renegotiate, reconnect,
and everything would be good.
I believe we have that.
This is what will happen.
I don't think we have a slide on the original context of the problem we're trying to solve.
But that's essentially what was happening.
There might have been something more serious to that problem we were solving.
I don't quite recall.
I see.
I have a question.
Is this in a specific version of Windows 5?
Is this the version? It would be the latest one. I can a question. This is implemented in a specific version of Windows 5? Is it a specific version?
It would be the latest one.
I can't remember.
The question was, what version?
Something fairly recent.
Okay.
And there's no plan to backport it to older?
I believe we backported it to some slightly older version.
More published will be better. I believe we've back-ported it to some slightly older version. We'll publish it when we go.
Maybe you're going to say this, but you haven't yet said what the client does if it does sign the session.
What was the client action on this notification?
Are you going to talk about that in a minute?
Oh, yeah.
Yes.
Are we going to... Sorry.
The question is, what does the client actually do?
It says here what to...
It discards the notification if there's no session.
What does it do if there is a session?
I'm assuming it's being requested to close the session.
Right, so the client will close the session.
And so the server will give him a little more time,
and if he doesn't, he just slams the door and does what it does today.
Right, so this is a mechanism to more gracefully notify the client.
Okay, so general client information.
If it does find the session, please close it.
Right?
That's the action that's being requested by this callback?
And that gives the client the opportunity
to pop up and so on.
Like an output.
Because the server is essentially saying
the keys you're holding, the state you're holding, is now outdated.
I'm going to discard it on my end.
On the server.
But it just keeps
another notification that it's already closed.
What is the notification I will close in half a minute or so?
I mean, I have over 500 on that session, possibly.
With outstanding rights, due to your fault.
What do we do with that? Throw away?
Even the early side, it said it was a notification.
On the early side, it said it was a notification.
It said it had been published by a certain...
So my delay, my right, I think I'm gone.
Right? Is that right?
I don't know.
Can I address it?
It's a recall of...
Tell me.
Tell me. What do we do with outstanding operations? I have buttons that are updated because I have an excluded list.
I want to write that one.
What's my session name?
There it is.
Right, those are good questions.
Unfortunately, I don't have an answer.
We'll get back to you with the answers. We'll ask Steve.
You had a slide that said it's used for major session by all
pro-shopper servers.
I thought that's the whole context, but you're saying it's not.
It can positively be verified by that and stuff like that?
So, yeah, the motivation was there was a server knob
to automatically retire sessions.
And when the server did that,
it got the client into an undesirable state.
So that was the motivation for doing that,
is to gracefully tell the client that
the server's doing that, and to gracefully tell the client that the server's doing that.
Client, please do be aware of that so we don't go into...
But we would only do that if the session was idle.
If you have outstanding handles, et cetera.
If you have outstanding handles, then...
Correct.
Then there's always a race with the server switching the session to the state where it believes it's idle concurrently with some request coming in that unidles it.
Yeah.
So this is a trigger when you've got an idle session and you're telling the client, hey, you can't do anything.
If I don't see anything from you, I'm going to close this.
Yeah, that's the question.
Or is it a take-out or a decision? Yeah, that's the question. Or is it a take-now-related thing?
Yeah, that's the real key.
Because we have something similar
in our connection time
where we have the libraries
and we're seeing things
that we want.
We'll just close it.
It would be nice to get
this message to know
that we're not
in that case.
We'll have to double check.
I think I understand the concern.
How idle is it?
And if a client is really not that idle,
what happens?
So this is something we've recently added.
We've never, well,
we've had something where the server
tells the client. But, you know, I'll imagine we'll find issues. So this could be one way.
Okay. So in terms of the integration dialect, we have a new SMB2 global cap notification. So it will indicate the
SMB client can receive such notification from the server. So the client will send this new
global flag along with other capabilities. So the server must set the support notification,
the connection support notification to true if the capabilities are great between the client and server.
So during the initial session setup,
the session support notification must be true
if connection support notification is true for that session.
In case of SMB multi-channel,
for subsequent binding request,
the server must check against session spot
notification with connection spot notification. So if the value is different, the server must
reject the binding request from the incoming client.
For the signing, we have some area to implement this. One solution is to add a new server notification bit set to one for AES-GMAC
noise generation. So this would require us to generate a random 64-bit message ID for each
notification instead of the current invalid message ID. The other one is to add a new field to serve to, serve to client notification structure to
send randomly generated noise by the server. So for
clients larger than three point, point x, so it
will acknowledge receiving SMB with random mid for a server larger than 3.x.x. So it will
send the signed notifications.
So you're using a random mid to avoid resending the same payload twice? It's serving a little bit like a nod, so I think. That's odd, but okay.
I don't have any detail on this one
because we haven't implemented this yet.
Oh, okay.
Yeah.
The random mid is what gets me.
I don't quite understand it.
I'd like more detail, but okay.
Okay. So for the protocol change,
when the SMB encryption is turned on,
so there's two scenarios to handle.
So if the notification is not associated with SMB session,
the encryption will not work
unless we broadcast the notification
to all clients, to all sessions, in order to use the encryption key. So other case,
if the notification is associated with a specific SMB session, so the notification will be encrypted and sent to the client. So for multi-channel
scenario, if the notification is not associated with specific session and
scope to some client, so the server will iterate through the client connection
list, matching the GUID ID for the client to send the notification to first available connection.
If it failed, it will try next connection.
So if the notification is associated with a specific session,
so the server will iterate through the session channel list
to find the first available connection.
If it failed, it'll try the next connection.
Is this specific to this request?
Why would it do the same thing that Oplog Break does?
For the Oplog?
Well, it's very much like an Oplog Break.
Does Oplog Break iterate through all the multi-channel connections?
Mm-hmm. Yeah. So, you can try it to the bottom? Does Oploc break iterate through all the multi-channel connections? Yes.
It doesn't go through, applies that there's a response that times out, but yet this operation could take quite some time to notify all your channels
if you're going to wait for a timeout on every one of them.
Right.
That's odd.
I just wonder why it doesn't follow the behavior of an outlaw crate.
I thought an outlaw crate was only sent once.
Okay.
It's a notification.
You get one chance.
If it didn't go through, well, you know. Oh. Okay.
Okay, thank you.
So here's example of the SMB
notification.
So we have the SMB header,
SMB2 server to client
notification.
We can have the structure size,
notification type,
notification, and SMB2 notify session close message.
So there's some practical usage to trigger this notification.
One example is use the close SMB session with session 90.
The other example is to have zero open file handles
on an active session when the encryption is turned on for the SMB share.
So we can let the session idle for until past 15 minutes timeout, so which is default in our current system.
So the SMB server will clean up the state for, and it sent the session close notification to the client.
So when we implement this session close notification, so it gave us an easy way to reproduce a rare exchange dialog due to the network disconnect in SMB director, which was also fixed recently.
So in the next version of Windows Server,
there will be a new command
for get set SMB server configuration
auto disconnect timeout v2.
So there's some confusion in our current Windows Server.
We use the auto disconnect
timeout for SMB version one.
So in order to fix this in next version of Windows,
we're going to use auto disconnect timeout version one to
distinguish this.
Yeah.
Thank you.
Thank you all for attending this session.
That's all.
Thanks for listening.
If you have questions about the material presented in this podcast,
be sure and join our developers mailing list by sending an email to developers-subscribe at snea.org. Here you can ask questions and discuss
this topic further with your peers in the storage developer community. For additional information
about the Storage Developer Conference, visit www.storagedeveloper.org.