Storage Developer Conference - #46: Building on The NVM Programming Model – A Windows Implementation
Episode Date: June 5, 2017...
Transcript
Discussion (0)
Hello, everybody. Mark Carlson here, SNEA Technical Council Chair. Welcome to the SDC
Podcast. Every week, the SDC Podcast presents important technical topics to the developer
community. Each episode is hand-selected by the SNEA Technical Council from the presentations
at our annual Storage Developer Conference. The link to the slides is available in the show notes at snea.org slash podcast.
You are listening to SDC Podcast Episode 46.
Today we hear from Paul Luce, Principal Engineer, Intel,
and Chandra Kumar Konopi, Senior Software Engineer with Microsoft,
as they present Building on the NVM Programming Model, a Windows Implementation, from the 2016 Storage Developer Conference.
All right, so we're going to talk about NVML, which a lot of you folks probably already
know about, but believe it or not, we're moving it to Windows now, so we're going to have
a Windows version of NBML.
My name is Paul Luce.
I'm a software engineer at Intel.
I've been there over 20 years,
most of it doing storage software stuff.
I recognize a few folks here in the room.
I did a lot of the early NBIM Express work.
If you're familiar with the Open Fabrics Alliance Windows driver,
I was the chair of that first working group,
and there were a team of four the chair of that first working group and
there were a team of four or five of us that wrote that and
put that on initially.
So it's good to see that's still plugging along and cruising.
And I took a little time off and did some OpenStack Swift work.
And now I'm back on this Windows NVML implementation
with my partner in crime over here, Chandra.
Hi, I'm Chandra, working in NTFS,
and I've gone enabling DAX file system mode in NTFS,
and Neil, if you're familiar with Neil.
And I also work with Paul for moving NVML to Windows.
And I think we've got to manually click.
Sure.
All right, so let's talk about what we're going to talk about.
This is more sort of a high-level talk.
We wanted to give you guys an overview on really what we're doing in Windows land.
NBML as a library, as a lot of you probably know,
has been around for a couple of years following on the programming model.
So for an agenda today, we, of course,
have to do some level setting on the programming model.
Hopefully some of you were here earlier for Doug's talk or many other talks or Andy's talk yesterday
or sticking around for Andy's talk afterwards.
And there's lots of deep dive stuff in the programming model.
We're just going to kind of remind you what it is and where it's relevant when we talk about NVML, the library.
And then, of course, we need to talk about NVML, the library itself.
This is not a deep dive talk in the library. And then of course we need to talk about NVML library itself.
This is not a deep dive talk in the library.
There are other talks that you can hear about the library or you can talk to any of us afterwards. Like I said earlier,
we're really here to focus on the fact that we're doing this for windows and
give you some idea of what that means, who's involved,
what kind of process we're following,
some of the architectural decisions that we had to make early on,
how are we going to approach this, and of course the status on where we're at,
and when we're going to be done, when you can actually use this library in Windows.
So that's what we hope to get out of today, and if we do our jobs correctly,
you guys will all walk away being experts on everything on this outline.
There will be a quiz.
Okay, so it doesn't say SNEA NBM programming model SW, but it does say open, so that's close.
It doesn't say the Intel programming model.
So the SNEA NBM programming model,
probably no news here to anybody in the room,
but started back, I don't know what, four ago four or five years ago right lots of companies put forth the
the specification links down below and as Doug talked about in the last in the
last talk it's not an API right it's a program model specification keep plugging
forward and very similar slide to at least 16 other presentations floating model specification. So let's keep plugging forward.
And very similar slide to at least 16 other presentations
floating around out there.
Couple of small tweaks that we made for the Windows side
about what we're going to go through.
We'll just sort of highlight the main three paths that show up
in this version of the slide.
Of course, your traditional application
through the file system down through your NVDIMM driver. We've got your second path through with the application through a PM-aware file system.
And in Windows, it's also called DAX.
Windows uses the term storage class memory to refer to persistent memory.
So you may see SCM floating around in our slides.
But DAX is the implementation and involves NTFS today and the cache manager
and the memory manager. Chandra did a lot of that work himself personally, so if you
have any really hard questions for him, that'd be awesome.
And then really the focus of this talk is the third path, the application direct path
is what we call it. And that's of course where once you've gone through the control mechanisms
of the file system,
you've done your file mapping, you've got everything set up,
and you've got load store access to memory.
Now you've got direct access to your hardware.
So QNVML.
Well, first let's talk about the status
of OS enablement of the programming model itself.
So I'm not sure how many people know,
but Windows has persistent memory support right
now.
It's available in all Windows SKUs, starting with the anniversary update and Server 2006,
specifically within TFS with DAX mode.
But you can do everything on Windows today that you can do on Linux.
And then on Linux support, we heard it was somewhere around 4.4,
but we don't know where Windows guys,
so could have been earlier, could have been later.
Somebody here probably knows better.
And then both of these refer to the JETIC-defined
NVDIMM devices.
Okay, also something that maybe a lot of folks don't know,
and I'll talk more about our Windows
community a little bit later, it's not just an Intel Microsoft thing, we've got a couple
other companies involved and we're beating the bushes trying to get more, but HP Enterprise
is doing some stuff above and beyond what Microsoft has done for NBM support, specifically
around Workstation 2012 R2.
So they've got the ability to provide early access to NBM technologies
in this older OS without any kernel changes from Microsoft.
So these are kernel mode drivers that HP Enterprise has provided
and able to provide access for the Workstation folks.
So one real specific thing, there's no DAX file system support.
Like I said, Microsoft is not behind these changes.
But this does use a subset of NVML based on the work that we're doing on the Windows side.
And we've got a lot more information that you can get out of this link here.
And these slides are posted.
So if you want to learn how to do some early access NVM stuff with releases prior to 2016,
then you can follow that link and get them.
Okay so now NVML, and I think this is probably maybe the third time this slide has been up
today or at least something very similar.
So hopefully everybody knows the NVML library is something that was created not to implement the persistent programming model, the SNEA persistent programming model, but to assist.
So it's there to help application developers better take advantage of this programming model.
So it really came out of input that we had heard on common pain points out in the industry from various customers that we were talking to.
And we're going to go through some of those details, exactly what it does.
But it involves, of course, hardware abstraction, making things easier.
There's definitely some value-added functionality for application developers that aren't used to dealing with transactions
or required atomicity, things that used to be dealt with by the file system
or somebody beneath them that they didn't care about,
and then some performance improvements as well
or performance enhancements over maybe standard ways of doing this.
Okay, so I'm going to turn it over to Chandra now,
and he's going to go through several of the NVML libraries and show you a couple of examples of what they look like under Windows.
Thank you Paul.
Okay NVM libraries.
So it is just a set of convenience libraries that work on top of open NVM programming model
to solve the common problems for application developers.
So that is the way to look at it, one way or the other.
So it is in user space, for now at least.
And it is not a single library,
it's a bunch of libraries, as I initially said.
There is one core library which does the trivial functionalities
like pmm persist, to make sure that the data that we store is persisted,
and testing whether the region is backed
by persistent memory, directly backed by persistent memory, such functionalities.
And there are other libraries as you can see that, libpmemobj, blog, log, they depend on
the core library and provide more advanced support.
Each one has a key focus like pmem log will provide logging APIs, et cetera. And uh, again, this,
this adds value on top of that file system as we can see,
and it's a user mode library.
Okay,
let's now look at each library and what's the key focus point.
So PMM is again the core library that is going to provide functionalities like flushing or processing the data that is written, testing
whether the region is DAX mapped or not. It's a core and every other library depends on
that one. PMEM block. This provides block atomicity. If applications are used to or
relay on sector atomicity, maybe this is something that might help those applications.
Because it doesn't provide generic transactional support, just an atomicity, whether a block
is written or not, this is more efficient than other transactional libraries.
And PMM log provides APIs for logging on a persistent memory, like you create a file,
log file and keep appending to it on a persistent memory.
That's PMM log for applications that need that.
And PMEM obj.
This one library might be the one most of you need or most of the app developers need
because this provides generic transactional support.
We will see the challenges later and then see more into what I mean by generic transactional
support.
Maybe you already heard from other talks like Paul said. But this is the one that most of you might need.
And we have VMEM and VMLock. This essentially provides virtual memory from a persistent
memory. We think it is a little, I don't know, we see less interest here. So if you believe
that you might be needing this functionality,
please give us feedback.
So these are the group of libraries and their overview.
And you know, some of you experts out there might be saying,
but wait, isn't there more?
I was out on GitHub and I saw more than just this.
And yeah, there is.
So the point is the scope of what we're doing in the Windows side is limited and
we're behind where the Linux side is at because of course we're two years behind
in development.
But I'll talk more about that in a minute.
But if you look out on GitHub and see a remote persistent memory library,
it's not a mistake that it's not here.
It is there for Linux, it's just not there for Windows.
Okay, so with DAX file system, the efficient way or efficient path is to memory map a file
and then do load or store directly on the PMM device.
So now let's look at how the code looks like for this model.
And I think this is the open programming model if you already heard that.
So we create a file or open a file, we create a file mapping file mapping object, map a view of a file, then
we load or store on the view directly.
Whenever we access the view, we are accessing the persistent device, so that's why it is
efficient.
No OS software is in the way.
And when you want to persist the data that we previously wrote, we use either flash view
of file or flash file buffers depending on we need the
metadata of the file to be persisted or not.
So flush file buffers will persist all dirty data associated with the file and the metadata
so it is less efficient than flush view of file if all an application needs is to persist
a given region.
So generally applications might prefer to use flush file buffers less often than flush
view of file.
You can model your application accordingly.
So this is the overview.
And this doesn't need a DAX file system.
This model that we see here, this piece of code, it works on any file system, even non-DAX
file system as well.
It just works.
But this is the efficient path in case of DAX file system.
That is where we'm looking at it.
So Chandra, what's the granularity of the flush operations here?
Can you come back and pull?
Are we flushing on cache line?
Are we flushing on pages?
Okay, so this flush view of file, it's a good question.
So the intention of flush view of file is to make sure that the data that is there
in the virtual memory
goes to the disk, all the way to the disk, through the system that always provides.
It works on the page granularity.
We have to flesh the page, write the page out to the file which is on disk.
That is the intended operation of the flesh your file.
Flesh file buffers, it fleshes all the pages corresponding to the file.
Also it makes sure that the logs that are written,
which is still in volatile memory, gets persisted.
So that if you extend a file and then write to it,
when you call flush file buffers,
it makes sure that the file size is also persisted and the data.
If the file size is not persisted,
even if we persist the data, it is immaterial.
We are not able to read it back.
So does it answer the question, Paul? I have another question, though.
Sure. It's easy to say.
So you're saying we need to flush a page for doing 18 bytes, but you also say it's
not atomic. What does that mean?
Thank you. Yeah. Here, we are writing more than 8 bytes, and it's not really atomic.
By that, what I mean is, if the machine crashes after the
string copy, part of the string could be persisted and part is not, based on what cache lines
are flushed. We are looking at only this thread, maybe there is some other thread operating
on the same file and it is issuing a a flush after the string copy and though our flush is not executed, part of the string could make it
to the disk while others are still sitting in the cache lines which doesn't reach the
memory and thereby the disk.
So that's what I meant by atomic operation.
Is it...
That's a bummer.
I wonder what we're going to do about that.
Does the problem make sense?
Okay.
So I conveniently assume that the problem makes sense.
And let's move on.
So first let's see how libpmum is going to fit here.
As you can see, most of the code remains the same.
The create file, create file mapping, map view of file, all the same, except for how we persist the data.
Previously we were using flush view of file, here we are using PMM persist.
And why PMM persist?
That's a question.
So in Dax, we know that the region is Dax mapped, that is, we are directly writing to
the persistent memory, and all we need to do is flush the data that is there in the processor cache and that will take care of
persisting it. And PMM persist just does that. Sorry, yeah, it just do that. Flushes the
processor cache to make sure the data reaches the DIMM which is enough to make sure it persists.
Unlike the flush view of file, it has to issue an I.O. for the page,
which is there in the oratel memory,
through the I.O. path
to reach the disk, which is two different operations
all together. It's just that both provide the same
guarantee in case of DAX, but this is
much efficient.
So that's why we would suggest or prefer
to use PMEM persists.
And again, as we can still see,
the string copy is more than 8 bytes and
even in this case it is not atomic meaning that if the machine crashed after the copy
but before we persist part of the string could have been flushed from the cache to the DIM
and part is not. So we'll still have the open problem. And the picture here, yeah, you have a question?
So in the previous case, there is no, if the file system is not PM aware...
It may or may not be.
Okay. So if the file system is PM aware, will there still be a mapping into memory?
Okay, the question is, in this case, if the file system is PMM aware, will there be a mapping into memory? I think the question assumes if there is mapping into a volatile memory and copying data.
No, there is no copy of the data.
The mapping is created such that the virtual region directly points to the physical persistent
media.
So we don't have another copy.. The question is can't flush view of file be enlightened to detect the reason
that you are flushing is backed by a position memory, it's a DAX
file system, and then work optimally.
So to do that, we need a check.
So given a region, we need to check whether it is DAX backed, DAX mapped, or regular memory
mapped.
And that is a little costly as well.
We need to make a call into kernel mode to detect whether this region is
DAX mapped or not. It needs a kernel switch from user mode to,
we need to switch to kernel mode and then get that info back.
Yeah, those are the really the two key differences, right?
Is that the context switch and if, and if you remember the previous slide,
we talked about being page granular, we're flushing a page. This is a,
the next slide. So besides not having a kernel mode transition here,
this is cache line granularity.
So those are the two big differences between those two cases.
We've actually, Chandra will show you
a little performance demo in a minute
about what the impact is on doing it one of those two ways.
Okay, so thank you Paul.
That's a good point, those two things. OK. So thank you. That's a good point.
There's two things.
These are the reasons why we prefer PMEM persist.
Primarily efficient.
And one way to think about it is they both are for different
purpose altogether.
This is just to flush the browser caches.
And this other thing has to make sure that it has to issue
an IO for the page to be returned to disk.
It's just worked for DAX mode as well.
That's it.
And now this picture is again reminding us that libpmum is a value add and top of the
DAX file system support that is provided with OS.
Just as a reminder.
But still not atomic?
Yeah, still not atomic.
Yeah.
So again, the OS, as we discussed, the OS crashes in between.
Based on what cache lines are flushed, part of the string can make it to the DIMM and
other part might not.
That's the state.
Now here we see that libpmm-obj, where it fits and what a little overview.
This is the library that most of you might again need.
That's why we are dedicating a
slide to this library.
And it depends on the PMM and the DAX file system support, that is the layering, and
it provides generic transactional support.
You say begin a transaction, allocate memory, and then end the transaction.
You might ask why we need a transaction for allocating memory.
Or you might already have guessed why we need that.
Just to reiterate,
when we allocate a memory from persistent memory
or persistent heap,
once the memory is allocated,
heap loses that region that we allocated.
And we have to assign the address
to a persistent variable
or a variable that will be persistent.
And later we can use it in the same instance of the application,
or if the mission reboots, even after that,
we can reach the region from the variable which has the address.
If the mission were to crash,
after we allocate and before we assign it to a variable,
we essentially lose the region.
It is neither in the heap, nor we have a reference to it to reach it. So this is the problem. It is neither in the heap nor we have a reference to it
to reach it.
So this is the problem.
This is one of the challenges of programming
with position memory.
So we need a transaction such that once we allocate
and before we assign it to a variable,
if the machine were to crash, it has to be rolled back
and given back to the heap.
So that's why we need the transaction support.
And let's see more about this lippymemorg.
Okay, so here we see the,
we are back to our string example,
which Paul was rightly questioning.
Why is it still not atomic?
Still not atomic.
It's a good question, and we have an answer here.
So this is how it looks like using lippymemorg.
We use tx begin and tx end to indicate the beginning and end of the transaction and within
that we use TX mem copy to copy a region of memory into the buffer.
And this is atomic, meaning if the machine has to crash, even when you're halfway through
the copy, it won't be part of the string,
it will be either all or nothing.
Leap MMOps takes care of it for us, like writing a log record and then rolling back in case
of a crash, all is taken care.
So yeah, and we might not need to use TX mem copy, we can use regular string copy as well,
this is not a requirement, or we can use any load or store instructions, but before modifying a region, say if you
are modifying 10 bytes of a buffer that we allocated, we have to add the region into
the transaction so that libmememob can roll it back. We say that this region is going
to be modified and then we modify the region. So in case of a crash, we can roll back to the original state.
The picture below is a peek into how objects are allocated and used using lippmmorgs. It
helps the next slide we are going to see. Whenever we allocate an object, we get an
OID, an object identifier, which is essentially a pair of the pool that we allocated or you can assume
it's a root object or something like that, a pool that we allocated and the relative
offset of this object.
Just again a reminder that this being persistent memory which can be, that is a file is DAX
mapped, it can be unmapped and the application can be closed. Later again when it is mapped, it can be mapped to a different virtual address.
So we can't uniquely identify an object with a pointer.
That wouldn't be sufficient.
So we always need a relative offset from the beginning of something which is unique.
So you can simply assume that it's a pair of the pool object come with a relative offset.
That's the object ID.
And that is how allocations are made using lip.
And that's what this picture is trying to tell us. Okay. Now again,
the string copy is atomic is already solved one problem and the code doesn't
look much complex other than macros, which is unknown. Maybe.
Yeah. In the previous slide when you were giving
the character name with name names,
there are still 18 characters.
Oh, sorry, one moment.
Just click on the little red thing.
Oh, okay.
Sorry.
You're going to pass 18.
Yeah. So whenever you're going to flush,
you're going to flush the size of the page.
That is with flush view of files.
That is true.
So what is it, PMEMPERSIST?
So PMEMPERSIST is cache line aligned, 8 bytes.
So, yeah.
It might help just to say a few more words about this
because we don't have any actual real application code in here,
but this is meant to be the application, if that's not obvious. It's just all we're showing
are macros that are implemented by the library, right? So the library gives you this macro
and this macro and this macro and you can put any number of operations inside of those
two macros and have them be treated as a transaction. And in this case, TxMemCopy will eventually, internally the library will result to a pmem-persist
that we saw a minute ago.
Sure.
Could you come again? I don't know.
Andy, you want to help us out?
Yeah.
So just by its name,
hardware transactional memory is about feasibility, not persistence.
So hardware transactional memory doesn't cover different systems.
So it doesn't solve the problem.
The problem is about making something atomic in the case of power failure. Sure. Well, this is atomicity both thread and persistence.
It's sending. So I'm not sure if you still have the slide with the two red arrows pointing at the macro. Is that coming now?
Yes, Andy.
So on that macro, there's two types of atomicity being provided.
The begin macro itself provides power fail atomicity,
and the lofted case provides a runtime, you know, multi-thread.
And hardware has actually been really quite fun. Cool. Thanks, Andy. and hardware transaction.
Cool. Thanks Andy.
You know where to find him.
Okay. So in the last slide, slide up, we, we discussed that, um,
we need not use the macros. We can use regular string copy or as a months as well.
This slide will shed more light on that one.
So this is just a singly linked list in session.
The single list linked list is on persistent memory.
That's the case we are going to see. So, okay,
so this line, I will, okay, let us, let me first walk through.
We again have transaction
begin, TX begin, and TX end, just similar, and here we are allocating a region out of
persistent memory, and we are adding data to the node that we allocated, and we're fixing
the next pointer to be the next of the root points, and finally we are changing the root.
This is when we are touching the existing variable. Roots head to the new pointer that we just allocated, new node that we
just allocated. And these are all the new node. Here all the three lines, we are
updating the new node. So if the transaction were to abort, there is
nothing to worry about. If the new node went back to the heap, we are all good. So
we made allocation, we are updating the new node, the new node went back to the heap, we are all good. So we made allocation,
we are updating the new node,
and if it goes back to the heap,
we are all good.
Whereas here,
we are updating the root object,
which is an existing object.
So before updating it,
we need to make sure that we say,
hey, this region, the root node,
is being updated,
added to the transaction.
That's what we are doing here.
And then we change the root's head,
so that in case of a crash after that,
before the transaction could commit,
we know how to roll back.
And this TXZ lock,
this takes care of allocating memory from the heap,
and in case of an abort, it will give it back to the heap.
It will roll back what it did.
So this is kind of safe,
and it doesn't have the problem that we previously
thought that leak.
Male Speaker 1 You know, one other thing that's probably
worth pointing out here, Chandra, if you weren't going to go through it, is these macros here?
Chandra Nathanael- Sure.
Male Speaker 1 Do you want me to cover those or?
Chandra Nathanael- Yes, Paul. I mean, come on.
Male Speaker 1 So I'm not sure if any of you saw in Doug's
talk or if you attended Andy's thing yesterday, one of the problems
with persistent memory programming is finding your stuff after you've gone through a reset
or a reboot, right? Where do you find things? And the notion of relative pointers versus
absolute pointers and even closing your application and reopening it, you could get a different
base for whatever you've mapped. So what this is showing is how the NVML solves that problem with these object
identifiers. So you as an application developer reference all of your objects through object
identifiers and macros like this one, which I think would stand for dereference read write.
So you basically pass this macro, one of those object identifiers, and it essentially converts
it to a pointer. So then you can use standard notation
to then get to an element within that structure.
So that's what those funny-looking macros are doing in there,
is using object identifiers to convert to more C-style pointers.
Any questions on any of that?
I was expecting that someone will question, why don't you speak about lock?
Do you want to speak about it as well?
Well, actually, Andy sort of covered that in answering the question about hardware
transactions, but maybe some people might not have heard without the mic, right, that
the TxBegin, this is actually two things in one, TxBegin lock.
There's a TxBegin macro that is just there for power fail atomicity, but we've got this macro
that also adds a lock in where you pass in the lock
to handle multi-thread atomicity.
So really that lock is protecting you in two ways,
or that macro is protecting you in two ways.
Yeah, I mean, it's a convenience variant
for TX begin and TX end, TX begin basically.
The lock will be released
when the transaction commits to robots.
It will be acquired in the beginning.
Okay, so we briefly discussed that we will be seeing
the performance comparison of PIMMAM Persist
and FlashView of file.
Here is the demo that, okay.
Toby shared in IDF.
Toby is our DAX and Nvml PM in Microsoft.
And here we are writing the 64 bytes at random offset and after every 64 write
we show a flush and we're just measuring the perf of those operations
once with flush view of file and other at other time with Pman persist and the
latency is almost four times better.
The detailed, uh,
info of each operation that is right versus flush is in a different XML because
this tool list everything,
this is an internal tool and it lists the details in an XML but we get,
we still get an overview that just by changing the operations,
your latency improved by like four times.
And you can probably see the right show up where you can see these in back,
but the IOPS as well, right? So there's flush view file at 1.3 million,
and there's PM persist at 2.9 million.
And that's because of the two things that we mentioned earlier,
that no transition needed here. And primarily this is page granularity. And yeah, so there's no use of the two things that we mentioned earlier that no transition needed here and primarily this is page granularity and yeah,
so there's no user to kernel switch. PmumPers is just more granular.
And the other thing which is, again,
I would like to mention is they both are for different purpose.
PmumPers is just flushes the cache lines and because that's what it's need in
DAX world or DAX environment. Whereas flushView of file, it's totally different requirement.
We have to make sure that the page that is there in volatile memory reaches disk,
which is way different from PMEM persist.
And yeah.
And one more thing, though it is not related to this performance comparison,
LibPMEM has variations for memcpy.
And it does the optimal, it uses optimal instructions for for based on the size of the reason that we are going
to copy to. For example,
it will know should I use regular copy and then flash or it will prefer non
temporal store based on the region size and many other factors.
And even the PMM persist depends on the processor.
It can use the best instruction available in the process or supported by the
processor.
OK.
Overall, I think this is what a short bunch of slides,
but this is what we wanted to convey.
For DAX, the efficient path is to use memory map and then
load and store instructions.
And NVML will be very convenient to use that model of programming with DAX.
We already saw one problem with persistent memory,
and when you get into it, you might reach many other problems,
and this will be very handy.
And we also want you to know that pmm.io is a place
where you get more details about NVML.
I think, yeah.
And libpmmobj is one of the most required library
for general application developers.
And I'll hand it over to Paul
to walk through how we have been doing this.
Okay, yeah, this is kind of a fun part, too.
So we're going to step back from the technical stuff a little bit
and talk about sort of how we're organized,
what we're doing, what tools we're using, who we are. It's awesome how far we've come in just
a few months. And a lot of us have worked in open source communities before. And we're
definitely calling this a community. This is not a couple of companies collaborating.
We're still very small. But this is very reminiscent for me of when we did the NVM Express driver
for Windows. It was initially just three companies and I think, I don't know, five or six of us that
had to work under NDA for a few months, but as soon as we got that restriction lifted,
we were wide open to build our community, to pick our tools, our communications methods,
our ground rules.
We were able to just establish everything starting with just that small handful of people.
And it's still going strong today with a gazillion people involved, still using many of the same processes that we started years ago. So
I kind of see us doing that right now. We started from scratch. It was just basically me and Chandra,
and we've got many more people involved I'll mention here in a minute. And we've got some
tools put together, and we've got some things that are growing and some communication stuff
that we're doing. So I wanted to kind of share that with you of where tools put together, and we've got some things that are growing and some communication stuff that we're doing.
So we wanted to kind of share that with you of where we're at,
and hopefully that we'll be as big as the OFA Windows group is in a couple years from now.
So right now the community is HP Enterprise, HP Labs, Intel, and Microsoft.
And we'd love to add more companies to that list. And we've still got a lot of work to do, so there's definitely no shortage.
We are a time zone-friendly group, even though we're maybe 12 of us total right now.
Almost everybody's in a different time zone.
We've got Texas, Washington, Arizona, Poland, Brazil, Taiwan, California.
We're just all over the place.
So a small group but widely dispersed, so we're able to take in new developers from pretty much any geo.
Our goal right now is to be code complete by the end of this year and what that means
really is getting all of the framework ported, all of the testing framework.
So a big part of this project since it's open source is the ability to do automated unit
testing, automated system and functional testing as much as possible
so that every time anybody new comes in
and touches the code, we know that it's fully regressed
before it goes into the database.
So obviously all that has to get done on Windows
before we can actually start making changes to the library
other than the core stuff that makes it functional
that people might be interested Windows specific.
So that's what we meant by code complete.
We want everything to be working.
We won't necessarily have added any new value or anything that Windows people might specifically
want.
Somebody raise their hand back there?
No?
Yeah.
Go back one slide.
Slide 16.
So your first objective is maintaining identical APIs for all OSs.
And this is...
Oh.
Yeah. Male Speaker 2 Well, I'm glad you pointed that out, SW.
I should have worn my glasses.
I totally skipped this entire four bullets.
And they're really important and that's one of the reasons because it was specifically
mentioned, especially in Doug's talk about avoiding APIs. This is not a programming model.
This is a library, right? So we do talk about APIs and we do have to define the APIs. And
one of the first decisions that we made, obviously with significant input from Microsoft because
this is a Windows library, was that
we wanted zero changes in the API. I mean, absolutely identical, down to the function
prototype, down to whether it's camel case or not. I mean, we wanted exactly the same.
So as an application developer using NVML, if you want to write your application for
Windows or Linux, the signatures are spot on. You won't see a single difference. And so far we've been able to maintain that. And I don't see any reason
why we're not going to be able to maintain that moving forward. But that's definitely
a goal of the library because it's an implementation. Which of course means maximizing our shared
code and documentation. And I'll show you our directory structure layout here in a minute
so you can sort of see where things are falling out and know that nothing up our sleeves.
It actually is laid out this way.
We use the same repository.
So if you go to GitHub today, pmm.io, and go poking around the source code, you're looking at the Windows code too.
We're working in real time off the same code.
It's just when the Linux stuff is released, none of the Windows specific things are tagged. We don't have a Windows specific release tagged and we're not building
Windows specific releases right now. But all of the source and the work in progress stuff
on test code is available right now. Yeah, I'm really glad you brought that up. I don't
know how I missed an entire section of that slide.
Okay, so like I said, the fun part, right, starting a new community,
jumping into a working community is really cool too, right?
Cuz if it's a good community, if it sucks, then it's no fun to work there.
But for the most part, all your decisions are made, and
you just gotta learn how to work with the machinery that people are going and
understand what's happening.
But we got to start from scratch, right?
Minus, well, we had to use GitHub cuz that's what the Linux what's happening. But we got to start from scratch, right? Minus, well, we had to use GitHub,
because that's what the Linux library is in.
But the things that we're using that were already common there
are GitHub.
We use Revealable for all of our code reviews.
And then we have a homegrown test framework.
I say mostly.
I think it really is mostly, mostly homegrown. And it's a combination of bash scripts and some Perl and some C code.
There's got to be one other language in there somewhere.
It's kind of this mesh of things, and it's really well structured,
but it was not an original design goal to have it be multi-OS.
So one of the big decisions that we had to come up with when we started this effort,
besides the API call, was what are we going to do about the test framework?
It's not going to run into Windows.
So we identified the maximum amount of shared code that we could possibly come up with
and figured out how to wrap it in PowerShell.
So we are converting basically everything that's Bash
in the Linux side to PowerShell on the Windows side.
And really everything else, including the Perl
and of course the C code, is for the most part untouched.
Where we find incompatibilities in the C code,
we will make every effort not to conditionally compile it out
for Windows or conditionally compile in something else. We'll make an effort to refactor it so that it is using something
common between the two OSs. Of course, we can't do that everywhere.
We do have a few conditional ifdef, win32
kind of things floating around in the test code. Almost none of that in the library, though.
That's almost all limited to test code. Then you'll see in a minute when we
show the directory structure, there's some obvious things that are big chunks of code that have
to be Windows, right? All of our threading implementation, memory mapping, that kind
of stuff that is just night and day different, has different implementations behind the scenes.
And then on Windows side, we're using Visual Studio, AppVare, which is our continuous integration
build environment, which actually I didn't
even know this thing existed when we started doing this.
We weren't sure what we were going to do for CI.
But we do have a system now set up where we are integrated with GitHub.
Reviewable was already integrated and now Appvare is as well.
So as a developer, you make your pull request and you instantly get both the Linux library
test being run under Travis CI
and the Windows test being run under Appvayr CI and then you can see your results and the
maintainers can go through and look at all the unit tests and make sure they're passing before
they even bother jumping into reviewable. So Appvayr is pretty cool. And then Trello,
I'll talk about that on the next slide. That's really one of our key tools that we can use because we're so small.
As soon as we get to be a bigger community, then we'll probably drop that.
But you'll see in a second why it's extremely helpful for what we're trying to do.
Process-wise, again, because we're really small, we do have a weekly meeting.
And it's very helpful because we're trying to move on a tight timeline.
And we're trying to port over 160 tests I think plus a lot of the framework the
the framework that's written in bash so we've got a lot of well-known tasks that
we have to get done so it helps to talk face to face we've also got an IRC
channel that we don't use a whole lot again because we're having weekly
meetings but when we stop having meetings I think we'll find ourselves on IRC quite a bit
more and then the link below covers Trello and we'll just go ahead and flip
to the next slide you'll see what that looks like so if you haven't used Trello
before it's super super easy basic way for anybody for us we're using it as
developers but for anybody to collaborate where you've got
pretty well-known tasks to deal with.
The way we run our weekly meeting is going through this Trello board.
Each one of these things is called a card.
We start off with our discussions column, so anybody in the community can throw out
a discussion topic, and that just guarantees that we talk about it in the meeting.
It doesn't mean that you can't talk about it before then.
We've got email and IRC and all that stuff.
But this is just a way to say, oh yeah, it's not important,
but I want to make sure we talk about it next time.
We've got sort of our wiki or reference page there.
So we've got all sorts of cards on how
to port certain things a certain way
so that we're all doing things consistently.
We've got a couple of columns for backlog.
So we have generic backlog tasks.
And we have backlog tasks that are focused around the
tests that we're porting so we can get an idea of we have to port the tests first.
We don't want anybody touching code until it's testable.
We've got lots of tests to get through first before we start moving on to non-test related
items.
The next column is just our mirror of GitHub issues.
We don't have any yet because we're not doing any of the importing test code.
And then moving on down,
we've got sort of doing and review and done
and enclosed and all that kind of stuff.
But this is kind of, you can see,
this is the agenda of how our workflow works
in our community.
We just go through one item by one item
and see who's got roadblocks,
who's questioning a certain implementation,
how it should be converted.
Has somebody hit this before?
And so far, it's been working really well. I think this was before we had counters on here, but right now we've got about 150 backlog
items completed and probably about 80 more to go that we know of before we know that
the test code has been completely ported and then we'll deal with whatever issues arise
and whatever else we need to do specific to Windows.
All right, like I said, I was going to dump out the directory structure here
that you'd get if you went off and cloned the repo and looked at it locally.
So these are all the Linux libraries here,
which are now the Linux and Windows libraries.
And then in the common directory, which originally was intended for code that was common between
the libraries, it now has two meetings.
It's code that's common between the libraries and between the OSs that the library supports.
So we've got a few Windows specific things down here.
You can see we've got like our P RP thread implementation for Windows is down here.
And then we've also got, we're going to migrate this over.
We've got this sort of leftover Windows directory up at the root.
We're going to slowly migrate that all over into this common area.
So basically the layout will be all one directory structure regardless of the OS,
and there will be a few Windows specific files floating in the common area, and that's it.
So everything else is untouched. of the OS and there'll be a few Windows specific files floating in the common area and that's it.
So everything else is untouched.
And the way we laid out our Visual Studio solution, we've got our main solution at the
top which includes individual projects for everything related to the library.
So we've got right now five or six library projects, right, to build the actual libraries.
At least a handful of internal tools that are used by the tests,
or even one or two tools that can be used by developers that are built as separate projects.
And then we have this nice long laundry list of subprojects.
Each one contains a suite of tests specific to an area of functionality. So each one of those can be anywhere from one test to 30 tests, depending on what unit of functionality we're trying to test with it.
But that's how we, that's essentially how we build and debug all of the test framework
in the library and all the libraries as well.
To date, neither Chandra or anybody else on the team has really had to make hardly any
changes to the library.
It's really been all changes to the test code.
Part of that is because there was some prep work done by the maintainer of the entire
library up front when he knew that Windows was coming.
He changed some of the library to make it a little bit more friendly for Windows, not
Windows specific, but he made those changes in advance.
So really we've been really getting just focused on getting this test code ready so we can start with real implementations.
Okay, status and plan. Like I said, we're really active in development now.
We have full meetings every week. We've got lots of pull requests going in every week.
Lots of reviews happening. I think out of 12 of us working on the code,
there's probably 40 things in flight right now,
either being worked on or in review,
ready for the maintainer to kick back for a revision.
We are nearing the end of implementing the test code,
which we know is probably going to produce
more backlog items for us once we get everything testable.
And then our plans are to continue to maintain this whole thing on PMM.io and we will start
building the Windows version of NVML and providing that as a tagged, probably an MSI package
or something on, in the same repo.
And then we've just got some of the references here.
So that's kind of what we're doing,
where we're at and where we're going.
And again, if anybody is interested and wants to help out,
we've still got lots of work to do.
All you've got to do is contact me or Chandra.
Both our contact information is on here,
and we will get you going.
Any questions?
All right, one question.
Can you comment on support for other architectures?
Yeah, I think that question was asked once the other day too.
The library is certainly built to handle other architectures, but nobody has brought forth
a patch to support other architectures.
So there's nothing in the architecture of the library that would prevent supporting anything else.
But nobody's brought that forward.
And if they do, it will be welcome with open arms.
Is it packaged up for common distributions?
Yeah, it is.
I'm not sure what the current status is, but Andy does.
Yeah, it's been picked up by Fedora,
and it'll be in RHEL 7.3.
It's in OpenSUSE, and then it'll be in one of the upcoming SLES things.
We haven't submitted it yet to Debbie.
We have all the packaging stuff working.
We just haven't finished some of the labor.
So it's getting it on the Linux side.
Yeah, and on the Windows side,
we'll continue to see closer and tighter integration with Microsoft
as we make it through the rest of this process.
Okay, thanks everybody.
Thanks for listening.
If you have questions about the material presented in this podcast,
be sure and join our developers mailing list by sending an email to developers-subscribe at sneha.org.
Here you can ask questions and discuss this topic further with your peers in the developer community.
For additional information about the Storage Developer Conference, visit storagedeveloper.org.