CppCast - C++11/14 Library Best Practices
Episode Date: June 10, 2015Rob and Jason are joined by Niall Douglas to discuss best practices for C++ 11/14 libraries. Niall Douglas is a consultant for hire, is one of the authors of proposed Boost.AFIO and is current...ly the primary Google Summer of Code administrator for Boost. He is an Affiliate Researcher with the Waterloo Research Institute for Complexity and Innovation at the University of Waterloo, Canada, and holds postgraduate qualifications in Business Information Systems and Educational and Social Research as well as a second undergraduate degree double majoring in Economics and Management. He has been using Boost since 2002 and was the ISO SC22 (Programming Languages) mirror convenor for the Republic of Ireland 2011-2012. He formerly worked for BlackBerry 2012-2013 in their Platform Development group, and was formerly the Chief Software Architect of the Fuel and Hydraulic Test Benches of the EuroFighter defence aircraft. He is a published author in the field of Economics and Power Relations, is the Social Media Coordinator for the World Economics Association and his particular interest lies in productivity, the causes of productivity and the organisational scaling constraints which inhibit productivity. News constexpr Complete For VS 2015 RTM: C++11 compiler, C++17 STL C++ in the modern world Why C++17 is the new programming language for games I want Niall Douglas @ned14 Niall Douglas' blog Links Best Practice For C++ 11/14 Libraries Boost.AFIO Boost.APIBind
Transcript
Discussion (0)
This episode of CppCast is sponsored by JetBrains, maker of excellent C++ developer tools.
Listen in for a special discount code later this episode.
And by CppCon, the annual week-long face-to-face gathering for the entire C++ community.
Get your ticket now during early bird registration until July 10th.
Episode 15 of CppCast with guest Niall Douglas recorded June 10th, 2015.
In this episode, we discuss C++ in the modern world.
Then we'll interview Niall Douglas,
author of the proposed Boost AFIO
and API Bind libraries.
Niall then discusses his best practices document
for C++ 11 by C++ developers.
I'm your host, Rob Irving, joined by my co-host,
Jason Turner. Jason, how are you doing today? All right, Rob, how are you doing?
Doing pretty good. No real news, nothing new for me this week, but excited to do this episode.
At the top of every episode, I like to read a piece of feedback. This time, it comes in from Mike. Mike writes
in, great show, glad I found you guys,
and asks, any chance
you could review or suggest
C++ libraries for
building web services
for cloud deployments? Seems
like managed languages have plenty of options.
What about us C++ developers?
Yeah,
I don't know what your thoughts are about this, Jason,
but I would agree.
There's definitely not a lot to do with web development and C++.
I agree.
I have heard some rumblings from people wanting to do this kind of thing,
but nothing definite.
Maybe it's the kind of thing that you could use BoostDash for
or something similar.
I'm not sure.
I don't do a lot of web development myself, anyhow. No, I don't do a lot of
web development either. I know
when I was looking at STL's
blog, I think he said that he uses
a C++
program to generate his HTML,
which I thought was interesting. I'm not sure
if that's something he's shared in the
public or if there's other
well-known C++ website
generators. There does seem to be
a little bit of a movement
towards static web page generation in general,
but those tools tend to be Ruby or something.
There's no reason we couldn't do it in C++.
Yeah, the CppCast website
is actually using static HTML
generated with, I think,
a Node.js tool.
Okay.
Okay, but thanks for sending in the feedback, Mike.
We don't have a lot of thoughts on this right now,
but we'll definitely be on the lookout for web development,
C++ technology, if we're able to find something.
I agree, it would make a great show.
And we'd love to hear any of your thoughts about the show,
so you can always email us at feedback at cppcast.com,
contact us on Twitter, twitter.com slash cppcast,
or Facebook, facebook.com slash cppcast.
So our guest for today is Nial Douglas.
Nial is a consultant for hire,
is one of the authors of proposed boost.afio,
and is currently the primary google summer of code
administrator for boost he's an affiliate researcher with the waterloo research institute
for complexity and innovation at the university of waterloo canada and holds postgraduate
qualifications in business information systems and educational and social research as well as
second undergraduate degrees double majoring in economics and management.
He's been using Boost since 2002 and was the ISO SE22 Mirror Convenor for the Republic of Ireland, 2011 to 2012.
He formerly worked for BlackBerry in their platform development group and was formerly
the chief software architect of the fuel and hydraulic test benches of the Eurofighter
Defense Aircraft.
He's a published author in the field of economics and power relations, is a social media coordinator for the World Economics Association, and his particular
interest lies in productivity, the causes of productivity, and the organizational scaling
constraints which inhibit productivity. Y'all, welcome to the show.
Hi, guys. I really wish now I'd done out a custom bio for you instead of feeding you
the stock one because I never thought, what does it sound like on the radio and it's long.
It's a bit long but you obviously have a very impressive career.
Well, I don't know.
I would say I have never committed to any one particular specialization.
Yeah.
Well, before we get into the interview, I wanted to share a few pieces of news.
The first one comes from the Microsoft Visual C++
blog, and they are saying
that in VS 2015
RTM, there will be
full C++ 11
constexpr support.
And, Niall, you told me that you actually
have spoken to STL a bit about
this. Yeah,
Stefan's a great guy.
We sort of dialogue fairly routinely for various reasons.
Most recently, I had found some problems, or rather I had some questions actually,
to do with why Microsoft's compiler behaved a certain way when facing the lightweight monadic
class I've been working on. And it sort of went, well, it's really interesting because it's almost entirely through design.
So their compiler has to be backwards compatible
with a really ancient way of doing C++
such a long, long time ago.
And because of that,
they have to do a number of things
which most any normal new compiler doesn't have to do.
So when you do a single thing,
you suddenly see 3,000 opcodes
get popped out by the compiler from nothing.
Whereas on, say, GCC, that only
generates 23 opcodes, then it was
something like, well, the reason why, there are these
long list of very good reasons. So that's how it actually got
onto the ConstExpr discussion
because that paper, sorry, that article
he was writing, he was just coming up with
at the time, and he
was saying that it was going to likely to happen
and I was like, wow, who thought actual constexpr
in Visual Studio 2015?
I mean, hats off to that compiler team
for really pulling a rabbit out of the hat.
They did a great job in delivering that,
I think, in such a short time period
from where they were only two years ago
or even a year ago.
So, no, amazing.
Right.
And the article does mention that
the extended C++14 constexpr support
will be implemented in the future.
So maybe they'll put that in with a 2015 update,
hopefully, maybe even later this year.
He's not promising that, but he is saying it will be coming.
It's a huge job, though.
I mean, they're going to have to build a...
I mean, they've already got the system build a, I mean, they've already
got this system, because, you know, Microsoft's compiler is not an AST-based compiler. It literally
parses tokens and spits out code as the tokens come in, which is a very old-fashioned way of
doing things, but makes a lot of sense given Microsoft's internal code base. And what they've
got there, as far as I understand it from the rest, now I'm no compiler expert, but this is what I've
been told, is that it runs along, and if it comes to a point where it says, I need to do an AST here, it will then backtrack out and then go, I'm making an AST just in this partial location, just to evaluate this small particular context.
And then it'll go back to parsing by token stream again.
So there's a whole lot of heuristics in there where it trundles along, and it will turn on and off a modern compiler way of doing things. And this is exactly where enhanced Extended Constexpr comes in, because
they're going to have to effectively
internally replicate most of a whole new
compiler based on an AST model
just to implement Extended Constexpr
if I understand it correctly. And that just
gets turned on and off internally by heuristics.
There's a long list of very good reasons why
they do it that way, of course, and
if you're inside Microsoft,
they all make sense.
So that's why, you know, all the decisions that make no sense to us in the outside world
make total sense if you work inside Microsoft-ish.
So, no, I mean, it's an outstanding technical accomplishment is Microsoft's compiler.
I think anybody who looks at it would agree with that part.
Sure.
So the next article is coming from Medium.com.
And it's just a blog post about C++ in the modern world, which was interesting.
It was kind of just an overview of what's been happening in recent C++ history and where the language is going.
Jason, did you have a chance to read through this one?
I actually didn't, although there is a really great cartoon right in the middle of it if you haven't seen it yet.
Yeah, the cartoon is saying if you want to learn C++, you basically need to solve time travel in order to do it quickly.
How about you, Neil? Have you had a chance to read through this?
I did have a quick flick through it.
I think it's interesting.
People who really like C++ tend to be usually people who are good at C++.
People who don't like C++ tend to be people who are not very good at C++.
It's rare to actually find someone who's really good at C++ who hates it.
Now, there are a few, which is interesting.
But I find it's interesting because people say,
well, Microsoft are always advocating
C-sharp. Well, that's because businesses generally want C-sharp. They don't generally want C++
unless they have a very pressing business case to do so. And C-sharp is a fine language.
There's nothing wrong with writing your code in C-sharp. Or in fact, most of the.NET stuff,
I myself, if I want to do a Windows GUI-owned program,
I'll just grab.NET because that's the best thing to go and use.
I had a startup back in 2011, and that was written using a mixture of IronPython, C-sharp, and Visual Basic,
believe it or not, because I had Visual Basic for applications running within Microsoft Office because that was the easiest and simplest way to make that happen.
And that then knocked everything out onto a TCP connection
and then bolted itself into Python.
So that was...
I found that a very easy and straightforward solution to do.
So no, this is the thing.
The trouble with C++ is that if you don't understand memory,
and a lot of programmers don't,
you will end up creating memory corruption all over your code base
very, very easily.
It's funny, because if you're really good at C++
and you're skilled in it,
you can write code that never has a race condition
and never has memory corruption.
And I mean, insofar, I only state that as
what the tooling that's currently available can detect either.
So, I mean, you have this thing being run per commit
and it never, ever fires, you know,
in years of sending code.
So this can be done, but to reach that point
where you understand these things is
an enormous amount of effort.
As the cartoon says, I noticed there,
if you divide by 10, he's roughly talking about 10 years
to get to the point where you might have some idea
of doing these things. Now, the thing is,
most programmers and most businesses can't invest in people
to that extent, so what you do is you take away the ability
to corrupt memory on that level, and
you end up with something a bit more like
Java and Python,
which is great.
If you want to get something there
without having to debug memory corruption errors,
which are one of the worst forms of error to debug,
that's the right language and right tooling to use.
Rust, of course, is a big upstart challenger in this,
and my day job, actually, is writing nothing but Rust,
so it's kind of funny.
Yeah, it's really interesting.
The choices they've made there make you really think about
where C++ must go next if it's going to stay relevant.
But that's a separate topic.
But it is an interesting topic.
Is there any one point that you might want to pick out
and discuss from Rust?
I think Rust is still very immature
and has another three to five years
before you'd want to be writing production code.
And the big trouble is the standard libraries
that come with it.
They're amazingly immature,
you know, relative to what we're used to.
I mean, the STL and standard library
has lots of unfortunate design decisions,
but it's really mature.
And you have guarantees on things.
And you know, the warts that it has
are the warts you can work with and you can go from there.
Whereas with Rust, the standard libraries
have warts that constantly are changing and fading in and out
because they only standardized 1.0
very, very recently. And even with 1.1,
they've gone off and broken a number
of items in order to fix things that they made mistakes in
and sort of obsoleted interfaces.
If you think back with Java in the early days,
Java went between version 1 and version about 4.
There was a whole lot of standard library stuff
that got deprecated and removed.
And if you wrote a code for that early stuff,
it carried on working into the future,
but it ran really slowly, it was badly supported,
and a long list of other things.
So the big advantage, I suppose,
of choosing a mature platform is that it's stable
and the code you write is predictable.
Right.
Yeah, I definitely think it'd be worth doing an episode just talking about D and Rust and some of these other languages which are looked at as possible competitors to C++.
I'm sure there's plenty of listeners who'd be interested in that.
So this last article actually touches on those subjects.
Why C++17 is the new programming language for games that I want.
And this was basically in response to Jonathan Blow, who released a series of videos.
He's a game developer, worked on a game called Braid, which you may have heard of.
And he basically put out these videos talking about how he would like
to have a new programming
language, which
was optimized for game development
since lots of game developers are currently
using C++.
And this article was written by
Matt Newport, who wrote
how he thinks C++17
is already going to address the majority
of the things that John Blow
is looking for a new language to do.
So it's a very long article,
and there's actually several videos attached to it.
So if you're interested in game development,
I would encourage you to read the whole thing.
But, yeah, Jason, what did you think about this one?
Did you have a chance to read through it?
I got most of the way through part two of the uh of the article okay uh
well yeah because he has so he has some articles also that he follows up with on this that are
and he does a rather extensive um uh look at what jonathan blow is suggesting and what the
positives and negatives are of it and it's it's goes in depth, and there's a lot of material there.
And if you have any questions really about how to write safe code
that can be efficient in game development world,
it's probably worth reading.
So, Niall, let's move on to talking about your recent C++ Now talk.
So the goal of your talk was basically
to kind of generate some best practice suggestions
for C++ 11 and 14 libraries.
And before we go into those best practices,
I thought maybe we could touch on
the libraries that you've been working on.
The first one is API Bind.
Is that right?
The first one actually would have been AFio okay and that's sorry i should
explain that for the listeners that's a synchronous file operations which would be deliberately
mirroring the name of asio which is a synchronous socket operations so that was where it began and
then that has the needs that were exposed by that library have created other libraries to be
generated mostly for internal use but um one of them was API bind.
There's another one called SpinLock, which is now used internally,
and I think there may be a third one coming, actually.
So, yeah, as you refactor the internal code,
you kind of go, well, I need this, and I don't have it,
and no such thing currently exists.
So you kind of float that idea out there.
So that's where I came from, and I think the C++
Now talk was...
I mean, I'm very much an advocate
within the community of C++ 11
and especially 14 being a whole new language,
and I try to emphasize the fact
that the code you write,
if you don't have to think about O3 compatibility,
is just diametrically
different. You might as well have a brand new language. It looks
the same, but it's not.
The stuff you do in it is not like you would do in O3.
And a lot of people disagree with me on that,
and they would say more of the continuity between it.
But one of the things I wanted to emphasize with that
was by reviewing the new libraries coming to Boost
in the pipeline that are either conditionally approved
or shortly to be peer-reviewed.
And I went through four of those,
which were the closest that I felt were probably coming in.
Now, as it happened, during the actual talk,
during 90 minutes, I got through one library,
which was API Bind,
because the questions coming in from the audience,
as is normal in C++ now, were very extensive.
So I think more than 35 minutes went on audience commentary,
and I only got through, at best,
maybe 15 minutes of actual material so i ran
out which was i expected it by the way um so that that's fine i'll be doing hopefully i've submitted
the same talk for cvp con but there it's very much a more tutorial track not designed to make people
hopefully interrupt with with their own opinions on things and it's much more a case of these are
the libraries that are coming this is how you use them as a quick sort of you know use case example
and that's your hour.
So that's where I came from.
And that then led to, I normally try to write some big written document,
like a white paper with every presentation I do,
to try and formalize your thinking into something rigid.
And this year, instead of being an academic paper,
turned into a best practices handbook.
So what exactly were your goals with this best practices handbook?
Well, I think when I was looking through, I looked through 10 libraries that I judged as probably
having some chance of coming up for review by the community sometime soon, because as I don't know
if listeners are aware, but Boost is a peer-reviewed collection of libraries, so the community reviews
them and then has to go and pass review, which is often an extensive multi-year long process and during that sort of trying to figure
out I mean I looked at things like documentation there's actually a color-coded heat map of various
ways of rating libraries which I did up and you can sort of see how much green there is is probably
the closest library is to entering the review queue and I think that was at the front of the
slides and that was how I selected the libraries that I looked at so I
looked at the 10 of them and I looked into their source code and I looked into the documentation
and I tried to figure out what were the patterns that were joining them all together and I began
to notice that there were some really obvious things that made them separate from normal boost
libraries and there's also some techniques that were very commonly used and there are other
techniques that some use but others didn't, and so on and so forth.
And I thought, well, this is interesting.
So I think, why don't I write up a document that basically links into these bits of source code all over the place,
and you can kind of go through that.
And obviously it's my opinion of what I think is a really good practice,
but I put a rationale and an argumentation of why I think it is,
and if you agree with that point,
you might look into the source code of someone's library and lift that idea for yourself.
And the idea is to try and hopefully raise the common quality level, I suppose, for everybody.
That's the hope.
I want to wrap this discussion for just a minute to bring you a word from our sponsor, JetBrains.
C++ is on a comeback lately, isn't it?
This language has been around for so long, yet it's hard to find a good development tool
for it.
Luckily, our good friends at JetBrains have just released a new IDE for cross-platform
development in C++, and we thought you should know about it.
This IDE is called CLion, and it's compatible with Linux, OS X, and Windows.
It relies on the well-known CMake build system and offers lots of goodies and smartness that
can make your life a lot easier. CLion natively supports C and C++, including the C++11 standard, libc++, and Boost. But that's not all.
Editing CMake files gets a lot easier with CLion, thanks to completion for commands and file names
and automatic updating of CMake files. With its in-depth understanding of C and C++ code,
CLion brings you smart, relevant code completion.
It also gives you full coding assistance,
like customizable coding styles, key maps, and project views.
You can instantly navigate to a simple declaration or usages, too.
And whenever you use CLion's code refactorings,
you can be sure your changes are applied safely throughout the whole codebase.
When it comes to version control systems,
the IDE supports Subversion, Git, GitHub,
Mercurial, CVS, Perforce, and TFS, with a unified interface for all of those.
It even has a built-in terminal where you can run any command without leaving the IDE,
locally or remotely, using the SSH protocol.
A powerful language needs powerful developer tools.
With CLion, now you've got those.
Download the trial version at jetbrains.com slash cppcast dash clion. And if you are a student or
developing an open source project, use CLion for free, courtesy of JetBrains.
So do you want to start going over some of these specific points? The first one
you brought up was convenience, where
you're recommending using Git and GitHub.
Yeah, it's strange, isn't it?
There are still quite a lot of people who really
don't like Git.
Now, out of the libraries I looked at, everybody
uses Git, but it's funny that
I think, I try to aim it obviously not just at
Boost library developers, but as general C++ people. There aim it, obviously, not just at Boost Library developers,
but as general C++ people.
There's even, of course, a lot of people who don't like using GitHub.
And I
pointed out that there are a lot of very
good reasons to believe those both things,
why you don't like Git and you don't like GitHub, but
equally, it's just so much more convenient
to do both. The free
tooling that's available for anybody who uses
both is just so much easier. It just automatically integrates one-click integration. There's a long list of tools that just do that. The free tooling that's available for anybody who uses both is just so much easier. It just automatically
integrates one-click integration. There's a long list of
tools that just do that. And I did point
out the great thing with Git is you don't have to put
all your code into one location. You can actually
keep all your code in one location and have it mirror
out to GitHub. So then you're not trusting GitHub
with all of your crown jewels. You're not trusting
GitHub not to go delete your stuff or do
like SourceForge has been doing recently of injecting
adware into the binaries that are being downloaded.
Do you not hear about that?
No, I've not heard about that.
Oh wow, yeah, SourceForge decided that Nmap
and, what was it?
They decided they were abandoned, and they
went and modified the installers that people were downloading
to include adware from them,
and then shipped them out to people.
Yeah, the Windows GIMP download also.
That's right, GIMP.
That's right, yeah.
And that's just like
a wow move, you know.
And that's then caused, obviously, a big debate
and boost because we use SourceForge
to distribute our files and people are saying, well, we have to
leave now because how do we know in the future
they're not going to start modifying our installers?
I think that's an amazingly bad move for them, but that's
a separate matter.
The point is, anyway,
a lot of people do worry about commercially funded,
profit-driven companies having access to the crown jewels like that.
And of course, the great thing is Git's not subversion.
So you can have multiple copies.
You can even have your master copy in your private machine
served by an ADSL connection out to the world
and then have that mirror onto GitHub if you want.
Git's incredibly flexible.
I think most of the people who don't like Git,
it's because they don't really understand Git
and they lose work with Git.
And one of my
past employers, I won't mention whom,
there was a fellow who liked to do
Git force push a lot.
And the trouble was
that whenever Git gave him an error message, he would
just go Git force push at whatever it was to make the problem go away, right?
And the trouble was he ended up deleting a month's worth of six people were working on
an item and they contributed a large patch set that went in.
That interfered with his thing, so he just Git force pushed it out of existence and a
month's worth of work went and vanished because Git's garbage collector had fired and cleaned
it all up.
And that's six people working
for a month gone.
So that obviously generates a bit of anti-Git
negativity. And I understand where they're coming from, but equally
he shouldn't be doing Git force push all the time.
Yes, definitely.
The next thing here is uncoupling.
And I thought this was really interesting
because I had not read into the use of inline namespacing very much.
And this is something you're using a lot with your API bind library as well.
Yes, this is a way of hiding, or rather not hiding,
actually, if anything, exposing.
So if you have a version of your library
that implements, you ship a DLL
and it has a certain ABI being exported from it,
and then you link in, you know,
you use the wrong headers that don't match up with that header,
what you want, ideally, is for it to refuse to link.
And one of the big troubles we've always had
in many of the Boost libraries is that there
wasn't a very strong error-throwing situation where that happened. So you had this trouble,
I remember, in Boost Python many, many years ago, where someone was loading an extension that used
one version of Boost Python, and then someone was using an extension that loaded a slightly
different version of Boost Python. And when the two Boost Pythons came into the same process space,
they just spazzed all over each other, and
nasty segmentation faults happened all over.
And that was mainly because in O3
you had to do
an awful lot of macro work if you wanted to hide
that kind of simple stuff, or you had to
use linker tricks, or you had to do
a long list of other not
really very pleasant ways of doing things.
So one of the great advantages of C++11 was they brought in
inline namespaces and
that lets you
embed a tag within
the namespaces that is normally hidden during
normal use to let the linker
refuse to link something.
And that's exactly what's done here, except what we've done is
we've kind of twisted that slightly on its head and we're now
using it to tag
each of the ABI configurations
of something, so version 1, version 2, version
whatever, and you can have them coexist in the same space.
So if library A uses version 1 of some dependency, then library B can use version 2 of that same
dependency, and because they're all ABI'd out with different symbols, those link and
those work without colliding into one another, without symbols being overwritten.
So that then combines with using namespace,
sorry, what was the phrase?
Yeah, namespace aliasing,
where you bring some remote namespace in,
and that's a way of tagging a dependency
between your library's namespace
and some specific hard version.
And all it is is pure convention.
It's simply that if you follow this particular pattern,
you gain these benefits.
You obviously lose a certain cost in that as well,
mainly that it confuses people, I suppose,
because people aren't used to it.
That said, this particular technique
is used in all the STL implementations I'm aware of.
And moreover, they've been using it for a while now.
Something that Beeman mentioned in...
Sorry, Beeman's one of the founders of Boost
and has also been the convener of WG20 for many years.
Something he mentioned that this is just a completely standard technique
in standard library implementations for C++11
and why, in many respects, we haven't done this yet in other libraries
is kind of a bit of a mystery in some way.
But now, I've simply spelled out how it's done.
I don't think anybody had thought to do it, if that makes sense.
So it's very, very straightforward
and it certainly eases a lot of
manual typing and you don't want
macros appearing everywhere and this makes
that go away.
So do you make any specific recommendations
for how long you would want to keep around
the older versions of the APIs
before you eventually deprecate them from the library
or anything like that?
I think that entirely depends on your user base.
I would be one of the more aggressive people
when it comes to deprecation.
I always have been for years and years and years.
I don't think that people who use
some ancient version of your library can expect free support for eternity.
Right.
I think that's the difference between volunteer written stuff.
This is stuff that's being written in people's family free time.
You know, instead of spending time with your children, you're writing this stuff.
And I think to have that expectation, I think is unfair, personally.
But that's my opinion.
A lot of people disagree with it.
I think it depends on the library.
I mean, obviously, if there's a very, very good reason why version one has to use,
for example, it has a really important feature that hasn't been implemented yet in version two,
that's a very different circumstance to that version two implements everything that you need,
you just need to refactor your code to use it. So that's the complicated explanation.
Okay. Before we go much further, I'm just wanted to clarify for my American ears here, how do I pronounce your name?
Oh, it's Niall.
Niall, okay.
Don't worry about it. I'm not bothered.
Okay.
The next thing you bring up is strongly consider testing your library with Visual Studio 2015.
And I got to say, this is like a big, this is a big thing for me also.
I'm a huge fan of supporting the native compiler on every platform, if you will.
And clearly on Windows, Visual Studio
is the native compiler.
So, I mean, go ahead and go into your reasons
for why this is important, if you don't mind.
Well, it's a simple one.
I think everyone has always discounted Visual Studio
as being not really implementing anything of C++11 or even 14.
And the truth is that it's actually about Clang 3.2 levels of conformance already.
And that's with 2015.
And a lot of people are really shocked to think,
well, Clang 3.2 is pretty conformant.
I mean, it even implements some C++11 stuff. In fact, actually, I'm just reading the document myself.
It actually meets or exceeds Clang 3.3. So my apologies, I just made a mistake. It even
goes as far as there's quite a lot of early C++1Z. So that's the 17 stuff is beginning
to creep into 2015, especially in the standard library.
It's a lot better,
a lot more conforming compiler than a lot of people realize.
Now, just because it conforms
doesn't mean that it isn't going to choke in your code
in a really amazing way. So you'll have
your beautiful C++14 code
and you'll stick it in and it meets all the feature sets
that you think will work with Visual Studio 2015
and 2015 will just go blah all over it.
And a lot of the time when it does do that,
it's because it's found something,
because of the completely different way of doing C++ that it does,
is usually not a bug in their code.
It's actually usually a bug in your assumptions.
So you've made a certain assumption
that happens to be valid on Clang and on GCC
that Visual Studio exposes
as being not a valid assumption to make.
There is, of course, a certain amount of problems with lack of SFINAE.
So Expression SFINAE is done using magic tricks in Visual Studio 2015.
So if you use the special magic STL primitives,
it will magically turn on Expression SFINAE for that particular context
and then turn it back off again. But we are told that they're expecting to do
a service back
for Visual Studio 2015
sometime in its lifetime, which will implement full
expression SphinA. Again, you need an
AST parser in there
because the trouble is backtracking.
So the current expression SphinA implementation can't backtrack
and that's what they need to add.
So that's the thing that's...
And the trouble is, at the minute, if it does try backtracking,
it's unfortunately segfault, so you get internal compiler
error, which is why it's not turned on
for the vast majority of code.
But no, I strongly support it, because I think once people
realize it does support so
much stuff that it does...
I mean, the main big things
I have a list there of the things that are missing,
but there's very, very few.
Obviously, the long-standing problems like the preprocessor, the non-conforming preprocessor are still there.
And I don't think they're going to go away anytime soon, unfortunately.
But if you're really, really dedicated, you can, of course, just use Clang's preprocessor to output code,
which then is fed into Microsoft's compiler, and there's nothing stopping you from doing that.
So that's a major showstopper that's an option well since you brought that up msvc's preprocessor is non-conforming and you
also mentioned that their c99 support is not fully complete and i've heard both of those things come
up before but i've never had either one of them actually bite me is that can you go in uh and
then to either one of those topics? Sure.
There's actually a bit in the API bind where I do some C macro preprocessor metaprogramming
to get it to auto-generate stuff for me
at the preprocessor level.
And that is painful to do on Microsoft's broken preprocessor
because it doesn't do it right,
and it does it in a way that's really unhelpful.
So I think Edward Diener did a
library, BoostVMD,
and he's implemented
a whole load of abstraction to abstract
out the Microsoft
preprocessor to not behave
so difficult in such a difficult
fashion. And if you have that problem and you're
facing that kind of difficulty, then you definitely want to reach for
him who's done all those workarounds for you.
So that's my
very minor experience in that area. The other one is the
C99 support.
You're right that I haven't had it
bite me either, except I have
occasionally run C code
that's C99 written for GCC
and it didn't compile with
Microsoft's compiler. Then what you do
is you go off and you write some really nasty
shim code that simply goes, well, I'm going to map this missing C function
to the C++ one, which does exist.
And you just boilerplate all of that,
and lo and behold, it compiles.
And what they've gone and done effectively
for all intents and purposes in Visual Studio 2015
is they've done that boilerplate for you.
So if you look at the internal header files,
you'll see that they've done out all the C99 support,
and it bounces between the two.
So they've all been mirrored across the board.
So all the functions are mostly there. I think the two, so they've all been mirrored across the board, so all the functions are mostly
there. I think the stuff that's still missing is
oh, that
metaprogramming construct, what's the name of it?
It lets you do
metaprogramming in C, not C++
by doing
it's a macro-based, magic
macro thing where you can get it to select types
for you, and someone at C++ now
managed to build a
static if, so
a static constexpr if by
using the C99
metaprogramming construct to select
C++ types and have them
not compile if they were not able
to compile, which is just amazing.
That was Ehow who did that, and we
were all, I mean, there was about five of us in the room
and we just had our jaws on the floor because he'd
actually implemented static if. That just
works now, right now on
Clang, and
wow. But that's
something that's also missing, and that would be really great to
have, because think of the things you could do with it,
the evil you could do with it.
I can't structure. C++.
The next few points you had are all about quality. This is obviously
something really important when you're evaluating a
library. You want to make sure that the library is
of sufficient quality. It's being well
tested. Do you want to go over some
of your points here?
Sure. It's kind of funny.
Boosters always had this great reputation of quality,
and I think deservedly so.
They went out of their way very early on
to be way ahead of the curve in terms of per-commit testing,
in terms of making sure that the code that they deliver
really does work.
But what's happened is it was all custom infrastructure
that was built using custom scripts and custom backends.
And what's happened is the open source world
has very much caught up with these standardized tools.
So the most famous one, of course, is Travis,
which is, I think, run by a German or some European company.
And they provide for free integration with GitHub
such that every commit you send in,
it'll go off and run your code on Linux and OS X,
which is great because most of us don't have access to an OS X machine for development.
So if you can test your code as in just broken Apple Mac OS X, that's huge.
And I certainly find that immensely helpful.
Now, debugging, it's no fun.
You have to put in scripts to run a set of debug commands every time the Travis fails.
And there's a great tool called Expect for that, but I'll skip that.
The other thing for Windows testing is a program call
or a free service called AppFayer, and
that will do very much the same thing as
Travis, and
will do, except for Windows. So between
the two of them, you can get, for every single commit,
all your code have a certain set of small
tests run on both
systems, and it'll come back and comment on your
GitHub pull request to say this thing failed,
this thing passed, and whether
it's
wise to merge this code or this
pull request as is. And a lot of the time, because it comes back
to the person who submitted the pull request, they can see
it broke Windows, so then they have to go off and fix Windows
and then it'll show that it's been fixed.
And that's a huge time-saver for anyone who has
to deal with pull requests. Instead of manually testing all this
stuff, you can just have it all done for you. So that's a huge time saver for anyone who has to deal with pull requests. Instead of manually testing all this stuff, you can just have it all done for you.
So that's a big thing.
Obviously, the quality end of things comes into it too,
because, for example, you can hook Travis into another service called Coveralls,
which is free, and that will go off and calculate your code coverage for your unit test suite.
And it will show you with this beautiful annotated source code thing in green or in red
whether your line was executed or not by your unit test. It will it'll show you with this beautiful annotated source code thing in green or in red whether your line was executed
or not by your unit test. It'll give you back
a percentage, and that can be shown then as
93% unit test, and if someone submits
a pull request that doesn't have unit tests,
your score will drop, and you can then come back to them
and go, well, you need to add unit tests for the new functionality
you just added. So again, another huge time saver,
and it keeps the quality high automatically
without you having to do it by hand.
Very good.
What about Valgrind and runtime sanitizers?
You're recommending them as well.
Yes, it's surprising that Valgrind's been around
for such a long time now,
and we're still not automatically running
a unit test pass with it,
which is surprising given...
I know it has a huge performance penalty
there's no doubt about it, but you can have your unit
test written to automatically self-adjust
so in other words they run, do a runtime check, am I
running under Valgrind, and if so they do things
very differently like shorten loops by a hundred
fold or whatever it was
and the idea is then you can get that
stuff to happen
there's also of course a huge
amount of improvements recently with the Clang tooling.
So Clang Tidy was just something that Chandler Cruz introduced me to very, very recently indeed,
and I found that it broke horribly when I tried running my code through it.
So I submitted the bug report, and I'm sure they'll get it fixed shortly.
But I'm looking forward to trying that out, because it lets you write custom tests at the Clang AST analysis level
to say, is this thing good or bad about
a certain code base? So for example, naming conventions, you can put a test in there
that checks, are your naming conventions being followed? And if so, to give a warning.
There's lots of flexibility. The tooling for C++ has come on so, so far in the last recent years,
especially mostly because of Clang. It's amazing where we're at now compared to where we were even only five years ago.
So do you mention the Clang address sanitizer in your document?
I was wondering if you had any opinion on that versus Valgrind.
Yes, I recommend using Valgrind instead of the address sanitizer normally,
but not always.
And the reason for that is because if you're going to have complete
results from the address sanitizer, you need to
recompile your libc and your standard
c library also with the address sanitizer.
And if you don't, then you'll get these
weird false positives that pop up, some from
time to time. As soon as your code goes anywhere near
STL stuff or libc stuff, you'll see
weird things that come in. In other words, it doesn't show you
what was the cause of the problem.
The cause gets lost. If you use Valgrind, on the other hand, you'll spot that stuff.
Now, that said, the address sanitizer does do more testing than Valgrind does.
If you're using a fuzz tester, which I think is my item number seven, where you're
getting a program to automatically permute
inputs, so for example, to check for overflow bugs, to check for unvalidated
data coming in that's trying to crash your program
and exploit it for security, even
just to take parameters and to fuzz them
to test whether your code correctly handles all
outcomes. The address sanitizer for that
is fantastic because it's really, really fast. It lets
them do millions and millions of permutations
per second, and that's the kind of thing that you catch
Heartbleed with. That's the kind of tooling for
that. I think there's actually a very good
test on the LLVM fuzzer page that they show that you could have caught Heartbleed within. That's the kind of tooling for that. I think there's actually a very good test on the LLVM fuzzer page
that they show that you could have caught Heartbleed within
about 15 to 20 seconds
had you been using that tooling for it.
Now, I should say that the guys behind
OpenSSL, they were more than
well aware of what they should have been doing
to do this testing, but they just did not have the resource.
There's basically two guys who were doing that
for all those years, doing it for free in their spare time.
And they asked for money, they asked for resource, and they just never ever could get anyone to fork out for it.
And we got the result that we got, you know.
Now that they have the resourcing in taste, they are massively improved, that kind of testing for OpenSSL.
I mean, it was a slam dunk that we should have been doing that from the beginning.
But it's tough.
Someone has to sit down and configure this stuff.
And it's boring.
It's always boring configuring more testing, you know.
Nobody enjoys doing it. But it's boring. It's always boring configuring more testing. You know, nobody enjoys doing it,
but it's important we do it.
Well, since you've brought up fuzz testing
and it's your next item,
I have no experience with fuzz testing.
And I'm not really very clear on how it can work
if like your input,
your program is expecting structured input of some kind.
Does fuzz testing just do random input?
What does it do?
That depends on the tooling you've got.
So if you're using the classic tool as AFL,
which is American Fuzzy Lop,
which is the name of it, I think,
is due to the fuzz testing itself
producing that as a bug
within the same fuzz testing tool.
Something like that was the reason
why they named it American Fuzzy Lob.
But that's the traditional tool
that's been around for years.
My understanding of how that works is
that it takes
a piece of text and you
tell it what the structure of that text
is roughly. And it will then permute on the basis
of that and try to get your program to crash out.
That's my understanding of it.
I have to say I'm being very vague here
because I've never personally used it.
So I simply pointed, go read the documentation.
This is the tool that everybody recommends,
historically speaking.
The newer tools, of course,
are the LLVM tooling for LibFuzzer.
And I think Chandler did a talk on that
last ACCU conference that was in Bristol in April.
So he did the keynote there,
which was on the use of libfuzzer.
Libfuzzer is a bit more interesting
because it uses Clang's internal tooling
to optimize the direction of the randomization.
So it follows the code paths as it runs through the code,
and if it's followed a code path before,
it won't go down that line or permute that line anymore
because it's been covered.
What it tries to do is combine unfollowed or uncombined code paths
in new inputs, and it kind of guides the randomization
much more optimally than other tools can do
because it can see which code paths are being followed,
and it tries to make as many combinations as it can
to get the program to crash out as fast as possible.
So it's potentially orders of magnitude faster at finding problems.
That said, again, I haven't personally used it either,
and that's just purely down to time,
because I only learned about this tooling last April.
And we've had a whole lot of conferences between now and then.
So it's on my to-do list, and hopefully this coming year I'll do that myself.
Again, I just linked the documentation and say you really should go look at
this. It's really cool. Okay, yeah,
I just added LibFuzzer to my to-do list after
I mentioned it. I think I heard Chandler
mention it in a side conversation at
CPP now, but
I lost track of that
after I heard about it. Thank you.
No, Chandler
is really great at getting out there and telling us about
new tooling to do with Clang.
He's doing such a great job with it that I think some people are beginning to worry about the Clang monoculture,
taking over C++, that the only toolset we'll ever think about and consider in the future is going to be Clang.
And I can see actually that's coming, because all the exciting stuff's happening around the Clang ecosystem.
Even my own tooling that I've written for API Bind, I wrote that using libclang,
so it parses the Clang ASC that's output
and it can't cope with anything that Clang can't do.
Which is a good thing and a bad thing.
It's a worry. I don't think we're a long
way away yet before it's going to become a problem.
So do you actually think as Clang
CL becomes more feature complete, you'll
find us all swapping out CL.exe
with Clang CL in our Visual
Studio builds?
When Clang is already surprisingly good.
I have already compiled a completely Microsoft Visual Studio compliant binary
that works with everything just by typing win clang, off you go,
and it just works. It's great.
They've come so far in the last six months.
Chandler was telling me about how they found so many bugs in Microsoft's compiler
because no one was ever doing
what they were doing before,
and they've been submitting an enormous quantity of bugs in there.
The Microsoft team are going,
oh, wow, who would ever have thought of trying that combination?
It's until you go off to re-implement a compiler.
One of the interesting ones is there's a problem
with the mangling scheme used by Microsoft.
They did that mangling scheme before any C++ was standardized.
This is like 95, 94 time, you know?
And they're still using that same scheme now.
Well, the trouble is that if you put a lambda type into a template template declaration
in the right location, you cannot mangle that under the Microsoft scheme.
It just cannot be done.
The mangling cannot represent that.
And when Microsoft's compiler gets that, it just goes, I give up. It just cannot be done. The mangling cannot represent that. And when Microsoft's compiler gets to that,
it just goes, I give up.
I cannot compile this code.
And that's an interesting example
where nobody had actually thought
of putting a Lambda type
into a template-template type system before.
It hasn't come up.
But almost certainly,
just because someone hasn't done it yet
doesn't mean that someone isn't going to do it
at some near point in the future.
So that's an interesting problem
that Microsoft have to fix
because for them to fix that, they're going to have
to do something weird with the mangling.
There's already a whole lot of
context-sensitive mangling inflection
points in the parsing of the Microsoft
mangling. So depending on
context, you interpret this byte sequence
and this mangling one way different than another.
And as a result, when you're building
out a parser, it has to be contextually
able to backtrack. So that's
quite complicated. I've tried implementing
one, and I actually failed. And
the one that comes with
the MinGW
is actually broken.
It doesn't do all forms of mangling that Microsoft
can output, because how do you test
it? And even Microsoft themselves
have produced demanglers internally, which didn't
cover all of the possible cases.
It's tough.
It was a really complicated mangling.
So, I don't know, I would like to hope that they're
going to completely throw away their
existing mangling scheme at some future point and
adopt the Itanium one, and then we'll all be on the same
mangling, and I think that would be huge if that ever
happened, but I think the chances of that are very, very
low. That would be crazy if there
was a standardized mangling scheme that would help me with some of my projects. I think the chances of that are very, very low. It would be crazy if there was a standardized mingling scheme
that would help me with some of my projects.
I think it would help all of us hugely with all of our projects
because the tooling to build out, if you can assume that, is enormous.
But it would cause tremendous troubles for Microsoft's code.
So jumping back to this list,
one of the interesting things I saw was portability.
It looks like some of the libraries you evaluated
were doing their own compiler feature detection,
and you're recommending to use something a little more standard?
Yes, I think you jumped down quite a bit there, did you?
No, it's...
I think one of the interesting trends that's coming in these new
libraries is they're all standalone.
So they're all completely separate
from all other libraries with no dependencies
for the most part. Sorry, no mandatory
dependencies apart from the standard library
that comes with C++11 or 14.
And as a result, when they're doing feature
detection, they're going off and doing it
on their own. So they're doing the old-fashioned way of comparing compiler versions.
Of course, that's much more tractable nowadays
because we only really have three compilers.
10, 15 years ago, we had about 15 that you really had to think about,
and 10 that you definitely had to think about.
So nowadays, if there's only three that you really have to think about
in any major way, that's more tractable than it used to be.
But that said, just because it is now doesn't mean that in the future
there isn't going to be some new compiler that comes along
that will do all sorts of interesting new things.
And going off and doing version
checking I think is a very backward step
relative to doing feature checking. And of course
there's a C++17 study
group dedicated to doing feature checking macros.
So what I was suggesting here
in that particular point was, well instead of
going off and reinventing
the wheel, why don't we just go ahead and use the C++17 feature detection macros right
now.
And I had a header that I linked to which will, on the older compilers, go off and create
emulations of those exact future coming macros.
Now the latest compilers already have those feature detection macros going into them,
so if you use the latest Clang 3.7 and you ask it for a dump out of its macros
that it's pre-defining, it'll have a whole load of feature detection stuff coming in
from C++17. But if you're then on Clang 3.2, it doesn't implement, I think it implements
one or two of them, and it has the wrong name, because back in those days they didn't know
what the naming was going to be yet. So that's where this header file that I put together,
it simply goes off and it glues up a set of standardized
feature detection macros. And then I also
implemented a boost config emulation
for those. So if you want
your boost configuration macros that detect
various features about the compiler, there's a
shim that will convert out from the
C++17 feature detection into
the boost equivalent, so you can write your code
and as you always did, it just sort of
works, you know. And that was my
hope, was that was an idea
for people. But I don't know if people are going to do it.
People really hate dependencies
nowadays in C++ and that's very
much come in from the Web 2.0
kind of philosophy of doing code.
You know, you really hate having to check
out a development environment. You just want to drop a
single file that you've got and just drop it and get to work,
you know.
Well, I think we might be running check out a development environment. You just want to drop a single file that you've got, just drop it and get to work, you know? Right.
Well, I think we might be running out of time.
Are there any other points you want to bring up from the rest of your list
that you think are particularly important?
I think the biggest thing for me is
I really hope it's a living document.
So I've already had some people come in with some ideas.
There's a new one that just got added by Christoph who did a living document. So I've already had some people come in with some ideas. There's a new one that just got added
by Christoph who did
a proposed boost
dependency injection and
he found a technique for
a one-click launch of your library
in a ready-to-go webpage with
a development environment in there.
So you literally just punch the button and
there comes your library you want to have a go
inside a box and you can get programming right there and then run it,
and it just goes.
And the way that he's done it is that someone,
is it Wandbox or something like that?
They've implemented an API,
so you can come along with a RESTful API query,
and you can poke data into a ready-to-go online C++ compiler,
and as a result, you simply assemble a copy of your library,
you poke it into the development environment along with some test code to show it running,
and then you can get in there
and people can just literally try out your library
inside a web browser.
And that's an amazing idea, and he's shown me how to do it,
and I just need to sit down and write it up.
And it's just, you know, that's the kind of stuff
that you do with JavaScript, you know, with jQuery, you know.
And now we're doing it with C++ in a web browser.
It's just mind-blowing.
The next thing that would make me
even more excited is if you could convert the C++
into JavaScript and have it literally run the
program in the web browser. I know
there was some tooling that people were trying to do that. I think it's
unfortunately
become a bit obsolete.
But it's an amazing time to live
in if you're in C++.
It was interesting because one of the articles
that was reviewed earlier was saying about,
you know, people say that C++ is dying out.
And it is and it isn't, as the article pointed out.
For what it's good at, it's unchallenged at the minute.
There are challengers coming
who are going to challenge it in a big way.
And the fact that Microsoft have released the.NET platform
as a
completely portable solution is enormous.
I think it's going to have tremendous effects
on the wider thing, because why would you ever use Java ever
again if you can now write in any language
of your choice that runs on.NET? You've got
15 languages to choose from, all of which will
automatically interoperate. You could have Linux
stuff running with OS X stuff, running with
Windows stuff, just seamlessly, and on all
your platforms from mobile right through to
web you can deploy to the cloud just instantly
with a button in Visual Studio 2015 and then deploy
to your mobile phone using the same program
with a single button click you know
so that's the future of
where coding is going and C++
seems to be right at the front of that
and I think as long as it is
I think we have
a great future ahead of us.
I think you might want to take a look at the InScripten project
also along those lines of running in a web browser,
which does seem to still be under active development
for compiling your C++ straight to JavaScript.
Is that the thing I was thinking of that had...
When I said obsolete, I meant that it had just not been attended to,
but if it's been repicked up or it was indeed something I'm not thinking of, had, when I said obsolete, I meant that it had just not been attended to, but if it's been re-picked up or it was
indeed something I'm not thinking of, that
sounds fantastic. So does it do an
LLVM bytecode assembly into
JavaScript, does it? Yes.
Oh, wow, yeah.
The last release was tagged two days ago.
Fantastic.
That's another option that I really
look forward to seeing. I would love to see if you
could take a big metaprogramming library and get the compiler
to reduce it down to like three
JavaScript instructions.
And it's like, I've just run my
program and it prints, hello!
And you've just compiled
200,000 lines of C++ into that.
I think that would be
incredibly cool and at the same time scary.
Yes.
Okay, well where
can people go online to find
more of your stuff, Niall?
Oh my.
I have the world's
most ancient blog
at
www.nedprod.com
and that's been there since about
1998.
There's a whole lot of very embarrassing stuff if you go into the archives,
which I strongly recommend you leave well alone.
And I suppose then I have been on SourceForge
for donkey's years as well,
and of course there's the GitHub repo,
which is where my main development nowadays goes.
So I suppose you just go on there,
you star me,
and you can see the stuff that's coming through.
Apart from that, my plan for the next stuff that comes out is pretty straightforward.
AFO is coming up for review by the Boost community at the end of July,
and I have to do a number of things to finish it between now and then,
including a monadic type transport, which I'm going to try and do the definitive monad, which
is not any monad that anybody
has ever proposed before, because everybody tries to make
it flexible, unlike Haskell. I'm
making it like C++ and incredibly
simple. So ridiculously simple
to the point of people are going to go, oh my god, I can do
nothing. And that's exactly
what I'm aiming for, is because I want something that's
so efficient, it turns into
no code most of the time and I
have a series of unit test suite
things which literally spit out the assembler and
they count how many instructions are in there and is it one or
is it zero and that's the unit test suite and
if I ever write any code which causes anything other than
one or zero instructions to pop out it
fails in CI so that's
my next big task and I hopefully will announce
the first part of that next week
to BoostDe dev for their
feedback and then we'll move on to
the next generation future
promises which are based on that monad
afterwards you'll have a seamless interaction
between monadic programming and a
synchronous monadic programming
and that will then get folded into
AFIO hopefully and
with a bit of luck come July people won't
completely hate it
Well I gotta say that's a pretty hardcore continuous integration test there AFIO, hopefully, and with a bit of luck, come July, people won't completely hate it.
Well, I gotta say, that's a pretty hardcore continuous integration test there.
Zero or one instruction, that's...
Well, it's, you know, the Java, sorry, the
Python script which counts the instructions
has been remarkably
hard to get it right, because the trouble is
that when the code fails, it
goes off and outputs lots and lots of functions, you see, and then the Python code has to stitch
together the functions and then determine if your routine actually is only one or zero.
Just because you get 20,000 lines of assembler put out doesn't actually mean that your code
didn't assemble down to one instruction. So I've had to write a parser that takes a series
of heuristic guesses to generate the code, and I keep making... it keeps having bugs in
it. I only sent a fix there today
because it was getting itself into an infinite loop.
But I really don't want to write a full, fat,
X86 assembler parser in Python
if I can utterly avoid it.
No, I don't think anyone would want to do that.
Okay, well, thank you so much for your time, Neil.
This is a really great document, and we'll put a link in the show notes.
So if you're working on a library, you should definitely read it.
Thank you very much, guys, and thanks for the opportunity to appear on your show.
Thank you.
Thanks so much for listening as we chat about C++.
I'd love to hear what you think of the podcast.
Please let me know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic, I'd love to
hear that also. You can email all your thoughts to feedback at cppcast.com. I'd also appreciate
if you can follow CppCast on Twitter and like CppCast on Facebook. And of course, you can find
all that info and the show notes on the podcast website at cppcast.com.
Theme music for this episode is provided by podcastthemes.com.