CppCast - CppCon Poster Program and Interface Design
Episode Date: September 7, 2018Rob and Jason are joined by Bob Steagall to discuss his history with C++, the CppCon poster program and his upcoming talks. Bob is a Principal Engineer with GliaCell Technologies. He's been w...orking almost exclusively in C++ since discovering the second edition of The C++ Programming Language in a college bookstore in 1992. The majority of his career was spent in medical imaging, where he led teams building applications for functional MRI and CT-based cardiac visualization. After a brief detour through the worlds of DNS and analytics, he's now working in the area of distributed stream processing. Bob is a relatively new member of the C++ Standardization Committee, and launched a blog earlier this year to write about C++ and topics related to software engineering. He holds BS and MS degrees in Physics, is an avid cyclist, and lives in fear of his wife's cats. News Frama-C Frama-C Tutorial Frama-Clang plugin The Errata Evaluation Problem Use Boost.Hana with MSVC 2017 Update 8 Function poisoning in C++ Bob Steagall Bob Steagall's GitHub The State Machine Links C++Now 2018: Bob Steagall "If I had My 'Druthers: A Proposal for Improving Containers in C++2x" Fancy Pointers for Fun and Profit Fast Conversion From UTF-8 with C++, DFAs, and SSE Intrinsics Interface Design for Modern C++ Sponsors Backtrace Patreon CppCast Patreon Hosts @robwirving @lefticus
Transcript
Discussion (0)
Episode 166 of CppCast with guest Bob Stiegel, recorded September 5th, 2018. dot IO slash CPP cast.
In this episode, we discuss Scott Myers and function poisoning.
Then we talked to Bob Stiegel.
Bob talks to us about his history at C++,
the CppCon poster program,
and his upcoming talks. Welcome to episode 166 of CppCast, the first podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
I'm doing all right. How are you doing? I'm doing host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today? I'm doing all right. How are you doing?
I'm doing pretty well.
Don't have too much to share, I don't think. You?
Is it all right if I mention that you have a dishwasher?
I don't think we've mentioned it before on the show, but yeah.
We haven't.
No, but for some reason it took me about two months to get a new dishwasher,
but it's finally there, And I'm happy about that.
If you don't mind, I'm going to go ahead and do this.
For like the past eight weeks, whenever we would record, Rob would be like, yeah, I'm at home waiting for the dishwasher guy again.
Yeah, pretty much.
That saga is over now, though.
So, yeah.
Well, at the top of our episode, I'd like to give you a piece of feedback uh last week we were talking about um formal verification and we got this comment on
reddit from fsdmcpp and he says uh dear host thanks for another interesting episode i'm not
involved with formal verification professionally however as an amateur i'd like to point out to a
few relevant resources i'm not gonna read the whole post. However, as an amateur, I'd like to point out to a few relevant resources.
I'm not going to read the whole post because it's a bit long.
But he points to a tool to enrich C code with annotations to formally prove its properties, which is Framacy.
And I'll put a link to this in the show notes.
And he also has a tutorial that shows a number of applications of Frame.sc.
And he also pointed to a C++ plugin for Frame.sc,
which looks like it's a Clang-based experimental plugin.
And then he also talked about C++ contracts discussion going along with AdaSpark,
which I believe is another programming language
I'm not too familiar with.
Right, Jason?
I know Ada. I don't know Spark.
I mean, I don't know Ada.
I know very few people who have ever programmed in Ada, but yes.
Yeah, one thing he points out is this Frama C formal verification tool
that he mentioned can work with both C and Ada Spark,
and you could actually use it to verify both in a mixed code base,
which sounds kind of interesting.
But yeah, a couple links to some interesting-sounding resources
related to formal verification with C and C++,
and I will put those in the show notes.
And yeah, thanks for listening.
Yeah, and you know, thinking about last week's episode,
C++ is notoriously difficult to parse,
and this is definitely one of these things
where the Clang tooling is really allowing people
to do more analysis of the code
in a much easier way than was possible before.
Yeah, absolutely.
Well, we'd love to hear your thoughts
about the show as well.
You can always reach out to us on Facebook, Twitter,
or email us at feedback at cppcast.com,
and don't forget to leave us a review on iTunes.
So joining us today is Bob Stiegel.
Bob is a principal engineer with Glia Cell Technologies.
He's been working almost exclusively in C++ since discovering the second edition of the
C++ programming language in a college bookstore in 1992.
The majority of his career was spent in medical imaging, where he led teams building applications
for functional MRI and CT-based cardiac visualization.
After a brief detour through the worlds of DNS and analytics, he's now working in the area of distributed stream processing.
Bob is a relatively new member of the C++ Standardization Committee and launched a blog earlier this year to write about C++ and topics related to software engineering.
He holds BS and MS degrees in physics, is an avid cyclist,
and lives in fear of his wife's cats. Bob, welcome to the show.
Hi guys, thank you very much for having me. I'm very excited to be here.
It seems like a shame to live in fear of the animals in your house.
Well, I am the only male in a house populated with females. I'm married,
I have two daughters, 21 and 25. We currently have three female cats and a tank full of female fish.
So I've learned to be very careful in what I say and the expressions I put on my face.
That's funny.
Wow. But you get along with your cats, right, Rob?
I have two kittens that are great, although my previous two cats who have both passed away, one of them was very nice and the other one was very attached to my wife
and didn't want me to be near her. So I understand what you're talking about, Bob. Yes, we have a new
one who's about a year old who is raising hell and giving problems to the other two
who are sisters who are about seven.
And only one of them, I think, likes me sometimes.
So it's a razor's edge that I walk.
One of the dogs that my wife had when we met,
if my wife was ever out of town,
the dog would just sit there in the living room and stare at me.
Like, what did you do to her?
Yeah.
Okay.
Well, Bob, we've got a couple of news articles to discuss.
Feel free to comment on any of these, and then we'll start talking more about all your work with C++, okay?
Yes.
Great.
Okay.
So the first one is a post from Scott Myers, and it's a little bit of a sad post.
But we talked to him, I guess it was about two years ago, when he was basically stepping down from publishing any more books or doing any kind of active community work. And he's now saying that he no longer plans to update his books to fix technical errors.
And it's basically because the language is continuing to change and he hasn't been paying
close attention to it.
And people have been writing him about possible errors with some of his code in the books.
And he's no longer able to say uh you know definitively
whether or not he believes it's an error so he's just gonna stop putting out these sorts of updates
which is unfortunate but understandable yeah inevitable i guess at some point you can't
indefinitely keep maintaining your old books and trying to stay up to date after you've retired
and what was the most recent book i mean how many years ago was the most recent one?
Oh, Effective Modern C++ was...
Well, I just looked.
We interviewed him almost exactly three years ago today.
Oh, it was three years ago. Okay.
So Effective Modern C++ was already out when we interviewed him.
Yeah.
Yeah, I think the first year of publication was 2014 for that,
and there have been a few revised printings since then.
I have a copy of the first printing and then a copy of a later printing.
I can certainly sympathize with Scott's point of view.
It's very difficult to keep up with the changes that are coming in the language. And, you know, as a new committee member, trying to keep up with all of the mailings
on the email reflectors is virtually impossible.
I suppose if I had, that was my full-time job, I could do it.
But unfortunately, I have a job where I can't spend a lot of time at work doing non-work
things.
And it's very difficult to keep up with the evolution of language.
It's very exciting, especially given, you know, sort of the slow change in the language
for maybe the first half of its life there.
But it's very tough to keep up these days, and I can certainly understand where he's
coming from.
So on that topic, and this is perhaps jumping a little bit ahead to the interview portion, but I'm going to go ahead and go for it. I have a mental filter. I don't really pay attention to anything unless it has been approved by the committee. Now, if you're on the committee, obviously you can't do that. Sometimes I'll have people ask me, oh, like, well, what do you think about X and Y? And I'm like, honestly, I'm not paying attention at all. I'm just trying to stay up to date with the things that have been approved.
Yes.
How do you deal with that? Because like you said, it's got to be just this tidal wave of information.
It is very much like trying to drink from a fire hose. And I can tell you, you know,
going to a committee meeting, and I may be dating myself here, but if you've ever seen the movie Amadeus, when I go to a committee meeting, I feel like Salieri in a room full of Mozarts.
It's surrounded by brilliant people who are doing their best to try and make the language and the library better in a number of ways. And I find myself basically filtering out things that don't really
interest me. And it's taken me, well, a year now, basically a year of meetings to
construct that filter of the things that interest me. And what I'm really interested in in terms of the evolution of the language is
finding ways, not necessarily by adding new features to the language, but perhaps by suggesting
changes in the library or in the language itself that make the language easier to use and to
remember and to understand. You know, it's a tremendously difficult language to learn for beginners.
It's tremendously difficult for me to keep up with it.
And I think that there are some simplifications that could be applied to the language and the library
without necessarily adding new conceptual features, new semantics, or new capabilities
that could make code simpler and easier to read.
And so those are kind of my interest in participating in the committee.
You know, I don't have the same experience as some of the guys who are doing things like adding support for concepts
or the metaprogramming like Louie does or, you know, contracts or those sorts of things.
I sort of think of myself as being a proxy
or interested in supporting the little guy, so to speak, who's got to do a job every day.
And how can I help that person do their job more effectively?
Okay. I will definitely ask more questions about that after we get through the news.
Okay. Well, yeah, I definitely understand Scott's viewpoint.'s viewpoint again it's it'll be sad to see
him uh no longer making these sorts of updates but it's completely understandable and uh just
thought i'd maybe put out another plug in case anyone is so interested the um cpp con i think
it's a pre-conference class that he's going to be running along with kate gregory and uh um oh who's the other one with him i'm not sure andre alexandre andre
there we go it's given a talk on technical presentations like we've said these classes
are a who's who of guest of cbp cast yes okay um and the next one we have is a post from the Visual C++ blog.
We talked a few times in the past about Boost HANA being used from Visual Studio
and how they had a bunch of workarounds in order to compile Boost HANA,
and they have managed to basically fix all the compiler bugs
that were requiring them to put in all these workarounds,
and they're down to just three.
So with the latest MSVC 2017 Update 8 compiler,
they now only have three workarounds that you would have to use
in order to use Boost HANA.
And I believe those three workarounds are now like in
the official um like master branch i believe yeah i i think that's right like you can actually use
boost hannah with visual studio 2017 update as opposed to using like their fork of it right
right i i need to try chai script again with the latest update. Last one I had, I was getting incorrect behavior
with class template deduction guides.
But that's been a couple of releases,
and I know they're making huge headway on these things.
But some of these workarounds that are still in place,
I don't even understand the words in it.
Boost HANA workaround MSVC.
Okay, fine.
Multiplector? What the heck is a
multiplector? Is that a word? Well, it says multiple copy move
constructors. Oh, okay. That would make sense.
Well, I can say that I'm
in awe of Louie's metaprogramming abilities
and I've not yet had an opportunity to actually use HANA in anything work-related or actually even in any of my personal projects.
But I have some small appreciation of all of the coin, I'm really excited and been very excited on Microsoft's behalf over the last few years at the progress that they've made in such a short time in coming close to standard conformance.
I remember buying a copy of Microsoft C7 in 1991 or 1992 and then later buying Visual C++ 4.2 and Visual C++ 5 and Visual C++ 6 through the 90s.
And comparing those to compilers like GCC on Solaris at the time or Borland C++,
there was always the issue of things that Microsoft's compilers did not support
because they were busy, I think, trying
to force their will onto the market.
And that sort of continued for a few more years.
But I was at C++ and beyond in 2012 in Asheville, North Carolina, and someone asked Herb Sutter
about, you know, what is Microsoft
going to be doing in terms of standard conformance at that time? And Herb unequivocally said,
we are dedicated to standard conformance, and you're going to see that happen.
And lo and behold, over the last six years since then, it really has happened, and I'm
very pleased to see it. You know, every time a new version of Microsoft Visual Studio came out,
I would go and buy the Intel compiler because I wanted that standards conformance.
But I haven't actually purchased the Intel compiler now for a couple of years
because Microsoft has improved so much.
So I'm very happy for Microsoft,
and I'm very excited that they've really taken on this challenge and done so well.
I think it says something about the complexity of C++,
but at the same time, any old mature language is complex.
That it took six years of specific effort
with the might of Microsoft for them to get to this point.
Yes.
But to be fair, other popular languages
don't even have multiple implementations.
And those that do are either significantly simpler languages or not nearly as old, many of them.
Right.
I can't imagine.
But does Perl 6 have multiple implementations?
That's a beast of a language.
No idea.
Yeah. Did either one of you follow the development of pearl six no no it was a thing it took many years there's a chart a long time ago
downloaded a pdf chart of like the all the operators that it has and it's something like
30 different operators i might be exaggerating but it And it's something like 30 different operators. I might be exaggerating, but it's
crazy. Anyhow. Okay. Last article we have here is a post from Fluent C++, Jonathan Bokura's blog.
Well, I believe it's actually a guest post on the blog. And this is function poisoning in C++.
And this is something I was not aware of, but apparently GCC gives you this poison pragma
if you want to basically prevent the usage of some identifier in your code.
And the author of this post basically goes over some legitimate use cases of that.
Basically, one thing he highlights is if you wanted to put some wrapper around a C allocation function
so that you can instead wrap it with something returning, a smart pointer,
then you can then prevent someone from using the C version allocator to just make your code a little bit safer.
And I thought that was a good idea.
It's an interesting concept.
I don't tend to be a fan of these compiler specific extensions
but to be fair almost all the gcc prognosis are supported by clane clang so that's two compilers
anyhow right and i believe he said that he didn't see any way of doing something similar to this on
like msvc right so it's definitely GCC and I guess Clang only.
But interesting idea.
Yeah, it seems interesting.
It could be a useful technique if you have people that are working on a project
that may be less experienced
and more likely to fall into one of the traps
that you're attempting to provide here.
In general, though, I'm a little bit leery
of techniques like this.
In my mind, at least, it's somewhat analogous to having a member function
that's declared public in a base class
and then redeclaring it as protected or private in a derived class
where you're hiding public interface.
And it's not a perfect analogy,
but all of the APIs and stuff from the C library
and the standard C++ library,
they are sort of the public interface of our language.
And I'm a little bit leery of hiding parts of that interface,
unless there's a really compelling reason to do so.
That's an interesting viewpoint.
Yeah.
There was a couple of comments in the Reddit discussion about this article also that they
might've been able to accomplish something similar by deleting the versions of the methods
that they are, the functions they don't want to use in the public interface.
But I don't think you could get the level of implementation that they wanted here or
marking them deprecated.
That was one of the comments on Reddit is also.
Right. Okay, so Bob, could we start off by you just giving us a little bit more of your
history and background with C++?
Sure. I was working as a research assistant, my first sort of real job after graduate school
in the early 90s at the University Hospitals in Cleveland,
Ohio. And I worked for a physics professor, the guy who was actually my advisor for my master's
degree. And he was associated with Case Western Reserve University, and he did research in MRI.
And I was working on a project where we were doing some very early visualization of cardiac images from MRI,
which was, it's a notoriously difficult problem
because people's hearts don't hold still when you're trying to image them.
And scanners at the time, you know,
were just not quite sophisticated enough to acquire data at high enough speed
so that motion artifacts were not a problem.
But we were doing research,
and I was writing some code in conjunction with Siemens Corporate Research in Princeton,
New Jersey, to do some visualization of the images that we acquired. And I was doing it all in C.
I had taught myself C a few years earlier to do my master's thesis. And of course, you know,
I had the concept of viewports and images and all of these things that I was allocating off the heap,
and it just became more and more complex, more and more difficult for me to keep it in the limited capacity of my brain.
And I was in the university bookstore one day goofing off,
and I saw this book called the C++ Programming Language.
And I looked at it, and I thought, oh, this is interesting.
So I purchased it, and I took it home.
And my background at that time, all of my experience was with C and Fortran and Pascal,
very heavily procedural languages.
And I read it, and I got through the first couple chapters,
and at the time, I just didn't grok this inside-out syntax of object.operation.
And I thought, this is madness.
This is no use to me.
And I put it up on the shelf, my bookshelf, where it sat for a few months. And finally, the pain at work managing all my memory allocations got bad enough
that I pulled the book off the shelf again.
And also, I'd recently heard about this thing called templates.
And it sounded like a solution to a problem I was having. So I got the book and started reading it and suddenly
realized, hey, there's this destructor thing, which would really help me with my memory leaks.
And oh, by the way, here's this template thing that would help me manage different lists of
things. And in my code, I was using lots of doubly linked lists of different objects.
And I just sort of fell in love with the language.
This was just before, really, that there were web browsers, and HTTP was a thing.
And a friend of mine who was really into the nascent Internet at that time turned me on to Usenet,
and I started reading comp.lang.c++ and comp.stood.c++.
And a couple years later, I heard about this thing called the STL, which was really kind of exciting.
And everybody was talking about it, and I downloaded it.
I FTPed it off of some FTP server at HP Labs and was able to sort of make it work with the primitive GCC that I was using at the time, which was kind of cool. And I was young and ambitious and not quite wise to the ways of
the world and thought, hey, there could be a market for something like this. And so I left my
job at UH and started my own company.
And I did some consulting on the side.
I did network installations where I crawled around in attics and stuff,
installing RGB, RG58 thin wire and thick wire Ethernets to support myself.
But from about 94 through 99, I worked on implementing and I sold my own version of the standard C++ library.
And I actually started selling it a few months before the C++98 standard was announced.
And it was not perfectly conformant to the standard by any means.
There were things at the time I thought were pretty silly that I did not include, and I did things my own way.
And it was a great learning experience.
You know, I wrote several hundred pages of documentation, which I published and sold
to my users on paper, of all the crazy things.
I didn't sell a whole lot of copies.
If I did, I'd be laying on a beach somewhere earning 20%.
But it was a great learning experience.
And from there, my wife finished her postdoc and got a job at National Institute of Health here in Maryland.
And we moved to Maryland.
A buddy of mine got me into a company where he was working doing medical imaging, which I had background in.
And before long, I was running a small company, the company that does functional MRI.
I was there for about 11 years.
We had a product that we sold to GE Medical Systems that was functional MRI. I was there for about 11 years. We had a product that we sold to GE
Medical Systems that was quite successful. And that product was entirely C++ based.
I did a lot of development on that the last eight or nine years while I was there.
I wrote a big metaprogramming library, a linear algebra library, atomic messaging library.
So I was sort of the chief cook and bottle washer at the company there
because there were about 8 to 12 of us.
And one minute I'm designing things and arguing with my team
about things on a whiteboard, and the next minute I'm ordering
soap replacements and paper towels for the bathroom and that kind of stuff.
So it sounds like you actually did make a living for a time from selling your own
implementation of the standard library uh that's true but i'm not sure i would even call it a
living okay well you at least made it sound like you made a living anyhow i made a few dollars at
it most of most of the money i was making at that time was actually coming from the network
installations i was doing because i was willing to crawl around in dusty attics and places most smart people would not go in order to install cables.
Just thinking about how much the world has changed.
I mean, if you said today you want to make and sell your own standard library implementation,
that would sound kind of insane.
That would be crazy talk.
Yes.
So that's pretty cool.
It was a good experience, and I actually developed things which were not part of the standard, but eventually would become part of the standard.
And I don't claim any responsibility for them becoming part of the standard, because people were doing sort of the same thing in Boost at the time.
And I was not involved with the committee in any way at that time. But when I released my product in the spring of 97, I had a complete regular expression
library that integrated with the strings, that supported Perl syntax of regular expressions,
except that it only did greedy matching.
It did not do non-greedy matching.
I had an extensive library of functions for threads.
I had a class that I called callback, which was analogous to function today.
I had another type that I called initiators, which was really an implementation of the
command pattern, which was a callback coupled with arguments.
And you could use that with the thread class to run a command in a distinct thread.
Okay.
And then I had the basic synchronization primitives, atomic integers,
and sem4s, mutexes, condition variables.
So it was 20 years ahead of its time.
I don't know about that, maybe 10 years.
What I really want
to know though is what parts did you leave out because you thought they were silly if you recall
what that was oh i can tell you exactly what i left out okay i left out valerie and everything
associated with valerie because you know what's the point okay i i've never personally met or
heard of anybody who uses valerie, although there must be somebody.
Yeah, I've never used it, I don't think. But I think the worst and most painful part of it was implementing the I-O streams.
I expected you to say you left off I-O streams, honestly, because...
No, I had a full implementation of I-O streams, plus all the required facets.
But that was painful, and I didn't like it,
and I have avoided the I-O streams ever since at all costs. When I do character-based I-O, you know, I either use the I-O, the OS functions for
doing I-O, or I use the C functions. I really try to avoid the I-O streams. You know, I must say,
it's a slightly off topic, but I have recently started using the format lib, lib format,
however they want it to be called.
And it is absolutely amazing.
It is what I'm using now.
Yeah, it's a great library.
And, you know, to combine the syntax for specifying output that you get from the old C streams that are got to be 40, almost 40 years or more old now with the type safety of C++ just seems like a big win.
And it avoids all the unnecessary verbosity and complexity of, you know, IO stream manipulators to get things to format the way you want.
And it's always a battle to make things format the way you want. And it's always a battle
to make things work the way you want to
with Iostreams.
And I really just prefer
the simplicity of printf.
I guess I'm old-fashioned.
Well, if you can get the simplicity of it
with the type safety and performance
of modern C++ design...
It's a win.
It's a pretty cool project, yeah.
Yeah.
I wanted to interrupt this discussion for just a moment to It's a pretty cool project, yeah. I wanted to interrupt
this discussion for just a moment to bring you a word
from our sponsors.
Backtrace is a debugging platform that improves software
quality, reliability, and support
by bringing deep introspection and automation
throughout the software error lifecycle.
Spend less time debugging and reduce your
mean time to resolution by using
the first and only platform to combine
symbolic debugging, error aggregation, and state analysis. At the time of error, Bactres jumps into action,
capturing detailed dumps of application and environmental state. Bactres then performs
automated analysis on process memory and executable code to classify errors and highlight
important signals such as heap corruption, malware, and much more. This data is aggregated
and archived in a centralized object store, providing your team
a single system to investigate errors across your environments.
Join industry leaders like Fastly, Message Systems, and AppNexus that use Backtrace to
modernize their debugging infrastructure.
It's free to try, minutes to set up, fully featured with no commitment necessary.
Check them out at backtrace.io slash cppcast.
So, Bob, we're now about two, two and a half weeks away from CppCon,
and you're going to be running the poster program, or you have been running the poster program.
Is that right?
Yes, that's right.
This is my second time doing this.
The program was first started in 2016 by Hanya Bastani.
And I think, I've never talked to her about it, but I think what she was doing was trying to, in some sense, because the talks at CPP Con are peer-reviewed, she was trying to emulate what scientific conferences do, which is they also have poster sessions.
And typically, researchers or graduate students or students, you know, they create posters to describe their research. And, you know, at a lot of these conferences, the number of slots for speaking is very limited.
And so a lot of work goes on to creating these posters.
And I think what Hanya was trying to do was to emulate that at C++,
because there is a lot of really good work that people are doing that should be talked about
and the community should be made aware of.
So 2016 was the first year, and there were four posters.
Unfortunately, she couldn't do it last year,
and John Kalb asked me if I was interested in doing it,
and I agreed to do it.
And so last year was my first year doing it.
We had nine posters, so it was a very nice increase in poster count.
It was very successful.
The posters, I don't know if you remember from last year,
but the poster stands were inside the reception hall during the Sunday evening reception.
And I was very pleasantly surprised and very excited to see that around every single poster,
there were crowds that were two and three and four deep
listening to the poster authors talk about their work. So I was excited that people were so
interested in it, but I was also pleasantly surprised and happy on behalf of the authors
that they were getting so much attention for their work. And so this year, 2018, we have 16 posters. So we're not quite doubling again.
But over two years, we've quadrupled. So I figure if I can double it again three more times,
I'll have as many posters as there are talks. And I can give Bryce a run for his money.
It's just that much more work to do to make sure that you're selecting the right content, too.
Yeah, well, you know, to be fair, you know, the requirements for submitting a poster are not nearly as stringent as they are for giving a talk.
Right.
But, you know, the idea with the posters is to give people a voice who may not quite have everything that it takes to give a talk. And really, I'd sort of like to focus over the coming years
on some of the younger or less experienced members of our community
to try and get them involved.
I'd like to hear about their work, students, graduate students, postdocs, interns,
people who are relatively new in industry,
to get them to participate in this as well
and build contacts throughout the rest of the community
and talk about what they're doing.
I think it's a great experience.
So I'm curious, if you can share,
how many submissions you had versus how many were accepted?
We had slightly more than 16.
I think there were 18 or 19 submissions.
And I would have, actually all of the submissions were very good. The problem with the two or three
that did not make it is they just came in too late. I can let a day or two slide, but I can't
let nine days slide.
The people who are doing the evaluations, I'd already formulated a schedule and asked them for their time.
It's unfair to say, hey, here's three more submissions, but you've got half a day to evaluate them.
So they were cut off for time reasons, not for content.
I think I've heard John or Bryce refer to this as the 90-10 rule of 90% of the submissions come in 10 minutes after the deadline.
Yeah, John's told me that as well. Yeah.
But I think it's a great medium. In a way, the posters are a semi-persistent medium. They're
up all week. Whereas the talks, after a talk, when the words are spoken, they're kind of gone until the talks come out on YouTube.
The posters are a good way to foster communication that lasts for an entire week.
And it's just a different way of presenting information that's a contrast to the talks and classes and such.
And I think it's a great idea.
I'm very happy that John asked me to be a part of it.
And, you know, I hope he'll let me continue.
So is the plan the same again this year, then, to have them during the reception available?
Yes. The poster stands will be in the reception hall, I think.
That's the last plan that John and I spoke about a few days ago.
Sort of the same format, except there will be more posters. This year, so the first two years, we actually had a poster judging committee that was a small number of people that went around
and evaluated the posters in order to award a prize. This year, we have too many posters to ask
a committee to do that, because it takes quite a bit of time, at least 20 minutes per person,
and then everybody has to gather and argue to decide on a winner.
So this year, we are going to award prizes based on votes from participants.
And I think probably this year, the top two vote-getters will be the posters that win the prizes.
Okay. two vote getters will be the posters that win the prizes.
Okay.
When you say votes from participants, do you mean from the poster creators or from CPP Con attendees?
From CPP Con attendees.
So for those of you who are maybe new to CPP Con, when you pick up your badge and your conference materials,
inside of it you will find two colored slips of paper which represent your votes for posters. And as you view the posters, you'll find that there's a large envelope attached
to the poster stand with the poster's name on it. And if you like that poster, if you think it's
worthy of a prize, you can drop one or both of your vote slips into that envelope and they will be counted and tallied up for the awards Friday
afternoon. So in addition to working on the poster program, I think you're going to be giving
two talks at CVECon this year, is that right? Yes, so I'm fortunate to be able to give two
talks this year, which are reprises of the talks that I gave in Aspen at C++ Now.
The one on Tuesday morning is called Fancy Pointers for Fun and Profit, and it's a look
at synthetic pointers and things that you can do with them, or some things you can do
with them.
The talk on Wednesday morning has to do with fast conversion of UTF-8 to UTF-32 using C++ and a simple DFA and some SSE intrinsics.
You definitely have to tell us more about this fancy pointers.
What do you mean by fancy pointers?
What can our listeners expect to see in this talk?
Okay. see in this talk. Okay, so fancy pointer is the common or the colloquial term for something that
the standard calls pointer-like types. Okay. And a pointer-like type is a user-defined type
whose purpose is to emulate the syntax and the semantics of pointers, right? People call these
fancy pointers. I like People call these fancy pointers.
I like to call them synthetic pointers because, you know,
fancy is not quite fancy enough of a word for it.
But if you think about what a pointer is,
a pointer is a kind of object that represents an address,
and it has certain primitive operations related to addressing.
It tells you what the location of some thing is in memory.
And as it turns out out with C++11, you can write a class which pretty closely mimics the syntax of a natural pointer, you know, T star or void star. And you
can imbue that class with the semantics of a pointer. And this turns out to be useful.
There's a couple of motivating problems.
The first problem is shared memory.
Suppose that you're writing some message managing system or shared memory database,
and you want this thing which exists in shared memory to be accessible by multiple processes.
So in one of the processes, you ask the operating system,
make me some shared memory segments.
And it will do that, and it will give you back some sort of handles to those shared memory segments.
And then you ask the operating system,
okay, with this handle to the shared memory segment, give me an address to that.
And so the operating system will map the virtual address of the shared memory segment, give me an address to that. And so the operating system will map the virtual address of the shared memory segment
into the private address space of your process.
Okay.
So now if you have more than one process,
there's no guarantee that the mapping is going to be at the same address
in both of those processes.
You can ask the operating system to do that, and sometimes
it will comply, and sometimes it will not. So you cannot be guaranteed that the same memory
segment is going to exist at the same address in both processes. So with synthetic pointers,
it's possible to develop a type that emulates pointers that works in both processes. Imagine that you wanted to construct
a linked list of strings, a stood list of stood string, or a stood map of stood string to stood
list of stood string, right? And you wanted to construct this thing and you wanted it to be in
your shared memory segment. Well, with ordinary pointers, it would work for the process that constructed and placed things there,
but for other processes it might not because it's using natural pointers
and their addressing is relative to the base address of that process.
With synthetic pointers, you can define a pointer type and use it with
std allocator to do the same sort of thing, to instantiate the same class in the shared memory
segment, but have it be readable and writable, if you lock it correctly, from multiple processes.
So it's a location-independent way of doing addressing, in a sense.
So that's one motivating problem.
The other is something that I call, for lack of a better word, self-contained DOMs.
Suppose that you're doing some kind of messaging, and you want to build that same data structure, a map of string to list of string,
and you want to wrap it up in some buffer, and you want to send it out over the wire to somebody else.
Well there's another kind, you know, there is a kind of synthetic pointer that you
can use to build that kind of thing into a buffer of bytes and send it over the
wire and have the receiver get it, do a pointer cast, and be able to use it
without any serialization.
So it's possible to build a valid C++ object, a container, which is both a valid object and a pre-serialized bag of bytes that you can move around with memcpy or send-receive
or things like that.
And I think it has a lot of use in today's world where people are doing messaging and stream processing where a lot of time is being spent serializing to XML or JSON and deserializing on the receiving end.
And this is a technique where if you have a homogenous network where you know all your nodes are the same type and you've got the same compiler building your software, that you can achieve some nice performance benefits.
So there's something that you said at the beginning.
I mean, that's all very interesting.
I can see the application. I don't want to ignore that.
But you said, model a pointer like a T star or a void star.
And arguably, void star isn't a model of a pointer.
You can't increment it, and you can't dereference it, right?
Yes, you're absolutely right.
So in my talk, I break the concept down into four different levels, at least with regard to memory allocation.
Addressing model, storage model, pointer interface, and allocation strategy.
These are four different concerns, somewhat orthogonal concerns, that an allocator has.
At the lowest level of that is the addressing model.
You can think of the addressing model as being the analog of void star.
How do I arrange bits inside an object to represent memory? Okay. For example, if you
remember DOS or early Windows segment offset addressing, you had a pointer which contained
a segment and an offset, right? That's actually a similar principle can be used for shared memory
synthetic pointers. And those are different, say, than flat pointers like you have in a 32 or 64-bit address space.
So what I call the addressing model is something that's analogous to a void star, and it's
basically how do you define the bits that represent a pointer?
And how do you use that to compute an address if you know the type of the
thing that you're addressing? And typically that's somewhat associated with the storage model.
If you had a shared memory storage model, you might expect a representation in one form.
A self-contained DOM storage model might use a different form of addressing model.
But layered on top of that is what I call the pointer interface.
And this is the thing that actually, for non-void types,
adds the traditional pointer syntax and semantics on top of it.
Okay.
So it takes sort of the void star characteristics of the addressing model,
and it imbues them with the type characteristics or the type operations
that you would expect
from a regular T star.
All right.
Okay.
Do you want to tell us a little bit more about the UTF-8 talk you're going to be giving?
Sure.
And in fact, if I remember correctly, Jason was in the audience in Aspen and saw that.
I did.
Oh, yes.
They had a bright light shining in my face, and Jason was jumping up and down in the back of the room, apparently trying to get my attention, and I couldn't see him.
So I apologize again for that, Jason.
Yeah, yeah.
I always, whenever I'm giving a talk like that, I always say, I'm going to turn down the light, and the camera guy's like, but then you won't look as good on video.
I'm like, you know what? I'd rather be able to interact with my audience.
Right.
So a couple years ago,
I was considering building my own JSON library.
I eventually decided against the wisdom of that,
but as part of that,
I realized that a conforming JSON implementation
has to be able to handle UTF-8 correctly.
And so I sort of barely knew what UTF-8 was and realized that,
hey, there's this thing where you need to take UTF-8 code units,
or sequence of UTF code units, and turn them into UTF-32 code points
if you want to do something useful with it.
And so that took me down the rabbit hole of how can I do this quickly?
And I sort of looked at some of the traditional algorithms for doing it, but I also, based on work that I'd done before, i.e. the regular expressions
many years ago, I realized that this was a lexical analysis problem and that one could write a DFA
to analyze the incoming stream of UTF-8 octets, both for validity but also to turn them into a UTF-32 code point.
And so I started working on this, and after a little bit of alcohol, much profanity, and
many hours of working on pencil and paper, I derived a state machine, a DFA, which described
how to do this and conformed to all the various requirements from the Unicode
Consortium for doing such conversions. And then I found out a few weeks after I did that that
somebody else had already done it. But luckily, my results matched his, which was pleasing.
There's strength in numbers. And so I had a DFA that could do the conversion, but when I tried to do DFA-only based conversion,
my speed was basically the same as the traditional algorithm, which has a switch statement in it,
or a laddered sequence of if statements.
And I was looking at it, and I looked at some sample webpages, my test cases from Wikipedia,
and I looked at them, and I realized that any time one encounters an ASCII character, which fits in a single UTF-8 octet, it's very likely to be followed by another ASCII character.
And even if you do things like download the Wikipedia page for the Hindi language written in Hindi, you know, for Hindi readers, there's an awful lot of ASCII characters in it.
So it occurred to me that if I had a fast way to zero extend the ASCII characters from 8 bits to
32 bits, I might be able to speed things up. And so I did some experimentation with that and
have sort of a hybrid algorithm which uses the DFA when it encounters a non-ASCII character.
Otherwise, when it finds an ASCII character, it tries to zero expand as many subsequent ASCII characters,
up to 16 as possible, using some SSE primitives.
And they turn out to be quite fast.
For simple cases of mostly ASCII, you can see a 10 times performance speed up over traditional libraries.
For more difficult cases where you have lots of Chinese or Russian or 3-byte characters,
you can do, sprinkled in with some ASCII, you can maybe do twice as fast.
And then I also derived a couple of torture tests in cases where I knew that my algorithm would not perform well
just to see how it compared to the other algorithms.
And most of the time it held up pretty well.
I also did some cross-conversion from UTF-8 to UTF-16.
And I was trying to compete against Microsoft's Win32 API function for doing such conversions,
and I'm proud to say that my conversions compare favorably to theirs.
Yeah, theirs are core to Windows and highly optimized, right?
Yes. So in most of my test cases, I actually beat their conversion by a small amount.
Yeah, that's pretty good for as many years as they've invested in that.
Yes.
So.
I know you're part of the standards committee now.
Hasn't there been some talk about UTF-8 standardization recently?
Well, there's not so much UTF-8, but I know Tom Honerman has made a proposal to add a
new fundamental type called CHAR-8T, which would be the analog of char16t and char32t,
specifically to support Unicode literals
and also something like a U8 string,
like we have U16 string and U32 string now,
which are type aliases.
You know, the problem with char is I don't think it's specified whether char is signed or unsigned.
And to do, I think the standard leaves it unspecified.
And char 8T is an unsigned type,
which mirrors exactly what you need to do the unsigned operations on UTF-8 octets.
Okay.
I know that Tom actually has his own implementation of various Unicode utilities.
Zach Lane is also very active.
He's the author of Boost.txt, which has some very extensive research and work that Zach has done
on creating a better string type,
which is compatible with Unicode up and down the line.
And also I should mention Tom runs the SG16 study group,
which is a newly formed study group looking at Unicode support for both language and library.
Oh, I was not aware of that study group.
Yes.
Well, on the topic of your involvement and knowledge
with the Standardization Committee,
I promised I would get back to this.
You said when we first asked you about your involvement
that you focus on the things that would make the language easier
for the little guy, basically.
I'm kind of curious what things so far you have focused on and what you would like to see get moved forward.
Well, so I've been mostly an observer member up to now.
I co-authored a paper with Arthur O'Dwyer.
Actually, Arthur did all the hard work.
I just sort of wrote his coattails in on the paper.
It's nice when you can just sign your name to a paper that someone else has written.
Yeah, I sort of feel guilty about that because Arthur really did do quite a lot of work.
He's very smart, and I was very fortunate that he let me be associated with him.
Having to do with fancy pointers, actually.
But that particular topic had to do more with fixing defects or
perceived defects in the existing standard with regard to support for fancy pointers.
But, you know, going forward, to answer your question, Jason, you know, I have various ideas
about how one could make the standard library and standard, and the language facilities easier for the little guy.
I gave a talk in Aspen, C++ Now, last spring, called If I Had My Druthers,
and proposals for changing containers in the standard library,
in which I proposed that we could add several new containers,
and we could structure them in such a way that there's an interface for what I call the casual user that allows them to be productive right out of the box. They don't
have to worry about things like allocators. And the subtleties associated with allocators
are completely erased from the picture. There's no allocator as part of the template interface
of those classes. But as part of the layers in getting to that point, there are layers for expert and guru-style
users to get more performance and more customizability out of it. So I think that's one
part. I would like to see, although I don't myself have any good ideas for it, but I would like to
see a better string class someday that's easier to use.
When you talk about the language,
one of my pet peeves has always been the conflation of interface and implementation.
And I think that the specification of implementation can be unnecessarily verbose.
I'm the kind of guy who I have this religious, unshakable,
zealous practice of separating my member function implementations from my class definitions.
When I look at a header, I just want to see the member functions in the class.
I don't want to see the member functions defined in the class because this is not Java.
So I always define my member functions outside the class and i find even in i'm sorry so even even for simple things only the very most trivial things
do i define member functions inside the class definition itself i want to know if you also go
through what would be necessary to do that for templates as well.
I do.
I do. Okay.
If you read my code, my class definitions only contain declarations.
Doesn't that make your member function definitions for template classes significantly longer because you have to specify the template type for the class
in that case when you wouldn't have to otherwise you do there's there's a whole there's a whole
lot of verbiage that has to come along for the ride okay so for example one of the ideas i've
had in the past is slightly extending the syntax of the language by adding a new keyword or perhaps
reusing a new keyword so that you could define the member functions for a class or class template
outside of the definition of the class template itself
the same way that you would define them inside the class template.
So I've been toying with the idea of reusing the word namespace.
So away from your class definition,
at some subsequent point in the translation unit,
you could do something like template, angles, class, T, angle, namespace,
open bracket or open brace,
and then write the definitions of your member function inside that
without all of the extra verbiage,
and then
close the brace and you're done just as if just as if you had to find them inside the class
definition itself that would save an awful lot of electrons from being harvested and would make code
a lot easier to read it you know uh it's it's a time saver and it's a readability thing it helps
readability in separating interface from implementation.
It helps readability in reading the implementation itself.
I don't know if anybody's going to go for that idea.
When I mention it to people, about 50% of the time they look at me like I'm crazy.
So I don't know if it will go anywhere.
But that's the kind of thing that I'm thinking of.
I'm only looking at you like you're 50% crazy, not.
I see
we are going with it, and I could definitely see
if there was a way to do it in a way that
would make the committee happy, I would
not argue against it, personally.
So, yeah.
So, ideas like that. I mean, it doesn't
add anything new to the language. There's no
semantics. It's just slightly new additional
syntax that doesn't break anything existing that the language. There's no semantics. It's just slightly new additional syntax that doesn't break anything existing
that would make the language easier to use.
Okay.
Maybe that's a good segue since you're talking about interface implementation
to give a plug for your pre-conference class on modern interface design.
Yes, thank you.
So I'm giving a two-day pre-conference class called Interface Design for Modern C++.
And it's a class which I intend to be geared towards C++ programmers who think of themselves as being advanced beginners or maybe intermediate level who are, you know, relatively new to the language perhaps
and are thinking about writing their own API or their own libraries. And, you know, my skill in
doing this, whatever skill I have, has been built up over 26 painful years. And so I'm hoping to
impart some wisdom based on my experience, but also on what I think are commonly established best practices for defining interfaces.
And also trying to have my students, the ones who show up, gain some understanding and appreciation for everything that's involved in creating, you know, say a library, right?
There's
more to it than just writing code. There's a lot of thinking besides thinking about code that has
to go on. And so I hope that my students will come away sort of with a checklist of two checklists,
a checklist of guiding principles for what are the questions I should be asking myself and the
things that I should be doing as I am composing my interface
that I want to give to the world.
And the second checklist is, here's a list of things and actions
that I should be doing in order to actually release my interface to the world.
So, you know, that's the nickel tour of what the class, I hope the class will be.
Sounds good.
Yeah, sounds great.
Thanks.
Well, is there anything else you want to go over, Bob, before we let you go?
No, you guys have, I think we've done a pretty thorough job here.
Again, thank you very much for having me on the show.
This has really been a lot of fun.
Certainly.
Yeah, it's been great having you on today.
Yes.
Thanks so much for listening in as we chat about C++.
We'd love to hear what you think of the podcast.
Please let us know if we're discussing the stuff you're interested in,
or if you have a suggestion for a topic, we'd love to hear about that too.
You can email all your thoughts to feedback at cppcast.com.
We'd also appreciate if you can like CppC on facebook and follow cppcast on twitter you can
also follow me at rob w irving and jason at left to kiss on twitter we'd also like to thank all our
patrons who help support the show through patreon if you'd like to support us on patreon you can do
so at patreon.com cppcast and of course you can find all that info and the show notes on the
podcast website at cppcast.com.
Theme music for this episode was provided by podcastthemes.com.