CppCast - CppCon Poster Program and Interface Design

Starting point is 00:00:00 Episode 166 of CppCast with guest Bob Stiegel, recorded September 5th, 2018. dot IO slash CPP cast. In this episode, we discuss Scott Myers and function poisoning. Then we talked to Bob Stiegel. Bob talks to us about his history at C++, the CppCon poster program, and his upcoming talks. Welcome to episode 166 of CppCast, the first podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today?

Starting point is 00:01:25 I'm doing all right. How are you doing? I'm doing host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today? I'm doing all right. How are you doing? I'm doing pretty well. Don't have too much to share, I don't think. You? Is it all right if I mention that you have a dishwasher? I don't think we've mentioned it before on the show, but yeah. We haven't. No, but for some reason it took me about two months to get a new dishwasher, but it's finally there, And I'm happy about that.

Starting point is 00:01:47 If you don't mind, I'm going to go ahead and do this. For like the past eight weeks, whenever we would record, Rob would be like, yeah, I'm at home waiting for the dishwasher guy again. Yeah, pretty much. That saga is over now, though. So, yeah. Well, at the top of our episode, I'd like to give you a piece of feedback uh last week we were talking about um formal verification and we got this comment on reddit from fsdmcpp and he says uh dear host thanks for another interesting episode i'm not involved with formal verification professionally however as an amateur i'd like to point out to a

Starting point is 00:02:23 few relevant resources i'm not gonna read the whole post. However, as an amateur, I'd like to point out to a few relevant resources. I'm not going to read the whole post because it's a bit long. But he points to a tool to enrich C code with annotations to formally prove its properties, which is Framacy. And I'll put a link to this in the show notes. And he also has a tutorial that shows a number of applications of Frame.sc. And he also pointed to a C++ plugin for Frame.sc, which looks like it's a Clang-based experimental plugin. And then he also talked about C++ contracts discussion going along with AdaSpark,

Starting point is 00:02:59 which I believe is another programming language I'm not too familiar with. Right, Jason? I know Ada. I don't know Spark. I mean, I don't know Ada. I know very few people who have ever programmed in Ada, but yes. Yeah, one thing he points out is this Frama C formal verification tool that he mentioned can work with both C and Ada Spark,

Starting point is 00:03:22 and you could actually use it to verify both in a mixed code base, which sounds kind of interesting. But yeah, a couple links to some interesting-sounding resources related to formal verification with C and C++, and I will put those in the show notes. And yeah, thanks for listening. Yeah, and you know, thinking about last week's episode, C++ is notoriously difficult to parse,

Starting point is 00:03:45 and this is definitely one of these things where the Clang tooling is really allowing people to do more analysis of the code in a much easier way than was possible before. Yeah, absolutely. Well, we'd love to hear your thoughts about the show as well. You can always reach out to us on Facebook, Twitter,

Starting point is 00:04:01 or email us at feedback at cppcast.com, and don't forget to leave us a review on iTunes. So joining us today is Bob Stiegel. Bob is a principal engineer with Glia Cell Technologies. He's been working almost exclusively in C++ since discovering the second edition of the C++ programming language in a college bookstore in 1992. The majority of his career was spent in medical imaging, where he led teams building applications for functional MRI and CT-based cardiac visualization.

Starting point is 00:04:28 After a brief detour through the worlds of DNS and analytics, he's now working in the area of distributed stream processing. Bob is a relatively new member of the C++ Standardization Committee and launched a blog earlier this year to write about C++ and topics related to software engineering. He holds BS and MS degrees in physics, is an avid cyclist, and lives in fear of his wife's cats. Bob, welcome to the show. Hi guys, thank you very much for having me. I'm very excited to be here. It seems like a shame to live in fear of the animals in your house. Well, I am the only male in a house populated with females. I'm married, I have two daughters, 21 and 25. We currently have three female cats and a tank full of female fish.

Starting point is 00:05:15 So I've learned to be very careful in what I say and the expressions I put on my face. That's funny. Wow. But you get along with your cats, right, Rob? I have two kittens that are great, although my previous two cats who have both passed away, one of them was very nice and the other one was very attached to my wife and didn't want me to be near her. So I understand what you're talking about, Bob. Yes, we have a new one who's about a year old who is raising hell and giving problems to the other two who are sisters who are about seven. And only one of them, I think, likes me sometimes.

Starting point is 00:05:52 So it's a razor's edge that I walk. One of the dogs that my wife had when we met, if my wife was ever out of town, the dog would just sit there in the living room and stare at me. Like, what did you do to her? Yeah. Okay. Well, Bob, we've got a couple of news articles to discuss.

Starting point is 00:06:17 Feel free to comment on any of these, and then we'll start talking more about all your work with C++, okay? Yes. Great. Okay. So the first one is a post from Scott Myers, and it's a little bit of a sad post. But we talked to him, I guess it was about two years ago, when he was basically stepping down from publishing any more books or doing any kind of active community work. And he's now saying that he no longer plans to update his books to fix technical errors. And it's basically because the language is continuing to change and he hasn't been paying close attention to it.

Starting point is 00:06:56 And people have been writing him about possible errors with some of his code in the books. And he's no longer able to say uh you know definitively whether or not he believes it's an error so he's just gonna stop putting out these sorts of updates which is unfortunate but understandable yeah inevitable i guess at some point you can't indefinitely keep maintaining your old books and trying to stay up to date after you've retired and what was the most recent book i mean how many years ago was the most recent one? Oh, Effective Modern C++ was... Well, I just looked.

Starting point is 00:07:30 We interviewed him almost exactly three years ago today. Oh, it was three years ago. Okay. So Effective Modern C++ was already out when we interviewed him. Yeah. Yeah, I think the first year of publication was 2014 for that, and there have been a few revised printings since then. I have a copy of the first printing and then a copy of a later printing. I can certainly sympathize with Scott's point of view.

Starting point is 00:07:58 It's very difficult to keep up with the changes that are coming in the language. And, you know, as a new committee member, trying to keep up with all of the mailings on the email reflectors is virtually impossible. I suppose if I had, that was my full-time job, I could do it. But unfortunately, I have a job where I can't spend a lot of time at work doing non-work things. And it's very difficult to keep up with the evolution of language. It's very exciting, especially given, you know, sort of the slow change in the language for maybe the first half of its life there.

Starting point is 00:08:35 But it's very tough to keep up these days, and I can certainly understand where he's coming from. So on that topic, and this is perhaps jumping a little bit ahead to the interview portion, but I'm going to go ahead and go for it. I have a mental filter. I don't really pay attention to anything unless it has been approved by the committee. Now, if you're on the committee, obviously you can't do that. Sometimes I'll have people ask me, oh, like, well, what do you think about X and Y? And I'm like, honestly, I'm not paying attention at all. I'm just trying to stay up to date with the things that have been approved. Yes. How do you deal with that? Because like you said, it's got to be just this tidal wave of information. It is very much like trying to drink from a fire hose. And I can tell you, you know, going to a committee meeting, and I may be dating myself here, but if you've ever seen the movie Amadeus, when I go to a committee meeting, I feel like Salieri in a room full of Mozarts. It's surrounded by brilliant people who are doing their best to try and make the language and the library better in a number of ways. And I find myself basically filtering out things that don't really

Starting point is 00:09:50 interest me. And it's taken me, well, a year now, basically a year of meetings to construct that filter of the things that interest me. And what I'm really interested in in terms of the evolution of the language is finding ways, not necessarily by adding new features to the language, but perhaps by suggesting changes in the library or in the language itself that make the language easier to use and to remember and to understand. You know, it's a tremendously difficult language to learn for beginners. It's tremendously difficult for me to keep up with it. And I think that there are some simplifications that could be applied to the language and the library without necessarily adding new conceptual features, new semantics, or new capabilities

Starting point is 00:10:42 that could make code simpler and easier to read. And so those are kind of my interest in participating in the committee. You know, I don't have the same experience as some of the guys who are doing things like adding support for concepts or the metaprogramming like Louie does or, you know, contracts or those sorts of things. I sort of think of myself as being a proxy or interested in supporting the little guy, so to speak, who's got to do a job every day. And how can I help that person do their job more effectively? Okay. I will definitely ask more questions about that after we get through the news.

Starting point is 00:11:20 Okay. Well, yeah, I definitely understand Scott's viewpoint.'s viewpoint again it's it'll be sad to see him uh no longer making these sorts of updates but it's completely understandable and uh just thought i'd maybe put out another plug in case anyone is so interested the um cpp con i think it's a pre-conference class that he's going to be running along with kate gregory and uh um oh who's the other one with him i'm not sure andre alexandre andre there we go it's given a talk on technical presentations like we've said these classes are a who's who of guest of cbp cast yes okay um and the next one we have is a post from the Visual C++ blog. We talked a few times in the past about Boost HANA being used from Visual Studio and how they had a bunch of workarounds in order to compile Boost HANA,

Starting point is 00:12:19 and they have managed to basically fix all the compiler bugs that were requiring them to put in all these workarounds, and they're down to just three. So with the latest MSVC 2017 Update 8 compiler, they now only have three workarounds that you would have to use in order to use Boost HANA. And I believe those three workarounds are now like in the official um like master branch i believe yeah i i think that's right like you can actually use

Starting point is 00:12:53 boost hannah with visual studio 2017 update as opposed to using like their fork of it right right i i need to try chai script again with the latest update. Last one I had, I was getting incorrect behavior with class template deduction guides. But that's been a couple of releases, and I know they're making huge headway on these things. But some of these workarounds that are still in place, I don't even understand the words in it. Boost HANA workaround MSVC.

Starting point is 00:13:23 Okay, fine. Multiplector? What the heck is a multiplector? Is that a word? Well, it says multiple copy move constructors. Oh, okay. That would make sense. Well, I can say that I'm in awe of Louie's metaprogramming abilities and I've not yet had an opportunity to actually use HANA in anything work-related or actually even in any of my personal projects. But I have some small appreciation of all of the coin, I'm really excited and been very excited on Microsoft's behalf over the last few years at the progress that they've made in such a short time in coming close to standard conformance.

Starting point is 00:14:13 I remember buying a copy of Microsoft C7 in 1991 or 1992 and then later buying Visual C++ 4.2 and Visual C++ 5 and Visual C++ 6 through the 90s. And comparing those to compilers like GCC on Solaris at the time or Borland C++, there was always the issue of things that Microsoft's compilers did not support because they were busy, I think, trying to force their will onto the market. And that sort of continued for a few more years. But I was at C++ and beyond in 2012 in Asheville, North Carolina, and someone asked Herb Sutter about, you know, what is Microsoft

Starting point is 00:15:05 going to be doing in terms of standard conformance at that time? And Herb unequivocally said, we are dedicated to standard conformance, and you're going to see that happen. And lo and behold, over the last six years since then, it really has happened, and I'm very pleased to see it. You know, every time a new version of Microsoft Visual Studio came out, I would go and buy the Intel compiler because I wanted that standards conformance. But I haven't actually purchased the Intel compiler now for a couple of years because Microsoft has improved so much. So I'm very happy for Microsoft,

Starting point is 00:15:38 and I'm very excited that they've really taken on this challenge and done so well. I think it says something about the complexity of C++, but at the same time, any old mature language is complex. That it took six years of specific effort with the might of Microsoft for them to get to this point. Yes. But to be fair, other popular languages don't even have multiple implementations.

Starting point is 00:16:07 And those that do are either significantly simpler languages or not nearly as old, many of them. Right. I can't imagine. But does Perl 6 have multiple implementations? That's a beast of a language. No idea. Yeah. Did either one of you follow the development of pearl six no no it was a thing it took many years there's a chart a long time ago downloaded a pdf chart of like the all the operators that it has and it's something like

Starting point is 00:16:41 30 different operators i might be exaggerating but it And it's something like 30 different operators. I might be exaggerating, but it's crazy. Anyhow. Okay. Last article we have here is a post from Fluent C++, Jonathan Bokura's blog. Well, I believe it's actually a guest post on the blog. And this is function poisoning in C++. And this is something I was not aware of, but apparently GCC gives you this poison pragma if you want to basically prevent the usage of some identifier in your code. And the author of this post basically goes over some legitimate use cases of that. Basically, one thing he highlights is if you wanted to put some wrapper around a C allocation function so that you can instead wrap it with something returning, a smart pointer,

Starting point is 00:17:34 then you can then prevent someone from using the C version allocator to just make your code a little bit safer. And I thought that was a good idea. It's an interesting concept. I don't tend to be a fan of these compiler specific extensions but to be fair almost all the gcc prognosis are supported by clane clang so that's two compilers anyhow right and i believe he said that he didn't see any way of doing something similar to this on like msvc right so it's definitely GCC and I guess Clang only. But interesting idea.

Starting point is 00:18:08 Yeah, it seems interesting. It could be a useful technique if you have people that are working on a project that may be less experienced and more likely to fall into one of the traps that you're attempting to provide here. In general, though, I'm a little bit leery of techniques like this. In my mind, at least, it's somewhat analogous to having a member function

Starting point is 00:18:32 that's declared public in a base class and then redeclaring it as protected or private in a derived class where you're hiding public interface. And it's not a perfect analogy, but all of the APIs and stuff from the C library and the standard C++ library, they are sort of the public interface of our language. And I'm a little bit leery of hiding parts of that interface,

Starting point is 00:18:58 unless there's a really compelling reason to do so. That's an interesting viewpoint. Yeah. There was a couple of comments in the Reddit discussion about this article also that they might've been able to accomplish something similar by deleting the versions of the methods that they are, the functions they don't want to use in the public interface. But I don't think you could get the level of implementation that they wanted here or marking them deprecated.

Starting point is 00:19:21 That was one of the comments on Reddit is also. Right. Okay, so Bob, could we start off by you just giving us a little bit more of your history and background with C++? Sure. I was working as a research assistant, my first sort of real job after graduate school in the early 90s at the University Hospitals in Cleveland, Ohio. And I worked for a physics professor, the guy who was actually my advisor for my master's degree. And he was associated with Case Western Reserve University, and he did research in MRI. And I was working on a project where we were doing some very early visualization of cardiac images from MRI,

Starting point is 00:20:05 which was, it's a notoriously difficult problem because people's hearts don't hold still when you're trying to image them. And scanners at the time, you know, were just not quite sophisticated enough to acquire data at high enough speed so that motion artifacts were not a problem. But we were doing research, and I was writing some code in conjunction with Siemens Corporate Research in Princeton, New Jersey, to do some visualization of the images that we acquired. And I was doing it all in C.

Starting point is 00:20:35 I had taught myself C a few years earlier to do my master's thesis. And of course, you know, I had the concept of viewports and images and all of these things that I was allocating off the heap, and it just became more and more complex, more and more difficult for me to keep it in the limited capacity of my brain. And I was in the university bookstore one day goofing off, and I saw this book called the C++ Programming Language. And I looked at it, and I thought, oh, this is interesting. So I purchased it, and I took it home. And my background at that time, all of my experience was with C and Fortran and Pascal,

Starting point is 00:21:11 very heavily procedural languages. And I read it, and I got through the first couple chapters, and at the time, I just didn't grok this inside-out syntax of object.operation. And I thought, this is madness. This is no use to me. And I put it up on the shelf, my bookshelf, where it sat for a few months. And finally, the pain at work managing all my memory allocations got bad enough that I pulled the book off the shelf again. And also, I'd recently heard about this thing called templates.

Starting point is 00:21:44 And it sounded like a solution to a problem I was having. So I got the book and started reading it and suddenly realized, hey, there's this destructor thing, which would really help me with my memory leaks. And oh, by the way, here's this template thing that would help me manage different lists of things. And in my code, I was using lots of doubly linked lists of different objects. And I just sort of fell in love with the language. This was just before, really, that there were web browsers, and HTTP was a thing. And a friend of mine who was really into the nascent Internet at that time turned me on to Usenet, and I started reading comp.lang.c++ and comp.stood.c++.

Starting point is 00:22:31 And a couple years later, I heard about this thing called the STL, which was really kind of exciting. And everybody was talking about it, and I downloaded it. I FTPed it off of some FTP server at HP Labs and was able to sort of make it work with the primitive GCC that I was using at the time, which was kind of cool. And I was young and ambitious and not quite wise to the ways of the world and thought, hey, there could be a market for something like this. And so I left my job at UH and started my own company. And I did some consulting on the side. I did network installations where I crawled around in attics and stuff, installing RGB, RG58 thin wire and thick wire Ethernets to support myself.

Starting point is 00:23:18 But from about 94 through 99, I worked on implementing and I sold my own version of the standard C++ library. And I actually started selling it a few months before the C++98 standard was announced. And it was not perfectly conformant to the standard by any means. There were things at the time I thought were pretty silly that I did not include, and I did things my own way. And it was a great learning experience. You know, I wrote several hundred pages of documentation, which I published and sold to my users on paper, of all the crazy things. I didn't sell a whole lot of copies.

Starting point is 00:23:57 If I did, I'd be laying on a beach somewhere earning 20%. But it was a great learning experience. And from there, my wife finished her postdoc and got a job at National Institute of Health here in Maryland. And we moved to Maryland. A buddy of mine got me into a company where he was working doing medical imaging, which I had background in. And before long, I was running a small company, the company that does functional MRI. I was there for about 11 years. We had a product that we sold to GE Medical Systems that was functional MRI. I was there for about 11 years. We had a product that we sold to GE

Starting point is 00:24:25 Medical Systems that was quite successful. And that product was entirely C++ based. I did a lot of development on that the last eight or nine years while I was there. I wrote a big metaprogramming library, a linear algebra library, atomic messaging library. So I was sort of the chief cook and bottle washer at the company there because there were about 8 to 12 of us. And one minute I'm designing things and arguing with my team about things on a whiteboard, and the next minute I'm ordering soap replacements and paper towels for the bathroom and that kind of stuff.

Starting point is 00:25:01 So it sounds like you actually did make a living for a time from selling your own implementation of the standard library uh that's true but i'm not sure i would even call it a living okay well you at least made it sound like you made a living anyhow i made a few dollars at it most of most of the money i was making at that time was actually coming from the network installations i was doing because i was willing to crawl around in dusty attics and places most smart people would not go in order to install cables. Just thinking about how much the world has changed. I mean, if you said today you want to make and sell your own standard library implementation, that would sound kind of insane.

Starting point is 00:25:42 That would be crazy talk. Yes. So that's pretty cool. It was a good experience, and I actually developed things which were not part of the standard, but eventually would become part of the standard. And I don't claim any responsibility for them becoming part of the standard, because people were doing sort of the same thing in Boost at the time. And I was not involved with the committee in any way at that time. But when I released my product in the spring of 97, I had a complete regular expression library that integrated with the strings, that supported Perl syntax of regular expressions, except that it only did greedy matching.

Starting point is 00:26:23 It did not do non-greedy matching. I had an extensive library of functions for threads. I had a class that I called callback, which was analogous to function today. I had another type that I called initiators, which was really an implementation of the command pattern, which was a callback coupled with arguments. And you could use that with the thread class to run a command in a distinct thread. Okay. And then I had the basic synchronization primitives, atomic integers,

Starting point is 00:26:56 and sem4s, mutexes, condition variables. So it was 20 years ahead of its time. I don't know about that, maybe 10 years. What I really want to know though is what parts did you leave out because you thought they were silly if you recall what that was oh i can tell you exactly what i left out okay i left out valerie and everything associated with valerie because you know what's the point okay i i've never personally met or heard of anybody who uses valerie, although there must be somebody.

Starting point is 00:27:28 Yeah, I've never used it, I don't think. But I think the worst and most painful part of it was implementing the I-O streams. I expected you to say you left off I-O streams, honestly, because... No, I had a full implementation of I-O streams, plus all the required facets. But that was painful, and I didn't like it, and I have avoided the I-O streams ever since at all costs. When I do character-based I-O, you know, I either use the I-O, the OS functions for doing I-O, or I use the C functions. I really try to avoid the I-O streams. You know, I must say, it's a slightly off topic, but I have recently started using the format lib, lib format, however they want it to be called.

Starting point is 00:28:11 And it is absolutely amazing. It is what I'm using now. Yeah, it's a great library. And, you know, to combine the syntax for specifying output that you get from the old C streams that are got to be 40, almost 40 years or more old now with the type safety of C++ just seems like a big win. And it avoids all the unnecessary verbosity and complexity of, you know, IO stream manipulators to get things to format the way you want. And it's always a battle to make things format the way you want. And it's always a battle to make things work the way you want to with Iostreams.

Starting point is 00:28:48 And I really just prefer the simplicity of printf. I guess I'm old-fashioned. Well, if you can get the simplicity of it with the type safety and performance of modern C++ design... It's a win. It's a pretty cool project, yeah.

Starting point is 00:29:03 Yeah. I wanted to interrupt this discussion for just a moment to It's a pretty cool project, yeah. I wanted to interrupt this discussion for just a moment to bring you a word from our sponsors. Backtrace is a debugging platform that improves software quality, reliability, and support by bringing deep introspection and automation throughout the software error lifecycle.

Starting point is 00:29:18 Spend less time debugging and reduce your mean time to resolution by using the first and only platform to combine symbolic debugging, error aggregation, and state analysis. At the time of error, Bactres jumps into action, capturing detailed dumps of application and environmental state. Bactres then performs automated analysis on process memory and executable code to classify errors and highlight important signals such as heap corruption, malware, and much more. This data is aggregated and archived in a centralized object store, providing your team

Starting point is 00:29:46 a single system to investigate errors across your environments. Join industry leaders like Fastly, Message Systems, and AppNexus that use Backtrace to modernize their debugging infrastructure. It's free to try, minutes to set up, fully featured with no commitment necessary. Check them out at backtrace.io slash cppcast. So, Bob, we're now about two, two and a half weeks away from CppCon, and you're going to be running the poster program, or you have been running the poster program. Is that right?

Starting point is 00:30:16 Yes, that's right. This is my second time doing this. The program was first started in 2016 by Hanya Bastani. And I think, I've never talked to her about it, but I think what she was doing was trying to, in some sense, because the talks at CPP Con are peer-reviewed, she was trying to emulate what scientific conferences do, which is they also have poster sessions. And typically, researchers or graduate students or students, you know, they create posters to describe their research. And, you know, at a lot of these conferences, the number of slots for speaking is very limited. And so a lot of work goes on to creating these posters. And I think what Hanya was trying to do was to emulate that at C++, because there is a lot of really good work that people are doing that should be talked about

Starting point is 00:31:01 and the community should be made aware of. So 2016 was the first year, and there were four posters. Unfortunately, she couldn't do it last year, and John Kalb asked me if I was interested in doing it, and I agreed to do it. And so last year was my first year doing it. We had nine posters, so it was a very nice increase in poster count. It was very successful.

Starting point is 00:31:27 The posters, I don't know if you remember from last year, but the poster stands were inside the reception hall during the Sunday evening reception. And I was very pleasantly surprised and very excited to see that around every single poster, there were crowds that were two and three and four deep listening to the poster authors talk about their work. So I was excited that people were so interested in it, but I was also pleasantly surprised and happy on behalf of the authors that they were getting so much attention for their work. And so this year, 2018, we have 16 posters. So we're not quite doubling again. But over two years, we've quadrupled. So I figure if I can double it again three more times,

Starting point is 00:32:14 I'll have as many posters as there are talks. And I can give Bryce a run for his money. It's just that much more work to do to make sure that you're selecting the right content, too. Yeah, well, you know, to be fair, you know, the requirements for submitting a poster are not nearly as stringent as they are for giving a talk. Right. But, you know, the idea with the posters is to give people a voice who may not quite have everything that it takes to give a talk. And really, I'd sort of like to focus over the coming years on some of the younger or less experienced members of our community to try and get them involved. I'd like to hear about their work, students, graduate students, postdocs, interns,

Starting point is 00:33:01 people who are relatively new in industry, to get them to participate in this as well and build contacts throughout the rest of the community and talk about what they're doing. I think it's a great experience. So I'm curious, if you can share, how many submissions you had versus how many were accepted? We had slightly more than 16.

Starting point is 00:33:23 I think there were 18 or 19 submissions. And I would have, actually all of the submissions were very good. The problem with the two or three that did not make it is they just came in too late. I can let a day or two slide, but I can't let nine days slide. The people who are doing the evaluations, I'd already formulated a schedule and asked them for their time. It's unfair to say, hey, here's three more submissions, but you've got half a day to evaluate them. So they were cut off for time reasons, not for content. I think I've heard John or Bryce refer to this as the 90-10 rule of 90% of the submissions come in 10 minutes after the deadline.

Starting point is 00:34:10 Yeah, John's told me that as well. Yeah. But I think it's a great medium. In a way, the posters are a semi-persistent medium. They're up all week. Whereas the talks, after a talk, when the words are spoken, they're kind of gone until the talks come out on YouTube. The posters are a good way to foster communication that lasts for an entire week. And it's just a different way of presenting information that's a contrast to the talks and classes and such. And I think it's a great idea. I'm very happy that John asked me to be a part of it. And, you know, I hope he'll let me continue.

Starting point is 00:34:51 So is the plan the same again this year, then, to have them during the reception available? Yes. The poster stands will be in the reception hall, I think. That's the last plan that John and I spoke about a few days ago. Sort of the same format, except there will be more posters. This year, so the first two years, we actually had a poster judging committee that was a small number of people that went around and evaluated the posters in order to award a prize. This year, we have too many posters to ask a committee to do that, because it takes quite a bit of time, at least 20 minutes per person, and then everybody has to gather and argue to decide on a winner. So this year, we are going to award prizes based on votes from participants.

Starting point is 00:35:38 And I think probably this year, the top two vote-getters will be the posters that win the prizes. Okay. two vote getters will be the posters that win the prizes. Okay. When you say votes from participants, do you mean from the poster creators or from CPP Con attendees? From CPP Con attendees. So for those of you who are maybe new to CPP Con, when you pick up your badge and your conference materials, inside of it you will find two colored slips of paper which represent your votes for posters. And as you view the posters, you'll find that there's a large envelope attached to the poster stand with the poster's name on it. And if you like that poster, if you think it's

Starting point is 00:36:16 worthy of a prize, you can drop one or both of your vote slips into that envelope and they will be counted and tallied up for the awards Friday afternoon. So in addition to working on the poster program, I think you're going to be giving two talks at CVECon this year, is that right? Yes, so I'm fortunate to be able to give two talks this year, which are reprises of the talks that I gave in Aspen at C++ Now. The one on Tuesday morning is called Fancy Pointers for Fun and Profit, and it's a look at synthetic pointers and things that you can do with them, or some things you can do with them. The talk on Wednesday morning has to do with fast conversion of UTF-8 to UTF-32 using C++ and a simple DFA and some SSE intrinsics.

Starting point is 00:37:13 You definitely have to tell us more about this fancy pointers. What do you mean by fancy pointers? What can our listeners expect to see in this talk? Okay. see in this talk. Okay, so fancy pointer is the common or the colloquial term for something that the standard calls pointer-like types. Okay. And a pointer-like type is a user-defined type whose purpose is to emulate the syntax and the semantics of pointers, right? People call these fancy pointers. I like People call these fancy pointers. I like to call them synthetic pointers because, you know,

Starting point is 00:37:48 fancy is not quite fancy enough of a word for it. But if you think about what a pointer is, a pointer is a kind of object that represents an address, and it has certain primitive operations related to addressing. It tells you what the location of some thing is in memory. And as it turns out out with C++11, you can write a class which pretty closely mimics the syntax of a natural pointer, you know, T star or void star. And you can imbue that class with the semantics of a pointer. And this turns out to be useful. There's a couple of motivating problems.

Starting point is 00:38:29 The first problem is shared memory. Suppose that you're writing some message managing system or shared memory database, and you want this thing which exists in shared memory to be accessible by multiple processes. So in one of the processes, you ask the operating system, make me some shared memory segments. And it will do that, and it will give you back some sort of handles to those shared memory segments. And then you ask the operating system, okay, with this handle to the shared memory segment, give me an address to that.

Starting point is 00:39:03 And so the operating system will map the virtual address of the shared memory segment, give me an address to that. And so the operating system will map the virtual address of the shared memory segment into the private address space of your process. Okay. So now if you have more than one process, there's no guarantee that the mapping is going to be at the same address in both of those processes. You can ask the operating system to do that, and sometimes it will comply, and sometimes it will not. So you cannot be guaranteed that the same memory

Starting point is 00:39:31 segment is going to exist at the same address in both processes. So with synthetic pointers, it's possible to develop a type that emulates pointers that works in both processes. Imagine that you wanted to construct a linked list of strings, a stood list of stood string, or a stood map of stood string to stood list of stood string, right? And you wanted to construct this thing and you wanted it to be in your shared memory segment. Well, with ordinary pointers, it would work for the process that constructed and placed things there, but for other processes it might not because it's using natural pointers and their addressing is relative to the base address of that process. With synthetic pointers, you can define a pointer type and use it with

Starting point is 00:40:26 std allocator to do the same sort of thing, to instantiate the same class in the shared memory segment, but have it be readable and writable, if you lock it correctly, from multiple processes. So it's a location-independent way of doing addressing, in a sense. So that's one motivating problem. The other is something that I call, for lack of a better word, self-contained DOMs. Suppose that you're doing some kind of messaging, and you want to build that same data structure, a map of string to list of string, and you want to wrap it up in some buffer, and you want to send it out over the wire to somebody else. Well there's another kind, you know, there is a kind of synthetic pointer that you

Starting point is 00:41:12 can use to build that kind of thing into a buffer of bytes and send it over the wire and have the receiver get it, do a pointer cast, and be able to use it without any serialization. So it's possible to build a valid C++ object, a container, which is both a valid object and a pre-serialized bag of bytes that you can move around with memcpy or send-receive or things like that. And I think it has a lot of use in today's world where people are doing messaging and stream processing where a lot of time is being spent serializing to XML or JSON and deserializing on the receiving end. And this is a technique where if you have a homogenous network where you know all your nodes are the same type and you've got the same compiler building your software, that you can achieve some nice performance benefits. So there's something that you said at the beginning.

Starting point is 00:42:08 I mean, that's all very interesting. I can see the application. I don't want to ignore that. But you said, model a pointer like a T star or a void star. And arguably, void star isn't a model of a pointer. You can't increment it, and you can't dereference it, right? Yes, you're absolutely right. So in my talk, I break the concept down into four different levels, at least with regard to memory allocation. Addressing model, storage model, pointer interface, and allocation strategy.

Starting point is 00:42:46 These are four different concerns, somewhat orthogonal concerns, that an allocator has. At the lowest level of that is the addressing model. You can think of the addressing model as being the analog of void star. How do I arrange bits inside an object to represent memory? Okay. For example, if you remember DOS or early Windows segment offset addressing, you had a pointer which contained a segment and an offset, right? That's actually a similar principle can be used for shared memory synthetic pointers. And those are different, say, than flat pointers like you have in a 32 or 64-bit address space. So what I call the addressing model is something that's analogous to a void star, and it's

Starting point is 00:43:37 basically how do you define the bits that represent a pointer? And how do you use that to compute an address if you know the type of the thing that you're addressing? And typically that's somewhat associated with the storage model. If you had a shared memory storage model, you might expect a representation in one form. A self-contained DOM storage model might use a different form of addressing model. But layered on top of that is what I call the pointer interface. And this is the thing that actually, for non-void types, adds the traditional pointer syntax and semantics on top of it.

Starting point is 00:44:15 Okay. So it takes sort of the void star characteristics of the addressing model, and it imbues them with the type characteristics or the type operations that you would expect from a regular T star. All right. Okay. Do you want to tell us a little bit more about the UTF-8 talk you're going to be giving?

Starting point is 00:44:35 Sure. And in fact, if I remember correctly, Jason was in the audience in Aspen and saw that. I did. Oh, yes. They had a bright light shining in my face, and Jason was jumping up and down in the back of the room, apparently trying to get my attention, and I couldn't see him. So I apologize again for that, Jason. Yeah, yeah. I always, whenever I'm giving a talk like that, I always say, I'm going to turn down the light, and the camera guy's like, but then you won't look as good on video.

Starting point is 00:45:03 I'm like, you know what? I'd rather be able to interact with my audience. Right. So a couple years ago, I was considering building my own JSON library. I eventually decided against the wisdom of that, but as part of that, I realized that a conforming JSON implementation has to be able to handle UTF-8 correctly.

Starting point is 00:45:24 And so I sort of barely knew what UTF-8 was and realized that, hey, there's this thing where you need to take UTF-8 code units, or sequence of UTF code units, and turn them into UTF-32 code points if you want to do something useful with it. And so that took me down the rabbit hole of how can I do this quickly? And I sort of looked at some of the traditional algorithms for doing it, but I also, based on work that I'd done before, i.e. the regular expressions many years ago, I realized that this was a lexical analysis problem and that one could write a DFA to analyze the incoming stream of UTF-8 octets, both for validity but also to turn them into a UTF-32 code point.

Starting point is 00:46:06 And so I started working on this, and after a little bit of alcohol, much profanity, and many hours of working on pencil and paper, I derived a state machine, a DFA, which described how to do this and conformed to all the various requirements from the Unicode Consortium for doing such conversions. And then I found out a few weeks after I did that that somebody else had already done it. But luckily, my results matched his, which was pleasing. There's strength in numbers. And so I had a DFA that could do the conversion, but when I tried to do DFA-only based conversion, my speed was basically the same as the traditional algorithm, which has a switch statement in it, or a laddered sequence of if statements.

Starting point is 00:46:57 And I was looking at it, and I looked at some sample webpages, my test cases from Wikipedia, and I looked at them, and I realized that any time one encounters an ASCII character, which fits in a single UTF-8 octet, it's very likely to be followed by another ASCII character. And even if you do things like download the Wikipedia page for the Hindi language written in Hindi, you know, for Hindi readers, there's an awful lot of ASCII characters in it. So it occurred to me that if I had a fast way to zero extend the ASCII characters from 8 bits to 32 bits, I might be able to speed things up. And so I did some experimentation with that and have sort of a hybrid algorithm which uses the DFA when it encounters a non-ASCII character. Otherwise, when it finds an ASCII character, it tries to zero expand as many subsequent ASCII characters, up to 16 as possible, using some SSE primitives.

Starting point is 00:47:58 And they turn out to be quite fast. For simple cases of mostly ASCII, you can see a 10 times performance speed up over traditional libraries. For more difficult cases where you have lots of Chinese or Russian or 3-byte characters, you can do, sprinkled in with some ASCII, you can maybe do twice as fast. And then I also derived a couple of torture tests in cases where I knew that my algorithm would not perform well just to see how it compared to the other algorithms. And most of the time it held up pretty well. I also did some cross-conversion from UTF-8 to UTF-16.

Starting point is 00:48:39 And I was trying to compete against Microsoft's Win32 API function for doing such conversions, and I'm proud to say that my conversions compare favorably to theirs. Yeah, theirs are core to Windows and highly optimized, right? Yes. So in most of my test cases, I actually beat their conversion by a small amount. Yeah, that's pretty good for as many years as they've invested in that. Yes. So. I know you're part of the standards committee now.

Starting point is 00:49:09 Hasn't there been some talk about UTF-8 standardization recently? Well, there's not so much UTF-8, but I know Tom Honerman has made a proposal to add a new fundamental type called CHAR-8T, which would be the analog of char16t and char32t, specifically to support Unicode literals and also something like a U8 string, like we have U16 string and U32 string now, which are type aliases. You know, the problem with char is I don't think it's specified whether char is signed or unsigned.

Starting point is 00:49:52 And to do, I think the standard leaves it unspecified. And char 8T is an unsigned type, which mirrors exactly what you need to do the unsigned operations on UTF-8 octets. Okay. I know that Tom actually has his own implementation of various Unicode utilities. Zach Lane is also very active. He's the author of Boost.txt, which has some very extensive research and work that Zach has done on creating a better string type,

Starting point is 00:50:26 which is compatible with Unicode up and down the line. And also I should mention Tom runs the SG16 study group, which is a newly formed study group looking at Unicode support for both language and library. Oh, I was not aware of that study group. Yes. Well, on the topic of your involvement and knowledge with the Standardization Committee, I promised I would get back to this.

Starting point is 00:50:53 You said when we first asked you about your involvement that you focus on the things that would make the language easier for the little guy, basically. I'm kind of curious what things so far you have focused on and what you would like to see get moved forward. Well, so I've been mostly an observer member up to now. I co-authored a paper with Arthur O'Dwyer. Actually, Arthur did all the hard work. I just sort of wrote his coattails in on the paper.

Starting point is 00:51:25 It's nice when you can just sign your name to a paper that someone else has written. Yeah, I sort of feel guilty about that because Arthur really did do quite a lot of work. He's very smart, and I was very fortunate that he let me be associated with him. Having to do with fancy pointers, actually. But that particular topic had to do more with fixing defects or perceived defects in the existing standard with regard to support for fancy pointers. But, you know, going forward, to answer your question, Jason, you know, I have various ideas about how one could make the standard library and standard, and the language facilities easier for the little guy.

Starting point is 00:52:13 I gave a talk in Aspen, C++ Now, last spring, called If I Had My Druthers, and proposals for changing containers in the standard library, in which I proposed that we could add several new containers, and we could structure them in such a way that there's an interface for what I call the casual user that allows them to be productive right out of the box. They don't have to worry about things like allocators. And the subtleties associated with allocators are completely erased from the picture. There's no allocator as part of the template interface of those classes. But as part of the layers in getting to that point, there are layers for expert and guru-style users to get more performance and more customizability out of it. So I think that's one

Starting point is 00:52:58 part. I would like to see, although I don't myself have any good ideas for it, but I would like to see a better string class someday that's easier to use. When you talk about the language, one of my pet peeves has always been the conflation of interface and implementation. And I think that the specification of implementation can be unnecessarily verbose. I'm the kind of guy who I have this religious, unshakable, zealous practice of separating my member function implementations from my class definitions. When I look at a header, I just want to see the member functions in the class.

Starting point is 00:53:41 I don't want to see the member functions defined in the class because this is not Java. So I always define my member functions outside the class and i find even in i'm sorry so even even for simple things only the very most trivial things do i define member functions inside the class definition itself i want to know if you also go through what would be necessary to do that for templates as well. I do. I do. Okay. If you read my code, my class definitions only contain declarations. Doesn't that make your member function definitions for template classes significantly longer because you have to specify the template type for the class

Starting point is 00:54:25 in that case when you wouldn't have to otherwise you do there's there's a whole there's a whole lot of verbiage that has to come along for the ride okay so for example one of the ideas i've had in the past is slightly extending the syntax of the language by adding a new keyword or perhaps reusing a new keyword so that you could define the member functions for a class or class template outside of the definition of the class template itself the same way that you would define them inside the class template. So I've been toying with the idea of reusing the word namespace. So away from your class definition,

Starting point is 00:55:06 at some subsequent point in the translation unit, you could do something like template, angles, class, T, angle, namespace, open bracket or open brace, and then write the definitions of your member function inside that without all of the extra verbiage, and then close the brace and you're done just as if just as if you had to find them inside the class definition itself that would save an awful lot of electrons from being harvested and would make code

Starting point is 00:55:38 a lot easier to read it you know uh it's it's a time saver and it's a readability thing it helps readability in separating interface from implementation. It helps readability in reading the implementation itself. I don't know if anybody's going to go for that idea. When I mention it to people, about 50% of the time they look at me like I'm crazy. So I don't know if it will go anywhere. But that's the kind of thing that I'm thinking of. I'm only looking at you like you're 50% crazy, not.

Starting point is 00:56:06 I see we are going with it, and I could definitely see if there was a way to do it in a way that would make the committee happy, I would not argue against it, personally. So, yeah. So, ideas like that. I mean, it doesn't add anything new to the language. There's no

Starting point is 00:56:22 semantics. It's just slightly new additional syntax that doesn't break anything existing that the language. There's no semantics. It's just slightly new additional syntax that doesn't break anything existing that would make the language easier to use. Okay. Maybe that's a good segue since you're talking about interface implementation to give a plug for your pre-conference class on modern interface design. Yes, thank you. So I'm giving a two-day pre-conference class called Interface Design for Modern C++.

Starting point is 00:56:50 And it's a class which I intend to be geared towards C++ programmers who think of themselves as being advanced beginners or maybe intermediate level who are, you know, relatively new to the language perhaps and are thinking about writing their own API or their own libraries. And, you know, my skill in doing this, whatever skill I have, has been built up over 26 painful years. And so I'm hoping to impart some wisdom based on my experience, but also on what I think are commonly established best practices for defining interfaces. And also trying to have my students, the ones who show up, gain some understanding and appreciation for everything that's involved in creating, you know, say a library, right? There's more to it than just writing code. There's a lot of thinking besides thinking about code that has to go on. And so I hope that my students will come away sort of with a checklist of two checklists,

Starting point is 00:57:59 a checklist of guiding principles for what are the questions I should be asking myself and the things that I should be doing as I am composing my interface that I want to give to the world. And the second checklist is, here's a list of things and actions that I should be doing in order to actually release my interface to the world. So, you know, that's the nickel tour of what the class, I hope the class will be. Sounds good. Yeah, sounds great.

Starting point is 00:58:28 Thanks. Well, is there anything else you want to go over, Bob, before we let you go? No, you guys have, I think we've done a pretty thorough job here. Again, thank you very much for having me on the show. This has really been a lot of fun. Certainly. Yeah, it's been great having you on today. Yes.

Starting point is 00:58:45 Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic, we'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppC on facebook and follow cppcast on twitter you can also follow me at rob w irving and jason at left to kiss on twitter we'd also like to thank all our patrons who help support the show through patreon if you'd like to support us on patreon you can do

Starting point is 00:59:16 so at patreon.com cppcast and of course you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode was provided by podcastthemes.com.

Your Ad Here

CppCast - CppCon Poster Program and Interface Design

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.