CppCast - The Art of C++ Libraries

Episode Date: August 9, 2018

Rob and Jason are joined by Colin Hirsch to discuss his work on The Art of C++ collection of libraries including PEGTL, json and more. Dr. Colin Hirsch studied Computer Science at the Universi...ty of Technology in Aachen, Germany in 1993 and later got a PhD in Mathematics from the same university. He worked for two years as a consultant for T-Mobile, developing back-end server applications in C++ and Lua. Later Colin moved to Italy, opened his own business and continued working for T-Mobile (now Deutsche Telekom) as well as working for some other interesting projects like Greenpeace and the Austrian ministry of ecology. In his free time he enjoys photography, being in nature, science fiction and spending time with his daughter. News Google Open Sources Filament rendering engine CppCon 2018 Program C++ Foundation Survey 2018-08 C++ on Sea Early Bird Tickets Available Colin Hirsch Colin Hirsch's GitHub Links The Art of C++ UmbriaLogic PEGTL json postgres Sponsors Backtrace Patreon CppCast Patreon Hosts @robwirving @lefticus

Transcript
Discussion (0)
Starting point is 00:00:00 Episode 162 of CppCast with guest Colin Hirsch recorded August 8th, 2018. This episode of CppCast is sponsored by Backtrace, the turnkey debugging platform that helps you spend less time debugging and more time building. Get to the root cause quickly with detailed information at your fingertips. Start your free trial at backtrace.io slash cppcast. In this episode, we discussed the CppCon 2018 program.
Starting point is 00:00:45 Then we talked to Colin Hirsch. Colin talks to us about the art of C++ libraries, the first podcast for C++ developers by C++ developers. I am your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today? I'm alright, Rob. How are you doing? I'm doing okay. You know, weather's not too I'm all right, Rob. How are you doing? I'm doing okay. You know, weather's not too bad out here in North Carolina. How about you? Oh, cool morning for me. I'm actually wearing a sweatshirt at the moment, but... Very nice, very nice. Cooler than the rest of the planet. Any news you wanted to bring up before we get going? No, I don't think so. Not at the moment.
Starting point is 00:01:45 Okay. Well, at the top of our episode, I'd like to read a piece of feedback. This week, we got a tweet from Paul saying, I feel like I'm cheating on CBPCast with LambdaCast, but it is so good. I had not heard of LambdaCast, but I looked them up. They're about 20 episodes in. They seem to be a functional programming-focused podcast. So, sounds interesting.
Starting point is 00:02:08 Yeah, that's a completely different beast from what we're doing, I think. Yeah, I wouldn't say you're cheating on us with them. And if you were cheating on us, then you shouldn't let us know about it, maybe. But, yeah, it's always great to have more uh technical programming podcasts out there so we're happy to hear about lambda cast and you can go check them out if you're interested so we'd love to hear your thoughts about the show as well you can always reach out to us on facebook twitter or email us at feedback at cpcast.com and don't forget to leave us a review on itunes joining us today is Dr. Colin Hirsch.
Starting point is 00:02:47 Dr. Hirsch studied computer science at the University of Technology in Aachen, Germany in 93 and later got a PhD in mathematics from the same university. He worked for two years as a consultant for T-Mobile, developing backend server applications in C++ and Lua. Later, Colin moved to Italy, opened his own business,
Starting point is 00:03:00 and continued working for T-Mobile, now Deutsche Telekom, as well as working for some other interesting projects like Greenpeace and the Austrian Ministry of Ecology. In his free time, he enjoys photography, being in nature, science fiction, and spending time with his daughter. Colin, welcome to the show. Yeah, thank you. And I'm very glad to be here with you today.
Starting point is 00:03:20 Thanks for joining us. You know, I wondered if T-Mobile was headquartered in Aachen, because I noticed that as an American with T-Mobile service, I had the best service outside of the United States I've ever had when I was last in Aachen. No, their headquarters is actually in Bonn, Germany, but that's just like an hour, an hour and a half from Aachen, so sort of same region, same general area.
Starting point is 00:03:45 Okay. So yeah, 93, that was, you've been programming for a while then, I take it? Yes, I started pretty early, actually. When I was 10 years old, I got my first computer, even though most people looking at it mistook it for a calculator, because that was the size of the thing. But it did actually have a very small alphanumeric display, and you could write programs in BASIC as long as you stayed within the 524 bytes of free memory that was. What kind of computer was that? What was that? It was a Casio PB110. Okay. You've probably never heard of it before. I once looked up the tech specs.
Starting point is 00:04:29 It was fun to see there was actually a 4-bit microprocessor in there. Are you checking to see if you have one on your desk or something? No, I don't. I have something similar-ish. This is a Casio FX4000P, which was the first programmable calculator that I had. And I think it also is a 4-bit microprocessor in it. Yeah, it might have been the same thing, just sort of packaged into a different form factor. Perhaps.
Starting point is 00:04:54 Interesting. Of course, 1993, we were a bit further along the road, although it was sort of the fun times, you know, compiling your own Linux kernels with experimental compilers and that kind of thing. Right, right. Okay, well, Colin, we have a couple of news articles to discuss. Feel free to comment on any of these, and then we'll start talking to you more about the libraries you work on, okay? Okay, great. That sounds fine. Okay, so this first one is that Google has open sourced Filament, which is a physically based rendering engine for Android, Windows, Linux, and macOS.
Starting point is 00:05:31 I had not heard about this before, but I guess it was an internal Google project before, and they released this out to the wild. Yeah, I mean, the pictures look pretty, but I feel like I'm at a loss of context for this. Like, is it intended for just static ray tracing games or I with the building modeling energy simulation stuff that I've done in the past? these physically based ray tracers used for things like determining how much light you'll actually have in a room on a given day of the year based on the reflectivity of the wall colors and that kind of thing like is that what this is intended for is intended for like actual physics modeling i don't know i was also a bit confused about like is it is it for real time or is it something where each picture takes you know like two hours on a high-end computer? Which also shows that I'm not particularly into computer graphics lately. Right.
Starting point is 00:06:33 But given that the first platform they mention is Android, does that kind of imply that it might be for... It does say real-time physically-based rendering. Ah, okay. Real- okay real time thanks i would think it's real time if you're running it on a phone yeah that's a good point yeah well it looks like interesting project i definitely recommend everyone to to look at some of these uh screenshots being produced with the library they're pretty impressive looking yeah uh next we have the entire cpp con 2018 schedule is now available and that is only seven weeks away now right jason yes right at seven weeks it seems and it looks like five six seven different tracks depending on what particular hour of the day you're looking at yeah at least
Starting point is 00:07:25 six tracks yeah now one question i had was i know they've already announced at least some of the keynotes and plenaries but all of the keynote and plenary time slots say tba and i wasn't sure why that might be um They've already announced it. They just need to decide who goes when. We have Kate Gregory doing one. We have that guy, I can't remember his name, but he works on the video game graphics, I believe, or movie graphics. Mark Elnt?
Starting point is 00:07:59 Yeah. Elnt? Yeah, I believe that's it. Chandler. Chandler's doing one. See, I don't know why these aren't just listed on the schedule. Well, that's an interesting point, because that's one, two, three, and then it's kind of implied.
Starting point is 00:08:13 Well, it is even actually said here, Herb and Bjarne will be speakers. That's five. That's basically everyone. Yeah, I mean, maybe they're just shuffling around who's going where. I don't know. They also have all these open content sessions that are listed as TBA. Yes, they haven't even announced the call for open content sessions yet.
Starting point is 00:08:32 How many of those did they do last year? I know we talked about the Jewelbots with Sarah. She gave an open content. Was that the only one last year? No, there tends to be well uh the morning and noon hour of every day is what i would have expected it looks like this year there's also some open in the evening i don't know if they did that last year and i just wasn't paying attention okay but pretty much anyone can submit that so if you are going to be at cbp con and you haven't yet
Starting point is 00:09:03 uh and you're not giving a talk or even if you are giving a talk and you've got some other thing that you want to propose as a talk that, um, doesn't go through the normal review process, basically you would just, you have the ability to just give something during one of the non normal mainstream times. Um, then they'll announce that probably pretty soon for you to submit talks for that. Yeah, and these open content talks, like you said, they're kind of all not going to interfere with session time. So there's one at 8 a.m., one during at 12.30, which is kind of the lunch hour, and there's one at night kind of at the same time as possible lightning talks.
Starting point is 00:09:41 Yeah. Colin, are you going to CppCon? No, unfortunately, I will not be making it, but I have to say the program is quite impressive and I think we could do two of these CppCon sessions just going over it in more detail. And I will definitely try to get to some major CppConference next year, even though sitting in Europe it might be one of the other ones. So I'll see about that definitely well jason you were at meetings c++ last year and that was quite a good conference
Starting point is 00:10:09 for you right yes although unfortunately i was only able to attend two days of it due to air berlin going out of business and i had to rearrange my flight schedule it happens okay uh next thing the c++ foundation developer survey uh yes this is out there go ahead jason i was gonna say they did one of these about six months ago yes so this is a follow-up they just apparently want to keep surveying the community to see how you're using C++. This one seems to be related to the cloud. Okay, I haven't looked through it yet. They started off, the first one, I think I participated, they sort of covered all the basics. And it looks like they're sort of choosing one subject at a time, and we'll just dig into it. And I think that's a pretty good idea to get some more feedback. I just hope that enough people participate so that it gives a sort of reasonably representative analysis and data on the situation of as many C++ developers as possible.
Starting point is 00:11:14 Yeah, and it looked like the information they got, like you said, last time was pretty valuable. And I could see them even doing the same surveys every 18 or 24 months just to see how the landscape is changing as well. That's true. I would, however, be interested in knowing why cloud was chosen now for the first subject-dedicated survey. So I'll be looking forward to see the results.
Starting point is 00:11:37 And so not just the survey results, but also what they actually make of it and what will happen next in the c++ committee and related areas yeah well listeners should definitely check out the survey should only take a few minutes to get your your results in right and then the last thing we've talked to phil nash before about his conference that he's starting the c++ on the c conference and if you're interested in going to that early bird tickets are now available yes and i think an important note here as phil has explicitly said that if you have submitted a talk and you are waiting to hear whether or not it has been accepted you do not need to rush out and buy the early bird ticket because you will still be eligible for early bird even if the announce the notifications are late for your talk submission.
Starting point is 00:12:27 That's a good thing for him to do. And early bird ticket says is until September 9th. Yes. So that's one pretty close to you to consider going to, Colin, if you're looking for a... Yes, it is actually. I definitely will check my calendar for february after we finish uh here and that might be a good place for me to go yes and you know in southern england in february you'll cool off even if it's a bit too late compared to now yes definitely i've
Starting point is 00:12:58 i've been to southern england so often i have her grandparents there actually i used to have grandparents there so i'd be looking forward to going there again. I only have... Well, I've been to Southern England twice, but it's been... Oh, goodness, make me feel old. Like 20 years since I was last there. 22 years.
Starting point is 00:13:16 Yeah. Okay. Well, Colin, where should we start? You, I believe, are the main contributor to the Art of C++, which is available on github.com at tau cpp do you want to tell us a bit about it yeah so we actually we're mostly two people working on this sort of collection small collection of C++ libraries the other being Daniel Frey with whom I spent some time at the university in Aachen, where we met.
Starting point is 00:13:46 And we've also spent some time working together in our professional jobs. And now we've got this project running. And it's very much a collaboration between the two of us. And the collection is sort of rather mixed we've got some things that daniel had on his hard disk for for a while and then just wanted to sort of share with the world and then we've got now two major libraries i'd say that one of which has been going since 2007 that's when when i started the pect. And basically, Daniel wanted sort of an umbrella under which we could sort of gather all of our open source efforts and put them up in one place
Starting point is 00:14:35 and let them live there and be worked on. And the nice thing is we've also gathered a third person now, Julian from Brazil, I believe. And he's helping us with some CMake-related things and also the Conan packaging. So this is, of course, a very nice milestone for an open-source project when you suddenly have other people contributing and chiming in and helping you get on with things. So we're quite happy with the way things are going, even though being a library or a collection of libraries
Starting point is 00:15:13 only developed in our free time, development is sometimes slow, since obviously we can't always dedicate all of our free time to it. But we like the direction it's going in, and we are generally happy with what's happening there. You know, starting talking about the beginning of the project and you said basically Daniel had some work on his hard drive that he wanted to share. And this concept of, you know,
Starting point is 00:15:37 basically letting the internet be your backup. That's, I don't know, it's just kind of funny. I think that was something that Linus basically said when he started Linux. So maybe you're off to a good start here. Yeah, I'm not quite sure whether the backup aspect was the important one, but it's definitely a good side effect of things. Yes, absolutely.
Starting point is 00:15:58 So specifically, you mentioned first the PegTool, as I think that's how you pronounced it, P-E-G-T-E-L library. Do you want to talk about that? And then we'll dig into some of the others. Yes, I think the Pegtel library is sort of very typical for how the things with our open source efforts got started. It was like I had stumbled over a library called Yard by Christopher Diggins, which was a small parser library he had written. And he was sort of trying out the idea, what happens if you write a parser combinator library
Starting point is 00:16:34 and you use C++ templates and template instantiations as your domain-specific languages, or as your domain-specific language to express the grammar in language, in the C++ language. So your combinators are classes, and to combine them, you write down template instantiations in your source code.
Starting point is 00:16:58 So the syntax is obviously very different compared to something like, you know, the normal E, B, and F or whatever we use for grammars. But the nice thing is that you don't need a preprocessor and you can just put it into your source code, into your C++ source. And the question that I asked then was, okay, so this Yard library is still pre-C++11, so pre-variadic templates. And this means that he had to go through all the hoops, he had to jump through all the hoops to make the templates such that they could accept a certain number of arguments, but you could also use them with less if you wanted to. But there was always an upper limit, and it always made the code more complicated than necessary.
Starting point is 00:17:49 And in 2007, we already had GCC 4.3, I believe it was, and that had support for variadic templates. And so I sort of took the idea and just re-implemented it in C++11, or at least in the part of C++11 available at the time with the variadic templates. And it turned out that that made many things very much easier. And from that point onward, the project sort of took on a life on its own. I then made an input layer that was a bit more flexible than the original Yard library. And I wrote documentation and I didn't bake the semantic actions into the library, but rather just added a mechanism by which semantic actions could be added to a grammar by the user.
Starting point is 00:18:43 And yeah, so the rest is history. And one interesting or funny tidbit of history is that the first version that I put up on Google Code Hosting at the time actually carries the version number 0.9. And the interesting thing is I fully expected the next version like three weeks, to be 1.0. And that obviously didn't quite work out as intended.
Starting point is 00:19:09 We went through several years until we arrived at like 0.32. And then it took another few years before Daniel and I together took on the job of major refactoring that then culminated in version 1.0. That's funny. I think I love things personally to be lexicographically sortable. Like I use isodate format anywhere I can. Makes me wonder if maybe we should start our first project as like 0.00.0. So then we can make sure it's sortable from there on out. But I'm looking at the examples on GitHub here, and it is definitely a technique that I have not, I think, seen before. So you basically have struct integer colon. So now you're deriving from something. And you've got a sequence, which is an optional with one plus or minus, and then one or more digits, it looks like.
Starting point is 00:20:10 And then you basically create an integer parser and a single line with a bunch of inheritance, basically. Is that correct? Yes, that's correct. We're sort of constrained by the C++ template syntax here. So in an EBNF, you would have the digit followed by the symbol plus. And here we just have to write plus with digit as a template parameter.
Starting point is 00:20:36 And one of the first questions we then always get asked is, why do we derive? Why do we write struct integer? Why don't we use a typedef or using to give a new name to this particular template instantiation? And the answer is that in the error reporting, it's usually a good idea to use struct because then you get an error parsing integer instead of an error parsing seq, opt1, et cetera, et cetera. that's interesting because that could also help potentially with compile
Starting point is 00:21:07 times because the length of the identifier that the compiler is passing around is smaller the type that is and in some cases from some of this stuff i've seen like from joe joe falco about compile time optimization but that's an interesting aspect we do generally try to do some optimizations for compile time but the length of identifiers hasn't been an explicit goal of us so far it's the aggregate size of the uh symbol table seems to have some impact on compile time and if you can keep it smaller then so that So that might be helping anyhow. I don't know. I'm definitely not an expert in that. But yeah, so I've never seen a technique like this, though. Have you considered going the route that Boost Spirit uses with expression templates to accomplish a similar thing?
Starting point is 00:22:00 After all these years, we've basically gotten used to this approach. And we also like that we give the compiler the burden, basically, of optimizing our grammar. And we also, once you sort of take this step away from the C++ operators, and you have the words, you aren't sort of constrained by anything. We have, if you look at the rules reference, we've got really tons of different combinators that make it easier to write grammars because if you want a comma-separated list of space-separated items, there's already a combinator for that. And you can use that out of the box. Just want to take a step back.
Starting point is 00:22:43 What are some of the use cases of a parsing expression grammar that you might use this library for? Well, we've used it for some smaller DSLs for config parsing. And one of the major use cases is also the tau cpp json library that has a Pectl-based parser. Even though there we went a bit beyond what the Pectiel offers and have some additional optimizations in there, which however also shows that the Pectiel is very open and modular. So it's very easy to say at this point, in this particular point, I want something special, either semantically or from a performance point of view, and you can just add it,
Starting point is 00:23:28 and it will work together with all the rest without any big glue code or whatever. So you just have to add her to certain API guidelines, and then you can just plug it all together. Well, so now you've mentioned the JSON library, and it sounds like it's a pretty good use case for PegTL. What kind of compile times are you getting with JSON since you said that you've tried to optimize compile time somewhat?'s just that we occasionally, we look at how things are going. And in the Pectiel in particular, we look at reducing template instantiation depth
Starting point is 00:24:11 since you have tons of nested templates. We tried that within the Pectiel, we don't have like 10 additional layers for every single user visible layer in the grammar. But we don't benchmark that. It's not so bad in practice. So that's very hand-waving, but that's the best I've got here. No, I mean, that makes actually a lot of sense. If the compile times are bad, then you would say,
Starting point is 00:24:39 well, we do get a lot of complaints from our users about compile times, and you didn't say that, so it must be okay. Yeah, either users have a thick skin or computers are really fast nowadays. And we also, I think somewhere in the documentation, we recommend sticking Pecteal grammars into a C file rather than in a header file, so that it will be compiled only once in every project. And also that sometimes won't be possible but a small hint that can help people to to keep the compile time a bit lower
Starting point is 00:25:13 right so how do you go about ensuring that your projects stay high quality um well that's a very big question and i could probably talk an hour about it but I think one of the main parts is just that we take a lot of time for things so sometimes when there's a change you see a change on GitHub or you see an additional feature pop up on GitHub
Starting point is 00:25:38 what you don't see is that we might have discussed that for three weeks playing ping pong with ideas until we are happy with the result. So this is time together with gut feeling and just working together and trying to leverage sort of all of our experiences, of our joint experiences, and not just having one person working on complicated parts. Then, of course, we've and not just having one person working on complicated parts. Then, of course, we've got sort of the usual stuff. We've got a large test suite,
Starting point is 00:26:11 in particular for the PEC-TL, which is an 11-year-old project, is pretty mature, and we've got full test coverage. And we've spent a lot of time on that too. And then there are also some parts that are interesting because they come sort of from actually putting it up as open source. Like if I imagine I'd have written the peg tier just for my personal use, I'd have the code, I'd have the tests, but I probably wouldn't have that much documentation for my own library, for my own personal use. But since we are trying to reach a large audience, but I probably wouldn't have that much documentation for my own library, for my own personal use.
Starting point is 00:26:52 But since we are trying to reach a large audience, we are also investing a lot of time in the documentation. And sometimes you just come to a point where you are writing the documentation and you think, well, this shouldn't be so complicated. Why is this so hard to explain? And even then we go back and look at the API and look at the complexity and whether we can change it in a way, which for us is also a part of quality. So quality is ease of use and understandability, as well as simplicity of implementation and test coverage and, well, of course, correct code as correct as possible. Okay, so you said that you have full test coverage with PegTR. Yeah, at least we think we have.
Starting point is 00:27:34 It's, of course, not quite so easy to measure with a template library where basically all the code is a template. But we do sometimes just look into the coverage analysis and check that all lines are actually covered, all lines of the core library. Okay. Yeah, I was wondering about that. Since it is a template library, there's virtually an infinite number of ways that people could use your library, right?
Starting point is 00:27:57 Yes. That, of course, makes it hard. And we know full well that 100% test coverage isn't always as great as it sounds. But we do try when writing tests to sort of not just cover obvious cases, but also go a bit left and right and test some combinations, some weird stuff and things.
Starting point is 00:28:20 And we are quite happy with the results because it very rarely happens that some user comes up and says, yeah, we found a real bug here. You need to fix this. So it seems to be an appropriate testing approach for this kind of library. And also one thing that helps us is that it's very small. Like the core of the Pectier library is like 6,000 lines of code.
Starting point is 00:28:47 Okay. So we try to keep things small or lean and mean, as we say. And that, of course, also makes it much easier to get something like 100% test coverage, at least on the lines of code, compared to some 100,000 lines of code.
Starting point is 00:29:05 Right. The hemoth. I'm also curious, it sounded a lot like your collaboration with Daniel and the other contributors that the collaboration is very important to maintaining high code quality for you. And I'm curious if you have any recommendations for how that works out
Starting point is 00:29:23 or tools that you use for how you communicate ideas while you're developing the next new feature. I don't really think that tooling is a particularly important part of the equation, at least not for such a small team with mostly just two people writing the code. So we use everything from Skypepe to emails and and share documents okay but um i i have actually been been thinking about this on occasion and i think the main part is that we are very detached from our ideas so it's like if one of us has an idea and suggests makes a suggestion or shows the code to the other one and the other one one then says, yeah, no, this is not good. We have to change it that way, then it's better. We iterate very fast in a fast fashion together without trying to not get hung up on our own ideas just because
Starting point is 00:30:16 they are our own ideas. So we fully embrace the fact that the first iteration or even the second or third iteration of any code, regardless of how brilliant the programmer, might not be the best. And that we have to just go on and look at it again and sleep over it and spend another week perhaps thinking if it's something difficult. And then we'll just talk about it again. And we'll go on with that until we are both happy and the result looks sort of as small and elegant as seems possible at the time. And yes, I sometimes think it's a bit of a shame that this is not really visible on the GitHub repository because we can't just paste all discussions on there.
Starting point is 00:31:08 But it is definitely an important part of the development process for us. All right. So my recommendation is to just collaborate closely and share ideas freely. Just talk about things. That helps a lot. I want to interrupt this discussion for just a moment
Starting point is 00:31:28 to bring you a word from our sponsors. Backtrace is a debugging platform that improves software quality, reliability, and support by bringing deep introspection and automation throughout the software error lifecycle. Spend less time debugging and reduce your mean time to resolution by using the first and only platform to combine symbolic debugging, error aggregation, and state analysis. At the time of error, Bactrace jumps into action, capturing detailed dumps of application and environmental state. Bactrace then performs automated analysis on process memory and executable code to classify errors and highlight important
Starting point is 00:32:00 signals such as heap corruption, malware, and much more. This data is aggregated and archived in a centralized object store, providing your team a single system to investigate errors across your environments. Join industry leaders like Fastly, Message Systems, and AppNexus that use Backtrace to modernize their debugging infrastructure. It's free to try, minutes to set up, fully featured with no commitment necessary. Check them out at backtrace.io slash cppcast. Most of the projects seem to be listed as C++11. Do you have any plans to update them to C++14, 17?
Starting point is 00:32:34 Are you looking at anything in C++20? Yeah, that's a subject that we've been starting to look into recently. Now, there are some sort of differences between the individual projects, like the PEG-TL seems to work very well with C++11. There are only few places where it would really benefit from jumping to 14 or 17. So we'll try to leave it on C++11 as long as possible. One reason for that is also that we know from personal experience and anecdotal evidence from what you read on the web or speaking to other friends in the IT sector that some companies are really very slow in updating compilers. And some of my close friends are still stuck on C++98.
Starting point is 00:33:27 And for them, even C++11 would be a great and important improvement. So in the name of wanting to reach as large audience as possible, we are sort of dragging our feet a little bit on that front. But we have also seen that the JSON library, for example, would probably benefit much more from going to C++14 or C++17. And since we are still, even after three years, at a pre-1.0 point, the possibility is absolutely there that we will launch it as C++14 or 14 or 17 library and we'll do all the cleanups and simplifications that can possibly be done by jumping to these newer
Starting point is 00:34:10 standards. So we are trying to keep an open eye in both directions, future and past, and then gauge where the best sort of cost benefit is or where the most benefit can be reached without cutting off too many people out of curiosity what standards or compilers are you able to use at work well that changes a lot one of the compilers i'm using a lot at the moment is actually gcc 4.8 okay um and um there are other things where i can use some newer compilers but uh yeah we simply don't always have the possibility and i mean we generic we to just update to the latest compiler and use the latest standard right oh yeah i was just wondering if you were using 17 at work if there or for example if there were particular features that you're like oh i would really like to use this feature
Starting point is 00:35:10 in the json library since you said that's the one that you thought could benefit the most yes there are definitely some things like in the json library we use string views for example and those aren't available in c++ 11 yet So we have our own fully baked string view implementation that is used as a fallback in the JSON library. And that would, of course, be one great simplification if we could just cut that out and rely on the standard implementation. Right. So I'm curious what compilers you support for these projects we try to support as many compilers as possible so if you look into the continuous integration config files of the pectl you'll find many gcc versions many clang versions on both linux and mac. And we've also had some community support
Starting point is 00:36:06 and contributions to enable the Pectio to work on Android and on Windows with both GCC and Visual Studio. So for 2018 landscape, I think we are covering all of the major bases and we're quite happy with that. Even though there is sometimes a bit of pain to support some particularly old or different compiler,
Starting point is 00:36:29 but so far the effort of all involved parties to support the C++ standards better and better is showing, and we're hoping that the additional effort to support Visual Studio will become less over time. Yeah, I just, since you mentioned it, looked at your Travis configuration file, and I don't think I've ever seen one quite so extensive. I believe you support basically every configuration that can be supported on Travis. Yeah, that's entirely possible.
Starting point is 00:37:03 In particular, Daniel has spent a lot of time in supporting as many platforms as possible and also just different compiler versions. And that is also, of course, part of the equation regarding quality. And it does help us sometimes catch things early when some other compiler gives some other warning that we haven't taken into consideration yet. So with this many configurations, and for our listeners, it seems pretty much every version of GCC from 4.8 to 8, every version of Clang from 3.5 to 6, and every version of Xcode that's available on Travis,
Starting point is 00:37:43 plus Android compilers, yes. it's a lot of compilers. Yeah, that's just the Travis. Then we've also got the doozer and the app layer for some other platforms. And we'll definitely have to talk about the doozer because that's one I've I'm personally not familiar with. And I don't believe we've had any guests. But I've I've had this problem. I'm curious if it's ever been a problem for you where uh an old compiler is giving a warning that a new compiler is not and it turns out that the old compiler was not necessarily 100 correct new compiler has bug fixes do you run into issues like that at all with this many configurations that you're trying to support? It does happen sometimes.
Starting point is 00:38:25 But as I said, the overall pains of supporting that many platforms aren't that big. We usually only have very few places where we need specific workarounds. We also try to run very clean code. We compile with minus pedantic and many warnings enabled. And yeah, it's not a big big issue at least not for these libraries obviously it might depend a bit on the kind of thing that you're doing but for us luckily it hasn't been too bad so far okay cool and so we we just we mentioned app app there and dozer then uh so app, it looks like you've got the last couple of versions of Visual Studio plus MinGW compilers on here. And we've talked about AppVeyor on previous shows, but let's talk
Starting point is 00:39:12 about Doozor. What is Doozor? It's basically another continuous integration platform. And I have to admit that I'm personally also not all that familiar with it. I think Daniel stumbled over it and then decided that it was a good idea to support it since it had some more, I believe it was some different Linux distributions available or something like that. So not exactly an area where we expected any compatibility issues, seeing that we aren't particularly tied to an operating system. But just in order to cover all the bases, we tried it and we've got it working now.
Starting point is 00:39:51 So we're happy to have even more platforms tested automatically. Yeah, it does look like it's various distributions of Fedora and Ubuntu. Yes. So, and OS X. Is it also a free service then? I assume not something you're paying for?
Starting point is 00:40:11 No, all of the continuous integration services that we are using are at least free for us as an open source project. Right. Very cool. You also mentioned Conan support, and it looks like you directly work with conan to expose uh pectel as a conan package is that right yes we have recently been looking into packaging and since conan seems to be one of the up-and-coming package managers for c++ libraries we decided to to look into it and we also got some support from the community,
Starting point is 00:40:47 and particularly from Julian, our third Tau CPP member. And yeah, we are happy to show up in as many package managers as possible, even though it is, in a sense, still sort of a bit of a new subject for us. Like if you look into Pectl packages for the different Linux distributions, you'll see that many of them are a bit out of date at this point. So we are still sort of in the learning phase
Starting point is 00:41:14 to see how to play together with the distributions or the package maintainers and who needs to push whom, when and trigger what so that the, yes, to keep everything a bit more up-to-date. But yeah, we are happy with how things are going. I've been saying this a lot, but yeah.
Starting point is 00:41:34 Well, that's good. Yeah, it is. At least I've been mostly looking at PegTL, and it seems like that's what we've been talking about the most. But there's actually a lot of projects on here and you've mentioned the JSON library already but do you maintain the same level of continuous integration and quality across all of the libraries that you have up here? We have been ramping up to the same level in the other libraries they might not all be there yet and some of the libraries are also so specialized that we don't really
Starting point is 00:42:06 expect very many users of them, like the integer sequence and the tuple library isn't something that a normal user would use because they'd use the standard implementation. So these are just some smaller, in a sense, more educational projects that Daniel put up into Tau CPP. But he is pushing to get the same level of continuous integration coverage on that and on all libraries. It's like once you've got the first library running on all the platforms, on all the continuous integration platforms, then you know the quirks, you get an idea of how to how
Starting point is 00:42:46 to go about things and then it's just a matter of using the same config file and fixing all the bugs and all the warnings and all the compilers which right yeah i think this raises an interesting question at least for me personally um it seems like like i it would be interesting to me if there was some top-level way in GitHub to say, well, this is the basic CMake warnings that I want to use and the basic Travis configuration that I want to use. But as far as I know, you're just going to have to copy and paste these things between projects whenever you update them. Is that, as far as you know, also, there there's nowhere a way to centralize this across your organization? Yes, that is actually an issue. There are several things that we would like to centralize. In particular, if you look at the JSON library, it's got a full copy of the PECTL embedded into the source code.
Starting point is 00:43:39 So we copied it into the JSON library. We changed all the namespaces so that there wouldn't be a collision if any application were to use both the JSON and the Pecteal library independent of each other. And we are looking for solutions to this and we are hoping that package management might help us there. If we have a JSON package
Starting point is 00:44:04 that depends on a Pecteo package, then that could make things easier. So for us, that's sort of the big question. The continuous integration config files aren't that large and don't change that often. So for us, it's more a question of reuse on a higher level with regards to all the shared code between all the libraries. I'm curious if you did consider already and then reject submodules for your Git projects. Yeah, I was wondering the same thing. Yeah, we've looked at it and we've tried it in a private context somewhere else. And we weren't sort of quite happy with it,
Starting point is 00:44:47 even though I can't recall the exact issues now. So there's the Git sub modules and then there's the other sub thingy. So there are actually two different ways of how to embed one Git repository into another. And we might try one of them, but it also seemed to be slightly awkward regarding how you then have one project within the other one
Starting point is 00:45:09 and then the sub project still needs all its own configuration files and everything. So what would be ideal would be some kind of truly hierarchical system where you can say, we've got the overarching project and then all the individual parts. And this might of course be something
Starting point is 00:45:24 that we could be able to realize with sub modules but we haven't actually tried it yet okay but this is actually one of the biggest issues that we're having at the moment something that we we've been thinking about a lot but haven't been able to dedicate enough time to actually go ahead and try something out yet you know uh rob i hate to admit this but this all sounds like a strong argument for doing the uh you know single git repository with all of the files in it which we've talked about with uh google does titus winters yeah yeah so we talked about with titus i think the first time we had him on because it's it it does not make me comfortable but i can see the issues that it would solve yeah yeah the problem is i think it solves some issues
Starting point is 00:46:11 and then it creates some other ones so yeah now we may have 50 libraries in one git repository versioning might be a bit of a nightmare when the development cycles aren't synced up so yeah everything would have to be versioned at once. I guess that's what, well, I mean, if you do the live from head mentality as well, where head is always stable, then I guess it's all, then that matters less too. I don't know if that's a real solution
Starting point is 00:46:38 for the average organization or not, honestly. I'll say I think submodules works well as long as the project you're bringing in as a submodule isn't going to be updated that frequently. Like if it's already pretty stable, then I think it makes sense. That might be both a case for and against
Starting point is 00:46:58 submoduling the Pectl into the JSON library since it is both very stable, but then again, it does have a lot of small changes, updates, and feature additions over time. Right. Going back to package managers, we mentioned Conan. Have you looked into any of the other package managers,
Starting point is 00:47:19 like VC package maybe? I believe that VC package is something that somebody is already working on. I think we've already got at least a PEC-TL package there. It's definitely one of the package managers that we want to be in. And if it doesn't work out, we'll try to work on it ourselves. Of course, the hope is that the community, that the general community will somehow standardize on, let's say, not too many different package managers, just not to create too much overhead. Of course, that might or might not work out, as we know from many other different subjects and libraries, etc. So we'll see how that plays out but it's it's good that there
Starting point is 00:48:06 is definitely some and a lot of effort going on in the general c++ community at the moment to address this issue and so i'm confident that something will come out of it that will be reasonably universal and accepted broadly i find it funny that you mentioned that someone was already working on vcpackage because with my open source library I got an issue a few months ago or actually I don't know how long ago it was now that was basically like well the vcpackage is out of date
Starting point is 00:48:34 I'm like the what is what what? I had no idea that anyone had been working on it. I think it was before I was even really aware of vcpackage. They seem to be fairly aggressive in that community of getting as many packages in as they can yeah and that's a suppose that's a that's a good thing yeah but it also does show up one of the perhaps um less satisfactory aspect of open source sometimes you don't know who is using your libraries or whatever it is, your projects for what? And you're sitting there like wondering,
Starting point is 00:49:06 yeah, so do we have 100 people actively using it or 100,000 and what are they doing with it? And of course, the Pecteal now, after 10 years of being public, we do have a sort of a certain stream of comments and feedback and even sometimes articles on the web that reference it. Like there was one that we liked very much because it concluded with pecteal rules.
Starting point is 00:49:29 So that was a nice thing to read. But we are very strict in wanting to keep the MIT license and be as liberal as possible, even though I have an occasion one that's, well, shouldn't we add a fourth clause? Like, yeah, everyone who uses this library has to send us a postcard stating where they are, and even just in very broad terms, what they're using it for a postcard with a one euro coin. Yeah, that will go a long way, possibly. Yes.
Starting point is 00:50:08 So I did notice Clang format configuration files in some of the repositories. Are you using that consistently across all of the codebases as well? Yes, we are actually using it. While it's not, so consistent formatting is not a priority. So it's not, how would you say, a first tier quality metric for code bases. We do think it is nice to have everything formatted consistently. And at least it prevents issues with, you know, third party contributions being formatted too differently, and then having to
Starting point is 00:50:39 discuss where the cutoff is. So you know, we have the clang format and we just use it and that settles that issue it does seem like a good uh way to manage user contributions if you're like oh they use tabs everywhere please apply the clang format file before submitting your patch or whatever yeah and if somebody then doesn't do it or doesn't have clang format installed we just do it and you know push another commit straight afterwards. So that works well, no problem. Do you ever find it gets in your way at all? Like you intentionally wanted something to be formatted a particular way that goes outside of what Clang format is capable of doing? We do actually have that in the Pectiel in particular. When you're writing grammars,
Starting point is 00:51:23 you often have these many lines defining all the different grammar rules, one under each other, where you have like struct, your rule name, colon, and then the template expression that you're inheriting from. Right. And it really doesn't make any sense to put the curly braces from this struct definition into the next two lines. Right. So if you look into the Pectl source code, you'll see that in places where this happens frequently, you'll have like a big block that is then a big block of code that is excluded from the Clang format in order to keep the formatting style
Starting point is 00:52:00 that for this particular bit of code is more appropriate. But I can run with that question, say we are also using Clang-Tidy, and that is actually giving us a few more issues, at least in the JSON library. So in the Pectier, we've only got, I think, about 10 places where we have a no-lint comment to exclude a line from the Clang-Tidy.
Starting point is 00:52:24 In the JSON library, which throws a lot more exceptions, which does more low-level fiddling around due to the union, we have like 400 places where we exclude the Clang-Tidy from giving a comment. And we are sort of at the moment looking into reducing the number of checks that Clang Tidy does for us, which is quite a task since it's got a lot. Or we might even ditch it. But at the moment, we're sort of trying to save the situation by cutting it down to a set of checks that works better for this particular project. I recently became aware of a feature of cling tidy that I don't see used very often that might be helpful that you can specify a dot cling tidy file just like you can specify a dot cling format
Starting point is 00:53:16 file so you can give like a top level project exclusions. Yeah the issue is that with 400 places there are also many different cases at times and the list of different checks that it has, we have to dedicate some time just to understand the whole list and then try to choose wisely
Starting point is 00:53:39 and figure out which ones just aren't appropriate for this particular project because it does a bit more low-level stuff than some other libraries usually do. Well, it sounds like you all have a lot going on with all these different projects. Is there any project that we have not discussed yet that you would like to bring up? I'd say the three major libraries in Tau CPP are the PEC-TL and the JSON library and also the Postgres library, which is a C++ wrapper around the libpq. That is also very mature, even though we might still need to finish up on the documentation.
Starting point is 00:54:22 That was one of Daniel's private projects that he decided to share with the community. So for us, these are sort of the three poster child libraries, because as I mentioned, the other ones are a bit very specialized and perhaps not generally useful. Whereas even though parsing and Postgres aren't perhaps something that you use in any project it's not something strange you might easily require a little parser you might easily require a database and JSON of course is everywhere nowadays
Starting point is 00:54:55 so that's hopefully of a general appeal Well it's been great having you on the show today Colin and obviously we'll put links great having you on the show today colin uh and obviously we'll put links to uh the libraries on the show notes is there anything else you want to let us know anything like where can people find you online um yeah you i think you already mentioned the github page and you'll say you'll you'll offer the link to everybody um We don't have any blogs or such for our development. We try to put every free minute into the libraries themselves. And yeah, so I'll just say thank you for having me.
Starting point is 00:55:37 It was great talking to you. And I'll be looking forward to listening to your upcoming CPP cast. Okay. Awesome. Great having you on the show. Thank you. Thanks for coming on. Yeah, thank you. Bye-bye.
Starting point is 00:55:49 Bye. Thanks so much for listening in as we chat about C++. I'd love to hear what you think of the podcast. Please let me know if we're discussing the stuff you're interested in. Or if you have a suggestion for a topic, I'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.