CppCast - The Art of C++ Libraries
Episode Date: August 9, 2018Rob and Jason are joined by Colin Hirsch to discuss his work on The Art of C++ collection of libraries including PEGTL, json and more. Dr. Colin Hirsch studied Computer Science at the Universi...ty of Technology in Aachen, Germany in 1993 and later got a PhD in Mathematics from the same university. He worked for two years as a consultant for T-Mobile, developing back-end server applications in C++ and Lua. Later Colin moved to Italy, opened his own business and continued working for T-Mobile (now Deutsche Telekom) as well as working for some other interesting projects like Greenpeace and the Austrian ministry of ecology. In his free time he enjoys photography, being in nature, science fiction and spending time with his daughter. News Google Open Sources Filament rendering engine CppCon 2018 Program C++ Foundation Survey 2018-08 C++ on Sea Early Bird Tickets Available Colin Hirsch Colin Hirsch's GitHub Links The Art of C++ UmbriaLogic PEGTL json postgres Sponsors Backtrace Patreon CppCast Patreon Hosts @robwirving @lefticus
Transcript
Discussion (0)
Episode 162 of CppCast with guest Colin Hirsch
recorded August 8th, 2018.
This episode of CppCast is sponsored by Backtrace,
the turnkey debugging platform that helps you spend less time debugging
and more time building.
Get to the root cause quickly with detailed information at your fingertips.
Start your free trial at backtrace.io slash cppcast.
In this episode, we discussed the CppCon 2018 program.
Then we talked to Colin Hirsch.
Colin talks to us about the art of C++ libraries, the first podcast for C++ developers by C++ developers.
I am your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
I'm alright, Rob. How are you doing?
I'm doing okay. You know, weather's not too I'm all right, Rob. How are you doing? I'm doing okay.
You know, weather's not too bad out here in North Carolina. How about you? Oh, cool morning for me.
I'm actually wearing a sweatshirt at the moment, but... Very nice, very nice. Cooler than the rest of the planet. Any news you wanted to bring up before we get going? No, I don't think so. Not at the moment.
Okay.
Well, at the top of our episode, I'd like to read a piece of feedback.
This week, we got a tweet from Paul saying,
I feel like I'm cheating on CBPCast with LambdaCast, but it is so good.
I had not heard of LambdaCast, but I looked them up.
They're about 20 episodes in.
They seem to be a functional programming-focused podcast.
So, sounds interesting.
Yeah, that's a completely different beast from what we're doing, I think.
Yeah, I wouldn't say you're cheating on us with them.
And if you were cheating on us, then you shouldn't let us know about it, maybe.
But, yeah, it's always great to have more uh technical programming podcasts out there so
we're happy to hear about lambda cast and you can go check them out if you're interested
so we'd love to hear your thoughts about the show as well you can always reach out to us on
facebook twitter or email us at feedback at cpcast.com and don't forget to leave us a review
on itunes joining us today is Dr. Colin Hirsch.
Dr. Hirsch studied computer science at the University of Technology in Aachen, Germany in 93
and later got a PhD in mathematics
from the same university.
He worked for two years as a consultant for T-Mobile,
developing backend server applications
in C++ and Lua.
Later, Colin moved to Italy,
opened his own business,
and continued working for T-Mobile,
now Deutsche Telekom,
as well as working for some other interesting projects like Greenpeace and the Austrian Ministry of Ecology.
In his free time, he enjoys photography, being in nature, science fiction, and spending time
with his daughter.
Colin, welcome to the show.
Yeah, thank you.
And I'm very glad to be here with you today.
Thanks for joining us.
You know, I wondered if T-Mobile was headquartered in Aachen,
because I noticed that as an American with T-Mobile service,
I had the best service outside of the United States I've ever had
when I was last in Aachen.
No, their headquarters is actually in Bonn, Germany,
but that's just like an hour, an hour and a half from Aachen,
so sort of same region, same general area.
Okay. So yeah, 93, that was, you've been programming for a while then, I take it?
Yes, I started pretty early, actually. When I was 10 years old, I got my first computer,
even though most people looking at it mistook it for a calculator, because that was the size of
the thing.
But it did actually have a very small alphanumeric display, and you could write programs in BASIC as long as you stayed within the 524 bytes of free memory that was.
What kind of computer was that? What was that?
It was a Casio PB110.
Okay. You've probably never heard of it before. I once looked up the tech specs.
It was fun to see there was actually a 4-bit microprocessor in there.
Are you checking to see if you have one on your desk or something?
No, I don't.
I have something similar-ish.
This is a Casio FX4000P, which was the first programmable calculator that I had.
And I think it also is a 4-bit microprocessor in it.
Yeah, it might have been the same thing, just sort of packaged into a different form factor.
Perhaps.
Interesting.
Of course, 1993, we were a bit further along the road, although it was sort of the fun times, you know, compiling your own Linux kernels with experimental compilers and that kind of thing.
Right, right.
Okay, well, Colin, we have a couple of news articles to discuss.
Feel free to comment on any of these,
and then we'll start talking to you more about the libraries you work on, okay?
Okay, great. That sounds fine.
Okay, so this first one is that Google has open sourced Filament, which is a physically based rendering engine for Android, Windows, Linux, and macOS.
I had not heard about this before, but I guess it was an internal Google project before, and they released this out to the wild.
Yeah, I mean, the pictures look pretty, but I feel like I'm at a loss of context for this. Like, is it intended for just static ray tracing games or I with the building modeling energy simulation stuff that I've done in the past? these physically based ray tracers used for things like determining how much light you'll
actually have in a room on a given day of the year based on the reflectivity of the wall colors and
that kind of thing like is that what this is intended for is intended for like actual physics
modeling i don't know i was also a bit confused about like is it is it for real time or is it
something where each picture takes you know like two hours on a high-end computer?
Which also shows that I'm not particularly into computer graphics lately.
Right.
But given that the first platform they mention is Android, does that kind of imply that it might be for...
It does say real-time physically-based rendering.
Ah, okay. Real- okay real time thanks i would think
it's real time if you're running it on a phone yeah that's a good point yeah well it looks like
interesting project i definitely recommend everyone to to look at some of these uh
screenshots being produced with the library they're pretty impressive looking yeah uh next we have the entire cpp con 2018 schedule is now
available and that is only seven weeks away now right jason yes right at seven weeks it seems
and it looks like five six seven different tracks depending on what particular hour of the day you're looking at yeah at least
six tracks yeah now one question i had was i know they've already announced at least some of the
keynotes and plenaries but all of the keynote and plenary time slots say tba and i wasn't sure
why that might be um They've already announced it.
They just need to decide who goes when.
We have Kate Gregory doing one.
We have that guy, I can't remember his name,
but he works on the video game graphics, I believe, or movie graphics.
Mark Elnt?
Yeah.
Elnt?
Yeah, I believe that's it.
Chandler.
Chandler's doing one.
See, I don't know why these aren't just listed on the schedule.
Well, that's an interesting point, because that's one, two, three,
and then it's kind of implied.
Well, it is even actually said here,
Herb and Bjarne will be speakers.
That's five.
That's basically everyone.
Yeah, I mean, maybe they're just shuffling around who's going where.
I don't know.
They also have all these open content sessions that are listed as TBA.
Yes, they haven't even announced the call for open content sessions yet.
How many of those did they do last year?
I know we talked about the Jewelbots with Sarah.
She gave an open content.
Was that the only one last year?
No, there tends to be well uh the morning
and noon hour of every day is what i would have expected it looks like this year there's also some
open in the evening i don't know if they did that last year and i just wasn't paying attention
okay but pretty much anyone can submit that so if you are going to be at cbp con and you haven't yet
uh and you're not giving a talk or even if you
are giving a talk and you've got some other thing that you want to propose as a talk that, um,
doesn't go through the normal review process, basically you would just, you have the ability
to just give something during one of the non normal mainstream times. Um, then they'll announce
that probably pretty soon for you to submit talks for that.
Yeah, and these open content talks, like you said,
they're kind of all not going to interfere with session time.
So there's one at 8 a.m., one during at 12.30, which is kind of the lunch hour, and there's one at night kind of at the same time as possible lightning talks.
Yeah. Colin, are you going to CppCon?
No, unfortunately, I will not be making it, but
I have to say the program is quite impressive and I think we could do two of these
CppCon sessions just going over it in more detail.
And I will definitely try to get to some major CppConference
next year, even though sitting in Europe it might be one of the other ones.
So I'll see about that
definitely well jason you were at meetings c++ last year and that was quite a good conference
for you right yes although unfortunately i was only able to attend two days of it due to air
berlin going out of business and i had to rearrange my flight schedule it happens okay uh next thing the c++ foundation developer survey uh yes this is out there go ahead jason
i was gonna say they did one of these about six months ago yes so this is a follow-up they just
apparently want to keep surveying the community to see how you're using C++. This one seems to be related to the cloud.
Okay, I haven't looked through it yet. They started off, the first one, I think I participated, they sort of covered all the
basics. And it looks like they're sort of choosing one subject at a time, and we'll just dig into it.
And I think that's a pretty good idea to get some more feedback. I just hope that enough people participate so that it gives a sort of reasonably representative analysis
and data on the situation of as many C++ developers as possible.
Yeah, and it looked like the information they got,
like you said, last time was pretty valuable.
And I could see them even doing the same surveys
every 18 or 24 months
just to see how the landscape is changing as well.
That's true.
I would, however, be interested in knowing why cloud was chosen now for the first subject-dedicated survey.
So I'll be looking forward to see the results.
And so not just the survey results, but also what they actually make of it and what will happen next in the c++ committee and related areas yeah
well listeners should definitely check out the survey should only take a few minutes to
get your your results in right and then the last thing we've talked to phil nash before about his
conference that he's starting the c++ on the c conference and if you're interested in going to
that early bird tickets are now available yes and i think an important note here as phil has
explicitly said that if you have submitted a talk and you are waiting to hear whether or not it has
been accepted you do not need to rush out and buy the early bird ticket because you will still be
eligible for early bird even if the announce the notifications are late for your talk submission.
That's a good thing for him to do.
And early bird ticket says is until September 9th.
Yes.
So that's one pretty close to you to consider going to, Colin, if you're looking for a...
Yes, it is actually.
I definitely will check my calendar for february
after we finish uh here and that might be a good place for me to go yes and you know in southern
england in february you'll cool off even if it's a bit too late compared to now yes definitely i've
i've been to southern england so often i have her grandparents there actually i used to have
grandparents there so i'd be looking forward to going there again.
I only have...
Well, I've been to Southern England twice,
but it's been...
Oh, goodness, make me feel old.
Like 20 years since I was last there.
22 years.
Yeah.
Okay.
Well, Colin, where should we start?
You, I believe, are the main contributor
to the Art of C++, which is available on github.com at
tau cpp do you want to tell us a bit about it yeah so we actually we're mostly two people
working on this sort of collection small collection of C++ libraries the other being
Daniel Frey with whom I spent some time at the university in Aachen, where we met.
And we've also spent some time working together in our professional jobs.
And now we've got this project running.
And it's very much a collaboration between the two of us.
And the collection is sort of rather mixed we've got some things that daniel
had on his hard disk for for a while and then just wanted to sort of share with the world
and then we've got now two major libraries i'd say that one of which has been going since 2007
that's when when i started the pect. And basically, Daniel wanted sort of an umbrella
under which we could sort of gather all of our open source efforts and put them up in one place
and let them live there and be worked on. And the nice thing is we've also gathered a third
person now, Julian from Brazil, I believe.
And he's helping us with some CMake-related things and also the Conan packaging.
So this is, of course, a very nice milestone for an open-source project
when you suddenly have other people contributing and chiming in
and helping you get on with things.
So we're quite happy with the way things are going,
even though being a library or a collection of libraries
only developed in our free time, development is sometimes slow,
since obviously we can't always dedicate all of our free time to it.
But we like the direction it's going in,
and we are generally happy with what's happening there.
You know, starting talking about the beginning of the project
and you said basically Daniel had some work on his hard drive
that he wanted to share.
And this concept of, you know,
basically letting the internet be your backup.
That's, I don't know, it's just kind of funny.
I think that was something that Linus basically said
when he started Linux.
So maybe you're off to a good start here.
Yeah, I'm not quite sure whether the backup aspect was the important one,
but it's definitely a good side effect of things.
Yes, absolutely.
So specifically, you mentioned first the PegTool,
as I think that's how you pronounced it, P-E-G-T-E-L library.
Do you want to talk about that?
And then we'll dig into some of the others.
Yes, I think the Pegtel library is sort of very typical for how the things with our open source efforts got started.
It was like I had stumbled over a library called Yard by Christopher Diggins, which was a small parser library he had written.
And he was sort of trying out the idea,
what happens if you write a parser combinator library
and you use C++ templates and template instantiations
as your domain-specific languages,
or as your domain-specific language to express the grammar in language,
in the C++ language.
So your combinators are classes,
and to combine them,
you write down template instantiations
in your source code.
So the syntax is obviously very different
compared to something like, you know,
the normal E, B, and F or whatever we use for grammars. But the nice thing is that you don't need a preprocessor
and you can just put it into your source code, into your C++ source.
And the question that I asked then was, okay, so this Yard library is still pre-C++11, so pre-variadic templates.
And this means that he had to go through all the hoops, he had to jump through all the hoops to make the templates such that they could accept a certain number of arguments, but you could also use them with less if you wanted to.
But there was always an upper limit,
and it always made the code more complicated than necessary.
And in 2007, we already had GCC 4.3, I believe it was,
and that had support for variadic templates.
And so I sort of took the idea and just re-implemented it in C++11, or at least in the part of C++11 available at the
time with the variadic templates. And it turned out that that made many things very much easier.
And from that point onward, the project sort of took on a life on its own. I then
made an input layer that was a bit more flexible than the original Yard library.
And I wrote documentation and I didn't bake the semantic actions into the library,
but rather just added a mechanism by which semantic actions could be added to a grammar by the user.
And yeah, so the rest is history.
And one interesting or funny tidbit of history
is that the first version that I put up
on Google Code Hosting at the time
actually carries the version number 0.9.
And the interesting thing is I fully expected
the next version like three weeks, to be 1.0.
And that obviously didn't quite work out as intended.
We went through several years until we arrived at like 0.32.
And then it took another few years before Daniel and I together took on the job of major refactoring that then culminated in version 1.0.
That's funny. I think I love things personally to be lexicographically sortable. Like I use
isodate format anywhere I can. Makes me wonder if maybe we should start our first project as like
0.00.0. So then we can make sure it's sortable from there on out.
But I'm looking at the examples on GitHub here, and it is definitely a technique that I have not,
I think, seen before. So you basically have struct integer colon. So now you're deriving
from something. And you've got a sequence, which is an optional with one plus or minus, and then one or more digits, it looks like.
And then you basically create an integer parser and a single line with a bunch of inheritance, basically.
Is that correct?
Yes, that's correct.
We're sort of constrained by the C++ template syntax here.
So in an EBNF, you would have the digit
followed by the symbol plus.
And here we just have to write plus
with digit as a template parameter.
And one of the first questions we then always get asked is,
why do we derive?
Why do we write struct integer?
Why don't we use a typedef or using
to give a new
name to this particular template instantiation? And the answer is that in the error reporting,
it's usually a good idea to use struct because then you get an error parsing integer instead
of an error parsing seq, opt1, et cetera, et cetera. that's interesting because that could also help potentially with compile
times because the length of the identifier that the compiler is passing around is smaller
the type that is and in some cases from some of this stuff i've seen like from joe joe falco
about compile time optimization but that's an interesting aspect we do generally try to do
some optimizations for compile time but the length of identifiers hasn't been an explicit
goal of us so far it's the aggregate size of the uh symbol table seems to have some impact on
compile time and if you can keep it smaller then so that So that might be helping anyhow. I don't know. I'm definitely not an expert in that.
But yeah, so I've never seen a technique like this, though.
Have you considered going the route that Boost Spirit uses with expression templates to accomplish a similar thing?
After all these years, we've basically gotten used to this approach. And we also like that we give the compiler the burden, basically, of optimizing our grammar.
And we also, once you sort of take this step away from the C++ operators, and you have the words, you aren't sort of constrained by anything. We have, if you look at the rules reference,
we've got really tons of different combinators
that make it easier to write grammars
because if you want a comma-separated list of space-separated items,
there's already a combinator for that.
And you can use that out of the box.
Just want to take a step back.
What are some of the use cases of a parsing expression grammar that you might use this library for?
Well, we've used it for some smaller DSLs for config parsing.
And one of the major use cases is also the tau cpp json library that has a Pectl-based parser.
Even though there we went a bit beyond what the Pectiel
offers and have some additional optimizations in there, which however also shows that the Pectiel
is very open and modular. So it's very easy to say at this point, in this particular point,
I want something special, either semantically or from a performance point of view,
and you can just add it,
and it will work together with all the rest
without any big glue code or whatever.
So you just have to add her to certain API guidelines,
and then you can just plug it all together.
Well, so now you've mentioned the JSON library,
and it sounds like it's a pretty good use case for PegTL. What kind of compile times are you getting with JSON since you said that you've tried to optimize compile time somewhat?'s just that we occasionally, we look at how things are going.
And in the Pectiel in particular,
we look at reducing template instantiation depth
since you have tons of nested templates.
We tried that within the Pectiel,
we don't have like 10 additional layers
for every single user visible layer
in the grammar.
But we don't benchmark that. It's
not so bad in practice. So that's very hand-waving, but that's the best I've got here.
No, I mean, that makes actually a lot of sense. If the compile times are bad, then you would say,
well, we do get a lot of complaints from our users about compile times,
and you didn't say that, so it must be okay.
Yeah, either users have a thick skin or computers are really fast nowadays.
And we also, I think somewhere in the documentation,
we recommend sticking Pecteal grammars into a C file
rather than in a header file,
so that it will be compiled only once in every project.
And also that sometimes won't be possible but a small hint that can help people to to keep the compile time a bit lower
right so how do you go about ensuring that your projects stay high quality um well that's a very
big question and i could probably talk an hour about it but I think one of the main
parts is just that
we take a lot of time for things
so sometimes
when there's a change
you see a change on GitHub or you see
an additional feature pop up on GitHub
what you don't see is that we might
have discussed that for three weeks
playing ping pong with ideas
until we are happy with the result.
So this is time together with gut feeling and just working together and trying to
leverage sort of all of our experiences, of our joint experiences, and not just having one person
working on complicated parts. Then, of course, we've and not just having one person working on complicated
parts. Then, of course, we've got sort of the usual stuff. We've got a large test suite,
in particular for the PEC-TL, which is an 11-year-old project, is pretty mature,
and we've got full test coverage. And we've spent a lot of time on that too. And then there are also some parts that are interesting
because they come sort of from actually putting it up as open source.
Like if I imagine I'd have written the peg tier just for my personal use,
I'd have the code, I'd have the tests,
but I probably wouldn't have that much documentation
for my own library, for my own personal use.
But since we are trying to reach a large audience, but I probably wouldn't have that much documentation for my own library, for my own personal use.
But since we are trying to reach a large audience, we are also investing a lot of time in the documentation.
And sometimes you just come to a point where you are writing the documentation and you think,
well, this shouldn't be so complicated. Why is this so hard to explain?
And even then we go back and look at the API and look at the complexity and whether we can change it in a way, which for us is also a part of quality. So quality is ease of use and
understandability, as well as simplicity of implementation and test coverage and, well,
of course, correct code as correct as possible.
Okay, so you said that you have full test coverage with PegTR.
Yeah, at least we think we have.
It's, of course, not quite so easy to measure with a template library where basically all the code is a template.
But we do sometimes just look into the coverage analysis
and check that all lines are actually covered, all lines
of the core library.
Okay.
Yeah, I was wondering about that.
Since it is a template library, there's virtually an infinite number of ways that people could
use your library, right?
Yes.
That, of course, makes it hard.
And we know full well that 100% test coverage isn't always as great as it sounds.
But we do try when writing tests
to sort of not just cover obvious cases,
but also go a bit left and right
and test some combinations,
some weird stuff and things.
And we are quite happy with the results
because it very rarely happens that some user comes up and says,
yeah, we found a real bug here.
You need to fix this.
So it seems to be an appropriate testing approach
for this kind of library.
And also one thing that helps us is that it's very small.
Like the core of the Pectier library is like 6,000 lines of code.
Okay.
So we try to keep things small
or lean and mean, as we say.
And that, of course,
also makes it much easier
to get something like 100% test coverage,
at least on the lines of code,
compared to some 100,000 lines of code.
Right. The hemoth.
I'm also curious,
it sounded a lot like your collaboration
with Daniel and the other contributors
that the collaboration is very important
to maintaining high code quality for you.
And I'm curious if you have any recommendations
for how that works out
or tools that you use
for how you communicate ideas while you're developing the next new feature.
I don't really think that tooling is a particularly important part of the equation, at least not for such a small team with mostly just two people writing the code.
So we use everything from Skypepe to emails and and share documents okay but um i i have
actually been been thinking about this on occasion and i think the main part is that we are very
detached from our ideas so it's like if one of us has an idea and suggests makes a suggestion or
shows the code to the other one and the other one one then says, yeah, no, this is not good. We have to change it that way, then it's better. We iterate very fast
in a fast fashion together without trying to not get hung up on our own ideas just because
they are our own ideas. So we fully embrace the fact that the first iteration or even the second or third iteration of any code, regardless of how brilliant the programmer, might not be the best.
And that we have to just go on and look at it again and sleep over it and spend another week perhaps thinking if it's something difficult. And then we'll just talk about it again.
And we'll go on with that until we are both happy
and the result looks sort of as small and elegant
as seems possible at the time.
And yes, I sometimes think it's a bit of a shame
that this is not really visible on the GitHub repository
because we can't just paste all discussions on there.
But it is definitely an important part
of the development process for us.
All right.
So my recommendation is to just collaborate closely
and share ideas freely.
Just talk about things.
That helps a lot.
I want to interrupt this discussion for just a moment
to bring you a word from our sponsors.
Backtrace is a debugging platform that improves software quality,
reliability, and support by bringing deep introspection
and automation throughout the software error lifecycle.
Spend less time debugging and reduce your mean time to resolution
by using the first and only platform to combine symbolic debugging, error aggregation, and state analysis. At the time of error, Bactrace jumps
into action, capturing detailed dumps of application and environmental state. Bactrace then performs
automated analysis on process memory and executable code to classify errors and highlight important
signals such as heap corruption, malware, and much more. This data is aggregated and archived in a centralized object store,
providing your team a single system to investigate errors across your environments.
Join industry leaders like Fastly, Message Systems, and AppNexus
that use Backtrace to modernize their debugging infrastructure.
It's free to try, minutes to set up, fully featured with no commitment necessary.
Check them out at backtrace.io slash cppcast.
Most of the projects seem to be listed as C++11.
Do you have any plans to update them to C++14, 17?
Are you looking at anything in C++20?
Yeah, that's a subject that we've been starting to look into recently.
Now, there are some sort of differences between the individual
projects, like the PEG-TL seems to work very well with C++11. There are only few places where it
would really benefit from jumping to 14 or 17. So we'll try to leave it on C++11 as long as possible. One reason for that is also that we know from personal experience and anecdotal evidence
from what you read on the web or speaking to other friends in the IT sector that some
companies are really very slow in updating compilers.
And some of my close friends are still stuck on C++98.
And for them, even C++11 would be a great and important improvement.
So in the name of wanting to reach as large audience as possible,
we are sort of dragging our feet a little bit on that front.
But we have also seen that the JSON library, for example, would
probably benefit much more from going to C++14 or C++17. And since we are still,
even after three years, at a pre-1.0 point, the possibility is absolutely there that we will
launch it as C++14 or 14 or 17 library and we'll do
all the cleanups and simplifications that can possibly be done by jumping to these newer
standards. So we are trying to keep an open eye in both directions, future and past, and then gauge
where the best sort of cost benefit is or where the most benefit can be reached without cutting
off too many people out of curiosity what standards or compilers are you able to use at work
well that changes a lot one of the compilers i'm using a lot at the moment is actually gcc 4.8 okay um and um there are other things where i can use some newer
compilers but uh yeah we simply don't always have the possibility and i mean we generic we
to just update to the latest compiler and use the latest standard right oh yeah i was just wondering
if you were using 17 at work if there or for example
if there were particular features that you're like oh i would really like to use this feature
in the json library since you said that's the one that you thought could benefit the most
yes there are definitely some things like in the json library we use string views for example and
those aren't available in c++ 11 yet So we have our own fully baked string view implementation that is used as a fallback in the JSON library.
And that would, of course, be one great simplification if we could just cut that out and rely on the standard implementation.
Right.
So I'm curious what compilers you support for these projects we try to support as many compilers
as possible so if you look into the continuous integration config files of the pectl you'll find
many gcc versions many clang versions on both linux and mac. And we've also had some community support
and contributions to enable the Pectio
to work on Android and on Windows
with both GCC and Visual Studio.
So for 2018 landscape,
I think we are covering all of the major bases
and we're quite happy with that.
Even though there is sometimes a bit of pain
to support some particularly old or different compiler,
but so far the effort of all involved parties
to support the C++ standards better and better is showing,
and we're hoping that the additional effort to support Visual Studio
will become less over time.
Yeah, I just, since you mentioned it, looked at your Travis configuration file,
and I don't think I've ever seen one quite so extensive.
I believe you support basically every configuration that can be supported on Travis.
Yeah, that's entirely possible.
In particular, Daniel has spent a lot of time
in supporting as many platforms as possible and also just different compiler versions.
And that is also, of course, part of the equation regarding quality. And it does help us sometimes
catch things early when some other compiler gives some other warning that we haven't taken into consideration yet.
So with this many configurations, and for our listeners,
it seems pretty much every version of GCC from 4.8 to 8,
every version of Clang from 3.5 to 6,
and every version of Xcode that's available on Travis,
plus Android compilers, yes. it's a lot of compilers.
Yeah, that's just the Travis. Then we've also got the doozer and the app layer for some other
platforms. And we'll definitely have to talk about the doozer because that's one I've I'm
personally not familiar with. And I don't believe we've had any guests. But I've I've had this
problem. I'm curious if it's ever been a problem for you where uh an old
compiler is giving a warning that a new compiler is not and it turns out that the old compiler
was not necessarily 100 correct new compiler has bug fixes do you run into issues like that at all
with this many configurations that you're trying to support? It does happen sometimes.
But as I said, the overall pains of supporting that many platforms aren't that big.
We usually only have very few places where we need specific workarounds.
We also try to run very clean code.
We compile with minus pedantic and many warnings enabled.
And yeah, it's not a big big issue at least not for these libraries obviously it might depend a bit on the kind of thing that
you're doing but for us luckily it hasn't been too bad so far okay cool and so we we just we
mentioned app app there and dozer then uh so app, it looks like you've got the last couple of versions of Visual Studio
plus MinGW compilers on here. And we've talked about AppVeyor on previous shows, but let's talk
about Doozor. What is Doozor? It's basically another continuous integration platform. And
I have to admit that I'm personally also not all that familiar with it. I think Daniel stumbled over it and then decided that it was a good idea to support it
since it had some more, I believe it was some different Linux distributions available
or something like that.
So not exactly an area where we expected any compatibility issues,
seeing that we aren't particularly tied to an operating system.
But just in order to cover all the bases,
we tried it and we've got it working now.
So we're happy to have even more platforms
tested automatically.
Yeah, it does look like it's various distributions
of Fedora and Ubuntu.
Yes.
So, and OS X.
Is it also a free service then?
I assume not something you're paying for?
No, all of the continuous integration services that we are using
are at least free for us as an open source project.
Right. Very cool.
You also mentioned Conan support,
and it looks like you directly work with conan to
expose uh pectel as a conan package is that right yes we have recently been looking into
packaging and since conan seems to be one of the up-and-coming package managers for c++ libraries
we decided to to look into it and we also got some support from the community,
and particularly from Julian, our third Tau CPP member.
And yeah, we are happy to show up in as many package managers as possible,
even though it is, in a sense, still sort of a bit of a new subject for us.
Like if you look into Pectl packages
for the different Linux distributions,
you'll see that many of them
are a bit out of date at this point.
So we are still sort of in the learning phase
to see how to play together
with the distributions
or the package maintainers
and who needs to push whom,
when and trigger what
so that the, yes, to keep everything a bit more up-to-date.
But yeah, we are happy with how things are going.
I've been saying this a lot, but yeah.
Well, that's good.
Yeah, it is.
At least I've been mostly looking at PegTL,
and it seems like that's what we've been talking about the most.
But there's actually a lot of projects on here and you've mentioned the JSON library already but do you
maintain the same level of continuous integration and quality across all of the libraries that you
have up here? We have been ramping up to the same level in the other libraries they might not all be
there yet and some of the libraries are also so specialized that we don't really
expect very many users of them, like the integer sequence and the tuple library isn't something
that a normal user would use because they'd use the standard implementation. So these are just
some smaller, in a sense, more educational projects that Daniel put up into Tau CPP.
But he is pushing to get the same level of continuous integration coverage on that
and on all libraries.
It's like once you've got the first library running on all the platforms,
on all the continuous integration platforms, then you know the quirks,
you get an idea of how to how
to go about things and then it's just a matter of using the same config file and fixing all the bugs
and all the warnings and all the compilers which right yeah i think this raises an interesting
question at least for me personally um it seems like like i it would be interesting to me if there was some top-level way in GitHub to say, well, this is the basic CMake warnings that I want to use and the basic Travis configuration that I want to use.
But as far as I know, you're just going to have to copy and paste these things between projects whenever you update them.
Is that, as far as you know, also, there there's nowhere a way to centralize this across your organization?
Yes, that is actually an issue.
There are several things that we would like to centralize.
In particular, if you look at the JSON library, it's got a full copy of the PECTL embedded into the source code.
So we copied it into the JSON library. We changed all the namespaces so that there wouldn't be a collision
if any application were to use
both the JSON and the Pecteal library
independent of each other.
And we are looking for solutions to this
and we are hoping that package management
might help us there.
If we have a JSON package
that depends on a Pecteo package,
then that could make things easier. So for us, that's sort of the big question. The
continuous integration config files aren't that large and don't change that often.
So for us, it's more a question of reuse on a higher level with regards to all the shared code between all the libraries.
I'm curious if you did consider already and then reject submodules for your Git projects.
Yeah, I was wondering the same thing.
Yeah, we've looked at it and we've tried it in a private context somewhere else.
And we weren't sort of quite happy with it,
even though I can't recall the exact issues now.
So there's the Git sub modules and then there's the other sub thingy.
So there are actually two different ways
of how to embed one Git repository into another.
And we might try one of them,
but it also seemed to be slightly awkward
regarding how you then have one project
within the other one
and then the sub project still needs
all its own configuration files and everything.
So what would be ideal
would be some kind of truly hierarchical system
where you can say,
we've got the overarching project
and then all the individual parts.
And this might of course be something
that we could be able to realize with sub modules but we haven't actually tried it yet okay but this is
actually one of the biggest issues that we're having at the moment something that we we've
been thinking about a lot but haven't been able to dedicate enough time to actually go ahead and
try something out yet you know uh rob i hate to admit this but
this all sounds like a strong argument for doing the uh you know single git repository with all of
the files in it which we've talked about with uh google does titus winters yeah yeah so we talked
about with titus i think the first time we had him on because it's it it does not make me comfortable
but i can see the issues that it would solve yeah yeah the problem is i think it solves some issues
and then it creates some other ones so yeah now we may have 50 libraries in one git repository
versioning might be a bit of a nightmare when the development cycles aren't synced up so yeah
everything would have to be versioned at once.
I guess that's what, well, I mean,
if you do the live from head mentality as well,
where head is always stable,
then I guess it's all, then that matters less too.
I don't know if that's a real solution
for the average organization or not, honestly.
I'll say I think submodules works well
as long as the project you're bringing in
as a submodule
isn't going to be updated that frequently.
Like if it's already pretty stable,
then I think it makes sense.
That might be both a case for and against
submoduling the Pectl into the JSON library
since it is both very stable,
but then again, it does have
a lot of small changes,
updates, and feature additions over time.
Right.
Going back to package managers, we mentioned Conan.
Have you looked into any of the other package managers,
like VC package maybe?
I believe that VC package is something that somebody is already working on.
I think we've already got at least a PEC-TL package there. It's definitely one of the
package managers that we want to be in. And if it doesn't work out, we'll try to work on it
ourselves. Of course, the hope is that the community, that the general community
will somehow standardize on, let's say, not too many different package managers, just not to create
too much overhead. Of course, that might or might not work out, as we know from many other different
subjects and libraries, etc. So we'll see how that plays out but it's it's good that there
is definitely some and a lot of effort going on in the general c++ community at the moment to
address this issue and so i'm confident that something will come out of it that will be
reasonably universal and accepted broadly i find it funny that you mentioned that someone was
already working on vcpackage because
with my open source library I got an
issue a few months ago or actually I don't
know how long ago it was now that was basically like
well the vcpackage is out of date
I'm like the what is what
what? I had no idea that anyone had been
working on it. I think it was before I was even really
aware of vcpackage.
They seem to be fairly aggressive
in that community of getting as many packages in as they can yeah and that's a suppose that's a that's a good thing yeah but
it also does show up one of the perhaps um less satisfactory aspect of open source sometimes you
don't know who is using your libraries or whatever it is, your projects for what? And you're sitting there like wondering,
yeah, so do we have 100 people actively using it
or 100,000 and what are they doing with it?
And of course, the Pecteal now,
after 10 years of being public,
we do have a sort of a certain stream of comments
and feedback and even sometimes articles on the web
that reference it.
Like there was one that we liked very much because it concluded with pecteal rules.
So that was a nice thing to read.
But we are very strict in wanting to keep the MIT license
and be as liberal as possible,
even though I have an occasion one that's,
well, shouldn't we add a fourth clause?
Like, yeah, everyone who uses this library has to send us a postcard stating where they are,
and even just in very broad terms, what they're using it for a postcard with a one euro coin.
Yeah, that will go a long way, possibly. Yes.
So I did notice Clang format configuration files in some of the repositories.
Are you using that consistently across all of the codebases as well?
Yes, we are actually using it.
While it's not, so consistent formatting is not a priority.
So it's not, how would you say,
a first tier quality metric for code bases. We do think
it is nice to have everything formatted consistently. And at least it prevents issues
with, you know, third party contributions being formatted too differently, and then having to
discuss where the cutoff is. So you know, we have the clang format and we just use it and that settles that issue
it does seem like a good uh way to manage user contributions if you're like oh they use tabs
everywhere please apply the clang format file before submitting your patch or whatever yeah
and if somebody then doesn't do it or doesn't have clang format installed we just do it and
you know push another commit straight afterwards. So that works
well, no problem. Do you ever find it gets in your way at all? Like you intentionally wanted
something to be formatted a particular way that goes outside of what Clang format is capable of
doing? We do actually have that in the Pectiel in particular. When you're writing grammars,
you often have these many lines
defining all the different grammar rules, one under each other, where you have like struct,
your rule name, colon, and then the template expression that you're inheriting from.
Right. And it really doesn't make any sense to put the curly braces from this struct definition
into the next two lines. Right. So if you look into the Pectl source code, you'll see that in places
where this happens frequently, you'll have like a big block
that is then a big block of code that is excluded from the
Clang format in order to keep the formatting style
that for this particular bit of code is more appropriate.
But I can run with that question,
say we are also using Clang-Tidy,
and that is actually giving us a few more issues,
at least in the JSON library.
So in the Pectier, we've only got, I think,
about 10 places where we have a no-lint comment
to exclude a line from the Clang-Tidy.
In the JSON library, which throws a lot more exceptions,
which does more low-level fiddling around due to the union,
we have like 400 places where we exclude the Clang-Tidy from giving a comment.
And we are sort of at the moment looking into reducing the number
of checks that Clang Tidy does for us, which is quite a task since it's got a lot. Or we might
even ditch it. But at the moment, we're sort of trying to save the situation by cutting it down
to a set of checks that works better for this particular project.
I recently became aware of a feature of cling tidy that I don't see used very often that might be helpful that you can specify a dot cling tidy file just like you can specify a dot cling format
file so you can give like a top level project exclusions. Yeah the issue is that with 400 places
there are
also many different cases at times
and the list
of different checks that it has, we
have to dedicate some time just
to understand the whole list and then
try to choose wisely
and figure out
which ones just
aren't appropriate for this particular project
because it does a bit more low-level stuff than some other libraries usually do.
Well, it sounds like you all have a lot going on with all these different projects.
Is there any project that we have not discussed yet that you would like to bring up?
I'd say the three major libraries in Tau CPP are the PEC-TL and the JSON library and also the Postgres library, which is a C++ wrapper around the libpq.
That is also very mature, even though we might still need to finish up on the documentation.
That was one of Daniel's private projects that he decided to share with the community.
So for us, these are sort of the three poster child libraries, because as I mentioned, the
other ones are a bit very specialized and perhaps not generally useful.
Whereas even though parsing and Postgres aren't perhaps something that you use in any project
it's not something strange
you might easily require a little parser
you might easily require a database
and JSON of course is everywhere nowadays
so that's hopefully of a general appeal
Well it's been great having you on the show today Colin
and obviously we'll put links great having you on the show today colin uh and obviously we'll put links
to uh the libraries on the show notes is there anything else you want to let us know anything
like where can people find you online um yeah you i think you already mentioned the github page
and you'll say you'll you'll offer the link to everybody um We don't have any blogs or such for our development.
We try to put every free minute into the libraries themselves.
And yeah, so I'll just say thank you for having me.
It was great talking to you.
And I'll be looking forward to listening to your upcoming CPP cast.
Okay.
Awesome. Great having you on the show.
Thank you.
Thanks for coming on.
Yeah, thank you.
Bye-bye.
Bye.
Thanks so much for listening in as we chat about C++.
I'd love to hear what you think of the podcast.
Please let me know if we're discussing the stuff you're interested in.
Or if you have a suggestion for a topic, I'd love to hear about that too.
You can email all your thoughts to feedback at cppcast.com. Thank you.