CppCast - Compile Time Regular Expressions
Episode Date: October 19, 2018Rob and Jason are joined by Hana Dusíková to discuss her compile time regular expressions library, the Prague user group and her proposal for implicit constexpr. Hana is working as a senior ...researcher in Avast Software. Her responsibility is exploring new ideas and optimizing existing ones. She also propagates modern C++ techniques and libraries in internal techtalks and gives talks at local C++ meetups. She studied computer science at Mendel university and subsequently taught several courses there, including: Data Structures, Computability and Complexity, and Formal Languages and Automata. News ACCU 2019 Call For Papers "auto to stick" GNU Tools Cauldron 2018 Videos online Visual Studio 2017 and Visual Studio for Mac Support Updates Hana Dusíková @hankadusikova Hana's GitHub Links Compile Time Regular Expression v2 CppCon 2018: Hana Dusíková "Compile Time Regular Expressions" Compile Time Regular Expressions Presentation Slides Avast Prague C++ Meetup P1235R0: Implicit constexpr Sponsors Download PVS-Studio We Checked the Android Source Code by PVS-Studio, or Nothing is Perfect Hosts @robwirving @lefticus
Transcript
Discussion (0)
Episode 171 of CppCast with guest Hanna Duskova, recorded October 16th, 2018.
Today's sponsor of CppCast is the PVS Studio team.
PVS Studio can be considered both as a tool for finding errors and typos,
and a static application security testing tool.
The tool supports the analysis of C, C++, and C-sharp code.
In this episode, we discuss auto to stick.
Then we talked to Hannah Dusikova. And it talks to us about her compile time,
regular expressions library,
and much more. Welcome to episode 171 of CBPCast, the first podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
I'm doing all right, Rob. How are you doing?
Doing okay. You got a busy few weeks up ahead for you, right?
Yeah, a little bit of overseas training coming up again and some U.S. training as well.
Okay. So yeah, the schedule might be a little erratic. We might record two episodes one week
and not one the following, but we should continue to have episodes out for the next few weeks.
Yeah, I don't think our listeners will notice a difference at the moment.
Right.
Okay.
At the top of every episode, I'll leave a piece of feedback.
This week I got an email from Tony.
He wrote in,
Hi Chaps, thanks for all your continuing work on CppCast.
I enjoy it every week and I think it's an important part of the C++ world.
Please may I cast a vote for trying to get standard library maintainers on as guest or repeat guests.
I feel people like Marshall Clow and STL will always have plenty of valuable material to discuss on repeat visits.
And I'd be interested to hear from Jonathan Wakely, who, yeah, I don't think we've had him on yet.
Right, Jason?
I don't think so.
And Eric Fizier, I don't think we've had him on yet. Right, Jason? I don't think so. And Eric Fizier, I don't think we've had on yet.
Yeah, so definitely a couple of names that we could try to reach out to.
And, yeah, Marshall and STL were both a lot of fun to talk to,
so maybe we could try to get them on again sometime, too.
Mm-hmm.
Yeah.
Well, we'd love to hear your thoughts about the show as well.
You can always reach out to us on Facebook, Twitter, or email us at feedback at cpcast.com. And don't forget to leave us a review on iTunes. Joining us today is Hannah Dusikova. Hannah is working as a senior researcher in Avast Software. Her responsibility is exploring new ideas and optimizing existing ones. She also propagates modern C++ techniques in libraries and internal tech talks
and gives talks at local C++ meetups.
She studied computer science at Mendel University
and subsequently taught several courses there,
including data structures, computability and complexity,
and formal languages and automata.
Hannah, welcome to the show.
Hello.
Hey, it's great for you to join us.
I'm curious, and this is something you mentioned to us in an email,
so I thought I'd bring it up now, is how did you get started programming?
Yeah, I started programming when I was like eight years old.
My dad has a computer.
It was Z80 machine, ZX Spectrum.
Okay.
And there was a lot of games,
and after I played a lot of them,
I think I left just programming.
You just started coding away?
That had a lot of computer newspapers,
and at the end of every one,
there was a program you can rewrite in BASIC.
So I rewrite it, and then I play with it.
What happens if I change this and this?
I start programming.
So on a Z80, you said,
that's a lot of our listeners, that's their first CPU,
and it was a favorite for them.
Yeah, it's my favorite too.
So at the time, was it all basic,
or did you get into it and learn machine code and stuff too?
Only basic.
It was dark magic for me then.
Yeah, I totally agree.
It was 30 years after I got rid of my Commodore 64
before I actually learned any 6502 machine code.
Okay, well, Hannah, we got a couple news articles to discuss.
Feel free to comment on any of these.
And I want to talk more about your talk at CPCon, okay?
Mm-hmm.
Okay, so the first one is a call for proposals
from ACCU, the 2019 conference.
And we've talked a few times about ACCU before.
It's not an explicitly C++ conference,
but they tend to have a lot of C++ content.
Right, Jason?
Yes, that is my understanding.
I've never been there myself.
Are you thinking about submitting?
I am not this year.
I think I'm meeting my conference quota with three for the year.
Okay, so you're already planning which three you're going to go to next year?
Well, I mean, next year kind of starts with C++ on C.
Right, right.
So, yeah.
Okay.
What about you, Hannah?
I'm going to C++ on C, so I'm looking forward to it.
Have you ever been to ACCU or thinking of going?
Sorry?
Have you ever been to the ACCU conference or are you thinking of going?
I've never been.
I was only on two conferences and both are CPP.
Okay. Okay. only on two conferences and both are cpp uh corner okay okay uh well this next one is from
the fluent c++ blog obviously we've talked about a couple of these articles before and had jonathan
boker on uh before this one is auto to stick and changing your style and uh jason you want to
introduce this one i thought it was a pretty article, especially with some of the talks we've had recently with people saying, you know, talking about overusing auto.
Yeah, it's based on what Herb Sutter's CBPCon talk from a fair number of years ago or something, where he's basically saying auto should always be on the left.
It truly is the almost always auto is really what it is is the argument
for this and i just have a hard time using it all the time personally but he makes a pretty
strong argument actually i had a herb review his article for why this is a good idea and for
you know fixing basically fixing the type of a thing i don't know
yeah i mean it he definitely points out a lot of good reasons that you know why you would want to
use auto um you know and kind of the main one being so you don't leave something uninitialized
back right yeah that's why you should always const. That too, that too.
Do you use this always auto style, Hanna?
Not every time.
I'm still used to type and variable name,
but especially in my CPP library,
I present on CPPCon,
I use auto on the left side because it makes sense for functional programming.
You have input and output.
Right.
Well, and I would imagine with a lot of what you're doing,
you almost don't have a choice
because a lot of times you don't even know the name,
the type of the thing, right?
Yeah.
Okay.
This next one is a collection of videos
from New Tools Cauldronron which is a recent conference i
guess jason yeah i had no idea that this conference existed but they apparently get together every
year and you know talk about like gcc kind of things yeah that's interesting i it's it looks
it must be a shorter conference it looks like there's 34 videos. That might just be one or two days worth of content.
Yeah, and several of them are lightning talks.
Oh, okay.
So probably just a one-day thing.
Yeah, I'm kind of surprised by the number of conferences that are out there on what seem to be niche topics to me.
I think right now there's actually a GitHub conference going on.
And obviously, Git's a huge tool you know
lots of us use it every day but i can't imagine going to an entire two-day conference revolving
around git unless maybe it's they're talking about just other kind of programming in general topics
well it's github specifically let's go well i mean one of the conferences that's getting ready
to happen here since i'm on patreon and i't know, maybe you saw this email as well, Rob, is Patricon.
Really?
Yes.
So everyone has a conference.
Everyone has a conference.
Okay.
That's interesting.
Did you look at any of these videos, Hannah, by any chance?
No, not at this point.
Okay.
I was just wondering if any your name stood out to you there is one
on here that was a lightning talk that i will draw attention to if i can find it again okay
um where did it go it disappeared what was the lightning talk on yeah that's uh it's on automatically tuning your compiler flags for performance.
Oh.
So it's like running your automatic tuning of compiler options using iRace.
And basically it's like a tool that just goes through and iterates through all the different possible combinations of compiler flags until you can get the fastest possible binary output.
And I think they said in their experiments,
they're getting about a 40% performance improvement over just using dash O three.
Oh,
wow.
That's pretty cool.
Yeah.
No idea how long it takes for it to actually run that.
But I know that there are listeners to this podcast who would care, that they would invest
the time of letting this tool run for a few days so they know what flags to use
in the next build of their high-performance thing.
One thing I'm curious about is how do they measure that your code
is running faster exactly?
I think it's raw timing, but I could be wrong.
Okay.
Interesting.
And then the last thing we have
is a post from Visual Studio blog
about upcoming updates
to both Visual Studio 2017
and Visual Studio for Mac.
Obviously, it looks like they're starting
to talk more about the upcoming
Visual Studio 2019, although I don't think we've still gotten Um, obviously it looks like they're starting to talk more about the upcoming visual studio 2019.
Although I don't think we've still gotten a date on when we could expect that, but,
uh, version 15.9 of visual studio 2017 should be coming out pretty soon.
It sounds like.
Yeah.
And I think the main news here is 15.9.
They said is the service pack.
Once 15.9 ships, they're no longer supporting the old
versions of visual studio 2017 right that makes sense but i mean they've they've put out so many
updates over the past two years um yeah it's like it's it's we've gotten a lot of new features
via updates yeah well one of the main comments on here is says i just want to return to stability
because it's been so many like you know so many changes between each sub release or whatever
and i don't know if that's fair or not i personally have not seen instability but
i do like uh i verify that visual studio is doing what i wanted to do I don't do my main development in Visual Studio. Right, right. I haven't
noticed any stability either,
but I update usually
as soon as they come out. I don't usually
dip into the preview releases, though.
No, I don't do that.
And, yeah, Visual Studio
for Mac, I'm not using that one either. I believe
that's not a C++ ID.
That's just for C Sharp and Xamarin and stuff. I'm not using that one either. I believe that's not a C++ ID. That's just for C Sharp and
Xamarin and stuff. I have no idea.
I try to not use my Mac. In fact,
I need to get rid of it if anyone wants it.
And are you a Visual Studio user
or you live on Linux?
I'm working on a Mac
and I'm using TextMate and Terminal.
Nothing else.
Yeah.
So you don't even use...
Why can I never remember the name of Apple's IDE?
Xcode.
Yes, so you don't use Xcode.
No.
Yeah.
I could never bring myself to use it.
But, I mean, I use Vim everywhere,
and that's something I can rely on being everywhere.
So there's no reason to use Xcode, I don't think,
when I only use that once every few months or whatever.
Every time I open Xcode, I'm shocked by complexity
and I want to return to my text mate.
I understand that.
Okay, well, Hannah, so this year at CPCon, just a few weeks ago,
you gave a talk on your compile time regular expressions library.
And I think, Jason, you were in the talk.
I did not make it, but I did get to watch the video afterwards.
But even while I was still at the conference,
I heard a lot of people talking about your talk,
how it was really interesting and impressive what you managed to achieve.
Can you just start off by telling us a little bit about
what you're trying to solve with your regular expressions library?
I want to try to solve the problem of expensive construction of regular expressions.
It's too expensive, I think, and your program shouldn't do it every time it's run.
It should be done as a code during compilation, at least for me, because my regular expressions
are stable, static for the whole program. That's a good point. I don't think I've ever
personally worked on a project that had dynamically created regular expressions.
You either knew what you were parsing or you didn't.
I think 90% of usage of regular expressions are static.
If you are not using, doing text editor or database,
you don't need dynamic regular expressions.
Right.
Oh yeah, that's a good point.
I have used dynamic regular expressions
in the context of SQL queries before, but I think that's the only time i have used dynamic regular expressions in the context of
sql queries before but i think that's the only time i've ever done that maybe i don't know yeah
so uh how does your library exactly uh parse these regular expression patterns at compile
time in order to avoid doing it at runtime i actually to build LL1 parser for a recursive distance parser.
And I parse it to intermediate data structure
in front of typelist.
And then I transform typelist into something similar,
which can be used in matcher,
which is like typelist of list of type list of type list.
Okay. So, uh, if you, can you explain for us, because I know this is going to sound terrible.
I do have a scripting language, but I don't know anything about parsers. Um, what does LL one
parser mean? LL one parser is context-free grammar parser which needs some form of stack
and it's deterministic
so by looking on next symbol
in this case one char
it decides where it should go
so if you correctly design your grammar
it can be ON algorithm
of the actual parsing of it is ON
yeah
okay yeah so do all regular expressions algorithm. The actual parsing of it is O-N? Yeah. Okay.
Do all regular expressions
fall into this
LL1 parsability?
Regular expression patterns are in
LL1
grammar, but
regular expression itself are
just regular languages.
There are two parsers
in my code. There is one parser which translates the pattern into jsou jen regulární jazyky. Je tam dva parsery v mém kódu.
Je tam jedna parsera, která představuje patrn
do nějakého reprezentace
a tam je smažení regulárního způsobu.
Způsob regulárního způsobu je jen regulární jazyky.
Jsou to klasy jazyků. a každý typ je více potřebný než předtím.
Nejlepší potřebnější jazyk je regulární jazyk,
pak jsou kontextové jazyky, což jsou většinou programové jazyky,
pak jsou většinou vytvořené výrobky, což jsou programy samotné.
Ale výrobky všech programových jazyků jsou kontextové grammy, which are programs by itself. But syntax of almost every programming language
is context-free grammar,
even C++ with some hacks.
Okay, so you could parse the regular expression
at compile time,
but you cannot execute it at compile time.
Is that correct?
No, no, no, it's not correct.
Okay.
I can parse it,
and I can evaluate it and match it against input.
Oh, okay.
Actually, I'm using one parser to create another parser to parse content of subject.
Okay.
So you use the parser of the regular expressions to create a new parser that then can do the matching or whatever.
Yeah.
Okay.
And the parser is actually itself, it's just a type.
I created during compile time.
And so you can work with it as its type.
So you can join them together to regular expressions
or do subscription and anything else.
If I remember correctly from your presentation,
it looked like this type that you generate at compile time
could almost be used as a way to visualize the regular expression
because the type had everything right there in it
for what the regular expression was doing.
Yeah, the type is actually the regular expression
in just a different form.
Okay.
I was able to design matcher,
which take this form and do the matching.
Wow.
Very cool.
How does the library compare
with some of the other regex libraries out there,
including the standard regex library?
I'm on par with PCRE library,
which is similar in most cases.
In some cases, I'm better.
In some cases, PCRE is better.
But for other libraries,
like POSIX regular expression in eGREP,
I'm usually 10 times maybe even quicker.
I tried benchmark RE2 regular expressions from Google,
and I'm like 10% quicker than them.
I tried regular expression 2C compiler,
which generates actually C code with state machine.
And in most cases, I am 10, 20% quicker.
That's pretty impressive.
That's amazing, yeah.
Thank you.
And then the standard library ones?
Yeah, standard library ones is special case.
I don't know why, but they are so slow.
When I was benchmarking them,
I actually killed them a few times
because I was thinking that they stuck.
Something I can match in like three, five seconds,
and I mean one gigabyte files,
expression in standard library took 50 minutes, for example.
Wow.
So you're orders of magnitude faster than them.
Yeah.
In this case, I'm talking about libc++ regular expression.
StudRajax implementation is much better.
Okay.
But still, it's kind of slow, like a minute versus five seconds.
Now, I know you're not involved
in the standard library implementations,
or at least I don't think you are,
but do you have any idea
why they're not just using PCRE
or one of the already proven ones
in the backend when they can?
I don't know.
I can only imagine maybe it's license problems.
I'm not sure.
Okay.
Okay.
I believe in your talk you talked about a couple C++20 features
that your library is dependent on.
What platforms is it currently supported on?
Currently, you can compile it on Clang, I think, version 4 and newer,
and GCC 7.2 and newer.
I'm using a feature called String Template Literals, string template literals which was designed
as implementation for paper
N3596
I'm not sure
and
it was
designed to
create string literals
which give you charPack
instead of const char
and sizeT
so you can use charPack as a string during compile time.
But it wasn't approved by committee.
And based on the design and later,
then Jeff Snyder came with much better design,
which allows you to use almost any type as a template parameter.
Okay. So this is the C++20 feature that is just, yeah, the... almost any type as a template parameter. Ah, okay.
So this is the C++20 feature that is just, yeah, the...
Yeah, it's called class non-type template parameters.
Right. Okay, I did not realize that. Okay.
And in my library, you can use extension,
which is created in GCC and Clang,
which is in C++17 extension,
GNU extension, or
if your compiler is able to use
class-not-type template parameters,
then by using feature test
macro, my library will
support a new
form of syntax automatically.
Okay.
So right now, can it work in Visual
Studio at all?
Not yet.
Okay.
I asked Stephen Lafayette if they will be implementing this feature.
They will be implementing it soon in Q1 next year.
Oh, okay.
And it should work on C++ in Visual Studio.
Maybe without making too big of a guess here,
if he said Q1 of next year,
they might be referring to the first release of Visual Studio 2019.
Hopefully.
It could be, yeah.
Hopefully.
But this feature is depending on operator UFO.
What's that?
UFO operator.
Oh, okay, right.
The three-way comparison.
Yeah.
So are you aware of anyone using your library
yet in production code?
I'm aware only about me,
but there was a lot of people who asked asked me a lot of questions so maybe someone
is using it it's it's new it's just two weeks uh in public so maybe i hope so so it is actually
being used uh in your products at work yeah oh that's cool. So you actually did a lightning
talk about this library
last year at Subicon, and it
sounded like you maybe have gone through a couple different iterations
of this library, is that right?
Yeah, this is like six or seven
iterations, I'm not sure.
I played with IDEA like last
four or five years,
and I tried a lot of different
approaches.
Some of them work, some of them don't.
Wow.
So when you come up with a new iteration library,
kind of like starting from scratch,
it's just the same basic idea of, you know,
I want to make these regular expressions,
you know, parse to compile time,
but each one is a unique library? That's the original idea,
but implementation was completely different.
This time I used function overloading resolution as a major point, major engine of my
library, and it works, and it's a rather
elegant code I can explain on one slide.
Yeah, when I saw how you were using function overload resolution,
I thought, I think that that was the piece that I was missing
when I last tried to write a parser that could run at compile time.
But that leads us to a question Odin Holmes wanted us specifically to ask.
How did you come up with your unique style of metaprogramming
using these techniques?
I usually read papers in future C++,
and I play with them if it's possible
and try to use them in different scenarios
and how they work together.
And then sometimes I phone something,
maybe I learn something which is already known,
I don't know.
And then I came with an idea,
and then I tried to implement it.
If it works, it works.
Everyone says that template metaprogramming
is too incomplete.
So by theory, you should be able to do whatever you want.
Right.
So implement parser.
So that sounds like extremely valuable experience there's not
very many people who actually try the new features as they're being developed to see
how they interact with each other um have you ever like written up your experiences on this
and submitted it just as like uh i figure there's not a proposal to the committee, but
just like an info paper to the committee on how useful these features are?
I didn't.
I don't like writing.
Well, on the topic of that, though, I did notice that you do have a paper for San Diego. Is that correct?
Yeah, there is one paper as first author, and I think two others with me as second author.
Maybe only one, I'm not sure.
Can you tell us about the paper? The paper is a new attribute to C++20,
which should allow you
to introduce new diagnostic,
user-defined diagnostic.
If your function
Svinae or requires
or just
attributes do not match,
it should give you
the diagnostiku.
Například v mém případě, v mém bibliu,
myslel jsem si, že bych měl použít Swinariy check,
abych měl vytvořit normální expresion
nebo Static Assert.
Static Assert může vám dát kustomový diagnostik,
ale Swinariy check je rádi jiným Swinariy programováním.
A nemůžete mít oba. A tento atribut měl by se změnit. is more friendly to others when I am programming. And you cannot have both.
And this attribute should change it.
Okay.
So in case you ever wanted
to have
custom error message
if your overload cannot be used
because of some constraint,
for example, your T is not numeric,
this constraint should help you.
Maybe some argument that you don't need it
because concept should solve it.
But it's not actually true.
Concept can give you nicer error messages,
but if you have very complex concept,
it can give you very bad error messaging.
For example, in my case,
I can introduce a concept which checks
if a regular expression is correct
and if it doesn't contain any captures.
Because if there are no captures in a concept,
I can use a different engine to match it,
which is much quicker than backtracking.
Oh, interesting. Okay.
I also noticed that your name is on
a paper with
Bryce Edelstein Laubach
on implicit constexpr.
Yeah, that's true.
That one, well, that looks interesting
to me.
Our original thing was
there is a lot of work
that everything will be constexpert in a few years.
We can agree on that.
Yes.
So I just got it and make context for everything now,
just by saying compiler should be able to evaluate it.
That's everything done.
You don't need any work from my perspective.
I'm not expert.
Right.
Ideally, you should be able to
invoke some code
generated by compiler, even in
some different library, and place it as a parameter
for template, for example.
Right.
It should make language much easier
and simpler
to use.
I guess there's no reason why you would ever
want to explicitly mark something non-constexpr, right?
Maybe some input-output functions, maybe.
Is that part of the paper?
Is there a non-constexpr if it's going to be implicit constexpr?
There is constexpr false.
False, okay.
But it's not nice,
but I think it should be easily changed to something else.
There is also some same topic with constexpr explanation mark paper,
currently known as const evaluate paper.
So if this paper will be joined or not, I don't know.
Yeah, I saw that overlap with the constexpr!
Because in this proposal, constexpr true says it must be done in a constexpr context.
Yeah, it's the same.
You know, honestly, it's more wordy to have to do this constexpr open paren true,
but I feel like it's more consistent with the rest of the language personally,
like it matches the no except and the conditionally explicit syntax.
I personally think that having true or false after constexpr will raise a lot of questions.
People can ask, should I place an expression here or not?
So maybe it's not the best idea to have true or false.
Maybe a new keyword will be better. I'm not sure.
Maybe.
But I do like this idea of this implicit constexpr.
We are at the point now where we do have a few different tools
that can tell us, oh, by the way, that function could be constexpr.
Like, the compilers already know this,
because they know what operations you're performing inside it.
And as your paper points out, they have to know that,
because lambdas are implicitly constexpr.
Yeah, that's true.
I'd like to interrupt the discussion for just a moment to bring you a word from our sponsors.
Authors of the PVS Studio Analyzer suggest downloading and trying the demo version of the product.
Link to the distribution package is in the description of this podcast episode.
You might not even be aware of the large number of errors
and potential vulnerabilities that the PVS Studio Analyzer is able to detect.
A good example is the detection of errors that are classed as CWB14
according to the Common Weakness Enumeration.
Compile a removal of code to clear buffers.
PVS Studio creators demonstrate the detection of such an error type,
for example, in one of the latest articles.
We check the Android source code by PVS Studio, or nothing is perfect.
Link to this article is also in the description of this episode.
PVS Studio works in Windows, Linux, and macOS environments.
The tool supports the analysis of C, C++, and C Sharp code,
and Java support is coming soon.
Try it today.
So since we're talking about the ISO committee right now,
you mentioned that the intro of your talk,
you live in the Czech Republic,
and that the Czech Republic is soon going to be a national body for WG21.
Are you involved in that process?
I started the process.
You did? That's awesome.
Last year, Bryce and others told me Are you involved in that process? I started the process. You did? That's awesome.
Last year, Bryce and others told me Czech Republic is only O-member,
Observing member.
Do you want to be in committee?
And I was like,
yeah, but that's a lot of work.
And I was thinking about it
and then I contacted the Czech Bureau,
which is responsible for communication with ICO.
And they were quite happy.
If Avast wants to give a lot of work into it, we are okay with it.
And there will be a Czech national body in the form of our company, Avast.
So how long is this process, do you think?
A lot of work is about our lawyers.
It's a lot of lawyery.
Oh, okay.
Actually, it's just a few signatures, and that's all.
We are already all members,
so the hard work is already done.
So when should this be official? When should you be a voting national body i hope yesterday i'm still waiting for uh lawyers
but uh soon i will not be in san diego but i will be on next meeting yeah that's what i was
going to ask if you're going to san Diego. What is the next one after that?
It's in February and it's on Hawaii.
It's a good one to go to.
You're right.
Not that you'll be able to enjoy it, but...
I will be probably sitting in an air-conditioned hotel room
without windows.
Yes.
That's what we've been told.
At the beginning of your talk,
you also mentioned
you're involved in the C++ Prague meetup.
How is the C++ community in Prague?
It's starting.
It's small,
and it will be bigger and bigger
with each meetup, I think.
There will be a next meetup tomorrow,
which is October 17th.
And there will be Tony von Erd.
I was able to convince my company
to invite some nice guests to come here.
Your company recently had
Matt Godbold out, is that correct?
Yeah, that's also my meetup.
Oh, okay. Very cool.
I didn't realize that was the meetup
also. My meetup
has not flown anyone from overseas
to come in. Although we have
had people coincidentally visit from
overseas because they were on their way to a
conference here. I wanted from the beginning on their way to a conference here.
I wanted from the beginning that every talk to be streamed.
It helps to spread the word about the meetup and it's a lot about marketing of our company.
Right.
So you can watch the C++ Prague meetups online?
Yeah. Streamed. Live.
Awesome.
So who is your next speaker then?
Tony van Aert tomorrow.
Okay.
Wow.
That's a long flight for Tony.
I'm curious.
My experience in Eastern Europe so far is like Wrocław, Poland,
has a gigantic C++ community.
It seems that Budapest does.
Is it similar in Prague?
Is there a lot of C++ development happening?
There is a lot of development.
There is Skype, Microsoft,
Red Hat in Brno,
and a lot of different companies.
So there are probably a lot of C++ developers,
but they are hidden inside, I think.
Right.
Well, you'll also be presenting your talk
at C++ on C in early 2019.
Are you planning on making any updates to it?
Yeah, there'll be a few significant ones.
Are you running the library again?
I'm preparing a few new tricks
to shock the audience.
And I hope there will be a new engine
in the library.
Wow.
Yeah.
So it sounds like you enjoy rewriting it.
Nothing from it.
Okay.
No, no. It's not rewriting.
It's additional engine which can be used
if there are no captures in the regular expression,
only matching, which can be done much quicker.
Okay.
I have one more question about
all these different iterations of the library.
If you're stuck on C++11 or 14,
could you use one of the earlier iterations of the library since the new one is kind of dependent on these C++11 or 14, could you use one of the earlier
iterations of the library, since the new one
is kind of dependent on these C++17
and C++20 features?
I think the oldest one, I don't know
if it's public, was on
C++11 with the extension.
Okay.
But
probably not. It would be a lot
of work to make it work.
Yeah. Okay. I'm also curious, since you mentioned you had to But probably not. It would be a lot of work to make it work.
Yeah.
Okay.
I'm also curious, since you mentioned you had to build an LL1 parser first, is that parser component reusable?
Is that something that other people could use to make their own parsers?
Yeah, you can just take part of my library.
It's called ctll and provide new grammar and you are good to go
okay oh I see
compile time ll
yeah
you can just design new grammar
as form of table
with function overloads
and you give me
grammar as a
type as a parameter in parser
and that's all.
In my library, there is a branch, brainfuck, which is implementing language brainfuck with the parser.
Which I'd imagine is a very difficult language to parse.
It's actually very easy to parse.
Really? Okay.
Yeah, it's like 100 lines of grammar, which is nothing.
Yeah, my understanding is that language is one of those write-only languages.
Yeah, I use examples from Wikipedia.
I didn't try to write my own.
So one thing we haven't brought up at all yet
is all this compile time magic you're doing
is talk about the impact on the compiler.
How much does this affect the compile time of your project?
You mean just using my library or...
Yeah, using your CTRE library, yeah.
Like if I'm going to throw a half dozen regular expressions
with your library and my project
because I want really, really fast run times,
is it going to take me an hour to compile my project?
No, it will take you only a few milliseconds per regex.
On GCC, it's 200 milliseconds per regex.
For how big of a regex?
Sorry.
I use
medium-sized regex, I think
20 chars long.
Okay.
So it looks like most of
ordinary regular expressions.
In Clang, it's
a little quicker, I think
150 milliseconds.
Depends on your computer.
But if you use the same regular expression multiple times,
compiler caches the parsing, so you're not paying anything more.
So you can use your regular expression 100 times, and you will not know it. So that just, if I understand correctly,
the compiler sees that you're instantiating the same template again
and says, oh, I already know what's going to happen
when I instantiate this template?
Exactly.
Okay.
If they are in the same transition unit.
Right.
Okay.
So it sounds like you could,
if for some reason these became a compile time concern to you,
you could use them as extern templates as Michael case recommended when we
had him on the show.
Yeah,
probably.
Or you can just wrap them in function or in different translation unit.
Oh yeah.
That would make sense because you just need to pass a string view or whatever to it that you want to match against.
Okay.
That's really cool.
And, you know, Rob started out by saying this, but it is true.
All of the comments on Reddit and Twitter and Slack seem to agree that your talk was the best one of the conference this year.
You got several applause lines.
People were pretty impressed from what I could tell in the video.
Yeah.
I was surprised, too.
Oh, and on that note, the first video that went up had to be pulled
because it had an incorrect title.
So it sounds like, Rob, you did find a video that stayed online.
No, there is no video
currently online. It was
pulled because there was a wrong title.
It was just regular expressions.
I was missing compile time.
Okay, so they haven't reposted
it yet.
Okay, well, hopefully they get it up soon.
Yeah.
One more thing, since we're talking about the
length of compile time for Regex, I think you did find there was an length of compile time for a regex.
I think you did find there was an upper limit to how long a regex could be
before the compiler would be unable to process it.
Because the deep of template installation recursion is the same as the length of the string,
you are limited by the limit of your compiler template
in such a recursion, so it's usually 100 chars.
Okay.
100 chars, okay.
I don't think many people are writing
regit expressions that long, though.
Yeah, so it's not much usable for big JSONs,
but for regits, it's enough.
Yeah.
Yeah, and I think most compilers have compiler flags
where you could extend that limit if you really wanted to.
Yeah.
I actually was thinking about writing a paper
which
gives you ability to ask about
limit of
template integration recursion.
So you can use it
in your parser and give it
a nice error message.
Your regular expression is too long.
That would be interesting.
Are you aware
if the standard currently says
if there's a minimum
recursion limit?
I'm not sure, but there shouldn't be any.
It's just an implementation issue.
Yeah, that's probably true.
I know there's some places where the standard
has minimums, but...
Wow. It's a very cool library.
Thank you.
Was there anything else you wanted to share with us
before we let you go, Hannah?
I'm not sure.
I don't know.
Probably not.
Okay.
Okay.
Well, it's been great having you on.
We'll get a new link to the YouTube video as soon as it's back up and put that in the show notes.
And good luck at the Hawaii C++ meeting when you go there.
Thank you.
And I guess I'll see you in England
in a few months.
Yeah, if anyone is on Meeting C++,
I will be there too.
Oh, right. I forgot that you
had submitted to that one as well.
Well, have fun.
Will this be your first year
at Meeting C++ then, you said?
Yeah.
It's a good conference.
I heard everything good about it.
Good.
Okay, thank you, Hannah.
Thank you.
Thanks.
Thanks so much for listening in as we chat about C++.
We'd love to hear what you think of the podcast.
Please let us know if we're discussing the stuff you're interested in,
or if you have a suggestion for a topic,
we'd love to hear about that too.
You can email all your thoughts to feedback at cppcast.com.
We'd also appreciate if you can like CppCast on Facebook
and follow CppCast on Twitter.
You can also follow me at Rob W. Irving
and Jason at Lefticus on Twitter.
We'd also like to thank all our patrons
who help support the show through Patreon.
If you'd like to support us on Patreon,
you can do so at patreon.com slash cppcast.
And of course, you can find all that info and the show notes on the podcast website at cppcast.com.
Theme music for this episode was provided by podcastthemes.com.