CppCast - Compile Time Regular Expressions

Episode Date: October 19, 2018

Rob and Jason are joined by Hana Dusíková to discuss her compile time regular expressions library, the Prague user group and her proposal for implicit constexpr. Hana is working as a senior ...researcher in Avast Software. Her responsibility is exploring new ideas and optimizing existing ones. She also propagates modern C++ techniques and libraries in internal techtalks and gives talks at local C++ meetups. She studied computer science at Mendel university and subsequently taught several courses there, including: Data Structures, Computability and Complexity, and Formal Languages and Automata. News ACCU 2019 Call For Papers "auto to stick" GNU Tools Cauldron 2018 Videos online Visual Studio 2017 and Visual Studio for Mac Support Updates Hana Dusíková @hankadusikova Hana's GitHub Links Compile Time Regular Expression v2 CppCon 2018: Hana Dusíková "Compile Time Regular Expressions" Compile Time Regular Expressions Presentation Slides Avast Prague C++ Meetup P1235R0: Implicit constexpr Sponsors Download PVS-Studio We Checked the Android Source Code by PVS-Studio, or Nothing is Perfect Hosts @robwirving @lefticus

Transcript
Discussion (0)
Starting point is 00:00:00 Episode 171 of CppCast with guest Hanna Duskova, recorded October 16th, 2018. Today's sponsor of CppCast is the PVS Studio team. PVS Studio can be considered both as a tool for finding errors and typos, and a static application security testing tool. The tool supports the analysis of C, C++, and C-sharp code. In this episode, we discuss auto to stick. Then we talked to Hannah Dusikova. And it talks to us about her compile time, regular expressions library,
Starting point is 00:00:49 and much more. Welcome to episode 171 of CBPCast, the first podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today? I'm doing all right, Rob. How are you doing? Doing okay. You got a busy few weeks up ahead for you, right? Yeah, a little bit of overseas training coming up again and some U.S. training as well. Okay. So yeah, the schedule might be a little erratic. We might record two episodes one week and not one the following, but we should continue to have episodes out for the next few weeks.
Starting point is 00:01:48 Yeah, I don't think our listeners will notice a difference at the moment. Right. Okay. At the top of every episode, I'll leave a piece of feedback. This week I got an email from Tony. He wrote in, Hi Chaps, thanks for all your continuing work on CppCast. I enjoy it every week and I think it's an important part of the C++ world.
Starting point is 00:02:06 Please may I cast a vote for trying to get standard library maintainers on as guest or repeat guests. I feel people like Marshall Clow and STL will always have plenty of valuable material to discuss on repeat visits. And I'd be interested to hear from Jonathan Wakely, who, yeah, I don't think we've had him on yet. Right, Jason? I don't think so. And Eric Fizier, I don't think we've had him on yet. Right, Jason? I don't think so. And Eric Fizier, I don't think we've had on yet. Yeah, so definitely a couple of names that we could try to reach out to. And, yeah, Marshall and STL were both a lot of fun to talk to,
Starting point is 00:02:37 so maybe we could try to get them on again sometime, too. Mm-hmm. Yeah. Well, we'd love to hear your thoughts about the show as well. You can always reach out to us on Facebook, Twitter, or email us at feedback at cpcast.com. And don't forget to leave us a review on iTunes. Joining us today is Hannah Dusikova. Hannah is working as a senior researcher in Avast Software. Her responsibility is exploring new ideas and optimizing existing ones. She also propagates modern C++ techniques in libraries and internal tech talks and gives talks at local C++ meetups. She studied computer science at Mendel University and subsequently taught several courses there,
Starting point is 00:03:15 including data structures, computability and complexity, and formal languages and automata. Hannah, welcome to the show. Hello. Hey, it's great for you to join us. I'm curious, and this is something you mentioned to us in an email, so I thought I'd bring it up now, is how did you get started programming? Yeah, I started programming when I was like eight years old.
Starting point is 00:03:38 My dad has a computer. It was Z80 machine, ZX Spectrum. Okay. And there was a lot of games, and after I played a lot of them, I think I left just programming. You just started coding away? That had a lot of computer newspapers,
Starting point is 00:04:04 and at the end of every one, there was a program you can rewrite in BASIC. So I rewrite it, and then I play with it. What happens if I change this and this? I start programming. So on a Z80, you said, that's a lot of our listeners, that's their first CPU, and it was a favorite for them.
Starting point is 00:04:29 Yeah, it's my favorite too. So at the time, was it all basic, or did you get into it and learn machine code and stuff too? Only basic. It was dark magic for me then. Yeah, I totally agree. It was 30 years after I got rid of my Commodore 64 before I actually learned any 6502 machine code.
Starting point is 00:04:53 Okay, well, Hannah, we got a couple news articles to discuss. Feel free to comment on any of these. And I want to talk more about your talk at CPCon, okay? Mm-hmm. Okay, so the first one is a call for proposals from ACCU, the 2019 conference. And we've talked a few times about ACCU before. It's not an explicitly C++ conference,
Starting point is 00:05:16 but they tend to have a lot of C++ content. Right, Jason? Yes, that is my understanding. I've never been there myself. Are you thinking about submitting? I am not this year. I think I'm meeting my conference quota with three for the year. Okay, so you're already planning which three you're going to go to next year?
Starting point is 00:05:33 Well, I mean, next year kind of starts with C++ on C. Right, right. So, yeah. Okay. What about you, Hannah? I'm going to C++ on C, so I'm looking forward to it. Have you ever been to ACCU or thinking of going? Sorry?
Starting point is 00:05:55 Have you ever been to the ACCU conference or are you thinking of going? I've never been. I was only on two conferences and both are CPP. Okay. Okay. only on two conferences and both are cpp uh corner okay okay uh well this next one is from the fluent c++ blog obviously we've talked about a couple of these articles before and had jonathan boker on uh before this one is auto to stick and changing your style and uh jason you want to introduce this one i thought it was a pretty article, especially with some of the talks we've had recently with people saying, you know, talking about overusing auto. Yeah, it's based on what Herb Sutter's CBPCon talk from a fair number of years ago or something, where he's basically saying auto should always be on the left.
Starting point is 00:06:41 It truly is the almost always auto is really what it is is the argument for this and i just have a hard time using it all the time personally but he makes a pretty strong argument actually i had a herb review his article for why this is a good idea and for you know fixing basically fixing the type of a thing i don't know yeah i mean it he definitely points out a lot of good reasons that you know why you would want to use auto um you know and kind of the main one being so you don't leave something uninitialized back right yeah that's why you should always const. That too, that too. Do you use this always auto style, Hanna?
Starting point is 00:07:30 Not every time. I'm still used to type and variable name, but especially in my CPP library, I present on CPPCon, I use auto on the left side because it makes sense for functional programming. You have input and output. Right. Well, and I would imagine with a lot of what you're doing,
Starting point is 00:07:53 you almost don't have a choice because a lot of times you don't even know the name, the type of the thing, right? Yeah. Okay. This next one is a collection of videos from New Tools Cauldronron which is a recent conference i guess jason yeah i had no idea that this conference existed but they apparently get together every
Starting point is 00:08:13 year and you know talk about like gcc kind of things yeah that's interesting i it's it looks it must be a shorter conference it looks like there's 34 videos. That might just be one or two days worth of content. Yeah, and several of them are lightning talks. Oh, okay. So probably just a one-day thing. Yeah, I'm kind of surprised by the number of conferences that are out there on what seem to be niche topics to me. I think right now there's actually a GitHub conference going on. And obviously, Git's a huge tool you know
Starting point is 00:08:45 lots of us use it every day but i can't imagine going to an entire two-day conference revolving around git unless maybe it's they're talking about just other kind of programming in general topics well it's github specifically let's go well i mean one of the conferences that's getting ready to happen here since i'm on patreon and i't know, maybe you saw this email as well, Rob, is Patricon. Really? Yes. So everyone has a conference. Everyone has a conference.
Starting point is 00:09:15 Okay. That's interesting. Did you look at any of these videos, Hannah, by any chance? No, not at this point. Okay. I was just wondering if any your name stood out to you there is one on here that was a lightning talk that i will draw attention to if i can find it again okay um where did it go it disappeared what was the lightning talk on yeah that's uh it's on automatically tuning your compiler flags for performance.
Starting point is 00:09:47 Oh. So it's like running your automatic tuning of compiler options using iRace. And basically it's like a tool that just goes through and iterates through all the different possible combinations of compiler flags until you can get the fastest possible binary output. And I think they said in their experiments, they're getting about a 40% performance improvement over just using dash O three. Oh, wow. That's pretty cool.
Starting point is 00:10:18 Yeah. No idea how long it takes for it to actually run that. But I know that there are listeners to this podcast who would care, that they would invest the time of letting this tool run for a few days so they know what flags to use in the next build of their high-performance thing. One thing I'm curious about is how do they measure that your code is running faster exactly? I think it's raw timing, but I could be wrong.
Starting point is 00:10:46 Okay. Interesting. And then the last thing we have is a post from Visual Studio blog about upcoming updates to both Visual Studio 2017 and Visual Studio for Mac. Obviously, it looks like they're starting
Starting point is 00:11:01 to talk more about the upcoming Visual Studio 2019, although I don't think we've still gotten Um, obviously it looks like they're starting to talk more about the upcoming visual studio 2019. Although I don't think we've still gotten a date on when we could expect that, but, uh, version 15.9 of visual studio 2017 should be coming out pretty soon. It sounds like. Yeah. And I think the main news here is 15.9. They said is the service pack.
Starting point is 00:11:23 Once 15.9 ships, they're no longer supporting the old versions of visual studio 2017 right that makes sense but i mean they've they've put out so many updates over the past two years um yeah it's like it's it's we've gotten a lot of new features via updates yeah well one of the main comments on here is says i just want to return to stability because it's been so many like you know so many changes between each sub release or whatever and i don't know if that's fair or not i personally have not seen instability but i do like uh i verify that visual studio is doing what i wanted to do I don't do my main development in Visual Studio. Right, right. I haven't noticed any stability either,
Starting point is 00:12:08 but I update usually as soon as they come out. I don't usually dip into the preview releases, though. No, I don't do that. And, yeah, Visual Studio for Mac, I'm not using that one either. I believe that's not a C++ ID. That's just for C Sharp and Xamarin and stuff. I'm not using that one either. I believe that's not a C++ ID. That's just for C Sharp and
Starting point is 00:12:25 Xamarin and stuff. I have no idea. I try to not use my Mac. In fact, I need to get rid of it if anyone wants it. And are you a Visual Studio user or you live on Linux? I'm working on a Mac and I'm using TextMate and Terminal. Nothing else.
Starting point is 00:12:44 Yeah. So you don't even use... Why can I never remember the name of Apple's IDE? Xcode. Yes, so you don't use Xcode. No. Yeah. I could never bring myself to use it.
Starting point is 00:12:58 But, I mean, I use Vim everywhere, and that's something I can rely on being everywhere. So there's no reason to use Xcode, I don't think, when I only use that once every few months or whatever. Every time I open Xcode, I'm shocked by complexity and I want to return to my text mate. I understand that. Okay, well, Hannah, so this year at CPCon, just a few weeks ago,
Starting point is 00:13:27 you gave a talk on your compile time regular expressions library. And I think, Jason, you were in the talk. I did not make it, but I did get to watch the video afterwards. But even while I was still at the conference, I heard a lot of people talking about your talk, how it was really interesting and impressive what you managed to achieve. Can you just start off by telling us a little bit about what you're trying to solve with your regular expressions library?
Starting point is 00:13:52 I want to try to solve the problem of expensive construction of regular expressions. It's too expensive, I think, and your program shouldn't do it every time it's run. It should be done as a code during compilation, at least for me, because my regular expressions are stable, static for the whole program. That's a good point. I don't think I've ever personally worked on a project that had dynamically created regular expressions. You either knew what you were parsing or you didn't. I think 90% of usage of regular expressions are static. If you are not using, doing text editor or database,
Starting point is 00:14:38 you don't need dynamic regular expressions. Right. Oh yeah, that's a good point. I have used dynamic regular expressions in the context of SQL queries before, but I think that's the only time i have used dynamic regular expressions in the context of sql queries before but i think that's the only time i've ever done that maybe i don't know yeah so uh how does your library exactly uh parse these regular expression patterns at compile time in order to avoid doing it at runtime i actually to build LL1 parser for a recursive distance parser.
Starting point is 00:15:07 And I parse it to intermediate data structure in front of typelist. And then I transform typelist into something similar, which can be used in matcher, which is like typelist of list of type list of type list. Okay. So, uh, if you, can you explain for us, because I know this is going to sound terrible. I do have a scripting language, but I don't know anything about parsers. Um, what does LL one parser mean? LL one parser is context-free grammar parser which needs some form of stack
Starting point is 00:15:45 and it's deterministic so by looking on next symbol in this case one char it decides where it should go so if you correctly design your grammar it can be ON algorithm of the actual parsing of it is ON yeah
Starting point is 00:16:04 okay yeah so do all regular expressions algorithm. The actual parsing of it is O-N? Yeah. Okay. Do all regular expressions fall into this LL1 parsability? Regular expression patterns are in LL1 grammar, but regular expression itself are
Starting point is 00:16:20 just regular languages. There are two parsers in my code. There is one parser which translates the pattern into jsou jen regulární jazyky. Je tam dva parsery v mém kódu. Je tam jedna parsera, která představuje patrn do nějakého reprezentace a tam je smažení regulárního způsobu. Způsob regulárního způsobu je jen regulární jazyky. Jsou to klasy jazyků. a každý typ je více potřebný než předtím.
Starting point is 00:16:45 Nejlepší potřebnější jazyk je regulární jazyk, pak jsou kontextové jazyky, což jsou většinou programové jazyky, pak jsou většinou vytvořené výrobky, což jsou programy samotné. Ale výrobky všech programových jazyků jsou kontextové grammy, which are programs by itself. But syntax of almost every programming language is context-free grammar, even C++ with some hacks. Okay, so you could parse the regular expression at compile time,
Starting point is 00:17:15 but you cannot execute it at compile time. Is that correct? No, no, no, it's not correct. Okay. I can parse it, and I can evaluate it and match it against input. Oh, okay. Actually, I'm using one parser to create another parser to parse content of subject.
Starting point is 00:17:36 Okay. So you use the parser of the regular expressions to create a new parser that then can do the matching or whatever. Yeah. Okay. And the parser is actually itself, it's just a type. I created during compile time. And so you can work with it as its type. So you can join them together to regular expressions
Starting point is 00:18:00 or do subscription and anything else. If I remember correctly from your presentation, it looked like this type that you generate at compile time could almost be used as a way to visualize the regular expression because the type had everything right there in it for what the regular expression was doing. Yeah, the type is actually the regular expression in just a different form.
Starting point is 00:18:28 Okay. I was able to design matcher, which take this form and do the matching. Wow. Very cool. How does the library compare with some of the other regex libraries out there, including the standard regex library?
Starting point is 00:18:46 I'm on par with PCRE library, which is similar in most cases. In some cases, I'm better. In some cases, PCRE is better. But for other libraries, like POSIX regular expression in eGREP, I'm usually 10 times maybe even quicker. I tried benchmark RE2 regular expressions from Google,
Starting point is 00:19:15 and I'm like 10% quicker than them. I tried regular expression 2C compiler, which generates actually C code with state machine. And in most cases, I am 10, 20% quicker. That's pretty impressive. That's amazing, yeah. Thank you. And then the standard library ones?
Starting point is 00:19:40 Yeah, standard library ones is special case. I don't know why, but they are so slow. When I was benchmarking them, I actually killed them a few times because I was thinking that they stuck. Something I can match in like three, five seconds, and I mean one gigabyte files, expression in standard library took 50 minutes, for example.
Starting point is 00:20:07 Wow. So you're orders of magnitude faster than them. Yeah. In this case, I'm talking about libc++ regular expression. StudRajax implementation is much better. Okay. But still, it's kind of slow, like a minute versus five seconds. Now, I know you're not involved
Starting point is 00:20:31 in the standard library implementations, or at least I don't think you are, but do you have any idea why they're not just using PCRE or one of the already proven ones in the backend when they can? I don't know. I can only imagine maybe it's license problems.
Starting point is 00:20:50 I'm not sure. Okay. Okay. I believe in your talk you talked about a couple C++20 features that your library is dependent on. What platforms is it currently supported on? Currently, you can compile it on Clang, I think, version 4 and newer, and GCC 7.2 and newer.
Starting point is 00:21:17 I'm using a feature called String Template Literals, string template literals which was designed as implementation for paper N3596 I'm not sure and it was designed to create string literals
Starting point is 00:21:39 which give you charPack instead of const char and sizeT so you can use charPack as a string during compile time. But it wasn't approved by committee. And based on the design and later, then Jeff Snyder came with much better design, which allows you to use almost any type as a template parameter.
Starting point is 00:22:04 Okay. So this is the C++20 feature that is just, yeah, the... almost any type as a template parameter. Ah, okay. So this is the C++20 feature that is just, yeah, the... Yeah, it's called class non-type template parameters. Right. Okay, I did not realize that. Okay. And in my library, you can use extension, which is created in GCC and Clang, which is in C++17 extension, GNU extension, or
Starting point is 00:22:28 if your compiler is able to use class-not-type template parameters, then by using feature test macro, my library will support a new form of syntax automatically. Okay. So right now, can it work in Visual
Starting point is 00:22:44 Studio at all? Not yet. Okay. I asked Stephen Lafayette if they will be implementing this feature. They will be implementing it soon in Q1 next year. Oh, okay. And it should work on C++ in Visual Studio. Maybe without making too big of a guess here,
Starting point is 00:23:09 if he said Q1 of next year, they might be referring to the first release of Visual Studio 2019. Hopefully. It could be, yeah. Hopefully. But this feature is depending on operator UFO. What's that? UFO operator.
Starting point is 00:23:28 Oh, okay, right. The three-way comparison. Yeah. So are you aware of anyone using your library yet in production code? I'm aware only about me, but there was a lot of people who asked asked me a lot of questions so maybe someone is using it it's it's new it's just two weeks uh in public so maybe i hope so so it is actually
Starting point is 00:23:57 being used uh in your products at work yeah oh that's cool. So you actually did a lightning talk about this library last year at Subicon, and it sounded like you maybe have gone through a couple different iterations of this library, is that right? Yeah, this is like six or seven iterations, I'm not sure. I played with IDEA like last
Starting point is 00:24:19 four or five years, and I tried a lot of different approaches. Some of them work, some of them don't. Wow. So when you come up with a new iteration library, kind of like starting from scratch, it's just the same basic idea of, you know,
Starting point is 00:24:38 I want to make these regular expressions, you know, parse to compile time, but each one is a unique library? That's the original idea, but implementation was completely different. This time I used function overloading resolution as a major point, major engine of my library, and it works, and it's a rather elegant code I can explain on one slide. Yeah, when I saw how you were using function overload resolution,
Starting point is 00:25:09 I thought, I think that that was the piece that I was missing when I last tried to write a parser that could run at compile time. But that leads us to a question Odin Holmes wanted us specifically to ask. How did you come up with your unique style of metaprogramming using these techniques? I usually read papers in future C++, and I play with them if it's possible and try to use them in different scenarios
Starting point is 00:25:36 and how they work together. And then sometimes I phone something, maybe I learn something which is already known, I don't know. And then I came with an idea, and then I tried to implement it. If it works, it works. Everyone says that template metaprogramming
Starting point is 00:25:54 is too incomplete. So by theory, you should be able to do whatever you want. Right. So implement parser. So that sounds like extremely valuable experience there's not very many people who actually try the new features as they're being developed to see how they interact with each other um have you ever like written up your experiences on this and submitted it just as like uh i figure there's not a proposal to the committee, but
Starting point is 00:26:26 just like an info paper to the committee on how useful these features are? I didn't. I don't like writing. Well, on the topic of that, though, I did notice that you do have a paper for San Diego. Is that correct? Yeah, there is one paper as first author, and I think two others with me as second author. Maybe only one, I'm not sure. Can you tell us about the paper? The paper is a new attribute to C++20, which should allow you
Starting point is 00:27:07 to introduce new diagnostic, user-defined diagnostic. If your function Svinae or requires or just attributes do not match, it should give you the diagnostiku.
Starting point is 00:27:25 Například v mém případě, v mém bibliu, myslel jsem si, že bych měl použít Swinariy check, abych měl vytvořit normální expresion nebo Static Assert. Static Assert může vám dát kustomový diagnostik, ale Swinariy check je rádi jiným Swinariy programováním. A nemůžete mít oba. A tento atribut měl by se změnit. is more friendly to others when I am programming. And you cannot have both. And this attribute should change it.
Starting point is 00:27:49 Okay. So in case you ever wanted to have custom error message if your overload cannot be used because of some constraint, for example, your T is not numeric, this constraint should help you.
Starting point is 00:28:06 Maybe some argument that you don't need it because concept should solve it. But it's not actually true. Concept can give you nicer error messages, but if you have very complex concept, it can give you very bad error messaging. For example, in my case, I can introduce a concept which checks
Starting point is 00:28:29 if a regular expression is correct and if it doesn't contain any captures. Because if there are no captures in a concept, I can use a different engine to match it, which is much quicker than backtracking. Oh, interesting. Okay. I also noticed that your name is on a paper with
Starting point is 00:28:49 Bryce Edelstein Laubach on implicit constexpr. Yeah, that's true. That one, well, that looks interesting to me. Our original thing was there is a lot of work that everything will be constexpert in a few years.
Starting point is 00:29:07 We can agree on that. Yes. So I just got it and make context for everything now, just by saying compiler should be able to evaluate it. That's everything done. You don't need any work from my perspective. I'm not expert. Right.
Starting point is 00:29:26 Ideally, you should be able to invoke some code generated by compiler, even in some different library, and place it as a parameter for template, for example. Right. It should make language much easier and simpler
Starting point is 00:29:41 to use. I guess there's no reason why you would ever want to explicitly mark something non-constexpr, right? Maybe some input-output functions, maybe. Is that part of the paper? Is there a non-constexpr if it's going to be implicit constexpr? There is constexpr false. False, okay.
Starting point is 00:30:06 But it's not nice, but I think it should be easily changed to something else. There is also some same topic with constexpr explanation mark paper, currently known as const evaluate paper. So if this paper will be joined or not, I don't know. Yeah, I saw that overlap with the constexpr! Because in this proposal, constexpr true says it must be done in a constexpr context. Yeah, it's the same.
Starting point is 00:30:41 You know, honestly, it's more wordy to have to do this constexpr open paren true, but I feel like it's more consistent with the rest of the language personally, like it matches the no except and the conditionally explicit syntax. I personally think that having true or false after constexpr will raise a lot of questions. People can ask, should I place an expression here or not? So maybe it's not the best idea to have true or false. Maybe a new keyword will be better. I'm not sure. Maybe.
Starting point is 00:31:21 But I do like this idea of this implicit constexpr. We are at the point now where we do have a few different tools that can tell us, oh, by the way, that function could be constexpr. Like, the compilers already know this, because they know what operations you're performing inside it. And as your paper points out, they have to know that, because lambdas are implicitly constexpr. Yeah, that's true.
Starting point is 00:31:49 I'd like to interrupt the discussion for just a moment to bring you a word from our sponsors. Authors of the PVS Studio Analyzer suggest downloading and trying the demo version of the product. Link to the distribution package is in the description of this podcast episode. You might not even be aware of the large number of errors and potential vulnerabilities that the PVS Studio Analyzer is able to detect. A good example is the detection of errors that are classed as CWB14 according to the Common Weakness Enumeration. Compile a removal of code to clear buffers.
Starting point is 00:32:19 PVS Studio creators demonstrate the detection of such an error type, for example, in one of the latest articles. We check the Android source code by PVS Studio, or nothing is perfect. Link to this article is also in the description of this episode. PVS Studio works in Windows, Linux, and macOS environments. The tool supports the analysis of C, C++, and C Sharp code, and Java support is coming soon. Try it today.
Starting point is 00:32:46 So since we're talking about the ISO committee right now, you mentioned that the intro of your talk, you live in the Czech Republic, and that the Czech Republic is soon going to be a national body for WG21. Are you involved in that process? I started the process. You did? That's awesome. Last year, Bryce and others told me Are you involved in that process? I started the process. You did? That's awesome.
Starting point is 00:33:08 Last year, Bryce and others told me Czech Republic is only O-member, Observing member. Do you want to be in committee? And I was like, yeah, but that's a lot of work. And I was thinking about it and then I contacted the Czech Bureau, which is responsible for communication with ICO.
Starting point is 00:33:30 And they were quite happy. If Avast wants to give a lot of work into it, we are okay with it. And there will be a Czech national body in the form of our company, Avast. So how long is this process, do you think? A lot of work is about our lawyers. It's a lot of lawyery. Oh, okay. Actually, it's just a few signatures, and that's all.
Starting point is 00:33:59 We are already all members, so the hard work is already done. So when should this be official? When should you be a voting national body i hope yesterday i'm still waiting for uh lawyers but uh soon i will not be in san diego but i will be on next meeting yeah that's what i was going to ask if you're going to san Diego. What is the next one after that? It's in February and it's on Hawaii. It's a good one to go to. You're right.
Starting point is 00:34:32 Not that you'll be able to enjoy it, but... I will be probably sitting in an air-conditioned hotel room without windows. Yes. That's what we've been told. At the beginning of your talk, you also mentioned you're involved in the C++ Prague meetup.
Starting point is 00:34:52 How is the C++ community in Prague? It's starting. It's small, and it will be bigger and bigger with each meetup, I think. There will be a next meetup tomorrow, which is October 17th. And there will be Tony von Erd.
Starting point is 00:35:12 I was able to convince my company to invite some nice guests to come here. Your company recently had Matt Godbold out, is that correct? Yeah, that's also my meetup. Oh, okay. Very cool. I didn't realize that was the meetup also. My meetup
Starting point is 00:35:35 has not flown anyone from overseas to come in. Although we have had people coincidentally visit from overseas because they were on their way to a conference here. I wanted from the beginning on their way to a conference here. I wanted from the beginning that every talk to be streamed. It helps to spread the word about the meetup and it's a lot about marketing of our company. Right.
Starting point is 00:35:56 So you can watch the C++ Prague meetups online? Yeah. Streamed. Live. Awesome. So who is your next speaker then? Tony van Aert tomorrow. Okay. Wow. That's a long flight for Tony.
Starting point is 00:36:16 I'm curious. My experience in Eastern Europe so far is like Wrocław, Poland, has a gigantic C++ community. It seems that Budapest does. Is it similar in Prague? Is there a lot of C++ development happening? There is a lot of development. There is Skype, Microsoft,
Starting point is 00:36:38 Red Hat in Brno, and a lot of different companies. So there are probably a lot of C++ developers, but they are hidden inside, I think. Right. Well, you'll also be presenting your talk at C++ on C in early 2019. Are you planning on making any updates to it?
Starting point is 00:36:59 Yeah, there'll be a few significant ones. Are you running the library again? I'm preparing a few new tricks to shock the audience. And I hope there will be a new engine in the library. Wow. Yeah.
Starting point is 00:37:21 So it sounds like you enjoy rewriting it. Nothing from it. Okay. No, no. It's not rewriting. It's additional engine which can be used if there are no captures in the regular expression, only matching, which can be done much quicker. Okay.
Starting point is 00:37:35 I have one more question about all these different iterations of the library. If you're stuck on C++11 or 14, could you use one of the earlier iterations of the library since the new one is kind of dependent on these C++11 or 14, could you use one of the earlier iterations of the library, since the new one is kind of dependent on these C++17 and C++20 features? I think the oldest one, I don't know
Starting point is 00:37:53 if it's public, was on C++11 with the extension. Okay. But probably not. It would be a lot of work to make it work. Yeah. Okay. I'm also curious, since you mentioned you had to But probably not. It would be a lot of work to make it work. Yeah.
Starting point is 00:38:05 Okay. I'm also curious, since you mentioned you had to build an LL1 parser first, is that parser component reusable? Is that something that other people could use to make their own parsers? Yeah, you can just take part of my library. It's called ctll and provide new grammar and you are good to go okay oh I see compile time ll yeah
Starting point is 00:38:30 you can just design new grammar as form of table with function overloads and you give me grammar as a type as a parameter in parser and that's all. In my library, there is a branch, brainfuck, which is implementing language brainfuck with the parser.
Starting point is 00:38:59 Which I'd imagine is a very difficult language to parse. It's actually very easy to parse. Really? Okay. Yeah, it's like 100 lines of grammar, which is nothing. Yeah, my understanding is that language is one of those write-only languages. Yeah, I use examples from Wikipedia. I didn't try to write my own. So one thing we haven't brought up at all yet
Starting point is 00:39:27 is all this compile time magic you're doing is talk about the impact on the compiler. How much does this affect the compile time of your project? You mean just using my library or... Yeah, using your CTRE library, yeah. Like if I'm going to throw a half dozen regular expressions with your library and my project because I want really, really fast run times,
Starting point is 00:39:51 is it going to take me an hour to compile my project? No, it will take you only a few milliseconds per regex. On GCC, it's 200 milliseconds per regex. For how big of a regex? Sorry. I use medium-sized regex, I think 20 chars long.
Starting point is 00:40:13 Okay. So it looks like most of ordinary regular expressions. In Clang, it's a little quicker, I think 150 milliseconds. Depends on your computer. But if you use the same regular expression multiple times,
Starting point is 00:40:32 compiler caches the parsing, so you're not paying anything more. So you can use your regular expression 100 times, and you will not know it. So that just, if I understand correctly, the compiler sees that you're instantiating the same template again and says, oh, I already know what's going to happen when I instantiate this template? Exactly. Okay. If they are in the same transition unit.
Starting point is 00:41:01 Right. Okay. So it sounds like you could, if for some reason these became a compile time concern to you, you could use them as extern templates as Michael case recommended when we had him on the show. Yeah, probably.
Starting point is 00:41:16 Or you can just wrap them in function or in different translation unit. Oh yeah. That would make sense because you just need to pass a string view or whatever to it that you want to match against. Okay. That's really cool. And, you know, Rob started out by saying this, but it is true. All of the comments on Reddit and Twitter and Slack seem to agree that your talk was the best one of the conference this year. You got several applause lines.
Starting point is 00:41:47 People were pretty impressed from what I could tell in the video. Yeah. I was surprised, too. Oh, and on that note, the first video that went up had to be pulled because it had an incorrect title. So it sounds like, Rob, you did find a video that stayed online. No, there is no video currently online. It was
Starting point is 00:42:08 pulled because there was a wrong title. It was just regular expressions. I was missing compile time. Okay, so they haven't reposted it yet. Okay, well, hopefully they get it up soon. Yeah. One more thing, since we're talking about the
Starting point is 00:42:23 length of compile time for Regex, I think you did find there was an length of compile time for a regex. I think you did find there was an upper limit to how long a regex could be before the compiler would be unable to process it. Because the deep of template installation recursion is the same as the length of the string, you are limited by the limit of your compiler template in such a recursion, so it's usually 100 chars. Okay. 100 chars, okay.
Starting point is 00:42:54 I don't think many people are writing regit expressions that long, though. Yeah, so it's not much usable for big JSONs, but for regits, it's enough. Yeah. Yeah, and I think most compilers have compiler flags where you could extend that limit if you really wanted to. Yeah.
Starting point is 00:43:13 I actually was thinking about writing a paper which gives you ability to ask about limit of template integration recursion. So you can use it in your parser and give it a nice error message.
Starting point is 00:43:30 Your regular expression is too long. That would be interesting. Are you aware if the standard currently says if there's a minimum recursion limit? I'm not sure, but there shouldn't be any. It's just an implementation issue.
Starting point is 00:43:48 Yeah, that's probably true. I know there's some places where the standard has minimums, but... Wow. It's a very cool library. Thank you. Was there anything else you wanted to share with us before we let you go, Hannah? I'm not sure.
Starting point is 00:44:05 I don't know. Probably not. Okay. Okay. Well, it's been great having you on. We'll get a new link to the YouTube video as soon as it's back up and put that in the show notes. And good luck at the Hawaii C++ meeting when you go there. Thank you.
Starting point is 00:44:23 And I guess I'll see you in England in a few months. Yeah, if anyone is on Meeting C++, I will be there too. Oh, right. I forgot that you had submitted to that one as well. Well, have fun. Will this be your first year
Starting point is 00:44:40 at Meeting C++ then, you said? Yeah. It's a good conference. I heard everything good about it. Good. Okay, thank you, Hannah. Thank you. Thanks.
Starting point is 00:44:54 Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic, we'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter.
Starting point is 00:45:14 You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons who help support the show through Patreon. If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode was provided by podcastthemes.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.