CppCast - Vcsn

Episode Date: April 13, 2017

Rob and Jason are joined by Akim Demaille to discuss VCSN, a platform for automata and rational expressions, and some of the interesting problems he faced while working on the library. Akim ha...s been participating in free software for about 20 years, starting with a2ps, an anything to PostScript tool written in C. In order to ensure its portability, he became a major contributor to GNU Autoconf, GNU Automake and GNU Bison. Akim has been teaching and researching at EPITA, a French CS Graduate School, for eighteen years. He has taught formal languages, logics, OO design, C++ and compiler constructions, which includes the Tiger compiler, an educational project where students implement a compiler in C++. This project, whose assignment is regularly updated, keeps track of the C++ eveolutions, and this year's version uses C++17 features. Akim's recent research interests are focused on the Vcsn platform, dedicated to automata and rational expressions. He's recently been recruited by former students of his to be part of the Infinit team at Docker. News Announcing Meeting C++ 2017 C++Now 2017 Keynote: Ali Çehreli - Competitive Advantage with D Reduce C++ Build Times by Reducing Header Dependencies Capturing *this in C++11, 14 and 17 Akim Demaille Akim Demaille's GitHub Links Vcsn home page Vcsn Online Sandbox The Tiger Project Johnny Five Technical report about runtime instantiation in C++ Sponsors Incredibuild JetBrains

Transcript
Discussion (0)
Starting point is 00:00:00 This episode of CppCast is sponsored by Incredibuild. Accelerate your C++ build by up to 30 times faster directly from within Visual Studio 2017. Just make sure to check the Incredibuild box in the C++ Workload VS 2017 setup. And by JetBrains, maker of intelligent development tools to simplify your challenging tasks and automate the routine ones. JetBrains is offering a 25% discount for an individual license on the Thank you. Pacific++, the first major C++ conference in the Pacific region, providing great talks and opportunities for networking. Get your ticket now during early bird registration until June 1st. Episode 97 of CPP Cast with guest Akeem Damal, recorded discuss some big conference announcements. Then we talk to Akeem Damal, professor at APITA.
Starting point is 00:01:22 Akeem talks to us about VCSN, a platform for automata and rational expressions. Welcome to episode 97 of CppCast, the only podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today? Good. Sleepy. Sleepy? Yeah, we'll see if I stay awake through the episode. I don't think we've lost you to passing out once yet, so I think you'll be okay. That's true. That hasn't happened yet.
Starting point is 00:02:14 Yeah. So I just got a new laptop. I finally installed Visual Studio 2017, so that's nice. I've been waiting. I was expecting to get the new laptop soon. soon and 2017 supposed to install like 10 times faster than 2015 right yeah i and i installed both 2013 and 2017 because i still might need 2013 to build some things um and and you could notice pretty clearly the the difference in install time between the two it was cool i will have to do that when i get my new laptop soon also very cool okay well at the top of our episode i'd like to read a piece of feedback this week we got an email from mandar and he wrote you guys have been doing really great work and my
Starting point is 00:02:57 40 minute drive to office and on the way back has been a learning ride ever since. I'm from Prune, India, and I'm a C++ developer like many others. I call myself as born and brought up with CPP because I started my career 10 years ago as a C++ programmer, and I believe my programming sense has grown slowly and steadily ever since then, just like the C++ standards. Also, congratulations for grabbing
Starting point is 00:03:22 the best overall podcast. You guys really deserve it. Thank you. We appreciate it. He has a suggestion about a topic could you discuss pros and cons of static initialization in c++ it's pros and cons in one of your episodes in my previous organization the product which i worked on suffered from a large startup time and one of the culprits was huge usage of static initialization my learning was although the concept is very powerful it needs to be judiciously used but learning was although the concept is very powerful it needs to be judiciously used but overall i find the concept interesting to explore more and know
Starting point is 00:03:50 about jason this sounds like something you may have already talked about in a c++ weekly um i do have some yeah i i did one episode on the cost of using statics but more just in general i hate globals so i would it would be a highly opinionated episode if we talked about that well that'd be a good one though we'll have to think about that once more see who we would get on yeah so we'd love to hear your thoughts about the show as well you can always reach out to us on facebook twitter or email us at feedback at cpcast.com. And don't forget to leave us a review on iTunes. Joining us today is Akeem Damal. Akeem has been participating in free software for about 20 years, starting with A2PS and anything to PostScript tool written in C. In order to ensure its portability, he became a major contributor to new AutoConf, new Automake, and new Bison.
Starting point is 00:04:44 Akeem has been teaching and researching at apita a french cs graduate school for 18 years he's taught formal languages logics object oriented design c++ and compiler constructions which includes the tiger compiler an educational project where students implement a compiler in c++ this project whose assignment is regularly updated keeps track of the c++ evolution and this year's version uses C++17 features. Akeem's recent research interests are focused on the VCSN platform, dedicated to automata and rational expressions. And he's recently been recruited by former students of his to be part of the Infinite team at Docker. Akeem, welcome to the show.
Starting point is 00:05:22 Thank you for having me. I got to ask about your contributions to AutoConf, Automake, and Bison to ensure portability, because those are kind of notoriously difficult to use tools on Windows, specifically. Yeah, but it was something like 20 years ago, and I was mainly interested in all the different kind of Unixes that existed then. So it's more Solaris and the young Linux at that time, and many different Unixes. That was my main focus. That would have been a very interesting time to be working on helping make sure AutoFu tools are working on the brand new Linux.
Starting point is 00:06:05 Yeah, we call it AutoTools if you want to name the new build system, LiveTools, American AutoConf. Yeah, it was an exciting time. Okay, so we're going to have a couple news articles to discuss, Hakim, and then we'll talk more about your most recent project with VCSN, okay? Sure. Okay, so first one, we have a couple conference announcements. Announcing Meeting C++ 2017.
Starting point is 00:06:35 It's going to be three days now, November 9th to 11th in Berlin, just like the previous years of the conference. And one of the exciting things I thought here was they announced the keynoters, at least some of them, Sean Parent, Kate Gregory, and Max Musterman. I'm not familiar with Max Musterman, but it's great to see Kate Gregory will be keynoting the conference. I'm actually friends with Kate on Facebook, and I think she made a recent announcement that she actually has been having some health problems over the past year and that's why we
Starting point is 00:07:10 didn't see her at cpp con but she's feeling better now and she's going to be back on the conference circuit which is great to see yeah that's the yeah you catch this maxerman, you know who that is? I do not. And what about John Doe? Well, I'm guessing John Doe is a placeholder? Yeah, I suppose. Yes, it's like the German version of a placeholder. Oh, okay, okay. I was not aware of that. Okay, so they're waiting to announce this third keynote then. And it's waiting to see women in keynotes oh yeah absolutely
Starting point is 00:07:47 anything else a friend of the podcast yeah okay and then another conference announcement uh c++ now um are announcing their second keynoter and this this one will be Ali Surili. Hopefully I'm pronouncing that right. And that's programming in D. So he is the author of Competitive Advantage with D. Is that the name of the book? No, he's the author of Programming in D. Programming in D, okay. And the title is Talk, Competitive Advantage.
Starting point is 00:08:23 And he also sits on the D Foundation. So, Jason, you've attended C++ Now a couple times. How do you think the conference is going to respond to these D and Rust developers giving keynotes? I think it's going to be a lot of fun, because the people who tend to attend C++ Now are people that are stretching the language to their limits and want to know what's possible in other languages and wants to know where the language should be evolving to. So I think it makes a lot of sense for C++ now.
Starting point is 00:08:52 Maybe not for one of the larger, more mainstream conferences. Yeah, maybe this one would go over quite as well as CppCon. Maybe not, but... Yeah. Yeah, I was listening to CPP chat last week, and Bryce Lelbach and John Kolb, of course, were on that, and they're both involved with both conferences. And they made a comment.
Starting point is 00:09:14 They didn't announce Ali doing the keynote then, but they kind of hinted at it. And they said that they've looked at feedback from previous conferences and that the more controversial topics seem to be more favored by the audience. So they're specifically trying to go for controversial topics or potentially controversial topics
Starting point is 00:09:36 because they think the audience will like it. So this definitely seems like it should be popular. Yeah. Have you been to either of these conferences, Akeem? No, unfortunately not. Okay.
Starting point is 00:09:51 So the next one, this is on Lattix's blog, and it's reduce C++ build times by reducing header dependencies. And this is something that I'm guessing a lot of our listeners have heard of, that if you can reduce the number of headers in your source files, you should be able to reduce build times,
Starting point is 00:10:11 especially if the header files are unnecessary. But it's kind of worth reminding listeners that you should be looking out for this. And when I looked at Reddit, there were some comments pointing out some other tools that you could use to look for unused headers. I think it's clangs, only use what you use. Include what you use.
Starting point is 00:10:33 Include what you use, yeah. Yeah, it's useful. I actually had a problem with this article. What was that and his final example here he for declares um share price okay uh so that he doesn't have to include an extra header sure and then in his in the class stock share price now becomes a pointer right and he doesn't um it's not wrapped in a smart pointer, and he does not add a destructor. So we've taken what was very simple, easy-to-reason-about code that could all work nicely with the stack
Starting point is 00:11:17 and converted it into something with a memory leak, essentially. Yeah, it definitely could do much better with that example it's kind of a very weak pimple yeah right okay and then the last article this is a really short one uh capturing the this pointer in c++ 11 14 and, and 17. And this is showing examples on how you can capture this in a lambda in the three different versions of modern C++.
Starting point is 00:11:53 And I want to know what you guys think about this, but I kind of feel like C++ 14's version looks really nice, and then C++ 17 looks kind of confusing. Well, actually, the most value of this blog post is about the report itself,
Starting point is 00:12:11 where you can read why they decided to add that feature, and mainly it's to avoid forgetting. If you capture with self, as you did, you need to use self everywhere. And maybe you forgot one or two places, in which case you would be using the captured pointer, this. So the main idea is to make sure that you don't forget some occurrences of this. So you say, once for all, I want this to be a copy. Okay. So in the C++17 version, you're specifically prevented from using this. It would be a compile error?
Starting point is 00:12:52 Yes. Okay. Interesting. Now, I like abusing lambdas, but I've yet to come up with an actual use case where capturing this by copy, by value, is useful't i just don't know where this is used by people have either of you used this never as i don't think so okay someone must be using it or else there wouldn't be proposals to uh keep changing the way it's used right sure yeah i guess if you got some go ahead ahead. I was just thinking that it's another sign that this
Starting point is 00:13:29 being a pointer is kind of weird. It's an historical artifact. It should have been a reference, but references were invented after this. But there's kind of an inconsistency in the way you have to treat this because it's a pointer.
Starting point is 00:13:46 That's an interesting point. And yeah, right, because it's not... This can never be null, right, according to the standard. So it would make the most sense for it to be a reference. Yeah, and it probably would have been a reference if references were invented before, but it went the other way around. That's interesting. I hadn't ever thought about that.
Starting point is 00:14:08 Okay, well, Hakim, maybe you can start us off by telling us a little bit about the VCSN project that was mentioned in your bio. Yeah, okay. So VCSN is a platform for automata and rational expressions. First, I should say that what I call rational expressions is what most people call regular expressions. Regular or rational is about the same. It's a platform in the sense that it's first a C++ library and tools around it, or rather on top of it.
Starting point is 00:14:42 There are bindings to Python, for instance, and binding to IPython, which is kind of a GUI, a very nice terminal with graphical possibilities, and also tools to use it from the shell, using pipes and everything in the usual Unix way. So it's about automata. Automatas are oriented graphs. Well I guess most computer scientists know what automata are, but what's specific about VCSN is that it targets many
Starting point is 00:15:18 different kinds of automata. So usually automata are labeled with letters, right? So, you follow the transitions, and the transitions are labeled with letters. But actually, you could decide that you want to label the transitions with words instead of letters, and maybe tuples of letters, or tuples mixing some input, which would be a letter, and some output, which could be a word, etc. Besides, automata can be weighted, which means that usually automata just say yes or no, they are just computing some Boolean function. The word is accepted or the word is rejected, like grep does. But you can also decide to compute
Starting point is 00:16:07 some score. So, for instance, it's used by linguists, by computational linguistics, to implement translation, automatic translation, and these kind of things where they look for the closest possible result, the most likely result i mean okay okay okay so since you could have many different kind of labels and many different kind of weights vcsn as a library is heavily templated it's templated by the type of labels you can use and the type of weights you can use which poses quite many problems when you want to bind it to
Starting point is 00:16:52 to an environment that would be nice for students to discover what automata are so actually what I wanted to discuss with you about VCSN is what we do to be able to bridge the gap between C++ templated library and Python on top of it. Okay, so what tools do you use for that, or how do you accomplish that? There are several tools, and the one we use to go from C++ to Python is BoostPython. Okay. I know you've been using Swig, but I have been using Swig too, and I remember very, very bad time, horrible headaches.
Starting point is 00:17:40 I guess most recent versions of Swig are easier to use, but, well, I was burned by it, and I did not want to try again. On a sufficiently large project, Swig can be a beast to use, but if you need to expose to many different languages, then it can be helpful. Yeah, probably the best answer. Yeah, I agree. So you're using Boost Python, but then how do you solve the problem of exposing these templates to Python? I don't.
Starting point is 00:18:13 Okay. What we actually do is that we have another C++ layer on top of the bottom library. The bottom library is templated and on top of it we have another API we call DIN for dynamic which implement types erasure. So we have for instance many different types of automata, not only many different instances of the class automaton, but also many different classes implementing automata. And using type erasure, we declare din automata, which is a regular type. It's not templated.
Starting point is 00:19:02 Okay. Am I clear? I believe so. So maybe to take just a quick step back for our listeners, the problem is a template is not a concrete thing, so it cannot be exposed to the scripting engine. Right. So you have to provide it something concrete
Starting point is 00:19:20 that you can actually expose to Python. Exactly. You have to instantiate it, and you want to put it in a box. You want to wrap it in something that you expose. And so you are basically creating some sort of, I don't know, is it equivalent to C++17's any type or something? Well, you could call it that way, but it's typed any. It's really meant for automata.
Starting point is 00:19:50 And yeah, it uses the same kind of techniques, which is just type erasure indeed. Okay. So Diane is giving you a way of basically converting from compiled time type erasure to run-time type erasure. Right. Okay. So is this a library that could be used outside of Automata, or does it provide some sort
Starting point is 00:20:14 of generic capability that the rest of us could take advantage of? No, I guess there are some IDs to pick from there, but it was not implemented in the ID to be used in another project than VCSN. Okay. I wanted to interrupt this discussion for just a moment to bring you a word from our sponsors. IncrediBuild dramatically reduces compilation and development times with unique process virtualization technology. The tech transforms your computer network into a virtual supercomputer and lets each workstation use hundreds of vital cores across the network. Use IncrediBuild to accelerate
Starting point is 00:20:50 much more than just C++ compilations, speed up your unit tests, run more development cycles, and scale your development to the cloud to unleash unreal speeds. Join more than 100,000 users to save hundreds of monthly developer hours using existing hardware. IncrediBuild is already integrated into Visual Studio 2017. Just make sure to check the IncrediBuild box in the C++ workload in the Visual Studio 2017 setup. When you first reached out to us, you mentioned IPython and Jupyter. Can you tell us a little bit about those and how they're used with VCSN? Sure. So once you have binding to Python, it's quite easy to use IPython, which is just another interpreter, another top-level loop for Python. And in particular, you can run it from a web browser, in which case you could use anything a web browser can display. So, for instance, when you have an automaton,
Starting point is 00:21:48 it's displayed as an automaton. It's really... you can see the graph and understand how it works. And when you have, for instance, a mathematical formula, it's nicely displayed as it would be with LaTeX, for instance So it's a much nicer way to interact with Python. It's kind of Mathematica or Maple for Python. And since the project was very successful, they decided to extend it to many other languages. That's why it's been renamed Jupyter, which stands for Julia,
Starting point is 00:22:28 Python, and R. But now there are many more languages supported. I think I saw there was some version of C++ binding for it also via Kling or something. Maybe. Okay.
Starting point is 00:22:45 So once you wrap your automata, the most interesting part is really when you want to call functions. And what's particular about this way of, the way we chose to type arrays is that you can take your DIN automaton and ask it, what is your exact type. Okay.
Starting point is 00:23:09 Okay. And then you can use, for instance, if you want to apply an operation on two automata, you can query those two automata to know what their type are, and then to select the routine that correspond exactly to their types. So you have a dynamic dispatch hidden behind DynAutomaton that will select a specific templated algorithm in the lowest library based on this type signature, dynamic type signature. Okay, that makes sense. So what are the challenges that you hit working on this approach? Well, it was very incremental,
Starting point is 00:24:03 so there was no single big wall to climb, but mainly to get some kind of introspection to be able to query an object and get something that describes its type. You cannot use type ID because the name is not, the convention into a name is not really standard and it was not convenient for us to use it as a key, but roughly we need to implement some kind of small, just what we need, some small introspection. That's just type traits. And, well, that's about it for that part. Okay.
Starting point is 00:24:55 So... Oh, and I'm sorry, go ahead. Yeah, just to conclude about what you can do when you call function this way is that, in practice, that provides you with multi-methods. So really the dispatch depends on the type of all the arguments to select the exact routine that you will be running. Do you have to deal with the potential for ambiguous overloads or ambiguous calls with
Starting point is 00:25:21 your dynamic dispatch? It cannot happen. It's always, you get the exact type of all your arguments, and either there is a single function that corresponds to it, or there's no implementation, but there cannot be many. Okay. So I'm curious, what do you do if no implementation matches? That's an interesting question. If there's no answer, since we get precise descriptions of the arguments that we were using, we can generate just a bit of C++
Starting point is 00:25:57 that instantiates the routine you were calling with the exact types. So it's just a template instantiation. It's just a short file that instantiates the function you were about to call. Okay. That same file just adds a hook to register that function when it's ready. Then it's compiled into a shared object. Then we DL open the shared object the hook inside the shared object register the function you just compiled into the system
Starting point is 00:26:32 and so the call you attempt paused for the time it took to compile your routine and proceed by calling it. Wow. Okay. So you can actually, I'm sorry, if I understood that correctly, you can actually compile the function if it doesn't already exist.
Starting point is 00:26:57 Right. Well, actually, I should say that you can instantiate the function for the arguments you just gave. Okay. All right, so when the system first starts up, do you have... In my head, I'm imagining there's no reason to give it anything. You just start it up with an empty plate and let it fill in what it needs as you use it.
Starting point is 00:27:19 Yes, you could do that, but it's not very nice because it has to compile many things at the beginning, and when you start the system, you want it to be already responsive for the most typical cases. So there are a couple of typical routines and typical arguments that you want to use that are pre-compiled. And then the system will complete itself on the go while you use it. And of course, we're caching everything. So if you quit and restore the system, you won't be recompiling everything.
Starting point is 00:27:51 We just check that the timestamps are up to date. And if nothing changed, we just reload the shared object. Wow. That's pretty impressive. It was really made because VCSN was made for researchers to be able to implement algorithm on automata. It was also made for linguists to use it, so it needs to be ready to work on automata with millions of states, hence C++, but we also wanted it to be usable by students who were just discovering the field. So we really wanted to provide something easy to use in an environment such as IPython or Python is really what we wanted.
Starting point is 00:28:42 So there was no option. We really wanted to be able to hide the templated library behind something much simpler. But still be able to provide all the speed of the compile time. Exactly. So if you really want to use the library, the bottom library,
Starting point is 00:29:03 everything is just at excellent speed. You don't lose anything. If you're on top of DIN, you don't lose much because usually when you call an algorithm on Automata, the algorithm itself is expensive, so the cost of the dispatch is... you forget about it. The dispatch itself was measured, it's about 40 times, 50 times what virtual would cost.
Starting point is 00:29:29 Okay. So it's much more expensive, but since you do it on expensive computations, you forget about it. So could you see this? Okay, so as I'm envisioning this, I'm thinking it seems like it's like like the the meshing of like some of the work i've done with chai script with like the
Starting point is 00:29:53 dynamic recompilation stuff that the game developers are talking about that we've had on the show before is that doug i think? I think so. So I'm thinking, if you were to hypothetically throw away Python, could you envision putting your own scripting layer on top of this, or Lua, or something else, and make it into the perfect game scripting engine? I guess you could. Actually, really, Python is just one API that we provide on top of it, but all the magic is really done in C++. The generation of code, the loading of code,
Starting point is 00:30:45 type erasure, everything is done in C++. So actually, DIN would be very easy to bind using Swig, and you could easily target many other languages. Swig wouldn't know about the fact that it's below its instantiating code and compiling code. It's really
Starting point is 00:31:02 cool. Yeah, it is. And yes, I agree that there are probably many things that are similar to what you've been doing in Javascript in there. Well, I mean, when you started talking about dynamic dispatch, I'm like, well, I could ask a thousand questions about how you implemented different things, but we don't necessarily need to go down that road. So what compilers do you
Starting point is 00:31:25 work with, then? Well, Clang and GCC. I never tried to port it to Windows, and I don't know if there are users of VCSN on Windows. So you just haven't tried yet? No, I never tried.
Starting point is 00:31:42 I'm not running Windows. And I am aware that now you can run Visual on Unix, but never tried. I'm not running Windows. And I am aware that now you can run Visual on Unix, but never tried it, though. Yeah, you could compile on Unix. But Jupyter itself works on Windows Linux
Starting point is 00:32:02 back-OS, is that correct? Yeah, certainly. I don't know, I never tried but I don't see why it wouldn't. Okay. The project is open source, right? So other contributors could try it on other platforms if they wanted to? Sure, and actually since
Starting point is 00:32:18 there's a web backend, since you can use it from a web browser it's also usable from your home. Just go to vcsn-sandbox. lrde.epita.fr and then you can use it. You can try it without having to install anything. So we can put a link to that in the show notes yes absolutely so okay so maybe if we can go back to like the actual tool that you've written from the
Starting point is 00:32:55 jupiter perspective you can just like what type in a regular expression and this will show you a graph of it right you can compile your rational expression into an automaton, and there are many different ways to transform, to compute an automaton from a rational expression. So you can try the different algorithm and see what it produces, what they produce. And rational expressions are, or regular expressions, are more powerful than what usually you can see. So, for instance, it works very well with complement, when you want to say, I don't want that, I don't want to match that, and it supports very well also intersection, so I want this and this.
Starting point is 00:33:40 And there are many operators, you could say, I want this and this, but I don't know in which order. So the usual RegExp is quite poor compared to what you could actually do with Automata. Okay. And go ahead, Rob. I was going to ask, you said that this is already being used by the research community and by some of your students? Yeah, it's used when we teach automata theory. So outside of language and regular expressions, I have to admit my knowledge or at least my recent experience with automata is very low. So how else, can you give us more examples
Starting point is 00:34:26 of what automata can be modeled with your system? Or is regular expressions like the superset of these things? Oh, no, no. Actually, regular expression is just one way to define automata, and it's very compact when you know what you want to express, but automata can be used to represent very, very large languages. So in which case, rational expressions are really not the way to build them. So you could easily build automata using dictionaries,
Starting point is 00:34:56 or you can build automata with machine learning techniques, in which case you get absolutely huge automata that you cannot really browse. And if you were to extract rational expression from it, there's no chance you would understand anything from it. So rational expressions are really nice when you know what you want for requests, for queries. It's really nice. But to represent a large set of knowledge, it's not the right tool.
Starting point is 00:35:29 So going back to what you were talking about at the beginning of the interview, a machine learning system might analyze whatever and generate automata for you that could then be used in translation, you're saying, essentially. Yes, absolutely. That's one of the techniques which is used for automated
Starting point is 00:35:49 translation. That's something I have absolutely no experience with at all. Okay, is there anything else that we're missing that we haven't talked about yet with VCSN or the Python bindings? Well, actually, I don't't know it depends on how deep in
Starting point is 00:36:06 the details you want to go. So for instance we did not discuss how you can traverse, you can go through dynamic API down to the static API. So you need a system to unwrap your boxes to get the object. So really, for instance, if you invoke a dynamic routine, you have to unwrap your dynamic
Starting point is 00:36:37 arguments to get the static arguments, and this needs a cast. Then you invoke the function from the static library, then you get a static result, which you wrap into a dynamic result. And all this needs to be done for every single algorithm you implement. And in the end, supporting everything is kind of boring. So much of it can be automated by just looking at the signature of the dynamic algorithm that you want to expose. And what we use, we parse this kind of simple C++ signature and generate quite a lot of code out of it.
Starting point is 00:37:26 That's just the unpacking, calling, unpacking, and returning. All with lots of template metaprogramming, I assume? Yes, there is, yes. How long have you been working on this project? Something like four years, four or five years, I guess. Okay, so you were able to start
Starting point is 00:37:47 with having variadic templates available to you. Yes, and that was absolutely critical to have it. Yeah, variadic templates were really a godsend for the implementation. And just to conclude on what you can do when you master the dynamic dispatch. So as I said, sometimes your automata can, for instance, have several tapes, what we call tape. You could have a tape on which the automaton is reading and a tape on which the automaton is writing.
Starting point is 00:38:22 So it's usual automaton has a single tape, you can have many. And the tapes do not need to have the same type. So for instance, my input tape could be just labels or letters and the output tape could have strings.
Starting point is 00:38:40 Okay? And you might want to project or have a view of your automaton on just one of the two tapes. In which case, from the C++ library, it's two different instantiations that depend on an int. The template uses a static integer to select the tape that you want to show. Okay. Okay. But from the Python side, to convert a dynamic integer, 0, 1, 2, into a static integer that we give to the template system.
Starting point is 00:39:35 Okay, how do you do that? So the way we do that is by using the... So the dynamic dispatch uses signatures. So when you get an object, you ask it, what is your static type? And what we do then is when we see an integer that goes into a place where a static integer is expected. Instead of saying this is an integer, we say this is a stood interval constant, and we pass the value of the dynamic integer into there.
Starting point is 00:40:20 But don't forget that we generate code, so we just print, actually, we just print the value of that integer into the C++ code at the place where the stood integral constant is expected then you compile, you load and it works so a function is requested at runtime
Starting point is 00:40:42 you don't yet have a version of this function. So you actually generate C++ code, like a CBP file, and compile that. Again, I'm not exactly generating the function. I'm generating the instantiation of the function. The function exists, but it's the parameters that are computed and printed, and it's the instantiated function that we compile and load. Okay. I think that, yeah, the piece that I was missing,
Starting point is 00:41:11 and maybe if you can correct me if I'm wrong, is the fact that you are actually generating a C++ file that needs to be passed to the compiler that has these instantiations that you're looking for. Exactly. That's what we do. All right. That makes sense. And I would have never thought of going around that route, but I love it. Good.
Starting point is 00:41:35 So maybe I have one more contributor. We'll see. It seems like the kind of thing Clemens Morgenstern might be interested in. Also, since he works on the dynamic loading libraries for Boost, I don't know. Anyhow. So you had a lot of stuff going on in your bio. Maybe we could touch on some of these other topics. Do you want to tell us a little bit about the Tiger compiler that your graduate students work on? So the Tiger compiler is a project at Epita
Starting point is 00:42:07 and it was really designed for them to have a long-term project which was tricky, not a simple project and something that would run for several months because students are often making projects, what we call clinics projects.
Starting point is 00:42:31 You know, you just write the project and you throw it away. And they take back habits from this. They don't write tests or not enough tests, not enough documentation, not enough documentation. The code is not readable. So we really wanted to find something that would last for several months so that the students would understand the pain of not maintaining the test suite, not maintaining the comments, not maintaining the code and everything.
Starting point is 00:43:03 And a compiler is very nice for this because it's a very it's just a pipe. It's a sequence of modules. So the way we designed the Tiger compiler is they have to first they write the parser and then they write
Starting point is 00:43:19 the binder which just checks what are the names that are introduced and what are the names that are introduced and what are the names that are used, and then the type checker and so forth. So it's a project over several months, and they have to write it in C++. So each module tries to bring something new about C++
Starting point is 00:43:42 to discover in its implementation. Okay. Do you target LLVM, or have the students use LLVM, that is? Yes and no. So there are several backends, and recently some of the students have implemented a backend for LLVM, but the main backend that we proposed them is to implement it by themselves, by hand, not using any library. That would be a very interesting project to work on.
Starting point is 00:44:14 Yeah. And you also mentioned that you recently joined Infinite, which is a part of Docker, and that some of your students apparently work there. Do you want to tell us a little bit about what you're doing with Docker? So Infinite is working on the distributed file system. So Docker is really about containers, and containers roughly are very lightweight virtual machines. It's not virtual machines. It's different, but I'm not sure I have enough time
Starting point is 00:44:51 to describe what exactly containers are. But containers, the idea is really to be able to spawn many containers, many virtual machines, and maybe lose them, and maybe put more... Being able to be elastic. It's really designed for the cloud. So you can spawn new machines,
Starting point is 00:45:14 kill machines, etc. You don't care about these machines. It's really cattle. The problem, of course, is that usually if you have several machines, they are computing something useful, and you want to keep it. So that's where Infinite is used.
Starting point is 00:45:29 Infinite is a distributed file system, and that's where we put the data that you want to last. Okay. Yeah, I'm sure we've talked about it. I'm sorry? Infinite provides the persistent storage. Okay. Yeah, I'm sure we've talked a little bit about Docker in the past on the show.
Starting point is 00:45:53 I'm sure it's something that at least most of our listeners are familiar with. Hmm. Yeah. Yeah. Most probably. Okay. Well, is there anything else you wanted to go over before we let you go? It's been great having you on today, Hakim.
Starting point is 00:46:08 Thank you very much. Well, I don't know. If you have questions, just call me back. Okay. Okay. And we'll put links in the show notes, but where can people find VCSN or find you online uh well the easiest is to go to vcsn's home page which will be in the show notes so okay the easiest to reach me is to go to cppcast.com okay do you have any papers or anything that you've written uh documentation for how the dine
Starting point is 00:46:41 system works yes there's a tech report so I will provide it also for the show notes if you want more gory details or maybe someday I could have a guided visit into the code. I don't know how to do a virtual guided visit in code, but that would be
Starting point is 00:47:00 something to experiment. Yeah, it could be interesting to screencast. Okay, well it's been great having you on. Thank you. Thank you very much. Thank you. Thanks so much for listening in as we chat about C++. I'd love to hear what you think of the
Starting point is 00:47:16 podcast. Please let me know if we're discussing the stuff you're interested in. Or, if you have a suggestion for a topic, I'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. I'd also appreciate if you have a suggestion for a topic, I'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. I'd also appreciate if you like CppCast on Facebook and follow CppCast on Twitter. You can also follow me at Rob W. Irving and Jason at Leftkiss on Twitter. And of course, you can find all that info and the show notes on the podcast website
Starting point is 00:47:40 at cppcast.com. Theme music for this episode is provided by podcastthemes.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.