CppCast - Vcsn
Episode Date: April 13, 2017Rob and Jason are joined by Akim Demaille to discuss VCSN, a platform for automata and rational expressions, and some of the interesting problems he faced while working on the library. Akim ha...s been participating in free software for about 20 years, starting with a2ps, an anything to PostScript tool written in C. In order to ensure its portability, he became a major contributor to GNU Autoconf, GNU Automake and GNU Bison. Akim has been teaching and researching at EPITA, a French CS Graduate School, for eighteen years. He has taught formal languages, logics, OO design, C++ and compiler constructions, which includes the Tiger compiler, an educational project where students implement a compiler in C++. This project, whose assignment is regularly updated, keeps track of the C++ eveolutions, and this year's version uses C++17 features. Akim's recent research interests are focused on the Vcsn platform, dedicated to automata and rational expressions. He's recently been recruited by former students of his to be part of the Infinit team at Docker. News Announcing Meeting C++ 2017 C++Now 2017 Keynote: Ali Çehreli - Competitive Advantage with D Reduce C++ Build Times by Reducing Header Dependencies Capturing *this in C++11, 14 and 17 Akim Demaille Akim Demaille's GitHub Links Vcsn home page Vcsn Online Sandbox The Tiger Project Johnny Five Technical report about runtime instantiation in C++ Sponsors Incredibuild JetBrains
Transcript
Discussion (0)
This episode of CppCast is sponsored by Incredibuild.
Accelerate your C++ build by up to 30 times faster directly from within Visual Studio 2017.
Just make sure to check the Incredibuild box in the C++ Workload VS 2017 setup.
And by JetBrains, maker of intelligent development tools to simplify your challenging tasks and
automate the routine ones. JetBrains is offering a 25% discount for an individual license on the Thank you. Pacific++, the first major C++ conference in the Pacific region, providing great talks and opportunities for networking.
Get your ticket now during early bird registration until June 1st.
Episode 97 of CPP Cast with guest Akeem Damal, recorded discuss some big conference announcements.
Then we talk to Akeem Damal, professor at APITA.
Akeem talks to us about VCSN, a platform for automata and rational expressions. Welcome to episode 97 of CppCast, the only podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
Good. Sleepy.
Sleepy?
Yeah, we'll see if I stay awake through the episode.
I don't think we've lost you to passing out once yet, so I think you'll be okay.
That's true. That hasn't happened yet.
Yeah.
So I just got a new laptop.
I finally installed Visual Studio 2017, so that's nice.
I've been waiting. I was expecting to get the new laptop soon. soon and 2017 supposed to install like 10 times faster than 2015 right yeah i and i installed both 2013
and 2017 because i still might need 2013 to build some things um and and you could notice
pretty clearly the the difference in install time between the two it was cool i will have to do that when i get my new
laptop soon also very cool okay well at the top of our episode i'd like to read a piece of feedback
this week we got an email from mandar and he wrote you guys have been doing really great work and my
40 minute drive to office and on the way back has been a learning ride ever since. I'm from Prune, India,
and I'm a C++ developer like many others.
I call myself as born and brought up with CPP because I started my career 10 years ago
as a C++ programmer,
and I believe my programming sense
has grown slowly and steadily ever since then,
just like the C++ standards.
Also, congratulations for grabbing
the best overall podcast.
You guys really deserve it.
Thank you. We appreciate it. He has a suggestion about a topic could you discuss pros and cons
of static initialization in c++ it's pros and cons in one of your episodes in my previous
organization the product which i worked on suffered from a large startup time and one of
the culprits was huge usage of static initialization my learning was although the concept is very
powerful it needs to be judiciously used but learning was although the concept is very powerful it
needs to be judiciously used but overall i find the concept interesting to explore more and know
about jason this sounds like something you may have already talked about in a c++ weekly
um i do have some yeah i i did one episode on the cost of using statics but more just in general i hate globals so i would it would be a highly
opinionated episode if we talked about that well that'd be a good one though we'll have to think
about that once more see who we would get on yeah so we'd love to hear your thoughts about the show
as well you can always reach out to us on facebook twitter or email us at feedback at cpcast.com. And don't forget to leave us a review on iTunes.
Joining us today is Akeem Damal.
Akeem has been participating in free software for about 20 years, starting with A2PS and anything to PostScript tool written in C.
In order to ensure its portability, he became a major contributor to new AutoConf, new Automake, and new Bison.
Akeem has been teaching and researching
at apita a french cs graduate school for 18 years he's taught formal languages logics object
oriented design c++ and compiler constructions which includes the tiger compiler an educational
project where students implement a compiler in c++ this project whose assignment is regularly
updated keeps track of the c++ evolution and this year's version uses C++17 features.
Akeem's recent research interests are focused on the VCSN platform, dedicated to automata and rational expressions.
And he's recently been recruited by former students of his to be part of the Infinite team at Docker.
Akeem, welcome to the show.
Thank you for having me.
I got to ask about your contributions to AutoConf, Automake, and Bison to ensure portability, because those are kind of notoriously difficult to use tools on Windows, specifically.
Yeah, but it was something like 20 years ago, and I was mainly interested in all the different kind of Unixes that existed then.
So it's more Solaris and the young Linux at that time,
and many different Unixes.
That was my main focus.
That would have been a very interesting time to be working on
helping make sure AutoFu tools are working on the brand new Linux.
Yeah, we call it AutoTools if you want to name the new build system,
LiveTools, American AutoConf.
Yeah, it was an exciting time.
Okay, so we're going to have a couple news articles to discuss, Hakim,
and then we'll talk more about your most recent project with VCSN, okay?
Sure.
Okay, so first one, we have a couple conference announcements.
Announcing Meeting C++ 2017.
It's going to be three days now, November 9th to 11th in Berlin,
just like the previous years of the conference.
And one of the exciting things I
thought here was they announced the keynoters, at least some of them, Sean Parent, Kate Gregory,
and Max Musterman. I'm not familiar with Max Musterman, but it's great to see Kate Gregory
will be keynoting the conference. I'm actually friends with Kate on Facebook, and I think she
made a recent announcement
that she actually has been having some health problems over the past year and that's why we
didn't see her at cpp con but she's feeling better now and she's going to be back on the conference
circuit which is great to see yeah that's the yeah you catch this maxerman, you know who that is? I do not. And what about John Doe?
Well, I'm guessing John Doe is a placeholder?
Yeah, I suppose.
Yes, it's like the German version of a placeholder.
Oh, okay, okay. I was not aware of that.
Okay, so they're waiting to announce this third keynote then.
And it's waiting to see women in keynotes oh yeah absolutely
anything else a friend of the podcast yeah okay and then another conference announcement uh c++
now um are announcing their second keynoter and this this one will be Ali Surili.
Hopefully I'm pronouncing that right.
And that's programming in D.
So he is the author of Competitive Advantage with D.
Is that the name of the book?
No, he's the author of Programming in D. Programming in D, okay.
And the title is Talk, Competitive Advantage.
And he also sits on the D Foundation.
So, Jason, you've attended C++ Now a couple times.
How do you think the conference is going to respond to these D and Rust developers giving keynotes?
I think it's going to be a lot of fun, because the people who tend to attend C++ Now
are people that are stretching the language to their limits
and want to know what's possible in other languages
and wants to know where the language should be evolving to.
So I think it makes a lot of sense for C++ now.
Maybe not for one of the larger, more mainstream conferences.
Yeah, maybe this one would go over quite as well as CppCon.
Maybe not, but...
Yeah.
Yeah, I was listening to CPP chat last week,
and Bryce Lelbach and John Kolb, of course, were on that,
and they're both involved with both conferences.
And they made a comment.
They didn't announce Ali doing the keynote then,
but they kind of hinted at it.
And they said that they've looked at feedback from previous conferences and that
the more controversial
topics seem to be more favored by
the audience. So they're
specifically trying to go for controversial
topics or potentially controversial topics
because they think the audience will
like it. So this definitely seems
like it should be popular.
Yeah. Have you
been to either of these conferences,
Akeem? No, unfortunately
not.
Okay.
So the next one, this is on
Lattix's blog, and it's
reduce C++ build times
by reducing header dependencies.
And this is something that
I'm guessing a lot of our listeners
have heard of, that if you can reduce the number of headers in your source files,
you should be able to reduce build times,
especially if the header files are unnecessary.
But it's kind of worth reminding listeners
that you should be looking out for this.
And when I looked at Reddit,
there were some comments pointing out some other tools
that you could use to look for unused headers.
I think it's clangs, only use what you use.
Include what you use.
Include what you use, yeah.
Yeah, it's useful.
I actually had a problem with this article.
What was that and his final example here he for declares um share price
okay uh so that he doesn't have to include an extra header sure and then in his in the class
stock share price now becomes a pointer right and he doesn't um it's not wrapped in a smart pointer,
and he does not add a destructor.
So we've taken what was very simple, easy-to-reason-about code that could all work nicely with the stack
and converted it into something with a memory leak, essentially.
Yeah, it definitely could do much better with that example
it's kind of a very weak pimple yeah right
okay and then the last article this is a really short one uh capturing the this pointer in c++
11 14 and, and 17.
And this is showing examples on how you can capture this
in a lambda in the three different
versions of modern C++.
And I
want to know what you guys think about this, but I kind of feel
like C++ 14's
version looks really nice,
and then C++ 17 looks kind
of confusing.
Well, actually,
the most value of this blog post is about the report itself,
where you can read why they decided to add that feature,
and mainly it's to avoid forgetting.
If you capture with self, as you did, you need to use self everywhere.
And maybe you forgot one or two places, in which case you would be using the captured pointer, this.
So the main idea is to make sure that you don't forget some occurrences of this.
So you say, once for all, I want this to be a copy.
Okay. So in the C++17 version,
you're specifically prevented from using this. It would be a compile error?
Yes. Okay. Interesting.
Now, I like abusing lambdas, but I've yet to come up with an actual
use case where capturing this by copy,
by value, is useful't i just don't know
where this is used by people have either of you used this never as i don't think so okay
someone must be using it or else there wouldn't be proposals to uh keep changing the way it's used
right sure yeah i guess if you got some go ahead ahead. I was just thinking that it's another
sign that this
being a pointer is kind of weird.
It's an historical
artifact. It should have been a reference,
but references were invented after
this.
But there's kind of an
inconsistency in the way you have to treat this
because it's a pointer.
That's an interesting point.
And yeah, right, because it's not...
This can never be null, right, according to the standard.
So it would make the most sense for it to be a reference.
Yeah, and it probably would have been a reference
if references were invented before,
but it went the other way around.
That's interesting. I hadn't ever thought about that.
Okay, well, Hakim, maybe you can start us off by telling us a little bit about the VCSN
project that was mentioned in your bio.
Yeah, okay. So VCSN is a platform for automata and rational expressions. First, I should
say that what I call rational expressions
is what most people call regular expressions.
Regular or rational is about the same.
It's a platform in the sense that it's first a C++ library
and tools around it, or rather on top of it.
There are bindings to Python, for instance,
and binding to IPython,
which is kind of a GUI,
a very nice terminal with graphical possibilities,
and also tools to use it from the shell,
using pipes and everything in the usual Unix way.
So it's about automata.
Automatas are oriented graphs. Well I guess most computer scientists know what automata are, but what's specific about VCSN is that it targets many
different kinds of automata. So usually automata are labeled with letters, right? So, you follow the transitions, and the transitions are labeled with letters.
But actually, you could decide that you want to label the transitions with words instead of letters,
and maybe tuples of letters, or tuples mixing some input, which would be a letter,
and some output, which could be a word, etc.
Besides, automata can be weighted, which means that usually automata just say yes or no,
they are just computing some Boolean function.
The word is accepted or the word is rejected, like grep does.
But you can also decide to compute
some score. So, for instance, it's used by linguists,
by computational linguistics, to
implement translation, automatic translation, and these kind of
things where they look for the closest possible
result, the most likely result i mean okay okay okay
so since you could have many different kind of labels and many different kind of weights
vcsn as a library is heavily templated it's templated by the type of labels you can use and the type of weights you can use
which poses quite many problems when you want to bind it to
to an environment that would be nice for students to discover what automata are
so actually what I wanted to discuss with you about
VCSN is what we do to be able to bridge the gap between C++ templated library and Python on top of it.
Okay, so what tools do you use for that, or how do you accomplish that?
There are several tools, and the one we use to go from C++ to Python is BoostPython.
Okay.
I know you've been using Swig, but I have been using Swig too,
and I remember very, very bad time, horrible headaches.
I guess most recent versions of Swig are easier to use,
but, well, I was burned by it, and I did not want to try again.
On a sufficiently large project, Swig can be a beast to use,
but if you need to expose to many different languages,
then it can be helpful.
Yeah, probably the best answer. Yeah, I agree.
So you're using Boost Python, but then how do you solve the problem of exposing these templates to Python?
I don't.
Okay.
What we actually do is that we have another C++ layer on top of the bottom library. The bottom library is templated and on top of it we have
another API we call DIN for dynamic which implement types erasure. So we have for instance
many different types of automata, not only many different instances of the class automaton,
but also many different classes implementing automata.
And using type erasure, we declare din automata,
which is a regular type.
It's not templated.
Okay.
Am I clear?
I believe so.
So maybe to take just a quick step back for our listeners,
the problem is a template is not a concrete thing,
so it cannot be exposed to the scripting engine.
Right.
So you have to provide it something concrete
that you can actually expose to Python.
Exactly.
You have to instantiate it, and you want to put it in a box.
You want to wrap it in something that you expose.
And so you are basically creating some sort of, I don't know,
is it equivalent to C++17's any type or something?
Well, you could call it that way, but it's typed any.
It's really meant for automata.
And yeah, it uses the same kind of techniques,
which is just type erasure indeed.
Okay.
So Diane is giving you a way of basically converting
from compiled time type erasure to run-time type erasure.
Right.
Okay.
So is this a library that could be used outside of Automata, or does it provide some sort
of generic capability that the rest of us could take advantage of?
No, I guess there are some IDs to pick from there, but it was not implemented in the ID
to be used in another project than VCSN.
Okay.
I wanted to interrupt this discussion for just a moment to bring you a word from our sponsors.
IncrediBuild dramatically reduces compilation and development times with unique process virtualization technology.
The tech transforms your computer network into a virtual supercomputer and lets
each workstation use hundreds of vital cores across the network. Use IncrediBuild to accelerate
much more than just C++ compilations, speed up your unit tests, run more development cycles,
and scale your development to the cloud to unleash unreal speeds. Join more than 100,000 users to
save hundreds of monthly developer hours using existing hardware.
IncrediBuild is already integrated into Visual Studio 2017. Just make sure to check the IncrediBuild
box in the C++ workload in the Visual Studio 2017 setup. When you first reached out to us,
you mentioned IPython and Jupyter. Can you tell us a little bit about those and how they're used with VCSN? Sure. So once you have binding to Python, it's quite easy to use IPython, which is just another
interpreter, another top-level loop for Python. And in particular, you can run it from a web
browser, in which case you could use anything a web browser can display. So, for instance, when you have an automaton,
it's displayed as an automaton. It's really... you can see the graph and understand how it
works. And when you have, for instance, a mathematical formula, it's nicely displayed
as it would be with LaTeX, for instance So it's a much nicer way to interact with Python.
It's kind of Mathematica or Maple for Python.
And since the project was very successful,
they decided to extend it to many other languages.
That's why it's been renamed Jupyter, which
stands for Julia,
Python, and R.
But now there are many more
languages supported.
I think I saw there was
some version of C++
binding for it also via
Kling or something. Maybe.
Okay.
So once you wrap your automata,
the most interesting part
is really when you want to call functions.
And what's particular about this way of,
the way we chose to type arrays
is that you can take your DIN automaton
and ask it, what is your exact type.
Okay.
Okay.
And then you can use, for instance, if you want to apply an operation on two automata,
you can query those two automata to know what their type are, and then to select the routine that correspond exactly to their types.
So you have a dynamic dispatch hidden behind DynAutomaton that will select a specific templated algorithm
in the lowest library based on this type signature, dynamic type signature.
Okay, that makes sense.
So what are the challenges that you hit working on this approach?
Well, it was very incremental,
so there was no single big wall to climb,
but mainly to get some kind of introspection to be able to query an object
and get something that describes its type. You cannot use type ID because the name is not, the
convention into a name is not really standard and it was not convenient for
us to use it as a key, but roughly we need to implement some kind of small,
just what we need, some small introspection. That's just type traits.
And, well, that's about it for that part.
Okay.
So...
Oh, and I'm sorry, go ahead.
Yeah, just to conclude about what you can do
when you call function this way is that,
in practice, that provides you with multi-methods. So really
the dispatch depends on the type of all the arguments to select the exact routine that
you will be running.
Do you have to deal with the potential for ambiguous overloads or ambiguous calls with
your dynamic dispatch?
It cannot happen. It's always, you get the exact type of all your arguments,
and either there is a single function that corresponds to it,
or there's no implementation, but there cannot be many.
Okay.
So I'm curious, what do you do if no implementation matches?
That's an interesting question. If there's no answer, since we get precise
descriptions of the arguments that we were using, we can generate just a bit of C++
that instantiates the routine you were calling with the exact types. So it's just a template instantiation.
It's just a short file that instantiates the function you were about to call.
Okay.
That same file just adds a hook to register that function when it's ready.
Then it's compiled into a shared object.
Then we DL open the shared object
the hook inside the shared object
register the function you just compiled into the system
and so
the call you attempt
paused for the time it took to compile
your routine
and proceed by calling it.
Wow. Okay.
So you can actually, I'm sorry, if I understood that correctly,
you can actually compile the function if it doesn't already exist.
Right. Well, actually, I should say that you can instantiate the function
for the arguments you just gave.
Okay.
All right, so when the system first starts up,
do you have...
In my head, I'm imagining there's no reason to give it anything.
You just start it up with an empty plate
and let it fill in what it needs as you use it.
Yes, you could do that, but it's not very nice
because it has to compile many things at the beginning,
and when you start the system, you want it to be already responsive for the most typical cases.
So there are a couple of typical routines and typical arguments that you want to use that are pre-compiled.
And then the system will complete itself on the go while you use it.
And of course, we're caching everything.
So if you quit and restore the system,
you won't be recompiling everything.
We just check that the timestamps are up to date.
And if nothing changed, we just reload the shared object.
Wow.
That's pretty impressive. It was really made because VCSN was made for researchers to be able to implement algorithm on automata.
It was also made for linguists to use it, so it needs to be ready to work on automata with millions of states, hence C++, but we also wanted it to
be usable by students who were just discovering the field.
So we really wanted to provide something easy to use in an environment such as IPython or
Python is really what we wanted.
So there was no option. We really wanted to be able to hide
the templated library
behind something much simpler.
But still be able to provide
all the speed of the compile time.
Exactly.
So if you really want to use the library,
the bottom library,
everything is just at excellent speed.
You don't lose anything.
If you're on top of DIN, you don't lose much
because usually when you call an algorithm on Automata,
the algorithm itself is expensive,
so the cost of the dispatch is... you forget about it.
The dispatch itself was measured,
it's about 40 times, 50 times what virtual would cost.
Okay.
So it's much more expensive,
but since you do it on expensive computations,
you forget about it.
So could you see this?
Okay, so as I'm envisioning this,
I'm thinking it seems like it's
like like the the meshing of like some of the work i've done with chai script with like the
dynamic recompilation stuff that the game developers are talking about that we've had
on the show before is that doug i think? I think so.
So I'm thinking, if you were to hypothetically throw away Python,
could you envision putting your own scripting layer on top of this,
or Lua, or something else,
and make it into the perfect game scripting engine?
I guess you could. Actually, really, Python is just one API that we provide on top of it,
but all the magic is really done in C++. The generation of code, the loading of code,
type erasure, everything is done in C++. So actually, DIN would be very easy to bind using Swig, and you
could easily target many
other languages. Swig wouldn't
know about the fact that it's
below
its instantiating code
and compiling code.
It's really
cool. Yeah, it is. And
yes, I agree that there are probably many things
that are similar to what you've been doing in Javascript in there.
Well, I mean, when you started talking about dynamic dispatch,
I'm like, well, I could ask a thousand questions
about how you implemented different things,
but we don't necessarily need to go down that road.
So what compilers do you
work with, then?
Well, Clang and GCC.
I never tried to port it to
Windows, and
I don't know if there are users
of VCSN on Windows.
So you just haven't tried
yet? No, I never tried.
I'm not running Windows.
And I am aware that now you can run Visual on Unix, but never tried. I'm not running Windows. And I am aware that now
you can run Visual on
Unix, but never tried it, though.
Yeah, you
could compile on Unix.
But Jupyter itself
works on Windows Linux
back-OS, is that
correct? Yeah, certainly. I don't know, I never tried
but I don't see why it wouldn't.
Okay.
The project is open source, right?
So other contributors could try it on other platforms
if they wanted to?
Sure, and actually since
there's a
web
backend, since you can use it from
a web browser it's also usable from
your home. Just go to vcsn-sandbox.
lrde.epita.fr
and then you can use it. You can try it without having to install anything.
So we can put a link to that in the show notes yes absolutely so okay so maybe if we can go back to like the actual tool that you've written from the
jupiter perspective you can just like what type in a regular expression and this will show you
a graph of it right you can compile your rational expression into an automaton, and there are many
different ways to transform, to compute an automaton from a rational expression. So you can
try the different algorithm and see what it produces, what they produce. And rational expressions
are, or regular expressions, are more powerful than what usually you can see. So, for instance, it works very well with
complement, when you want to say, I don't want that,
I don't want to match that, and it
supports very well also intersection, so I want this and this.
And there are many operators, you could say, I want this
and this, but I don't know in which order.
So the usual RegExp is quite poor compared to what you could actually do with Automata.
Okay.
And go ahead, Rob.
I was going to ask, you said that this is already being used by the research community and by some of your students?
Yeah, it's used when we teach automata theory.
So outside of language and regular expressions, I have to admit my knowledge or at least my recent experience with automata is very low. So how else, can you give us more examples
of what automata can be modeled with your system?
Or is regular expressions like the superset of these things?
Oh, no, no.
Actually, regular expression is just one way to define automata,
and it's very compact when you know what you want to express,
but automata can be used to represent very, very large languages.
So in which case, rational expressions are really not the way to build them.
So you could easily build automata using dictionaries,
or you can build automata with machine learning techniques,
in which case you get absolutely huge automata that you cannot really browse.
And if you were to extract rational expression from it,
there's no chance you would understand anything from it.
So rational expressions are really nice when you know what you want for requests, for queries.
It's really nice.
But to represent a large set of knowledge,
it's not the right tool.
So going back to what you were talking about
at the beginning of the interview,
a machine learning system might analyze whatever
and generate automata for you
that could then be used in translation, you're saying, essentially.
Yes, absolutely.
That's one of the techniques which is used for
automated
translation.
That's something I have absolutely no experience with
at all.
Okay, is there anything else
that we're missing that we haven't talked about yet with
VCSN or the
Python bindings?
Well, actually, I don't't know it depends on how deep in
the details you want to go. So for instance we did not discuss how you
can traverse, you can go through dynamic API down to the static API. So you need a system to unwrap
your boxes
to get the object.
So really, for instance,
if you invoke
a dynamic routine,
you have to unwrap your dynamic
arguments to get the static
arguments, and this needs
a cast.
Then you invoke the function from the static library,
then you get a static result, which you wrap into a dynamic result. And all this needs
to be done for every single algorithm you implement. And in the end, supporting everything is kind of boring. So much of it
can be automated by just looking at the signature of the dynamic algorithm that you want to
expose. And what we use, we parse this kind of simple C++ signature and generate quite a lot of code out of it.
That's just the unpacking, calling,
unpacking, and returning.
All with lots of template metaprogramming, I assume?
Yes, there is, yes.
How long have you been working on this project?
Something like four years,
four or five years, I guess.
Okay, so you were able to start
with having variadic templates available to you.
Yes, and that was absolutely critical to have it.
Yeah, variadic templates were really a godsend
for the implementation.
And just to conclude on what you can do
when you master the dynamic dispatch.
So as I said, sometimes your automata can, for instance, have several tapes, what we call tape.
You could have a tape on which the automaton is reading and a tape on which the automaton is writing.
So it's usual automaton has
a single tape, you can have many.
And the tapes do not
need to have the same type.
So for instance, my input tape could be
just labels or letters
and the output tape
could have strings.
Okay?
And you might want to
project or have a view of your automaton on just one of the two tapes.
In which case, from the C++ library, it's two different instantiations that depend on an int.
The template uses a static integer to select the tape that you want to show.
Okay.
Okay. But from the Python side, to convert a dynamic integer, 0, 1, 2,
into a static integer that we give to the template system.
Okay, how do you do that?
So the way we do that is by using the...
So the dynamic dispatch uses signatures.
So when you get an object, you ask it, what is your static type?
And what we do then is when we see an integer that goes into a place where a static integer is expected.
Instead of saying this is an integer,
we say this is a stood interval constant,
and we pass the value of the dynamic integer into there.
But don't forget that we generate code,
so we just print, actually, we just print the value of that integer
into the C++ code
at the place where the stood
integral constant is expected
then you compile, you load
and it works
so a function is requested at runtime
you don't yet have a version of this function.
So you actually generate C++ code, like a CBP file, and compile that.
Again, I'm not exactly generating the function.
I'm generating the instantiation of the function.
The function exists, but it's the parameters that are computed and printed,
and it's the instantiated function that we compile and load.
Okay.
I think that, yeah, the piece that I was missing,
and maybe if you can correct me if I'm wrong,
is the fact that you are actually generating a C++ file
that needs to be passed to the compiler
that has these instantiations that you're looking for.
Exactly. That's what we do.
All right. That makes sense.
And I would have never thought of going around that route, but I love it.
Good.
So maybe I have one more contributor.
We'll see.
It seems like the kind of thing Clemens Morgenstern might be interested in. Also, since he works on the dynamic loading libraries for Boost, I don't know.
Anyhow.
So you had a lot of stuff going on in your bio.
Maybe we could touch on some of these other topics. Do you want to tell us a little bit about the Tiger compiler that your graduate students work on?
So the Tiger compiler is
a project at Epita
and it was really designed
for them to have a long-term
project which was
tricky, not
a simple project and something that would
run for several
months because students are often making projects,
what we call clinics projects.
You know, you just write the project and you throw it away.
And they take back habits from this.
They don't write tests or not enough tests,
not enough documentation, not enough documentation.
The code is not readable.
So we really wanted to find something that would last for several months
so that the students would understand the pain of not maintaining the test suite,
not maintaining the comments, not maintaining the code and everything.
And a compiler is very nice for this because it's a very
it's just a pipe. It's a
sequence of modules.
So
the way we designed the
Tiger compiler is they have to
first they write
the parser and then they write
the binder which
just checks what are the
names that are introduced and what are the names that are introduced
and what are the names that are used,
and then the type checker and so forth.
So it's a project over several months,
and they have to write it in C++.
So each module tries to bring something new about C++
to discover in its implementation.
Okay.
Do you target LLVM, or have the students use LLVM, that is?
Yes and no.
So there are several backends,
and recently some of the students have implemented a backend for LLVM,
but the main backend that we proposed them is to implement it by themselves, by hand, not using any library.
That would be a very interesting project to work on.
Yeah.
And you also mentioned that you recently joined Infinite, which is a part of Docker, and that some of your students apparently work there.
Do you want to tell us a little bit about what you're doing with Docker?
So Infinite is working on the distributed file system.
So Docker is really about containers,
and containers roughly are very lightweight virtual machines.
It's not virtual machines.
It's different, but I'm not sure I have enough time
to describe what exactly containers are.
But containers, the idea is really to be able to spawn
many containers, many virtual machines,
and maybe lose them,
and maybe put more...
Being able to be elastic.
It's really designed for the cloud.
So you can spawn new machines,
kill machines, etc.
You don't care about these machines.
It's really cattle.
The problem, of course, is that
usually if you have several machines,
they are computing something useful,
and you want to keep it.
So that's where Infinite is used.
Infinite is a distributed file system,
and that's where we put the data that you want to last.
Okay.
Yeah, I'm sure we've talked about it.
I'm sorry?
Infinite provides the persistent storage.
Okay.
Yeah, I'm sure we've talked a little bit about Docker in the past on the show.
I'm sure it's something that at least most of our listeners are familiar with.
Hmm.
Yeah.
Yeah.
Most probably.
Okay.
Well, is there anything else you wanted to go over before we let you go?
It's been great having you on today, Hakim.
Thank you very much.
Well, I don't know.
If you have questions, just call me back.
Okay.
Okay.
And we'll put links in the show notes, but where can people find VCSN or find you online uh well the easiest is to go to vcsn's home page which will
be in the show notes so okay the easiest to reach me is to go to cppcast.com okay
do you have any papers or anything that you've written uh documentation for how the dine
system works yes there's a tech report so I will provide it
also for the show notes
if you want more gory details
or maybe someday
I could have a guided visit
into the code. I don't know
how to do a virtual guided visit
in code, but that would be
something to experiment.
Yeah, it could be interesting to
screencast.
Okay, well it's been great having you on. Thank you.
Thank you very much.
Thank you.
Thanks so much for listening in as we chat about C++.
I'd love to hear what you think of the
podcast. Please let me know if we're discussing
the stuff you're interested in. Or,
if you have a suggestion for a topic, I'd love to hear
about that too. You can email all your
thoughts to feedback at cppcast.com. I'd also appreciate if you have a suggestion for a topic, I'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com.
I'd also appreciate if you like CppCast on Facebook and follow CppCast on Twitter.
You can also follow me at Rob W. Irving and Jason at Leftkiss on Twitter.
And of course, you can find all that info and the show notes on the podcast website
at cppcast.com.
Theme music for this episode is provided by podcastthemes.com.