CppCast - Joedb
Episode Date: September 30, 2021Rob and Jason are joined by Remi Coulom from Kayufu. They first discuss another blog posts about the ongoing ABI problems in C++ and another on common mistakes with comparison functions. Then they tal...k to Remi about Joedb, the Journal-Only Embedded Database. News Dicontinue Sourcetrail Binary Banshees and Digital Demons Djinni generator release v1.2.0 Opzioni Everybody Makes Mistakes When Writing Comparison Functions Links Joedb - Journal-Only Embedded Database Kayufu - Artificial Intelligence in Games Sponsors PVS-Studio Learns What strlen is All About PVS-Studio podcast transcripts
Transcript
Discussion (0)
Episode 319 of CppCast with guest Remy Coulombe, recorded September 29th, 2021. In this episode, we discuss ABI and comparison functions.
And we talk to Remy Coulombe from Kaiyufu.
And he talks to us about Jodib, the journal podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
I'm all right, Rob. How are you doing?
Doing all right. You got anything going on you want to talk about today?
Well, no. I mean, still at the moment, I don't know if I'll be allowed You got anything going on you want to talk about today?
Well, no.
I mean, still at the moment, I don't know if I'll be allowed to go to Norway for people who are curious about that.
But I am still planning to go to CBPCon since it's right there.
In your backyard.
Yeah.
That way, actually.
Yeah.
And that's what, like three weeks away now, right?
Four?
Is it really?
Three and a half? Something like that. I don't know. How many weeks away? That four three and a half something like that i don't know how many weeks
away that's a great question we are like a day away from october i know that so it is october
24th so it's a three and a half weeks yeah or saturday is the start of my class the 23rd
oh my goodness i should probably create that class.
No, I'm just kidding.
I've actually been working on it off and on for a couple of months.
But I need to make sure everything is smooth on it now.
Yeah, definitely.
Okay, well, at the top of every episode, I'd like to read a piece of feedback.
We got this email from Frederic Simonis. And he wrote, I'm not sure if this is breaking news has reached you,
but SourceTrail was abandoned by its developers.
And obviously we had an episode of SourceTrail a while back,
but he goes on to say, yeah, the devs would be a good candidate for the show,
especially as they seem to have been crushed by the complexity of their task.
GUI design, cross-platform support, weird user setups apparently became a complexity
hell, which quickly became way more difficult to deal with than the core visualization concept.
And he also says, you know, with our reach in the C++ community, maybe we can help find
a volunteer to maintain this crucial piece of C++ tooling.
So yeah, I'll put a link to this blog post
in the show notes where SourceTrail author Eberhard,
who we had on the show before,
talks about their decision
to discontinue working on the product.
Did you read that official announcement, by the way, Rob?
I did read through it.
And yeah, it's just a shame that they had to close it down.
Well, I mean, yeah.
They had their reasons.
Stopped getting momentum and support and everything else.
And, you know, maintaining a large open source project for free is expensive.
Very, very much so.
But yeah, I mean, maybe a larger community could form around it and more easily support the work that's needed to keep it going.
But it would be interesting to talk to Eberhard again and see if he has
any more to talk about or see what he's up to next.
Well, we'd love to hear your thoughts about the show.
You can always reach out to us on Facebook, Twitter, or email us at feedback at cvbcast.com.
And don't forget to leave us a review on iTunes or subscribe on YouTube.
Joining us today is Ry coulomb remy
was born in 1974 and started to code in 1983 he wrote his first c++ code in 92 with borland turbo
c++ and it has been his main programming programming language since then remy completed
his phd in cognitive science in 2002 with a research topic of reinforcement learning with
artificial neural networks,
invented Monte Carlo tree search, and applied it to the game of Go with his Go program,
winning many international computer Go tournaments.
In 2010, Remy was contacted by Unbalanced Corporation, who offered to buy a license of his code.
And in 2014, Remy quit the University of Lille to create his own company, Kaiufu, and work
full-time on developing Game.ai software.
Remy, welcome to the show.
Hi. Thank you for inviting me.
You know, we've had Borland Turbo C++ come up a number of times on the show, right, Rob?
I mean, we even talked to some of the people who work on the modern incarnation of Borland.
What's that called? Embarcada, yeah.
I'm just curious if you miss those
Borland Turbo C++ days
at all, the nice little integrated
environment and everything
that they had.
Yes, well, I think it
was an amazing piece of software.
Well, I miss...
I could reuse it if I wanted
to, and probably I won't.
But, yeah, it was...
Because before programming in C++,
I was programming in BASIC like everybody else
with the 8-bit machines of the 80s.
Yes, when I discovered C++, I was amazed
because, yeah,
I remember being amazed because
with the 8-bit machines, you have
the choice to program either in assembly
or BASIC.
BASIC was extremely slow
and with C++,
I discovered the enjoyment
of being able to write in a high-level
language and be almost as fast as assembly.
And yes, I remember being enthusiastic when I discovered C++.
Out of curiosity, what 8-bit computer did you get yourself started with in 1983?
So the very first one was called Videopac in France,
and I think in the U.S. it's the Magnavox Odyssey 2.
Okay.
So that was
assembly programming
with 100 bytes
of RAM for code and data.
Oh, wow.
And I enjoyed it very much.
I enjoyed it very much.
And then a few years later,
I got MO5, a French computer by Thomson,
which has 32 kilobytes of RAM and a 6809 processor.
And when I was a teenager, during all my teenage years,
I programmed on this machine and programming games mainly.
And then I got a PC when I was 18.
We point out for our listeners,
you said 100 bytes, not 100 kilobytes
or megabytes or gigabytes
of RAM.
That might be smaller than the smallest
system I've ever programmed for, which I think
had 768 bytes of RAM.
Microcontroller.
Alright. Well, Remy, we've got
a couple news articles to discuss.
Feel free to comment on any of these,
and we'll start talking more about the stuff you're working on,
like Jody B., okay?
Okay.
Okay, so this first one we have is a very long blog post
from Shanheed Manid titled B binary banshees and digital demons.
And,
uh, this is his post on ABI,
which we've obviously talked a lot about on the show.
Um,
he certainly has a lot of thoughts here.
Uh,
I don't know,
Jason,
where do you want to start with this one?
Oh my goodness.
I don't even know.
Yes.
The ABI has demons.
Yeah. Ghost. I don't know. The main, the ABI has demons, ghosts.
I don't know.
The main thing he's focusing on is it just seems that it's blocking so much potential improvement in the standard.
But it's worse than that.
So one of the examples down at the bottom of this article is about libformats implementation in Visual Studio, which I think we
discussed briefly on the show, where Visual Studio implemented libformat. And effectively,
there was a bug in the design of libformat because it accidentally relied on locale,
it wasn't supposed to. And Microsoft said, Nope, we're not breaking ABI. Too bad, can't fix the bug. And it was a C++20 feature implemented in 2020,
and they're fixing the bug in 2021.
Nope, can't fix it.
So his argument is that it's not the best solution that wins.
It's the first solution that wins.
For another example in here was what um nested functions in c if they were to
implement nested functions and see like allow that feature then they either have to go with
the non-standard implementation that gcc has had for decades and lots of people use, or break decades worth of GCC's code
by specifying something different.
So there's just no way to do it.
And because it would be an ABI related issue again.
And yeah, yeah, that's what I took away from it.
Yep.
Yeah.
You have any thoughts on this one, Remy?
Any thoughts on ABI break?
Yeah, I know that ABI
is a big mess and
my personal point of view is
that there are some
very clever people like you who think about
these problems and I try to stay
away from it because
I enjoy programming my C++
code and
I'm not involved in
C++ design so
I prefer to
trust people to do it
and I do with whatever
constraint the
compiler leave but yeah I think
it's so from what I understand
it's a pity because
some companies like Google
have from what I understand
I'm not a specialist,
but they tend to develop their own standard library because it's not possible to fix
the existing standard library.
Right.
Yeah.
Yeah.
My feeling is that
there should be a way to deal with this
cleanly,
like stating an ABI version somehow.
Versioning or just letting
everyone know when a new standard comes out
it's going to be an ABI break.
Yeah, something.
But yeah, I imagine
it can be extremely complicated
but I don't know the details.
I'm pretty sure this is like
was the point of inline
namespaces in C++11,
was to allow you to swap at compile time what version of an ABI you used,
and then at runtime you would get a break if things went awry.
But that's a C++ fix,
and John Heed's been really involved in the C standards committee lately also
in pointing out that C has just absolutely no solution to this because their ABI is too simple.
Yeah.
Yeah.
Sometimes lately I've been thinking that my role on social media or whatever is to be
the last person standing who is still optimistic about C++. It does seem like it is
really causing
a lot of depression among some committee members.
These ongoing ABI
issues, I really feel for
the people in the community.
And they make a reasonable
proposal, which John Heath Carroll
covers several of these. A reasonable proposal.
Sorry, nope. That would be an ABI
break in the compiler
that you've never heard of before yeah because they already implemented whatever feature okay
uh so moving on the next thing we have is a uh new release from uh ginny and we talked about this uh
i think earlier this year over the summer with uh harold. And yeah, so it was just a new version.
Just wanted to highlight that they are still actively working on this.
And I don't recall how much we went into this with him,
but it looks like they now have both Python and C Sharp support.
I know that wasn't in the original version of Genie that Dropbox worked on.
That was originally for Java, right?
It was originally just Java and Objective-C,
I believe, but they definitely have Python and C Sharp now. Objective? Objective-C. Apple stuff.
Yeah. iOS, yeah. And then the next thing we have is Opsioni, which is a command line arguments
parser library for C++. You want to tell us about this one, Jason?
This came up because I tweeted, I don't know, a couple weeks ago.
I said something like, group therapy time.
Everyone should share projects that they started but never completed.
And someone shared this project, and it really jumped out at me
because there's a million command line argument parsers.
Yeah, a million commandline argument parsers yeah yeah a million command line
argument parsers and we just this just came up with clara again when we were last talking about
talking with phil last time we had him on and talking about the fork of clara well okay so
this one jumped out at me because the author makes a point of saying at compile time you know
all of the command line arguments your project can take, right?
So it's actually a constexpr generated command line argument parser.
Because you can say program whatever dot version dot intro whatever, assign that all to a constexpr variable, and it's going to build the command line argument handling stuff at compile time.
I just thought I wanted to call attention to that.
That's a shiny example of constexpr use right there.
Definitely.
Okay.
And then this last one we have is another blog post from PVS Studio,
a very short one.
And this one is about OpenSSL 3.0
and everybody makes mistakes when writing comparison functions.
And yeah, I mean, just it's a very, very short post, just finding that they have this comparison
function, taking in one argument of A, one argument B, and they just compare A to A, basically.
Yeah.
This is why I'm a fan of dash W unused, because it would have also flagged on this.
Yeah, definitely.
Any thoughts from you, Remy, by the way?
We kind of skipped over you on the last one.
Yeah, well, yeah, no thoughts about it.
But I am happy to learn about this command line passing library.
Jotiby badly needs a command line passing library.
Yeah. library. Jotiby badly needs a command line. Unfortunately, C++20 is maybe a bit too cutting edge for me at the moment. Yeah, that's probably true. That one does look like it requires the
latest there. So I guess before we get into the interview too much, I think it's worth pointing
out, like in a disclosure kind of perspective, that I actually did code reviews for Remy, what, a couple of months ago on JoeDB.
And so I've looked at this project some in the past already.
Well, pretty in-depth in the past, actually.
But just so that that's out there, that we had had that working relationship already.
Right. Well, Remy, why don't you start off by telling us a little bit more about what Jodib is.
Yes. So Jodib stands for the Journal-Only Embedded Database.
And it's a C++ library to store data on disk.
And so my initial motivation is, like every C++ programmer, I need to save structured data on disk.
And I was not very satisfied
by the existing solutions
because I wanted to have crash safety
and transactions.
And pretty much the only solution
for this is database.
But they felt too complicated for me
and using SQL queries inside C++ felt very unpleasant.
So I said to myself,
I'll try to come up with a simple solution.
So in order to have crash safety,
I need to have transactions,
and a journal is necessary.
And then I said to myself,
well, nothing else is necessary,
and I will make a very minimalist C++ library
to store structured data.
And so Jotiby stores on the disk only
a log of all operations.
And it keeps tables in memory
in a C++ data structure.
So manipulating data can be done directly
in C++ like a container.
And writes are written
at the end of the log.
This way I can keep the whole history
of the data and
commit to
confirm data.
So it's a very simple
library and I can manipulate
tables. So the
data structure is similar to
SQL database.
The file
is made of tables and each table
has fields,
columns that have a type
and it's possible
to have a pointer from one table
to another one.
It's a very
minimalist library.
It's maybe 10 times smaller than SQLite.
And it gives me the features
I want it to have,
such as crash safety
and also concurrency
in a very small C++ library.
And when you say
that it's journal-based,
I just want to clarify,
is that the same
as like a journaling file system?
Or do you know?
I don't know enough
about journaling file systems to answer this question.
I was hoping you could explain it to us.
Yes.
The basic idea is,
but I think it's the same principle,
that is to say,
I want to be able to recover from a crash
or a power failure that may occur at any time,
even at the worst time
in the middle of a write.
So it's difficult.
And the way it works
is rather simple.
So in each Jodibi file,
I have a checkpoint
that indicates the length of the file
up to which the data
have been correctly written.
And when I write a transaction, I first write the log entries at the end of the file up to which the data have been correctly written. And when I write a transaction,
I first write the log entries at the end of the file
and the first copy of the checkpoint position.
I flash this to the disk,
and then I write a second copy of the checkpoint position.
And so the log entries are considered as definitely stored safely on disk when the two copies of the
checkpoint positions are identical.
So that's how it works.
And when reopening a file, if the two copies don't match, then it's possible to recover
from the previous checkpoint position that is also memorized. So with this journaling,
it sounds like it should be easy
to kind of go back in time
and look at the history
of what your database used to have.
Yes, so you can replay history
from the beginning
and undo any change.
And that's a really nice feature
because if you have a SQL database,
if you erase some data by mistake, then it's a problem.
So on a system based on SQL database, you really need to set up a system of periodic backups
if you want to be able to recover from a mistake.
With Jodibi, you can undo anything.
So does that mean each time you open the database again,
it has to replay the whole history to get up to?
Yes.
Okay.
Oh, interesting.
I was just going to say, for a larger database,
does that become a lengthy operation to rebuild the data
and get up to what the current state of it should be?
Well, for the things I do, it's not a problem.
Okay.
So, of course, the big limitation compared to SQL database
is that data has to fit in RAM.
So I cannot use JotDB for databases that are bigger than RAM
because data is stored in C++ data structures in RAM.
Okay.
Okay.
But, yeah, loading is fast.
Now, the whole history is not stored in RAM, right?
Only the current state, and then you write the history out?
No, the history is only in the file.
Okay.
And the tables are stored in RAM.
I just thought about this for the first time,
because I work, you know, long-time listeners of the show know one of the projects that I work on is a physics simulation system that I'm contracted to work on.
And one of the things that they want to be able to do is, you know, snapshot the current state of the simulation and then fork it, branch it, do whatever, roll back if they need to. And for something like that, if you're not working with very large data sets,
Jodb could actually be a possible solution
because you could run a full simulation all the way up to the end point
and then say, wait a minute, what happens if we go back and change the simulation
20 time steps ago if you saved every time step in the journal?
That might work.
I see, yes.
But maybe it would be profitable only if simulating is more costly
than storing on disk.
Right.
Because you do a flush, you said, with each write.
Well, you can decide when you flush.
Okay.
So if you don't need durability
you can flush
a huge buffer if you want
and it will be much faster than flushing every write
right
the sponsor of this episode of CBPCast
is the PVS Studio team
they developed the PVS Studio Static Code Analyzer
the tool helps find typos
mistakes and potential vulnerabilities in code.
The earlier you find an error, the cheaper it is to fix, and the more reliable your product releases are.
PVS Studio is always evolving in two directions.
The first is new diagnostics.
The second is basic mechanics, for example, data flow analysis.
The PVS Studio Learns What Sterlen Is All About article describes one of these enhancements. You can find the link in the podcast description. Such articles allow you to take a
peek into the world of static code analysis. So you mentioned that you weren't a big fan of
writing SQL code and C++, which is understandable. What does it look like to write queries and transactions in Jodibay?
So it looks like accessing a normal C++ container.
So you can...
So in fact, the database is stored in C++ container.
Each column is stored in a vector, std vector.
And you can also...
I did not explain one thing that is very important.
There is a compiler.
So you define the schema of the database
and the compiler will generate C++ code
to open a file and access tables and data.
And you can
instruct the compiler
to build
indexes, for instance.
So if
you have a table of
users of
your website that are
identified by their email
address, you may wish to be able
to search rapidly in
the table based on their email address.
And so the compiler would generate C++
code to
automatically generate and update
a STD map that
will let you search
the table of users based on
their email address very rapidly.
So yes,
it's...
So, that's a negative
point of 2DB, that is to say it works only
with C++, unlike
other databases.
But it's also a good
plus because you don't have to
have this clumsy
query
to convert back and forth
from C++. It's directly
C++ data structure.
Just to be clear, you don't write your own
C++ data structures at all.
You specify the schema and then use
the data structures generated by the
compiler. Yes.
So what was your primary
use case for writing
Jodib? How do you use it?
I use it in many different situations.
So whenever I have to write anything to disk,
basically I use Jodib.
So my very first use of Jodib was for a website.
So I used Jodib to store the usernames
and all the information.
So my website is not like Facebook,
so I don't have billions of users.
It fits in RAM.
And then I use it for all my game AI.
So I use it, for instance,
so I'm using deep learning
and convolutional neural networks.
So I store my neural networks in Jotv format.
And that's also one big plus
of JotDB, that is to say,
when doing machine learning, we have to manipulate
a lot of vectors, large vectors
of floating-point numbers.
Manipulating large vectors of floating-point
numbers in an SQL database
is not so
convenient.
With JotDB, you have your C++ vector of floating-point numbers. is not so convenient. And with 3DV,
you have your C++ vector
of 14-point numbers.
So yes, that's one.
And yeah, writing is extremely fast.
And so yes,
that's one application,
storing neural networks.
I also store my training set
in 3DV format.
And I have,
if I want to store also the game records to replay a game,
I store game records in 3D format.
And yeah, whenever I need to write a file,
I write it in 3D format anyway.
It's so simple to use.
So you said game records.
Does that mean like with your Go engine
that you would store every move made by every player like an event log and then be able to replay that?
Yes.
And also for the end user of our applications, if they wish to save their games they play against AI, they can save it.
And the format for saving the game record is a JTP.
So from the C++ perspective, we talked about opening the file, and it has to kind of replay
everything to get to the last known state. But if I wanted to open the file at the first
known state, and then just increment to the last known state, can I do that?
You can. So at the moment, there is no very convenient API
to manipulate history,
because in practice, for the use I have,
I don't have to do it.
But when you open a file,
it's possible to replay the log step by step.
And if you want to catch events,
so for instance, in the log,
you can write a data operation
such as adding a new row or modifying a value.
You can also insert timestamps or comments in the log.
So you can, if you want to,
when you open a file, you can say,
okay, replay the file until such timestamp.
This is possible. It sounds like it could be...
It sounds very similar to
how Doom
game events are stored,
which is what I'm thinking of from when I
worked on my Doom C++
port. A similar
kind of idea. It's events and timestamps,
you know, so you can go back and completely
replay a demo of a game whatever it sounds like it could be a good use case for game developers
who want to store like a set of series of input events or something yes maybe yeah yeah i don't
know doom but yeah right yeah yeah you learn a lot taking someone else's random project and trying to port it to C++ 20-some-odd years later.
What does the performance of Jodib look like
compared to something like SQL or SQLite?
Yes.
So the performance is a performance of a STD vector.
So if you are reading data, you are reading a STD vector. So if you are reading data,
you are reading a STD vector.
So if your data can be stored efficiently
in a STD vector,
it's pretty much as fast as it can be.
And so in the code of 2DB,
I made some simple benchmark
for bulk insert in SQLite database
and do the same with
Jodib.
So for bulk insert,
I found that Jodib
is about an order of magnitude faster
because what it does
is much simpler.
So I'm a bit cautious about these benchmarks
because I'm not a big SQLite specialist
and maybe the SQLite developer
will be able to do something more efficient.
But in the principle,
SQLite has to store the table on disk
and pretty much all SQL databases
are based on balance trees,
which means in terms of C++ containers,
it's like a STD map.
And 2db stores in vector,
so it's a lot simpler.
If your data is a big vector,
then it's going to be much faster in 2db necessarily.
You said each column is its own vector, right?
Yes.
So that also has the interesting potential side effect that if you were already wanting to use a data-oriented design, you kind of naturally got that out of it.
If you say, I need to iterate over all of the X coordinates of my data set or whatever, bam, it's already hot in the cache, ready to go.
Yeah.
Very cool. So we're kind of talking
about just the single computer use case, but Jodb also has
a server kind of mechanism, right? Yes.
That's also another use of Jodb I make that I did not mention.
For my machine learning experiment, I need to do distributed
calculations. So I have multiple machines playing games against each other.
And so 2DB has a concurrency model and a network protocol
so that it's possible for multiple machines to share a shared database. And so it's, again,
a very minimalist and simple way
to handle concurrency.
The way it works is a bit like Git
because I have a log history
of all the change, like in Git,
and I can push and pull
between two copies of the database.
So in a client and a server, the server has a copy of the log,
and the client has a local copy.
And so unlike Git, I don't deal with merges.
So I need to keep the history linear.
I can do this with a lock.
So basically, the client has four operations with respect
to the server. Can lock, unlock, push,
and pull. And with
these four basic operations, it's
possible to
conveniently synchronize
multiple clients
connected concurrently to the same server.
And basically,
a transaction is you lock,
then pull, and make local modification, then push and unlock.
If a client locks and then gets stuck,
it hangs or something,
what's the scenario there?
Do you ever release the lock?
I can set a timeout on the server.
So you can set a timeout on the server so uh so you can set a timeout on the server and you'll have to wait until the
timeout expires so you said you you use this for coordinating your ai game players playing against
each other right yes so what does that actually look like like what kind of information do you
store in the database and how does each client know
what it's supposed to work on next or whatever?
So
basically the database
is a list of jobs to do
and
clients
connect to the database to pick a job
to do and they record
the fact that they picked this job to do
and then they unlock the database and do that job
and when the job is over
they can
make a transaction to push
the result of the calculation
and this is, well it's a bit more
complicated than this because there is one
client, he's a supervisor that
manages
everybody but yeah that's
basic scheme of concurrency.
So from a logical point of view,
it works like SQL Server,
but in practice, it's much more convenient to use in C++
because it's very natural C++ code on C++ data.
It kind of sounds like you are using it in a scenario.
I think the first thing that comes to mind,
now I'm questioning if I'm going to use it right,
BeanstalkD is what I'm thinking of at the moment.
Does that sound familiar to anyone?
No.
BeanstalkD is simple fast work queue.
Yeah, so you're using it effectively as a work queue,
so it's another potential kind of application here.
It sounds like you're definitely making a lot of use of Jodib in your AI software.
Do you know if there's any other users out there of the Jodib software currently?
I'm not aware of any other active user.
So a few years ago,
one person wrote a few issues on the GitHub
and was obviously actively using it,
but I've known use for a while.
So yeah, I'm probably the only user at the moment.
So I did not actively promote the library.
I made this library for my personal use.
I'm happy to use it.
But yeah, I decided
to make it open source because
it's...
Yeah, I think...
I love using this library because
I feel it's...
There is no good alternative to it
for the things I do.
So yes, if anybody else wants to take a look, I feel there is no good alternative to it for the things I do.
So if anybody else wants to take a look,
I am happy if it's useful for anyone else.
You said it's open source.
What license does it use?
MIT.
Okay.
And you also said C++20 might be a little too advanced for you for the command line parser.
What C++ standard are you currently using?
C++ 11.
Okay.
And are you mostly running this on Linux?
Have you tried it on Windows or Mac?
It also works on Windows.
It does work on Windows, okay.
And also on game consoles and everywhere.
So it works on iOS and Android and the PlayStation and the Switch.
Very cool.
And I did not test networking on those machines,
but file reading and writing works.
Can you tell us what games are built with Jody being used? So these are strategy games like chess and Othello.
And so some Japanese games that you may not know,
like Hanafuda or Majan.
Okay.
These are, yeah.
Yeah, on the topic of the strategy games,
you just kind of, in your bio, glossed over,
you said won many international computer Go tournaments.
Yeah.
How does that work?
Are you pitting computers against computers?
Yes.
So participants connect.
So we have a network protocol for connecting AI to each other.
Okay.
And there are automatic tournament managers.
And so, yeah, we all meet in a big room
and connect the machines to the network and let them play.
That's how we get Skynet.
How quickly do those games go?
And also, I'm curious if there's a visualization for it.
Can you see the game board?
Yes, of course.
In fact, yes.
The Japanese tournaments are really great.
They invite every time professional Go players
that make live comments on TV.
What?
Yeah, yeah, yeah.
The game of Go and Shogi are very big in Japan.
So they have specialized TV channels and professional players.
So they even commentate on the AI games?
Yes, yes, yes.
I had literally no idea that was a thing.
Yeah, so usually the winner of the AI versus AI tournament no idea that was a thing. Sorry, go ahead.
Usually, the winner of the AI versus AI
tournament is
also, after the tournament, set to play
a demonstration game against the top
human players, and this
attracts a lot of spectators, usually.
Yeah, and Go is considered
one of the most complex games
ever created, or something like that?
Yes, it's the game that resisted the longest to AI.
So you may have heard about AlphaGo by DeepMind.
So five or six years ago, they made a very big match.
So for a very long time, Go was a game where we could not make an AI stronger than top humans.
And so with neural networks,
since a few years now, we are stronger than humans. But for a long time, it was a big challenge.
How did your AI tournament
winning Go engine fare against humans?
That's what I want to ask.
So, the
version of today is
stronger than humans.
Yours is as well.
Sorry? The one you made is.
Yeah, the one I made is.
It's not so strong. So, it used to be the
strongest in the world long
ago when I was young.
And so, a few years ago,
Google made their paper explaining their new algorithm
based on convolutional neural network.
And then plenty of very strong open source programs
were based on this paper
and became stronger than my engine.
But I'm fighting back.
I hope I will catch up someday.
So is this, and then if I understood your bio correctly,
basically because you had this really good Go engine,
someone wanted to buy it for their board game
or computer game, I mean.
Yeah.
Does that happen often that a game company will approach someone
who competes in these AI game tournaments?
I don't know.
I think, yeah, it's very rare.
Yeah.
But, yeah, there is, yeah, but there are a few, yeah,
there are a few programmers in the world who, yeah, sell their engine.
Okay.
But it's not, yes.
If you want to make more data, I don't recommend this plan.
I'm thinking about the competitive programming episode we did with Connor years ago.
And talking about competitive programming,
I mean, there's potentially a lot of money available in competitive programming.
Is there prizes if you win
the AI tournament or anything?
Yeah, there are prizes, but
they are relatively small.
Well, not...
20 years ago, there was a millionaire from China
who offered $1 million
to the first program
who can beat a professional player.
But that person died before we could do it.
And so the prize...
Oh, that's disappointing.
The prize was canceled.
Well, if any of our listeners...
I'm sorry, go ahead.
Yes.
So, yeah, usually the prize for winning tournaments
are very modest,
but it's possible to make money by then using the notoriety
to sell the software to people.
Okay.
If any of our users are interested in trying their hand at AI game engine writing,
is there a particular website you might refer them to or something?
Yes, I recommend codinggame.com
Codinggame.com
Codinggame, yeah.
They run plenty of
competition and it's very fun. There are
hundreds of participants.
And it's not just Go.
No, no.
In fact, I don't think they run
a Go competition.
But they have plenty good competition. Oh.
But they have plenty of games.
And also there is an International Computer Games Association,
which is a kind of official body for the,
yes, also academic association for research in AI.
And they organized tournament for many years.
They organized the World Computer Chess Championship
and that kind of tournaments.
And Computer Olympiad is also
a competition they organized.
Computer Olympiad.
Just imagining the computers
doing pole vaulting.
Yeah, this kind of thing.
That sounds really fun.
Yeah, really fun.
I have a very fun job.
I enjoy it.
It was my hobby when I was a kid,
and now I can make money with it.
It's always the dream.
Right.
One other thing that I think we should talk about
is you described this schema,
and then the compiler runs it,
and then that's the data structure you use.
But if you ever want to modify the schema,
how does Jodv deal with that?
Yes, so there is a convenient feature
to upgrade all database to the new schema.
So the way it works, it's very simple.
So the schema is written in the log,
like the rest of the data, as a sequence of operation to create a table, add a field, or rename a field or rename a table.
And so if you open a file in Jodb and schema does not match the schema that was used by the compiler,
usually it will reject the file and say this file has a wrong
schema, I don't want to open it.
But if the schema
is the
beginning of the current schema and I can
complete
the
schema of the file that is open
by adding a missing
operation at the tail of the log,
then it will automatically upgrade it.
And so it will automatically add new tables,
fields, and rename.
And it can also,
you can also make custom functions
to convert data if necessary.
Okay, so anything that I've worked on
in the past that had a database
backing or a file format of some sort,
inevitably you have
the, you know, how do I upgrade
to the new format path.
Yes, that's a very convenient
feature.
And it's also safe in the sense that
if you upgrade by mistake
and want to go back, you can undo the upgrade.
Oh, okay.
Since you have the whole history there anyhow.
Yes, you have the whole history.
So you can undo anything.
So it's not risky to upgrade.
Anything can be undone.
All right.
Awesome.
Very cool.
Well, Remy, we'll put links in the show notes for listeners to go check out.
Jody B., is there anywhere else they should go to find your work online? Remy, we'll put links in the show notes for listeners to go check out Jodib.
Is there anywhere else they should go to find your work online?
Yeah, Jodib is on GitHub, and there you will find a link to the documentation.
So I made a user guide with plenty of examples,
and you can get a feeling of how Jodib works thanks to this.
Okay, thanks so much for coming on the show today.
Yeah, thank you very much.
Thank you.
Thanks so much for listening in as we chat about C++.
We'd love to hear what you think of the podcast.
Please let us know if we're discussing the stuff you're interested in,
or if you have a suggestion for a topic, we'd love to hear about that too.
You can email all your thoughts to feedback at cppcast.com.
We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter. You can also follow me at Rob W. Irving and Jason at
Lefticus on Twitter. We'd also like to thank all our patrons who help support the show through
Patreon. If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast.
And of course, you can find all that info and the show notes on the podcast website
at cppcast dot com.
Theme music
for this episode was provided by podcast