CppCast - Pattern Matching
Episode Date: June 5, 2019Rob and Jason are joined by Michael Park to discuss his Pattern Matching library and standards proposal. Michael Park is a software engineer at Facebook, working on the C++ libraries and stand...ards team. His focus for C++ is to introduce pattern matching to facilitate better code. News Initialization in C++ the Matrix CMake 3.14 and Performance Improvements Compiler Explorer Execution Support Michael Park @mcypark Michael Park's GitHub Michael Park's Blog Links C++Now Pattern Matching: Match Me If You Can P1371R0 CppCon 2017: Michael Park "MPark.Patterns: Pattern Matching in C++" MPark.Patterns Sponsors PVS-Studio Facebook PVS-Studio Telegram PVS-Studio Twitter JetBrains Hosts @robwirving @lefticus
Transcript
Discussion (0)
Thank you. maker of intelligent development tools to simplify your challenging tasks and automate the routine ones jetbrains is offering a 25 discount for an individual license on the c++ tool of your choice
sea lion resharper c++ or app code use the coupon code jetbrains for cpp cast during checkout at In this episode, we discuss initialization syntax in C++.
Then we talk to Michael Park from Facebook.
Michael talks to us about his proposal to add pattern matching for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how's it going today?
I'm all right, Rob. Made it to 201.
Made it to 201. It's 201 it's gonna take a little
getting used to to start saying uh can you count that high uh yeah i wonder if it's gonna be another
big milestone when we hit a full bite it's 256 or 255 255 yeah well we should have started counting
from zero i really wish that i had started my videos from like zero count and like octal or something.
That's what I wish I had done.
Yeah, that'd be nice.
Yeah.
It's too late now.
Too late.
Well, at the top of the episode, I'd like to read a piece of feedback.
We got a bunch of tweets last week after we released episode 200.
This week, this one is from Luke Tre trevorrow saying episode 200 of cpp cast
herb sutter was a joy to listen to congratulations to rap and jason for making it to 200 quality
podcasts don't ever give up which is a reference to us uh talking about how much longer we'll
keep this thing going but yeah as long as there's uh c++ news to talk, we don't have any plans to end it anytime soon.
Is it fair to say 200 quality episodes? Could we say
it's at least 170 quality episodes or something?
We were definitely still learning early on, but I would hope
to think that the vast majority of them have been quality episodes.
We'd love to hear your thoughts about the show. You can always reach out Hope to think that the vast majority of them have been quality episodes. Yeah. Okay.
Well, we'd love to hear your thoughts about the show.
You can always reach out to us on Facebook, Twitter, or email us at feedback at cps.com.
And don't forget to leave us a review on iTunes.
Joining us today is Michael Park.
Michael is a software engineer at Facebook working on the C++ libraries and standards team.
His focus for C++ is to introduce pattern matching
to facilitate better code.
Michael, welcome to the show.
Hey, thank you. Thank you for having me.
So how long have you been at Facebook?
I started April last year, so like a year and a bit.
Okay.
I feel like we've had many Facebook developers on here over the years.
Oh, cool.
Victor went on recently, I know.
Yeah, and you always have a pretty good presence at conferences
from what I've seen, too.
Yeah. CBPCon especially,
I think we have a lot of people at Con.
Right.
Well, Michael, we've got a couple news articles
to discuss. Feel free to comment
on any of these, and we'll talk more about pattern matching,
okay? Great.
So this first one is a blog post from timur dumler i know i've met him uh from jet brains but i i have
trouble with that name and this is a post initialization in c++ 17 which comes from a
talk he gave and he just had a lot of requests for this one slide he made on all the
different types of initialization that you can do in c++ 17 and it's uh it's quite a little chart
isn't it it is yes and you've definitely met him because he goes to all of the c++ conferences
without realizing it i think i've met him in like five different countries now
or something i know we talked to him along with uh anastasia and phil at the uh at cpp con last
year yeah both of whom i've also seen at five other countries yes uh but yeah is there anything
you want to call out with this uh chart michael Jason? 42 entries. I noticed that.
That's interesting.
I did not notice that.
That's funny.
That's funny.
Yeah, I think it's a useful chart for people.
I think that some of the entries
actually kind of are confusing to me.
Like the box for default initialization for aggregates,
for example, it says uninitialized. But I mean, like, if we have, you know, a struct x with a
string in it, and you default initialize that, that's not uninitialized. It's going to default construct the members,
which is exactly what it says for the other types with no user-provided default ctor row.
Under default init, it says members are default initialized.
So that seems to be the exact same case for aggregates
for default initialization.
So I'm not exactly sure what's going on there.
So that's, I think the point is
if you had like a struct of three ints
and that first column,
all their values
are undefined. Right, that's true.
But it's very much dependent on
what's inside the aggregates, right?
Yes, I would agree.
So it's kind of interesting. If it said
aggregates with default,
aggregates with built-ins only or something like that...
Or trivial aggregates or something like that, maybe?
Yeah, something like that.
But it just says aggregates, so it's kind of like,
I don't know what the...
I think I understand what it's trying to say,
but maybe it's just slightly inaccurate or something.
Yeah, and it also says copy initializations
for aggregates doesn't
compile, but if I
have a struct x with an int
in it, I can copy initialize
one just fine.
I'm not sure what's going on there either.
I have a few
questions, but in general
I think it's a useful table for people.
I feel like it must
just be like what could fit in the square because like if you had a struct of an int and you do
equals five because he has you know equals value there that wouldn't compile right because it
doesn't have a constructor or a converter conversion operator or whatever that it could call
uh that's i guess that's what they that's what he means yeah i think that must be the point yeah so it's yeah yeah but initialization is a crazy area that i actually kind of like stayed away
from for a while uh just because i don't want to see all the stuff on the bed um but i see but i
see like common mistakes like people doing like array uh initializing an array with like curly brace zero close curly brace
thinking that that zero initializes the array like it's array or something and like it happens to
work right because it'll take the zero value and then zero initialize the rest but if people put
like open curly one close curly thinking that it initializes all of the values to one, like that is not what's going to happen.
Right.
Initialize the first value to one and the rest of them to zero,
right?
Zero.
Exactly.
Yeah.
And so the curly zero,
uh,
close curly is like very misleading as to what's actually happening there.
Yeah.
That's a,
that's a great point.
I see that all the time.
I think I will use this chart like in my classes i mean just like
tell my students that this chart exists but when it comes up sometimes in classes and teaching i'm
like yeah there's like 17 types of initialization in c++ but we don't really care these are the
three rules that i want you to pay attention to yeah yeah yeah i mean we try to we try to like shove it under some sense
of intuition right right hopefully hopefully it comes out okay but all the nitty-gritty is pretty
ugly yeah that's true okay uh next thing we have here is another update from visual studio 2019
and this is cmake 3.14 and performance improvements. So we've talked about this a
fair bit, how they, you know, keep improving their CMake experience with Visual Studio.
And with this one, it seems like the main improvement is that they upgraded to CMake
3.14. And with that, they can now use the file-based API. Jason, are you familiar with
what that gives you? I am not. I read this and at first I was like,
wait, where are these performance improvements coming from? Oh, that's because of a change in
CMake. But then I noticed that they said CMake had deprecated the CMake server, which I thought
was interesting because there's a very brief period where like all of the build tools and
compilers were like, we're going to stand up our own servers and everyone can talk to the server
and they can get all the information that they want and then they're like never mind
so apparently uh cmake server is no longer being used and now there's this file-based api that i'm
not familiar with neither am i okay well it seems like it's a good move and uh you know if you're
getting performance improvements out of it,
it's definitely a win.
Well, yeah, we didn't actually...
I don't think you've said out loud.
It's two times performance improvement
for CMake-based configuration.
Yeah.
It looks like they tested it on the LLVM code base.
Which counts.
I mean, that's a large CMake project,
is what I'm trying to say.
Yeah, yeah, yeah.
It's not like testing it on my toy project or something. No, no, absolutely not. Yeah, I'm sure it's a large c-make project that's what i'm trying to say yeah yeah yeah it's not like
testing it on my toy project or something yeah no no absolutely not yeah i'm sure it's a good
test case yeah i think it would actually be even even like more impressive if it was on a small
project because i think oftentimes it's harder to get big wins on small projects all right
i mean on the upside a small project doesn't have a measurable time
when uploading
CMake hopefully
yeah
but it seems pretty impressive
yeah okay and then the last
thing we have is Compiler
Explorer released
a beta version of code execution
support so you can now actually
see the code you put into
Compiler Explorer and see what the output would be
from that code when you run it, in addition to
the assembly and all the other good stuff that Compiler
Explorer already did.
Yes.
I was going to say, for background, for people who
watch C++ Weekly, I've been
showing this support for years.
Because if you run it locally,
you've been able to do execution
with no problem but matt just had the time and opportunity to be able to with some level of
you know assurance that it's not going to take down this whole server put this up on
the official website now gotcha let's see okay well michael can you start off by maybe giving us an intro to what exactly
pattern matching is i'm sure we've you know referenced it a couple times and talking about
other languages on the show but i don't think we've ever gone in depth with it cool um yeah
so pattern matching uh basically is a declarative approach to inspecting a value or values in that rather than using
something like a dfls chain where you would poke into a given value to check its state.
So for example if you have like a point class, if you were to check whether this point is in the origin, you would often end up
writing something like if point.x equals zero and point.y equals zero. And so we have some value,
and we're reaching into that value to check some conditions, right? And oftentimes, these
conditions are capturable in what we refer to as a pattern.
And so a pattern generally is a description of a value.
And patterns, they either match or they don't match.
And so in this case, we would introduce a pattern like square bracket, zero comma zero,
close square bracket, which would capture the idea of first element zero, second element also zero.
And this also has to be a two-tuple kind of thing.
And so you can view pattern from...
So similar terminology comes from regex, right?
Where you have the regex pattern,
which is a description of the string that you're looking for.
And then we just say, oh, given some arbitrary string,
see if this pattern matches that string. And then if it matches, then you can actually specify
which components of that string you want, which will then give you your regex matches, which
is your catcher groups, essentially. And so we can do something similar with generalized beyond strings, where we can say square bracket 0 comma x, close square bracket, where the second x captures the value of the thing that we matched.
Okay. is you end up writing like a switch kind of language construct. And rather than having cases that are constants or integral constants,
you end up being able to do a lot more elaborate things in there.
And then it'll match the first pattern.
It'll dispatch to the statement expression for which the pattern first matches.
So you said, you know, a lot more things.
Like, what exactly can I put in there?
Like, I don't know, can I say if x is greater than 5 and y is less than 4,
and that's one of my cases?
So that would be provided through what we call a pattern guard.
Okay.
So there essentially would be a small pattern language, right?
So the square brackets, which is kind of like inherited from the structured bindings notation.
After, let's say, for example, I want to say, so let's go back to the point example, right?
If we say, I want to match a point of x comma y and then i want x to be
greater than y right then you would say open square bracket x comma y close square bracket if
x greater than y okay and so you can put an if statement essentially after a pattern inside
which you can put arbitrary glory expressions again now this syntax you're talking about uh
you have a pattern matching library and we didn't this syntax you're talking about, you have a pattern matching library,
and we didn't, I don't think, mention yet
that you have a pattern matching proposal.
This syntax you're talking about is in reference to the proposal?
This syntax is in reference to the proposal, yes.
Okay.
Yeah.
So maybe we should dig into that a little bit.
I mean, you also do have the pattern matching library.
Is the library
based on your proposal? Is it like a sample implementation?
No, the library actually came before the proposal. It was an experiment to see how far we could take
a library to see, well, it was twofold. was is well could a library solution be enough and the
second was if if it's if it's useful but it sucks enough then similar to boost lambda there would be
some desire to actually make it a language uh language also because you know it's not sufficient right i'm sorry but you just
mentioned boost lambda for our listeners who are not familiar with it it is completely unusable
but you know it was useful enough people wanted it but it sucked enough that we wanted a language
thing right and right that's the that's the balance we want to strike there.
But yeah, it was,
I gave a talk on it in CppCon 2017 describing the challenges that I hit with it,
namely around identifiers.
Basically, it was pretty difficult
trying to introduce identifiers inside a pattern
because there are only so many contexts in which you can introduce an identifier in C++.
And so inside of the pattern, which is essentially just a value with placeholders, which I called
arg, because the placeholders inside the pattern in that library
would get dispatched to the lambda,
which is on the right-hand side.
Okay.
And so, you know, again, going back to the point example,
if I were to say something like, you know,
product type, like ds,
which stood for destructuring,
if I were to say ds, open paren, arg, arg, close paren
equals a lambda that
takes two arguments, when that
pattern matches, the two elements
of the pair would get dispatched
to the lambda on the right-hand side.
And so
it
of course hit a readability issue
because when you have
complex patterns
with arg, arg, arg, arg, like you have no idea what those are, right? We really want identifiers
in place to be able to help, you know, denote what those fields are. And so what ended up,
what we ended up doing then is like, okay, auto name equals arg, auto address equals arg, and then put those names inside the pattern.
But that has to be declared beforehand, before you actually use it.
Right.
Which is like, it defeats the purpose of the whole thing.
So, yeah.
So how is, I'm sorry, no, go ahead.
Yeah, so I think as an experiment, it was a pretty successful one.
It definitely helped to guide the language design quite a bit.
But I think we need something more than library solution.
Is the library used actively by anyone right now?
I hope not that might be the first time we've ever had a library author on our show say i hope no one's
using my life i like it there has been there has been and i mean because because i put zero effort
into optimizing it the actual actual render performance of it,
it was really an experiment to explore the API space.
Right.
And I think there...
I mean, okay, so to be honest,
there are some people who are using it
in their school projects or something like that
because they're writing a compiler in C++ or something
and they want to have pattern matching because it's pretty useful in that context. But performance doesn't
quite matter for them. And so that's the context in which I was saying. I hope no one's using it.
That's awesome. Yeah.
I want to interrupt the discussion for just a moment to bring you a word from our sponsors.
PVS Studio is a tool for detecting bugs and security weaknesses in the source code of programs written in C, C++, C Sharp, and Java.
It works under 64-bit systems in Windows, Linux, and macOS environments, and can analyze source code intended for 32-bit,
64-bit, and embedded ARM platforms. The PVS Studio team writes a large number of articles on the analysis of well-known open-source projects. These articles can be a great source
of inspiration to improve your programming practices. By studying the error patterns,
you can improve your company's coding standards, as well as adopt good programming practices which
protect from typical errors.
For example, you can stop being greedy on parentheses when writing complex expressions involving ternary operators. Subscribe to Facebook, Telegram, or Twitter to be informed
about all publications by the PVS Studio team. Links are given in the podcast show notes for
this episode. Well, I guess let's talk more about the pattern matching proposal.
So you said it is kind of informed by the work you did on your library.
Who else is authoring the proposal?
Like, what's the current status of it?
Right.
So there are three co-authors on the proposal.
David Sankel, Sergey, and Dan Sargensen.
And those guys are at Bloomberg.
So the latest paper, actually,
the preceding one was from David Senkel from 2015, I believe.
He proposed language variant as well as pattern matching
in a single paper back then.
And the feedback from the committee was to separate the proposals into two different things.
And after that, I went and implemented this library to get some experience.
And then he and I, well, all four of us.
So the other two co-authors joined David through because they work in Bloomberg.
And then we kind of, we had competing proposals initially.
And then we kind of, you know, got together and figured out how to combine them.
How to combine them.
Yeah.
So that's how it kind of came to be.
The latest status of the paper itself is that we presented it to the Evolution Group
on a Saturday session in Kona. That was back in February. And we're going to be scrambling
for the next two weeks because that's when the Cologne deadline is.
So the next mailing for Cologne, which the meeting will be held in july
but the pre-meeting mailing is due uh two weeks from now oh well okay two weeks minus two days
so so we have like 11 days or so yeah for our listeners when this airs that means they'll have
basically two weeks to get the if there's any papers they're hoping to get to clone.
Yeah.
Yeah.
Something like that.
So you're not targeting C++ 20 though, or are you trying to get this into C++ 23?
Yeah.
Yeah.
So it's targeted for 23.
Um, 20 is feature complete.
Um, it's just issues coming back and stuff like that.
Um, it's, yeah, it's very early in its stage uh still
um there's not a full implementation of it um there is you know the implementation that
was done in my library as well as a library like mock 7 which was library that was done by yuri
yuri solotsky i believe uh is how you pronounce his last name,
but I'm not 100% sure about that.
So Yuri, as his PhD work, I believe did Mach 7,
which was a pattern matching library that used macros and such.
But he did a lot of work in actually implementing optimizations for
getting performance
that matched virtual dispatch,
basically.
And so there was a lot of performance
work that was done there that I think we'll be able
to draw from as well.
But yeah, this is definitely
headed for 23.
I hope it'll actually be one of C++23's
marquee features. We'll see
how that goes.
It sounds potentially like
it would change a lot of code, but I think we should
kind of, I guess, dig into it some more.
Yeah.
You mentioned that you can match,
you can basically do destructuring
and then have a guard.
And I believe, looking at the proposal,
you can also match on types?
Yes.
Okay.
What else can you do?
How exactly would the type matching work?
Yeah.
Sorry, I didn't hear what Rob said.
How exactly would the type matching work?
Yeah, so the type matching happens through angle brackets.
The primary use case is to match std variant.
So there have been several complaints about the API of std visit as well as performance of std visit.
I may have heard more of that than others because I have an implementation of variant.
So people come to me with visit issues.
But basically the API for visits is kind of backwards,
right? We have to first list the cases that we want to handle
in a function object or something like that.
And then we have to say,
std visit of that function object,
comma sequence of variants that you want to match.
And so oftentimes we end up doing a single visitation.
So it ends up being like visit function object, comma, variant.
As opposed to what people are more used to is the switch kind of syntax where we should be able to just say, you know, switch on a variant and then list the cases that you want to handle.
If those happen to be types, that's fine.
And so what we want to allow is for people to say something like inspect V where V is invariant and then angle bracket, int, angle bracket, string, etc.
So those are the cases in your inspect statement. And so the way that would work
is
you end up visiting
the variant and then
checking that the
resulting value actually matches
that type. And if it does
then it would dispatch that handler.
So is it kind of like
structured bindings where there's hooks
so that you can implement your own thing to decide what the type of this object is or whatever?
So the customization points for variant-like would be variant-size, variant-alternative, and get.
Okay. And it relies on those already existing customization points?
Yes.
Okay.
Well, those are not customization points today, per se.
Oh, okay.
But tuple size, tuple elements, and get are for structured bindings.
And so it's kind of a similar mapping to variant.
So what would actually end up happening
in the kind of like pseudo-translated C++ code
would be, uh,
you would do a std visit on a generic Lambda.
And then inside the generic Lambda,
imagine you have a,
a sequence of context for ifs where the checks are the type of the thing
that,
uh,
thing that thing that came in is the type that we're trying to match.
Right.
Um,
and then,
and so it's not,
it doesn't quite have the semantics of
std visit with an overloaded
set with int and string,
right, because if you were to
match a double,
then with the
pattern matching proposal, that wouldn't match.
Whereas with the overloaded
function object, it would convert
and match. And so it has slightly different semantics.
Okay.
But in terms of the mechanisms, that's how it works with NU.
Okay.
So it would be extensible for other user-defined types, not just variant.
Yes.
And does it work with RTTI for dynamic types?
Right.
So for polymorphic types,
yeah,
the,
the,
the idea is that it would perform sequential dynamic casts semantically.
Okay.
Not implementation wise.
Implementation wise,
we can do,
we can,
we can,
we can do tricks that were explored in Mach 7,
which actually like caches the result of dynamic cast
and stuff like that,
which comes out to be pretty impressive numbers.
Really?
Yeah, yeah.
The paper is actually super interesting.
It, like, stashes a Vtable pointer
and, like, uses that for something.
I don't remember the details exactly,
but, yeah, there's some interesting stuff there.
There's like a pseudo
Duff's device thing
in there that catches stuff.
Yeah.
So it wouldn't actually be
sequenced with dynamic cast,
but it would be, semantically, that's what would
happen.
And so there are some nasty
cases because
dynamic cast allows
weird things like cross-casts.
And so
this is one of the things
that's difficult is
people intuitively look at
examples and think
especially the variant examples
and think like, oh, why don't we just do best match here
when we just have rare matching types.
But the issue is that even if we're doing types,
for polymorphic types, for example,
we can't really do best match.
That's not really an option.
And so we have to do linear search.
And I think even though we could do best match for variant,
it would be very confusing if
these two very similar things
have different semantics.
I think it would be much simpler
for people to look at it and say, okay,
we just do first match.
That's a lot to know.
And so those are the
customization points.
It's the tuple-like
protocol,
the variant-like protocol
for variant-like stuff,
and
the third one will be an any-like.
And so
dynamic cast is the any-like
thing for polymorphic types,
but if I listed any, for example,
and I wanted to match a sequence of type-based matching,
then you could call anycast for each of them.
That sounds potentially very handy,
because otherwise any is kind of a bit wordy to work with.
Right, exactly.
And so the customization point for that currently is anycast,
but maybe it should be did cast or something neutral.
Yeah.
So those are kind of the user-defined type customization points.
But it gets more interesting because we can actually support user-defined patterns as well. And so this was something that I covered in my talk at C++ Now,
from which I actually got an audible wow
from one of the members in the crowd.
That's always fun, yeah.
Yeah.
So CTRE, which is Compile Time Regular Expressions.
Right, Harness Library, yeah.
Harness Library, yeah.
It's gotten pretty popular.
One of the things that it can do
is it can tell you the size of the matching,
the resulting matches at compile time.
So with std regex, if I were to do a match,
the thing that I get back is something like a
vector of string views, essentially.
Right.
Whereas with Hano's library, I get
an array of string views
of some compile-time size.
Okay. So what you can do with
that is you can say
CTRE of some string,
some regex pattern
that match on some string, and regex pattern that match on some
string, and then the result
comes out to be
some fixedize array.
I can immediately
decompose that with
structured bytes.
For example, I can have a
date regex, and I can say
that date regex that match of some
string, and then i can uh
on the left hand side i can say square bracket auto square bracket um you know the entire match
comma year comma month comma day post square bracket equals the right the regex match thing
okay and in and and you can you can use the year month month, day immediately. Right. That's pretty cool. Yeah. So that's really nice.
And so if you were to put this into the context of pattern matching,
then if we were to want to test a string to see whether it's a date,
test whether it's a phone number,
and then do different things based on it,
then now we need to, you know,
so we say, like, inspect S, and then what do you do?
Like, we have a date regex, but we can't, like, call,
we can't match the date regex with the S
and then match on it immediately or anything,
given what we've just talked about so far.
Okay.
And so what we introduce is a notion of an extractor.
So an extractor takes a value, calls extract
on that extractor with the value, and then we pass on the result of that into a subsequent
match, a subsequent pattern match. So we say something like extractor question mark pattern.
Okay. Okay. Trying to visualize it all.
Yeah.
Yeah.
So let's say the extractor is like a date regex.
So we say date regex question mark square bracket year month day.
Right.
Okay.
So what that would do is take,
it would call date regex dot match on S,
try extract on s
that would that would that would return some right and then that array would be matched against the
Square bracket here month day, which is on the right hand side, right?
And then if that matches then you can then on the right hand side you'd have access to ear month day to use immediately
Okay, and so you could do a sequence of these, basically.
Right.
And you can check if the string is a date,
and do this, and if it's a phone number,
then do this with the area code or whatever.
So instead of a bunch of nasty if statements together,
combining with CTRE and your pattern matching matching you'd be able to do a nice
clean table of all the different options that's exactly right yeah okay yeah sounds very nice
yeah yeah so that was so that was pretty pretty cool i'm kind of curious about two things if you
don't mind um and the cases were like you do type matching and you said that it's as if you had done
an if constexpr or if you had done a dynamic cast or something in there, that does kind of imply, I don't know, I feel like there has to be some kind of case where it's handling the fact that there might passing in does not necessarily be valid for all types that are matched
or all context or something.
I don't know.
It does feel like it does have to effectively do an if-constexpr
in any of the things that do type matching.
But that can't be right for the types that are runtime known.
Never mind.
I might just be overthinking this.
So we can move on past that.
Okay.
Okay.
To be clear, patterns do have requirements that are placed on the subject, which is the thing that's being matched.
So, for example, similar to how structured bindings
if you were to do square bracket x comma y equals some
expression,
the expression is expected to be duplice size 2 with get 0 and get 1 valid.
Valid get 0 and get 1. Right. And so those requirements
are placed on the value that's being matched
by any given pattern. Okay.
And so...
So, I guess what I'm trying to ask
is the thing on the right-hand side,
unlike a switch statement
where there is no new scope,
the thing on the right-hand side
is a new scope, it has its own types,
it has its own context,
whatever, like, it's its own thing.
It's not just...
The right-hand side has its own cut but you mean
the right hand side of like a like a match case yeah like the actual operation you're going to
perform i guess right so that doesn't have a that doesn't have a type right okay so so so okay so
there are two forms of inspect they were proposing one is an inspect statement and the other is an
inspect expression okay okay so the so the first form is an inspect statement. Okay. Okay. So the first form is an inspect statement, which is very much like a switch,
where the right-hand side is a statement.
And so there's no type.
Okay.
And there's no new scope or anything right there.
Well, there is its own scope.
Okay.
Because we need it for the names
that are being introduced in the pattern.
Right.
All right.
Okay.
Yeah.
And so it does have its own scope.
In the expression case, it also has
its own... it also has
a type. Okay. And so
the
expression statement
uses a
equals greater than, which is
a fat arrow, essentially, in the
middle, instead of a colon. Oh. Okay.
But an
inspection expression is very
useful because otherwise
you end up doing
every case having a return
statement and wrapping that
whole thing in a lambda
and specifying return
types and then immediately
calling it.
And thinking about it kind of in the
context of a switch statement is there a default case yeah so the default case essentially would
be um any so okay this is an interesting part actually that we have to discuss uh during the
meeting which is the underscore um okay which is which is a pattern that matches anything. Okay. But if you look at it, any identifier matches anything.
It just binds whatever is being matched.
And so if you were to just put foo as the pattern,
that would match anything and just give it an alias foo
for the thing that's being matched.
Right?
Okay.
And so you could just use underscore in that way
where underscore just introduces a new variable,
a variable name, because it's a valid identifier today, right?
But what that means is you couldn't do something like
underscore, comma, underscore, close square bracket.
Right.
Because I have a redeclaration issue.
Yeah, and people have been wanting to skip
structured binding elements forever doing something like that.
Exactly.
And so we brought a proposal to Kona.
I think the number was P1469, which was to propose disallowing use of underscore in the context of structured bindings so that we could use it
to ignore stuff
in structured bindings but also in pattern matching.
And that
did not fly.
people didn't want to
because it's an already existing variable
people didn't want to
introduce context sensitiveness
into it.
Although technically a variable that's just underscore
is reserved by the standard anyhow, right?
Because anything that starts with an underscore...
No, that is not true.
Underscore at the global namespace is reserved,
but not inside namespaces.
It is underscore followed by a capital letter
is reserved everywhere.
That's true.
Underscore, underscore,
anywhere in your identifier is reserved.
But underscore followed by a small case
is not.
Lowercase is not.
In a local scope.
Yeah.
Yeah.
I still don't do it, though.
Yeah, me neither but but but but but but the issue basically is popular libraries like google mock um puts has a google
colon colon testing colon colon underscore uh which is passed to um their framework to indicate
i don't care what gets matched here and so in Google mock, you can say stuff like, expect this function to be called with these arguments.
And so you might say,
expect call foo comma one comma two.
And then you call foo with zero zero, and then
the framework will tell you, oh, I expected a call to foo with one two,
but you passed it zero zero.
And if you didn't care what the value was for one of those things, you could pass underscore, which is their placeholder, which is an actual value there.
But their library knows how to interpret that to ignore those values.
Right. And so and so because of these existing uses of underscore, we couldn't quite make that the otherwise case, which should have served as a people case, as you were asking about.
Okay.
So we need some different way to spell it.
There has been some discussion of double underscore, which I, because it's already reserved, right?
Which is definitely reserved, right.
But, yeah, I hate it.
I think it's a terrible way to go, but I don't know.
We'll... Yeah. A question mark
has been mentioned.
There's always triple underscore.
Shh.
I did have a silly idea
of allowing
two or more underscores
at which point you could actually
if you have a table of score bracketed
structured bindings looking
matches you could actually
use seven underscores if you need
to align your variables properly
right
yeah
but yeah so I don't know we'll see
what happens there but you know question mark
has been brought up as an option
but I think that would be challenging
I think there are various extensions
for like question
mark colon I think there's a GCC
extension that does question mark colon
and so like if you were to use that
as a default case you would end up
with question mark colon
stuff right okay and that might be an issue yeah simple just grammar is crazy so yeah oh yeah yes
i mean yeah i really feel like sorry go ahead i had i think students at one point recently asked
why lamb does have the syntax they
do and i was basically like because it was the only option left i mean that's not 100 accurate
but i think it's pretty close to the truth actually yeah yeah yeah working on this proposal
i really feel like i feel like i i got myself into a 35-year-old Jenga game.
It's pretty challenging. Where do I poke?
Is this thing going to fall? I don't know.
Is there any particular
use cases for pattern
matching that we haven't already discussed
that you think C++ developers
are going to find really useful um specific use cases for pattern matching yeah i mean i guess we haven't quite
discussed trees all that much okay but pattern matching comes up in tree uh tree traversals but
also tree manipulations uh quite a bit and so um so And so one of the benefits of pattern matching
is that it allows you to nest patterns,
which works exactly like how you nest values, right?
Like when we build values, we build by composition.
You might have some int that you put inside of a pair,
which you put as an alternative of a variant,
which may be going to a struct or whatever so these things compose um and so and so in the exact similar way you can
build patterns in a composition uh via composition and so you can you can think back to how you built
that value and then build build a pattern to see like do i do i have that thing and so with the tree it works in a similar way as well where I might say like oh like check if the thing on the
left is so okay so in an expression tree for example like an arithmetic
expression tree if you were to perform some simple simplifying operation you
actually need to look deeper than one level down to be able to figure out that you actually can do simplifications.
Okay.
I wish I had a concrete example to share with you guys, but I think it's concrete enough.
So if you say on the left, I might have an int, but I need that thing to be zero.
And if I have a plus, then just return the right-hand side.
Sure. to be zero and if i have a plus then just return the right hand side right sure and i need to be
able to do multiple uh levels of inspection before i can actually make this decision right um and so
uh and so this kind of stuff comes up a lot in in in tree manipulation specifically uh one of the
one of the um uh poster child for it is red-black tree rebalancing.
You can...
So with something like a stidmap,
which keeps track of...
which has a red-black tree underneath,
it actually consistently rebalances this tree
to keep the height at a reasonable height.
And the process for that,
the process for which that happens is you inspect the structure of your children.
And so it would be like,
if you have left, left, like red, red,
and then on the red,
you have a black and a red
or something like that in some orientation,
then you can reorient that
such that we restore its invariant.
And the code for it actually turns out that there's actually like four cases,
four cases of the structure that you're looking for of the children.
And then all of them actually emerge to the same structure.
And so the code for it ends up being like 10 lines of code.
As opposed to oftentimes when you look at C++ or Java
examples of red-black trees, there's
a left-rotate function,
rotate-right function, left-left-rotate
function, and all this kind of stuff, which
gets very big and very
complex as to how you reason about this
code. And so it allows you to
think structurally
when you actually modify these
trees and things like that.
So that's a use case that I think comes up quite a bit. Okay. I was wondering if the type
matching feature can match multiple types
and if so, does it then work for multi-method dispatch?
Yes. Yeah.
It's actually not included in the paper today but the intent uh
is that we allow matching multiple values okay uh and so you you want to be able to say inspect
a comma b comma c and then give it um which would you an implicit tuple-like thing,
and so you would put square brackets around your pattern to match the multiple values.
But if those happen to be variants, for example, you could do double dispatch.
Right.
Okay, sorry. Yes, go ahead.
Yeah, so it would basically do tuple matching of those things.
But remember, each pattern can still answer the question of,
does this thing match?
And how that gets optimized underneath, we can do whatever.
Compilers will have full power to do whatever they need to do to make that fast
but i think i think like in terms of performance i think it's we have strictly more information
than we did before right which which i think is uh is a good thing yeah i see that it gives
the compiler more semantic i guess context for what you're trying to do. It doesn't just see a block of if statements.
Exactly.
Yep.
Yeah.
It's very difficult for compilers to sift through arbitrary Boolean expressions.
Right.
And to deduce meaning from it.
So we had Herb on last week, and he talked about his reflection in Metaclass's proposal.
You said you had to have a library implementation of this
do you think that library implementation would be any different if you had reflection and
metaclasses or would the proposal change at all when those are available um i think my i mean i
think i think my answer would be no probably um reflection i'm somewhat familiar with. Metaclasses, I'm not that familiar with.
But one of the issues with the library solution really is the lack of notation available.
And so if you look at a pattern inside the library, like they're all just function calls right and so and so like there's no like it's hard it's very hard to look
at a pattern and deduce what the structure of it is whereas with the added notation for square
brackets and angle brackets for types and things like that when you see something like angle
bracket point close angle bracket open square x, x, y, close square,
you can kind of visually see, okay, I'm matching a point,
and then I'm going to destructure that into a tuple-like thing.
When you just see alternative, open paren, destructure, open paren, x, y,
it just looks like a sequence of nested function calls, and it's hard
to deduce visually what's happening
there. And I don't
think metaclasses gives you
the ability to add new syntax.
Yeah, I don't think so either.
Yeah, and so part of it
is visual inspection,
and so I don't quite think
that it would help much
in this case.
That makes sense.
Well, it's been great having you on the show today, Michael.
We'll definitely keep track of the progress with the proposal going towards C++23.
Sounds great.
Yeah, thank you.
I guess we haven't mentioned the number for the proposal.
Oh, yeah, go ahead.
It's P1371.
1371. Yeah.
Okay, well good luck at the
Cologne meeting. Yeah, thank you very much.
Thank you for having me on the show. Thanks.
Thanks so much for listening in as we chat about
C++. We'd love to hear what you think
of the podcast. Please let us know if we're
discussing the stuff you're interested in,
or if you have a suggestion for a topic, we'd love
to hear about that too.
You can email all your thoughts to feedback at cppcast.com.
We'd also appreciate if you can like CppCast on Facebook
and follow CppCast on Twitter.
You can also follow me at Rob W. Irving
and Jason at Lefticus on Twitter.
We'd also like to thank all our patrons
who help support the show through Patreon.
If you'd like to support us on Patreon,
you can do so at patreon.com cppcast and of course you can find all that info and the