CppCast - How CLion works under the hood
Episode Date: August 4, 2023Dmitry Kozhevnikov joins Timur and guest co-host Matt Godbolt. Dmitry talks to us about how the CLion IDE works under the hood. News mold 2.0.0 released CLion 2023.2 released Should we stop w...riting functions? Links C++ Lambda Idioms: technique to separate user-specified and deduced template parameters The Kotlin Foundation JetBrains Runtime CLion data flow analysis engine PVS-Studio: 60 terrible tips for a C++ developer
Transcript
Discussion (0)
Episode 366 of CppCast with guest Dmitry Kozhevnikov, recorded 1st of August 2023.
This episode is sponsored by the PVS Studio team.
The team promotes regular usage of static code analysis and the PVS Studio static analysis tool. In this episode, we talk about new releases in the world of C++ tooling.
And about why we should use lambdas instead of regular functions.
Then, we are joined by Dmitry Kozhevnikov.
Dmitry talks to us about how the C-Line IDE works under the hood. Welcome to episode 366 of CppCast, the first podcast for C++ developers by C++ developers.
I'm your host, Timo Dummler, joined by my co-host for today, Matt Gottbold.
Matt, how are you doing today?
I'm doing great. Thanks, Timo. How are you doing?
I'm not too bad. Thanks. Yeah, so Phil is on vacation.
So I'm doing a couple episodes with guest co-hosts.
And Matt, thank you so much for coming on the show and being my co-host today.
I'm very excited about having you here.
Likewise.
Thank you for asking me.
I'm privileged to be here again.
Yeah.
So how's it going on your end?
What are you up to?
It's going very well, thank you.
So recovering from a couple of conferences back to back where, in fact, I met you, which
has been the first time in a little while that you and I have seen each other in the flesh.
We had CPP North and C++ on C,
and it was so, so lovely to see C++ people again
after so many years without having seen anyone
or seeing them over video conference.
So yeah, it's going great.
I'm just sort of recovering now and a bit of a comedown after that.
Yeah, yeah. So I've also been to both of those conferences uh and it was good fun um now i'm taking a bit of a break from conference travel it's been quite a lot uh the last few months for
me but um yeah all right so at the top of every episode i'd like to read a piece of feedback and
this time i got some feedback from richard powell, whom I actually met at CPP North, the conference that we just mentioned. Richard said, thank you guys so much
for starting up CPP Cast again. I don't always get the chance to listen to it, but when I get
the chance to listen to it, I really love it. Thanks. Well, thank you, Richard, for the feedback.
Richard actually went on and gave me another piece of feedback. He said that because we're
an audio onlyonly show,
sometimes it's a bit difficult to follow the conversation
if there's no visual material.
For example, if you're talking about syntax or something like this,
and he suggested actually that we can use a podcast feature
where you can have different cover art for every section of the episode.
And so you can essentially use them as slides
to visualize what you're talking about. I'm not sure this is something you want to do i think i
need to look into this but i've never heard about this before no me neither yeah that's that sounds
like a really intriguing idea um and i can certainly see you know i listen to some other
podcasts and certainly some of them where they start to talk about code it would be useful to
actually see the code on the screen but then the other thing is if you start relying on that too much then you lose your
listenership who are walking their dog for example like myself uh or listen to it at 1.9x and can't
keep up with the slides changing every two seconds like some people we know so i i could see that
being useful but it also adds an awful lot more burden on the podcast producers to actually then generate visual content as well yeah so that's why i'm a bit skeptical a it's more work and nobody likes
more work and b it's kind of a trade-off right because it might be uh easier to follow the topic
uh for the people who do watch these like these slide sort of content yeah but but but but then
yeah a lot of people like you said like they told me that they listen to cpp cast while they're on a run or something like that
and and we really don't want to lose it might it might make you lazy because if you could rely on
the visual aid the whole time then you're not forced to explain it better for those folks who
aren't able to see the picture so yeah an interesting idea nonetheless though so thank
you richard all right so we'd like to hear
your thoughts about the show and you can always reach out to us on master on or on twitter or
whatever it is called these days um or email us at feedback at cppcast.com joining us today is
dmitry kazhevnikov dmitry has over 15 years of experience in c++ development he began his career
with a company specializing in GIS,
simulators, and 3D entertainment systems, serving as a C++ software developer for seven years.
Later, he joined JetBrains and contributed to the C++ language support in C-Line.
There, he worked on the in-house parser, semantic analysis, and refactoring features for C++ code.
Currently, Dimitri is doing engineering management tasks on the C-Line team.
He has a passion for developer tooling and dreams of making C++ development a universally
pleasant and accessible experience.
Dmitry, welcome to the show.
Yeah, hi.
Nice to meet you.
Nice to see you all.
I'm super excited to hear that you used to do stuff with, what was this, 3D entertainment
systems.
What on earth was that?
Yeah, well, basically the company had an in-house 3D,
like sort of gaming engine, game engine,
or something like that.
Right.
But yeah, and there are some simulators
for ships and stuff doing on top of that.
And as a sidekick, it was like some entertainment,
like, you know, 5D things that are installed in malls oh i see so not like a and people come come and play a game oh the one
way that your chair rocks back and forth and like soap bubbles are flying around and yeah and some
air breathes in your in your face and stuff right right there are like the rats run along and then
you can feel something at the back of your heels
and all that kind of stuff.
Right, okay.
That's cool.
That's cool.
That was C++ as well, presumably.
Yeah, that was all C++.
That was very C++ heavy shop.
I suppose, yeah.
Cool, very cool.
All right.
Well, Dmitry, we'll get more into your work
in just a few minutes.
But before we do that,
we have a couple of news articles to talk about.
So feel free to comment on any of these, okay?
First one is a release in the C++ tooling world.
Mold 2.0 got released.
And Mold is a faster drop-in replacement
for existing Unix linkers,
such as LLVM, LLD, and GNU Gold.
And what the new major version 2.0 does
is it switches the license from AGPL to MIT.
There's a comment by the team saying they've been attempting to monetize Mold through a dual AGPL commercial license scheme.
And that didn't work out as well as they hoped.
And so they switched the whole thing to MIT now.
And in addition to that license change,
there are also some technical updates in this version,
but no super major features, I think.
So I think it's mostly about the license change.
Yeah, mostly the license, which is huge for commercial users.
It was a bit vague before as to whether or not
if you use the linker that was AGPL,
whether or not the binary that was linked by it was also
bound by some of the the the sort of clauses of it so making it mit makes it very very clear who
who what what the the sort of um the legal situation is of the linked binary and mold is
so fast it's almost as fast as copy so if you just copied all your dot o files into a single
directory mold is only slightly
slower than that and so we use it um for some of our debug builds so you get a really really fast
uh you know edit debug link run test cycle in my in my day job and it's it's fabulous so this is
great great news it means that i feel a lot more confident about using mold myself um and yeah it's
i don't know how he's made it so quick yeah that's interesting i have
to admit i have not heard about mold before uh but yeah it seems like quite a lot of people are
relying on it because it is much faster so is that just it basically speeds up your your link
at times if that's your bottleneck or is that absolutely yeah there are some other features
that it has it has some sort of like layout randomize layout randomization options that you can use to make sure that like your uh code isn't performance bound by uh linker layout so
you know you can run it multiple times and link multiple times and then run it and see if it gets
faster and slower you're like well there's some sensitivity to exactly how my code is laid out
but the main reason you do it is because it's just so blisteringly fast you don't even notice
that it's running anymore and like sometimes you get to the end of your program and there's like a two minute pause
while the link happens and you're like oh that's a shame i only made a one line change right okay
we have another uh release uh another major release in the tooling world uh c-lined 2023.2
actually got released uh last week um it has quite a few really spectacular new features.
There is an AI assistant.
I mentioned this on the show, I think, a couple of episodes ago
when there was an EAP, like an early access preview,
that had the AI assistant.
We talked about that one already on the show, I believe.
This release also gives you registers and a debugger,
which you now can inspect.
There's integration with Platform.io,
which is, I think, great for embedded people.
Is that right, Mitya?
Yeah, that's correct.
Well, we had it before, but now it's redone,
and now it's way more straightforward.
It's especially great for hobbyists and newcomers.
Well, it's also probably used for professional developers,
but yeah, professional developers has other options,
but newcomers and hobbyists,
that's who are using Platform.io.
So I think it would work great for them in CLion.
All right.
And you also added VC package integration,
which I think is also quite exciting news. Yeah. The AI assistant is particularly interesting to
me. I looked through the terms and conditions to see whether or not I could easily enable it. And
so far, because of the amount of like back and forth, understandably, and the fact that I share
my account between my personal and my work means i couldn't clearly couldn't look my legal
team in the eye and say don't worry none of my work code is going to go to this but i'll be
excited to turn it on and see what it comes up with you know i've only really dipped my toe in
all of this ai nonsense but if it's in my idea it'd be rude for me not to try it out yeah uh
that's actually a valid point we're working on to make it more uh straightforward
and more clear from a legal point of view to understand well uh how do you disable the thing
completely and how do you enable it when you need it oh fantastic obviously your code is your code
is going to to be sent somewhere for it to work so you're probably very aware of how sometimes code
has to be sent to remote servers for some reasons i won't just mention
that well it's it's not included by default into the release you need to install it separately so
i know that it's a concern for a lot of people so uh you're not going going to get it automatically
when you install c-line but in generally i'm kind of uh excited and terrified at the same time about our future because, well, it will change things.
I'm not yet sure what things it will change and how it will look like in several years, but it definitely will do something to the way we work.
Yeah, I think we should also probably have an episode about this AI revolution that's happening right now, also in a C++ tooling world, along with the rest of
everything. Probably we should talk about this at some point on the show.
Today, we're going to talk more about CLion. But before we get to that, I have one more news item.
There was a blog post. I mean, there were a bunch of blog posts released in the last couple of
weeks, but there was one that I found particularly interesting. It was written by Jonathan Müller,
who I think has been around writing and talking
about c++ since at least 2015 um but he's actually now with a think cell a german company in berlin
and uh he's now writing a blog post for them among other things um and this one um is called
should we stop writing functions and use lambdas instead?
So basically, you know,
half ironically what he's saying
is you shouldn't write functions anymore.
We should write everything with lambdas.
Like instead of writing, I don't know,
a function that takes two integers
and returns an integer,
you just, you know, write const x pro auto,
blah, blah, blah, equals,
and then write a lambda expression instead
and you call that.
And basically, you should just do that everywhere
and just not write freestanding functions anymore.
Because, you know, functions are kind of really broken, right?
Like, you have ADL, which, like, is really complicated.
You can overload them.
You can, like, the way they do, like,
template argument deduction
is kind of weird.
And lambdas are so much better.
There's no ADL,
there's no overloading unless you want to,
so you can construct these lambda overload
objects, but by default, there isn't
any overloading. Lambdas
do this really cool thing, which I think I mentioned
also in my C++ Lambda Idioms talk.
You can separate template arguments that are deduced and the ones that you explicitly have to
specify so like the you can put like you can make like the lambda itself a template and then you
have template arguments there and then you can make it an auto or you call this generalized lambda
generic lambda i forget what they call but yeah with an auto parameter parameter and then those
are the ones that are going to be deduced but they're like in two different places right so
you have like full control i'd never thought of that yeah that's interesting yeah yeah that's a
that's a really cool trick and then they're implicitly constexpr as well which is something
that i think we all want right you know if you if they're sprinkling constexpr like magic
sauce throughout our code base is something we're all kind of a bit used to doing nowadays but you
know it was nice for it to be the the default so it's an interesting idea um it also
means that your code looks more and more like typescript or javascript which maybe is maybe
it's not something you want to aspire to but it is an interesting idea um i don't know that i could
go with it i'm too old and fuddy-duddy i think i still like uh functions but uh i i certainly
appreciate the idea of of having something which is trying to show us that the defaults are broken
yet again yeah i don't think this is going to make it into any by any company's actual coding
guidelines but i think it's a really interesting blog post summarizing the problems with functions
that we have and kind of it helps us being more aware of them and like choose the other thing if if you want to you know yeah i wonder if there is any
other mainstream language where the the community kind of agrees that functions are broken so
i mean we already agreed that initializing a variable is broken right and so apparently
functions are also broken i wonder what else is broken i mean with c++ it's been around so long
now it's hard to find things that that aren't broken in any meaningful way. I mean, that's unfortunately
part of the legacy that we bring with us as C++ folks. And the desire to be backwards compatible
all the way back to pretty much C is that we have to put up with an awful lot of wrong defaults and
stuff. So yeah, I don't know how many other languages have
so much baggage to deal with. Okay, so that concludes the news items for today. So we can
transition to our main topic. And I want to say a little bit about like how this episode came about,
because I wanted to do this episode for a very, very long time. What happened recently, a piece
of personal news is that I actually quit my job. So I actually no longer work for JetBrains as of
yesterday. So nobody can accuse me of like a conflict of interest here anymore that I actually quit my job. So I actually no longer work for JetBrains as of yesterday.
So nobody can accuse me of like a conflict of interest here anymore.
I can finally do this episode.
And so the idea was kind of, you know, when I was talking to people about C-Line or working
at the JetBrains booth and conferences, a lot of people came to me and were asking questions
about how C-Line actually works under the hood.
There's not a compiler, right? There's an IDE. So it's kind of different, but like people are
very curious. And so, yeah, I decided to invite Dimitri on the show to talk about this because
I think it's a really cool topic. So yeah, Dimitri, welcome again to the show.
Yeah. Hi again.
So probably the most frequent question I actually got asked when kind of about like how C-Line actually works is which programming language is C-Line actually written in?
Is it really all written in Java or is there anything else going on?
Are there any native components written in C++?
Yeah, so the very short answer is yes, but it actually depends.
So indeed, like most of the code is written in JVM languages.
So that's being Java, a bit of Groovy, and now Kotlin.
And so Celion shares the platform, the IntelliJ platform,
with a bunch of other JetBrains ideas.
And what I actually wanted to talk about here,
while we're on the JVM side,
I wanted to talk about Kotlin a bit, and about how we made the transition to Kotlin.
And actually now in JetBrains, in IntelliJ, well, in CLion as well, Kotlin is the go-to way to write new code.
And it's actually explicitly preferable to write in Kotlin when you're working on CLI and other IDs.
And it was all done in a 20-year-old code base in a single monorepo where thousands of people have contributed there.
And Kotlin was integrated there over time.
And what made it happen is that it has absolute top-notch inter interoperability
with java so you basically can take like any class uh convert it to kotlin or rewrite it to kotlin
start using it and it's not just possible it's actually the apis are somewhat compatible so you
can use the collection apis from kotlin and from. And on both sides, it would be idiomatic, natural.
And I wanted to talk about that because actually that is a direction
I sometimes see the C++ is going recently.
So there are a lot of activities, a lot of work in introducing
some kind of successor to C++, whether it's Google's Carbon or Herb
Sutter's CPP front, or like Swift is trying to do C++ interoperability and stuff like
that.
So I think all these efforts have a lot to learn from how it's possible to gradually
migrate to Kotlin over time.
And it wouldn't be possible if we just had to rewrite bigger components on Kotlin
and then just have some kind of compatibility layer
or stuff like that.
My understanding is that Kotlin,
and you mentioned Groovy as well,
and I know that there are other JVM targeting languages,
but the JVM itself enforces essentially the ABI
between all of these components.
And so in order to operate, they are forced through the ABI of the JVM. And that means that
the way that you instantiate classes, the way that objects are created, the way that references
are passed between them all has already been agreed upon. And so in order to be interoperable
with the JVM makes you instantly interoperable with pretty much everything that is also JVM based.
Whereas sort of C++ doesn't have the luxury of doing that because the ABI is much lower level and doesn't control, doesn't talk about a lot of this sort of nitty gritties.
And in order to use something like C++, you actually need to look at the header files and understand them.
Whereas in a JVM language, you already have like this blob of bytecode.
So do you see that there's, you know, obviously from the point of view of kotlin being a natural successor
to java i understand that but like um i think it had fewer problems or am i misunderstanding
yes indeed it had very fewer problems and it would be a way more challenging task for c++ but
it also there are some uh more subtle and ergonomic things that has to be done uh has
has to be done right.
For example, you work with collections
and basically collections and iterators
somewhat differently in Kotlin and in Java.
But it's actually interoperable in both sides.
So you can have a method in Kotlin that takes a collection
and it has slightly different interfaces in Kotlin and in Java,
and they are both idiomatic,
and you can call it transparently from one side to the other.
Okay, so the JVM isn't a panacea.
You still have to do some very clever tricks in Kotlin
to make it interoperate really, really seamlessly with Java stuff.
Cool.
Yeah, you still have to design it properly.
You still have to think about ergonomics
of actually interoper interpreting between languages.
So how much of the code base is Kotlin now compared to, say, Java, roughly?
Yeah, I don't remember, actually.
But a decent amount of it.
Yes, a decent amount of it.
Also a new code for a few years to read in Kotlin.
And Kotlin, forgive me if I'm wrong, is actually JetBrains' own language, right?
This is something JetBrains came up with.
Yeah, it's initially JetBrains came out with that.
Then it's, well, kind of governed by Kotlin Foundation now.
So it's open source.
And like there are some other major players
in Kotlin Foundation right now,
but it's still like mostly being developed by JetBrains.
That's a pretty impressive achievement
for a tool company to say,
we've had so much experience.
You know, the main product line, I think, IntelliJ here is the Java AD that you sort of started with, as I understand it.
And so you're like, you're so sick of trying to deal with Java that you've come up with your own alternative spelling of all of the way Java works in order to write your own program.
That's cool. So now, obviously, the next obvious thing is that you should turn your attention to c++ and um and and come up with the alternative syntax for c++ right probably yes
but yeah someone need to do it it's probably not us yeah because well we have some in-house
experience of c++ uh obviously like for example the gdk we're working on top it's uh we actually
work on top of a fork of open gdk it's
called jetbrains runtime and well it's maintained and developed also in jetbrains it's also open
source and it has a bunch of a bunch of fixes and additions compared to open gdk it's mostly
related to desktop applications the fonts handling the to focus handling and stuff like that.
So there are a lot of people working there on C++ because OpenJDK has a lot of C++ code there.
It's actually a very marvelous engineering thing.
It's impressive.
It works surprisingly well.
Yeah.
And it's very complicated, but it works very well in some cases.
So you have a JDK written in C++,
and then you use that to run Java in Kotlin,
and then you use that to implement a front-end for C++?
Yes, something like that, yeah.
That's fun.
It's turtles all the way down.
But are there any components in CLan itself
that are actually written in C++?
Yeah, sure.
Then the bigger one is that for, for some IDA features,
we're relying on ClangD.
We have a fork of it.
Well, ClangD is
a Clang-based daemon
for IDEs.
Well, it has
an upstream implementation
and we are maintaining
something specific
for CLI and for that
in our fork.
So, yeah.
It's purely, well, it's LLVM and LLVM is purely C++ projects.
So we're also working on C++ there.
And does the ClangD, is that sort of integrated directly
into your codebase?
Are you doing like JNI calls to it
or is it something that runs sort of independently?
Is there sort of a separation between the Java process
and the C++ side?
No, it actually runs as a separate process and, well, for a good reason.
I love C++, but still C++ applications tend to crash.
And it happens rarely, but it happens way, way, way more often than the managed application like Java.
And it's actually, I don't think it would be possible
to write the IDE platform like IntelliJ in C++
because it requires the code of thousands of people,
including third-party developers
running in the same process
and interpreting with each other.
So the fact that it doesn't cross the IDE
and it's like everything can be recovered,
it's really huge.
But yeah, the Clang clang we kind of controlling ourself and we can work on it but we're still running it as a separate
process got it got it so it sounds like you spend most of your day well it sounds like you're
managing folks these days more than more than writing code but when you are writing code
obviously there's a lot of kotlin uh going on. Is there anything you miss about C++?
Using C++ as opposed to staring at the parse tree of C++ all day?
Yeah.
What I actually miss is, well, it's control over memory and control over lifetimes.
So these are two main things that I'm actually missing.
So it's actually really sad when
when we can't uh do some trivial there's some things that i consider trivial in cyclospass
like you can't uh have objects with efficient memory layout you need to have allocations there
are not not real allocations they are managed by like jvm and they are pretty fast but they're still uh they still
happen and yeah it's kind of uh became complicated became complicated sometimes and that's the main
thing i missed and also the lack of more destructors and lack of more controlled lifetimes
of things it's also it's funny when i see you know folks trying to write java for the first time who
have been in the c++ world they'll immediately latch latch onto finalizers in Java as like a destructor.
And it's like the first thing you learn is never, ever do anything in a finalizer.
Just so many odd caveats and weird things.
It's unfortunate because you, as a C++ engineer, the RAII sort of semantics are so baked into your soul that you want to have it in other languages
and you just can't get it without, you know.
Yeah, and I think most JVM applications,
large applications have some things implemented,
a similar to RAII inside them,
like just from user side, like from library side.
Yeah, sorry.
Yeah, I remember this.
Like when I first joined jetbrains
as a developer i think that was back in 2017 uh you know i was hired and then i was told oh you're
gonna be writing uh you know a lot of java and do you know any java and it's like no i don't
and i kind of learned java on the job but like it felt so weird in the beginning where you just
sprinkle these like naked news all over the place like new this new that and i'm like oh my god this is an allocation like this is going to be slow and
then took me a while to like get used to the fact that it's fine this is how it works it's not going
to be slow it's okay you know and yeah it's just a very different different way of coding i think
what i really appreciated like in java is that like things were actually a lot easier to write
i could just write code and you know the computer is actually going to do what i wrote which almost never happens to me when i write
in c++ but but yeah on the other hand like as dimitri said like yeah you don't have this uh
kind of control over what's actually happening and that felt really weird for for quite a while
but are there things that you prefer about working in a managed language what thing you know
timur has just sort of just explained some of the things that were good for him.
But what about you, Dimitri?
Is there other things that when you go back to C++, if you do indeed write C++, you're like, oh, gosh, I have to do this thing.
Yeah, well, the first thing is I kind of mentioned is all sorts of safety. So like not being able to actually crash the application
and recover reasonably from any kind of issues
is really powerful.
But the main thing I'm missing when doing C++
is actually all the tooling and control
over whatever is happening in the program.
So it starts from source-based tooling like the
ids as a inspections and stuff they're way more powerful in both java and kotlin and well you
basically can write half of your code by just uh pressing shortcuts and it does a thing for you
and then like that's the running joke though isn't it like you know we're talking with people it's
like you know just hit control space or control enter until like your entire application is written.
I suppose that's the chat GPT style
or the sort of the open AI or stuff
that we were talking about AI assistant
is the logical conclusion
of auto-completing your entire code base.
Yeah, something like that.
But it actually, like, it actually makes sense
when we're doing it in kotlin and java so yeah
but it's not even that like the more powerful thing is like the control over runtime and stuff
via tooling so the debuggers works way better and you can have reasonable memory snapshots where can
you just get a snapshot of your application inspect all the objects with all the fields and see how they are related and like look at their values and
it's well it's not always possible in in c++ and then have some instrumentations and have
some code to be rewritten on the fly have compiler plugins plugins if you need it. It's all sort of possible in C++, but
it's not accessible to
people. You need
to be a really sophisticated
engineer and really understand
what's going on to access these things
and in JVM
it sort of happens naturally.
And I guess the other thing that I think of when you
mentioned about not crashing is that
there's no undefined behavior that I'm aware of in Java.
So if you reference a null pointer rather than the compiler going, I can do anything I darn well like now, it rather pedestrianly throws a null pointer exception and you carry on with your life and someone else can catch it somewhere else and go, oh, yeah, that's fine.
Never mind, there's a bug in that bit of the code, just don't run it again, you know? Which is a very different mindset, I think,
from C++.
Right.
So speaking about C++,
I'd like to mention my favorite topic, parsing.
Yeah, so parsing C++ is hard.
I believe, Dimitri,
you remember that you and I
actually have done a CppCon talk
about parsing C++ and how hard it is.
Yeah, I do.
A few years ago at CppCon.
But in an IDE, it's even harder because
you're working with half written code right uh and you still kind of have to make sense of like
what the user's writing so how do you do that how do you pass code in c line and what paths do you
use and can you talk a little bit about that uh yeah sure i probably will start with a more easier
part uh with how we're using clangD. So nowadays we're using
ClangD basically for things that can be done on a single file. So when you show errors,
when you show like highlightings, when you do code completion, like the basic ones,
it just completes the things that are imported, included into your file. So
that's we using ClangD and it uses Cl clang parser under the hood and well it's
works well but that's actually not enough for what we uh what we are doing and what we want
to do what we want to achieve so at the same time we maintain our in-house parser which is currently
used for like refactorings and uh like project-wide things and uh it actually pretty powerful and
pretty fascinating what it can do basically it has to do everything that uh like the compiler
parser has to do and then more on top of it and for example it has features like for example it
can be incremental in a way that it can reparse specific functions when you type inside a function.
So that's something that current compilers can't really do.
Then it has a sink.
So as you know, as you parse C++ code, you need to do semantic analysis for that.
And our in-house parser, what it can do,
it can do only necessary semantic analysis on the fly and not all of it.
So if your code is structured in a way that it's obvious what's written
and it doesn't have any ambiguities,
it can avoid doing any semantic analysis until it's actually needed.
Is it those kind of stuff where like you have, I don't know,
an identifier somewhere and you don't know if it's a type or a variable
or whatever, and you kind of have to execute arbitrary constexpr code
somewhere else to figure that out?
Yeah, that's something like that.
And also like the common thing inside the id when you work with
project-wide things so you you have a name and then you look through it you just and you need to
understand whether it's usually actually usage so you need to do like type inference type checks on
that and what's our in-house thing can do it's it can do it also on demand lazily
and only for a single thing.
It doesn't have to reparse all the files for that.
So, and yeah, while the throughput of this,
like the raw performance is obviously way lower
than just the compiler
in terms of like megabytes per second
and stuff like that.
But doing tricks like that,
we can very often do things way faster
than a compiler would do these things for us.
Because you're able to do them incrementally.
I had no idea that you were reparsing
on a function-by-function basis.
That explains an awful lot.
That's something I've often wondered.
I'm tapping away quickly,
and I expect when I type this completely new variable
that I've just
finished typing, that I can immediately do dot and hit control space. And it's going to tell me
all of the members that might be in there. And I'm very intolerant if it delays in any way,
but you have to essentially pause the world in order to answer the question, what functions
might be able to go here? And know especially with the adl stuff that
you do nowadays i think you can like control space on pretty much anything and it goes oh there's a
free operator that takes this as a first parameter and you can i'll give you that as an option as
well yeah that's one of my more favorite things to have here uh but uh to be honest uh just the
normal completion is done by clangd and it's also pretty fast because ClangD,
unlike the Clang itself,
it also has a bunch of optimizations that work on a single file.
So it doesn't...
If you have just a set of normal includes
on top of your file,
just top-level includes one by one
without any declarations,
so it doesn't reparse it.
It keeps it uh
in memory so it also works reasonably reasonably well but i see uh yeah we with our in-house
parser we can do it even more incrementally even more granularly and then is there a separate thing
that does like the syntax highlighting is that different again or is that is that built into one of those puzzles because i can only imagine literally that's character by
character you have to work out what color to print the next character as yeah that's like mostly
that's clangdy like they're highlighting what's what's wearable and what's not uh and when you
see like errors or warnings and stuff like that the little squigglies and things underneath stuff. Yeah, it
comes from both.
It depends on where it's implemented.
So I noticed that
I use C-Line in my day job. It's my daily
driver, but there are a
bunch of other C++
tooling
things that JetBrains
provides, a ReSharper, and then
there's the Rider. Is that the other one that's like the resharper, and then there's the Rider.
Is that the other one that's like Unreal Engine thing?
Yeah, that's right.
How much of the code, you mentioned you're in a monorepo,
so I'm imagining you can share quite a lot of the code
between these things, but they look on the face of it
quite different from each other.
Is there a lot of code reuse between them,
or is there a, yeah, how does that look?
Well, actually not much, actually not much yet
because, well, it's a story,
it originates from very old times
when there were two completely separated departments
in JetBrains.
One is doing IDEs and another is doing
like Visual Studio plugin, like the ReSharper.
So when, and a Visual Studio plugin, like the ReSharper. So when, and the Visual Studio plugin, it's actually running on C-sharp and.NET infrastructure. So it's a
different code base. And back then it was decided just to re-implement the same thing another time for ReSharper C++.
And yeah, the only thing we are sharing right now is some test data for our parsers and actions.
And yeah, that's obviously not ideal,
and that's something we're working to do something about.
I probably can't go into more details right now,
but yeah, that's something we're very well aware of uh and hopefully one day uh it wouldn't be like that
but for now yeah for now it's kind of share the same ideas because well the resharper passer and
well rider which also uses the resharper backend under hood. It's kind of the second attempt on the same idea
that CLI in particular has.
All right, we're going to take a little break
because now we would like to mention a few words
from our sponsor.
This episode is supported by PVS Studio.
PVS Studio is a static code analyzer
created to detect errors and potential vulnerabilities
in C, C++, C Sharp, and Java code.
Podcast listeners can get a one-month trial of the analyzer
with the CppCast23 promo code.
Besides, the PVS Studio team regularly writes articles
that are entertaining and educational at the same time on their website.
For example, they've recently published a mini-book
called 60 Terrible Tips for a C++ Developer.
You can find links to the book
and to the trial license in the description.
So one of the things that I'm more impressed with
with all of the IntelliJ family of products
is how wantonly I can completely change the code
from my shell while the IDE is running in the background.
You know, like I'll check out a different branch.
I'll revert some files.
I'll do some changes.
I'll copy things around.
And then I'll alt-tab back to the application.
And magically, it's kind of kept up with me and is completely fine.
Whereas I remember like in the old days of Visual Studio, this is like 15, 20 years ago, there was always this kind of button I had to synchronize state with the UI,
all that kind of stuff.
And I would always wonder,
should I be shutting my IDE down before I edit stuff or whatever?
But I just assumed that CLI is going to keep up with me.
How on earth does that stuff work?
Yeah, well, some people would argue it might work better.
But yeah, we're doing it we're actually working hard on making
making it actually work smooth smooth and painless like there are a bunch of tricks we're doing like
from the one hand we have like a separate process that watches file system changes and then just syncs it back to the IDE state.
So IDE might look at some out-of-date,
a bit out-of-date state, but it's always consistent.
So the files won't change in the middle of some code analysis
or it would be just canceled and restarted.
But also specific to C and C++,
there is also a bunch of things
we're doing in our
indexing and optimizations.
We're sort of doing the same thing
that modules are doing, but
in a somewhat different way.
So we're trying to keep
a parsed representation of
header files and
reuse them when possible.
And then, so when we like reindex a changed file
and we encoder and include directive,
we need to understand what is the preprocess state,
what was the preprocess state.
We did it last time.
And is there significantly different
or should the difference affect the parse three
of this header file?
And if it doesn't, like we don't go into this file
and we just reuse some parsed representation
of the header file we're storing.
So tricks like that, they're hard to do correctly.
They're complicated.
That sounds like the kind of trick that you could imagine
a compiler vendor implementing with the same thing.
You know, if you encounter this particular file with the same state ahead of time, then
load some incremental slab that you've pre-done instead of reparsing everything again.
But yeah, maybe it's because it's only a subset of the information that you're interested
in, the types and the inferences
and things like that.
But yeah, it would be lovely if we could just get all the benefits
of modules without actually having to write modules
or understand how they work.
That's just me being lazy.
Yeah, well, it's actually one of the things we're actually doing
under the hood.
It's like sort of experimental project
is what we're trying to,
like for our Clang D and Clang backend,
we're trying to do automatic modelization on the fly.
So when we, we have this thing now called,
it's called Clang D indexer.
So it's like Clang D,
upstream Clang D has some index infrastructure.
You can just run it and have a list of references
and symbol use, just stuff like that.
And yeah, it's pretty powerful,
except it works with the speed of Clang parser,
which is pretty fast, but we are not always satisfied with that for
id tasks so what we are trying to do and it's a ongoing experimental project is we're trying to
automatically convert headers into modules if we think that they're suitable for that. Wow. So, and then keep like some set of hot
and often used headers,
like as a models somewhere on disk or in memory,
and then reuse it as we go.
It's for now, it's in our fork.
It's not open sourced.
It's not suitable to be used in like in actual compiler
because well, in actual compiler,
you actually care about correctness
much more than in IDE
because in IDE,
you can get an incorrect resquiggle
and in actual compiler,
you can just get the wrong behavior
of the program.
But eventually someday,
I dream of a world
where we can do it automatically
like even in an actual compiler.
That sounds very promising to me, for sure.
So another thing that occurred to me is that I assume
I can throw pretty much any C++-ish code into CLion.
So I use GCC at work,
and we'll use every extension we can possibly find.
We'll use GNU annotations, and we'll use inline assembly blocks.
We'll use anything that's available to us we'll start using
because we know which compiler we're using.
But obviously that's not supported by every compiler.
Sometimes the things that I'm doing won't even work in Clang,
which makes me wonder how on earth you deal with the fact
that you're using ClangD, which doesn't necessarily support everything that GCC does. And also there may be
other vendor extensions out there that other people might be using in their compiler and they
can't use CLion if it can't understand their code. So how do you decide what goes into your
sort of superset of all C++ features? Or is there a process for this?
Or how does that work?
Yeah, well, nowadays we're kind of using whatever Clang has.
And with a bit of like fiddling with compiler flags
and stuff like that, and with some Clang person modes,
you can get a lot of different behavior
and various behavior of various simulation of other compilers.
We obviously follow our users here and we see what they need, what they request and what they say that doesn't work.
We have some minor tweaks for Clang frontend to match some commonly seen extensions.
Right. Generally, I believe that Clang has,
for recent years,
Clang has catch up with most of major compilers.
And we don't see compatibility
with existing non-standard extensions
or older non-standard extensions
as a major pain point of the users.
Maybe these users just don't use c-line
but generally we don't see as a huge problem right but i mean you target embedded and they
tend to have a few magical keywords but i suppose most of those you can work around or are already
supported by clang because most clang supports everything so that's interesting but obviously
you've got multiple parsers to keep up to date too so i could imagine it being painful and is that the same for example for uh like new language
features like i had a few times the situation where i was you know using something that wasn't
even in the latest standard yet like you know deducing this or some new syntax and then and
then the c line is giving me a red squiggles um obviously. Is that basically because Clang doesn't support it yet?
Yeah, well, for new features, it's basically the same thing.
So we're working on a pretty fresh Clang.
We're kind of merging every two weeks,
and usually we're releasing a version of ClangD,
which is ahead of the latest official Clang release.
But it's actually lately became some sort of a concerning issue
because what I feel like is that Clang development is a bit stalled.
Well, it's not completely dead, obviously,
and everyone's relying on it,
but the speed of how quickly it gets new features and
it's not like it was before like before all the new standard features were implemented there first
and then came to other compilers and i believe this is not always the case recently yeah i noticed
that too nowadays often like the microsoft compiler is actually the first one to implement
something like deducing this for example whereas you whereas a few years ago, they would be the last compiler to get the latest syntax or whatever. And now they tend to be first. So that's quite interesting. well, it's easier for us in our in-house parsers because, well, we have way more expertise there
compared to the Clang 1.
And also it's kind of complicated
to contribute new features to Clang
when we have absolutely no expertise
and no need to implement backend support for that.
Probably you can just contribute a frontend thing
that doesn't work to the Clang front-end upstream.
So there was
this thing once, what we
did is we
merged the concepts branch
in our ClangD support before
it was merged to upstream ClangD.
So we tried that once,
but yeah, it was kind of painful.
We probably won't do it again.
And yeah, and all of that is kind of reinforced us
in thinking that we shouldn't drop our in-house parsers for now
and just work on it in parallel and see how it turns out.
Because, well, especially now,
we don't want to bet all in on Clang
and we want to provide the competition for that.
So my day job relies on me having to access a remote computer
to do most of the compilation work.
I kind of work with a laptop and then everything else is remote.
And I know that CLion supports some remote execution stuff,
but that's a bit fussy for me.
One thing I've noticed is that there's a new product line coming, Fleet, which seems to be like a completely redesigned experience where it's very much a client-server thing.
And sort of first running the main part of the IDE remotely but showing the UI locally is an option.
So that's very appealing to me,
but it doesn't have C++ support yet.
Is that coming soon?
Or where are we on that?
Well, actually it does have C++ support.
Oh, then I missed it.
Darn.
I need to go and check that out.
It's kind of bare bones.
Yeah, so it's based exclusively on Clang d it doesn't have any like uh we are still
using our clang d fork and we're still using this uh indexer optimization there but uh it's not
using our c lines in-house stuff so well it's a bit bare bones so it's uh like a bit basic at the
moment but is this the sort of the direction that JetBrains are going with their IDEs?
Is this it?
Or, you know, I mean,
I can't imagine CLion is going to go away anytime soon,
but like in the five, 10 years frame time,
is this the sort of the way forward?
Well, I don't think that CLion will go away
even in five or 10 years time.
And well, Fleet has a different set of priorities
and different set of trade-offs to do
so it has this multi-process remote by default architecture and with multiple workspaces multiple
like code analysis engines can connect it and all the remote handling and stuff like that
and well it's kind of offers this lightweight feel uh that's the line
sometimes legs but yeah we'll see how it goes uh so for now like sea lion is alive and well and we
are not planning to switch the fleet completely what we're actually planning to do is to uh have
great remote development capabilities in sea lion asine as well. So they're also working with that.
And for Fleet, it's just probably targeted
on different user base and different use cases.
Got it.
Right, so we talked a lot about parsing and other things.
Is there any other really cool tech under the hood
in C-Line that you want to talk about?
Yeah, well, there is this thing I want to mention
is that C-Lion actually has a pretty powerful
and kind of state-of-the-art data flow analysis engine.
So when you see where the variables are defined
and then what values they can get
and how they pass between functions and stuff like that.
So there are a bunch of powerful analysis
built on top of that and CLion,
like starting from just offering to simplify logical expressions.
If CLion knows that it's true or false,
but it also goes beyond that and checks like some nullability stuff,
whether like local references is leaked outside of function,
or maybe if it can even check some lifetime issues
like beyond the references but yeah it actually work in progress and it requires
probably better support from libraries and standard library because for standard library
we can hard cut all the lifetime interactions between types and objects,
but for soft-party libraries, it's not possible.
So we could really use standardized and agreed-upon set of attributes
or stuff like that about what's an owner, what's a pointer,
and stuff like that.
There are some examples of that in what's it's called, JCL library,
the support library for core guidelines,
but I'm not sure it's widely used yet in these foundations.
But yeah, it's a really cool piece of computer science in Science CLI,
and it has some solver of binary decision tree
called binary decision diagram.
That's super cool.
I didn't realize that that wasn't something like Clang Tidy running.
I've seen those wiggly things in my IDE where it says,
you know, this is always true or, you know, this can't be reached.
This is unreachable code.
And I didn't realize that was something that was unique to the IDE itself.
That's impressive.
But does – ask the question, this is great in my IDE,
but like if it's just a set of annotations or wigglies in my IDE, it's no use to my other developers
who sadly, and despite my many controlling ways,
have never picked up C-Line.
So we have a team that's mixed.
So is there ever an idea that we might be able to run this in CI somehow
and get the analysis report back out in an automated way?
If it just lives as a squiggly in my idea,
it's not as useful to me as it is as a tool standalone.
Yeah, there is this somewhat new project at JetBrains.
It's called Kodana.
I think it now enters like that.
It goes out of beta and, well, it's basically a tool to run various analysis,
including like JetBrains in-house semantic analysis on CI.
And just it has a bunch of features about reports and like inspection results
and showing it pretty like having a baseline and just showing you a new field inspection and stuff like that.
Sadly, there is no
C++ yet there, like the
properly supported, but we're working on that.
I hope it would
appear in some
foreseeable future.
So I hope that would be the way
to use this
CLI and
built-in checks. Some of them
are pretty cool, pretty powerful on the CI.
Like, for example,
CLion also implements
a bunch of checks for automotive,
like a subset of MR checks.
So it's also something
that obviously everyone can benefit
from running Git and CI.
So please stay tuned.
It will appear at some point.
All right.
Well, that's really cool.
Yeah, I used data flow analysis quite a few times
when I was, for example,
recently I did this like C++ and safety talk
at like CPP on C and CPP North.
And I was like playing around with like examples
that would be UB.
And there were quite a lot of cases
where CLI said, hey, you know, you just, this iterator might be invalidated
or, you know, this is an out-of-bounds access
and, you know, things like that.
Yeah, I found that was very, very impressive.
But I was also really curious, like, okay, is that Clang now?
Is that like your own thing?
Yeah, maybe we are not doing a really great job of promoting that.
Like showing it's actually our shiny thing.
So it kind of goes along with all like the inspections we have from clank and clangd and other stuff but
and clang tidy you have clang tidy stuff as well right so yeah yeah all right so um actually
beyond c-line or the world of ide is there anything else uh kind of going on in the world
of c++ now that you think is really exciting or interesting
and you're keeping an eye on?
Yeah, well, I'm kind of looking closely on modules progress
because, well, it's somewhat important for the IDE.
I won't say it will solve all of our problems
because, well, if half of the people are using modules,
it would be good for these half of the people,
but we still need to support another half
which doesn't use modules.
And this actually goes with all the C++ features
and things from new standards and improvements.
So if someone is using a new improvement,
we need to support both this improvement
and what was before that.
But yeah, I'm looking closely on modules and i'm
kind of disappointed of it of how progress is going in the actual compilers because i think
the only thing that has production ready support in like c++ 20 modules is visual studio and the
rest of the compilers are lagging behind so so there is no cross-platform universal support.
But yeah, actually,
when everything would be implemented,
well, it's obviously very complicated
as anything else in C++,
but the compiling pipeline
and how things are built
would make, I believe, way more sense
than it does now
when you have to have a lot of code in your header files.
Yeah, so we're kind of nearing the end of our episode here.
But before we wrap up,
is there anything else that you want to tell us?
Anything else you want to mention?
Or maybe you can tell people where they can reach you
if they want to get in touch.
If you want to reach me about some
things about CLion,
probably the best thing to do
is you can either
write to
CLion support
if it's some specific question
or you can
we have CLion
ID account on Twitter. You can write to it
and that's probably the best way to communicate.
If you want to reach me personally,
I think there are some contact details
on the CppCast page,
and feel free to reach me as well.
All right.
Well, thank you so much for coming on the show
and talking to us about C-Line
and how it works under the hood.
It was a really fascinating discussion, I think.
Yeah, thank you very much, Dimitri.
Thank you for inviting me. Thank you. All right. So you very much, Dimitri. Thank you for inviting me.
Thank you.
All right.
So thank you again, Dimitri, for coming to the show.
And thank you, Matt, for being my co-host for today.
So thank you so much, both of you.
Thanks so much for listening in as we chat about C++.
We'd love to hear what you think of the podcast.
Please let us know if we're discussing the stuff you're interested in.
Or if you have a suggestion for a guest or topic, we'd love to hear about that too. You can email all your thoughts to
feedback at cppcast.com. We'd also appreciate it if you can follow CppCast on Twitter or Mastodon.
You can also follow me and Phil individually on Twitter or Mastodon. All those links,
as well as the show notes, can be found on the podcast website at cppcast.com.
The theme music for this episode was provided by podcastthemes.com.