CppCast - Insight Toolkit
Episode Date: August 26, 2021Rob and Jason are joined by Matt McCormick from Kitware. They first discuss a blog post on using C++20 modules with GCC11 and Qt Multimedia support in Web Assembly. Then they talk to Matt about the hi...story of Insight Toolkit, some of its applications and its role in the origin of CMake. News C++20 modules with GCC11 JSON for Modern C++ 3.10.0 released Qt Multimedia has a new friend Links Insight Toolkit Insight Toolkit on GitHub 3D-Slicer CMake itk.js vtk.js dockcross Sponsors C++ Builder
Transcript
Discussion (0)
Episode 314 of CppCast with guest Matt McCormick recorded August 20th, 2021.
This episode is sponsored by C++ Builder, a full-featured C++ IDE for building Windows apps five times faster than with other IDEs.
That's because of the rich visual frameworks and expansive libraries.
Prototyping, developing, and shipping are easy with C++ Builder.
Start for free at Embcadero.com.
In this episode, we discuss another blog post on modules.
Then we talk to Matt McCormick from Kitware.
Matt talks to us about the Insight Toolkit Library for imaging analysis. Welcome to episode 314 of CppCast, the first podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
I'm all right, Rob. How are you doing?
Doing just fine.
I don't think I have anything particular to share. How about you?
Well, I got something that's just completely random
and only kind of tangentially related to the podcast.
Sure, go for it.
I got an email from someone asking me
if I wanted to be an influencer for
their products on my YouTube channel. Can you, can you say what the product is? Is it relevant
to your YouTube channel in any way? Interestingly, and I thought about this because of our guest,
it is a standing desk company. And I would say there is a chance that this will actually go
through and it'll get featured in an episode.
I asked my patrons for their opinion first, and they're basically like, go for it.
That's, you know, you know, people, programmers need decent desk.
So we'll see what happens.
Interesting.
Which also means that from the sake of recording these episodes, you might see me occasionally standing as well.
I do like standing desk. I need to get in the
habit of using it more often. I don't have one at my home office, but I do have one at the, uh,
my work office. Yeah. Well, I think one of the things that kind of excites me about this
desk that I'm looking at is it will even go four inches lower than my current desk. So depending
on my mood, I could be like totally slouched back back like destroying my spine or standing up i need
to try to not destroy my spine that's why i need to stand up more all right well uh at the top of
every episode i threw a piece of feedback uh jason this week we got this hot take from chris
on include c++ saying hot take regarding justin minor's episode i think the plethora of compilers for
c++ is a weakness not a strength until recently i maintained a project that targeted windows linux
and mac we had hundreds of lines of cmake which was then duplicated across teams to hide the
slightly different compiler flags we had to use for gtc clang and apple clang msvc was the biggest
problem child and the day i can stop supporting their front end will be a choice occasion.
I consider it a significant selling point of Rust and D that I can use the same compiler across platforms.
Fascinating.
I will just go ahead and because I have the power to do this, address this comment.
Sure.
I have had to support at least three platforms and every project that i've worked on since 2003
yeah so like 18 19 years now and uh apple has consistently been the problem child for me
but the fact that uh we have multiple compilers has almost never really been the problem it's
been the ecosystem on the operating system that's been the problem. Sure. Yeah. And then certainly, you know, what he's talking about having to configure
all of this is a legitimate thing that you need to spend time doing. And I guess you wouldn't,
if we only had one compiler, but as we've talked about before, you know, you get better code from
running it on multiple compilers. You, you that you wouldn't if you had only one compiler to work with.
Well, and I don't get the argument
of code duplicated across the teams
because that's like the whole point of CMake
is I've got a couple little modules
that set the flags appropriately for that platform.
And then everything else is identical.
Yeah.
I'm not quite sure what his point is there.
I'm not sure.
Well, I guess Chris will probably hear this episode.
We'll have a very long protracted argument
of once a week updates back and forth.
Or we can talk to him on the Discord.
This was on the Include C++ Discord.
Okay.
Okay.
We'd love to hear your thoughts about the show.
You can always reach out to us on Facebook, Twitter, or email us at feedback at cppcast.com.
And don't forget to leave us a review on iTunes or subscribe on YouTube.
Joining us today is Matt McCormick.
Matt is a principal engineer on Kitware's medical computing team in Carrboro, North Carolina.
His experience spans multiple medical, biological, material science, and geospatial imaging applications.
As a subject matter expert,
he makes and stewards open-source technical contributions
to scientific image analysis communities.
He's been coding in C++ since 2008,
when his graduate studies got him involved
in the Insight Toolkit and CMake communities.
Matt, welcome to the show.
Thank you for having me.
You're one of the lucky few who is paid professionally to
work on open source projects. I'm very honored. Yeah, I count myself lucky every day to do that.
And on the topic of having to support multiple platforms,
assuming you were paying attention during the intro there. Just out of curiosity, I know we'll
get into this later. But can you just list the platforms
that you actually do deal with on a regular basis? So on a regular basis on CI, we support
Mac, Linux, and Windows and different compilers across them, the Intel compiler too, a little bit.
And more recently, we've been supporting ARM. And also a big a big target too is web assembly too so yeah
so on arm do you support uh what mac arm do you do windows arm do you do linux arm do you do all
three we do python packages so bindings around the C++, not for Windows, but for Linux and for the Mac M1 processor, too.
Windows, we haven't tried.
I haven't tried that recently.
I'm not sure if that works, but it might.
Well, if it supports all the other platforms, I'd say there's a very good chance it wouldn't be very painful.
Yeah, so I now work at Kitware, which is the company that helps Stuart CMake.
And you can explain this relationship a little bit more, but I do a lot of work on CMake, of course.
And that's related to the project. And so I write my share of CMake in addition to C++ code too, as I'm sure many of
you do and many people listening do too. So I feel like any question we ask you about the utility of
CMake actually helping this process might be a little, that's the word, like it's almost like
nepotism, right? Yeah, that's a little bit. I can complain about cmake like they're like anybody
else can too but uh i'm a little biased okay all right well matt we'll we'll start talking to you
more about uh you know this medical imaging stuff that you work on in a bit but first we have a
couple news articles to discuss uh so feel free to comment on these, okay? Cool. So this first one is a
blog post, C++
20 Modules with GCC
11. Jason, you want to tell
us a little bit about this one? It seems like
it's, you know, we talked about Microsoft's
they had their own blog post about
modules, was that last week or
last episode? Here's another one
using GCC.
It's very in-depth. And this one even comments that
most of the articles that you see about modules seem to come from the Microsoft team. Which he
points out, they do seem to have the most fully featured modules implementation so far, but he
targeted GCC and got pretty far with it. My favorite part of this article, just for the record,
is it kind of teaches in the same way
that I like to teach, at least in like my C++ Weekly episodes, where the author makes a bunch
of mistakes along the way. Like, well, if you try to do this, it's not going to compile. And sometimes
I get comments from people on my YouTube episodes that are like, it seems like you had no idea what
you're doing. And I'm like, no, I was trying to make the same mistakes you would make so that you
would see what happens when you make those mistakes. And I really appreciate this
author did this because I feel like this made modules make more sense to me than any of the
other more technical articles that we've seen in the past on the show. But it's just a step-by-step
walkthrough of how you can build with modules in GCC. Nice. Is it time to make a C++ Weekly
episode on that yet, by the way?
I feel like I almost could based on this article now.
What do you think, Matt?
Have you done anything with modules yet?
I haven't.
I haven't.
I read the article.
I agree that it was a very good article.
I think it's pretty exciting.
You know, I've done multiple experiments in the past with the library that we work with.
It's highly templated.
So I've done a lot of experiments with pre-compiled headers.
And those were painful experiments
and mostly failed experiments.
So it takes a lot of work,
especially making it across different compilers,
and the benefits just weren't there.
Looking for the improved compile time,
and I think it's really exciting, the module work.
And I know a lot of folks to their credit at Kitware have been working hard on the build system support there.
And all the standards people in the community have done a lot of work to make that happen.
It's difficult, as it explains in the article, with all the header history we have in the language. But it's really exciting working in other languages
and seeing how modules and build systems and package ecosystems,
kind of that opportunity coming officially to C++ is quite exciting, I think.
So do you know any idea like what the actual status is there?
When can we
expect CMake to actually say we officially support modules or something like that? I mean,
ballpark. Yeah, I'm not sure. There is, you know, some support. I'm not sure the status of the
support in the community. Maybe you guys have a better idea too. I've heard that C++23 might have,
even though technically it's in 20, 23 is when you should really start trying to use it.
I guess this article pointed out that it's not completely available
with C++20 and GCC.
Right.
Yeah, I think the attitude of C++23 is when we'll actually be able to use it
is just because people expect that's when our compilers and tools
will actually support it.
All right. just because people expect that's when our compilers and tools will actually support it.
Okay, this next one is an update to JSON for modern C++ version 3.1.
I think we certainly commented on this library a couple times before. This is actually the first release they've put out in a year.
And the big feature here is that in in their diagnostics if you like throw an
exception from the library you'll now be able to see the like json blob of data to help you debug
which sounds pretty handy that does sound handy and also a gdb print pretty printer is available
very nice very nice anything else you wanted to call out?
To me, one of the more interesting things here is that they've
updated their CI toolchain,
which is relevant to a conversation that
Rob and I were having off the air a moment ago,
so that you'll help them actually release,
make releases faster, if I understand
that correctly.
Yeah, fully reworked, overworked CI,
which performs a lot of checks for every commit, which should
allow for more frequent releases in the future.
Future.
Goodness gracious.
More frequent feature releases in the future.
Very cool.
Very cool.
Okay.
And then the last thing we have is this post.
Cute Multimedia has a new friend.
Cute Multimedia is now available to run on Cute for WebAssembly. And I don't think we knew
that Qt for WebAssembly was a thing. Did we, Jason? I don't recall talking about it before.
I think it might have come up briefly before. I'm not sure.
What you mentioned that you target WebAssembly, right, Matt?
Yes, yes. Building to it and then also interfacing with it. And it's mostly
numerical computing, but
I know a lot of projects too who've had
success with related projects
that have WebGL support
and I wasn't aware that they had
I know the
Emscription project was made for games
and they have some
audio support, but I'm really
impressed by the fact that that works
yeah still but i can't where you all use cute at least in some of your projects do your projects
use cute with itk and web assembly or is that like two different things going on here right yes there
are some projects that use cute and and itk together, but for the web work, yeah, we usually do native web interfaces,
so HTML, JavaScript, and all the nice frameworks that are out there.
They're quite amazing in the web world.
I know Qt is very strong and very good for our desktop applications,
but the web world does have a lot to offer in that place, and it's more native.
Okay.
Well, Matt, we've kind of hinted at it a little bit,
but do you want to start off by telling us what exactly the Insight Toolkit is?
Sure.
The Insight Toolkit, it's a library for image analysis, so scientific image analysis.
It was built for medical images specifically, which are images like ultrasound images and MRI images, CT images.
And it's used for medical imaging and also things like microscopy and material science, remote sensing images.
I don't know if those are some of the projects you've been involved with, Jason, with Kitware,
but the difference is really the difference with your camera,
where you have a 2D image that might be unsigned char pixels,
and they're laid out, they have uniform sizes of the pixels.
This type of data, it's larger, it can be 3D,
and it can be oriented in different, have different sampling rates in different directions.
And so the library supports processing these images and kind of doing some traditional image processing with them.
Okay. So, no, I don't, I've never used ITK, I don't think.
So the projects that I work on tend to have relatively simple 2D
visualizations, maybe like a heat map kind of thing, but I don't think anything as complex
as what you're talking about with ITK. Yeah, so it's made for the processing, but also
the few tasks that you have, especially with these scientific images that are a little more complex,
then it does things like reducing noise.
But in many cases, you'll have to do image registration.
And that's where you're finding out the alignment between multiple images.
So you have a tumor evolving over time. What is the change of volume of that tumor or you have
different modalities and you want to compare pair them different imaging modalities and so it does
it does that type of operation and also segmentation too helps with segmentation so
identifying structures in these these three-dimensional images so So that's kind of the role it plays.
It's more the traditional image processing.
It's used with a lot of the machine learning AI libraries
that you have today in conjunction,
although it doesn't explicitly do that itself.
Before we dig more into the capabilities of the library,
could you just tell us a little bit about the history?
It seems like ITK has been around for quite a while. It has been around for quite a while. So back in the GCC,
two, three days or so in time. I've been with the project for over 10 years, but it's over 20 years
old. It started in 1999 is when it started. And it was started to a really unique and
interesting history. It was started when they had the Visible Human Project at NIH. So that was kind
of like the Human Genome Project where people sequenced the entire human genome. In this case,
they imaged an entire human being from toe to head, or two actual human beings.
These were people who were on death row, and they contributed their body to science,
and they imaged them with all the different imaging modalities they have,
and did high-resolution imaging, which was an incredible data set.
But of course, just having the data doesn't tell you as much
as you'd like so they created the insight toolkit so you could get insights from those pixel pixels
those those bytes so when you say like all the different modalities like can you give us
like slightly more specifics like we're talking like uh mri data cat scan data like actual
visual image photographs or like what like how does that yes so the visible human project that was
mri which is magnetic resonance and that's where you're looking at the response the medic
magnetic environment of of tissues of and the water content and how it moves there.
And then there's also the CT images where you're irradiating and seeing the attenuation of the x-rays.
And then they also did slicing.
For this data set, they sliced them and then took an RGB image
of the slices at a high resolution, which is quite unique.
So we're talking, not to get too gruesome,
but actual slices of the body have been imaged and then stacked together to reconstruct the 3D visualization.
Exactly, exactly.
So those are the types of modalities,
and now it's to use with microscopy,
which is, again, kind of a lot of 3D modalities.
I'm sorry, microscopy, is again kind of a lot of 3d modalities or i'm sorry microscopy if you
can just there's all different types of microscopy methods but um in many of these cases they're
using light or um different ways of using light um or higher energy um types of radiation,
and looking at how the tissue or the medium interacts with that light.
So those are the different types of modalities we deal with. Also PET, that's another medical imaging modality
where radiation is coming from inside your body.
The skin injected, you might have got one of
these they're looking for cancer in your body they'll inject you with something that's radioactive
and and see where that goes and see if it concentrates and look for signals in that
okay data yeah so then with itk all of this can be merged into one data set in some way.
Right, right.
And you can process it in 3D.
So back when it was started, C++, you needed to have C++ because 3D is times N and larger data sets.
And so the system was designed to do things like stream processing
of the data sets, not loading the entire data set into memory at one time.
And multithreading and these type of operations
are important too for working with that type of data.
I'm just trying to wrap my mind around this.
So if I wanted to say I have all these data sets
that are all merged together in this visual body project,
and now as the programmer or the user of the visualization in some way,
what do I just say?
I'm curious about this 3D volume right here,
and then you give me back the things that exist there or what?
Yes.
So you have a giant chunk of data bytes,
and that doesn't have very meaning.
That's just a lot of raw data.
So segmentation is the process of identifying and labeling
what I'm interested in, like this is the liver
or this is a tumor and isolating that.
And then that allows you to either visualize it
in a meaningful way or quantify, get quantifications,
which is ideal so you can really put numbers
onto what you're seeing in the data.
So then does it do like,
once you mentioned like 3D data,
like does it do something kind of like photogrammetry?
Like it has all of these things
and now it can give you a three-dimensional reconstruction
and tell you like a three-dimensional structure
or point cloud or anything like that that exists?
So the derived data structures and types you get to are important too so it does support meshes and points point sets and uh you know identifying tubes so there's
a lot of great work that's done because vessels are important in our body so working this those other data structures are
important too so yeah i'm sorry i like i just for some reason in my head i thought of itk as a
visualization thing not as a data processing thing because i've seen those initials around before but
i guess i missed the point well yeah visualization is important so it's it's it's the analysis side
and there's another toolkit that came up, has the same pedigree,
the same kind of people worked on it in the beginning.
It's called the Visualization Toolkit, VTK.
So you often see together ITK, VTK.
And I have an ITK, VTK application.
So the most common thing would be an ITK, VTK Qt application.
And they build off each other, right?
So they build the analysis of just trying
to figure out and isolate what is important to visualize is is coupled with the visualization
too so if you're trying to do visualization you typically want to use these things together
so is this like high level enough that i can like throw all the data at it you just said
identifying tubular structures throw all the data at it? You just said identifying tubular structures.
Throw all the data at it and be like, show me the blood vessels in here.
Unfortunately, no.
Unfortunately, no.
You do have to have some domain knowledge and some algorithmic knowledge.
Well, there goes my plans for the weekend.
Yeah, that's okay.
That's okay.
We have a book.
We have a book that's 1,000 pages.
It's only 1,000 pages. But if you read that book, you'll be able to figure it out.
But yeah.
Interesting.
So aside from like having, you know, lots of domain knowledge, what, you know, what
does it look like using these tools to create an application?
Are lots of applications being built all the time using this or is it just kind of a handful of tools built around them?
Yeah, so most people who use the tool, the software, are using it from an end-user application.
So there's applications out there like 3D Slicer is a popular application.
It's used in research, in many research contexts to help people do who are doing research to
analyze their data quantify their data it's also used in commercial projects so a lot of the
commercial imaging systems might use it underneath the hood for the software that they use
but most users are using this these end user tools that are cute or maybe a web-based tool
and going down a layer from that, there's Python bindings too.
So if you want to program yourself,
people that aren't as technical are using the Python bindings.
And then there are the wizards, C++ wizards.
Of course, the C++ library at its core,
and you can use it from from the c++ context too it sounds like a lot uh a computationally
intensive system are you taking advantage of any parallel cluster gpu kinds of things we do use
we use multi-threading is is kind of ubiquitous throughout the toolkit, just the native. We've moved from handspun platform-specific threading to the C++11 thread pool
is kind of standard what we use as the default.
And then we've been looking to add more and more support for GPUs.
That's been difficult. We've tried many things in the past.
Of course, GPU programming is difficult to do in general, GPUs. That's been difficult. We've tried many things in the past.
Of course, you know, GPU programming is difficult
to do in general.
We also have a big
focus on cross-platform support
and making things work
cross-platform. So there's some
OpenCL and
CUDA support
for GPU, but
more is needed in the future.
I'm really excited.
I haven't done any work with the C++ executor support
that's supposed to be coming,
but I've seen that and I think it might finally allow us
to do that and program it in a reasonable way
and hopefully get cross-platform support in a reasonable way.
Someday in the future, C++
and the community will help us make that happen.
I'm hoping.
I want to end up the discussion
for just a moment to bring you a word from our sponsor,
C++ Builder, the IDE
of choice to build Windows applications
five times faster while writing less code.
It supports you through the full development
lifecycle to deliver a single-source codebase
that you simply recompile and redeploy.
Featuring an enhanced Clang-based compiler, Dyncomware STL, and packages like Boost and STL2 in C++ Builder's Package Manager, and many more.
Integrate with continuous build configurations quickly with MSBuild, CMake, and Ninja Support, either as a lone developer or as part of a team. Connect natively to almost 20 databases like MariaDB, Oracle, SQL Server, Postgres, and more with FireDAC's high-speed direct access.
The key value is C++ Builder's frameworks, powerful libraries that do more than other
C++ tools. This includes the award-winning VCL framework for high-performance native Windows apps
and the powerful FireMonkey framework for cross-platform UIs. Smart developers and
agile software teams write better code faster
using modern OOP practices and C++ Builder's robust frameworks and feature-rich IDE.
Test drive the latest version at Embarcadero.com.
So are you actively, you know, keeping track of the latest C++ versions
based on what you were just saying, I'm guessing?
So, you know, are you using C++ 14, 17, 20 in ITK?
Right now we require C++ 11,
and C++ 14 we're going to have a requirement to soon.
But we have a large community
and many different users and many different platforms,
so we have to support the...
What's the oldest compiler that people are using out there?
GCC 2.95.
The greatest.
Oh my goodness.
It sounds like from what you said about your experience and when you got
started,
I'm guessing when you got started,
people were still using 2.95,
even though it was already out of date.
Yes.
Yes.
And,
and then yeah, three came and it it's it's really amazing to see i know you've been doing c++ for even longer jason but i think it's really
amazing to see how much better it is now and how much faster things have evolved especially in the
last five years or so um so yeah i used the non anders, non-compliance across Visual Studio
to a much greater degree than there is today.
And big kudos to folks like the compiler developers
who do things in a standard way and are advancing the standards now.
It's really exciting to be working with C++ and working with the community now.
Yeah, it does seem like there's much faster adoption of
newer compilers today. I know my first job, GCC 3.2, was already pretty well established as a good
solid version of GCC. But across the board, you still saw people refusing to move past 2.95,
specifically 2.95. I don't know what made that particular version so special.
I should go back and look that up at some point, but it was crazy.
So with all the platform support you just mentioned, I'm curious if people are doing
things like this kind of advanced image processing on like Android or iOS.
Yes. So that's one part of the motivations for the ARM support, too,
in terms of the interesting systems that people are targeting,
Android and related to the image processing.
One of the topics I work on is ultrasound,
and ultrasound has gotten smaller and more portable recently,
and that's really exciting.
You can go out there and buy handheld portable ultrasound systems
for less than $10,000, less than $5,000,
and power it and hopefully program it with these armed tools.
So that's one of the potential areas for really exciting development and growth.
I saw something recently about how super portable ultrasounds could like literally
change people's lives and developing nations. Absolutely. Absolutely. You know, it's,
it's affordable and it can be used for so many things. So, um, and it's a great opportunity for
people who want to try things and program against it.
You don't have to buy a multimillion-dollar MRI system.
You can get access to these systems, which is really exciting.
So you've mentioned MagMedical has come up a bunch.
Is there other kinds of image datasets that people process with ATK?
Another big one that we're working on recently are microscopy. There's new
types of microscopy images that will generate data that is terabyte size for a single data set.
Yeah, so that there's a lot of new challenges there in terms of computing. And you know,
how do you handle that, that type of data? How do you work with that type of data? And so that's
where a lot of the new challenges, the new developments are,
are being able to take that data and not just collect it now, just like we were doing in the
past with the visible human data. It is being able to learn quantitatively from those data sets
and process them. You got to forgive my ignorance on this again. So if we can just come back to microscopy, we're talking about imaging of very small things, right? Yes, yes. But there are ambitions to
take that in aggregate and get the entire body. So there's some groups that would like to
characterize at a microscopic level, the entire body. It's a very ambitious goal and and it's very uh difficult to do but um
but there that you know there are new frontiers that are being taken on in that sense yeah so
it's cells and subcellular structures is what these systems are oh wow yeah for the entire body next step transporters you bet so we're talking you said you know like visible
light so microscopes or related or electron microscopes that kinds of things for imaging
at the very small level correct right okay i just feel like there's other imaging technologies that
we just haven't mentioned that i don't know anything about. But I don't know what questions to ask about how this could
be used in other ways. I think, I mean, there are a lot of interesting imaging modalities.
There's more and more open datasets out there, which is exciting. You know, that's a big focus
of the, not just the software we work with with open source software, but open source data sets is something that we try to advocate for
and make available in the community.
So there are more open data sets that you can get access to
to process and explore that are coming online.
For medical imaging data sets, the NIH has a great resource,
the TCIA, the Cancer Imaging Archive.
So there's a lot of datasets out there that you can actually get access to.
So are you implying that there's new things to discover in these publicly available datasets and people are still analyzing those things?
I think it's exciting. It's exciting and interesting. And, you know, medical,
the medical field is our bodies and how we work is, has been fascinating in the past few years.
I think a lot of students have kind of been enamored with AI. Everything is AI and everybody
in the industry is looking at AI, but, um, like maybe the, the enamorment there might wear off a little bit
and people should come back.
And there are a lot of interesting things in the field
and welcome collaborators and contributors to the project.
Let's talk more about that.
I mean, how large is the community working on ITK
and are you always looking for more developers to come join?
We're definitely open and welcoming new developers to join the project. We have maybe 50 contributors
every release. We do roughly biannual releases, feature releases, and patch releases after that.
But it's a broad spectrum of people from working in universities or companies or people in the open source community contributing.
And we welcome contributions.
We're on GitHub.
So you can do the Insight Software Consortium is kind of the group that is a nonprofit consortium group that supports the development of the toolkit.
What is your specific role on the
team? I don't think we've actually discussed that yet. So I do a lot of the release management
and supporting and stewarding development in the community. So code reviews and providing
introductions and guidance for people in the community. Is it? I'm sorry, go ahead.
That's what I spend a lot of my time doing, but it's also, you know, applying
it for different, uh, applications, projects.
So we do commercial projects and we do government funded, um, projects from, from the national
labs and, and research grants to at kitware.
So it's probably, is it fair to assume
you do have a fairly extensive
automated build environment of some sort?
Yeah, we moved to GitHub CI a few years ago.
It was a new challenge with C++
and a very large templated library
to make things run under the time constraints
and the resource constraints that they have.
But, yeah, the GitHub CI keeps us
on the straight and narrow
and developing confidently,
relatively confidently.
And we also use, I don't know if,
I'm sure everyone's familiar with CMake,
but there's also a dashboarding system,
C-Dash and C-Test.
So we use all those tools.
They're part of the CMake suite of tools for our testing
and monitoring, parsing all those different warnings
and errors that we get as part of our builds.
So I'm not sure if we've talked about those before.
C-Dash?
No, I believe we've never mentioned it.
I have actually used it, but it's been a long time.
Do you guys use CTest?
I use CTest, definitely, yes.
So CTest can be used with GTest and other instrumentation,
but it outputs a specific XML format
that is uploaded to this web-based dashboard
that will parse and tell you what is the platform,
what are the different warnings,
and what are the different test failures, the errors,
what is the output, and provides a nice visualization.
You can see the open source projects that are using it.
Our hosted instance is open.cdash.org.
It's hosted by Kitware.
And you can use that.
Recently, I think CMake also added support
for JUnit output too.
So you can use that in addition,
but C-dash is one option.
I think for the sake of our listeners here,
you mentioned G-test,
and I only just realized that g test support
parsing of uh of what what am i trying to say extracting your g test tests is built into c make
and there's also a module that ships with catch two to do the same behavior of you just point it at your C++ file
and it'll extract the list of tests so that you can get those straight from C test.
Cool, cool. I'm not familiar with catch-2. What is catch-2?
Another unit testing framework.
Another unit testing framework, yeah. It's very similar to gtest.
Well, in nature, in principle, it's quite a bit different.
You mentioned that you know puts out
i think multiple releases a year what are some of the new features being added to the toolkit
after it's been around for like 20 years right right yeah good question so more recently we've
been bridging with the machine learning ai libraries is a lot of our focus so and mostly
that's done on the python level so that's kind of our how we talk to a lot of our focus. And mostly that's done on the Python level.
So that's kind of how we talk to a lot of other things is through our Python bindings.
But bridging with PyTorch,
and there's actually a very nice higher-level interface
for machine learning called Manai,
and especially medical applications
that makes it easier to do reproducible
and effective machine learning.
So we've been bridging with the Manaya community.
And then also for these very large datasets, we're talking about these terabyte-sized
microscopy images.
We do some bridging with Dask, which is a way to do distributed computing effectively
in Python Python too.
So those are some of the developments there.
In terms of C++, we've been, right, we're at C++14.
We have some people in the community that are amazing,
do some amazing C++ work.
They've done some, taken advantage of the nice syntax
that you can create with C++, modern C++ work, they've done some, they've taken advantage of the nice syntax that you can create
with C++, modern C++. So we have these n-dimensional images, right? So we have a
contributor who created a way to do a range over an n-dimensional set of pixels. So some just nice modern C++ syntax.
Yeah, so how many dimensions can there be to an image?
N, but in practice, you know, three or four.
Three or four is usually what works well without running out of RAM.
Okay, four I can get. Four gives me like, okay. I mean, three,
excuse me, three, I can get that's like a voxel or something, right? Like that's what I am
imagining when I see, when I think three or a point cloud or something like that.
What, what is the, what is the fourth dimension of an image? Yes. So in the library, you know,
it has good support too. That's kind of kind of unique versus a lot of the other image processing libraries.
For the pixels, they can be floats.
It's more common to have a float pixel when you're working with an ITK or a long 64-bit int.
And so you can have that type of pixel type, but you can also have these multi-component pixel types so that can be the
different ways of looking at a piece of tissue from different imaging modalities and what is
very common now with machine learning right is is you have many different channels so it used to be
the different components multiple components of a voxel or pixel were components, but now it's usually called channels, and those are the different ways of regressing or modeling a certain location in space.
So that's the fourth dimension.
Okay. mm was offered a job what feels like a lifetime ago now where i would have been working for
medtronic which i know on their toolkit this is this is a it was a cute based job um doing
visualization of the placement and so the idea is someone's having brain surgery done they've got special tools with 3d tracking on
them and they can overlay where in the human's body that tool is based on ct scan that was taken
before the surgery and i'm thinking that if i had gone down that route in a different lifetime
i might have ended up spending the last 10 years or whatever
working with ITK and VTK and those things.
Yes, that's exactly the type of application where it gets used.
So we use a lot of care because of how we handle things like metadata.
It turns out it's really important because you don't want someone
sticking a needle in the wrong part of your brain in that type of context. So there's some sort of
responsibility you feel working on libraries like this. Yes, surgeon, could you please not put the
needle in the wrong part of my brain? I would pay extra money to have that happen, I think. That's why tests are important.
How do you actually test a system like this?
So CTest is just looking at the output.
So a lot of our tests are regression tests of a baseline image.
So we have an expected image and do some processing,
and we compare the output, and then we look at what how that
compares to the input and of course you find all these wonderful things that you experience with
c++ and different platforms and different processors and especially floating point
numbers and things that you you know you have this experience where you learn that
computers are not as deterministic as you expect them to be, which is very beautiful.
And multi-threading, how that impacts things.
That's interesting because I sometimes get an attitude from people that's basically like,
a dash-f fast math is good enough. You should be allowed to use that whenever you
want to. And, and then I hear things from people like, um, you know, compiler developers that say
only use fast math if you don't actually care about the results. And I'm curious now, like
in your quest for both performance and accuracy being extremely important here,
are there like optimization flags that you're like, no, you can't use that.
It breaks the system.
There are, yeah, I mean, that's a great
question. Yeah, fast math, we don't
use that, but
there are a lot of issues with floating
point numbers, especially
floats that we
encounter and have
to deal with.
For the different processors,
you get the x86 extensions.
And then also when you're working with a lot of pixels,
adding up floating point numbers,
it just becomes unstable.
That's kind of another problem that we see rise in many cases.
So we have, I ended up putting a lot of work
into making sure
that we can add there's a there's a algorithm called um khan's summation if you've heard of
this before i think i have actually yeah but no please explain it um i mean it's something that
you know maybe folks may encounter where you know floating floating point number only has so much precision.
And if you have a number and you're adding a small number to a sum that you're summing up,
of course, that sum, its precision, it doesn't bring in the precision of the smaller number
in a very effective way, essentially.
And there's techniques that you can use to avoid that
causing issues in your results, losing precision there. That's something that we encounter in
practice. Another issue that we encounter a lot in practice is when you're doing things like summing
or doing that computation with multiple threads, and you know how those operations occur depending on your
system you might have many different threads and then you get you get a different result just
depending on how many threads you're using even if it's the same input data the same operation
so those are the kind of issues that we we encounter do you well i know you said you have
regression tests but i I just am imagining
this world now when you're, you're working on it, someone adds a new bit of math, and you have to
call into question like every summation or multiplication and floating point that they do,
you're like, No, no, no, no, no. Like, is it that bad? It's, it's not like a lot of these nuclear
simulate nuclear simulation codes where you cannot lose any energy in the system.
The answer is really, it depends.
It depends on the task that you're trying to do, how critical it is.
Sometimes a little wiggle room is okay.
Okay.
I feel like we should ask you a little bit about CMake since we have you here and you work on that too.
What does the CMake build of ITK look like?
Well, of course, we use CMake.
We advise it strongly.
It looks pretty good.
You can blame me for CMake to some degree or thank I think depending on who you're talking to.
But something that we are pretty proud of is CMake,
it actually came out of ITK many years ago. Yes, yes.
So the project was started with this visible human goal,
and CMake was just the itch scratching for allowing different users
to build the code across platforms.
And it's just open source projects started from there,
and then it's finally made its way to its use today.
So if I go and look at the way CMake is used in IDK,
is this like the premier example that I should duplicate in all of my projects?
No, don't.
Yeah, we need to do some modernization, too, of our CMake as it evolves, too.
But do look at the CMake docs and kitware as an example of what to do.
I did have one more question.
I think we're getting about close to wrap up here,
but you mentioned that your Python libraries are a large part of this.
How do you create Python bindings for C++?
Do you use any tools or is it all like hand spun or what?
Yeah, that's another great point and something that folks might be interested in.
Brad King, who leads a lot of development of CMake now, he worked on 9toK for a while.
He helped develop some of the tooling for our Python bindings. That's very useful. It used to
be a project called GCC XML
because it was using GCC.
Now it's using Clang, and it's called CastXML.
And so that's something that parses C++
and generates an abstract syntax tree in XML,
and it can be useful for many different things.
So we use that for generating our Python bindings
as a very large toolkit.
They're more or less auto-generated
in that way with CMake code.
A lot of CMake macros
in this tool and some
Python. And then we generate Swig
bindings and use that there.
Cast.cl is a great tool
folks should check out
and use.
I have never heard of that.
Yeah.
That's interesting.
Okay.
Well, it was great having you on the show today, Matt.
Where should people go to go and learn and find out more about ITK,
and where can they find you online?
Great.
Yeah.
Thank you so much for having me.
Yeah, if you want to learn more about ITK and use it or contribute to it,
very welcome.
Go to itk.org, and that's where you can find the project.
You can find me at Twitter or GitHub at T-H-E-W-T-E-X is my handle there.
And I look forward to talking to people.
Okay.
Thanks so much.
Thanks a lot.
Thank you.
Thanks so much for listening in as we chat about C++.
We'd love to hear what you think of the podcast.
Please let us know if we're discussing the stuff you're interested in, or if you
have a suggestion for a topic, we'd love to hear about
that too. You can email all your
thoughts to feedback at cppcast.com.
We'd also appreciate if you
can like CppCast on Facebook
and follow CppCast on Twitter.
You can also follow me at RobWIr and follow CppCast on Twitter. You can also follow me at
Rob W. Irving and Jason at Left2Kiss on Twitter. We'd also like to thank all our patrons who help
support the show through Patreon. If you'd like to support us on Patreon, you can do so at
patreon.com slash cppcast. And of course, you can find all that info and the show notes on
the podcast website at cppcast.com. Theme music for this episode was provided by podcastthemes.com.