CppCast - Think-Cell Ranges
Episode Date: January 25, 2018Rob and Jason are joined by Arno Schödl to talk about the work he does at think-cell with C++ and their custom range library. Arno Schödl, Ph.D. is the Co-Founder and Technical Director of t...hink-cell Software GmbH, Berlin. think-cell is the de facto standard when it comes to professional presentations in Microsoft PowerPoint. Arno is responsible for the design, architecture and development of all our software products. He oversees think-cell’s R&D team, Quality Assurance and Customer Care. Before founding think-cell, Arno worked at Microsoft Research and McKinsey & Company. Arno studied computer science and management and holds a Ph.D. from the Georgia Institute of Technology with a specialization in Computer Graphics. News Pacific++ 2017: Sarah Smith "Postcards from the Cross-platform Frontier" Outcome v2 Boost peer review begins CppCMS C++ Web Framework version 1.2.0 released under MIT license Spectre mitigations in MSVC Arno Schödl Arno Schödl Links think-cell range library think-cell Talks and Publications think-cell Funds the Working Group for Programming Languages of the German Institute for Standardization (DIN) think-cell Sponsors the Standard C++ Foundation think-cell C++ Jobs Work Life at think-cell Hosts @robwirving @lefticus
Transcript
Discussion (0)
Episode 135 of CPP Cast with guest Arno Schudel, recorded January 24th, 2018.
In this episode, we talk about the Outcome V2 Boost peer review.
Then we talk to Arno Schudel, Technical Director at ThinkCell.
Arno tells us about their custom ranges library and more. Welcome to episode 135 of CppCast, the only podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
I'm doing pretty good, Rob.
Did you survive the snow last week?
I did survive the snow.
My whole family was trapped for three days with five inches of snow outside.
Not completely trapped.
I was able to go out, but the schools were all closed and everything.
Yeah, five inches is a pretty big deal down here.
Yeah.
Although, I mean, are you in a part of the country where there's lots of four-wheel drive pickups and stuff around and Jeeps?
There's definitely lots of pickups and lots of people driving SUVs.
Yeah, I got a Subaru, so I was able to go out.
But, you know, black ice is still pretty scary even with four-wheel drive.
Yes, I was thinking that actually when I was out with the snowstorm that we got um you see owners of of four-wheel drive vehicles i won't mention
any particular brand names or anything that drive as though somehow four-wheel drive is magic and it
will prevent you from sliding and whatever it's not really how it works. No, not at all. Anyway, at the top of our episode,
let's read a piece of feedback.
This week we got an email from Sebastian in Germany.
He writes,
Dear Rob and Jason,
First of all, thanks for the great podcast.
I'm following it quite often.
I'm looking for a link in the news section of an episode,
but I can't remember which one it was.
Maybe you can remember and help me.
You told about a speaker at a conference
who built a Qt app with Qt Creator in one hour
and deploying it on various platforms,
Android, Mac, Windows, Linux.
I hope you know which one I mean.
Thanks a lot in advance.
Jason, this sounds like something from Pacific++.
Yes, that was one of the talks from Pacific++.
It's up on YouTube,
and it should be pretty easy to find since there was only 10 talks at that conference or whatever.
Yeah, I think we talked this in our episode where you're still down in New Zealand.
Yes, I believe I was sitting on the floor of the kitchen of an Airbnb that I happen to be staying in at the moment.
Right. Well, we will find the link to that and put that in the show notes for you, Sebastian.
But yeah, I need to watch that talk too, because it sounded really interesting.
She deployed it to multiple mobile platforms, right?
I don't remember how many actual deployments she got working live, but definitely demonstrated
that it was Android and iOS, yeah, and then was running it locally also.
So it was, you know, there's a lot of live coding.
It might be one of those videos that you watch at 2X or something.
But, yeah.
Okay, well, we'd love to hear your thoughts about the show.
You can always reach out to us on Facebook, Twitter, or email us at feedback at cppcast.com.
And don't forget to leave us a review on iTunes.
Joining us today is Arno Schottl.
Arno is a co-founder and technical director of ThinkCell Software.
ThinkCell is the de facto standard when it comes to professional presentations in Microsoft PowerPoint.
Arno is responsible for the design, architecture, and development of all our software products.
He oversees ThinkCell's R&D team, quality assurance, and customer care. Before founding ThinkCell, Arno worked at Microsoft Research and McKinsey
and Company. Arno studied computer science and management and holds a PhD from the Georgia
Institute of Technology with a specialization on computer graphics. Arno, welcome to the show.
Hello. Wow, that's a lot of hats. Thank you very much. Thank you much for having me. Yeah,
certainly. Yeah, that's a lot of hats that you wear at your company today. Well, I mean, basically, we're
just two founders, right? So one guy is more, we have two computer scientists, in fact, so it's all
run by computer scientists. But there's just one guy who's doing the technology side, which is me.
And then there is Marcus, who's doing more of the managerial side. Everything else, basically, but development.
Wow.
I'm kind of curious what you did at Microsoft Research.
That's, from what I understand, an interesting world to be in.
Yes.
I did my PhD at Georgia Tech, and I did computer vision at the time.
So it was really related to my to my phd i actually
had um you know phd is really relevant for the real world so i had a fish swimming in front of
a green screen uh for a while recorded that and then you were able to like steer the fish with a
mouse through your your your fish tank you know these are the kind of relevant things you're doing for the phd but it wasn't quite all it wasn't quite all like that but i mean part of it is um
and and then i actually went to um after the phd or actually during my phd i thought okay before i
stick to computer science for the rest of my life oh i would like to see the real world. And then I went for an internship to McKinsey,
and that's where the idea of ThinkCell came about,
because there were many people, well-paid people,
they were all hunched over laptops,
and they were pushing around PowerPoint shapes.
And I was kind of dumbfounded.
We were, in my PhD, using really smart algorithms to solve problems like the fish tank.
And here there were people who were having a tough time pushing around PowerPoint shapes.
And that seemed odd.
So we started working on that.
That created the idea of making a tool for PowerPoint that does it a bit smarter than PowerPoint alone.
Okay, so we'll definitely talk more about think-cell in a little bit,
but first we have a couple of news articles to go through.
Arno, feel free to comment on any of these, okay?
All right.
So this first one, outcome V2 boost peer reviews have begun.
So obviously we've had Niall Douglas on the show twice before,
and last time we had him on we were talking
about the outcome boost review which obviously did not uh get accepted because we're now talking
about outcome v2 so hopefully he'll have some better luck this time it seems like you really
kind of trimmed down the library it says he got rid of anything that even smelled like monads
right yeah is there anything else you wanted to highlight with this?
I do think it's funny that at the top he says,
I'm sure RCPP is sick of hearing about outcome
and expected by now.
So the last time we had him on,
shortly after that we had Charlie on
and we talked about the review process.
And so this is Niall, and as far as I can tell,
again, Charlie is doing the review for V2 again,
which is just, it seems like an astounding amount of work to be involved in this stuff,
and the amount of comments you have to sift through.
Well, yeah, I mean, Charlie, I think, said he went through like 300-some comments, if not more.
At least, yeah.
And that's over just like a 10-day period, I believe.
It's pretty crazy.
Have you done any work with the
Boost review or
Boost libraries, Arno?
Well, we've done actually
well, we contributed a bunch
of things to Boost, but nothing
in the scale
of an actual review.
So I was always a bit scared about
actually doing a review. In particular,
if we have our own range library or something,
and to contribute that to Boost will probably involve a lot of work.
Yeah, it definitely seems like it's a lot of work
to do big contributions like that.
There is, in this Reddit discussion,
someone talking about a proposed feature to C++,
and now I'm scrolling through and I can't recall it.
Oh, here it was.
It's something about return types from functions
and I don't know, whatever.
But Niall's comment is that,
yes, there's currently a proposal to do this,
but it'll be at least C++23
before we could expect to see any movement on that.
Yeah, and I think the committee is probably even,
that's even more work than doing a boost contribution.
Oh, right.
Yeah.
Okay, this next thing we want to talk about
is the CPP CMS,
Content Management System, I believe CMS stands for.
And we've kind of talked about this type of thing before
and whether CppCast could move to a C++-based
web management system, because right now we do everything with before and whether CppCast could move to a C++ based web
management system because right now we do
everything with
some JavaScript project. I don't really
know too much about it, but it works.
But yeah,
did you look at this in
any detail, Jason? Do you think it's something we
could possibly use? I looked
at the history and
such, but I didn't look at the actual feature set,
honestly. Yeah, I'll have to look into this a little bit more to see if it's something that
maybe we could switch to, because I'm sure building the CppCast website with C++ would be nice.
Yeah, I did notice that they just updated their automated build systems, and I had to do a double
take because I wasn't sure like when the when that
post was written because it says they just upgraded from windows xp to windows 7 on their
automated build system and i'm like wait yes that was written in december of 2017 um but at least
they're getting their their build system updated and updating all their compiler tool sets and
stuff and and they just changed the license to mit. Yeah, I was going to say that's the big change
that they're announcing here,
that they're moving from LGPL to MIT license
so commercial projects can use it,
which is great.
Well, there's also the Boost Beast
that came out pretty recently, right,
in the Boost 166,
which is also basically an HTTP server tool
for this time
in Boost. I think it's Boost ASIO
based.
Yeah.
We talked about that on the show a while ago.
But this CMS
is more like a CGI gateway
kind of thing, not an HTTP
server.
Mm-hmm.
Okay, and then
the next thing we have is
obviously we're talking about Meltdown and Spectre
a lot two weeks ago.
There's still more
content coming out with people
trying to address these vulnerabilities.
The Visual C++ blog
wrote this post on
Spectre mitigations in MSVC
and apparently they
are putting out an update
that'll have a new flag, slash queue Spectre,
which will help with the Spectre variant one exploit.
I don't recall that there were two different types
of vulnerabilities within Spectre,
but it looks like this is fixing the bounce check bypass.
So if you are working on sensitive code,
then you might want to update your Visual Studio compiler if that's what you're using,
and take a look at the QSpectre flag
and see if you can use it
and see if it affects your performance in any way.
I think they're saying in most cases
it's not affecting performance, though.
Yeah, go ahead. I'm sorry.
Yeah, just, I mean, actually,
today we actually flipped the switch and
tried it, and I don't know. I couldn't see
any difference. But, I mean, what's kind of
special about this security flaw is that any
mitigation you do is going to be
imperfect by nature, right? Because
the flaw is so deeply seated in the processor
that you can kind of fix things of known exploits of that particular flaw, but you're not going to fix
the underlying flaw itself, really. Yeah, it's ridiculous. Although I noticed that in the
Microsoft-specific version of this, they're adding a one single instruction called LFence,
and I didn't
even know that this cpu instruction existed on x86 architecture i never heard of it before
yeah i'm not familiar with it either but yeah that's the only difference in the their output
assembly example between using spectre and not using spectre the addition to the l fence yeah
okay well arno can you tell us a little bit about what you do at ThinkCell?
We talked about it a little bit in your bio and PowerPoint and everything, right?
Sure.
So basically the idea was to automate this shape pushing, right, on PowerPoint.
And it's kind of funny that being in 2017,
of course, with word processing,
we kind of take it for granted
that whenever we type text,
that the words don't stick to the page
wherever we type them,
but they get automatically word-wrapped.
Now, this doesn't really happen in PowerPoint, right?
So if you put a shape,
or in any other drawing program for that matter,
so if you take a shape and you put it on the slide,
it just stays where it is.
And whenever you want to change something,
you have to rearrange all the rest that's also on the slide.
So the basic idea that we started ThinkCell with,
and that's been a while ago,
is that you basically want to be able
to change objects on the slide
or add objects, remove objects,
and your general layout stays stays stays good stays
automatically adjusts automatically um to to all the changes you do um and we actually started off
uh with that idea but then uh we decided well okay what we're going to need first is some some of
these these these elements that we're going to automatically layout.
So these could be charts.
And then pretty much we did charts for 15 years, or maybe 10.
And only now, really, we are getting into having the first release
with some of that automated layout technology
being actually effective for the user.
So we have the first elements
that are actually kind of floating on the slide
and are adjusting their positions automatically,
and you can specify how you want them
to relate to other objects.
Of course, that's tricky to do.
I mean, it's mathematically challenging.
It's challenging from a user interface perspective
because the user has to tell the system what he actually wants. So this turned out not to be an
easy problem, but I think we're getting there. And it's been interesting all the way.
So the idea is just to automate people's workflow when using PowerPoint so they don't have to do
as much by hand manipulation of a slide, that type of thing?
Right. So you would basically, it's a bit like Lego, right? You put your first element on the
slide, and you snap your next element next to it, and you decide whether you want a gap or not,
and something below and something above, and then you fill in your content. And the layout
should stay in such a way that you could actually show
this to a to your audience at any time uh without actually doing doing manual adjustments that
that's the idea so it's uh it's a plug-in it's a plug-in point okay right just making sure i was
getting that and uh all of this work uh for this plug-in is written in c++ yes um so i mean basically
everything we do is in c++ because we have we built a large library over time and the best
way to leverage that is just to build everything in c++ um we have a bunch of build scripts uh
in python um that's about it i think that's not c++ really
uh that seems like an interesting choice i would expect
most things that are office plugins are written in c sharp or something in the dot net family
um right and and most people actually do write these things in c sharp and and it turns out
to be a terrible choice um there is well no it's a disaster, really. I mean, I tell you the story.
It's terrible.
So the problem is, or fact of the matter is,
the Office APIs are all com.
Office is written in C++, right?
So first of all, you're not gaining any functionality
if you were going to.NET,
because all you are talking to
is basically the other side is always com,
so you're going to do all this interop, and you're just losing performance.
Okay, so far so good.
I mean, you still can like C-sharp and write your things in C-sharp.
But the problem is that com is based on reference counting.
And when the reference count falls to zero, your object gets destroyed, right?
So you get deterministic destruction
as soon as you release the last pointer to your object.
Now, C-sharp is built on.NET
and is garbage collected.
And thus, all that destruction is only happening,
at least by default,
if you don't jump through many hoops,
when the garbage collector is running.
Now, it would be nice
if Office would be written in a way
to be able to deal with this
non-deterministic destruction. It is not.
So, you have a
C-sharp add-in, and you are getting
a PowerPoint shape from PowerPoint,
and do something with it, and you release it.
Or you let
the pointer go out of scope.
Well, in C++ parlance.
Now, the garbage collector, of course,
hasn't run right away.
So now an underreader happens in PowerPoint
and that pointer is still being held.
And that makes the shape
that the pointer was pointing to
kind of live on as a zombie.
And when you're doing calls to that shape,
you get funny behavior.
You get strange errors.
You can't do much with this thing anymore.
And it's not like this only
applies to the shape that
has gone out of scope. I mean, you don't have any access to it
anyway. But if you then go,
basically you have one C-sharp add-in
which does that, and then you have
our add-in, which accesses the
regular PowerPoint shape collection. It just gets a fresh pointer. Even that fresh and then you have our add-in, which accesses the regular PowerPoint shape collection.
It just gets a fresh pointer.
Even that fresh pointer that you're getting
is going to point to that zombie.
So what we have to do, in fact,
because there's so much C-sharp code out there,
that when we are detecting this kind of problem
and there are certain telltale bugs or telltale errors
that are thrown from the COM API,
then we actually enumerate all the.NET app domains
and trigger the garbage collector just to kill the zombies.
And then we release all our pointers again and restart.
And you really need to do that.
So, I mean, don't get me wrong.
There is a way to write correct C-sharp code
that interoperates with Office.
But all the convenience of garbage collection
is pretty much gone.
You have to do explicit releases.
So you have to be careful that there's no temporaries
that create it anywhere,
because as soon as you have a temporary,
this thing will go on the garbage collector
and you have a problem.
So you have to explicitly name everything,
and then you have to explicitly release everything. And that's a disaster to program in. It's terrible. And, of
course, there are tons of code out there where they don't get it right. And it's very hard to
audit this kind of code. So, I mean, if you want to take my advice, if you want to do any serious
office development, by no means, don't take C Sharp.
Never, ever.
It's terrible.
It's really that bad.
So this is, if I were to try to bring this back into, I don't know, I guess the C++ kind of world,
this would be like having to manually call the destructor on every single object.
Does that sound right?
That sounds about right.
Yeah, that doesn't sound like fun.
No, it doesn't.
Not at all.
And you get it wrong.
Wow.
I imagine you get it wrong regularly.
That's the point of destructors and deterministic lifetime.
Exactly.
So all these benefits of Rai are just out the door at that point.
Wow.
One thing we noticed on your website is that it says even your customer portal is written in C++.
Is that true?
That is true.
We started off a while back with, or when we started the company, we started off with ATL Server.
That moved on from there.
But yes, I mean, we built a big library, and a lot of that stuff that we built is not really specific to anything.
I mean, you can find the public part of the library on GitHub.
But it just made sense for us to leverage that also for the web server.
Because really, I mean, when you're writing C++,
there is this certain strictness about writing your code,
and that carries over well to writing your web server.
I mean, you have a very stable web server.
It works great.
And yeah.
So it also is written in C++.
So I am
kind of curious. It seems
like the Windows
APIs, and I don't know
if it's true with the Office
APIs too, are not exactly known
for modern C++
techniques. Do you have any
difficulty integrating what you want to have your modern C++ techniques. Do you have any difficulty integrating
what you have your modern C++ techniques
with the Office APIs?
Well, I mean, yes, there is a fair share of glue.
So when you have a collection,
I mean, you want to express that collection
as a range in C++.
So of course we have glue for that.
But I mean, really, at the end, you just have objects express that collection as a range in C++. So of course we have glue for that.
But really, at the end, you just have objects and you just have calls to these objects, methods,
and there's not too much more to that.
It works okay.
It's not really a problem.
And you just have your glue code
and then you write modern C++ on top of it.
Okay. So you mentioned ranges right
there. Do you want to tell us a little bit
about the ranges library that
ThinkCell made?
Sure. So
the ranges library, part of that is actually
public on GitHub.
And there
are, I mean,
the main, I think, driver for ranges is currently Eric Kneebler.
And he has, of course, his RangeV3 library.
And so we grew out of Boost range.
We kind of grew our own library and use it throughout our code.
So it's been well used and it gets adjusted all the time.
We have the freedom to change the interfaces still
because it's kind of a monolithic thing.
We can have the library
and we have the code base
and we can change the library
and then change the code base along with it.
We actually have one guy
who was actually doing just nothing else
but code refactoring.
So that's pretty comfortable.
Or it's a comfortable way to build a library,
really. It's a luxury
that not many people have.
So we don't have a stable interface.
It won't change dramatically,
but it's okay.
Now, it's
a bit different from...
While building that library, we made
some choices that are a bit different
to the choices that Rangers V3 made.
And most of them growing out of pragmatic solutions
to problems we had.
One more ideological difference
is we don't have a pipe operator,
which I think is the telltale sign for Eric's library,
just because we think that it's a special case solution
for a general problem of how to call functions.
I do like the uniform function call proposal from Herb Sutter,
but I don't think it's going to...
There's no sign of it being adopted, if I'm informed correctly.
Now, there are a few other more pragmatic differences.
The way that the range library deals with R-value containers.
So you say you're generating a container
and want to stick that right into an adapter,
in a filter adapter.
You want to filter a container that you just created.
Now, in Eric's library, that wouldn't really work
because as soon as that,
when you're referencing this container that you just created,
as soon as you end the statement,
you would get a dangling reference to that container
because the container is a temporary, right?
Now, the solution that we have is that we actually aggregate the container
into the adapter. So, that way, kind of the transform of the filter becomes a container
rather than a view on a container. And I think Eric has ideological or has qualms with this solution because suddenly what you may think is a view on some other
or a reference to another container
suddenly kind of becomes a container itself,
which may be theoretically not as sound as you may want it to be.
In our experience, it's not really a problem in practice.
So that's the solution we
adopted. And
I think it avoids
the programmer having
to resort to more
roundabout ways to writing that same
piece of code. So, for example,
if you want to return something from a function,
which is this kind of filtered
container, I mean,
where would you put the container if you filtered it?
You would have to kind of return two things.
One is the actual filtered range, and the other one is the container,
because they don't aggregate with each other.
So that's one difference.
There are a bunch of subtleties with transforms
that may be a bit too detailed to go into right now.
There are completely different type of ranges that we support.
We call them generators.
They're basically visitor pattern in generator disguise,
or in range disguise.
So if you have a function that just enumerates the elements,
just sticks these elements into a functor,
into another functor,
you can, on that,
if you have such a function,
you can implement many of the adapters,
in particular filter and transform,
or things like concatenation,
you can implement this on this kind of object.
So you suddenly have ranges
that don't have iterators anymore.
They just enumerate their elements,
and you can still use the same kind of algorithms
that you would use on regular ranges.
That's kind of nice because, you know,
you have sometimes just generating or creating a container
that passes out iterators can be clumsy.
It requires a lot of work,
but writing something that just enumerates its values
is often much quicker to write.
And you can use it with the same kind of algorithms
or some of the same kind of algorithms
and some of the same kind of adapters
that you would use regular ranges.
So that, I think, is very useful in the code.
So in your version you uh you said that
the the range is uh an owning thing it is the container right not always some i mean it can be
if you if you pass it an r value okay okay i was just curious so then if if it was always like
making copies when you were doing the filters but but it sounds like no. No, that would be terrible.
But it gives you at least the option.
See, with Arranges V3,
you kind of have the option of having a lazy view on a range,
or if you want to actually work with an R value,
of course you can execute whatever filter or something you have eagerly.
So these are kind of the choices you have.
I think he calls it views and actions.
Now, the thing is,
I think these two concepts are really orthogonal.
So you can have
laziness on containers,
and of course you can also
eagerly sort a range.
So you have two concepts that don't have much to do with each
other, and I think they should be really treated separately.
It's certainly when you're creating a vector
and if you only need the first element of that vector
and you have a filter on that vector,
it's certainly not efficient to filter the whole vector.
It's much more efficient to aggregate the whole vector
as large as it may be.
It doesn't matter.
It's a pointer at the end.
Just into your actual filter, and then return that,
and see whoever consumes this thing can then decide what to do with it.
And if you only need the first element,
you just take the first element filter until you get the first element,
and then you throw away the whole vector.
So that is certainly more, yeah, I think it's a good way to do it,
and I think you need this kind of way.
If you need it in the syntax, we do it.
For us, it makes no difference.
You write TC filter, no matter if you have an R value or an L value.
Maybe people feel uneasy about that and they want something more explicit.
I think that's fine, but I think you should have the option of aggregating
your container into the range.
Sounds interesting. How much are you using
your ranges library throughout your codebase?
Throughout. Everywhere.
I mean, it's...
And there are
very subtle things
that you discover
when you actually use
that
library so much.
I mean, stdmin is broken, for example.
The standard stdmin is broken.
Because it binds to, like, rvalues bind to const ref, right?
And then it returns, the stdmin returns its return value as a const ref.
So it silently turns R values into L value references,
and you lose the ability to see whether things are an R value.
For these range libraries, you need to know whether to aggregate or not.
And in the case of the transform adapter,
there are other decisions that you have to base on the fact that something is an R value or not. And in the transform, in the case of the transform adapter, there are other decisions that you have to base on the fact that something is an R-value or not. And when you are
then, suddenly there is a function that silently turns L-values into R-values, that's just not a
good idea. That causes problems. Have you made a proposal or anything to fix or to make an R-value
clean version of Stoodmin and some of these other
algorithms?
There is a tcmin, I mean,
in the public library, there is a version
that basically just says, okay,
I mean, if I'm getting an R-value,
I have to stick to
an R-value, I have to leave it an
R-value, and there are interesting things
like when you are getting, say,
a const ref and a ref ref, an R value ref stuck into a tc min what's the return value and i believe the correct
return value is actually a constant r value um which which combines the the the fact that you
cannot modify it because it may be the const ref with the fact that it may be going out of scope
because you've got an R value.
So you've got the worst of both worlds of a const ref and a ref ref,
which is then a const ref ref.
So in the beginning, you would ask,
what do you need const ref refs for?
Well, I think this is very much what you need them for.
And the type system does the right thing.
The only thing I think that where C++ is wrong
is to bind R values to const ref,
which of course has historic reasons,
but it's one of the many things which have historic reasons,
but are fundamentally, I think, not correct.
So you gave a talk on your ranges library at meeting C++, is that right?
Right, yeah.
Okay.
I think if I caught this right at the end of it,
you say that you hate the ranged for loop that we got in C++11.
Oh, yeah, absolutely.
Absolutely.
So with a passion.
No, so the problem with it is that we are now trying to train developers
to think in algorithms, to use all these tools that we have on ranges,
say filter and transform and all these things.
Now, but at the same time,
or maybe a short time before that,
we now gave our developers a new way to write a loop,
an explicit loop,
and that makes people use it.
And that's really the problem.
So people start off with a range.
They see a range. They start off with a range, they see a range, they
start off with a range-based for loop.
And because of it, because they have that
range-based for loop, they are
never jumping
to an algorithmic
solution that would take a lambda.
So for us
in a think cell, we actually
don't use a range-based for loop, but instead
force people to write a TC for each, basically a std for each for ranges using a lambda just because the when
you already have a lambda and you already wrote a lambda to make the jump to say hey maybe it is
better instead of for each maybe an all off or a an accumulate or something like that would be actually the better
choice the better choice and the better tool to use that to make that jump is that much easier
and and getting people from going from the the range-based for loop to actually wrapping the
same thing into a lambda frequently just people didn't do and And then they are basically stuck in that old for loop world that we actually didn't want
anymore. Makes sense.
Your range
library started as a fork of
boost range. Have you forked any other
boost libraries?
Not to the same extent.
I mean, we do have
contributed a bit to
SpiritX3.
Made it a bit faster.
They kind of aborted parsing using exceptions
and that was slow, so we replaced that.
But that has all been merged.
And we use Spirit actually for parsing throughout.
We used to use RegExs,
but they're just terrible to maintain.
And it's one of the other things
that I would not
like to have, that I would not like to
be in the standard, because it's kind of a case of
when all you have is a regex, then
well, you're going to use the regex and
stop looking for alternatives.
And so
Boost Spirit is one. We have our own
boost operators, which
is basically
injecting operators based on, you basically have modifying operators, which is basically injecting operators based
on...
You basically have modifying operators, and you want
all the dual versions, like
you generate the plus from the plus equal,
things like that. So there
we have our own version.
We've done a bit of work on the CLP,
which is a linear solver. That's what
the math we need for the automatic
layout. So this kind of stuff. But nothing to the extent of what the math we need for the automatic layout.
So this kind of stuff,
but nothing to the extent of what we did with the ranges,
I think.
The range library does contain a bit of utilities on type trades and other useful,
like the TC min-max, for example, things like that.
So if we, since you mentioned think-cell products again,
if we can go back to that, when I was looking at product i'm like you know this kind of speaks to me because
i don't use powerpoint anymore because trying to manage all of my slides for all my presentations
and teaching material it just becomes like it's just too much work and you were talking about
watching a bunch of you know people just hunched over the computers, pushing slide elements around. And so I use other tools that help me build
presentations that can do automatic code formatting for me. And in particular, like
automatically highlighting, syntax highlighting, code formatting. Is that anything that your tool
can do? Can it help help the programmer making presentations also?
Not really.
I mean, I myself don't use,
I don't use PowerPoint for my presentations.
Just because, I mean, yes, it's a pain in the butt.
If you just have to write code, it's terrible.
And the thing is, the market is just not there.
We thought about doing something like a markup to PowerPoint converter,
where you can really write text.
Because nothing is as quick to write as text,
especially, I think, for programmers.
So it would be kind of natural to do this kind of converter.
This has always kind of been an idea,
but we never really followed up on it.
So, yeah, I don't think I can help you much there.
Okay.
Well, and also thinking about how you implement it,
you say you use C++,
it's so much better than the idea than C Sharp,
but in your job posting,
you also say that you've used some assembler glue,
and I'm curious what that is for.
Well, I mean, we are interoperating with Office a lot,
and the public APIs just don't always do
what you would like to do.
Things like synchronizing with the undo queue, right?
So when the user is working with our software,
she is expecting that when you press undo,
that you get the proper undo behavior.
But that's kind of not exposed.
You don't really know when to do the undo steps.
That's one thing.
The other thing is that you may want to have your API or your UI, your visuals, tightly integrated.
So with Office, for example, if you scroll the slide or if you zoom a slide in and out,
you would expect that your elements that you kind of attached to the slide or if you zoom a slide in and out, you would expect that your elements
that you kind of attached to the slide,
you'd attach to the charts on the slide, say,
that they also zoom in and out.
And that actually we solve
by disassembling PowerPoint and Excel
with the IDA disassembler.
And to find, and then we go and and find this the the hooks that that we can
actually patch and there is um we build a little patch searching engine uh whenever powerpoint
starts or excel starts we actually go and go through the code um and and search for these
patches and it must be somewhat robust because i mean puts out hotfixes all the time, so we can't break all the time.
So we need to be robust against small changes in the code.
So we have the scanner, we scan the code, and we set the hook.
And obviously that's work, the low-level work, the function hooking work that's done in assembly.
So you have to discover the name of the function that you want to hook.
Not the name. I mean, you have to find it in binary.
Okay, so you're not looking for symbols
in here.
There are no symbols.
You have to just find it in binary.
It's an IDA.
Right? You've got a disassembler.
Okay, so I'm curious
what exactly are you looking
for then? Certain byte patterns
once you've identified the code or what?
Yes, I mean, once you've identified what you actually want to hook,
we usually take a small patch around that place
and then look for this byte pattern.
There are certain things you have to be robust against.
For example, if jumps change, like the compiler changes,
sometimes jumps
from say jump zero to jump non-zero so basically it turns turns around the the parity of the jump
and these are things that we have we have to be robust against um some some other like long jump
versus short jump these kind of things like simple changes in the assembly code uh you you have to be
robust against when you when you actually do that scanning and otherwise it will just break too often um but if you do that then it's then
it's actually manageable and since you're running as a plug-in to powerpoint you're already operating
in its address space and permissions and and whatever so you're not you're not doing anything
that's going to alert windows that you're you something wrong. We are not the only ones doing this.
There are DRM tools
that try to prevent PowerPoint
from saving files.
There are, of course,
all kinds of malware scanners
that try to detect behavior.
So we're not the only ones
patching around in process address spaces.
But maybe we're the only charting add-in that does that.
That's very interesting.
And I wonder if this is something we could have dug into last week when we were talking about the overlays with GOG and Steam and whatever,
with games, how they patch executables too.
That's certainly a realm I've never worked in personally.
It's interesting stuff.
Bringing it back to the range library,
ranges I believe we're hoping is going to get into C++20.
Have you thought about recommending any of the changes you've made
in ThinkTel's range library to Eric Niebler and the standards committee?
Or do you think C++ developers are still going to be pretty happy just getting any version of ranges?
So far, what I've seen in the ranges TS, we looked at that stuff.
And there was one crucial thing that was pretty early on that we had changed, and they actually did change it.
Okay.
Is that when you get an iterator from a range, that you must make sure that the range stays alive.
And that's something they actually added to the standard, that basically the iterator validity is tied to the range validity.
If you don't do that, there's basically no way that I know of
to implement efficiently filter adapters.
Because when you are stacking filter, the point is this,
when you have an iterator of a
filter adapter, the iterator
just by itself,
if it doesn't reference any
range,
if it cannot store
extra data anywhere,
then it has to store its
end iterator, because otherwise, while
filtering, it will basically fall off the edge.
So, we are moving forward your iterator. Because otherwise, while filtering, it will basically fall off the edge. So we are moving forward, your
iterator, until you hit your
filter criterion, but you may hit end.
And so
you need to know where end is. So you need to
store the end iterator.
And when
you now start stacking these
filter iterators, then
every one of these, so you have a filter of a filter of a
filter, then every one of these filter iterators would store two iterators, then every one of these, so you have a filter of a filter of a filter, then every one of these filter
iterators would store
two iterators of its underlying
iterator.
So you get a combinatorial explosion
in the size of, or not a,
is it, no, it's not combinatorial,
it's
exponential. So you get an exponential
explosion in the size of your iterator.
So at some point when
you have 10 stack 10 size uh then you have the 10,024 word large iterator that you're moving
through your program that can't be that can't be efficient right so um and we made this one we have
this one crucial change that that um allows us to to efficiently implement iterators. There is one thing that we did in our library
which goes beyond that.
When you're sticking to what Eric did,
you get still linear growth in the size of the iterators.
That's much better than, certainly, the exponential growth,
but it's still kind of bad.
And there is a way to do without,
which is not precluded by the standard.
So you could implement it with the standard,
standard conformant,
and still get small iterators.
But Eric hasn't done it,
and you need extra concepts.
You need like an index concept.
If you're interested,
there's the talk on the website,
on our website,
talking about this index concept. and then you can actually implement filter or transform or things like
that efficiently so with small iterators okay uh jason you have anything else you want to ask
i don't think so okay well arno it's been great having you on the show today um is there anything
else you want to tell us about think cell before we let you go? Well, one thing is, of course, we are certainly looking for C++ developers.
So we try to hire as many as we can.
We certainly, I think we have a very high quality bar.
So it's quite difficult to get in.
But we are certainly, we are really looking for developers.
And we would love to hear from anyone looking for a job.
And you are in Berlin?
We are in Berlin, and currently everyone is on-site.
And so far, it's basically relocation to Berlin.
Okay.
Well, it seems like Berlin has a great C++ community.
I mean, with Jens out there running Meeting C++,
I'm sure it's a really strong community.
Yeah.
Okay.
Well, it's been great having you on the show today, Arno.
Thank you very much.
Thanks for joining us.
Thanks so much for listening in as we chat about C++.
I'd love to hear what you think of the podcast.
Please let me know if we're discussing the stuff you're interested in.
Or if you have a suggestion for a topic, I'd love to hear about that too. You can email all your
thoughts to feedback at cppcast.com. I'd also appreciate if you like CppCast on Facebook and
follow CppCast on Twitter. You can also follow me at Rob W. Irving and Jason at Leftkiss on Twitter.
And of course, you can find all that info and the show notes on the podcast website at cppcast.com.
Theme music for this episode is provided by podcastthemes.com.