CppCast - Think-Cell Ranges

Starting point is 00:00:00 Episode 135 of CPP Cast with guest Arno Schudel, recorded January 24th, 2018. In this episode, we talk about the Outcome V2 Boost peer review. Then we talk to Arno Schudel, Technical Director at ThinkCell. Arno tells us about their custom ranges library and more. Welcome to episode 135 of CppCast, the only podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today? I'm doing pretty good, Rob. Did you survive the snow last week?

Starting point is 00:01:10 I did survive the snow. My whole family was trapped for three days with five inches of snow outside. Not completely trapped. I was able to go out, but the schools were all closed and everything. Yeah, five inches is a pretty big deal down here. Yeah. Although, I mean, are you in a part of the country where there's lots of four-wheel drive pickups and stuff around and Jeeps? There's definitely lots of pickups and lots of people driving SUVs.

Starting point is 00:01:36 Yeah, I got a Subaru, so I was able to go out. But, you know, black ice is still pretty scary even with four-wheel drive. Yes, I was thinking that actually when I was out with the snowstorm that we got um you see owners of of four-wheel drive vehicles i won't mention any particular brand names or anything that drive as though somehow four-wheel drive is magic and it will prevent you from sliding and whatever it's not really how it works. No, not at all. Anyway, at the top of our episode, let's read a piece of feedback. This week we got an email from Sebastian in Germany. He writes,

Starting point is 00:02:14 Dear Rob and Jason, First of all, thanks for the great podcast. I'm following it quite often. I'm looking for a link in the news section of an episode, but I can't remember which one it was. Maybe you can remember and help me. You told about a speaker at a conference who built a Qt app with Qt Creator in one hour

Starting point is 00:02:28 and deploying it on various platforms, Android, Mac, Windows, Linux. I hope you know which one I mean. Thanks a lot in advance. Jason, this sounds like something from Pacific++. Yes, that was one of the talks from Pacific++. It's up on YouTube, and it should be pretty easy to find since there was only 10 talks at that conference or whatever.

Starting point is 00:02:48 Yeah, I think we talked this in our episode where you're still down in New Zealand. Yes, I believe I was sitting on the floor of the kitchen of an Airbnb that I happen to be staying in at the moment. Right. Well, we will find the link to that and put that in the show notes for you, Sebastian. But yeah, I need to watch that talk too, because it sounded really interesting. She deployed it to multiple mobile platforms, right? I don't remember how many actual deployments she got working live, but definitely demonstrated that it was Android and iOS, yeah, and then was running it locally also. So it was, you know, there's a lot of live coding.

Starting point is 00:03:30 It might be one of those videos that you watch at 2X or something. But, yeah. Okay, well, we'd love to hear your thoughts about the show. You can always reach out to us on Facebook, Twitter, or email us at feedback at cppcast.com. And don't forget to leave us a review on iTunes. Joining us today is Arno Schottl. Arno is a co-founder and technical director of ThinkCell Software. ThinkCell is the de facto standard when it comes to professional presentations in Microsoft PowerPoint.

Starting point is 00:03:57 Arno is responsible for the design, architecture, and development of all our software products. He oversees ThinkCell's R&D team, quality assurance, and customer care. Before founding ThinkCell, Arno worked at Microsoft Research and McKinsey and Company. Arno studied computer science and management and holds a PhD from the Georgia Institute of Technology with a specialization on computer graphics. Arno, welcome to the show. Hello. Wow, that's a lot of hats. Thank you very much. Thank you much for having me. Yeah, certainly. Yeah, that's a lot of hats that you wear at your company today. Well, I mean, basically, we're just two founders, right? So one guy is more, we have two computer scientists, in fact, so it's all run by computer scientists. But there's just one guy who's doing the technology side, which is me.

Starting point is 00:04:39 And then there is Marcus, who's doing more of the managerial side. Everything else, basically, but development. Wow. I'm kind of curious what you did at Microsoft Research. That's, from what I understand, an interesting world to be in. Yes. I did my PhD at Georgia Tech, and I did computer vision at the time. So it was really related to my to my phd i actually had um you know phd is really relevant for the real world so i had a fish swimming in front of

Starting point is 00:05:13 a green screen uh for a while recorded that and then you were able to like steer the fish with a mouse through your your your fish tank you know these are the kind of relevant things you're doing for the phd but it wasn't quite all it wasn't quite all like that but i mean part of it is um and and then i actually went to um after the phd or actually during my phd i thought okay before i stick to computer science for the rest of my life oh i would like to see the real world. And then I went for an internship to McKinsey, and that's where the idea of ThinkCell came about, because there were many people, well-paid people, they were all hunched over laptops, and they were pushing around PowerPoint shapes.

Starting point is 00:05:59 And I was kind of dumbfounded. We were, in my PhD, using really smart algorithms to solve problems like the fish tank. And here there were people who were having a tough time pushing around PowerPoint shapes. And that seemed odd. So we started working on that. That created the idea of making a tool for PowerPoint that does it a bit smarter than PowerPoint alone. Okay, so we'll definitely talk more about think-cell in a little bit, but first we have a couple of news articles to go through.

Starting point is 00:06:30 Arno, feel free to comment on any of these, okay? All right. So this first one, outcome V2 boost peer reviews have begun. So obviously we've had Niall Douglas on the show twice before, and last time we had him on we were talking about the outcome boost review which obviously did not uh get accepted because we're now talking about outcome v2 so hopefully he'll have some better luck this time it seems like you really kind of trimmed down the library it says he got rid of anything that even smelled like monads

Starting point is 00:07:01 right yeah is there anything else you wanted to highlight with this? I do think it's funny that at the top he says, I'm sure RCPP is sick of hearing about outcome and expected by now. So the last time we had him on, shortly after that we had Charlie on and we talked about the review process. And so this is Niall, and as far as I can tell,

Starting point is 00:07:22 again, Charlie is doing the review for V2 again, which is just, it seems like an astounding amount of work to be involved in this stuff, and the amount of comments you have to sift through. Well, yeah, I mean, Charlie, I think, said he went through like 300-some comments, if not more. At least, yeah. And that's over just like a 10-day period, I believe. It's pretty crazy. Have you done any work with the

Starting point is 00:07:46 Boost review or Boost libraries, Arno? Well, we've done actually well, we contributed a bunch of things to Boost, but nothing in the scale of an actual review. So I was always a bit scared about

Starting point is 00:08:01 actually doing a review. In particular, if we have our own range library or something, and to contribute that to Boost will probably involve a lot of work. Yeah, it definitely seems like it's a lot of work to do big contributions like that. There is, in this Reddit discussion, someone talking about a proposed feature to C++, and now I'm scrolling through and I can't recall it.

Starting point is 00:08:22 Oh, here it was. It's something about return types from functions and I don't know, whatever. But Niall's comment is that, yes, there's currently a proposal to do this, but it'll be at least C++23 before we could expect to see any movement on that. Yeah, and I think the committee is probably even,

Starting point is 00:08:41 that's even more work than doing a boost contribution. Oh, right. Yeah. Okay, this next thing we want to talk about is the CPP CMS, Content Management System, I believe CMS stands for. And we've kind of talked about this type of thing before and whether CppCast could move to a C++-based

Starting point is 00:09:04 web management system, because right now we do everything with before and whether CppCast could move to a C++ based web management system because right now we do everything with some JavaScript project. I don't really know too much about it, but it works. But yeah, did you look at this in any detail, Jason? Do you think it's something we

Starting point is 00:09:19 could possibly use? I looked at the history and such, but I didn't look at the actual feature set, honestly. Yeah, I'll have to look into this a little bit more to see if it's something that maybe we could switch to, because I'm sure building the CppCast website with C++ would be nice. Yeah, I did notice that they just updated their automated build systems, and I had to do a double take because I wasn't sure like when the when that post was written because it says they just upgraded from windows xp to windows 7 on their

Starting point is 00:09:50 automated build system and i'm like wait yes that was written in december of 2017 um but at least they're getting their their build system updated and updating all their compiler tool sets and stuff and and they just changed the license to mit. Yeah, I was going to say that's the big change that they're announcing here, that they're moving from LGPL to MIT license so commercial projects can use it, which is great. Well, there's also the Boost Beast

Starting point is 00:10:16 that came out pretty recently, right, in the Boost 166, which is also basically an HTTP server tool for this time in Boost. I think it's Boost ASIO based. Yeah. We talked about that on the show a while ago.

Starting point is 00:10:34 But this CMS is more like a CGI gateway kind of thing, not an HTTP server. Mm-hmm. Okay, and then the next thing we have is obviously we're talking about Meltdown and Spectre

Starting point is 00:10:48 a lot two weeks ago. There's still more content coming out with people trying to address these vulnerabilities. The Visual C++ blog wrote this post on Spectre mitigations in MSVC and apparently they

Starting point is 00:11:03 are putting out an update that'll have a new flag, slash queue Spectre, which will help with the Spectre variant one exploit. I don't recall that there were two different types of vulnerabilities within Spectre, but it looks like this is fixing the bounce check bypass. So if you are working on sensitive code, then you might want to update your Visual Studio compiler if that's what you're using,

Starting point is 00:11:34 and take a look at the QSpectre flag and see if you can use it and see if it affects your performance in any way. I think they're saying in most cases it's not affecting performance, though. Yeah, go ahead. I'm sorry. Yeah, just, I mean, actually, today we actually flipped the switch and

Starting point is 00:11:51 tried it, and I don't know. I couldn't see any difference. But, I mean, what's kind of special about this security flaw is that any mitigation you do is going to be imperfect by nature, right? Because the flaw is so deeply seated in the processor that you can kind of fix things of known exploits of that particular flaw, but you're not going to fix the underlying flaw itself, really. Yeah, it's ridiculous. Although I noticed that in the

Starting point is 00:12:18 Microsoft-specific version of this, they're adding a one single instruction called LFence, and I didn't even know that this cpu instruction existed on x86 architecture i never heard of it before yeah i'm not familiar with it either but yeah that's the only difference in the their output assembly example between using spectre and not using spectre the addition to the l fence yeah okay well arno can you tell us a little bit about what you do at ThinkCell? We talked about it a little bit in your bio and PowerPoint and everything, right? Sure.

Starting point is 00:12:53 So basically the idea was to automate this shape pushing, right, on PowerPoint. And it's kind of funny that being in 2017, of course, with word processing, we kind of take it for granted that whenever we type text, that the words don't stick to the page wherever we type them, but they get automatically word-wrapped.

Starting point is 00:13:16 Now, this doesn't really happen in PowerPoint, right? So if you put a shape, or in any other drawing program for that matter, so if you take a shape and you put it on the slide, it just stays where it is. And whenever you want to change something, you have to rearrange all the rest that's also on the slide. So the basic idea that we started ThinkCell with,

Starting point is 00:13:33 and that's been a while ago, is that you basically want to be able to change objects on the slide or add objects, remove objects, and your general layout stays stays stays good stays automatically adjusts automatically um to to all the changes you do um and we actually started off uh with that idea but then uh we decided well okay what we're going to need first is some some of these these these elements that we're going to automatically layout.

Starting point is 00:14:05 So these could be charts. And then pretty much we did charts for 15 years, or maybe 10. And only now, really, we are getting into having the first release with some of that automated layout technology being actually effective for the user. So we have the first elements that are actually kind of floating on the slide and are adjusting their positions automatically,

Starting point is 00:14:33 and you can specify how you want them to relate to other objects. Of course, that's tricky to do. I mean, it's mathematically challenging. It's challenging from a user interface perspective because the user has to tell the system what he actually wants. So this turned out not to be an easy problem, but I think we're getting there. And it's been interesting all the way. So the idea is just to automate people's workflow when using PowerPoint so they don't have to do

Starting point is 00:15:03 as much by hand manipulation of a slide, that type of thing? Right. So you would basically, it's a bit like Lego, right? You put your first element on the slide, and you snap your next element next to it, and you decide whether you want a gap or not, and something below and something above, and then you fill in your content. And the layout should stay in such a way that you could actually show this to a to your audience at any time uh without actually doing doing manual adjustments that that's the idea so it's uh it's a plug-in it's a plug-in point okay right just making sure i was getting that and uh all of this work uh for this plug-in is written in c++ yes um so i mean basically

Starting point is 00:15:47 everything we do is in c++ because we have we built a large library over time and the best way to leverage that is just to build everything in c++ um we have a bunch of build scripts uh in python um that's about it i think that's not c++ really uh that seems like an interesting choice i would expect most things that are office plugins are written in c sharp or something in the dot net family um right and and most people actually do write these things in c sharp and and it turns out to be a terrible choice um there is well no it's a disaster, really. I mean, I tell you the story. It's terrible.

Starting point is 00:16:27 So the problem is, or fact of the matter is, the Office APIs are all com. Office is written in C++, right? So first of all, you're not gaining any functionality if you were going to.NET, because all you are talking to is basically the other side is always com, so you're going to do all this interop, and you're just losing performance.

Starting point is 00:16:49 Okay, so far so good. I mean, you still can like C-sharp and write your things in C-sharp. But the problem is that com is based on reference counting. And when the reference count falls to zero, your object gets destroyed, right? So you get deterministic destruction as soon as you release the last pointer to your object. Now, C-sharp is built on.NET and is garbage collected.

Starting point is 00:17:14 And thus, all that destruction is only happening, at least by default, if you don't jump through many hoops, when the garbage collector is running. Now, it would be nice if Office would be written in a way to be able to deal with this non-deterministic destruction. It is not.

Starting point is 00:17:32 So, you have a C-sharp add-in, and you are getting a PowerPoint shape from PowerPoint, and do something with it, and you release it. Or you let the pointer go out of scope. Well, in C++ parlance. Now, the garbage collector, of course,

Starting point is 00:17:48 hasn't run right away. So now an underreader happens in PowerPoint and that pointer is still being held. And that makes the shape that the pointer was pointing to kind of live on as a zombie. And when you're doing calls to that shape, you get funny behavior.

Starting point is 00:18:05 You get strange errors. You can't do much with this thing anymore. And it's not like this only applies to the shape that has gone out of scope. I mean, you don't have any access to it anyway. But if you then go, basically you have one C-sharp add-in which does that, and then you have

Starting point is 00:18:21 our add-in, which accesses the regular PowerPoint shape collection. It just gets a fresh pointer. Even that fresh and then you have our add-in, which accesses the regular PowerPoint shape collection. It just gets a fresh pointer. Even that fresh pointer that you're getting is going to point to that zombie. So what we have to do, in fact, because there's so much C-sharp code out there, that when we are detecting this kind of problem

Starting point is 00:18:40 and there are certain telltale bugs or telltale errors that are thrown from the COM API, then we actually enumerate all the.NET app domains and trigger the garbage collector just to kill the zombies. And then we release all our pointers again and restart. And you really need to do that. So, I mean, don't get me wrong. There is a way to write correct C-sharp code

Starting point is 00:19:05 that interoperates with Office. But all the convenience of garbage collection is pretty much gone. You have to do explicit releases. So you have to be careful that there's no temporaries that create it anywhere, because as soon as you have a temporary, this thing will go on the garbage collector

Starting point is 00:19:20 and you have a problem. So you have to explicitly name everything, and then you have to explicitly release everything. And that's a disaster to program in. It's terrible. And, of course, there are tons of code out there where they don't get it right. And it's very hard to audit this kind of code. So, I mean, if you want to take my advice, if you want to do any serious office development, by no means, don't take C Sharp. Never, ever. It's terrible.

Starting point is 00:19:47 It's really that bad. So this is, if I were to try to bring this back into, I don't know, I guess the C++ kind of world, this would be like having to manually call the destructor on every single object. Does that sound right? That sounds about right. Yeah, that doesn't sound like fun. No, it doesn't. Not at all.

Starting point is 00:20:05 And you get it wrong. Wow. I imagine you get it wrong regularly. That's the point of destructors and deterministic lifetime. Exactly. So all these benefits of Rai are just out the door at that point. Wow. One thing we noticed on your website is that it says even your customer portal is written in C++.

Starting point is 00:20:25 Is that true? That is true. We started off a while back with, or when we started the company, we started off with ATL Server. That moved on from there. But yes, I mean, we built a big library, and a lot of that stuff that we built is not really specific to anything. I mean, you can find the public part of the library on GitHub. But it just made sense for us to leverage that also for the web server. Because really, I mean, when you're writing C++,

Starting point is 00:20:54 there is this certain strictness about writing your code, and that carries over well to writing your web server. I mean, you have a very stable web server. It works great. And yeah. So it also is written in C++. So I am kind of curious. It seems

Starting point is 00:21:12 like the Windows APIs, and I don't know if it's true with the Office APIs too, are not exactly known for modern C++ techniques. Do you have any difficulty integrating what you want to have your modern C++ techniques. Do you have any difficulty integrating what you have your modern C++ techniques

Starting point is 00:21:28 with the Office APIs? Well, I mean, yes, there is a fair share of glue. So when you have a collection, I mean, you want to express that collection as a range in C++. So of course we have glue for that. But I mean, really, at the end, you just have objects express that collection as a range in C++. So of course we have glue for that. But really, at the end, you just have objects and you just have calls to these objects, methods,

Starting point is 00:21:51 and there's not too much more to that. It works okay. It's not really a problem. And you just have your glue code and then you write modern C++ on top of it. Okay. So you mentioned ranges right there. Do you want to tell us a little bit about the ranges library that

Starting point is 00:22:12 ThinkCell made? Sure. So the ranges library, part of that is actually public on GitHub. And there are, I mean, the main, I think, driver for ranges is currently Eric Kneebler. And he has, of course, his RangeV3 library.

Starting point is 00:22:31 And so we grew out of Boost range. We kind of grew our own library and use it throughout our code. So it's been well used and it gets adjusted all the time. We have the freedom to change the interfaces still because it's kind of a monolithic thing. We can have the library and we have the code base and we can change the library

Starting point is 00:22:56 and then change the code base along with it. We actually have one guy who was actually doing just nothing else but code refactoring. So that's pretty comfortable. Or it's a comfortable way to build a library, really. It's a luxury that not many people have.

Starting point is 00:23:12 So we don't have a stable interface. It won't change dramatically, but it's okay. Now, it's a bit different from... While building that library, we made some choices that are a bit different to the choices that Rangers V3 made.

Starting point is 00:23:30 And most of them growing out of pragmatic solutions to problems we had. One more ideological difference is we don't have a pipe operator, which I think is the telltale sign for Eric's library, just because we think that it's a special case solution for a general problem of how to call functions. I do like the uniform function call proposal from Herb Sutter,

Starting point is 00:23:57 but I don't think it's going to... There's no sign of it being adopted, if I'm informed correctly. Now, there are a few other more pragmatic differences. The way that the range library deals with R-value containers. So you say you're generating a container and want to stick that right into an adapter, in a filter adapter. You want to filter a container that you just created.

Starting point is 00:24:26 Now, in Eric's library, that wouldn't really work because as soon as that, when you're referencing this container that you just created, as soon as you end the statement, you would get a dangling reference to that container because the container is a temporary, right? Now, the solution that we have is that we actually aggregate the container into the adapter. So, that way, kind of the transform of the filter becomes a container

Starting point is 00:24:56 rather than a view on a container. And I think Eric has ideological or has qualms with this solution because suddenly what you may think is a view on some other or a reference to another container suddenly kind of becomes a container itself, which may be theoretically not as sound as you may want it to be. In our experience, it's not really a problem in practice. So that's the solution we adopted. And I think it avoids

Starting point is 00:25:30 the programmer having to resort to more roundabout ways to writing that same piece of code. So, for example, if you want to return something from a function, which is this kind of filtered container, I mean, where would you put the container if you filtered it?

Starting point is 00:25:47 You would have to kind of return two things. One is the actual filtered range, and the other one is the container, because they don't aggregate with each other. So that's one difference. There are a bunch of subtleties with transforms that may be a bit too detailed to go into right now. There are completely different type of ranges that we support. We call them generators.

Starting point is 00:26:14 They're basically visitor pattern in generator disguise, or in range disguise. So if you have a function that just enumerates the elements, just sticks these elements into a functor, into another functor, you can, on that, if you have such a function, you can implement many of the adapters,

Starting point is 00:26:35 in particular filter and transform, or things like concatenation, you can implement this on this kind of object. So you suddenly have ranges that don't have iterators anymore. They just enumerate their elements, and you can still use the same kind of algorithms that you would use on regular ranges.

Starting point is 00:26:53 That's kind of nice because, you know, you have sometimes just generating or creating a container that passes out iterators can be clumsy. It requires a lot of work, but writing something that just enumerates its values is often much quicker to write. And you can use it with the same kind of algorithms or some of the same kind of algorithms

Starting point is 00:27:16 and some of the same kind of adapters that you would use regular ranges. So that, I think, is very useful in the code. So in your version you uh you said that the the range is uh an owning thing it is the container right not always some i mean it can be if you if you pass it an r value okay okay i was just curious so then if if it was always like making copies when you were doing the filters but but it sounds like no. No, that would be terrible. But it gives you at least the option.

Starting point is 00:27:49 See, with Arranges V3, you kind of have the option of having a lazy view on a range, or if you want to actually work with an R value, of course you can execute whatever filter or something you have eagerly. So these are kind of the choices you have. I think he calls it views and actions. Now, the thing is, I think these two concepts are really orthogonal.

Starting point is 00:28:14 So you can have laziness on containers, and of course you can also eagerly sort a range. So you have two concepts that don't have much to do with each other, and I think they should be really treated separately. It's certainly when you're creating a vector and if you only need the first element of that vector

Starting point is 00:28:32 and you have a filter on that vector, it's certainly not efficient to filter the whole vector. It's much more efficient to aggregate the whole vector as large as it may be. It doesn't matter. It's a pointer at the end. Just into your actual filter, and then return that, and see whoever consumes this thing can then decide what to do with it.

Starting point is 00:28:52 And if you only need the first element, you just take the first element filter until you get the first element, and then you throw away the whole vector. So that is certainly more, yeah, I think it's a good way to do it, and I think you need this kind of way. If you need it in the syntax, we do it. For us, it makes no difference. You write TC filter, no matter if you have an R value or an L value.

Starting point is 00:29:18 Maybe people feel uneasy about that and they want something more explicit. I think that's fine, but I think you should have the option of aggregating your container into the range. Sounds interesting. How much are you using your ranges library throughout your codebase? Throughout. Everywhere. I mean, it's... And there are

Starting point is 00:29:38 very subtle things that you discover when you actually use that library so much. I mean, stdmin is broken, for example. The standard stdmin is broken. Because it binds to, like, rvalues bind to const ref, right?

Starting point is 00:30:00 And then it returns, the stdmin returns its return value as a const ref. So it silently turns R values into L value references, and you lose the ability to see whether things are an R value. For these range libraries, you need to know whether to aggregate or not. And in the case of the transform adapter, there are other decisions that you have to base on the fact that something is an R value or not. And in the transform, in the case of the transform adapter, there are other decisions that you have to base on the fact that something is an R-value or not. And when you are then, suddenly there is a function that silently turns L-values into R-values, that's just not a good idea. That causes problems. Have you made a proposal or anything to fix or to make an R-value

Starting point is 00:30:43 clean version of Stoodmin and some of these other algorithms? There is a tcmin, I mean, in the public library, there is a version that basically just says, okay, I mean, if I'm getting an R-value, I have to stick to an R-value, I have to leave it an

Starting point is 00:31:00 R-value, and there are interesting things like when you are getting, say, a const ref and a ref ref, an R value ref stuck into a tc min what's the return value and i believe the correct return value is actually a constant r value um which which combines the the the fact that you cannot modify it because it may be the const ref with the fact that it may be going out of scope because you've got an R value. So you've got the worst of both worlds of a const ref and a ref ref, which is then a const ref ref.

Starting point is 00:31:30 So in the beginning, you would ask, what do you need const ref refs for? Well, I think this is very much what you need them for. And the type system does the right thing. The only thing I think that where C++ is wrong is to bind R values to const ref, which of course has historic reasons, but it's one of the many things which have historic reasons,

Starting point is 00:31:50 but are fundamentally, I think, not correct. So you gave a talk on your ranges library at meeting C++, is that right? Right, yeah. Okay. I think if I caught this right at the end of it, you say that you hate the ranged for loop that we got in C++11. Oh, yeah, absolutely. Absolutely.

Starting point is 00:32:08 So with a passion. No, so the problem with it is that we are now trying to train developers to think in algorithms, to use all these tools that we have on ranges, say filter and transform and all these things. Now, but at the same time, or maybe a short time before that, we now gave our developers a new way to write a loop, an explicit loop,

Starting point is 00:32:37 and that makes people use it. And that's really the problem. So people start off with a range. They see a range. They start off with a range, they see a range, they start off with a range-based for loop. And because of it, because they have that range-based for loop, they are never jumping

Starting point is 00:32:54 to an algorithmic solution that would take a lambda. So for us in a think cell, we actually don't use a range-based for loop, but instead force people to write a TC for each, basically a std for each for ranges using a lambda just because the when you already have a lambda and you already wrote a lambda to make the jump to say hey maybe it is better instead of for each maybe an all off or a an accumulate or something like that would be actually the better

Starting point is 00:33:25 choice the better choice and the better tool to use that to make that jump is that much easier and and getting people from going from the the range-based for loop to actually wrapping the same thing into a lambda frequently just people didn't do and And then they are basically stuck in that old for loop world that we actually didn't want anymore. Makes sense. Your range library started as a fork of boost range. Have you forked any other boost libraries?

Starting point is 00:33:56 Not to the same extent. I mean, we do have contributed a bit to SpiritX3. Made it a bit faster. They kind of aborted parsing using exceptions and that was slow, so we replaced that. But that has all been merged.

Starting point is 00:34:15 And we use Spirit actually for parsing throughout. We used to use RegExs, but they're just terrible to maintain. And it's one of the other things that I would not like to have, that I would not like to be in the standard, because it's kind of a case of when all you have is a regex, then

Starting point is 00:34:31 well, you're going to use the regex and stop looking for alternatives. And so Boost Spirit is one. We have our own boost operators, which is basically injecting operators based on, you basically have modifying operators, which is basically injecting operators based on...

Starting point is 00:34:47 You basically have modifying operators, and you want all the dual versions, like you generate the plus from the plus equal, things like that. So there we have our own version. We've done a bit of work on the CLP, which is a linear solver. That's what the math we need for the automatic

Starting point is 00:35:04 layout. So this kind of stuff. But nothing to the extent of what the math we need for the automatic layout. So this kind of stuff, but nothing to the extent of what we did with the ranges, I think. The range library does contain a bit of utilities on type trades and other useful, like the TC min-max, for example, things like that. So if we, since you mentioned think-cell products again, if we can go back to that, when I was looking at product i'm like you know this kind of speaks to me because

Starting point is 00:35:29 i don't use powerpoint anymore because trying to manage all of my slides for all my presentations and teaching material it just becomes like it's just too much work and you were talking about watching a bunch of you know people just hunched over the computers, pushing slide elements around. And so I use other tools that help me build presentations that can do automatic code formatting for me. And in particular, like automatically highlighting, syntax highlighting, code formatting. Is that anything that your tool can do? Can it help help the programmer making presentations also? Not really. I mean, I myself don't use,

Starting point is 00:36:10 I don't use PowerPoint for my presentations. Just because, I mean, yes, it's a pain in the butt. If you just have to write code, it's terrible. And the thing is, the market is just not there. We thought about doing something like a markup to PowerPoint converter, where you can really write text. Because nothing is as quick to write as text, especially, I think, for programmers.

Starting point is 00:36:33 So it would be kind of natural to do this kind of converter. This has always kind of been an idea, but we never really followed up on it. So, yeah, I don't think I can help you much there. Okay. Well, and also thinking about how you implement it, you say you use C++, it's so much better than the idea than C Sharp,

Starting point is 00:36:53 but in your job posting, you also say that you've used some assembler glue, and I'm curious what that is for. Well, I mean, we are interoperating with Office a lot, and the public APIs just don't always do what you would like to do. Things like synchronizing with the undo queue, right? So when the user is working with our software,

Starting point is 00:37:19 she is expecting that when you press undo, that you get the proper undo behavior. But that's kind of not exposed. You don't really know when to do the undo steps. That's one thing. The other thing is that you may want to have your API or your UI, your visuals, tightly integrated. So with Office, for example, if you scroll the slide or if you zoom a slide in and out, you would expect that your elements that you kind of attached to the slide or if you zoom a slide in and out, you would expect that your elements

Starting point is 00:37:45 that you kind of attached to the slide, you'd attach to the charts on the slide, say, that they also zoom in and out. And that actually we solve by disassembling PowerPoint and Excel with the IDA disassembler. And to find, and then we go and and find this the the hooks that that we can actually patch and there is um we build a little patch searching engine uh whenever powerpoint

Starting point is 00:38:14 starts or excel starts we actually go and go through the code um and and search for these patches and it must be somewhat robust because i mean puts out hotfixes all the time, so we can't break all the time. So we need to be robust against small changes in the code. So we have the scanner, we scan the code, and we set the hook. And obviously that's work, the low-level work, the function hooking work that's done in assembly. So you have to discover the name of the function that you want to hook. Not the name. I mean, you have to find it in binary. Okay, so you're not looking for symbols

Starting point is 00:38:50 in here. There are no symbols. You have to just find it in binary. It's an IDA. Right? You've got a disassembler. Okay, so I'm curious what exactly are you looking for then? Certain byte patterns

Starting point is 00:39:04 once you've identified the code or what? Yes, I mean, once you've identified what you actually want to hook, we usually take a small patch around that place and then look for this byte pattern. There are certain things you have to be robust against. For example, if jumps change, like the compiler changes, sometimes jumps from say jump zero to jump non-zero so basically it turns turns around the the parity of the jump

Starting point is 00:39:30 and these are things that we have we have to be robust against um some some other like long jump versus short jump these kind of things like simple changes in the assembly code uh you you have to be robust against when you when you actually do that scanning and otherwise it will just break too often um but if you do that then it's then it's actually manageable and since you're running as a plug-in to powerpoint you're already operating in its address space and permissions and and whatever so you're not you're not doing anything that's going to alert windows that you're you something wrong. We are not the only ones doing this. There are DRM tools that try to prevent PowerPoint

Starting point is 00:40:11 from saving files. There are, of course, all kinds of malware scanners that try to detect behavior. So we're not the only ones patching around in process address spaces. But maybe we're the only charting add-in that does that. That's very interesting.

Starting point is 00:40:36 And I wonder if this is something we could have dug into last week when we were talking about the overlays with GOG and Steam and whatever, with games, how they patch executables too. That's certainly a realm I've never worked in personally. It's interesting stuff. Bringing it back to the range library, ranges I believe we're hoping is going to get into C++20. Have you thought about recommending any of the changes you've made in ThinkTel's range library to Eric Niebler and the standards committee?

Starting point is 00:41:08 Or do you think C++ developers are still going to be pretty happy just getting any version of ranges? So far, what I've seen in the ranges TS, we looked at that stuff. And there was one crucial thing that was pretty early on that we had changed, and they actually did change it. Okay. Is that when you get an iterator from a range, that you must make sure that the range stays alive. And that's something they actually added to the standard, that basically the iterator validity is tied to the range validity. If you don't do that, there's basically no way that I know of to implement efficiently filter adapters.

Starting point is 00:42:01 Because when you are stacking filter, the point is this, when you have an iterator of a filter adapter, the iterator just by itself, if it doesn't reference any range, if it cannot store extra data anywhere,

Starting point is 00:42:17 then it has to store its end iterator, because otherwise, while filtering, it will basically fall off the edge. So, we are moving forward your iterator. Because otherwise, while filtering, it will basically fall off the edge. So we are moving forward, your iterator, until you hit your filter criterion, but you may hit end. And so you need to know where end is. So you need to

Starting point is 00:42:33 store the end iterator. And when you now start stacking these filter iterators, then every one of these, so you have a filter of a filter of a filter, then every one of these filter iterators would store two iterators, then every one of these, so you have a filter of a filter of a filter, then every one of these filter iterators would store two iterators of its underlying

Starting point is 00:42:49 iterator. So you get a combinatorial explosion in the size of, or not a, is it, no, it's not combinatorial, it's exponential. So you get an exponential explosion in the size of your iterator. So at some point when

Starting point is 00:43:05 you have 10 stack 10 size uh then you have the 10,024 word large iterator that you're moving through your program that can't be that can't be efficient right so um and we made this one we have this one crucial change that that um allows us to to efficiently implement iterators. There is one thing that we did in our library which goes beyond that. When you're sticking to what Eric did, you get still linear growth in the size of the iterators. That's much better than, certainly, the exponential growth, but it's still kind of bad.

Starting point is 00:43:42 And there is a way to do without, which is not precluded by the standard. So you could implement it with the standard, standard conformant, and still get small iterators. But Eric hasn't done it, and you need extra concepts. You need like an index concept.

Starting point is 00:43:59 If you're interested, there's the talk on the website, on our website, talking about this index concept. and then you can actually implement filter or transform or things like that efficiently so with small iterators okay uh jason you have anything else you want to ask i don't think so okay well arno it's been great having you on the show today um is there anything else you want to tell us about think cell before we let you go? Well, one thing is, of course, we are certainly looking for C++ developers. So we try to hire as many as we can.

Starting point is 00:44:32 We certainly, I think we have a very high quality bar. So it's quite difficult to get in. But we are certainly, we are really looking for developers. And we would love to hear from anyone looking for a job. And you are in Berlin? We are in Berlin, and currently everyone is on-site. And so far, it's basically relocation to Berlin. Okay.

Starting point is 00:45:02 Well, it seems like Berlin has a great C++ community. I mean, with Jens out there running Meeting C++, I'm sure it's a really strong community. Yeah. Okay. Well, it's been great having you on the show today, Arno. Thank you very much. Thanks for joining us.

Starting point is 00:45:16 Thanks so much for listening in as we chat about C++. I'd love to hear what you think of the podcast. Please let me know if we're discussing the stuff you're interested in. Or if you have a suggestion for a topic, I'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. I'd also appreciate if you like CppCast on Facebook and follow CppCast on Twitter. You can also follow me at Rob W. Irving and Jason at Leftkiss on Twitter. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode is provided by podcastthemes.com.

Your Ad Here

CppCast - Think-Cell Ranges

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.