CppCast - Reverse Engineering C++

Starting point is 00:00:00 Episode 192 of CppCast with guest Gal Zabam, recorded March 28, 2019. Sponsor of this episode of CppCast is PVS Studio. PVS Studio is a static application security testing tool for detecting errors and potential vulnerabilities in the source code of programs written in C, C++, C Sharp, and Java. In this episode, we discuss client updates and the least secure programming language. Then we'll talk to security researcher Gal Zabon. Gal talks to us about2 of CppCast, the first podcast for C++ developers by C++ developers. I'm your host, Rob Irving. Joining me is my co-host, Jason Turner. Jason, how are you doing today?

Starting point is 00:01:22 All right, Rob, how are you doing? Doing okay. Recovering from some travel. Was at the Microsoft Summit last week. Right. How'd that go? It was good. I got to hang out with Simon Brand, who's now the Visual C++ community guy over there. Evangelist, effectively.

Starting point is 00:01:40 Evangelist, something like that. And yeah, did a nice episode with him and Marian Lupreux and Tara Raj and Bob Brown. Cool. You been busy with everything? Yes, I mean it's tax season and I have a bunch of travel coming up

Starting point is 00:01:58 and then, well, a couple of conferences and yeah, definitely been staying busy. Okay. Well, at the top of our episode I'd like to read a piece of conferences, and yeah, definitely been staying busy. Okay. Well, at the top of our episode, I'd like to read a piece of feedback. This week we got an email from Sebastian, and he says, Hey, first of all, great podcast. You're doing a great job at interviewing interesting people and coming up with news I didn't get time to follow,

Starting point is 00:02:17 and getting that during my sport or commute is great. And then he does say he has one complaint, which is the intro music is way too loud and he says that he listens to a lot of other podcasts but ours is the only one where he has to lower the volume temporarily to avoid ear damage so i i'm surprised i've never heard this complaint before have you uh no i've never seen it come up but um it's easy to believe i mean if the music is jarring at a different level or something else i don't know have to normalize it across it or something later well i do i normalize everything so there's at least parts of the entire audio track that i thought got

Starting point is 00:02:58 as loud as the intro music but maybe i can bring the intro music itself just down a little bit. I'll take a look at it with the editing for this episode and see what I can do. Yeah, no, I sure there's, yeah, if you can. Well, thanks for the feedback, Sebastian. We'd love to hear your thoughts about the show as well. You can always reach out to us on Facebook, Twitter, or email us at feedback at cpcast.com. And don't forget to leave us a review on iTunes. Joining us today is Gal Zaban. Gal is currently working as a security researcher.

Starting point is 00:03:31 Her passion is reverse engineering with a particular interest in C++ code. In her spare time, when not delving into low-level research, she designs and sews her own clothes and loves to play the clarinet. Gal, welcome to the show. Hello, everyone everyone thank you for having me i have got to ask you some questions about the designing and sewing of your own clothes i i go to um sewing expos with my wife at least once a year or so and i've noticed that the the sewing machines that are sold today are i mean they're geeky toys just like any other expo that I go to.

Starting point is 00:04:06 They've got touchscreens and embroidery systems and you can program them and upload files. And like, do you get into that side of things also? So actually researching a sewing machine is stuff that I didn't do until now. When I mean research, I mean like take them apart, look at the code and start to find vulnerabilities in them. Right. But yes, I know that there is lots of gadgets for those things. It's kind of funny

Starting point is 00:04:34 how a hobby can be that extensive when you talk about those things. But I think that the most important thing is the fabric choosing. So you need to choose, you have lots of options and a lot of patterns. And actually, this is the hardest part in sewing clothes, in my opinion.

Starting point is 00:04:57 Right. So the actual design of the clothing itself, in other words. Yes, exactly. Yeah. My wife does a lot of sewing uh i don't she's only made a few things completely from scratch but pretty much modifies nearly everything that she buys i can understand that it's really uh it saves you like time to go to some expert that could do it for you and actually you can personalize everything that you have to make it prettier and your own right okay well gal we got a couple news articles to discuss. Feel free to comment on any of these,

Starting point is 00:05:29 and then we'll start talking more about your research and reverse engineering. Okay? Sounds great. Okay. So first, LLVM Clang 8.0 was released, and I don't know a whole lot about LLVM. Jason, do you have a chance to read through all these changelogs? Oh, goodness. I looked through the Clang side. I didn't look through the LLVM

Starting point is 00:05:51 side specifically. And then I also cross-referenced it with the cppreference.com list of compiler support to see what C++20 things we got in Clang 8. Okay. And there's a few things that stand out, such as constexpr-algorithm. Oh, okay. It is the first shipping vendor with constexpr-algorithm support. So that's cool. Yeah.

Starting point is 00:06:18 That was the main thing that stood out to me on that one. Was that the first C++20 feature? Were there any others in there? Oh yeah, there's several others. constexpr trycatch blocks, nested inline namespaces, this one's particularly fun,

Starting point is 00:06:36 prohibiting aggregates with user-declared constructors. Now, it's considered a design flaw. I'll give a quick overview of this from the listener's perspective here. default constructor or whatever, right? It is possible to still do aggregate initialization of those public members and bypass the constructor effectively. As of C++17, C++20 plugged that issue. And it's kind of goofy.

Starting point is 00:07:17 This is the kind of thing that ends up as a Twitter poll, basically. So that's, yeah, that's a thing that's been fixed. Um, and then a few other like default and constructible stateless, uh, default constructible and assignable stateless lambdas, uh, range based for a knit statements.

Starting point is 00:07:37 So just like if in it and switching it statements that we got in C plus plus 17 range based for loops now have that as well, which starts to become a little bit confusing with how many different ways it's possible to construct a for loop. But yeah, they've got a lot going on. And the spaceship operator partial support. Nice, nice. And also there's a bunch of sanitizer work here for implicit conversion sanitizer,

Starting point is 00:08:01 which I'm going to have to spend some time playing with that one. Yeah. Well, it's great to see that they're already doing so much C++20 work when the ink isn't even dry on the standard. Next, we have some conference news. C++ Now, the keynote was announced, and it's going to be a guest we've had on the show before, Hanna Duskova, who talked about compile-time regular expressions.

Starting point is 00:08:25 Her keynote will be on. It's had some updates since her CppCon talk and since we had her on. She'll be presenting some new material from updating the library, which she's been continuing to work on. Yes, and I have to give her serious

Starting point is 00:08:41 props for this because she keeps giving, at many conferences, has given a talk called Compile Time Regular Expressions. Right. But it's always new material. Yeah, I mean, I think even when we had her on, she said it was like the fifth iteration of the library, so I guess maybe she's up to her sixth now? I'm sure at least, at least the sixth, yeah. I guess it will also be interesting to see her talk in Core CPP in May. That's right.

Starting point is 00:09:09 She will be giving a talk with a similar name then, and I expect that there will be more updates again. Was there some other conference news you wanted to go over, Jason? Just in general, we've got C++ Now coming up. We haven't talked about it a lot, and that's probably partially my fault because I'm not planning to go, so I haven't been talking about it. Right, because it's back-to-back with Core C++, right? Yeah, it would be just complicated for me. It's between some work trips

Starting point is 00:09:36 and then right up against with Core C++, yeah. But that's May 5th through the 10th. It's coming up in Aspen. It looks like a great lineup, of course, every year. They did announce the schedule. And I think they announced the schedule, right? Yes, the 2019 schedule is online now. And the tickets are available. It's always a great conference. You should definitely go. Anything else with conference news? I don't know. Let's see. Core C++, that's coming up, of course, which our guest will be speaking at. Do you have anything that you would like to say about Core C++, Gal? I'm really looking forward to Core CPP.

Starting point is 00:10:14 It actually will be my first conference that I'm going to present for developers and not for security researchers. And I guess you will see my talk then. Yeah, I'm looking forward to it. Yep, I'm looking forward for this too. So that's May 14th, 15th, and 16th for those who are listening but haven't checked that out yet. There's definitely still time to buy tickets, and that's in Tel Aviv. Okay, and then a couple more things on the news.

Starting point is 00:10:46 There's this new name of operator. It's called an operator, but it's, it's a library and it does, you know, get to the name of functions, variables, class types, pretty much anything. There seems to be a name of macro in this little header-only library. It seems pretty cool. Yeah, it doesn't even, I don't know, seem like it should be possible. Yeah, yeah. I didn't go into the guts of the code to see how he does it with everything. Did you take a look at it?

Starting point is 00:11:18 I did start reading through the code. The main thing that I noticed is that it is super, super clean. It's like fully constexpr, no except correct. And everything is just like nice and organized. And it's tiny. It's a tiny library, 285 lines of code. It's a lot of functionality out of 285 lines of code. Yeah.

Starting point is 00:11:43 Yeah. And it looks like it runs on uh visual c++ gcc and clang so it's nice yeah just a tiny bit of uh compiler specific glue in here for sucking out the names of these things yeah definitely seems like the type of thing that would be nice to get standardized yeah i mean reflections mean, reflections. It's its own thing. Hopefully it'll come soon. And then the last thing is we have this short article about the three

Starting point is 00:12:11 least secure programming languages and I was very happy to see C++ was not at the top of this list. That's the only reason I put this on the news list. It's called the three least, but it does give you the top seven, and we are in the top seven. But in order, it's C, PHP, Java, JavaScript, Python, C++, and then Ruby.

Starting point is 00:12:37 I guess that one of the reasons that CPP is that low is not necessarily because it's more secure than all the others significantly, just because there is a lot of big projects and big codes at that, written in C, for example. And this is one of the reasons that you can see C up on the top with almost 50%. Also, some of the things that I think is that because

Starting point is 00:13:06 the report is only about open source code, there is also a lot of closed source code that is being researched and is not concluded in this report. And C++ has quite a in general about

Starting point is 00:13:23 closed source and not this specific report. It has lots of complexity if you compare it to C and actually C is quite a known language that a lot of people prefer researching instead of C++. Right. So maybe it explains some of the reasons why C has so many vulnerabilities in this report instead of other languages and YC++ is that low. Yeah, I mean, I guess like if you take, you know, critically visible projects like OpenSSL, which are written in pure C, and if they have a couple of CVEs against them, then that becomes a really big deal. Yeah. VEs against them, then that becomes a really big deal. I also read the Slashdot discussion on this, and it's been a very long time since I've programmed

Starting point is 00:14:09 in PHP, and PHP is listed as number two here. But apparently, there's so many gotchas in PHP, you have to know the language really intimately to avoid writing security flaws. But I don't know.

Starting point is 00:14:26 I haven't spent a lot of time with it in a long time. Also, this report shows 10 years of CVs that were found. So you might also want to take it under consideration when you look at this report. Right, right. Okay, well, since we're talking about security, maybe we should move on to the interview. So, Gal, you're a security researcher and your focus is on reverse engineering.

Starting point is 00:14:52 So can you tell us a little bit more about what exactly reverse engineering is? Yes. Reverse engineering is a process when you take a binary, usually a binary, but it can also be when you take a binary and try to understand the logic of the code that was written at the beginning. When you take some, what I do is specifically for software, but you can also do it for hardware. When you take a machine, you take a compiled code,

Starting point is 00:15:25 and then you try to understand the source, how the source was written, how everything was built at the beginning, what the developer really wanted to write. So this is in general how reversing, what reversing is.

Starting point is 00:15:41 Okay, so why reverse engineer a program? So there is some reasons why someone would like to reverse a program. I'm going to focus on the software side. So you have the ethical reasons. If you want to protect a product, if you want to make sure you have no vulnerabilities. So in some cases, you would like to research and reverse engineer a code that you don't have its source.

Starting point is 00:16:11 And then you can reverse engineer, find vulnerabilities, and then record them to the vendor that is responsible for them. So this is one of the reasons and one of the things you can do with reversing. Actually understand the logic to protect code. Other reasons can be less, let's say, legal. So white hat versus black hat reasons, right? Yeah, exactly. You have people who reverse code in order to find vulnerability, exploit them to create arm to some product. You can do it to understand logic and steal the logic of some products too.

Starting point is 00:16:54 So you can use reversing for good and for bad, but it's a really strong tool for people because you not normally developers just give them their code for a extremely critical machines you usually have to reverse it and understand it by your own so even if you want to understand the code to protect it some cases you just have to reverse it because i don't think that linux is open source and but windows if you want to research and understand it, you have to reverse code to protect stuff. Right.

Starting point is 00:17:31 That got me wondering, is there any advantage to reversing your own code that you already know the source code to? Can you learn different things about a project that way? So sometimes when you want to debug your code, you can debug the reversed code. I mean, like, if you write drivers, sometimes you have some very hard bugs to fix when you look at the source only.

Starting point is 00:17:59 And sometimes you need to combine between looking at the source and looking at the assembly code that actually run. And this is one of the reasons you would reverse your code even though you have the source. Other reasons can be optimization sometimes, or maybe if you want to understand better how exactly what you do, what it actually does. So this is some of the reasons that I would think that even if you have the source, you would reverse the code. Okay. How good does the reverse engineered code typically look? Like if you're comparing the output of a reverse engineered assembly

Starting point is 00:18:41 to the source code you started with, is it recognizable? It is recognizable. You can really understand. There is some of the things that after you compile the code are not there anymore, especially if you don't compile with symbols. So you don't have the parameters, you don't have the names, but you have all the structures.

Starting point is 00:19:03 You can understand them by reversing the code. It might take you more time if you are't have the names, but you have all the structures, you can understand them by reversing the code. It might take you more time if you are not experienced with it. But afterwards, everything that is written, you have a compiler at the middle that creates assembly code and it has patterns that you can see. And also you have logic you can understand by it. So if the debug symbols are in there, then you can at least potentially get the names of the variables and the names of the functions and that kind of thing that are being called. Yes.

Starting point is 00:19:36 So if you have debug symbols, so I can actually see a lot of those things that you write as a developer. You have more names. You sometimes have debug strings that are compiled when you compile with the debug. But actually, you can see more stuff when a program is compiled in debug mode. So I'm wondering this as I'm thinking about this more,

Starting point is 00:20:06 are there countermeasures that make it more difficult for your code to be reverse engineered? Yes, there is stuff. There is a lot of stuff actually that makes a reverse engineer job harder. Some stuff is called

Starting point is 00:20:23 anti-debugging that you can add to your codes to prevent people like us to reverse it. Anti-debugging? Yeah. You can do a lot of stuff to make the reverse job harder. Anti-debugging is one of the

Starting point is 00:20:39 options, but it only helps to make the process longer. It doesn't always, it cannot prevent people from reversing. It can only make the process longer. For example, to add a lot of functionalities just confuse reverses

Starting point is 00:21:00 and sometimes you have obfuscation and stuff like this. And all of those things, at the end, just stall the process but do not prevent it. So, I mean, as a security researcher, I'm assuming your job is more often reverse engineering the code. But I'm just wondering, would you ever make a recommendation to i don't know like i'm thinking about like voting systems voting systems like everyone's talking about voting systems and what vulnerabilities they have and and questioning you know whether or not we should be using electronic voting and whatever like should the voting system vendors apply these

Starting point is 00:21:39 anti-debugging countermeasures or whatever to their source code so that it's harder for someone to crack, or should they not? Is there any real advantage to that? So there is advantage, because usually the purpose of anti-debugging is just to stall the process. You can see it, for example, in video games, when you want someone to not reverse and publish code in order to sell more. So in these election systems, it's quite useful because it will take more time. I'm not sure that it cannot 100% prevent people from doing so,

Starting point is 00:22:19 but it might stall them enough so the process at the end will be okay okay can be 100 percent and 100 percent secure okay i wanted to interrupt the discussion for just a moment to bring your word from our sponsors pvs studio performs static code analysis and issues warnings for fragments of code that are likely to contain errors and potential vulnerabilities. The tool supports the C, C++, C Sharp, and Java programming languages. At the moment, PVS Studio has 422 diagnostics for C and C++, which enable you to detect dereferencing of null pointers, array bound violations, typos, dead code, resource leaks, and other kinds of errors. PVS Studio supports working with Visual C++, GCC, client compilers, as well as a number of compilers for embedded systems.

Starting point is 00:23:10 The analyzer works in the Windows, Linux, and macOS environments. Follow the links in the description to two new posts from the PVS Studio team. The first one suggests checking your skills to find errors. For the second one, you can read about a non-obvious case of an analyzer false positive. One thing I'm curious about is how does C++ compare to other languages with reverse engineering? Is it easier or more difficult to reverse engineer a C++ library or application? So C++ is quite hard to reverse, especially when you compare it to other languages, because C++ has virtualization and objects and a lot of concepts. Actually, every few years there is a new, there are new concepts every time.

Starting point is 00:24:01 And it's quite hard to understand the code itself because for example if you take a virtualization and virtual calls specifically so as a reversal you cannot see in the code after a compilation exactly what will be called you can only see that it's calling a register in the memory and then only on runtime you can understand what the functions, what is the functions that were called, and this is quite complicated. And also you have all the objects and inheritance that are quite hard to understand. It takes a lot of time, basically, and it really makes the process a bit different. You need to understand a lot of the, I don't know how to call it, metadata, but you need to understand the objects and everything around the code

Starting point is 00:24:58 and not only the logic itself. Okay. So this is what you'll be talking about at your Core C++ talk, right? Yes, exactly. So I'm going to show, because this talk is going to be focused for developers, I'm going to show some of the stuff

Starting point is 00:25:18 that might interest more developers than security researchers. Because when you develop QuadGit, you don't always think about what it looks like at the end, what it's compiled to. So I'm going to show some basic stuff, and I'm going to show some complex stuff about how the code that you write really looks like in the assembly, what you can understand as a security researcher, like how do I look at C++ code and what makes my job easier and what is just

Starting point is 00:25:54 complicating stuff. And actually, I think that it will show some of the developers don't know about those concepts, how all the stuff that you write really looks like at the end. All right. Can you tell us a little bit more about the tools that you use when reverse engineering C++? So, reverse C++, in general, you need to divide it into a few processes. The first is the disassembler, which actually converts the binary itself to some kind of an assembly language. And then you have a few options for this. One of the things is the most, like the tool that I use the most is IDA, which is IDA Pro, which is a tool that helps disassemble and also to decompile some of the code.

Starting point is 00:26:58 And so if this is how I talked about the first part about disassembly. C++ itself has some, there is some tools that was written by researchers and security researchers specifically that helps understand objects and inheritance and some of them also about the virtualization. Also, universities are quite into the subject of understanding C++ and reversing them and understanding objects after compilation. So there is also some articles that were written. But generally, there isn't lots of tools in this subject because all of them requires more work afterward.

Starting point is 00:27:46 You cannot just run a tool and then get all the information you need. You use the tool and then you need to work on it more. So this is the main problem with CPP. You don't have like an easy one-click solution. Okay, so yeah, I have actually used a.NET decompiler and you can just take a DLL and drop it into there. And even without symbols, it'll output some source code for you to look at. So you're saying with C++, it takes a lot more work than that. Yeah, because.NET has bytecode,

Starting point is 00:28:17 and C++ is a compiled language that doesn't have a bytecode. So when you use.NET and then you use all the decompilers that you have for it, the job is much different. And this is why you could also see all the variables and the names and everything. In C++, it's different. So are any of those tools that you mentioned freely available, or are they commercial projects? So there is a lot of

Starting point is 00:28:45 script that's written in IDA Python for IDA. IDA is not an open source tool. You have other alternatives that are open source. There is also the new tool that was released by the NSA now called

Starting point is 00:29:02 Gahiga, and you have other options for IDA but if you speak about specifically tools for C++ so some of the tools are open source. A lot of scripts and IDA Python scripts query them to help this problem.

Starting point is 00:29:18 So this is basically what you have now. I'm curious as you've been working with reverse debugging and I'm just, or reverse engineering, excuse me, thinking about it

Starting point is 00:29:29 from a programmer's perspective, have you ever come across something that like had to be a compiler bug, like the compiler generated incorrect code here for the source code that it was given?

Starting point is 00:29:41 You mean the compiler had a bug that created vulnerabilities? Yeah, or just incorrect code somehow. I don't know. I don't even have a great example. There is vulnerabilities also in compilers and stuff, but it's not something

Starting point is 00:29:56 that I've done before. I just thought with potentially spending that much time staring at assembly or disassembly that you would have come across. I usually look at assembly or disassembly that you would have come across? I usually look at assembly and not the source. So I usually don't have both, so I cannot compare. I can only see what I have in the assembly.

Starting point is 00:30:15 So if there was a source that was compiled in Wombley, I could not know that because I don't have the source itself. So could you spend some time, are you able to spend some time talking about what your actual day-to-day job looks like? Yeah, sure. So actually what I do in my job is that I take binaries and frameworks and actually try to understand and find vulnerabilities in them to look for some flaws in the behavior of the code

Starting point is 00:30:48 or to find flaws that cause memory corruptions and then try to understand if they are exploitable or not and sometimes exploit for them. So this is what I do. And today's job is quite frustrating exploit for them. So this is what I do. And the day-to-day job is quite frustrating and not that and not like it sounds like reverting as

Starting point is 00:31:12 a concept. It's really hard and really takes a lot of time. If you have one function, you can look at it for days and you can look at it for months and you can start code for a long time before even finding something. So it's quite fascinating to be a reverse engineer at the end.

Starting point is 00:31:30 I really enjoy it. I cannot complain at all because I love my job. But at the end, you usually have lots of frustration. You fail a lot as a security researcher because you try to find bugs in someone else's code when you don't have the code itself. So you have the process of reversing the code, which takes a long time, and you have the process of finding vulnerabilities, which is hard by itself. So all of this process is quite hard.

Starting point is 00:32:00 So you usually fail and not look at stuff and just all the vulnerabilities just pop out of the code. It takes a long time to find stuff. So are you hired by the company that wrote the code in the first place? Yes. Okay. So it's all like I do research to protect the product and to find stuff to make the product better. But you have still a lot

Starting point is 00:32:30 of work because sometimes you don't want to have the source and look at the source. You want to look at it from the other side, the red team side perspective when you try to attack something to protect it. Have you for the fun of it and not as part of work,

Starting point is 00:32:48 done any of the bug bounty kind of, like find a security vulnerability in Chrome and get a payback kind of thing? So I sometimes look at those stuff, but I'm not really doing a lot of bug bounties and I tried in my spare time when I'm not working on on vulnerabilities to actually focus on the reverse engineering itself I really like the process and I really think that there is a lot of methods for reverse engineering that should be automated and some of them should be taking more. I just really like reverse engineering generally, so sometimes I do some stuff to make my day-to-day job more automated

Starting point is 00:33:38 because there is always stuff to do. And also I really like to write tools for reverting C++, as I already presented about. So in my spare time, I usually focus on that and not on bug bounties. I try to do some to actually focus on the reverse engineering itself. Do you want to tell us some more about some of these tools that you've worked on? So one of the big tools that I've written is ViewTrailer,

Starting point is 00:34:11 which is an open source tool written in IDA Python. And this tool actually helps for reverse engineer to work on C++ code. So what it does, it's actually, as I said before, one of the biggest problems that we have is that we sometimes have filter calls in our code. And then we have problems to understand what function is being called. Because what you see is usually a call for a register, which is not really useful when you have the code without running it.

Starting point is 00:34:49 So usually ending up with running the code and executing it and lots of time just to understand a lot of the options that this virtual call can have. And what I made is a tool that can, on runtime, create a connection between the virtual call and the function, and also helps creating structures and other comments on the assembly code, so you can reverse it statically later because like there is two options for reversing code you have the static option when you just have the assembly you read the assembly you understand the logic and you also have the dynamic option with you run the code you understand stuff while the code is being executed and And this option is more time consuming than just statically look at assembly. So what I did is to try to minimize

Starting point is 00:35:51 the dynamic reversing of C++ and actually creating the structures and understanding the virtual call so you could afterwards reverse it and look at it statically. Sounds pretty cool. I hope that people will find it useful as I found it. I believe I found your GitHub project here and was just looking at the screenshots that you have

Starting point is 00:36:22 of some of the things going on. Yeah, so my code is on GitHub. So you can find also some screenshots of the functionalities that the tool does. Some of it is creating structures. Some of it is adding extras to the code. So you can find your functions after you finish the dynamic analysis. And I'm really happy if some people will, it's really hard when you have an open source project because you always want to understand what people think about it and if it helped them, what features would they like.

Starting point is 00:37:01 And it's really nice if people will will do it more often just like comment and have issues or send dms so i can improve the code and and continue and add features yeah definitely i i'm managing my own open source projects i feel like you don't hear from at least 95 of your users so i So I totally appreciate that. So definitely, if you're listening to this, and if you use this or do give it a try, let her know. I would love that. So for our listeners who are maybe interested in getting a job where they're doing reverse engineering of code, looking for vulnerabilities, Do you have any recommendations for them? Like what kinds of things they should be studying or looking into on their own?

Starting point is 00:37:48 And like how you go about finding a job doing this? So what I think that is really important that people that do reversing not necessarily understand how important it is to understand assembly code. So the first thing that I would recommend for people who want to start reversing is to write assembly code and not only read it. Because when you write assembly, you really understand stuff that you don't necessarily understand when you just read it and look at code. Today you have so many decompilers, so a lot of the people just decompile the code

Starting point is 00:38:24 and don't look at the assembly enough. I'm from the people that believe that assembly is really important to know in order to understand the reversing process. And when I mean know, I mean know how to write assembly.

Starting point is 00:38:40 So I think it's just a tip, but I think it will help people a lot in this process. The second thing that I think that people should have a lot of patience. Reversing is quite a long process. And if you want to be good at reversing, you need a lot of experience. You need to do stuff.

Starting point is 00:39:02 You need to read a lot of code and it's not an easy job that you can just learn a few days and then be a good reverser so you need to be patient and learn a lot be up to date with all of the vulnerabilities and stuff that you can find today and actually be very um stubborn when you read code and trying to find stuff because you really need to look at a lot of code and you sometimes look at the same thing more than once or twice and you need to have your the right state of mind when you do and you start um working in field. So this is my recommendation. It's not like a 90s hackers movie where it's all very exciting and happens in about 10 minutes.

Starting point is 00:39:53 Not at all. When you see everything in the TV, everything looks so glamorous, like, wow, let's press these few buttons, everything is just open for you and it's exactly the opposite you press a few buttons and it's not even the beginning so so it's it's just something that you need to understand before you start because everyone want to make the most glamorous one glamorous one, and the fact that, wow, you just like hack to stuff,

Starting point is 00:40:27 and it's not like that. You have a lot of work, and you have a lot of code to read, and usually, as I said before, you fail. You're not succeeding all the time. You usually fail. You try one lead, it's not ending up as profitable, so you try something else and it's a dead end too

Starting point is 00:40:47 and then at the tenth time you find something but it took a few months until you used all your leads and actually it's really a state of mind that you need to have

Starting point is 00:41:04 that the job is, the feeling when you succeed something is amazing. But in the process, you really need to be quite prepared for all the obstacles that you will have. I actually love all of these obstacles because it's quite interesting when the more complicated the code is, so I find it more interesting. It might not be the easiest, but you have more challenge. And what I like in my job is that I have challenge.

Starting point is 00:41:34 Right, so if you really like solving problems. And puzzles, actually. Puzzles, yeah. Now, you said that you do strongly recommend spending time writing assembly. Now, I mean, there's 32-bit Intel, 64-bit Intel, 32-64-bit ARM, MIPS. I mean, there's lots of commonly used processors. What would you suggest? So I would suggest starting with Intel,

Starting point is 00:42:00 just because a lot of the tutorial stuff that you have on reversing is written in Intel, and is a lot of sources. I also reverse some, let's call it, archaic architectures. And I can say that I know to write assembly code on all of the architectures. But Intel is the basics. ARM is also a good practice because I guess that 8086 and ARM are the most common ones.

Starting point is 00:42:33 So this is what I would start with. Do you have a recommendation for what kinds of things to do? I mean, for learning? So I guess at the beginning, there is some basics programs uh you can write i wouldn't start with writing the same code that you write in c++ but you can do basic stuff at the beginning like try to take some numbers add them try to do a fibonacci and stuff like this. And at the end, you can do something more complicated.

Starting point is 00:43:07 There is lots of exercise you can try. And actually, if you start with the same exercise that you would do if you want to start programming, I guess if you were doing the assembly, you will eventually be, it will really cover what you need for being a good reverser. But actually, you will never be an assembly developer. It's not what I mean here. It's just like to practice and to understand how the assembly code looks like.

Starting point is 00:43:42 Right. So this is what I think. Okay. Another question I had, earlier you were saying how, you know, there's some techniques like anti-debugging to help protect a C++ program from reverse engineering. Are there any, like, open source or commercial tools

Starting point is 00:43:59 to, you know, protect your application from reverse engineering that you would recommend? I don't have something specific that I would recommend on. There is lots of articles and blogs about it, so you can just read stuff about it and just implement it. But I don't have something specific that I would recommend on. Okay. Okay.

Starting point is 00:44:23 Well, Gal, it's been great talking to you today um is there anything else you wanted to go over before we let you go actually no it was a pleasure to be in your in your episode thanks for coming on yeah thanks for coming on thank you thanks so much for listening in as we chat about c++ we'd love to hear what you think of the podcast please let us know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic, we'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter. You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons who help support the show through Patreon.

Starting point is 00:45:07 If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode was provided by podcastthemes.com.

Your Ad Here

CppCast - Reverse Engineering C++

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.