CyberWire Daily - Pentesting at the speed of thought. [CyberWire-X]

Starting point is 00:00:00 You're listening to the Cyberwire Network, powered by N2K. Hello everyone and welcome to this CyberwireX special edition. I'm Dave Bittner. Today, we're talking about one of the toughest challenges defenders face, sorting the noise from the signal. For many teams, vulnerability management feels like a fire hose, loud, constant, and not especially helpful. And when everything is labeled critical, nothing really is. Our guest today is Snihall Antoni, co-founder and CEO of Horizon 3 AI.

Starting point is 00:00:48 He's lived that struggle firsthand as a CIO, deciding what not to do, asking staff to cancel family plans for patches that didn't matter and trying to prove security controls worked without waiting for an actual breach to test them. That frustration pushed him to rethink how we validate security in the first place. Instead of guessing, he decided to continue. pen-test his own environment, and the challenges he ran into along the way ultimately led him to co-found Horizon 3 AI, a company working to bring autonomous pen-testing and AI hackers into the mainstream. His argument is simple.

Starting point is 00:01:28 Vaughn's scanners tell you what might be wrong, pen-tests show you what actually matters. So today we are talking about pen-testing, continuous pen-testing, continuous pen-testing, and AI hackers and all that good stuff. I would love to start with a little background on you. I know you've said that the hardest part of being a CIO was deciding what not to do. How did that come to be part of your own personal journey and kind of something that you lead with?

Starting point is 00:02:08 It's interesting. So I'm an engineer by education and trade. I did my undergrad at Purdue and computer science, but start my career at IBM doing distributed systems, working on the mainframe, working on WebSphere, working with IBM research and product. And then I left to be a CIO at GE Capital.

Starting point is 00:02:26 It was my first kind of big executive job. And it was amazing, one, to have that opportunity and work with this amazing team. But then, too, to really start to have sympathy for the difficulty of decision making at that level. And one of those decisions was, from a cybersecurity standpoint, I would get a list of 100,000 vulnerabilities that must be fixed, right, according to some tools, some measure. And I would look at that and figure out, well, I don't have the capacity to do this. So what do I not fix on this list? And for the things I do have to fix, I've got to look my IT admin in the eyes and tell him or her,

Starting point is 00:03:06 they've got to skip their kids basketball game or cancel their weekend plans and stay behind and fix these issues. So the hardest part of the job was deciding what not to fix. the second hardest part of the job was telling people to fix stuff I knew weren't even exploitable or relevant to the attacker. And so that was quite difficult, especially in a large scale organization. And this idea of fiercely prioritizing problems that matter became a key part of my mantra. How did you communicate that to folks at the board level, that kind of prioritizing?

Starting point is 00:03:39 Because I could imagine that they're thinking, well, it's all going to be patched, right? Yeah, you know, there's two aspects to it. The first aspect is the answer isn't just patching. You know, I feel in industry, we over-rotate to patching is the end-all, be-all that solves everything. But actually, it's a small part of your defensive remediation plan. There are compensating controls to go off and put in. There are things you can do to reduce your blast radius. You know, reduce the roles of particular credentials, reduce the access to, reduce the access to

Starting point is 00:04:14 particular share drives or whatnot. There are things you can do to improve detection and response, so you at least have a chance at stifling and containing the attacker if they get in. So the first part was, for all of these issues, what are the options that I have to reduce risk or to minimize or contain damage? The second part was in that list, what is the actual consequence? Like, don't just tell me I'm vulnerable to a particular. vulnerability, I might be vulnerable.

Starting point is 00:04:46 Show me that I'm actually exploitable. Show me that threat actors are known to abuse it and make clear the consequence to the business if we don't do something. So, for example, don't just tell me I've got ransomware risk. Tell me that this vulnerability is exploitable, known to be abused by Salt Typhoon, and will enable the attacker to gain access

Starting point is 00:05:11 to accounts receivables, credentials, access the accounts receivables system, and steal or interdict financial payments. That level of precision is key to having the board understand the risk that they're accepting and understand why they should adjust priorities or marshal new resources to get after it. And until we're able to precisely describe the consequence in that risk narrative, it becomes very hard for the board to understand the risk that they're accepting or why they need to sacrifice resources from one area and realign those resources to you urgently. Well, along the way in your own journey, when did continuous pen testing become something

Starting point is 00:05:58 that you recognize the specific value of? So it started, even my first day of CIA at G Capital back in 2012, I had no idea I was secure until the bad guys showed up. Am I fixing the right vulnerabilities? Am I logging the right data in Splunk? Does my team know how to actually respond to a breach? Or is my EDR actually tuned and working correctly? And the answer is I don't know.

Starting point is 00:06:25 I have to wait to get hacked. Or I could hire a consultant to show up once a year and test a sample set of my environment. And for every patch Tuesday, I wanted a pen test Wednesday. You know, every time my environment changed, I wanted to assess whether I was exploitable. And so I wasn't able to find a way to solve it back then. I tried a variety of techniques and tools and resourcing. And then I saw this as a pervasive problem during my time as CTO of Splunk,

Starting point is 00:06:54 because I had the opportunity to meet with lots of CIOs and CISOs in that job because I worked with the largest Splunk customers at the time. And then when I left industry to serve within the Department of Defense, I had that same feeling I did when I first stepped in the role of GE Capital. I don't know where all the cyber risk is in the organization. It's a massive footprint. I don't know where the issues are. I don't know what I should fix first.

Starting point is 00:07:19 I don't know what the consequences are. And I need to rapidly assess my security posture so I can fiercely prioritize my cybersecurity harding plans. And in that job, that's when it really clicked. The commander of J-Soc said to me, don't tell me where secure, show me, and then show me again tomorrow and then show me again next week, because our environment's always changing and the adversary always has a vote. And that's when I realized that it is just not possible to do this if I'm dependent on hiring a consultant or a government red team to show up once a year, once every 18 months.

Starting point is 00:07:58 I needed to find a way to frequently test the security controls and security posture of my environment as often as possible. Well, help me understand here. Let's level set a little bit. How do pen tests differ from traditional vulnerability scans? Yeah, it's a great question. There's two aspects to it. The first is when you think about a volume scanner, I was one of the largest tenable customers in the world at one point. as one of the largest qualis customers in the world that did another job. And when you run a VOLN scanner, it is looking at vulnerabilities on a single machine. It doesn't know if those vulnerabilities

Starting point is 00:08:36 can actually be exploited. It doesn't know if there are compensating controls that are going to prevent the attacker from actually exploiting that issue. It just knows on this machine is a potential problem. It also doesn't know how the attacker could use that and combine it with other issues or laterally maneuver across the environment.

Starting point is 00:08:55 It's basically looking at one thing on one machine in complete isolation. And that's just not how attackers behave. Attackers are combining together different problems. They're even rarely using vulnerabilities or CVEs. They're using misconfigurations, credentials that they've collected through a variety of techniques, misconfigured security tools and other things that don't even constitute being a vulnerability by that purest definition. And it's how they're combining these.

Starting point is 00:09:25 things together, chaining them across machines, laterally maneuvering across machines, to achieve a goal, to steal your data, to compromise your domain and get admin access to everything. Volanscanters can't show you how to chain things together. They can't show you the consequences of what happens, like domain admin or sensitive data exposure. It's just an isolated standalone point in time view.

Starting point is 00:09:51 Penn testing gives you the attacker's perspective, and in cybersecurity, the only perspective that matters is the attackers. Like the attacker's perspective is what you need to prioritize, what to fix. It's what you need to make sure your tools are working, your controls are working, your team has the muscle memory. Penetration testing is the only way to get that attacker's perspective of your environment. So when you go to a team, a security team or even an IT team,

Starting point is 00:10:22 and you say, hey, everybody, good news, we're going to start continuing. pen testing. Do there tend to be any hurdles, either technically or even culturally, that those teams understandably sort of throw up in your way? Yeah, it's really interesting. It's still a very split market. So a really simple qualifying question is with an unlimited budget, how many pen tests would you run a year? One to two or four or more? Just keep it super simple. And if the answer is one to two, you're talking to somebody that has a compliance mindset to cybersecurity and a compliance mindset to pen testing. And if the person says four or more or as many as possible, you're talking to a person that cares deeply about cyber resilience, that cares deeply about proactive security. Because the goal of running a pen test is not to find problems. It is to quickly fix problems that matter.

Starting point is 00:11:20 That's the goal. And so the more often you run pen tests, the high, higher the resolution you have of your exploitable attack surface, the better understanding you have of what's exploitable, how quickly are you fixing them, how often are they reoccurring and why, how effective is your detection and response. And so I've found that single question of with an unlimited budget, how many pen tests would you run to be a really clear way to understand who I'm talking to, compliance-centric person that's trying to check a box or a person that's actually focused on cyber resilience of their organization. We'll be right back.

Starting point is 00:12:26 So you mentioned that you spent some time in the private sector at some high-profile places and then sometime working with a DOD. What was it that led you to start the current company? So it's really funny. I've dreamt of this product and capability since I was 12 years old, 1992. So I kind of grew up admiring that hacker-culture. the hacker world, the hacker movies. And I remember kind of envisioning or dreaming like,

Starting point is 00:12:57 imagine being able to look at anything and point, click, shoot, and take it over. And actually, remember the scene, I think is like Iron Man 2, where Tony Stark is testifying in front of the House or the Senate, pulls up his camera, double taps, and hijacks the television is now hacked everything around him. So it's been this elusive but very interesting idea for a long time. The hard part of the problem, honestly, is finding offensive talent that knows how to write exploit code that can run against production systems and that aren't criminals. When I left industry to serve within the U.S. Special Operations community, I had the privilege of meeting and working alongside these incredible cyber professionals. And so when you start a company, the idea honestly doesn't matter.

Starting point is 00:13:49 it's the early team that matters. And that early team, as my co-founder Tony retired from the Air Force and the other folks that had served in various cyber roles, finished their tour in the military, it became the perfect early team to assemble to go off and solve this problem. My understanding is that you all are making good use of automation here, and you use the term AI hacker, unpack that for us. What does that mean? Yeah, so when you think about this idea of point-click shoot hack, a good analogy is chess.

Starting point is 00:14:25 So in chess, there are well-defined opening moves. You're going to move the ponds to the center of the board. You're going to take the knights and maximize reach and maneuverability of them and so on. And there are well-defined closing moves. You're going to use the rooks to roll up the king or whatever else. But the middle of the chess game is completely dynamic. It very much depends on what your opponent is doing also. And so pen testing is actually very similar.

Starting point is 00:14:53 There are well-defined opening moves. You're going to conduct a ping sweep to understand everything that's network reachable. You're going to use that to do deep service inspection to understand all of the services running on every host you've identified. Use techniques to harvest user IDs and passwords or NTLM hashes. you're going to identify juicy or interesting landmarks. Dell Idrack, HP ILO, VIM backup and recovery, other kinds of virtual appliances and out-of-band services that are used by admins,

Starting point is 00:15:26 not normally monitored by the SOC team, and possess highly valuable credentials. Those are well-defined opening moves that you can automate to be able to execute against. There are well-defined closing moves of pilfering data looking for sensitive information, or finding ways to become domain admin through querying SMB and other things like that. Those can also be automated, right?

Starting point is 00:15:50 You don't need any special technology magic. You just need significant deep domain expertise. But the middle of the chess game, that dynamic part is actually where a blend of machine learning, reinforcement learning, LLMs, and aspects of AI and expert systems become very important. because it's all about the next best action. Should this AI hacker go after the router, the printer, or the television next? And the answer is, well, it depends. What were the discovered services?

Starting point is 00:16:22 What is the historical record of success? What is the likelihood that that going after the television is going to lead to domain compromise or sensitive data exposure and other things like that? So at the end, our AI hacker uses the right technique for the task. certain tasks are best solved by just good old boring automation. Other tasks are best solved like very narrow reasoning problems. Is this data valuable? Is really well solved by an LLM?

Starting point is 00:16:50 But if you ask an LLM to solve a large, unbounded problem, it's going to quickly veer off the road and go nuts. And so Markov decision processes or traditional machine learning might be better there. So I think that a good technical architect always understand the problem and tries to use the right tool for the job versus overrotating towards chasing the next technology trend. Do you have any examples, any stories about how an autonomous pen test surfaced something that was actionable, maybe something that a traditional process would have missed? Yeah, so a really interesting example is Windows defender at a particular company.

Starting point is 00:17:32 So one of our customers had about 14,000 endpoints, and they had Windows Defender installed on all of them. And they initiated a Penn test with our product Node Zero, and our product ran through its discovery phase and then dynamic execution phase and so on. And it found one host, one defender agent out of 14,000, was misconfigured, just one. And on that one misconfigured EDR agent, Node Zero was able to get. gain host compromise. He was able to gain access to sensitive processes like Sam and LSAS. He was able to gain access to sensitive credentials as a result and then laterally maneuver across the organization eventually becoming domain admin. One out of 14,000 is all it took for that to happen. And if you think about it, you can't hire a pen tester and give them all 14,000

Starting point is 00:18:30 endpoints as part of the scope. That's just too big. It's too expensive. It's going to take too long. Clearly, the customer missed this in their own configuration because this had been a problem for an extended period of time and they had no idea. You can't trust that your security tools are working. You have to verify that they're delivering the defenses you expected them to deliver using the attacker's perspective, using that idea of autonomous pen testing. So autonomous pen testing gives you speed, scale, and comprehensiveness. And that was a really interesting example of the customer did all the right things in 13,99 places. All it took was one. Yeah, it's a literal needle in the haystack. So where do you suppose we're headed here? So do you think that continuous autonomous pen testing

Starting point is 00:19:19 is going to replace traditional audits and red team exercises or is this going to be a thing where they coexist? They're going to coexist, but I think in very different ways than in the traditional world. So first and foremost, algorithms AI infrastructure, AI hackers like Node Zero are really awesome at network penetration testing because network penetration testing is a graph analytics problem at the end of the day. There's this Microsoft quote, attackers thinking graphs, defenders thinking lists. That is absolutely true for network penetration testing. especially of production systems, which we're the best in the world of doing.

Starting point is 00:19:58 The adjacency to that is in web applications, but very specifically testing for broken authentication and apt to internal pivoting. That's just a natural extension of infrastructure or network penetration testing. What algorithms are not good at, though, is finding logic flaws in custom code. Humans are uniquely gifted

Starting point is 00:20:24 at that type of problem set. So I think humans end up focusing on finding logic flaws in custom code because that's what they're uniquely good at. And then algorithms, AI hackers, are going to primarily focus on infrastructure pen testing at scale. Kind of number one. The next two areas are security of your source code. LLMs are actually proving to be really effective

Starting point is 00:20:50 at finding vulnerabilities in software that completely transforms the way we do static application security testing. Claude, OpenAI, Ardvark, other tools like that, are incredibly effective at finding bugs in software. Semgrep, which is a really awesome SASTY tool, has written an interesting paper on how they use a blend of traditional static application security testing techniques with LLMs to find very low, false, positive,

Starting point is 00:21:23 of high-impact software flaws. So I think that's going to be machine-driven. But the final area that's human-driven is the long-tail of OT and industrial control systems. You know, at Horizon 3, I can't go buy myself a nuclear reactor and add it to my cyber range to learn how to hack these, all the control systems. I don't have access to that kind of machinery. I don't have access to the long-tailed bespoke aspects of OT and ICS.

Starting point is 00:21:51 And so I think that's where humans become very focused and specialized also. We see this with Dragos. I mean, Dragos is an amazing company, primarily consulting services, because it takes a very special type of human with bespoke expertise to do long-tail OT testing. So I think, Dave, the answer is it's going to be a mix. There are certain parts of the problem are stack that AI is incredibly capable of solving. And there's very specific areas that humans are focused on. and humans should really be working on things that'll put them on stage at DefCon

Starting point is 00:22:23 and let AI take care of the rest. Our thanks to Snahal Antoni from Horizon AI for sharing his insights and experience. Snahal Antani makes the case that pen testing isn't just about finding problems, it's about fixing the ones that can truly hurt you and proving your defenses can stand up to real-world pressure. In a landscape drowning in theoretical vulnerabilities, he argues the smarter path, is focusing on what attackers can actually exploit and how quickly you can respond.

Starting point is 00:23:00 Our thanks to Snahal for sharing his insights and experience. I'm Dave Bittner. Thanks for listening to this CyberwireX special edition.

CyberWire Daily - Pentesting at the speed of thought. [CyberWire-X]

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.