CyberWire Daily - Pentesting at the speed of thought. [CyberWire-X]
Episode Date: January 19, 2026While our team is observing the Martin Luther King, Jr. holiday in the United States, please enjoy this CyberWire-X episode featuring the team from Horizon3.ai. In this CyberWire-X episode, Dave B...ittner speaks with Horizon3.ai co-founder and CEO Snehal Antani about how continuous autonomous penetration testing is reshaping security resilience. Antani reflects on his journey from CIO to DoD operator, where he learned that the hardest part of security isn’t patching — it’s prioritizing what matters and proving defenses work before attackers do. He explains why vulnerability scans fall short, how “AI hackers” simulate adversary behavior at machine speed, and why organizations must shift from compliance thinking to attacker-centric validation. Antani shares real-world findings, warns of 77-second domain compromise, and predicts a future of AI fighting AI, with humans by exception. Resources: Whitepaper: NodeZero® for Pentesters and Red Teams Whitepaper: Traditional vs. Autonomous: Why NodeZero® is the Future of Cyber Risk Assessments Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
You're listening to the Cyberwire Network, powered by N2K.
Hello everyone and welcome to this CyberwireX special edition.
I'm Dave Bittner.
Today, we're talking about one of the toughest challenges defenders face, sorting the noise from the signal.
For many teams, vulnerability management feels like a fire hose, loud, constant, and not
especially helpful.
And when everything is labeled critical, nothing really is.
Our guest today is Snihall Antoni, co-founder and CEO of Horizon 3 AI.
He's lived that struggle firsthand as a CIO, deciding what not to do, asking staff to cancel
family plans for patches that didn't matter and trying to prove security controls worked
without waiting for an actual breach to test them.
That frustration pushed him to rethink how we validate security in the first place.
Instead of guessing, he decided to continue.
pen-test his own environment, and the challenges he ran into along the way ultimately led him to co-found Horizon 3 AI,
a company working to bring autonomous pen-testing and AI hackers into the mainstream.
His argument is simple.
Vaughn's scanners tell you what might be wrong, pen-tests show you what actually matters.
So today we are talking about pen-testing, continuous pen-testing, continuous pen-testing,
and AI hackers and all that good stuff.
I would love to start with a little background on you.
I know you've said that the hardest part of being a CIO
was deciding what not to do.
How did that come to be part of your own personal journey
and kind of something that you lead with?
It's interesting.
So I'm an engineer by education and trade.
I did my undergrad at Purdue and computer science,
but start my career at IBM doing distributed systems,
working on the mainframe,
working on WebSphere,
working with IBM research and product.
And then I left to be a CIO at GE Capital.
It was my first kind of big executive job.
And it was amazing, one, to have that opportunity and work with this amazing team.
But then, too, to really start to have sympathy for the difficulty of decision making at that level.
And one of those decisions was, from a cybersecurity standpoint, I would get a list of 100,000 vulnerabilities
that must be fixed, right, according to some tools, some measure.
And I would look at that and figure out, well, I don't have the capacity to do this.
So what do I not fix on this list?
And for the things I do have to fix, I've got to look my IT admin in the eyes and tell him or her,
they've got to skip their kids basketball game or cancel their weekend plans and stay
behind and fix these issues.
So the hardest part of the job was deciding what not to fix.
the second hardest part of the job was telling people to fix stuff I knew weren't even exploitable
or relevant to the attacker.
And so that was quite difficult, especially in a large scale organization.
And this idea of fiercely prioritizing problems that matter became a key part of my mantra.
How did you communicate that to folks at the board level, that kind of prioritizing?
Because I could imagine that they're thinking, well, it's all going to be patched, right?
Yeah, you know, there's two aspects to it.
The first aspect is the answer isn't just patching.
You know, I feel in industry, we over-rotate to patching is the end-all, be-all that solves everything.
But actually, it's a small part of your defensive remediation plan.
There are compensating controls to go off and put in.
There are things you can do to reduce your blast radius.
You know, reduce the roles of particular credentials, reduce the access to, reduce the access to
particular share drives or whatnot.
There are things you can do to improve detection and response,
so you at least have a chance at stifling and containing the attacker if they get in.
So the first part was, for all of these issues,
what are the options that I have to reduce risk or to minimize or contain damage?
The second part was in that list, what is the actual consequence?
Like, don't just tell me I'm vulnerable to a particular.
vulnerability, I might be vulnerable.
Show me that I'm actually exploitable.
Show me that threat actors are known to abuse it
and make clear the consequence to the business
if we don't do something.
So, for example, don't just tell me I've got ransomware risk.
Tell me that this vulnerability is exploitable,
known to be abused by Salt Typhoon,
and will enable the attacker to gain access
to accounts receivables,
credentials, access the accounts receivables system, and steal or interdict financial payments.
That level of precision is key to having the board understand the risk that they're accepting
and understand why they should adjust priorities or marshal new resources to get after it.
And until we're able to precisely describe the consequence in that risk narrative,
it becomes very hard for the board to understand the risk that they're accepting
or why they need to sacrifice resources from one area and realign those resources to you urgently.
Well, along the way in your own journey, when did continuous pen testing become something
that you recognize the specific value of?
So it started, even my first day of CIA at G Capital back in 2012,
I had no idea I was secure until the bad guys showed up.
Am I fixing the right vulnerabilities?
Am I logging the right data in Splunk?
Does my team know how to actually respond to a breach?
Or is my EDR actually tuned and working correctly?
And the answer is I don't know.
I have to wait to get hacked.
Or I could hire a consultant to show up once a year
and test a sample set of my environment.
And for every patch Tuesday, I wanted a pen test Wednesday.
You know, every time my environment changed, I wanted to assess whether I was exploitable.
And so I wasn't able to find a way to solve it back then.
I tried a variety of techniques and tools and resourcing.
And then I saw this as a pervasive problem during my time as CTO of Splunk,
because I had the opportunity to meet with lots of CIOs and CISOs in that job
because I worked with the largest Splunk customers at the time.
And then when I left industry to serve within the Department of Defense,
I had that same feeling I did when I first stepped in the role of GE Capital.
I don't know where all the cyber risk is in the organization.
It's a massive footprint.
I don't know where the issues are.
I don't know what I should fix first.
I don't know what the consequences are.
And I need to rapidly assess my security posture so I can fiercely prioritize my cybersecurity harding plans.
And in that job, that's when it really clicked.
The commander of J-Soc said to me,
don't tell me where secure, show me, and then show me again tomorrow and then show me again next week,
because our environment's always changing and the adversary always has a vote.
And that's when I realized that it is just not possible to do this if I'm dependent on hiring a consultant
or a government red team to show up once a year, once every 18 months.
I needed to find a way to frequently test the security controls and security posture of my environment as often as possible.
Well, help me understand here. Let's level set a little bit. How do pen tests differ from traditional vulnerability scans?
Yeah, it's a great question. There's two aspects to it. The first is when you think about a volume scanner, I was one of the largest tenable customers in the world at one point.
as one of the largest qualis customers in the world
that did another job.
And when you run a VOLN scanner,
it is looking at vulnerabilities on a single machine.
It doesn't know if those vulnerabilities
can actually be exploited.
It doesn't know if there are compensating controls
that are going to prevent the attacker
from actually exploiting that issue.
It just knows on this machine is a potential problem.
It also doesn't know how the attacker could use that
and combine it with other issues
or laterally maneuver across the environment.
It's basically looking at one thing on one machine in complete isolation.
And that's just not how attackers behave.
Attackers are combining together different problems.
They're even rarely using vulnerabilities or CVEs.
They're using misconfigurations, credentials that they've collected through a variety of techniques,
misconfigured security tools and other things that don't even constitute being a vulnerability
by that purest definition.
And it's how they're combining these.
things together, chaining them across machines,
laterally maneuvering across machines,
to achieve a goal, to steal your data,
to compromise your domain and get admin access to everything.
Volanscanters can't show you how to chain things together.
They can't show you the consequences of what happens,
like domain admin or sensitive data exposure.
It's just an isolated standalone point in time view.
Penn testing gives you the attacker's perspective,
and in cybersecurity,
the only perspective that matters is the attackers.
Like the attacker's perspective is what you need to prioritize, what to fix.
It's what you need to make sure your tools are working, your controls are working,
your team has the muscle memory.
Penetration testing is the only way to get that attacker's perspective of your environment.
So when you go to a team, a security team or even an IT team,
and you say, hey, everybody, good news, we're going to start continuing.
pen testing. Do there tend to be any hurdles, either technically or even culturally, that those
teams understandably sort of throw up in your way? Yeah, it's really interesting. It's still a very
split market. So a really simple qualifying question is with an unlimited budget, how many pen tests
would you run a year? One to two or four or more? Just keep it super simple. And if the answer is one to two,
you're talking to somebody that has a compliance mindset to cybersecurity and a compliance mindset to pen testing.
And if the person says four or more or as many as possible, you're talking to a person that cares deeply about cyber resilience, that cares deeply about proactive security.
Because the goal of running a pen test is not to find problems. It is to quickly fix problems that matter.
That's the goal. And so the more often you run pen tests, the high,
higher the resolution you have of your exploitable attack surface, the better understanding you have
of what's exploitable, how quickly are you fixing them, how often are they reoccurring and why,
how effective is your detection and response. And so I've found that single question of with
an unlimited budget, how many pen tests would you run to be a really clear way to understand who I'm
talking to, compliance-centric person that's trying to check a box or a person that's actually
focused on cyber resilience of their organization.
We'll be right back.
So you mentioned that you spent some time in the private sector at some high-profile places
and then sometime working with a DOD.
What was it that led you to start the current company?
So it's really funny.
I've dreamt of this product and capability since I was 12 years old, 1992.
So I kind of grew up admiring that hacker-culture.
the hacker world, the hacker movies.
And I remember kind of envisioning or dreaming like,
imagine being able to look at anything and point, click, shoot, and take it over.
And actually, remember the scene, I think is like Iron Man 2,
where Tony Stark is testifying in front of the House or the Senate,
pulls up his camera, double taps, and hijacks the television is now hacked everything around him.
So it's been this elusive but very interesting idea for a long time.
The hard part of the problem, honestly, is finding offensive talent that knows how to write exploit code that can run against production systems and that aren't criminals.
When I left industry to serve within the U.S. Special Operations community, I had the privilege of meeting and working alongside these incredible cyber professionals.
And so when you start a company, the idea honestly doesn't matter.
it's the early team that matters.
And that early team, as my co-founder Tony retired from the Air Force and the other folks
that had served in various cyber roles, finished their tour in the military,
it became the perfect early team to assemble to go off and solve this problem.
My understanding is that you all are making good use of automation here,
and you use the term AI hacker, unpack that for us.
What does that mean?
Yeah, so when you think about this idea of point-click shoot hack, a good analogy is chess.
So in chess, there are well-defined opening moves.
You're going to move the ponds to the center of the board.
You're going to take the knights and maximize reach and maneuverability of them and so on.
And there are well-defined closing moves.
You're going to use the rooks to roll up the king or whatever else.
But the middle of the chess game is completely dynamic.
It very much depends on what your opponent is doing also.
And so pen testing is actually very similar.
There are well-defined opening moves.
You're going to conduct a ping sweep to understand everything that's network reachable.
You're going to use that to do deep service inspection to understand all of the services running on every host you've identified.
Use techniques to harvest user IDs and passwords or NTLM hashes.
you're going to identify juicy or interesting landmarks.
Dell Idrack, HP ILO, VIM backup and recovery,
other kinds of virtual appliances and out-of-band services
that are used by admins,
not normally monitored by the SOC team,
and possess highly valuable credentials.
Those are well-defined opening moves
that you can automate to be able to execute against.
There are well-defined closing moves
of pilfering data looking for sensitive information,
or finding ways to become domain admin through querying SMB and other things like that.
Those can also be automated, right?
You don't need any special technology magic.
You just need significant deep domain expertise.
But the middle of the chess game, that dynamic part is actually where a blend of machine learning,
reinforcement learning, LLMs, and aspects of AI and expert systems become very important.
because it's all about the next best action.
Should this AI hacker go after the router, the printer, or the television next?
And the answer is, well, it depends.
What were the discovered services?
What is the historical record of success?
What is the likelihood that that going after the television is going to lead to domain compromise
or sensitive data exposure and other things like that?
So at the end, our AI hacker uses the right technique for the task.
certain tasks are best solved by just good old boring automation.
Other tasks are best solved like very narrow reasoning problems.
Is this data valuable?
Is really well solved by an LLM?
But if you ask an LLM to solve a large, unbounded problem,
it's going to quickly veer off the road and go nuts.
And so Markov decision processes or traditional machine learning might be better there.
So I think that a good technical architect always understand
the problem and tries to use the right tool for the job versus overrotating towards chasing
the next technology trend. Do you have any examples, any stories about how an autonomous pen test
surfaced something that was actionable, maybe something that a traditional process would have
missed? Yeah, so a really interesting example is Windows defender at a particular company.
So one of our customers had about 14,000 endpoints, and they had Windows Defender installed on all of them.
And they initiated a Penn test with our product Node Zero, and our product ran through its discovery phase and then dynamic execution phase and so on.
And it found one host, one defender agent out of 14,000, was misconfigured, just one.
And on that one misconfigured EDR agent, Node Zero was able to get.
gain host compromise. He was able to gain access to sensitive processes like Sam and
LSAS. He was able to gain access to sensitive credentials as a result and then laterally maneuver
across the organization eventually becoming domain admin. One out of 14,000 is all it took for
that to happen. And if you think about it, you can't hire a pen tester and give them all 14,000
endpoints as part of the scope. That's just too big. It's too expensive. It's going to take too long.
Clearly, the customer missed this in their own configuration because this had been a problem for
an extended period of time and they had no idea. You can't trust that your security tools are working.
You have to verify that they're delivering the defenses you expected them to deliver using the
attacker's perspective, using that idea of autonomous pen testing. So autonomous pen testing gives you
speed, scale, and comprehensiveness. And that was a really interesting example of the customer
did all the right things in 13,99 places. All it took was one. Yeah, it's a literal needle in the haystack.
So where do you suppose we're headed here? So do you think that continuous autonomous pen testing
is going to replace traditional audits and red team exercises or is this going to be a thing where
they coexist? They're going to coexist, but I think in very different ways than in the traditional
world. So first and foremost, algorithms AI infrastructure, AI hackers like Node Zero are really
awesome at network penetration testing because network penetration testing is a graph analytics problem
at the end of the day. There's this Microsoft quote, attackers thinking graphs, defenders
thinking lists. That is absolutely true for network penetration testing.
especially of production systems,
which we're the best in the world of doing.
The adjacency to that is in web applications,
but very specifically testing for broken authentication
and apt to internal pivoting.
That's just a natural extension of infrastructure
or network penetration testing.
What algorithms are not good at, though,
is finding logic flaws in custom code.
Humans are uniquely gifted
at that type of problem set.
So I think humans end up focusing on finding logic flaws in custom code
because that's what they're uniquely good at.
And then algorithms, AI hackers,
are going to primarily focus on infrastructure pen testing at scale.
Kind of number one.
The next two areas are security of your source code.
LLMs are actually proving to be really effective
at finding vulnerabilities in software
that completely transforms the way we do static application security testing.
Claude, OpenAI, Ardvark, other tools like that,
are incredibly effective at finding bugs in software.
Semgrep, which is a really awesome SASTY tool,
has written an interesting paper on how they use a blend of traditional
static application security testing techniques with LLMs
to find very low, false, positive,
of high-impact software flaws.
So I think that's going to be machine-driven.
But the final area that's human-driven is the long-tail of OT
and industrial control systems.
You know, at Horizon 3, I can't go buy myself a nuclear reactor
and add it to my cyber range to learn how to hack these, all the control systems.
I don't have access to that kind of machinery.
I don't have access to the long-tailed bespoke aspects of OT and ICS.
And so I think that's where humans become very focused and specialized also.
We see this with Dragos.
I mean, Dragos is an amazing company, primarily consulting services,
because it takes a very special type of human with bespoke expertise to do long-tail OT testing.
So I think, Dave, the answer is it's going to be a mix.
There are certain parts of the problem are stack that AI is incredibly capable of solving.
And there's very specific areas that humans are focused on.
and humans should really be working on things that'll put them on stage at DefCon
and let AI take care of the rest.
Our thanks to Snahal Antoni from Horizon AI for sharing his insights and experience.
Snahal Antani makes the case that pen testing isn't just about finding problems,
it's about fixing the ones that can truly hurt you
and proving your defenses can stand up to real-world pressure.
In a landscape drowning in theoretical vulnerabilities,
he argues the smarter path,
is focusing on what attackers can actually exploit and how quickly you can respond.
Our thanks to Snahal for sharing his insights and experience.
I'm Dave Bittner. Thanks for listening to this CyberwireX special edition.
